• nerdponx 2 hours ago

    I have some questions that are not answered by the homepage.

    1) How does this work with function parameters that are intended to be captured unevaluated with substitute()? Do you type the input as "any" and document separately that the parameter is kept "unevaluated" as a symbol/name or call?

    2) How does this work with existing untyped R code? Does it at least include types for the standard library (or some subset thereof?)

    3) Is there any type inference, or does it require explicit type annotation everywhere?

    4) How do you propose to handle NA (which can appear "within" any typed vector)? Does the compiler support refinement types? If not, how does checking for and preventing nullability work, when checking for NA values requires a runtime check?

    5) How do data frames work? Are they typed like structs?

    6) Which object systems does it support, if any? S3, S4, Reference Classes, or the 3rd-party R6?

    As much as I like static types, I feel like R is maybe the language where I need or want them the _least_. How often do you really run into a situation where you pass a character vector to a function that requires a numeric vector and it crashes your program?

    99% of the time what you really want is known-valid data frames for data processing, and statically-sized arrays for math stuff.

    • andrewla 6 hours ago

      As an R programmer the examples given on the landing page seem very foreign to me -- you are almost always writing vectorized code in R, so I would think that would be front and center.

          let x: int = 1
      
      Is this a list of ints or a pure singleton? R doesn't have scalar types, so it would seem the former, but the example makes it unclear. Later in the docs it makes it clearer:

          let x: int = (1, 2, 3)
      
      And this, as an R developer, I can definitely get behind -- the c(...) syntax is always awkward and having a native syntax for static arrays is a welcome change.
      • juujian 5 hours ago

        Yeah, it's not an idiomatic example. I like the idea, but this makes me worry that the project does not have the right priorities. I.e., supporting my use cases :D

      • johnnybzane 2 hours ago

        How do I find jobs that use the R language? It's impossible to search the letter "R" on linkedIn or Indeed without getting a bunch of unrelated job postings

        "R" is the only programming language I know and I can't find a job that uses a R because job search engines don't allow you to sort by skill

        "R language" is the closest substitute on linkedin but the results are still a jumbled mess of jobs, some looking moreso for other skills (SQL/Python)

        I know R-heavy jobs exist but finding them on LinkedIn is virtually impossible

        • clircle 2 hours ago

          Why would you do that? R is a just a tool for doing statistics or research. You need to search for jobs in your subject area like "ecologist", "econometrician", "green energy reseacher", etc.

          • johnnybzane an hour ago

            There are hedge funds that like hiring people who know how to manipulate data in R using dplyr and data.table

            Looking for a similar job where my desire/interest to spend all day in Rstudio is a value add to a business

        • kagevf 16 minutes ago

          I tried using "r" (with quotes) on indeed, and got some hits where R was listed as one of the necessary skills.

          • Balladeer an hour ago

            How does "R language" compare to searching for one of the popular R packages? Searching for "tidyverse", "dplyr", or "ggplot" seems to get a good chunk of hits. That being said, yeah, there does seem to be a trio of skills that often go together (R, python, SQL)

            • johnnybzane an hour ago

              If you search specific packages on LinkedIn the number of jobs is usually very small

              E.g. tidyverse or dplyr is like 20-40 jobs. ggplot is 88. There's definitely way more than 100+ companies looking for R-heavy users.

          • mushufasa 6 hours ago

            The main reason we shy away from R for production apps is all the silent errors where things seem to succeed while being horribly wrong if you take a look. Typing would certainly help mitigate that.

            • ecshafer 6 hours ago

              I think this is a great idea for the project. I don't dislike the syntax, but the syntax seems more ML than R to me. I think keeping the syntax more R-like could be worthwhile.

              • joshdavham 2 hours ago

                Looks interesting! What types of programs do you think people would write in this language? I don't see an obvious need for traditional R programs which are usually just scripts for working with data, but maybe people could write R packages in this language?

                • uptownfunk 5 hours ago

                  Will this fix the problems it claims to? The power of R is the rich package ecosystem. It caters to people who don’t want to think about engineering concerns but want a fast way to access the powers of computation rather than building a scalable system, two very different things. It excels at the former. A new language will not fix this, because this type of thinking has infected the entire package ecosystem. Frankly with code translation you probably don’t need a new language. Prototype in R and code translate to Python or whatever you want to use in prod. Or frankly just do code gen directly in Python so you can skip having to confirm if the results match.

                  To be clear, I love R, it excels in prototyping but I have seen too many real world struggles of folks trying to move to prod that I would say save it for EDA projects and one time analyses.

                  • _Wintermute 5 hours ago

                    I often find I want a specific statistical package that's only in R, but want a more general purpose language for all the other stuff that's involved (parsing, filesystem stuff, error handling etc). I don't want to risk re-writing the statistical methods and all their dependencies in the sensible language, so I end up calling R only for the statistical methods, but I can see this as an alternative.

                    • joshdavham 2 hours ago

                      > A new language will not fix this, because this type of thinking has infected the entire package ecosystem.

                      Do you think the culture of the package ecosystem could possibly change in the future?

                    • clircle 6 hours ago

                      Statisticians and researchers, is this helpful?

                      • tech_ken 4 hours ago

                        I would say that vast majority of type problems in data science/stats workflows come from data tables "trojan-horsing" type or missing data issues, rather than type problems strictly at the code level. Type annotations won't help you when your upstreams decide they want to change the format of their year-quarter strings without telling you.

                        • dragonwriter 4 hours ago

                          > Type annotations won't help you when your upstreams decide they want to change the format of their year-quarter strings without telling you.

                          IME with both Python and JS/TS, it helps a lot (which is different than completely solving the problem), for reasons which should generalize to other typing add-ons/supersets for untyped languages. Typing your code forces validations at the boundaries, which obviously doesn't stop upstream sources from messing with formats but it does mean that you are much more likely to catch it at the boundary rather than having weird breakages deep in your code that you have to trace back to bad upstream data.

                          • tech_ken 3 hours ago

                            Is the idea that if my year_quarter parser is properly typed then it should detect the format change and throw an error? (kind of a silly example, just trying to be illustrative)

                            • Nadya 2 hours ago

                              Yes. Your type can encode what the proper format for a string should be and if a string is passed that does not meet that format it will throw an error allowing you to make any necessary adjustments to handle the new date year_quarter format.

                              eg. `type DateString = ${number}/${number}/${number}`

                              A super naïve check for using "/" instead of "-" as the separator character for a date formatted as a string. If a date is provided with some other separator character it will throw an error. If my function takes a DateString the string must be formatted correctly to pass the type check. Obviously this isn't enough (YYYY/MM/DD is different than DD/MM/YYYY) but the intention was to show a way to enforce something via types rather than validating a string to check that your have a DateString you can simply enforce that you have one.

                              • dllthomas 2 hours ago

                                "Typing your code forces validations at the boundaries" was too strong because of course you can type your code without actually doing the validations, but you can structure your code such that that won't happen accidentally: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...

                                The idea is that checking should be the only way of making a value of the type. That prevents you from forgetting to check when you turn some broader type (say, string) into the more narrow one (date, in this case).

                                • dragonwriter 13 minutes ago

                                  > "Typing your code forces validations at the boundaries" was too strong because of course you can type your code without actually doing the validations

                                  Yeah, of course you can cheat the typechecking in the code at the boundary in several ways, or convert from wire format to internal types in a way which plugs in type-valid defaults for bad data rather than erroring, or just use too-broad internal types to start with (you can have "stringly-typed code"), and fail to help the problems. But if you use the types that make sense internally for what the code is doing, than conversion including validation at the boundary becomes the path of least resistance in most cases. "Forces" is not strictly true, but my experience is that adding types does create a strong push for boundary validation.

                          • ellisv 2 hours ago

                            It is probably helpful in some cases and unhelpful in others. R uses multiple dispatch, so calling `foo` on different types can produce different output. It isn't clear to me how Vapour handles this. In general though, folks are passing around data.frame or similar objects.

                            • levocardia 3 hours ago

                              Not really, because honestly a lot of us who came into programming via research never learned typed languages or unit tests or any of those best practices - we were just hacking around in MATLAB, R, or Python from the start. What I really need is a seamless and easy way to run statistical models that can only be fit in R, but from Python or Node. There are several categories of statistical modeling where R completely blows python out of the water, and it's incredibly wasteful (and error-prone) to try to re-implement these yourself in Python.

                          • russellbeattie 2 hours ago

                            This isn't specifically about Vapour, just about what's become the common way to specify types.

                            I know this is totally bike shedding, semantics, vi vs Emacs, BigEndian vs LittleEndian and it's too late now to affect anything, but to me using a colon after the variable is just wrong!

                            let x : int = 1

                            func add(x: int, y: int): int { return x + y }

                            I see that and it looks like int = 1 and the function's return type is totally lost.

                            This seems completely backwards to me. Maybe I'm just used to the way C did it, but the variable modifiers should come first.

                            let int x = 1

                            func int add(int x, int y) { return x + y }

                            Why we reversed it and added in the colon just doesn't make much sense to me.

                            • bachmeier 5 hours ago

                              I took a couple stabs at this long ago (even before there was a Typescript for inspiration). The first attempt was to add types to the syntax of R, but that would have required a lot more time than I had. Properly catching errors is a massive undertaking requiring a lot of background I don't have. The second attempt was to add syntax for types to R and then compile the code to another language. That's easy to do, but really boring, so I wasn't able to stick with it. It comes with the advantages of static typing and R code that runs very fast. I gave up and went with embedding R inside a statically typed language. Very happy with my choice.

                              Good luck to the authors of this. I believe it solves an important problem for R package authors and others wanting to write bigger programs. It's hard to argue with the benefits of static typing for this type of work.

                              • lloydatkinson 6 hours ago

                                This looks nice. I find R to be an unreadable mess. The comprison shows a great improvement.

                                • qudat 4 hours ago

                                  The default IDE workflow is like a python "notebook" where code can and is run in whatever order the creator wants. Every R code I've read treats it as such and it results in an absolute mess to read and manage.

                                • layer8 5 hours ago

                                  Sounds like vapourware. ;)

                                  • joshdavham 2 hours ago

                                    I mean, there is an alpha you can download. If it was just a landing page and an email waitlist, then that would be vaporware.

                                    • layer8 an hour ago

                                      I was commenting on the naming choice.

                                  • brudgers 2 days ago

                                    [flagged]

                                    • johncoene 2 days ago

                                      First, how is that "giving myself an excuse"? Second, it's a total non sequitur, and even then, it's a day old has it broken?

                                      • brudgers 2 days ago

                                        the syntax might change, things will break, expect bugs.

                                        Bugs are normal software development.

                                        Changing syntax and breaking things make work for everyone else for the convenience of developers. Reliability is what makes a tool a tool.

                                        • Terretta 12 hours ago

                                          > Changing syntax and breaking things make work

                                          How else might one explore a new language (vapour) in the open among interested like-minded developers seeking to iterate on a tool found lacking (R)?

                                          Changing and iterating things makes.

                                          • ausbah 5 hours ago

                                            they aren’t wrong. backwards compatibility is a suppose to one of the first promises any mature programming languages. unless you make it explicit via noting breaking changes in major version updates (1.X.X —> 2.X.X) or the language is purely for R&D and makes no guarantee of anything