Comments Page - Designing type inference for high quality type errors

« Back Designing type inference for high quality type errorsblog.polybdenum.comSubmitted by PaulHoule 6 months ago

johnbender 6 months ago
Minor nit:
> The job of a compiler error message is to prove to the user that their code is invalid
The job of the compiler error message is to convey why the compiler couldn’t demonstrate the code is correct. The code may be valid but the compiler just can’t prove it (analyses are sound but not complete unless the language is strongly normalizing)
- PaulHoule 6 months ago
  Practically the difficulty of generating error messages is one reason why "we can't have nice things". I ran into this problem while trying to write an adventure game (like Zork) in Drools. Drools sucks in a lot of stuff from a Java compiler and mashes it into a production rules framework and the result is I was getting error messages that made no sense at all and thus gave up on the project.
  One approach to compiler development is one of successive transformations and it is possible to cut compilers into very thin slices like the original FORTRAN for the IBM 1401. Similarly you can go beyond the methods taught in Graham's On Lisp to do rather complex metaprogramming in Lisp or some other language, trouble is that an error manifests at stage N+5 but the information required to explain what went wrong was elided at stage N so it is quite difficult to figure out what's going wrong.
- tshaddox 6 months ago
  Sure, but the goal when designing a compiler and its error messages is to prove to the user that their code in invalid. Most practical type systems are going to be either unsound or incomplete (or both), but the goal should be to make encountering those limitations rare in practice.
  AnimalMuppet 6 months ago
  It seems to me that, if the limitations are encountered too often in practice, then people stop using that language to write that kind of code, because it's too frustrating to use that way.
  LoganDark 6 months ago
  > Sure, but the goal when designing a compiler and its error messages is to prove to the user that their code in invalid.
  The goal when designing a compiler and its error messages should be to give the user the information they need to correct their code.
  tshaddox 6 months ago
  A distinction without a difference. Perhaps "prove" is sounding too mathematical. I could rephrase it as "the goal...is to demonstrate to the user what is invalid about their code."
  LoganDark 6 months ago
  > I could rephrase it as "the goal...is to demonstrate to the user what is invalid about their code."
  This is closer, yes.
  tshaddox 6 months ago
  To prove is to demonstrate, and the way to do it is to display information about what is invalid about the code. So all these ways of phrasing it are synonymous.
  antonvs 6 months ago
  The meaning of "proof" in a computer science context is much stronger than that. A proof does involve demonstration, but it's demonstration via a deductive logical argument that shows that the conclusion is logically guaranteed to follow from the premises.
  Compiler error messages would be a lot more verbose if they did that rigorously.
  tshaddox 6 months ago
  Again, I didn’t intend the mathematical definition of “prove.”
- layer8 6 months ago
  One way of formally defining a programming languages is as the set of strings accepted as programs by the compiler, or by the formally defined algorithm that is implemented by the compiler. By such a definition, a given string is not a valid program of the respective programming language if the algorithm rejects it. And it is useful for the error message to show how the input program fails the requirements of the algorithm. The input program may be sound in type-system terms, but still not valid for the respective programming language.
tayistay 6 months ago
I've been working on a type checker for a language with ad-hoc overloading and what I did was have the checker proceed iteratively, making passes over the set of constraints and applying deduction rules. So it never guesses, branches, or has to backtrack. If it can't make progress because there's too much overloading, it gives up and asks the user to add some type annotations. I suspect this will actually work quite well in practice even if it can't type check some valid programs.
ruuda 6 months ago
For reporting the expected type on mismatch, I do something similar in RCL to be able to track the source of the expectation: https://ruudvanasseldonk.com/2024/implementing-a-typechecker...
renox 6 months ago
I'm not sure I value type inference more than function overloading.. Local type inference yes, global, mmmh..
estebank 6 months ago
Hot take: the primary role of a compiler is to turn malformed code into human readable diagnostics. They are error reporting tools with a codegen is side-gig.
- JonChesterfield 6 months ago
  This can and should be a separate tool, probably called a linter. The compiler dealing with codegen can then just say "nope, that's not a program in this language, try the linter", optionally calling into the linter for you.
  This does of course mean that divergent ideas of "correct" between linter and compiler are confusing and really do need to be avoided for happy developer times.
init1 6 months ago
All these years and Ada is still the undisputed and unmatched master of type safety.