• Jtsummers 2 days ago

    If you like Ghuloum's paper, there are three fairly recent compiler books that are inspired by it:

    https://nostarch.com/writing-c-compiler - Writing a C Compiler by Nora Sandler, language agnostic for the implementation.

    https://mitpress.mit.edu/9780262047760/essentials-of-compila... - Essentials of Compilation (using Racket) by Jeremy Siek

    https://mitpress.mit.edu/9780262048248/essentials-of-compila... - Essentials of Compilation (using Python) by Jeremy Siek

    Those last two both have open access versions.

  • WantonQuantum 2 days ago

    The "lambda lifting" seems to be referring to section 3.11 "Complex Constants" in the linked Ghuloum PDF:

    Scheme’s constants are not limited to the immediate objects. Using the quote form, lists, vectors, and strings can be turned into constants as well. The formal semantics of Scheme require that quoted constants always evaluate to the same object. The following example must always evaluate to true:

        (let ((f (lambda () (quote (1 . "H")))))
          (eq? (f) (f)))
    
    So, in general, we cannot transform a quoted constant into an unquoted series of constructions as the following incorrect transformation demonstrates:

        (let ((f (lambda () (cons 1 (string #\H)))))
          (eq? (f) (f)))
    
    One way of implementing complex constants is by lifting their construction to the top of the program. The example program can be transformed to an equivalent program containing no complex constants as follows:

        (let ((tmp0 (cons 1 (string #\H))))
          (let ((f (lambda () tmp0)))
            (eq? (f) (f))))
    
    Performing this transformation before closure conversion makes the introduced temporaries occur as free variables in the enclosing lambdas. This increases the size of many closures, increasing heap consumption and slowing down the compiled programs. Another approach for implementing complex constants is by introducing global memory locations to hold the values of these constants. Every complex constant is assigned a label, denoting its location. All the complex constants are initialized at the start of the program. Our running example would be transformed to:

        (labels ((f0 (code () () (constant-ref t1)))
                 (t1 (datum)))
          (constant-init t1 (cons 1 (string #\H)))
          (let ((f (closure f0)))
            (eq? (f) (f))))
    
    The code generator should now be modified to handle the data labels as well as the two internal forms constant-ref and constant-init.
    • JonChesterfield 2 days ago

      The idea is to move variables from the body of the function to the argument list and rewrite the call sites to match.

      That decreases the size of the closure (and increases the size of the code, and of however you're passing arguments).

      Do it repeatedly though and you end up with no free variables, i.e. no closure to allocate. Hence the name, the lambda (closure) has been lifted (through the call tree) to the top level, where it is now a function (and not a lambda, if following the usual conflating of anonymous function with allocated closure).

      Doesn't work in the general case because you can't find all the call sites.

      • zozbot234 2 days ago

        I think the "no closure to allocate" is not quite right because the captured parameters of a first-class function still need to be stored somewhere. It just happens as part of the calling code, e.g. consider how a "function object" in C++/Java works: the operator() or .call() code does not need to allocate anything, but the allocation might occur as part of constructing the object itself.

        • mrkeen 2 days ago

          Once they've been converted from free variables to formal parameters then it's assumed you can just stack allocate them, and roll them off when you return from your lambda (which is no longer a closure)

      • undefined 2 days ago
        [deleted]
        • MangoToupe 2 days ago

          > Using the quote form, lists, vectors, and strings can be turned into constants as well.

          So all of these forms will transformed to bss?

          • JonChesterfield 2 days ago

            bss stores zeros but sure, this lot could end up in rodata if you were careful about the runtime representation, or data if you were a little less careful. Treat the elf (ro)data section as the longest lived region in the garbage collector and/or don't decrement refcounts found there. Good thing to do to the language standard library.

        • kazinator 2 days ago

          In the TXR Lisp compiler, I did lambda lifiting simply: lambda expressions that don't capture variables can move to the top via a code transformation that inserts them into a load-time form (very similar to ANSI Common Lisp's load-time-value).

          E.g.

            (let ((fun (lambda (x) (+ x x))))
              ...)
          
          That can just be turned into:

            (let ((fun (load-time (lambda (x) (+ x x)))))
              ...)
          
          Then the compilation strategy for load-time takes care of it. I had load-time working and debugged at the time I started thinking about optimizing lambdas in this way, so it was obvious.

          load-time creates a kind of pseudo-constant. The compiler arranges for the enclosed expression to be evaluated just once. The object is captured and it becomes a de-facto constant after that; each time the expression is evaluated it just refers to that object.

          At the VM level, constants are represented by D registers. The only reason D registers are writable is to support load-time: load time will store a value into a D register, where it becomes indistinguishable from a constant. If I were so inclined, I could put in a vm feature that will write-protect the D register file after the static time has done executing.

          If we compile the following expression, the d0 register is initially nil. The d1 register holds 3, which comes from the (+ 3 x ) expression:

            1> (compile-toplevel '(lambda () (lambda (x) (+ 3 x))))
            #<sys:vm-desc: a32a150>
            2> (disassemble *1)
            data:
                0: nil
                1: 3
            syms:
                0: sys:b+
            code:
                0: 8C000009 close d0 0 4 9 1 1 nil t2
                1: 00000400
                2: 00010001
                3: 00000004
                4: 00000002
                5: 20020003 gcall t3 0 d1 t2
                6: 04010000
                7: 00000002
                8: 10000003 end t3
                9: 8C00000E close t2 0 2 14 0 0 nil
               10: 00000002
               11: 00000000
               12: 00000002
               13: 10000400 end d0
               14: 10000002 end t2
            instruction count:
                6
            #<sys:vm-desc: a32a150>
          
          The close instruction has d0 as its destination register "close d0 ...". The 9 argument in it indicates the instruction offset where to jump to after the closure is created: offset 9, where another "close ..." instruction is found: that represents the outer (lambda () ...)

          We have only compiled this top-level form and not yet executed any of the code. To execute it we can call it as if it were a function, with no arguments:

            3> [*1]
            #<vm fun: 0 param>
          
          It returns the outer lambda produced at instruction 9: as expected. When we dissassemble the compiled form again, register d0 is filled in, because the close instruction at 0 executed:

            4> (disassemble *1)
            data:
                0: #<vm fun: 1 param>
                1: 3
            syms:
                0: sys:b+
            code:
                0: 8C000009 close d0 0 4 9 1 1 nil t2
            [... SNIP; all same]
          
          
          d0 now holds a #<vm fun: 1 param>, which is the compiled (lambda (x) ...). We can call the #<fm fun: 0 param> returned at prompt 3 to get that inner lambda:

            5> [*3]
            #<vm fun: 1 param>
            6> [*5 4]
            7
          
          We can disassemble the functions 3 and 5; we get the same assembly code, but different entry points. I.e. the lambdas reference this same VM description for their code and static data:

            7> (disassemble *3)   ; <-- outer (lambda () ...)
            data:
                0: #<vm fun: 1 param>
                1: 3
            [ SNIP same disassembly ]
                9: 8C00000E close t2 0 2 14 0 0 nil
               10: 00000002
               11: 00000000
               12: 00000002
               13: 10000400 end d0    <---
               14: 10000002 end t2
            instruction count:
                6
            entry point:
               13                     <---
            #<vm fun: 0 param>
          
          The entry point for the outer-lambda is offset 13. And that just executes "end d0": terminate and return d0, which holds the compiled inner lambda.

          If we disassemble that inner lambda:

            8> (disassemble *5) ; <--- inner lambda (lambda (x) (+ 3 x)) 
            data:
                0: #<vm fun: 1 param>  <--- also in here 
                1: 3
            syms:
                0: sys:b+
            code:
                0: 8C000009 close d0 0 4 9 1 1 nil t2
                1: 00000400
                2: 00010001
                3: 00000004
                4: 00000002    <---
                5: 20020003 gcall t3 0 d1 t2.    <-- sys:b+ 3 x
                6: 04010000
                7: 00000002
                8: 10000003 end t3
            [ SNIP ... ]
            entry point:
                4             <---
            #<vm fun: 1 param>
          
          The entry point is 4, referencing into the lifted lambda that got placed into d0. Entry point 4 is in part of the close instruction which indicates the parameter mapping. The word there indicates that the argument value is to be put into register t2. Function 0 is called with the t2 argument and d1 (which is 3). Function 0 is in the syms table: sys:b+, a binary add. When it returns, its value is put into t3 and execution terminates with "end t3".
          • FrustratedMonky 2 days ago

            Off topic, but anybody have a quick take why LISP isn't the primary language of all of these AI models, and why everybody defaulted to using Python.

            I just remember 30 years ago everyone thought LISP would be the language of AI.

            Was it just that some nice easy Python Libraries came out, and that was enough to win the mindshare market.? More people can use Python Glue?

            • Jtsummers 2 days ago

              Lisp's market share was declining 30 years ago and only continued to decline. Python's has consistently risen in that same time. Also, Lisp offers little benefit, if any, when all the ANN implementations rely on C++ and CUDA code. You can write fast numerical code in Lisp, but it's not as straightforward and it's certainly not an idiomatic or common way to use Lisp. That could have let it compete with the C++ libraries, but wouldn't help with the GPU programming part. Lisp could have been the glue language like Python, but again, Python's popularity was on the rise and Lisp's on the decline.

              • medo-bear 2 days ago

                [dead]

              • cardanome 2 days ago

                Lisp was more associated with the classical symbolic and logic-based AI approach which doesn't share much in common with generative AI. It questionable whether the later should even be called "AI" but that battle has been lost years ago.

                Python is just a really good glue language and was good enough for the task.

                • spauldo a day ago

                  A lot of Lisp was funded by AI research. That research dried up in the 90s (see "AI Winter"). Regular computers of the time were underpowered for Lisp, which originated on mainframes and then moved onto purpose-built hardware. So the new generation of programmers learned Pascal, C, and C++ instead and were mostly disconnected from the AI research that had come before.

                  Modern AI focuses on numeric methods, whereas classic AI was heavily symbolic. That makes Lisp less well suited for it, since symbolic computation is where Lisp shines. Classic AI does have some useful bits to it, but insofar as AI development goes it's mostly considered a dead end.

                  • rurban a day ago

                    Because worse is better. You cannot persuade dummies to use a better language, they'll always fallback to the currently worst language. If VB or currently python.

                  • comonoid 2 days ago

                    Do you even lift?