Skip to content

Ideas to enable a DEFTRANSFORM facility.

Charles Zhang edited this page Sep 15, 2020 · 1 revision

Cleavir needs to have a known function facility to recognize known calls and facilitate `deftransform’. And also when to fire these.

Think of the trivial optimization which Python can do easily via known functions:

(if (not x) 1 2) => (if x 2 1)

OR

(not (not x)) => x

In current cleavir, even an optimization like this is hopeless since we can’t leverage information about the standard library.

How do we do known functions in Cleavir, in a client independent way?

We need a way to mark calls as being known calls in the IR, so that something like `deftransform’ can fire during BIR optimize.

For example, we can write (defknown cl:not …) so that the client can inform the compiler that the compiler can use special properties of cl:not. For all standard CL functions, it will probably be a good idea to make these known function attributes client independent.

Python has a bunch of known function attributes. We can adopt most of them profitably I think.

How to actually do the above optimization, given the database?

At first, we have the initial BIR for (if (not x) 1 2)

Perhaps during the transformation from CST to AST, we mark that NOT is in fact a known function and the call can be translated as a known call.

We need the compiler to mark function calls as known function calls and make it easy for the IR to access the name of the known function being called to look up information in the database.

Then in the function BIR-OPTIMIZE-IF (analogous to Python’s IR1-OPTIMIZE-IF) it can simply pattern match the known call to NOT and do the swap. This is also how (not (not <e>)) can get simplified to <e>. The outer known call invokes its deftransform

(deftransform not ((<transfer e>) *) (splice-fun-args <transfer e> ‘not 1) ‘<transfer e>)

where splice-fun-args basically takes the transfer-def of the transfer, checks that the def is a known call to NOT (corresponding to the inner NOT), and splices the argument of the inner NOT into the outer continuation _ in the outer (not _). So basically (not (not <e>)) -> (not <e>), and then the return value of the transform just changes this to <e>. This is exactly how Python implements this transform for (- (- <e>)).

When do these data driven transforms actually fire?

So now you might wonder when BIR-OPTIMIZE-IF and the transform for NOT in the preceding discussion might fire. In Python, it’s simple. There’s a function called IR1-OPTIMIZE that just runs a grab bag of optimizations in a fixpoint loop until code stops changing. It manages to do this efficiently by annotating the lvar structures of the things that change on each iteration of the loop, so that the entire graph doesn’t need to be run over, just the parts of the graph that have information updated on every run. If you look at the SBCL IR1 datastructures, LVARs and NODEs all have reoptimize flags for this purpose.

Of course current Cleavir doesn’t do this, and it just has a very primitive notion of one time passes. We will probably need some way of persistently annotating the IR data structures themselves for when an optimization changes a part of the code. For example, the NOT NOT transform would fire and only mark the local part of the graph affected by this transform for reoptimization, so that doing another iteration of BIR-OPTIMIZE won’t need to needlessly reexamine everything. SBCL does this via reoptimize flags on the structs themselves. This is not a bad idea at all, but there might be a more elegant way to do this. You can think of deftransform as basically a datadriven optimization that ruthlessly does high level lisp simplifications to bring the graph into a canonical form that makes it easy for the compiler to do further analysis, which causes more eligible transforms, and so on and so forth… Oh, and this is exactly where and when bottom up type information is locally transferred. Then Top down type propagation (“constraint propagation”) ties all this local information together with global flow analysis.