-
Notifications
You must be signed in to change notification settings - Fork 12
The Compiler
This is a placeholder for a guide to understanding how the compiler is implemented, as well as possible avenues for possible optimization, workflows and tools for validating performance and checking compiled output at various stages, etc.
Functional, immutable. No accidental side-effects.
The compiler first optimizes the source code and then generates Racket target code (including any necessary bindings).
The optimization is accomplished in a series of passes, each implementing a specific kind of optimization. Currently there are only two passes: (1) normalization and (2) deforestation.
Normalization rewrites source expressions to a standard and simple form, mapping many equivalent expressions to a common representative one, that can then be optimized in subsequent passes which need not worry about the many permutations of syntax that they may be applicable to. We call this normalized form "Qi Normal Form," and it is yet to be formally defined. As the representative expression can in some cases be faster than the source expression, we do consider Normalization to be an "optimization" though that is not its primary purpose.
Stream fusion - sometimes called deforestation - is implemented for a very limited set of host expressions.
The implementation of the whole deforestation pass is in the
deforest.rkt
module.
For certain sequences of flow steps that normally operate on lists, a stream approach for processing is implemented which generates optimized code without generating intermediate lists after each operation.
There are three types of operations that are eligible for stream fusion:
- Stream Producers mark the beginning of fused sequence and without deforestation they would produce a list. When fused, these generate the stream of values that would normally be contained in the list produced.
- Stream Transformers normally take a list of values and produce a new list of values. They can add, remove and modify the elements of the list. When fused, these changes are executed on a per-step basis and a standard semantics of how this is related to the input stream elements is defined.
- Stream Consumers take a list of values and produce a single value. When fused, these take care of terminating the fused sequence and produce a final value of the optimized part of the flow.
The following producers are implemented:
- implicit producer from list value,
- range generator producer.
The following transformers are implemented:
- filter - can be used with implicit producer,
- map - cannot be used with implicit producer currently.
The following consumers are implemented:
- implicit consumer that collects the result in list,
- foldr,
- foldl,
- car.
The core of this stream fusion optimizer is continuation-passing style stream implementation. Stream is represented as lambda term - usually called "next" - with three arguments:
- done - lambda accepting no arguments which gets evaluated when no more elements are generated by this stream,
- skip - lambda accepting one argument - the new state - which is evaluated when the current input element is not to be used,
- yield - lambda accepting two arguments - current value and the new state - which is evaluated when the current element is to be processed.
Producers must accepts these three mandatory arguments and produce a lambda accepting the initial state.
Transformers must at least accept the "next" lambda and produce a lambda term with the stream signature, implementing given transformation. The transformers usually accept more arguments specifying the transformation parameters.
Consumers must at least accept the "next" lambda and run the whole fused pipeline producing a result value. The consumer usually accepts more arguments specifying the operation parameters - like transformers do.
Fusable operations within the flow are matched using syntax classes which also provide attributes required by the fused operation generator.
The fusable-stream-producer
syntax class matches both implicit and
explicit producers and it provides the following attributes for
generating the fused operation:
-
next
- low-level stream procedure (see above) -
prepare
- wrapper procedure that converts input and applies it appropriately to thenext
procedure -
contract
- procedure contract which is used for wrapping the whole fused operation and ensure runtime checks of its input values -
name
- used for error reporting -
curry
- currying procedure for theprepare
part - it must handle direct usage and fine and blanket templates
The following operations are supported:
list->cstream
range
The fusable-stream-transformer0
matches any transformer that can be
be used with implicit producer. It provides the following attributes
for the fused operation generator:
-
f
- the operation of the transformer (right now this class is tailored towards filter- and map-like transformers, it might require extensions in the future) -
next
- low-level stream procedure
Currently only filter
is implemented.
The fusable-stream-transformer
matches any transformer and provides
the same attributes as fusable-stream-transformer0
. The following
operations are supported:
filter
map
The fusable-stream-consumer
matches any consumer. It provides the
following attribute for the fused operation generator:
-
end
- the beginning of the S-expression surrounding the generated fused operation
The following operations are supported:
cstream->list
foldr
foldl
car
The actual deforestation pass is implemented as syntax transformation of the source flow based on aforementioned syntax classes. The rewrite looks for the following patterns spliced within the flows:
- producer transformer ... consumer
- transformer0 transformer ... consumer
- producer transformer ...+
- transformer0 transformer ...+
For each of these patterns, there is a template that converts the sequence into a directly fusable (normalized) form as follows:
- producer transformer ... consumer
This syntax rewrite ensures the first element is (maybe an implicit) producer and the last element is (maybe an implicit) consumer.
The deforestation pass syntax rewrite uses the
generate-fused-operation
to convert a normalized fusable sequence
into a Racket expression the implements this sequence as a sequence of
fused stream operations.
It also attaches the runtime contracts to both producer and consumer if applicable.
To implement deforestation of a new operation, the low-level CPS implementation must be created first and then appropriate syntax class must be extended with a pattern mattching the operation in the source flow. This pattern must also specify all the required attributes.
- Using environment variables like
PLT_LINKLET_SHOW_CP0=1
. - Inspecting compiler output at intermediate stages.
- Inspecting expansions using
raco expand
. - Gotchas: stale compilation. Use
racket -y
.
Different strategies for testing the compiler are appropriate depending on what you're trying to do.
- Are you trying to verify the meaning of a compiled expression? Write a regular unit test in
qi-test/tests/flow.rkt
. - Are you trying to verify that an optimization was applied? Do you already know what the optimized Qi version should look like? Write an "equivalence" test in
qi-test/tests/compiler.rkt
(see the normalization tests for example), verifying that independently compiling the original expression and the optimized expression produce the same result. - Otherwise, if you don't already know the optimized expression, then write a granular test like the ones for the deforestation pass.
In addition, also consider writing a benchmark to validate the performance improvement, in qi-sdk/profile/nonlocal
.
Ideally, it would be good to develop a body of proofs of the correctness of every compiler transformation (see Validly Verifying that We're Compiling Correctly).
Also see Best Practices for Testing Compiler Optimizations?.
- Alternative layouts - E-graphs, etc.
- Applying stream fusion to core routing forms (may involve new forms of fusion that aren't exactly the traditional kind -- especially to handle multiple values, but also potentially for combinators like
-<
). - Avoiding values->list->values at the level of interface macros, and in host expressions that are subexpressions of flows.
Home | Developer's Guide | Calendar | Events | Projects | Meeting Notes