Skip to content

Compiler note system better debugging

Charles Zhang edited this page Sep 22, 2020 · 1 revision

We’d like to have a system for producing compiler notes and possibly style-warnings and normal warnings.

A compiler note is usually something optimization-related, such as noting that code was deleted, or failure to optimize something, or pessimization, or even that code was successfully optimized. This usually means the user hasn’t done anything weird, but there could be room to improve the code performance.

A style-warning is emitted for code which most likely the user did not intend or potentially is non-conforming code. For example, a bound lexical variable which is never referenced falls under this category. A destructive function modifying a constant list also falls under this category.

The compiler emits warnings for code which surely will fail at runtime. For example, a manifest type conflict detectable by type inference will emit a warning. The user must rewrite this code.

What types of warnings do we want? How do we implement them?

In general, the compiler should accumulate notes and emit them at the end of compilation.

Unreferenced variable warning

This arises from something like

(defun f (x)
  2
  )

which would yield a style-warning telling the user X was never used. This is extremely important for catching typos, for example. This can be caught in the initial BIR rather easily; if a lexical variable has no references then a style warning is accumulated: The user has probably made a typo or forgotten an ignore declaration.

Referencing an unknown variable

This arises from something like

(defun f (typographer)
  (feed typographer)
  typo-grapher)

Clearly the user meant to write typographer. We don’t want to catch the undefined variable error at runtime, so we see that typo-grapher is an undefined lexical variable reference. We should emit a style warning noting that this variable is not declared special and will be treated special. This can happen during any stage which processes the variables in the environment, probably in CST-to-AST.

What about a lexical variable with references deleted after optimization?

We probably want to handle this differently than if the variable was unreferenced initially, because it is the compiler that has made the variable unused. Various triggers will exist in BIR to delete unused code, and in particular the deletion trigger on the readvar instruction will check if the lexical variable referenced is no longer read from anymore. That trigger can emit a compiler note noting that the lexical variable has been deleted and optimized away. On high debug, we can avoid deleting the lexical variable and still emit the DWARF or whatever lower level debug level associated with it. This way we leave a cookie for tools like SLIME to figure out what happened to the lexical variable. This is already much better than in HIR where there was just no concept of a source level lexical variable.

Type conflicts, deletion notes

These are all accumulated during optimization phases. When data start having asserted and conflicting types, we accumulate these warnings, style-warnings, or notes to be finalized at the end of optimization somehow. We are deciding to no longer GC code so that we can always explicitly track what is being deleted and where. It would be nice to provide more detail on unreachable code. For example, we probably will at least GC some types of unreachable code that never make it into the final IR. As long as we keep track of all converted code coming out of AST-to-BIR we can at least run over this unreachable code and emit unreachable code and allow SLIME or the debugging environment to know which source forms are unreachable.

Optimization notes

These are a bit harder to do in Cleavir since these become a little more client specialized. This is probably where we allow clients to hook into the note system so it could be used in their lower levels translating out of client independent HIR. For example, we could have Clasp emit warnings for cases where a full unwind is needed, since in Clasp that is a particularly slow operation.

We could have a note saying something like: “SLJL optimization failed, using full unwind” which would greatly help users figure out where astronomically slow parts of code are at compile time so they can rewrite the code differently.

In general we might want to note and trace all optimization passes to see what exactly gets digested. Being able to see some kind of source to source transformation or at least a localized version of that would be extremely helpful to performance minded users and it would be easy to integrate this into SLIME. Also see what the Python compiler does at (optimize (SPEED 3)) to get an idea of what kind of optimization notes people might want.