diff --git a/src/evaluation/evaluation.pdf b/src/evaluation/evaluation.pdf index 7cf7e2d..32d4c82 100644 Binary files a/src/evaluation/evaluation.pdf and b/src/evaluation/evaluation.pdf differ diff --git a/src/evaluation/evaluation.tex b/src/evaluation/evaluation.tex index ee35267..7401aa4 100644 --- a/src/evaluation/evaluation.tex +++ b/src/evaluation/evaluation.tex @@ -4,6 +4,7 @@ \ourchapter{Evaluation}\label{sec:evaluation} +% TODO: signpost that the main contribution is left-recursion transformation so this is the most important thing to evaluate!! The project was initially intended to mostly focus on implementing a variety of parser lints, however implementing these rules revealed an unanticipated level of complexity in the infrastructure needed to support them. This ended up necessitating a lot more work on the intermediate \scala{Parser} and \scala{Expr} \textsc{ast}s than expected, which has led to a robust internal machinery that allows \texttt{parsley-garnish} to be easily extended with more domain-specific rules in the future. However, the time spent on this infrastructure work meant that the variety of implemented linting rules were not as comprehensive as initially desired. @@ -17,17 +18,44 @@ \subsection{Performance of Expression Normalisation}\label{sec:eval-exprs} % TODO: move this to evaluation chapter? % TODO: tbh could've just stopped at HOAS, but in our case, NbE was an easily attainable logical next step to more aggressively use meta-language constructs Originally, the implementation of \scala{Expr} used a named approach with Barendregt's convention, generating fresh variable names using an atomic counter. -However, this required an extra $\alpha$-conversion pass to clean up variable names before pretty-printing the expression term, since fresh name generation resulted in undesirably long names like \scala{x4657476}. +Normalisation proceeded by manually substituting variable objects following the $\beta$-reduction rules. +However, this required an extra $\alpha$-conversion pass to clean up variable names before pretty-printing the expression, as fresh name generation resulted in undesirably long names like \scala{x4657476}. The implementation was then switched to a \textsc{hoas} approach, which made the correctness of the normalisation process more easy to reason about. However, this lost the ability to easily show $\alpha$-equivalence of terms, so the next logical step was to implement the final two-tiered \textsc{nbe} approach, where a syntactic representation allowed for easy $\alpha$-equivalence checking, and a semantic \textsc{hoas}-based representation was used for normalisation. +\Cref{fig:benchmark} shows the performance benchmarking results of the expression normalisation process on five different tasks. +These were compiled on Scala 2.13.14, running on AdoptJDK \textsc{jvm} version 8. +\begin{itemize} + \item \emph{parser} normalises a fairly standard parser term generated during the left-recursion transformation. This $\lambda$-expression has a maximum depth of 10 nested terms, and contains 39 total variables. + \item \emph{norm tree} normalises a $\lambda$-expression encoding a binary tree in the $\lambda$-calculus. This has a depth of 23 and contains 1,117 variables. + \item \emph{sum tree} normalises a $\lambda$-expression encoding a sum fold over the same binary tree, using Church numerals to represent integers. This has a depth of 25 and contains 1,144 variables. + \item \emph{eq 20} normalises a $\lambda$-expression comparing the equality of two differently-built Church numerals representing the integer 20. This has a depth of 27 and contains 326 variables. + \item \emph{eq 40} is the same as \emph{eq 20}, but for the integer 40. This also has a depth of 27 and contains 539 variables. +\end{itemize} + \begin{figure}[htbp] -\includesvg{assets/benchmark.svg} -\caption{Benchmark results for the expression normalisation process.} -\label{fig:benchmark} + \includesvg{assets/benchmark.svg} + \caption{Benchmark results for the expression normalisation process on five tasks.} + \label{fig:benchmark} \end{figure} +Note that the benchmark results are visualised on a log scale: \textsc{nbe} and \textsc{hoas}, are orders of magnitude faster than the named substitution approach. +As expected, \textsc{nbe} and \textsc{hoas} are very close in performance, as they both use Scala's native variable bindings. +In all cases, \textsc{nbe} is slightly faster than \textsc{hoas}, which is also expected as it has slightly less overhead in creating new \scala{Expr} objects. + +In the \emph{parser} task, which is representative of the standard use case for the normalisation process, the substitution approach is still relatively performant, taking only 0.5ms to normalise the term. +This shows that even if \texttt{parsley-garnish} remained with the original approach, the performance would still be acceptable. +However, the other approaches still outperform this by 50x in this simple case. +For more complex tasks, this increases to as much as 6000x slower, as in the case of the \emph{eq 40} task. +From the benchmark results it is rather evident that the reasoning behind this is not the total number of variables, since the \emph{eq} tasks have fewer variables than the \emph{tree} tasks, but perform worse. +Maximum depth is also an unlikely explanation -- rather, it is likely that the individual size of specific terms being substituted is the main factor in the deteriorating performance of the named substitution approach. +This requires copying and re-constructing objects for each variable substitution, which adds up to a significant amount of overhead. +Contrast this with the \textsc{nbe} and \textsc{hoas} approaches, which do not experience the same rate of decrease in performance, as they mitigate this overhead by using Scala's native function application. + +\paragraph{Summary} +The performance benchmarks show that the final implementation of expression normalisation is extremely performant, even for unrealistically large $\lambda$-terms. +Although the original named substitution approach would've likely been sufficient for the standard use case of normalising parser terms, the cheap operational cost of \textsc{nbe} allows expression normalisation to be used frequently and liberally throughout the system, without worrying about performance bottlenecks. \section{Removing Left-Recursion}\label{sec:eval-leftrec} Broadly speaking, left-recursive grammars can be classified into three categories: direct, indirect, and hidden. @@ -247,7 +275,7 @@ \subsection{Hidden Left-Recursion} This can be seen in the error message, although in this context it is not particularly helpful as it leaks how \texttt{parsley-garnish} has desugared the parser while attempting to unfold it -- for example, the \scala{some} results from unfolding the \scala{many} combinator. Future work could see improvements in the error messages to make them more helpful and informative in the case of hidden left-recursion. -\subsection*{Summary} +\paragraph{Summary} The evaluation in this \namecref{sec:eval-leftrec} shows that \texttt{parsley-garnish} is able to detect and handle all forms of left-recursion, although the quality of the output varies depending on the nature of the left-recursion. Revisiting the evaluation criteria: \begin{itemize}