In Depth

What are the approaches that you implemented in Cinje?

The final version of cinje (proper use does not capitalize the name unless at the start of a sentence) was actually the fifth complete iteration. Four other overall approaches were tried. Because it’s a template engine there are two main aspects of its operation that each require their own distinct approaches:

Template parsing / translation into Python code.
Execution of the template to generate page content.

On the first, the other approaches tried included: classical grammar-based parser/generator, regular expression string replacement, and buffering iterative string translation, before settling on a reentrant streaming approach. This generates the code that is eventually executed during application runtime.

Parser/generators are extremely complex, with formal structures and organization that require effectively implementing the engine as a new programming language whose interpreter happens to be written in Python. This is exceptionally sub-optimal from the perspectives of performance (using an interpreted language to write an interpreter is inherantly slow) and simplicity (you need to replicate the logic of Python yourself), though it does give Jinja2 one feature that cinje will never compete with: sandbox isolation. The cost is an additional 4000 to 6000 lines of code in the engine, all of which needs testing. There are no parser/generator engines smaller than 4000 lines of code or so.

Regular expression string replacement is conceptually simple—you define input and output patterns and just let it go through the template rewriting as it goes—but unwieldly to manage and difficult to encapsulate the idea of context or scope into, such as template text stored within a nested “template function”. The code generation needs to be aware of the surrounding context (in order to make the right decisions on when to emit buffer construction code, when to flush, etc.) and the internal complexity of tracking that state grew faster than the features of the engine did. Additionally, regular expressions are their own interpreted micro-language, and thus slower than native string manipulation. Some regular expression engines accomplish their duties in as few as 1200 lines of code.

Buffering iterative string translation is a basic translation of regular expressions into standard string manipulation functions with the only benefit being the elimination of regular expression overhead.

Reentrant streaming, where the object being iterated upon iterates other objects which may, in turn, continue iterating the original object, and so forth, down through structured levels of scope, was more of a challenge or experiment in purity; to apply the streaming approach everywhere in the engine, not just at the time the templates are finally run to generate content. It also happened to be the simplest, structurally, with absolutely extreme levels of separation of concern. The code to translate a function, for example, only handles the function definition itself, then re-enters the stream to continue processing the function’s code, and is able to observe (and act) during the start and end of its nested scope.

On the second, alternate approaches included some particularly strange applications of Python langauge features, such as the use of objects representing DOM elements and array notation to denote child relationships. For example:

html [ body [ p [ “Hello.” ] ] ]

That’s 100% valid Python, which is weird enough, representing an HTML page. A second approach, bare yielding, returned to processing the text generated by the template as plain, dumb text, but suffered from yielding too frequently, requiring additional buffering and boilerplate by users of the templates. The final version relies on only a single feature of the languge: lists, or specificaly append, extend, and join operations. Instead of yielding each chunk, it extends the buffer with the chunks available in each step of processing, finally joining the list together to return the result.

Beyond those two main aspects of processing in the engine, there was also the overall approach for using the engine that had to be considered. Typical use of an engine looks something like:

from engine import Template

data = dict()
template = Template(open(‘file.tmpl’, ‘r’))
result = template.render(**data)

Opening and rendering can be abbreviated on one line, but this is code required on each use of any template; boilerplate that I do not find acceptable. DRY (Don’t Repeat Yourself) required I find a better solution. To resolve this, I “borrowed” a technique I had previously explored but never resulted in a usable product: unicode encoding. This is another virtually unique application of Python to template engines, which many users upon initial inspection find abhorrent (I use such a strong word intentionally; users can react almost violently towards unexpected choices) for generally asthetic reasons, not technical.

When a source file is loaded by Python, the first line or two is checked for a “coding” declaration, such as “encoding: utf-8” or “encoding: iso-8859-1”. This is generally used to support the use of unicode strings within the source file, and is implemented using a unfiied object protocol with plugins providing different encodings. Cinje implements this protocol. While no other template engine could be found that takes this approach, there are a small number of “utility” encodings available that aren’t true Unicode encodings, such as ‘base64’, ‘gzip', or ‘hex’. Instead of transforming a byte stream into a Unicode text stream, cinje transforms template source into plain Python source, and this process is transparent, automatic, and happens during module import.

As a side-effect of inserting itself into Python’s own code loading machinery, cinje did not have to reimplement a syntax tree generator, bytecode cache, or even error handling mechanisms. It instead simply relies on the Python runtime to provide those in their most optimized forms. This also allows cinje to benefit from language-level runtime differences between CPython (the core language runtime) and Pypy (a Python interpreter featuring Just-In-Time compilation), for example.

And where did you take them (from which engine)?

The overall syntax of cinje is a very, very close match to pyTenjin (using several optional extensions), which was the principle inspiration for the whole package. Additionally, the author of Tenjin, having implemented the engine across quite a large number of langauges (such as PHP and Perl) has also given presentations on the most efficient template execution processes possible using each language, serving as inspiration for the list append/extend/join final approach. Unfortunately, several code practices within Tenjin preclude its use, such as module compaction (one .py file, multiple importable modules), and stack frame manipulation.

Grammar-based parser/generator is the approach taken by a large majority of engines; inspiration would have been from Mako (which we used), Jinja2, and Django’s Template package. Regular expression string replacement is the approach taken by a majority of text markup systems such as bbcode, markdown, or textile. Buffering iterative string tranlsation was an aborted attempt to simplify regular expressions out of the system due to inherant performance issues. After being dissatisfied with the simple replacement of regular expressions with native string manipulation all existing approaches were dismissed as unnessicarily complex. The result, reentrant streaming generation, is an approach not a single other template engine in Python uses; it is a seemingly unique new development.

On the output front, 99.9% of alternate engines fully buffer output. Of the one competing package that offers streaming—Jinja2—only implements “bare yield” use, requiring the additional boilerplate to re-add controllable buffering.

After implementing them, what experiments or tests did you perform to validate the optimization of each approach?

Template translation generally revolved around minimizing effort to use (boilerplate) and increased readability of the generated code. This involved writing or porting over a large number of templates of various complexity from Mako to the work-in-progress cinje, then comparing the resulting generated code, generation time, function call counts, and other statistics between the two engines processing the “same” template. It was very much a process of iterative development and refinement to identify dead-end approaches and points of optimization as quickly as possible, before they became cemented in the code. Once a baseline was measured, each subsequent modification, change, or refactoring could be rapidly compared to earlier versions, to see how performance, use, and generated code compared.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In Depth

Clone this wiki locally