Skip to content

StencilStream Version 2.0.0

Compare
Choose a tag to compare
@JOOpdenhoevel JOOpdenhoevel released this 06 Jul 13:37
· 556 commits to master since this release

We are very exited to bring you the new version 2.0.0 of StencilStream, the Generic Stencil Simulation Library for FPGAs!

For this release, we have fundamentally changed the way StencilStream works internally which allows simulation grids of arbitrary size and better scaling for smaller grids. Let's go into the details:

What's new?

Architecture

StencilStream now uses a spatial tiling approach introduced by Hamid Reza Zohouri, Artur Podobas and Satoshi Matsuoka that partitions a dynamically sized grid into statically sized tiles which can be better handled by the processing pipeline.

Defined Grid Halo

This also allows for a new way to handle the grid halo; The cells outside the grid that are required to calculate the cells on the grid's edge. In the previous version, these cells were undefined and transition functions had use the indices to check for edge cases. Now, the user can provide a constant value to StencilStream and the pipeline guarantees that all cells in the grid halo will have this value. Old transition function might still work, but their complexity can be vastly reduced using this precondition.

For example, you would have needed to write a transition function like this in v1.1.1 in order to sum up the neighbors of a cell:

auto trans_func = [grid_width, grid_height](Stencil2D<float, 1> const &stencil, Stencil2DInfo const &info) {
    float sum = 0;
    if (info.center_cell_id.c > 0) {
        sum += stencil[ID(-1, 0)];
    }
    if (info.center_cell_id.c < grid_width - 1) {
        sum += stencil[ID(1, 0)];
    }
    if (info.center_cell_id.r > 0) {
        sum += stencil[ID(0, -1)];
    }
    if (info.center_cell_id.r < grid_height - 1) {
        sum += stencil[ID(0, 1)];
    }
    return sum;
};

Now, you can set the halo value to 0.0 and simply write:

auto trans_func = [](Stencil<float, 1> const &stencil) {
    return stencil[ID(-1, 0)] + stencil[ID(1, 0)] + stencil[ID(0, -1)] + stencil[ID(0, 1)];
};

Edge cases are automatically handled by StencilStream.

Pipeline Length as a Template Parameter

The previous version of StencilStream used preprocessor macros to duplicate the execution stages of a pipeline. This came with the limitation that the pipeline length was capped at 1024 stages and that the length had to be set via a macro definition too. In version 2.0.0, we have overcome this limitation and the pipeline length of a design is now set as a template parameter of the StencilExecutor class too.

Breaking Changes

This release also brings some breaking changes to the user-facing interface to reduce verbosity and increase clearness:

  • StencilStream now uses StencilStream has the default directory name, and a one-file-per-class policy has been adopted where suitable. For example the include line for the StencilExecutor class is #include <StencilStream/StencilExecutor.hpp> instead of #include <stencil/stencil.hpp>.
  • The Stencil2D has been renamed to Stencil
  • The Stencil2DInfo class has been merged into Stencil, transition functions only accept a Stencil instance as a parameter.
  • The StencilExecutor class has been completely rewritten.

More information

More information on how StencilStream is structured and how the interface is designed can be found in the documentation. It is both hosted online and attached as a tarball.

What's next?

This version marks the introduction of the spatial tiling architecture. Until now, we have focused on correctness and clearness with only some second thoughts on performance. In subsequent releases, we will profile and improve the performance of StencilStream and also provide optimization guides for users to achieve the full potential of their applications.

Your feedback is always welcome! Please submit an issue if you find a bug or have a feature request.