Skip to content

Commit

Permalink
[Docs] Update manually written docs
Browse files Browse the repository at this point in the history
  • Loading branch information
auphelia committed Mar 25, 2024
1 parent 85baad0 commit 7b50f16
Show file tree
Hide file tree
Showing 4 changed files with 37 additions and 75 deletions.
31 changes: 12 additions & 19 deletions docs/finn/developers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Power users may also find this information useful.
Prerequisites
================

Before starting to do development on FINN it's a good idea to start
Before starting to do development on FINN it is a good idea to start
with understanding the basics as a user. Going through all of the
:ref:`tutorials` is strongly recommended if you haven't already done so.
Additionally, please review the documentation available on :ref:`internals`.
Expand Down Expand Up @@ -61,7 +61,7 @@ further detailed below:
Docker images
===============

If you want to add new dependencies (packages, repos) to FINN it's
If you want to add new dependencies (packages, repos) to FINN it is
important to understand how we handle this in Docker.

The finn.dev image is built and launched as follows:
Expand All @@ -70,7 +70,7 @@ The finn.dev image is built and launched as follows:

2. run-docker.sh launches the build of the Docker image with `docker build` (unless ``FINN_DOCKER_PREBUILT=1``). Docker image is built from docker/Dockerfile.finn using the following steps:

* Base: PyTorch dev image
* Base: Ubuntu 22.04 LTS image
* Set up apt dependencies: apt-get install a few packages for verilator and
* Set up pip dependencies: Python packages FINN depends on are listed in requirements.txt, which is copied into the container and pip-installed. Some additional packages (such as Jupyter and Netron) are also installed.
* Install XRT deps, if needed: For Vitis builds we need to install the extra dependencies for XRT. This is only triggered if the image is built with the INSTALL_XRT_DEPS=1 argument.
Expand All @@ -84,9 +84,9 @@ The finn.dev image is built and launched as follows:

4. Entrypoint script (docker/finn_entrypoint.sh) upon launching container performs the following:

* Source Vivado settings64.sh from specified path to make vivado and vivado_hls available.
* Download PYNQ board files into the finn root directory, unless they already exist.
* Source Vitits settings64.sh if Vitis is mounted.
* Source Vivado settings64.sh from specified path to make vivado and vitis_hls available.
* Download board files into the finn root directory, unless they already exist or ``FINN_SKIP_BOARD_FILES=1``.
* Source Vitis settings64.sh if Vitis is mounted.

5. Depending on the arguments to run-docker.sh a different application is launched. run-docker.sh notebook launches a Jupyter server for the tutorials, whereas run-docker.sh build_custom and run-docker.sh build_dataflow trigger a dataflow build (see documentation). Running without arguments yields an interactive shell. See run-docker.sh for other options.

Expand All @@ -106,7 +106,7 @@ Linting
We use a pre-commit hook to auto-format Python code and check for issues.
See https://pre-commit.com/ for installation. Once you have pre-commit, you can install
the hooks into your local clone of the FINN repo.
It's recommended to do this **on the host** and not inside the Docker container:
It is recommended to do this **on the host** and not inside the Docker container:

::

Expand All @@ -119,7 +119,7 @@ you may have to fix it manually, then run `git commit` once again.
The checks are configured in .pre-commit-config.yaml under the repo root.

Testing
=======
========

Tests are vital to keep FINN running. All the FINN tests can be found at https://github.com/Xilinx/finn/tree/main/tests.
These tests can be roughly grouped into three categories:
Expand All @@ -132,7 +132,7 @@ These tests can be roughly grouped into three categories:

Additionally, qonnx, brevitas and finn-hlslib also include their own test suites.
The full FINN compiler test suite
(which will take several hours to run and require a PYNQ board) can be executed
(which will take several hours to run) can be executed
by:

::
Expand All @@ -146,7 +146,7 @@ requiring Vivado or as slow-running tests:

bash ./run-docker.sh quicktest

When developing a new feature it's useful to be able to run just a single test,
When developing a new feature it is useful to be able to run just a single test,
or a group of tests that e.g. share the same prefix.
You can do this inside the Docker container
from the FINN root directory as follows:
Expand Down Expand Up @@ -178,16 +178,9 @@ FINN provides two types of documentation:
* manually written documentation, like this page
* autogenerated API docs from Sphinx

Everything is built using Sphinx, which is installed into the finn.dev
Docker image. You can build the documentation locally by running the following
inside the container:

::

python setup.py docs
Everything is built using Sphinx.

You can view the generated documentation on build/html/index.html.
The documentation is also built online by readthedocs:
The documentation is built online by readthedocs:

* finn.readthedocs.io contains the docs from the master branch
* finn-dev.readthedocs.io contains the docs from the dev branch
Expand Down
21 changes: 1 addition & 20 deletions docs/finn/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,6 @@ Frequently Asked Questions
Can't find the answer to your question here? Check `FINN GitHub Discussions <https://github.com/Xilinx/finn/discussions>`_.


Can I install FINN out of the Docker container?
We do not support out of the Docker implementations at the moment. This is due
to the high complexity of the FINN project dependencies.

Since FINN uses ONNX, can I compile any model from the ONNX Model Zoo to an FPGA accelerator?
The short answer is no. FINN uses ONNX in a specific (non-standard) way, including custom layer
types and quantization annotations. Networks must be first quantized using Brevitas and exported
to FINN-ONNX to be converted to FPGA accelerators.


Can I install FINN out of the Docker container?
We do not support out of the Docker implementations at the moment. This is due
to the high complexity of the FINN project dependencies.
Expand Down Expand Up @@ -52,7 +42,6 @@ What operating systems are supported by FINN?
FINN should work fine under any Linux-based OS capable of running Vivado/Vitis, as long
as you install Docker (``docker-ce``) on your machine.


I am getting DocNav and Model_Composer errors when launching the Docker image.
We do not mount those particular directories into the Docker container because they are not
used. The errors are Vivado related but you can safely ignore them.
Expand All @@ -74,16 +63,8 @@ How can I target an arbitrary Xilinx FPGA without PYNQ support?
Why does FINN-generated architectures need FIFOs between layers?
See https://github.com/Xilinx/finn/discussions/383

How do I tell FINN to utilize DSPs instead of LUTs for MAC operations in particular layers?
This is done with the ``resType="dsp"`` attribute on ``MatrixVectorActivation`` and ``Vector_Vector_Activate`` instances.
When using the ``build_dataflow`` system, this can be specified at a per layer basis by specifying it as part of one or more layers’
folding config (:py:mod:`finn.builder.build_dataflow_config.DataflowBuildConfig.folding_config_file`).
This is a good idea for layers with more weight/input act bits and high PE*SIMD.
See the `MobileNet-v1 build config for ZCU104 in finn-examples <https://github.com/Xilinx/finn-examples/blob/main/build/mobilenet-v1/folding_config/ZCU104_folding_config.json#L15>`_ for reference.


How do I tell FINN to utilize a particular type of memory resource in particular layers?
This is done with the ``ram_style`` attribute. Check the particular ``HLSCustomOp`` attribute definition to see
This is done with the ``ram_style`` attribute. Check the particular ``HWCustomOp`` attribute definition to see
which modes are supported (`example for MatrixVectorActivation <https://github.com/Xilinx/finn/blob/dev/src/finn/custom_op/fpgadataflow/matrixvectoractivation.py#L101>`_).
When using the ``build_dataflow`` system, this can be specified at a per layer basis by specifying it as part of one or more layers’
folding config (:py:mod:`finn.builder.build_dataflow_config.DataflowBuildConfig.folding_config_file`).
Expand Down
Binary file modified docs/finn/img/repo-structure.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
60 changes: 24 additions & 36 deletions docs/finn/internals.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,6 @@ Custom Operations/Nodes

FINN uses many custom operations (op_type in ONNX NodeProto) that are not defined in the ONNX operator schema. These custom nodes are marked with domain="finn.*" or domain="qonnx.*" in the protobuf to identify them as such. These nodes can represent specific operations that we need for low-bit networks, or operations that are specific to a particular hardware backend. To get more familiar with custom operations and how they are created, please take a look in the Jupyter notebook about CustomOps (see chapter :ref:`tutorials` for details) or directly in the module :py:mod:`finn.custom_op`.

.. note:: See the description of `this PR <https://github.com/Xilinx/finn-base/pull/6>`_ for more on how the operator wrapper library is organized.

Custom ONNX Execution Flow
==========================

Expand Down Expand Up @@ -137,7 +135,7 @@ ModelWrapper contains more useful functions, if you are interested please have a
Analysis Pass
=============

An analysis pass traverses the graph structure and produces information about certain properties. It gets the model in the ModelWrapper as input and returns a dictionary of the properties the analysis extracts. If you are interested in how to write an analysis pass for FINN, please take a look at the Jupyter notebook about how to write an analysis pass, see chapter :ref:`tutorials` for details. For more information about existing analysis passes in FINN, see module :py:mod:`finn.analysis` .
An analysis pass traverses the graph structure and produces information about certain properties. It gets the model in the ModelWrapper as input and returns a dictionary of the properties the analysis extracts. If you are interested in how to write an analysis pass for FINN, please take a look at the Jupyter notebook about how to write an analysis pass, see chapter :ref:`tutorials` for details. For more information about existing analysis passes in FINN, see module :py:mod:`finn.analysis`.

.. _transformation_pass:

Expand All @@ -148,44 +146,42 @@ A transformation passes changes (transforms) the given model, it gets the model

.. _mem_mode:

MatrixVectorActivation *mem_mode*
==================================
HLS variant of MatrixVectorActivation: *mem_mode*
=================================================

FINN supports three types of the so-called *mem_mode* attrıbute for the node MatrixVectorActivation. This mode controls how the weight values are accessed during the execution. That means the mode setting has direct influence on the resulting circuit. Currently three settings for the *mem_mode* are supported in FINN:

* "const"
* "internal_embedded" (former "const" mode)

* "decoupled"
* "internal_decoupled" (former "decoupled" mode)

* "external"

The following picture shows the idea behind the "const" and "decoupled" mode.
The following picture shows the idea behind the "internal_embedded" and "internal_decoupled" mode.

.. image:: img/mem_mode.png
:scale: 55%
:align: center

Const mode
----------
In *const* mode the weights are "baked in" into the Matrix-Vector-Activate-Unit (MVAU), which means they are part of the HLS code. During the IP block generation the weight values are integrated as *params.h* file in the HLS code and synthesized together with it. For the *const* mode IP block generation the `Matrix_Vector_Activate_Batch function <https://github.com/Xilinx/finn-hlslib/blob/master/mvau.hpp#L92>`_ from the finn-hls library is used, which implements a standard MVAU. The resulting IP block has an input and an output stream, as shown in the above picture on the left. FIFOs in the form of verilog components are connected to these.
Internal_embedded mode
------------------------
In *internal_embedded* mode the weights are "baked in" into the Matrix-Vector-Activate-Unit (MVAU), which means they are part of the HLS code. During the IP block generation the weight values are integrated as *params.h* file in the HLS code and synthesized together with it. For the *internal_embedded* mode IP block generation the `Matrix_Vector_Activate_Batch function <https://github.com/Xilinx/finn-hlslib/blob/master/mvau.hpp#L92>`_ from the finn-hls library is used, which implements a standard MVAU. The resulting IP block has an input and an output stream, as shown in the above picture on the left. FIFOs in the form of verilog components are connected to these.

Advantages:

* smaller resource footprint

* easier to debug layer in cppsim since no additional components

* well-tested and mature components

Disadvantages:

* can lead to very long HLS synthesis times for certain weight array shapes

* less control over the weight memory FPGA primitives, Vivado HLS doesn't always make the best resource allocation decisions

Decoupled mode
--------------
In *decoupled* mode a different variant of the MVAU with three ports is used. Besides the input and output streams, which are fed into the circuit via Verilog FIFOs, there is another input, which is used to stream the weights. For this the `streaming MVAU <https://github.com/Xilinx/finn-hlslib/blob/master/mvau.hpp#L214>`_ from the finn-hls library is used. To make the streaming possible a Verilog weight streamer component accesses the weight memory and sends the values via another FIFO to the MVAU. This component can be found in the `finn-rtllib <https://github.com/Xilinx/finn/tree/dev/finn-rtllib>`_ under the name *memstream.v*. For the IP block generation this component, the IP block resulting from the synthesis of the HLS code of the streaming MVAU and a FIFO for the weight stream are combined in a verilog wrapper. The weight values are saved in .dat files and stored in the weight memory from which the weight streamer reads. The resulting verilog component, which is named after the name of the node and has the suffix "_memstream.v", exposes only two ports to the outside, the data input and output. It therefore behaves externally in the same way as the MVAU in *const* mode.
Internal_decoupled mode
------------------------
In *internal_decoupled* mode a different variant of the MVAU with three ports is used. Besides the input and output streams, which are fed into the circuit via Verilog FIFOs, there is another input, which is used to stream the weights. For this the `streaming MVAU <https://github.com/Xilinx/finn-hlslib/blob/master/mvau.hpp#L214>`_ from the finn-hls library is used. To make the streaming possible a Verilog weight streamer component accesses the weight memory and sends the values via another FIFO to the MVAU. This component can be found in the `finn-rtllib <https://github.com/Xilinx/finn/tree/dev/finn-rtllib>`_ under the name *memstream.v*. For the IP block generation this component, the IP block resulting from the synthesis of the HLS code of the streaming MVAU and a FIFO for the weight stream are combined in a verilog wrapper. The weight values are saved in .dat files and stored in the weight memory from which the weight streamer reads. The resulting verilog component, which is named after the name of the node and has the suffix "_memstream.v", exposes only two ports to the outside, the data input and output. It therefore behaves externally in the same way as the MVAU in *internal_embedded* mode.

Advantages:

Expand All @@ -197,14 +193,12 @@ Advantages:

Disadvantages:

* somewhat less well-tested compared to the const mode

* higher resource footprint due to additional weight streamer and weight FIFO
* slightly higher resource footprint due to additional weight streamer and weight FIFO


How to set *mem_mode*
---------------------
When the nodes in the network are converted to HLS layers, the *mem_mode* can be passed. More detailed information about the transformations that prepare the network and the transformation that performs the conversion to HLS layers can be found in chapter :ref:`nw_prep`. The *mem_mode* is passed as argument. Note that if no argument is passed, the default is *const*.
When the nodes in the network are specialized to HLS layers, the *mem_mode* can be passed. More detailed information about the transformations that prepare the network and the transformation that performs the specialization to HLS layers can be found in chapter :ref:`nw_prep`. The *mem_mode* is set in the node attributes of the nodes and can be passed as part of the folding configuration. The default is *internal_decoupled*.


.. _folding_factors:
Expand All @@ -217,46 +211,43 @@ Constraints to folding factors per layer
* - **Layers**
- **Parameters**
- **Constraints**
* - Addstreams_Batch
* - Addstreams
- PE
- inp_channels % PE == 0
* - ChannelwiseOp_Batch
* - ChannelwiseOp
- PE
- channels % PE == 0
* - ConvolutionInputGenerator
- SIMD
- inp_channels % SIMD == 0
* - ConvolutionInputGenerator1d
- SIMD
- inp_channels % SIMD == 0
* - Downsampler
- SIMD
- inp_channels % SIMD == 0
* - DuplicateStreams_Batch
* - DuplicateStreams
- PE
- channels % PE == 0
* - Eltwise
* - StreamingEltwise
- PE
- inp_channels % PE == 0
* - FMPadding_batch
* - FMPadding
- SIMD
- inp_channels % SIMD == 0
* - FMPadding_rtl
* - FMPadding_Pixel
- SIMD
- inp_channels % SIMD == 0
* - Globalaccpool_Batch
* - Globalaccpool
- PE
- channels % PE == 0
* - Labelselect_Batch
* - Labelselect
- PE
- num_labels % PE == 0
* - MatrixVectorActivation
- PE & SIMD
- MH % PE == 0 & MW % SIMD == 0
* - Pool_Batch
* - Pool
- PE
- inp_channels % PE == 0
* - Thresholding_Batch
* - Thresholding
- PE
- MH % PE == 0
* - VectorVectorActivation
Expand All @@ -280,9 +271,6 @@ This RTL version is an alternative to the original `HLS implementation <https://


The component is implemented by generating (System-)Verilog code for each individual instance, realized via the template + replacement dictionary mechanism found in other FINN components.
Despite the HDL implementation, the component is managed by its own HLSCustomOp (!) named "ConvolutionInputGenerator_rtl". Naturally, HLS simulation & synthesis are not supported.

The RTL SWG is currently disabled by default and can be enabled either in the corresponding HLS conversion transformation (:py:mod:`finn.transformation.fpgadataflow.convert_to_hls_layers.InferConvInpGen`) with `use_rtl_variant=True` or in the build configuration (:py:mod:`finn.builder.build_dataflow_config.DataflowBuildConfig.force_rtl_conv_inp_gen` set to True).

Implementation styles
---------------------
Expand Down

0 comments on commit 7b50f16

Please sign in to comment.