SMRA (or Software Managed Reconfigurable Accelerator) is initiative of the compiler-microarchitecture lab to boost and promote development of reconfigurable accelerators. It has been developed as an open-source and an automated framework in order to simulate the applications on coarse-grained reconfigurable arrays (CGRAs), which are associated with multi-cores. SMRA’s toolchain consists of LLVM based compiler to gem5 based processor simulator which aims to accelerate the c/c++ applications given as input. CGRA is also modeled as a core in the gem5 on which the performance-critical loops can be accelerated.
Work-flow of SMRA implementation is as follows:
- Preprocessing: This step identifies and extracts the loops from C code. It also creates a multithreaded file where loop is one thread and rest of application code becomes another thread. Input : application.c Output: multithreaded.c and application_loop.c
- Compiler Frontend (LLVM): An LLVM pass to generate Data Flow graph (DFG) of the loop. Input: application_loop.c Output: Node file, Edge file.
- Scheduling and Mapping Through Compiler (REGIMap): Mapping algorithm that maps and schedule the node to the CGRA architecture. Input: Node file, Edge File, CGRA architecture description Output: Schedule file (Prolog schedule, Kernel schedule and epilog schedule), Updated Node and Edge files (updated with routing nodes)
- Instruction Generator/Compiler Backend: Generates 32 bit instructions for CGRA based on the Mapping. It has custom optimized instruction-set architecture (ISA) for CGRAs. Input: Schedule files, updated Node File and Edge File. Output: Prolog, kernel and epilog binary files. Inserted into the multithreaded.c in loop thread.
- Architectural Simulator (gem5): CGRA implemented as co-processor with a main processor in ARM. Input: multithreaded binary file. Output: Stats of acceleration and output verification.
- Automated Install: Give execution access to install and dependencies.
$ ./depedencies
This installs all the dependencies packages required for LLVM and gem5. This script currently works only for Ubuntu (Tested in 14.04 version). If your Linux is different please install the dependencies manually (Use 2. Manual Install to install packages.)
$ ./install
This script automatically builds base LLVM + CGRALoopPass, REGIMap, Instruction Generator and gem5 + SMRA.
- Manual Install: Firstly, you need to download (or pull) this repository. Afterwards, you need to install the dependencies for gem5 and LLVM tools. Afterwards, you need to build the LLVM and gem5 tools, respectively. These can be done by following commands.
cd /home/cmlasu/cml-cgra/2.0/gem5/
scons build/ARM/gem5.opt
For compiler pass, you need to copy the MahdiLoop pass to the LLVM transformation pass and need to build it with make command. This will generate an output file with extension .so, which in turn, needs to be linked to the script file. For more information on how to write and build a pass, please see LLVM documentation (http://llvm.org/docs/WritingAnLLVMPass.html).
In case if you face any error regarding the locations of the tools, you can set them permanently by modifying the run script.
Currently, SMRA cannot perform simulation of the application-level acceleration. However, you can execute loops on SMRA through automated tool-chain. To do so, go through a work folder and create a new folder with you source file in c. A sample folder (sample_test) is already available for your reference in the work directory to guide you about how the output directories look like. You need to copy multi-threading template file and run script file in your new folder. Then you can execute run.sh and the tool-chain flow will automate the process for you, yielding final output directories.
Version 2 of SMRA Supports -
- Works Well With -O3 Compiler Optimizations (Clang)
- Data Dependencies
- Inter-Iteration/Loop-Carried Dependencies (Usage Of Phi Nodes)
- Fixed Implementation of Software Pipelining/Iterative Modulo Scheduling
- Provision of Storing Live Variables After Loop Execution On CGRA
- Provision of Validating Output
- Comparision of the statistics with standalone main processor run
- Single Script For Flow And Single Script To Clean The Space Inside Work Folder
Currently, This Version Does Not Support
- Vector/SIMD Instructions
- Application Level Simulation
- Loops With Control Flow (if-then-else, unconditional jumps)
- Irregular Nested Loops
- Working with other than FOR Loops
- Floating Point Implementation
- Loops Incurring Function Calls
For any questions or comments on SMRA development, please email us at [email protected]
CML's CGRA Webpage - http://aviral.lab.asu.edu/coarse-grain-reconfigurable-arrays/