A s1mple P
language compiler course project, compiling P
to RISC-V
RV32
instructions.
- Course website
- Discussion repo
- this course uses GitHub class room with awesome automation tool made by TAs.
The project is evoluated in 5 stages.
- Implement lexical analyzer (scanner)
- Implement syntax analyzer (parser) - Writing Syntactic Definitions
- Implement syntax analyzer (parser) - Constructing Abstract Syntax Trees
- Implement semantic analyzer
- Implement Code Generator
Write Regular expression with the tool flex
to split codes into tokens. I follow the convention of the starter code provided by TAs creating some macros for later convenience. Then implement the regular expression rules.
I found rules of some types like Octal, float and sciencetific can be easily messed together.
Parser takes tokens returned by scanner then parses them with grammar rules written by me with the tool bison
, checking if there is any error. Be careful about the left/right associatin of some tokens. As teacher mentioned in the class, we are writing ambiguous grammar. With the help of TA's instruction, I can write grammars with less pain.
- Token for keyword "begin" should not be named as BEGIN. That will direct bison to create the namespace collision. [source]
- Not familiar with P languange. Need to check the syntax back and forth.
- Top down approach? Bottom up?
- First I write grammar top down, but I found I can easily get confusing when I'm not sure what kind of base components(rules) I can get. So I tried to write grammars bottom up. Then I found I still need a top down overview of my grammar.
- Basically I finish my grammar with both top down and bottom up approach. I spent a lot of time fooling around and write shit at the begining. After finding out a structural way to construct grammars, everything goes well.
Add custom actions after grammers to construct nodes for AST. Suggested using visitor pattern to construct AST.
Mainly repeating these three steps:
- Give a non-terminal a type
- write actions for non-terminals
- Follow the AST Guild to implement nodes for actoins
This stage stretches c++
skills. It may make you battle-scarred, but worth it.
Use the AST constructed in the last stage to detect semantic errors. Should print out detailed error messages and symbol table as well.
- This stage stretches
c++
skills even more, especially some syntax fromc++11
or later. A great way to make me grow. - Passing infomations between nodes could be complicated.
- Be careful about memeory management. Referencing the addresses has been released could cause segment faults that are hard to trace.
Use the info stored in AST to generate RISC-V
assembly codes. Need basic assembly code programming experience. Write generation code directly in AST could be fast, but not the best way, in terms of abstraton and maintenability.
- Run the shell script at each HW folder root to enter container provided by TA.
$ make
to compile.$ make test
to test
For more details please check the readme file in each folder. Each HW contains previous content.
gcc
flex
bison
TAs provide us docker image to work within docker. Regular Ubuntu with flex
and bison
installed could make it as well. I use Arch by the way.
I only write code in src
folder. Other are written either by TAs or provided by third party tools. For more details please check the readme file in each folder.
If you are NCTU/NYCU students and find this repo INTERESTING, please keep in mind that this is for knowledge and learning process sharing. You can ask question by opening issues. Please try your best on your own before finding something HELPFUL here.
Don't stop persuing greatness on whatever way you are heading.