Replies: 22 comments 4 replies
-
summaryhttps://github.com/MelindaFang-code/bril/blob/395081661ffe06083a9ecfae3bd9465918f17c82/assignment/dataflow.py For this task, I implemented a generic data flow framework. The cfg.py file contains logic for constructing a map with key = block name and value = block, and also the logic for building control flow graph using the map, with nodes being the blocks. The cfg function essentially just return the predecessors and successors of the blocks as we only need them for the dataflow framework.The dataflow.py contains the main logic for the dataflow algorithm. The framework is generic in the sense that we could pass in arguments to decide which analysis to compute. My current implementation only includes live variable analysis and defined variables. To achieve this the framework is programmed to allow for both forwards computation and backwards computation, and depending on the direction we need to switch predecessor & successor, in nodes Map and out nodes Map. In addition, I implemented different merge and transfer functions. TestingI first implemented the non-generic version, the live variable analysis. Then I run my program on the bril files in examples and compared with the existing solution manually, as my output print function didn't produce the same output. When I confirmed that my implementation is correct, I used turnt to produce my results tried to implement the generic version. Then I run the tests using turnt to confirm that even after making my functions generic, the test outputs did not change. Hardest partWhat was the hardest part of the task? How did you solve this problem? |
Beta Was this translation helpful? Give feedback.
-
Summarize what you did.
I implemented a generic data flow analysis framework and implemented reachable definitions and live variable analysis using the framework. First, using the basic block generator from lesson 3's task, I coded a control flow graph generator that takes in a list of Instructions and gives back an adjacency list with the block's indices as IDs. Using this generator I created a data flow analysis framework in the form of an abstract interface that describes the initial conditions, merge, and transfer functions. An instance of an implementation of this interface will then be fed into the To make the worklist algorithm more general which allows both forward and backwards versions, I opted to have a property in the abstract interface called Explain how you know your implementation works—how did you test it? Which test inputs did you use? Do you have any quantitative results to report?
What was the hardest part of the task? How did you solve this problem?
|
Beta Was this translation helpful? Give feedback.
-
SummaryRepo I implemented a CFG and general dataflow analysis framework using the To support more efficient basic block level analysis, I implemented a struct that would maintain the uses & defs of each basic block that were precomputed when they were generated. It was extremely simple to integrate this basic block struct into the generic CFG framework by implementing the trait Usage details are provided in the readme TestingCorrectness testing was done mostly through manual inspection of the output for several of the provided programs in ChallengesMy primary challenges were working with Rust generics as I didn't have too much experience with it prior to this assignment. I really wanted to make my framework as general and extensible as possible so this was definitely a priority for me. Since recursive datatypes are generally considered to be somewhat difficult to work with in Rust, I stuck with using the Implementation aside, testing was particularly challenging as there's no really good way of verifying the results besides manual inspection. |
Beta Was this translation helpful? Give feedback.
-
SummaryDetails/TestingI implemented live variable and reaching definitions analyses via a generic solver. The solver uses the worklist algorithm and takes as parameters a direction, a domain of values equiped with a top value and a meet operator, and a transfer function. I tested the implementation by manually inspecting the output for a number of test programs—all the ones from DifficultiesI didn't have a good mechanism for systematically testing the implementations for this task. Other than running the implementations on all the benchmarks to ensure termination and absence of runtime errors, the best I could do to test correctness was manually inspecting output. I attempted to make this manual testing a bit easier by visualizing the CFG and analysis results using Graphviz, but it was still somewhat painful. |
Beta Was this translation helpful? Give feedback.
-
I have implemented the data flow algorithm for the reaching definitions use case. It is not a general framework yet, but can be extended to one fairly easily. I have tested my implementation by manual inspection of a number of Bril programs, all included in the repo. I had two main challenges:
|
Beta Was this translation helpful? Give feedback.
-
Summary
DetailsControl Flow Graph (CFG)First I implement a new cfg library that converts bril programs into a CFG. After discussing with my peer @SanjitBasker, I decided to represent the basic block as the instruction level since this makes reasoning about the transfer and merge functions much easier. This logic is encapsulated in Generic SolverI designed a generic solver that is initialized by an DFA Analysis using generic solverLeveraging my generic solver, I implemented reaching definitions and constant propagation. The reaching definitions implemented followed pretty naturally from the pseudocode in the lecture. My implementation for constant propagation used an abstract DFA Visualization as a GIFTo inspect and test my data flow analysis, I decided not only to create a graphviz representation of the state of the data flow analysis (displaying the I made a few design decisions for the visualization and animations:
Testing/Evaluation
Potential Extensions:
Difficulties
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Summary
Implementation details
TestingTesting the correctness of this analysis is challenging as the only way is to inspect the outputs. To comprehensively test my implementation, I created multiple test cases ChallengesThe most difficult part to me is composing the transfer function. Although in class the definitions and uses are kept in sets; however, elements in a set does not have an order --- so it does not capture the fact that later definitions kills earlier ones. I ended up using a list to preserve the order of the definitions. The downside is that makes the merge function a little less intutive. |
Beta Was this translation helpful? Give feedback.
-
SummaryFor this task, I implemented a generic dataflow analysis framework and, leveraging this framework, live variable analysis. Then, I used this analysis to implement proper dead code elimination. My implementation is spread across several files as follows: ImplementationDataflow Framework
Liveness Analysis and DCEI now used this dataflow framework to implement a live variables analysis tailored for DCE. Like a typical live variables analysis, it is a backward analysis where the ordering I then used my dataflow framework to compute the variables live out at every instruction. To implement DCE, I deleted every pure instruction which writes to a variable not live out at that program point. TestingI tested my implementation of DCE by against every Bril benchmark using the EvaluationWhile comparing the performance of DCE and TDCE of eliminating dead code produced by LVN, I found that DCE never performed worse than TDCE and occasionally performed slightly better, although not significantly so. In fact, the largest speedup of LVN + DCE over LVN + TDCE was little over 1% and the median speedup was 0% (the full dataset is available here for anyone interested). I anticipated this result, as I had previously observed that performance improvements due to LVN + TDCE were largely relegated to benchmarks produced by the TS frontend. I believe identifier names are generated by the TS frontend using a simple counter, so identifier names are likely globally unique anyways, allowing assignments to dead variables to be effectively optimized by TDCE. ChallengesThe largest challenge I encountered was the question of how generic I should make my dataflow analysis framework. In the end, I decided to make it as general as possible to support future analyses. I only encountered one (particularly frustrating) bug while implementing DCE; I had accidentally swapped the infix meet and join operators exposed by the lattices library, resulting in incredibly confusing test failures. However, once I ironed out this issue, I did not encounter any more problems. ... |
Beta Was this translation helpful? Give feedback.
-
SummaryFor this task, @emwangs and I implemented a Reaching Definitions analysis. You can see the code here in the lesson 4 folder. Implementation Details
Challenges
Testing
|
Beta Was this translation helpful? Give feedback.
-
Summary
Implementation details
How did you test it? Which test inputs did you use? Do you have any quantitative results to report?
What was the hardest part of the task? How did you solve this problem?
Example CFG output after live variable and (cropped) constant propagation analysis on |
Beta Was this translation helpful? Give feedback.
-
Summary
Implementation Details
Testing
Challenges
|
Beta Was this translation helpful? Give feedback.
-
SummaryLiveness Analysis Implementation Details
Difficulties
TestRunnable on hand-written brils and on eight-queens.bril. Future Work
|
Beta Was this translation helpful? Give feedback.
-
@xalbt and I worked together on lesson 4. SummaryImplementation
Difficulties
Testing
Output for
|
Beta Was this translation helpful? Give feedback.
-
SummaryImplementation DetailsIn this assignment @SanjitBasker and I implemented a dataflow analysis framework and instantiated it on two dataflow analyses. We continued to use C++ for the assignment. We implemented the formation of CFGs (which is an extension of the basic block forming algorithm we used in the last assignment) and then wrote a template class for the dataflow analysis. We had some trouble getting the main program to take advantage of our template class, but it can be instantiated through a simple interface in which one provides the meet and transfer functions (and also a utility function for making "top" values with which to initialize the analysis). The analysis can proceed in a forwards or backwards direction, but we realized only after implementing the solver that it may have been a cleaner design to make this a template parameter. TestingIn order to validate our implementation, we implemented analyses for constant propogation and defined variables. We then visualized our results by generating graphs in DOT. This was a bit painful because we couldn't find any C++ libraries for graphs that give full control over the rendered labels. So we decided to generate the graphs ourselves, which we visualized in Edotor. For fun, we also implemented a DFS-based traversal of the CFG and used this to solve the dataflow equations: when we discussed this topic in CS 4120, we learned that solving the strongly connected components of the graph in order will result in faster convergence than a naive worklist, since nodes are solved after their dependencies. In both implementations, we measured the number of times that a node was taken off the worklist and processed (this is equivalent to counting the invocations of the meet function). This was the measure of "performance" for our Below are some pretty charts detailing the results of our two solvers on our first analysis, defined variables: And here are some charts detailing their results on our second analysis, constant propagation: Firstly, we want to note that the primary purpose of developing this alternate traversal of the CFG to solve the dataflow equations was for testing with our original method, and these results we found here are very preliminary. Regardless, we find it surprising how so many of the speedups are by exactly a factor of 2, and also why the supposedly more efficient DFS-based traversal of the CFG performs so poorly on some of these test cases (most of these poor-performing tests consist of one large SCC), and we feel as though it would be interesting to analyze this behavior further. ChallengesGetting used to the class hierarchy structure in C++ was something we faced challenges with, as this was the first time we had to deal with them in a project. In addition, we faced some difficulties with the transfer function for constant propagation, as we did not deal with undefined variables correctly on our first pass. |
Beta Was this translation helpful? Give feedback.
-
SummaryI switched over to a new repo after working on a fork of the CFG ImplementationI switched over completely to Rust from TypeScript which led to a lot of time spent learning and catching back up. I had to rebuild (and build new) utility functions, and had opted to reimplement my LVN (previously in TS) just to get more familiar with Rust. As a result I started a bit late on this. I started by building a CFG. I used a graph library to skip the painful parts and modeled a CFG as a directed graph with each vertex being either a basic block or an empty return, representing the exit. I then just looped over the basic blocks in order as they appear and handled the Return, Jump, and Branch instructions accordingly. As a fun/annoying implementation detail, I ended up referring to each block by its index in the overall ordering for the function; I am new to Rust and I was having a hard time using the Block structs themselves as vertices. Random note, but I thought to label each directed edge with extra information - namely, for a Branch instruction I labeled the two edges with True and False. This didn't turn out to be useful for the analyses I implemented in this task, but I think this could be used in other kinds of analyses with CFGs, coupled with constant folding for example to eliminate dead blocks. I then implemented reaching definitions using the CFG, using the worklist algorithm in class as a model but pretty much hardcoding everything. This actually turned out to be the wrong thing to start with because each reaching definition is not a string/variable name but a reference to where something was defined, so it made everything much more complicated. However I ended up making things work and took advantage of Rust traits to factor everything out into the DifficultiesLearning Rust was and still is pretty difficult, but I'm sure it'll get better over time. I found going from a "hardcoded" reaching definitions algorithm to a generic worklist algorithm to be the hardest part of this task. Because as mentioned before, I used block indices to refer to blocks, and a reaching definition required one of these indices to distinguish itself from another reaching definition from a different block. Thus my TestingI mostly relied on handwritten bril programs to verify correctness. I also took some benchmarks (including the one I added) and did the analyses by hand and made sure my code was giving the same results. Finally, I did my DFAs on every benchmark just to make sure nothing crashed, though this wasn't really a convincing test of correctness. Here is one example output of reaching definitions for a handwritten test (I spent a lot of effort to pretty print the reaching definitions in a readable way) Next StepsI'm going to make the worklist algorithm more generic by adding the I also think I can refactor the way I handled going backwards via the worklist algorithm. It was more of an afterthought as I had started with just the forward version, so there is some hacky logic I think I can improve. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Summary @20ashah and I implemented a data flow framework for the special case of a reaching definitions analysis. We utilized our CFG module from a previous lesson in order to construct a CFG which was used as input to our implementation of the worklist algorithm. Currently, we are able to accurately compute reaching definitions at any arbitrary block in a given CFG. A goal of ours is to produce a generalized implementation of the data flow framework that takes in arbitrary meet operators, transfer functions, and directions. Testing The testing for this task was rather ad-hoc. We verified the correctness of our results manually based on printed output. Obviously, this is not ideal and certainly not a robust way to confirm the correctness of our implementation. Utilizing GraphViz would make results simpler to verify by inspection and we could also write a test that could check the .dot files used to generate GraphViz images. What was the hardest part of the task? How did you solve this problem? The hardest parts for us were the subtleties of implementing a data flow framework where the fundamental units are blocks rather than instructions (which we'd used in a previous class). An example of a subtlety that through us for a loop was handling the case of a variable being redefined within a single block (e.g. within a single block, x = 4; x = 5; where only x = 5 should be in the out set). |
Beta Was this translation helpful? Give feedback.
-
SummaryI implemented reaching definitions, where it keeps track of each definition by the line number in the original program. It will then print out which specific definitions could still be reached in any given block. This gives a little extra information about which specific definition of a variable is reachable, rather than just the name of the variable. TestingI tested by having a few programs I inspected manually, including some larger one from the benchmarks. This is not a very satisfying way to test and ideally I would have liked to actually implement an optimization I could test using brench. I was quite excited about this week’s task and enjoyed implementing it. Unfortunately, I didn’t have much time to do anything more interesting than the basics, but I plan on revisiting this topic to build out the general framework and more interesting data flow analyses that I can build into optimizations. Challenging partsI had some pretty frustrating debugging experiences with the last assignment, so thankfully this one was much smoother and was easy to translate from what we had done in class. The main difficulty was convincing myself that my implementation is correct, by using manual inspection, which is time consuming, error prone, and not that convincing. |
Beta Was this translation helpful? Give feedback.
-
data flow analysis implementation code here SummaryI implemented a generic data flow analysis framework and worklist algorithm using templates, and later tested it out with reaching definitions. Since I have been migrating to C++, I started off with rewriting my lesson2's Python CFG implementation to C++ and added more functionalities supporting get successors/predecessors for the data flow analysis task. Then I wrote a generic data flow analysis framework and worklist algorithm, with the goal of supporting multiple analysis. Since different data flow analysis problems have different types, I designed a "function factory" to produce different TestingTesting this task is tricky, and I had to manually draw out the control flow graph and inspect the value passing in/out each basic block. And I used turnt to test some representative cases to verify my reaching definition implementation was correct. ChallengesI rewrote my form basic blocks and build cfg in C++ and didn't test it out fully before moving on to data flow analysis. This led to long and frustrating debugging process. I will incorporate unit tests for small functions in the future, instead of testing the final deliverable in the end. By testing out bit by bit, although demand more time and effort at the moment, will save a ton of time in the long run. |
Beta Was this translation helpful? Give feedback.
-
Summary I used python for this task. I first implement a build_cfg function to build the cfg of the program. The build_cfg function outputs a dictionary of name of the block to its parents and children. I have also implemented a generate worklist_algo function that takes an Analyzer object that helps it perform the data flow analysis. The Analyzer object should have two functions: merge and tranfer, and the worklist_algo function calls these two functions to perform the data flow analysis. Testing Hardest part |
Beta Was this translation helpful? Give feedback.
-
SourceImplementation and issuesFor this lesson, I made use of the existing CFG construction code under the bril repository. I implemented a constant folding worklist algorithm through an intra-procedural analysis. CP computes the set of variables at each point in the CFG that contains an BenchmarkingFor benchmarking, I made use of brench and I used my own birthday.bril example while manually comparing the results. With the exhaustiveness of brench's output, I'm convinced my eye-ball verification technique is fine enough for this implementation.
|
Beta Was this translation helpful? Give feedback.
-
This thread is for summarizing how your implementation of the data flow framework and its client optimizations went!
Beta Was this translation helpful? Give feedback.
All reactions