-
Notifications
You must be signed in to change notification settings - Fork 9
Writing analyses using McSAF
This page gives a tutorial on how to write analysis using McSAF. We will start with a simple Name Collector, which collects the names of the variables on the LHS. Next we will implement a reaching definitions analysis. Note that we assume knowledge of Java and theoretical knowledge of compiler analyses.
###Name Collector Analysis
The analysis for collecting the names is fairly straightforward. We do not require a fixed point to implement it. Hence, we will be using the AbstractDepthFirstAnalysis
[1]. We start by extending the AbstractDepthFirstAnalysis
class.
public class NameCollector extends AbstractDepthFirstAnalysis<Set<Name>>
We call our analysis class NameCollector. AbstractDepthFirstAnalysis uses generics to define the type of the data structures used to hold the information collected from the analysis. In this case, we use a set of Name
objects. The Name
class is a node of the McSAF AST.
Next we declare the class members that we need for the analysis.
protected Set<Name> fullSet;
protected boolean inLHS = false;
We define constructors for the class.
public NameCollector(ASTNode<?> tree)
{
super(tree);
fullSet = newInitialFlow();
}
The method newInitialFlow()
is an abstract method from the AbstractDepthFirstAnalysis
class which we are required to implement.
@Override
public Set<Name> newInitialFlow()
{
return new HashSet<>();
}
We then define public methods to allow other classes to access the collected names.
public Set<Name> getAllNames()
{
return fullSet;
}
public Set<Name> getNames( AssignStmt node )
{
return flowSets.get(node);
}
AbstractDepthFirstAnalysis
provides a method for every node in the McSAF AST and defines the default behaviour for each of them. We will override the methods that we require to collect the names, namely, caseAssignStmt
, caseParameterizedExpr
, caseCelIndexExpr
, caseDotExpr
, caseFunction
, caseGlobalStmt
.
Let us look at the implementation of each method.
@Override
public void caseAssignStmt( AssignStmt node )
{
inLHS = true;
currentSet = newInitialFlow();
analyze( node.getLHS() );
flowSets.put(node,currentSet);
fullSet.addAll( currentSet );
inLHS = false;
}
In the above method, we set the class member variable inLHS to true. inLhs is used to ensure that only names that are on the LHS of the statements are collected. Members flowSets
and currentSet
are inherited through the AbstractDepthFirstAnalysis
class. flowSets
is a map which maps a ASTNode to its equivalent set of names. Note that the the type of the value in the map can differ depending on the generic type that is provided. currentSet
is used to hold the set of names for the current node. fullSet
is a member defined in this class and is a set of all the names on the LHS in the AST.
@Override
public void caseParameterizedExpr( ParameterizedExpr node )
{
analyze(node.getTarget());
}
@Override
public void caseCellIndexExpr(CellIndexExpr node) {
analyze(node.getTarget());
}
@Override
public void caseDotExpr(DotExpr node) {
analyze(node.getTarget());
}
The above methods all traverse down to the child node referring to the index variable. For example, in case of the parametrised expression a(i)
, we traverse down the node referring to a
.
@Override
public void caseFunction(Function f) {
for (Name name : f.getInputParams()) {
currentSet.add(name);
}
fullSet.addAll(currentSet);
analyze(f.getStmts());
analyze(f.getNestedFunctions());
}
In case of functions, we analyse the input parameters, function body and the nested functions.
@Override
public void caseGlobalStmt(GlobalStmt node) {
for (Name name : node.getNames()) {
currentSet.add(name);
}
fullSet.addAll(currentSet);
}
For the global statement, we analyse use the getNames
method to fetch all the names and add them to the currentSet and the fullSet.
You can find the full code for this analysis here
###Definitely Assigned Analysis
Next let us look at the definitely assigned analysis. You can find more information on how the analysis works in theory here. The analysis is a forward analysis and hence we will be extending the ForwardAnalysis
class to implement the analysis. Similar to the AbstractDepthFirstAnalysis
class, ForwardAnalysis
is also generic and hence a type needs to be specified.
public class DefinitelyAssignedAnalysis extends ForwardAnalysis<Set<String>>
We then define the constructor similar to the previous name collector analysis.
public DefinitelyAssignedAnalysis(ASTNode tree) {
super(tree);
}
Additionally ForwardAnalysis
also provides the currentOutSet
, currentInSet
, inflowSets
and outflowSets
data structures for holding the intermediate and final results, similar to the name collector analysis.
We have to override and implement the newInitialFlow
method.
@Override
public Set<String> newInitialFlow() {
return new HashSet<>();
}
Additionally, we also have to define the methods merge
and copy
. The merge method defines the operation to be carried out at a merge point and copy defines what needs to be done for a copy.
@Override
public Set<String> copy(Set<String> source) {
return new HashSet<>(source);
}
@Override
public Set<String> merge(Set<String> in1, Set<String> in2) {
Set<String> out = new HashSet<>(in1);
out.retainAll(in2);
return out;
}
In our case, the copy operation simply returns a new copy of the input. The merge operation performs a set intersection. We then override the methods of handling assignment statements, global statements , scripts and functions. We also override the method for the statement class which is the parent class for all statements.
@Override
public void caseScript(Script node) {
currentInSet = newInitialFlow();
currentOutSet = new HashSet<>(currentInSet);
node.getStmts().analyze(this);
outFlowSets.put(node, currentOutSet);
}
@Override
public void caseFunction(Function node) {
currentInSet = newInitialFlow();
currentOutSet = new HashSet<>(currentInSet);
for (Name n : node.getInputParams()) {
currentInSet.add(n.getID());
}
node.getStmts().analyze(this);
outFlowSets.put(node, currentOutSet);
node.getNestedFunctions().analyze(this);
}
@Override
public void caseAssignStmt(AssignStmt node) {
inFlowSets.put(node, currentInSet);
currentOutSet = new HashSet<>(currentInSet);
currentOutSet.addAll(node.getLValues());
outFlowSets.put(node, currentOutSet);
}
@Override
public void caseGlobalStmt(GlobalStmt node) {
inFlowSets.put(node, currentInSet);
currentOutSet = new HashSet<>(currentInSet);
for (Name name : node.getNames()) {
currentOutSet.add(name.getID());
}
outFlowSets.put(node, currentOutSet);
}
@Override
public void caseStmt(Stmt node) {
inFlowSets.put(node, currentInSet);
currentOutSet = currentInSet;
outFlowSets.put(node, currentOutSet);
}
For functions, the input parameters are considered to be already defined and hence are added to the currentInSet
. The body and the nested functions are then analysed. Since a script does not have any input parameters, the currentInSet
is initialised and copied over to the currentOutSet
. For the global statement, all the names are copied over to the currentOutSet
. For assignment statements, we add all the values on the LHS to the outset apart from copying over all the values from inset. The implementation for the rest of the statements is handled by the caseStmt
method. The inset is directly copied over to the outset. The complete analysis can be found here