This is an implementation of an abstract CESK machine which supports concurrent constructs (spawn/join), based on Matthew Might’s description of CESK machines (mainly from the papers Abstracting Abstract Machines and A Family of Abstract Interpretations for Static Analysis of Concurrent Higher-Order Programming).
This work is done in the context of my Master’s Thesis, available here.
The following OCaml packages are required:
- ocamlgraph
- oUnit
- batteries
Those packages are installable using OPAM:
opam install ocamlgraph oUnit batteries
To compile this project, you’ll also need ocamlbuild (generally provided with
OCaml). It has been tested with OCaml 4.01, but should probably work with older
versions too. To compile, simply launch make
. To launch the test suite,
execute make test
.
usage: ./main.byte [-v level] [-i input] [-g graph_output] [-k polyvariance]
[-t1 tag] [-t2 tag] [-target target] [other flags (see below)]
-v : verbose level (0 by default)
-progress : print progress (number of states computed)
-i : input file (stdin by default)
-g : output file for the generated graph (nothing by default)
-k : polyvariance (k-CFA) (k=0 by default)
-gc-before : run garbage collection before stepping (disabled by default)
-gc : run garbage collection after stepping (disabled by default)
-no-store-strong-updates : turn off strong updates in the store
-p : turn on parallelism with spawn and join (disabled by default)
-cpor : turn on cartesian partial order reduction (disabled by default)
-r : remove threads when they halt (disabled by default)
-j : do strong updates when evaluating a join (disabled by default)
-s : don't explore state if another state that subsumes them has been explored (disabled by default)
-l : enable first-class locks (disabled by default)
-ll : maximum length of abstracted lists (5 by default)
-no-threads-strong-updates : turn off strong updates for the threads
-quiet : don't print the results nor the parameters used, only the time and graph size (disabled by default)
-t1 : tag corresponding to the first expression used for MHP analysis
-t2 : tag corresponding to the second expression used for MHP analysis
-target {run|ast|length|mhp|cmp|unretriedcas|deadlocks|deadlocks1|ldeadlocks|setconflicts|allconflicts|conflicts|race}: action to do with the input (run by default)
-e {bfs|dfs}: graph traversal method used ('bfs' or 'dfs', bfs by default)
-help Display this list of options
--help Display this list of options
Those parameters are explained in more details below, and examples are given after.
-v
: when the verbose mode is activated (> 0), the execution will also print a trace of the states computed (before the results), with eval states being printed in red, and continuation states being printed in green.-i
: indicates the input file from which to read the input. By default, standard input is read. For example input files, look at theinput/
directory.-g
: indicates where to write the state graph. Graphviz’s dot format is used and images can be generated usingdot
(dot -Tpng foo.dot > foo.png
).-quiet
disables the values outputted at the beginning of the execution.-p
: turns on parallelism (use PCESK instead of CESK), thus supportingspawn
andjoin
.
-k
: tune polyvariance (see Abstracting Abstract Machines), the higher the k, the higher the precision. However, for higher-order programs, k > 1 is not really useful.-gc
: activates abstract garbage collection. It will result in a much higher precision on complex programs.-gc-before
: similar to-gc
, but the garbage collection will be done before the transition function, and not after. This is generally a bad idea, as running the GC after the transition reclaimes more unreachable addresses.-s
: activate reduction of the state space by subsumption. This experimentally gives really good results for the PCESK machine.-no-store-strong-updates
: disable strong updates in the store. It should result in a loss of precision.
First, parallelism should be turned on with -p
before using those options. The
use of first-class locks can be activated by using -l
.
-j
: when ajoin
is done, a strong update will be done to increase precision.-r
: each times a thread halts, remove it from the thread map. It should result in a better precision.-no-threads-strong-updates
: disable strong updates in the thread map. It should result in a great loss of precision.-cpor
: uses Cartesian Partial Order Reduction to reduce the state space
Multiple targets (ie. actions to run) exist. Each target makes use of the previously defined options.
The default target is run
, which simply computes the state graph for the given
program.
The ast
target allows the user to display the parsed AST, annotated with tags
(number corresponding to an AST node), in order to find the tags to use for the
mhp
target.
The mhp
target does a may-happen-in-parallel (MHP) analysis, using the nodes
described by the tags given by the -t1
and -t2
parameters. For doing so, it
computes the state graph and traverses it in order to find a state where the two
nodes can be evaluated. It then prints whether they can be evaluated in parallel
or not.
The conflicts
target detects read/write and write/write conflicts in a
program, in a similar way as how MHP is done (but the user doesn’t have to
specify two expressions to check, the whole program is checked). This target
filters out some conflicts that are considered as harmless.
The allconflicts
target is similar to conflicts
but doesn’t filter out any
conflict. It will detect more conflict, but with many false positives when cas
is used.
The setconflicts
target is also similar to conflicts
but only checks for
read/write and write/write conflicts involving a set!
(it assumes cas
is
correctly used).
The unretriedcas
target finds potential errors when cas
is incorrectly
used. The return value of cas
should always be checked and the cas
should be
retried if it failed. This analysis looks for cas
that are not retried when
they failed. This is a source of race conditions.
The race
target combines the targets conflicts
and unretriedcas
to detect
race conditions.
The deadlocks
target finds potential deadlocks in programs where locks are
implemented through cas
. It does so by looking for cycles on a cas
that will
not terminate (ie. the cas
will always evaluate to #f
and be retried).
The ldeadlocks
target finds potential deadlocks in program that use
first-class locks (enabled with the -l
option)
You can run a sequential CESK machine on programs that do not make use of the
parallel operators. Some examples are given in the input/seq/
directory.
With the run
target, the program will be evaluated and the possible results
will be printed. On each line the result will correspond to a possible final
state of the execution.
The last line of the output contains the number of states in the graph, the number of edges and the time it took to compute this graph.
For example:
$ ./main.byte -i input/seq/mut-rec.scm
#f
#t
#f
#t
#f
#t
145/145/0.189
The PCESK machine can be used to run simple programs that make use of the
parallel operators. Parallelism is turned on by the -p
parameter. By default,
nothing is done to reduce the state space and the computation might take a long
time. A sane default to improve this is to use the parameters -j -s -r
. The
garbage collector (-gc
or -gc-before
) tends to increase the size of the
state space compared to just using a reduction by subsumption (-s
).
For example:
$ ./main.byte -i input/pcounter.scm -p -j -r -s
Int
Int
2578/6333/431.817
We can check whether two expressions may happen in parallel. First, the two
expressions have to be identified by their tag, by analyzing the output of the
ast
target. Then, those two expressions identifiers (tags) are given as
value for t1
and t2
and the target mhp
is run.
For example:
$ ./main.byte -i input/pcounter.scm -p -j -r -target ast
(letrec ((counter@2 0@3) (f@4 (lambda () (letrec ((old@7 counter@8) (new@9 (+@11 old@12 1@13)@10)) (if (cas counter old@17 new@18)@15 #t@19 (f@21 )@20)@14)@6)@5) (t1@22 (spawn (f@25 )@24)@23) (t2@26 (spawn (f@29 )@28)@27)) (join t1@31)@30 (join t2@33)@32 counter@34)@1
$ ./main.byte -i input/pcounter.scm -p -j -r -target mhp -quiet -t1 15 -t2 15
The expressions (cas counter old@17 new@18)@15 and (cas counter old@17 new@18)@15 may happen in parallel
(The two cas
expressions may safely happen in parallel because of their
atomicity)
We can detect race conditions in a program with the following targets:
conflicts
, setconflicts
, unretriedcas
, and race
. conflicts
will look
for every read/write and write/write conflicts, but some conflicts (those
involving cas
) might not lead to race conditions. We can thus make the
assumption that cas
is correctly used and look for setconflicts
instead
(ie. conflicts that does not involve cas
usages). To verify the assumption
that cas
is correctly used, we can finally use the unretriedcas
analysis.
The race
target combines conflicts
and unretriedcas
.
For example:
$ ./main.byte -p -r -j -i input/pcounter.scm -target setconflicts
No conflicts detected
$ ./main.byte -p -r -j -i input/pcounter.scm -target unretriedcas
No unretried cas detected
$ ./main.byte -p -r -j -i input/pcounter-race.scm -target setconflicts
2 conflicts detected between the following pairs of expressions:
(set! counter (+@9 counter@10 1@11)@8)@6, (set! counter (+@9 counter@10 1@11)@8)@6
(set! counter (+@9 counter@10 1@11)@8)@6, counter@10
$ ./main.byte -p -r -j -i input/pcounter-race.scm -target race
2 conflicts detected between the following pairs of expressions:
(set! counter (+@9 counter@10 1@11)@8)@6, (set! counter (+@9 counter@10 1@11)@8)@6
(set! counter (+@9 counter@10 1@11)@8)@6, counter@10
No unretried cas detected
To detect deadlocks in a program where locks are implemented with cas
, the
target deadlocks
looks for a cycle in the state graph starting at a state that
evaluates a cas
and that fails (resulting in #f
). If such a cycle exists,
there is a possibility of staying blocked in this cycle infinitely.
For example:
$ ./main.byte -p -r -j -i input/deadlock-simple.scm -target deadlocks -gc
1 possible deadlocks detected, starting at the following expressions:
(cas lock #f@9 #t@10)@7 [on tid 1]
When the -l
flag is given, first-class locks can be used in the input
programs. The deadlock analysis for those lock (ldeadlocks
) is more precise
and takes less time than the deadlock analysis for deadlocks implemented with
cas
(deadlocks
).
For example:
$ ./main.byte -p -r -j -i input/locks/deadlock.scm -target ldeadlocks -l
1 possible deadlocks detected, at the following states:
{1: {(join t1)}
2: {(acquire b)}
3: {(acquire a)}}