Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Talos impl. guide #303

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions talos/Readme.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,10 @@ To run ``talos`` you need a Daedalus specification describing the parser. Talos
cabal run talos -- -n 4 tests/T003.ddl

The ``-n`` flag controls how many example inputs are generated. In this example, we generate four possible inputs that satisfy the parser.

Developing Talos
----------------

See the [implementatino guide]_.

.. _[implementatino guide]: implementation.md
185 changes: 185 additions & 0 deletions talos/implementation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@

Talos implementation
====================

The Talos tool synthesises valid documents from DDL grammars. The tool has the following phases:

1. Parsing the input file and converting to Talos-flavoured Core;
2. Analysing the Core to decompose the grammar into slices;
3. Traversing the Core to synthesize a document, calling into slice model generators when the start of a slice is encountered;

The implementatino is discussed below

Text to Core
------------

The majority of the translation from the text file into Core is
implemented as part of Daedalus, leaving us with a [Core module](../daedalus-core/src/Daedalus/Core/Decl.hs).

Talos [performs](src/Talos.hs) a number of standard and specific
Core to Core passes ([Talos.Passes]), namely

* Fail propagation (making cases partial)
* Polymorphic type monomorphisation
* Constant folding
* Naming arguments to functions;
* Naming the argument to string matches (which is also the result of the match)

These passes ensure that the analysis phase has a name for the
arguments to a function (as the argument to a function may be the
start of a slice, so it needs a name), and that we can safely ignore
string matches when slicing (as a string match always returns its
argument on success, so we just use the argument directly).

Slicing
-------

Briefly, the [Talos.Analysis.Exported][] module contains the data
structures used by the synthesis phase of Talos, using definitions
from [Talos.Analysis.Slice][] The below notes on the implementation of
slicing is not required to use the results.

The slicing phase lives under [Talos.Analysis][]. The aim of the
slicing phase is to decompose the input grammar into a collection of
independent slices (defined in [Talos.Analysis.Slice][], along with
sliced expressions in [Talos.Analysis.SLExpr][]). A slice is, in
essence, a grammar where some nodes have been replaced by holes.
Slices are independent if each choice in the input grammar (i.e., byte
values and grammar alternatives) occurs in exactly one slice (or no
slice if the choice cannot impact the feasibility of synthesis).
Slices are also closed, in the sense that all (parts of) variables
used in the grammar are defined in the grammar.

The slicing process has two phases, firstly a fixpoint is constructed;
and secondly, the fixpoint is _exported_ [Talos.Analysis.Exported][].
The export process simplifies the representation of slices from that
used in the fixpoint construction, including replacing holes in
expressions by default values (which should never actually be
examined), and forgetting which abstract environment (see below) was
used (to avoid carrying around the class in the remainder of Talos).

The slicing algorithm is abstract in the way in which the grammar is
analyzed; the abstraction is in
[Talos.Analysis.AbsEnv][]. Briefly, an
_abstract environment_ relates variables with _abstract predicates_,
and has a precondition operation which calculates the
abstract environment required for an expression to satisfy an abstract predicate.

The two main abstract environments are the whole-variable environment
([Talos.Analysis.VarAbsEnv][])
and the field-sensitive abstract environment
([Talos.Analysis.FieldAbsEnv][]).
The whole-variable abstract predicate carries no information (we want
the whole variable), and the abstract environment is equivalent to a
set of variables --- the abstract precondition operation then just
calculates the set of free variables.

The field-sensitive abstract predicate projects a subset of the fields
in a struct, including nested fields for fields with structure type.
The abstract precondition operation then calculates which fields of
the free variables are used in an expression.

<!-- Slicing is per-function backwards analysis; initially, each function -->
<!-- is sliced ignoring the result. Predicates are introduced when a -->
<!-- failing construct (i.e., a case) is reached: -->

The slicing algorithm uses the following auxiliary modules
* [Talos.Analysis.Eqv][]: notions of equivalence used to determine if a fixpoint has been reached;
* [Talos.Analysis.Merge][]: merging is used to combine slices when they are discovered to be non-independent (i.e., rely on the same choice);
* [Talos.Analysis.Domain][]: the domain over which the slicing algorithm operates, containing the in-progress slices for the current grammar function;
* [Talos.Analysis.Fixpoint][]: a naive algorithm for constructing the fixpoint for the slicing algorithm
* [Talos.Analysis.Monad][]: the monad containing the state for slicing, mostly this builds on Talos.Analysis.Fixpoint.

Non-dependent Document Generation
---------------------------------

The outer synthesis loop is in [Talos.Synthesis][].

Slice Model Generation
----------------------

A model for a slice is a path ([Talos.SymExec.Path][]), which can be
thought of as a parse tree with holes (corresponding to the holes in
the slice). The path tells the outer synthesis loop about decisions
that were made during slice synthesis (i.e., choices and bytes which
occur in the slice).

The main modules are:
* [Talos.Strategy][]: Strategy execution and utils (timing etc.)
* [Talos.Strategy.Monad][]: Core monad for strategies (random numners, Core definitions, etc)
* [Talos.Strategy.PathSymbolic][]: Current main symbolic strategy (pathsymb)
* [Talos.Strategy.PathSymbolicM][]: Monad and aux types for the pathsymb strat
* [Talos.Strategy.BTRand][]: Random selection with backtracking strategy.
* [Talos.Strategy.Symbolic][]: Experimental symbolic strategy (probably should be deprecated)
* [Talos.Strategy.SymbolicM][]: Monad for the symbolic strat

Modules for pathsymb:
* [Talos.Strategy.PathCondition][]: Path condition guards for the
pathsymb strat, corresponding to sets of choices in the slice
* [Talos.Strategy.MuxValue][]: Symbolic values used in pathsymb.
The basic idea is that values in the symbolic execution are sets
of values, each value being guarded by a path condition.

with helper modules:
* [Talos.Strategy.MemoSearch][]: A memoising search procedure (used by symbolic strat)
* [Talos.Strategy.OptParser][]: An option parser for strat arguments
* [Talos.Strategy.DFST][]: A DFS monad transformer
* [Talos.Strategy.SearchTree][]: An explicit representation of search trees

Symbolic Execution
------------------

These modules are involved in talking to the solver:
* [Talos.SymExec.SolverT][]: Solver support with support for context management
* [Talos.SymExec.StdLib][]: Support SMTLIB definitions
* [Talos.SymExec.ModelParser][]: Reading back models from SMTLIB
* [Talos.SymExec.Expr][]: Symbolic execution of expressions
* [Talos.SymExec.Funs][]: Define functions in SMT from Core functions
* [Talos.SymExec.SemiValue][]: Values with non-value leaves
* [Talos.SymExec.SemiExpr][]: SemiValue with SExpr leaves
* [Talos.SymExec.Type][]: Converting type to SMT types

<!-- Definitions of refs -->

[Talos.SymExec.SolverT]: src/Talos/SymExec/SolverT.hs
[Talos.SymExec.StdLib]: src/Talos/SymExec/StdLib.hs
[Talos.SymExec.ModelParser]: src/Talos/SymExec/ModelParser.hs
[Talos.SymExec.Expr]: src/Talos/SymExec/Expr.hs
[Talos.SymExec.Funs]: src/Talos/SymExec/Funs.hs
[Talos.SymExec.SemiExpr]: src/Talos/SymExec/SemiExpr.hs
[Talos.SymExec.SemiValue]: src/Talos/SymExec/SemiValue.hs
[Talos.SymExec.Type]: src/Talos/SymExec/Type.hs
[Talos.SymExec.Path]: src/Talos/SymExec/Path.hs
[Talos.Analysis]: src/Talos/Analysis.hs
[Talos.Analysis.AbsEnv]: src/Talos/Analysis/AbsEnv.hs
[Talos.Analysis.VarAbsEnv]: src/Talos/Analysis/VarAbsEnv.hs
[Talos.Analysis.FieldAbsEnv]: src/Talos/Analysis/FieldAbsEnv.hs
[Talos.Analysis.Eqv]: src/Talos/Analysis/Eqv.hs
[Talos.Analysis.Merge]: src/Talos/Analysis/Merge.hs
[Talos.Analysis.Domain]: src/Talos/Analysis/Domain.hs
[Talos.Analysis.Fixpoint]: src/Talos/Analysis/Fixpoint.hs
[Talos.Analysis.Monad]: src/Talos/Analysis/Monad.hs
[Talos.Analysis.Slice]: src/Talos/Analysis/Slice.hs
[Talos.Analysis.SLExpr]: src/Talos/Analysis/SLExpr.hs
[Talos.Analysis.Exported]: src/Talos/Analysis/Exported.hs
[Talos.Strategy]: src/Talos/Strategy.hs
[Talos.Strategy.Monad]: src/Talos/Strategy/Monad.hs
[Talos.Strategy.DFST]: src/Talos/Strategy/DFST.hs
[Talos.Strategy.SearchTree]: src/Talos/Strategy/SearchTree.hs
[Talos.Strategy.SearchT]: src/Talos/Strategy/SearchT.hs
[Talos.Strategy.SymbolicM]: src/Talos/Strategy/SymbolicM.hs
[Talos.Strategy.PathSymbolicM]: src/Talos/Strategy/PathSymbolicM.hs
[Talos.Strategy.MuxValue]: src/Talos/Strategy/MuxValue.hs
[Talos.Strategy.BTRand]: src/Talos/Strategy/BTRand.hs
[Talos.Strategy.Symbolic]: src/Talos/Strategy/Symbolic.hs
[Talos.Strategy.PathSymbolic]: src/Talos/Strategy/PathSymbolic.hs
[Talos.Strategy.PathCondition]: src/Talos/Strategy/PathCondition.hs
[Talos.Strategy.MemoSearch]: src/Talos/Strategy/MemoSearch.hs
[Talos.Strategy.OptParser]: src/Talos/Strategy/OptParser.hs
[Talos.Synthesis]: src/Talos/Synthesis.hs
[Talos.Passes]: src/Talos/Passes.hs





163 changes: 163 additions & 0 deletions talos/notes/2023-01-28-code-walkthrough.org
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@

* Core walkthrough

** Operation

1. Run Daedalus
2. Slice
3. Outer loop, calling ...
4. ... strategies

** Concepts

- Entangled :: wrt variables: when the choice of a variable impacts
the valid choices for another variable


** [[file:~/galois/safedocs/daedalus/daedalus-core/src/Daedalus/Core/Grammar.hs::data Grammar =][Grammar]] and [[file:~/galois/safedocs/daedalus/daedalus-core/src/Daedalus/Core/Expr.hs::data Expr =][Expr]]

Core data-structures targeted by Daedalus front-end, and used by
Talos. We assume (it should be true) that variable binders are
unique, if only the GUID part of the Name. This means we can use
variables as node identifiers in the AST (e.g. have a Map Name t)

** Synthesis outer loop

The synthesis is syntax-directed: we follow the syntax of the
grammar to synthesise bytes until we hit a variable which starts a
slice --- part of the synthesis state is a map from variable to slice.

- [[file:~/galois/safedocs/daedalus/talos/src/Talos/Synthesis.hs::synthesiseG SelectedHole][synthesisG random mode]]

- [[file:~/galois/safedocs/daedalus/talos/src/Talos/SymExec/Path.hs::data SelectedPathF ch ca a =][SelectdPath]]

This type is how we record decisions that were made when solving
slices (a SelectedPath is a model for a slice). Essentially
tells which decisions to make. It is strongly (structurally)
related to grammars (fills in bytes and choices).

When a slice is solved, the result (a SelectedPath) is merged
into any path we have from previous slice solving. The path is
inspected in synthesiseG, and is initially SelectedHole, o.e., we
have made no choices/constraints on future choices.

- [[file:~/galois/safedocs/daedalus/talos/src/Talos/Synthesis.hs::synthesiseG ::][Worker for Grammar (synthesiseG)]]

** Slices

A slice is a fragment of a grammer, by 'fragment' I mean a grammar
where some nodes in the ast are replaced by 'holes'. A slice
occurs when a grammar is partial: all bits of the grammar which
impact the partiality are in the slice, the idea being that the
slice contains all the information needed to synthesise the (bits
of) the variables bound in the slice.

*** Example

As an example, consider

block
a = A
x = P
y = Q
z = R
a < x is true
y > 0 is true

becomes:

- 'slice starting at a'

block
a = A
x = P
y = HOLE
z = HOLE
a < x is true
HOLE

- 'slice starting at y'

block
a = HOLE
x = HOLE
y = Q
z = HOLE
HOLE
y > 0 is true

with 'z' no being used in a partial way.

*** Notes

We won't talk about how we generate a slice, it is enough to know
that slices are

- closed :: contain all information needed to synthesise the bytes
in the slice
- independent :: slices can be synthesised without worrying about
other slices
- composable :: two slices can be merged into a new slice; the sum of
all slices is the dependent part of the grammar (the remnants
always succeeed)

Note tha SelectedPaths are also composable; merging paths should
result in a model for the merged slices. Because slices are
independent, a choice occurs in at most 1 model, so merging of
models (and slices) is trivial, and is just hole-plugging.

Slices can be at the field level, so a struct-typed variable may
have mutiple slices, one for each field that is a dependency. One
further wrinkle is that slicing depends on how things are used, and
so there is an additional FInstId which is used to tell the
synthesis which function summary to use.

- [[file:~/galois/safedocs/daedalus/talos/src/Talos/Analysis/Slice.hs::data Slice' cn sle =][Slice definition]]

This is what the strategies have to solve, it is parametric in
how calls are represented, and how expressions are represented.

- [[file:~/galois/safedocs/daedalus/talos/src/Talos/Analysis/Exported.hs::type ExpSlice = Slice' ExpCallNode Expr][Instantiation of Slice']]

This is how slices are used outside of the slicing algorithm
(hence 'exported' --- all the slicing-specific annotations etc
get removed).

** Strategies

A strategy is a way of solving a slice: it is a function from ExpSlices to SelectedPaths.

- [[file:~/galois/safedocs/daedalus/talos/src/Talos/Strategy.hs::findModel ::][Running Strategies]]

There is a bit of infrastructure for running strategies, including
timing etc. Strategies can generate many models (it is more
efficient to ask for multiple models from the solver) so we run the
strategy once and consult the cache when we next see that slice.

The simplest strategy is the random selection with backtracking
strategy, which is made a bit mode complicated by abstracting over
the method of backtracking.

- [[file:~/galois/safedocs/daedalus/talos/src/Talos/Strategy/BTRand.hs::stratSlice ::][Random strat]]

We mainly use the PathSymbolic strategy, which constructs a big
solver model along with a parametric SelectedPath, by which I mean
a SelectedPath where the decisions are solver variables. The
solver model is then used to instantiate the parametrict
SelectedPath.

- [[file:~/galois/safedocs/daedalus/talos/src/Talos/Strategy/PathSymbolic.hs::pathSymbolicStrat :: Strategy][Path symbolic strategy]]

This strategy is made (much mode) complicated as we avoid having
non-product types in the solver. We can discuss this if there is
time/interest.