Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

heuristic based boolean sat solving #109

Merged
merged 1 commit into from
Dec 29, 2019

Conversation

izgzhen
Copy link
Collaborator

@izgzhen izgzhen commented Oct 15, 2019

Description

the goal is to speed up the counterexample finding for deciding whether target and new_target is equivalent. We used to do it in SAT solver, but this could be extremely slow when given a deeply nested flatMap desugared from list comprehension expression. In that case, we can exploit conditions clauses and make smarter random assignments that satisfy the refutation easier

This aims to solve #108

How to test

$ cd examples
$ make select
...
  _var758 : Bag<(Int, Int, String, Int)> = FlatMap {r -> FlatMap {s -> FlatMap {q -> Map {w -> ((r).A, (s).C, (q).B, (w).C)} (Filter {w -> (((q).B == (w).B) and ((r).A == 15))} (Ws))} (Qs)} (Ss)} (Rs)
  return _var758
...

This means that Cozy successfully applies heuristic in determining that it can lift the list comprehension from query to state (since in this particular example there is no update to it).

@izgzhen izgzhen requested a review from Calvin-L October 15, 2019 15:10
@izgzhen
Copy link
Collaborator Author

izgzhen commented Oct 15, 2019

Still WIP

  • Support uninterpreted function (extern function) added to FIXME
  • Make it default off on and able to be turned off with a command-line option
  • tests.synthesis.TestSynthesisCore fails for some mysterious reason with this new change (I will elaborate later) added to TODOs

Further concerns

  1. Is the correctness of this heuristic automatically guaranteed (as long as test suite passes)? @Calvin-L

@izgzhen
Copy link
Collaborator Author

izgzhen commented Oct 15, 2019

The unit test is failed by enumerating to the candidate of:

ETreeMultisetElems(EMakeMaxTreeMultiset(EVar('xs').with_type(TBag(TBool()))).with_type(TMaxTreeMultiset(TBool()))).with_type(TList(TBool()))

However, the <= operator failed ToZ3 since TBool is not comparable and thus EMakeMaxTreeMultiset should not have the element type of TBool in the first place.

It seems that during random enumeration. EMakeMaxTreeMultiset is used to replace TBag — but this seems wrong. I would believe that it is another bug, related to the treemultiset implementation I’ve done before. However, I don't why this bug is revealed only with the current change (which seems remotely related).

@izgzhen
Copy link
Collaborator Author

izgzhen commented Oct 15, 2019

Also, notice that there are a few changes that set use_default_values_for_undefined_vars=True -- If not, it just doesn't work since some variables only exist in the types context but not in value context. I am not sure what is the implicit invariant that is offended, but there is indeed a difference between random_assignment.satisfy and solver.satisfy -- solver.satisfy sometimes return a map in which keys might contain variables that are bonded inside some inner lambda expressions -- which is counter-intuitive some I suppose they should not appear outside its scope.

@izgzhen izgzhen force-pushed the eval-refutation branch 3 times, most recently from cf407c6 to b0c2539 Compare October 15, 2019 17:41
@Calvin-L
Copy link
Collaborator

Is this still in progress? I won't take a close look at the source code until you are ready.

Here are a few notes that might save you some time:

Cozy requires the solver to produce "complete" examples. A complete example includes a binding for every variable in the context, even if that variable's value does not affect satisfiability, and even if that variable is not in the given formula. I would expect random_assignment.satisfy() to take a list of variables or a Context object so that it can fill in values for every variable. This may explain why you needed to add use_default_values_for_undefined_vars=True in a few places.

Currently, Cozy communicates the set of variables to the solver module when it constructs the solver object:
https://github.com/CozySynthesizer/cozy/blob/f2f999c/cozy/synthesis/core.py#L192

solver.satisfy sometimes return a map in which keys might contain variables that are bonded inside some inner lambda expressions

There are two possible reasons.

  1. There is a bug in the solver, and it "leaks" unnecessary assignments from some bound variables.
  2. Cozy sometimes invokes the solver to determine semantic equality of two expressions that appear underneath a binder.

I know for sure that (2) is true. (1) might be true also! If you address the complete-examples point above, I think we'll have a better idea of whether there is a bug.

I don't why this bug is revealed only with the current change (which seems remotely related).

I think you are right, the TreeSet bug is unrelated. I can guess at a reason: sometimes Cozy's behavior is affected by which counterexamples it sees. The counterexamples affect which expressions Cozy throws out as duplicates during synthesis. There is probably some very complicated explanation for this.

@izgzhen
Copy link
Collaborator Author

izgzhen commented Oct 16, 2019

Thanks for taking a look at this!

Is this still in progress? I won't take a close look at the source code until you are ready.

I think it is ready for review and it has already fulfilled the functionality I want it to, despite triggering other bugs.

Also, since it is a heuristic and designed optional, I might not implement some unsupported features such as uninterpreted function.

Cozy requires the solver to produce "complete" examples. A complete example includes a binding for every variable in the context, even if that variable's value does not affect satisfiability, and even if that variable is not in the given formula.

I am still confused -- why is that necessary?

@Calvin-L
Copy link
Collaborator

Complete examples are not strictly necessary. However, there is a tradeoff. I had to choose between:

  1. Require the solver to produce complete examples.
  2. Make every other component able to handle incomplete examples.

I opted for (1).

To illustrate how confusing incomplete examples can be: how would you compare them for equality? Consider these two examples:

{}
{"x": 0}

Are they equal? In one sense no: the first one does not constrain the value for x. They are structurally different. But in another sense they are equal: eval(e, example, use_default_values_for_undefined_vars=True) will produce the same output for all expressions e because 0 happens to be the default value for int variables.

When I was writing the solver/evaluator/synthesis code, I thought it would be simpler to require complete examples everywhere. The solver has an extra responsibility, but there are fewer surprises everywhere else.

@Calvin-L
Copy link
Collaborator

I will take a look at the code here when I get a chance in the evening. :)

cozy/solver.py Outdated Show resolved Hide resolved
cozy/solver.py Outdated Show resolved Hide resolved
cozy/solver.py Outdated Show resolved Hide resolved
cozy/synthesis/core.py Outdated Show resolved Hide resolved
tests/counterexamples.py Outdated Show resolved Hide resolved
tests/counterexamples_driver.py Outdated Show resolved Hide resolved
cozy/random_assignment.py Outdated Show resolved Hide resolved
cozy/random_assignment.py Show resolved Hide resolved
cozy/random_assignment.py Outdated Show resolved Hide resolved
cozy/random_assignment.py Outdated Show resolved Hide resolved
@izgzhen
Copy link
Collaborator Author

izgzhen commented Oct 18, 2019

Updated, thanks for the comments!

@izgzhen izgzhen changed the title heuristic based expression solving heuristic based boolean sat solving Oct 18, 2019
@izgzhen
Copy link
Collaborator Author

izgzhen commented Oct 18, 2019

(NOTE: This is still the remotely related problem mentioned above....)

How to reproduce the errors as in CI:

python -m unittest -vb tests.synthesis.TestSynthesisCore

Also, the error msg:

======================================================================
ERROR: test_easy_synth (tests.synthesis.TestSynthesisCore)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/zhen/projects/cozy/tests/synthesis.py", line 52, in test_easy_synth
    assert check_discovery(target, EStateVar(EVar("xs")), args=[x], state_vars=[xs], assumptions=assumptions)
  File "/Users/zhen/projects/cozy/tests/synthesis.py", line 33, in check_discovery
    examples=examples):
  File "/Users/zhen/projects/cozy/cozy/synthesis/core.py", line 241, in improve
    blacklist=blacklist):
  File "/Users/zhen/projects/cozy/cozy/synthesis/core.py", line 408, in search_for_improvements
    for info in enum.enumerate_with_info(size=size, context=ctx, pool=pool):
  File "/Users/zhen/projects/cozy/cozy/synthesis/enumeration.py", line 623, in enumerate_with_info
    yield from self._enumerate_with_info(context, size, pool)
  File "/Users/zhen/projects/cozy/cozy/synthesis/enumeration.py", line 682, in _enumerate_with_info
    to_keep = retention_policy(e, context, prev_exp, context, pool, cost_model)
  File "/Users/zhen/projects/cozy/cozy/synthesis/enumeration.py", line 239, in retention_policy
    ordering = cost_model.compare(new_exp, old_exp, context, pool)
  File "/Users/zhen/projects/cozy/cozy/cost_model.py", line 197, in compare
    lambda: order_objects(e1.size(), e2.size()))
  File "/Users/zhen/projects/cozy/cozy/cost_model.py", line 67, in prioritized_order
    o = f()
  File "/Users/zhen/projects/cozy/cozy/cost_model.py", line 196, in <lambda>
    lambda: self._compare(storage_size(e1, self.freebies), storage_size(e2, self.freebies), context),
  File "/Users/zhen/projects/cozy/cozy/cost_model.py", line 164, in _compare
    always_le = self.solver.valid(EImplies(path_condition, ELe(e1, e2)))
  File "/Users/zhen/projects/cozy/cozy/solver.py", line 1295, in valid
    return not self.satisfiable(ENot(e))
  File "/Users/zhen/projects/cozy/cozy/solver.py", line 1292, in satisfiable
    return self.satisfy(e) is not None
  File "/Users/zhen/projects/cozy/cozy/solver.py", line 1286, in satisfy
    x = self.solver.satisfy(e)
  File "/Users/zhen/projects/cozy/cozy/solver.py", line 1150, in satisfy
    a = self._convert(e)
  File "/Users/zhen/projects/cozy/cozy/solver.py", line 1067, in _convert
    return self.visitor.visit(e, self._env)
  File "/Users/zhen/projects/cozy/cozy/solver.py", line 835, in visit
    return super().visit(e, *args)
  File "/Users/zhen/projects/cozy/cozy/common.py", line 281, in visit
    return f(x, *args, **kwargs)
  File "/Users/zhen/projects/cozy/cozy/solver.py", line 788, in visit_ELet
    env[v.id] = self.visit(x, env)
  File "/Users/zhen/projects/cozy/cozy/solver.py", line 835, in visit
    return super().visit(e, *args)
  File "/Users/zhen/projects/cozy/cozy/common.py", line 281, in visit
    return f(x, *args, **kwargs)
  File "/Users/zhen/projects/cozy/cozy/solver.py", line 628, in visit_ETreeMultisetElems
    return self.visit(e.e, env)
  File "/Users/zhen/projects/cozy/cozy/solver.py", line 835, in visit
    return super().visit(e, *args)
  File "/Users/zhen/projects/cozy/cozy/common.py", line 281, in visit
    return f(x, *args, **kwargs)
  File "/Users/zhen/projects/cozy/cozy/solver.py", line 826, in visit_Exp
    return self.visit(h.encode(e), *args)
  File "/Users/zhen/projects/cozy/cozy/solver.py", line 835, in visit
    return super().visit(e, *args)
  File "/Users/zhen/projects/cozy/cozy/common.py", line 281, in visit
    return f(x, *args, **kwargs)
  File "/Users/zhen/projects/cozy/cozy/solver.py", line 623, in visit_ESorted
    self.solver.add(self.implies(self.all([bag_mask2[i], bag_mask2[j]]), ite(BOOL, asc, bag_elems2[i] <= bag_elems2[j], bag_elems2[i] >= bag_elems2[j])))
TypeError: '<=' not supported between instances of 'BoolRef' and 'BoolRef'

@izgzhen
Copy link
Collaborator Author

izgzhen commented Oct 18, 2019

Also, I found an interesting case from the failed unit test that for expression (not (Filter {_var12 -> _var12} (EStateVar(xs)) == EStateVar(xs))):

  • solver.satisfy returns None
  • random_assignment.satisfy returns {'xs': Bag((False,)), 'x': False}

I stared at it for a while and still believe that random assignment does satisfy that boolean expression. What do you think @Calvin-L ?

@Calvin-L
Copy link
Collaborator

You are right, that is a satisfying assignment. However, for me, the solver is able to find it. Do you have a test case that exhibits the problem? Here's the code I threw together:

from cozy.target_syntax import *
from cozy.syntax_tools import pprint, mk_lambda
from cozy.typecheck import retypecheck
from cozy.solver import satisfy

xs = EVar("xs").with_type(BOOL_BAG)
var12 = EVar("_var12").with_type(BOOL)
e = ENot(EEq(
    EFilter(EStateVar(xs), ELambda(var12, var12)),
    EStateVar(xs)))
assert retypecheck(e)

print(pprint(e))
print(satisfy(e))

Output:

(not (Filter {_var12 -> _var12} (EStateVar(xs)) == EStateVar(xs)))
{'xs': Bag((False,))}

It's possible that the solver returned None because it was given some assumptions that make the formula unsatisfiable.

As a performance optimization, since assumptions are static for the duration of a synthesis job, the assumptions are set when the solver is created. They are not part of the formula passed to satisfy, but every assignment must also satisfy the assumptions. Internally, the assumptions are encoded and sent to Z3 only once. Z3 can re-use them for multiple calls.

In general, these two expressions should be equivalent:

solver_for_context(ctx, assumptions).satisfy(e)
satisfy(EAll([assumptions, e]),
    vars=[v for v, _ in context.vars()],
    funcs=context.funcs())

You may want to pass the assumptions to random_assignment.satisfy as well.

@izgzhen
Copy link
Collaborator Author

izgzhen commented Oct 19, 2019

You may want to pass the assumptions to random_assignment.satisfy as well.

Ah, this is very helpful! I caught that when I set a breakpoint during the synthesis loop -- let me work on it a bit more

@izgzhen
Copy link
Collaborator Author

izgzhen commented Oct 19, 2019

@Calvin-L regarding the TTreeMultiset related bug, I set the newly added flag to False in that failing test case which instructs it to always use the solver instead, and it makes the full test suite pass. I think I will go with this solution for now and file a bug for it.

@izgzhen
Copy link
Collaborator Author

izgzhen commented Dec 9, 2019

@Calvin-L Do you have time to take a look at this?

Comment on lines 50 to 70
(EBinOp(EUnaryOp('sum', EStateVar(EVar('l').with_type(TBag(TInt()))).with_type(TBag(TInt()))).with_type(TInt()), '+', EVar('i').with_type(TInt())).with_type(TInt()), ENum(0).with_type(TInt()), {'l': Bag((0, -2438, 2437)), 'i': 0}),
(EBinOp(EBinOp(EFilter(EStateVar(EVar('l').with_type(TBag(TInt()))).with_type(TBag(TInt())), ELambda(EVar('_var372').with_type(TInt()), EBinOp(EVar('_var372').with_type(TInt()), '>', ENum(10).with_type(TInt())).with_type(TBool()))).with_type(TBag(TInt())), '+', ECond(EBinOp(EVar('i').with_type(TInt()), '>', ENum(10).with_type(TInt())).with_type(TBool()), ESingleton(EVar('i').with_type(TInt())).with_type(TBag(TInt())), EEmptyList().with_type(TBag(TInt()))).with_type(TBag(TInt()))).with_type(TBag(TInt())), '-', EStateVar(EFilter(EVar('l').with_type(TBag(TInt())), ELambda(EVar('_var492').with_type(TInt()), EBinOp(EVar('_var492').with_type(TInt()), '>', ENum(10).with_type(TInt())).with_type(TBool()))).with_type(TBag(TInt()))).with_type(TBag(TInt()))).with_type(TBag(TInt())), EBinOp(EBinOp(EFilter(EStateVar(EVar('l').with_type(TBag(TInt()))).with_type(TBag(TInt())), ELambda(EVar('_var372').with_type(TInt()), EBinOp(EVar('_var372').with_type(TInt()), '>', ENum(10).with_type(TInt())).with_type(TBool()))).with_type(TBag(TInt())), '+', ECond(EBool(True).with_type(TBool()), ESingleton(EVar('i').with_type(TInt())).with_type(TBag(TInt())), EEmptyList().with_type(TBag(TInt()))).with_type(TBag(TInt()))).with_type(TBag(TInt())), '-', EStateVar(EFilter(EVar('l').with_type(TBag(TInt())), ELambda(EVar('_var492').with_type(TInt()), EBinOp(EVar('_var492').with_type(TInt()), '>', ENum(10).with_type(TInt())).with_type(TBool()))).with_type(TBag(TInt()))).with_type(TBag(TInt()))).with_type(TBag(TInt())), {'l': Bag(()), 'i': 0}),
(EBinOp(EUnaryOp('sum', EStateVar(EVar('l').with_type(TBag(TInt()))).with_type(TBag(TInt()))).with_type(TInt()), '+', EVar('i').with_type(TInt())).with_type(TInt()), EUnaryOp('sum', EStateVar(EVar('l').with_type(TBag(TInt()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit that you may or may not care about: the context object lists all the variables that the solver should care about. It so happens that the solver allows additional variables in the formula (i.e. the free variables in the test cases above). However, I don't know if we want to guarantee that behavior. A free variable in a formula that is not present in the context may indicate an internal bug, so we might change the solver in the future to throw an exception in this case.

A "more correct" thing would be to construct a context containing a superset of each test case's free variables. However, what you have here does work today.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The comment above refers to the init_solver function; I accidentally attached it to the wrong lines!)

@@ -30,7 +31,7 @@
from cozy.typecheck import is_collection, is_scalar
from cozy.syntax_tools import subst, pprint, free_vars, fresh_var, alpha_equivalent, strip_EStateVar, freshen_binders, wrap_naked_statevars, break_conj, inline_lets
from cozy.wf import exp_wf
from cozy.common import No, OrderedSet, unique, OrderedSet, StopException
from cozy.common import No, unique, OrderedSet, StopException
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this import 👍

Test:

  $ cd examples
  $ make select
  ...
  _var758 : Bag<(Int, Int, String, Int)> = FlatMap {r -> FlatMap {s -> FlatMap {q -> Map {w -> ((r).A, (s).C, (q).B, (w).C)} (Filter {w -> (((q).B == (w).B) and ((r).A == 15))} (Ws))} (Qs)} (Ss)} (Rs)
  return _var758
  ...
@izgzhen
Copy link
Collaborator Author

izgzhen commented Dec 29, 2019

I've rebased the change and added a new issue regarding our discussion.

@izgzhen izgzhen merged commit 4e96922 into CozySynthesizer:master Dec 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants