Skip to content

Proving Equivalence of K Rules and Strata's BitVector Formulas(BVFs)

Sandeep Dasgupta edited this page Feb 24, 2018 · 13 revisions

Overview

While we are generating the K rules using the stratified synthesis mechanism of Strata, there should be some way to check if the rules that we are generating are semantically equivalent to corresponding Stata's BVFs.

Initially we used simple scripts to make some simply rewrites to K rule to transform them to strata's BVF, then that is not sufficient mainly because of the simpification lemmas that we employ to simplify the K rules to a degree that the K rules are syntactically very different that the BVFs.

So we have the idea to convert the K rules, e.q

convToRegKeys(R2) |-> (extractMInt(addMInt(concatenateMInt(mi(1, 0), getParentValue(R1, RSMap)), concatenateMInt(mi(1, 0), getParentValue(R2, RSMap))), 1, 65) )

and strata's bit-vector formula or rule like

%rbx   : (plus (concat <0x0|1> <%rcx|64>) (concat <0x0|1> <%rbx|64>))[63:0]

to their respective smt2 format:

PK_R2 = (Extract( ( Concat(ZERO1, R1) + Concat(ZERO1, R2) ).size() - 1 - 1, ( Concat(ZERO1, R1) + Concat(ZERO1, R2) ).size() - 65, ( Concat(ZERO1, R1) + Concat(ZERO1, R2) )  )  )

and

PS_R2 = (Extract (63, 0, (((Concat((ZERO1), (R1))) + (Concat((ZERO1), (R2)))))))

and use Z3 to check if Not (PK_R2 == PS_R2) is unsat

Motivation

  • To expose flaws in strata's symbolic engine's.
    • Upon disprepancy only test is sufficent to say which one is right.
  • To gain the same confidence in K rules as we have in strata.

Challenges

Converting FLoating point instruction to smt formulas

In case of floating point instructions, it is not possible for efficiently encode the K rules as the resultant z3 formula span multiple Z3 theories (bit-vector, Int, Float), making z3 super slow. On the other hand, strata's rule does not care about the operation semantics of those instruction, instead they represent it as uninterpreted function. In order to prove formal equivalence of K and strata's rule we decided to emit k rules involving the same uninterpreted functions as strata generates.

For that purpose, we have a different version of base Instructions instructions_with_uif/baseInstructions/ and derived instructions instructions_with_uif/derivedInstructions/ and they are used for creating K rules containing uninterpreted functions which are ONLY used for proving equivalences. Having said that, we have other scripts to compile the rules in the above folders and generating and proving equivalence between formulas.

How to compile using the uninterpreted version of base and derived Instructions

cd x86-semantics ./scripts/process_spec.pl --compile --useuif

How to generate the z3 formulas

cd x86-semantics ./scripts/process_spec.pl --getz3formula

How to prove equivalence between K and Strata' rules

cd x86-semantics ./scripts/process_spec.pl --z3prove

Adding axioms over uninterpreted functons (UIF)

There are UIFs like add_double(A:BitVec, B:BitVec). For some strange reason Strata's UIFs has the arguments in reverse order. Due to that Z3 fails (for obvious reasons) to prove that Ks_UIF_add_double(A,B) == Stratas_add_double(B,A). There are few ways to tackle the problem:

  1. Adding axioms We tried adding axioms
  def Ks_UIF_add_double(a, b):
    return Stratas_UIF_add_double(b, a)

which works, but then there are some complicated cases where in Strata we have the UIF like maxcmp(A, B) which returns true is A > B, false otherwise and K has UIF function like max(A, B) which return the max of A and B. My idea is have the axiom like

  define max(A, B): 
     if(maxcmp(A,B) = True): 
      return A 
     else: 
      return B

But as per like, but it seems that python's Z3 interface has some limitations in writing non trivial axioms over UIFs.

  1. Making changes in K Another idea is to have K generates the rules with same UIFs (means syntactic similar) as Strata's rules. But I decided not to make any changes in K, because
  • In case K generate rules involving
  if(maxcmp(B,A) = True): 
    return B 
  else: 
    return A

instead of max(A,B), some simplification lemmas will kick in and can make the resultant formula too big(because of the presence of If#Then#else) resulting it z3 taking long times to prove equivalnce. Also the compilation and krpove run slows down owing to big rules. Either we can waive those lemmas for verifications, but that way we fail to check if the lemas are correct.

  1. Script changes We let K generates its own UIFs (like max(A, B)) and we use scripting to rewrite them to the ones strata expects, ie.e
if(maxcmp(B,A) = True): 
    return B 
 else: 
    return A

Proving equivalence of certain instructions

There are instructions like I arg1, arg2 (where arg1 and arg2 are regosters), for example

xaddb_r8_r8	
xaddl_r32_r32
xaddq_r64_r64	

and many more..

for which strata gives the semantics in terms of sequence of simple x86 instructions when arg1 != arg2. The K rules that we got after symbolically executing the sequence is the semnactis of I when arg1 != arg2.

For arg1 == arg2, strata can generate the correct BVF (need to know how). I am manually generating the K rule for this case (arg1 == arg2) by consulting the arg1 != arg2 case. This is feasible because the number of such instructions are not many.

Now as the z3 formula are generated from the output of krpvover i.e. kast (not from the k rules itself), so we have the z3 formula for the case when arg1 != arg2. So it remains to prove that the manually written K rule for the case when arg1 == arg2 is semantically equivalent to corresponding strata's BVF. There could be 2 ways to do that:

  1. Encode the manually wriiten K rule for the case arg1 == arg2 to smt formula and use Z3 to prove its equivalence with the corresponding smt formula from strata's BVF for the same case.

  2. As the semanics of higher stratum instructions will be generated using the manually written rules (for the case arg1 == arg2) and we are anyway proving the equivalence of those higher stratum instructions with the corresponding strata's rule, we can skip exclicit proving of the correction of the manually written rules. Because if the manulally written rules were not consistent with the corresponding strata's rules, then this will lead to inconsistent semantics of higher stratum instrcutions.

Case study when z3 says "failed to prove"

  • z3EquivFormulas/x86-shlq_r64_cl.py While checking the equivalnce of K rule and strata's BV formula for instr shlq %cl R:64, Z3 failed to prove equivalance for the output OF (overflow flag). Upon checking the counter example, it revleaed that the failure is due to the fact that Intel manual says value of of is undefined when value of %cl is not 0 nor 1, which is correctly encoded in K rules. Strata's formula on the other hand assings the old value of 'of' in that case.

  • z3EquivFormulas/x86-vcvtsi2sdl_xmm_xmm_r32.py

    Help fixing a bug in K rule. commit

  • derivedInstructions/x86-vmovmskpd_r32_xmm.k

    This is a manulally generated instruction by consuling the manual. The reason that I chose to manually generate it because the K rule which I otherwise get from symbolically executing strata's instruction seq to too huge and complex. Also the next stratum instruction which are dependent on that instruxtions are getting more and more complex. This cipples the kompile, krpove and most importantly krun tool. Z3 decision help to not only fix a bug in that k rule but anso validate that the fixed rule is equiv with starta's rule.

Some application of the framework of equivalence checking

  1. For certain instructions like x86-vmovmskpd_r32_xmm.k, we can simplify the K semantics (which other wise is very complex as such instruction comes at very higher stratum level) by consulting the Intel's manual and with Z3 we can check if the semantics is as par with Strata;s semantics.

  2. There are some instructions like cmpxcng, for we we can only generate the K semantics for the case when the destination registers anre not comflicting, as strata does for give the sequence of simpler instructions to run kprover on, when the destination register are conflicting. We can manually generate the rule for those case and use Z3 to verify equivalence with corresponding strata's rules. (Note even if strata does not give seq of x86 instruction for those cases, where can generate the BVF)