diff --git a/doc/invargent-manual.pdf b/doc/invargent-manual.pdf index 8e74ed6..1f3b086 100644 Binary files a/doc/invargent-manual.pdf and b/doc/invargent-manual.pdf differ diff --git a/doc/invargent-manual.tm b/doc/invargent-manual.tm index f605a45..99ce7a3 100644 --- a/doc/invargent-manual.tm +++ b/doc/invargent-manual.tm @@ -34,9 +34,10 @@ does not currently cover records, the module system, objects, and polymorphic variant types. It supports higher-order functions, algebraic data-types including built-in tuple types, and linear pattern matching. It - supports conjunctive patterns using the keyword, but it does - not support disjunctive patterns. It does not currently support guarded - patterns, i.e. no support for the keyword of OCaml. + supports conjunctive patterns using the keyword, but it + currently does not support disjunctive patterns. It currently has limited + support for guarded patterns: after , only inequality + => between values of the type are allowed. The sort of a type variable is identified by the first letter of the variable. ,,,,,,,... @@ -46,16 +47,22 @@ . Remaining letters are reserved for sorts that may be added in the future. Value constructors (like in OCaml) and type constructors (unlike in OCaml) have the same syntax: capitalized name followed by a - tuple of arguments. They are introduced by and - respectively. The declaration might + tuple of arguments. They are introduced by and + respectively. The declaration might be misleading in that it only lists the sorts of the arguments of the type, the resulting sort is always . Values assumed into the environment are introduced by . There is a built-in type - corresponding to declaration and definitions + corresponding to declaration and definitions of numeric constants ... The programmer can use declarations to give the semantics of choice to the data-type. + When solving negative constraints, arising from + clauses, we assume that the intended domain of the sort is + integers. This is a workaround to the lack of strict inequality in the sort + . We do not make the whole sort an integer + domain because it would complicate the algorithms. + In examples here we use Unicode characters. For ASCII equivalents, take a quick look at the tables in the following section. @@ -230,11 +237,14 @@ InvarGenT commits to a type of a toplevel definition before proceeding to the next one, so sometimes we need to provide more information in the - program. Besides type annotations, there are two means to enrich the + program. Besides type annotations, there are three means to enrich the generated constraints: syntax for providing - negative constraints, and syntax for including constraints - of use cases with constraint of a toplevel definition. To ensure only one - maximally general type for , we use both. We add the lines: + negative constraints, >> and + = e2; >> for positive + constraints, and syntax for including constraints of use + cases with constraint of a toplevel definition. To ensure only one + maximally general type for , we use + and . We add the lines: <\code> \ \ \| TInt, TList l -\> (function Nil -\> assert false) @@ -496,7 +506,7 @@ Knowing i> and not knowing >n>, for the case 0>, we get: >k,n.Binary k>Binary - n>>:i[0>k>n>i>i>n+k].Binary + n>>i[0>k>n>i>i>n+k].Binary i>. >i> follows from i>, >n+k> follows from i> and >k>, but @@ -559,10 +569,10 @@ p1,p2 = e1 in e2>>>||>>|>>||\>>\>;e>>|>>||e\e;e>>|= e2; e3>>>>>> + type >\>>\>;e>>|>>||e\e;e>>|= e2; e3>>>>>> Toplevel expressions (corresponding to structure items in OCaml) introduce types, type and value constructors, global variables with given type @@ -584,7 +594,7 @@ en>>>||......>>>>> - Tests list expressions of type that at runtime have to + Tests list expressions of type that at runtime have to evaluate to . Type inference is affected by the constraints generated to typecheck the expressions. @@ -880,6 +890,172 @@ Based on user feedback, we will likely increase the default values of parameters in a future version. + + + + Type inference for the type system underlying InvarGenT is undecidable. In + some cases, the failure to infer a type is not at all problematic. Consider + this example due to Chuan-kai Lin: + + <\code> + datatype EquLR : type * type * type + + datacons EquL : >a, b. EquLR (a, a, b) + + datacons EquR : >a, b. EquLR (a, b, b) + + datatype Box : type + + datacons Cons : >a. a > Box a + + external let eq : >a. a > a + > Bool = "(=)" + + \; + + let vary = fun e y -\ + + \ \ match e with + + \ \ \| EquL, EquL -\ eq y "c" + + \ \ \| EquR, EquR -\ Cons (match y with True -\ 5 \| False + -\ 7) + + + Although has multiple types, it is a contrived example + unlikely to have an intended type. However, not all cases of failure to + infer a type for a correct program are due to contrived examples. The + problems are not insurmountable theoretically. The algorithms used in the + inference can incorporate heuristics for special cases, and can be modified + to do a more exhaustive search. + + The following example illustrates a limitation of our numerical abduction + algorithm that is not intrinsic to the numerical abduction problem. I.e. it + might be fixed by a smarter algorithm. + + <\code> + datatype Elem + + datatype List : num + + datacons LNil : List 0 + + datacons LCons : \n [0\n]. Elem * List n + \ List (n+1) + + external length : \n. List n \ Num n = "length" + + \; + + let rec append = + + \ \ function + + \ \ \ \ \| LNil -\ + + \ \ \ \ \ \ (function l when (length l + 1) \= 0 -\ assert + false \| l -\ l) + + \ \ \ \ \| LCons (x, xs) -\ + + \ \ \ \ \ \ (function l when (length l + 1) \= 0 -\ assert + false + + \ \ \ \ \ \ \| l -\ LCons (x, append xs l)) + + + The expected type is >a, n, + k[0>n>0>k]. List + n>List k>List (n+k)>. When our + algorithm discovers that the result is , rather than , it + is already committed to requiring that the result is no less than . + The answers on successive iterations of the main algorithm do not converge: + if the length of the tail has to be at least one, then the length of the + input list has to be at least two, etc. + + The following example is a natural variant of a function from the + example. + + <\code> + let rec add = fun x -\ efunction + + \ \ \| Empty -\ Node (Empty, x, Empty, 1) + + \ \ \| Node (l, y, r, h) -\ + + \ \ \ \ ematch compare x y with + + \ \ \ \ \| EQ -\ Node (l, x, r, h) + + \ \ \ \ \| LT -\ + + \ \ \ \ \ \ let l' = add x l in + + \ \ \ \ \ \ (ematch height l', height r with + + \ \ \ \ \ \ \ \| hl', hr when hl' \= hr+2 -\ create l' y r + + \ \ \ \ \ \ \ \| hl', hr when hr+3 \= hl' -\ rotr l' y r) + + \ \ \ \ \| GT -\ + + \ \ \ \ \ \ let r' = add x r in + + \ \ \ \ \ \ (ematch height r', height l with + + \ \ \ \ \ \ \ \| hr', hl when hr' \= hl+2 -\ create l y r' + + \ \ \ \ \ \ \ \| hr', hl when hl+3 \= hr' -\ rotl l y r') + + + The difference with the function in the file + amounts to computing , resp. near + the places where they are used. The inference fails because of lack of + sharing of information about due to facts about , resp. about due to facts about , with the other branch. More sophisticated algorithms might mitigate + that. + + We end with an example where there is little hope of improvement. The + and functions in + use assertions to convey the preconditions. Ideally, we would like to be + able to simply write an implementation similar to the following one: + + <\code> + let rotr = fun l x r -\ + + \ \ \ \ ematch l with + + \ \ \ \ \| Empty -\ assert false + + \ \ \ \ \| Node (ll, lx, lr, _) -\ + + \ \ \ \ \ \ (ematch height ll, height lr with + + \ \ \ \ \ \ \| m, n when n \= m -\ + + \ \ \ \ \ \ \ \ let r' = create lr x r in + + \ \ \ \ \ \ \ \ create ll lx r' + + \ \ \ \ \ \ \| m, n when m+1 \= n -\ + + \ \ \ \ \ \ \ \ (ematch lr with + + \ \ \ \ \ \ \ \ \| Empty -\ assert false + + \ \ \ \ \ \ \ \ \| Node (lrl, lrx, lrr, _) -\ + + \ \ \ \ \ \ \ \ \ \ let l' = create ll lx lrl in + + \ \ \ \ \ \ \ \ \ \ let r' = create lrr x r in + + \ \ \ \ \ \ \ \ \ \ create l' lrx r')) + + + Unfortunately, it seems it would require too much ``guesswork'' from the + inference algorithms. <\initial> @@ -927,7 +1103,7 @@ > > > - > + > > > > diff --git a/src/InvarGenTTest.ml b/src/InvarGenTTest.ml index 4b38532..2b5cbbe 100644 --- a/src/InvarGenTTest.ml +++ b/src/InvarGenTTest.ml @@ -160,7 +160,7 @@ let tests = "InvarGenT" >::: [ test_case "non_pointwise_vary" ()); "avl_tree" >:: (fun () -> - (* skip_if !debug "debug"; *) + skip_if !debug "debug"; test_case "avl_tree" ()); ] diff --git a/src/InvariantsTest.ml b/src/InvariantsTest.ml index c40106c..a0fb3e6 100644 --- a/src/InvariantsTest.ml +++ b/src/InvariantsTest.ml @@ -383,7 +383,7 @@ let rec append = (fun () -> todo "too hard for current numerical abduction"; skip_if !debug "debug"; - (* Too hard for the current abduction algo: whe it discovers + (* Too hard for the current abduction algo: when it discovers that the result is [n + k], rather than [n], it is already committed to requiring that the result is no less than [1], which on following iterations blows up. *) @@ -2552,7 +2552,6 @@ let rec add = fun x -> efunction (* Tricky because of lack of sharing of information about [l] due to facts about [l' = add x l], resp. about [r] due to facts about [r' = add x r], with the other branch. *) -(* TODO: explain in the manual. *) [2,"∃n, a. δ = (a → Avl (a, n) → ∃k[k ≤ n + 1 ∧ 1 ≤ k ∧