-
Notifications
You must be signed in to change notification settings - Fork 112
Draft floating point Boogie language extension
This wiki details the proposed syntax for the floating-point extension for Boogie. The float type proposed is the IEEE-754 floating point, a standard which can be found at http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=4610935&filter=AND(p_Publication_Number:4610933)
The proposed floating point syntax is based on the SMT-LIB standard, a summary of which can be found here: http://smtlib.cs.uiowa.edu/theories-FloatingPoint.shtml
The proposal and discussion history for this extension can be found at https://github.com/boogie-org/boogie/issues/29
A floating point type with 'sig' significand bits and 'exp' exponent bits can be declared using the syntax:
float[sig+1]e[exp]
For example, a variable 'name' with type equivalent to IEEE-binary32 could be declared with:
var name : float24e8;
A floating point constant with a 'sig_value' significand, 'exp_value' exponent, 'sig' significand bits and 'exp' exponent bits can be declared using the syntax:
(-)[sig_value]e[exp_value]f[sig]e[exp]
Where a minus sign is placed at the beginning to represent a sign bit of 1.
For example, we can assign the constant -2.25 to var 'name' using the following syntax:
var name : float24e8;
name := -1048576e128f24e8;
Special constants with 'exp' exponent bits and 'sig' significand bits can of course be generated by constructing a floating point constant and manually computing the appropriate 'exp_value' and 'sig_value'. Alternatively, it is possible to use the following syntax:
Value | Declaration |
---|---|
NaN | 0NaN[sig]e[exp] |
+oo | 0+oo[sig]e[exp] |
-oo | 0-oo[sig]e[exp] |
+zero | 0e0f[sig]e[exp] |
-zero | -0e0f[sig]e[exp] |
The following operators are built into Boogie with the following syntax, where x and y are floating point values with identical exponent and significand sizes. Note that each of these operators is assumed to use the round to nearest (RNE) rounding mode. For an explanation on how to use different rounding modes, see the section "SMT-LIBv2 operators via 'builtin' attribute." Note also that the data type returned by each operation is listed in the third column; when a function returns data of type float, that return value is assumed to have the same exponent and significand bit sizes as x and y.
Operator | Boogie Equivalent | Return Type |
---|---|---|
fp.neg(x) | -x | float |
fp.add(x, y) | x + y | float |
fp.sub(x, y) | x - y | float |
fp.mul(x, y) | x * y | float |
fp.div(x, y) | x / y | float |
fp.leq(x, y) | x <= y | boolean |
fp.lt(x, y) | x < y | boolean |
fp.geq(x, y) | x >= y | boolean |
fp.gt(x, y) | x > y | boolean |
fp.eq(x, y) | x == y | boolean |
The rmode type allows the ability to store one of the five below rounding modes:
Rounding Mode | Acronym |
---|---|
Round Nearest Ties To Even | RNE |
Round Nearest Ties To Away | RNA |
Round Toward Positive | RTP |
Round Toward Negative | RTN |
Round Toward Zero | RTZ |
For example, the Round Toward Positive rounding mode may be assigned to var 'mode' using the following syntax:
var mode: rmode;
mode := RTP;
The operators listed in the section "Built-in operators" can be modified to use one of the five rounding modes listed in the above "Rounding mode" section.
This is done by creating a new function with 'function_name' that takes in type arguments 'args', along with an rmode variable, and returns a value of type 'return_type' with the syntax below:
function {:builtin "fp.op"} 'function_name'(args, rmode) returns(return_type);
For example, the following code demonstrates how to declare the "fp.add" operation and use it with both the RTZ and RTN rounding modes:
function {:builtin "fp.add" } FP_ADD_FLOAT32(float24e8, float24e8, rmode) returns(float24e8);
.
.
.
.
var z : float24e8;
z := fp.add(x, y, RTZ);
z := fp.add(x, y, RTN);
Or, alternatively:
function {:builtin "fp.add" } FP_ADD_FLOAT32(float24e8, float24e8, rmode) returns(float24e8);
.
.
.
.
var mode : rmode;
var z : float24e8;
mode := RTZ;
z := fp.add(x, y, mode);
mode := RTN;
z := fp.add(x, y, mode);
The "builtin" attribute can also be used to generate the floating point operators listed below. Descriptions of these functions can be found in the SMT-LIB standard. Note that 'x', 'y', and 'z' are all floating points with equal exponent and significand bit sizes. Note also that when functions return a value of type float, that value is assumed to have the same exponent and significand bit sizes as the arguments to the function.
Operators | Return Type |
---|---|
fp.abs(x) | float |
fp.fma(x, y, z) | float |
fp.sqrt(x) | float |
fp.rem(x, y) | float |
fp.roundToIntegral(x) | float |
fp.min(x, y) | float |
fp.max(x, y) | float |
fp.isNormal(x) | boolean |
fp.isSubnormal(x) | boolean |
fp.isZero(x) | boolean |
fp.isInfinite(x) | boolean |
fp.isNaN(x) | boolean |
fp.isNegative(x) | boolean |
fp.isPositive(x) | boolean |
The syntax for creating functions for these operators is identical to the syntax for creating functions for the built-in operators, except that a specific rounding mode is specified using the "rm" attribute, rather than being passed as an argument. As a result, separate functions will have to be declared for the same operator using different rounding modes.
For example, to create a function to calculate the absolute value of IEEE-binary32 sized floats using rounding modes RNA and RTP, separate functions will need to be declared:
function {:builtin "fp.abs" :rm "RNA" } FP_ABS_FLOAT32_RNA(float24e8) returns(float24e8);
function {:builtin "fp.abs" :rm "RTP" } FP_ABS_FLOAT32_RTP(float24e8) returns(float24e8);
.
.
.
.
var f : float24e8;
f := FP_ABS_FLOAT32_RNA(x); //want to use RNA rounding mode
f := FP_ABS_FLOAT32_RTP(x); //want to use RTP rounding mode
Conversion to and from floating point values in Boogie is also done using the "builtin" attribute. The following conversions are well defined, where the argument given to an operator is the data type given to the function. Note that when float is given as an argument or return type, the number of exponent and significand bits are defined as part of the function declaration. Also note that a rounding mode may be defined for each conversion to a floating point type:
Operator | Return Type |
---|---|
to_fp(int) | float |
to_fp(real) | float |
to_fp(float) | float |
to_fp(bit_vec)* | float |
to_fp_unsigned(bit_vec) | float |
fp.to_ubv(float) | bit_vec |
fp.to_sbv(float) | bit_vec |
fp.to_real(float) | real |
*When converting from a bit vector to a floating point value using this builtin definition, the bitvector must be the same size as the resulting floating point value and, in particular, the first 'exp' bits of the bitvector become the exponent bits of the resulting floating point while the remaining bits become the significand of the resulting floating point. To treat the given bitvector as a signed two's complement integer, a special attribute 'ai' must be given with the value "True." For example, to convert a size 16 bitvector to a IEEE-binary16 floating point value directly, one could declare the following function:
function {:builtin "(_ to_fp 5 11) RNE" } TO_FLOAT16_BITVECTOR(bv) returns(float11e5);
While to convert a bitvector representing signed two's complement integer to a IEEE-binary16 floating point with rounding mode RNA, one could declare the following function:
function {:builtin "(_ to_fp 5 11) RNA" :ai "True" } TO_FLOAT16_SINTEGER_RNA(bv) returns(float11e5);