Skip to content

Draft floating point Boogie language extension

liammachado edited this page Mar 4, 2018 · 25 revisions

DRAFT: Floating point type support in the Boogie IVL

Summary

This wiki details the proposed syntax for the floating-point extension for Boogie. The float type proposed is the IEEE-754 floating point, a standard which can be found at http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=4610935&filter=AND(p_Publication_Number:4610933)

The proposed floating point syntax is based on the SMT-LIB standard, a summary of which can be found here: http://smtlib.cs.uiowa.edu/theories-FloatingPoint.shtml

The proposal and discussion history for this extension can be found at https://github.com/boogie-org/boogie/issues/29

Type declarations

A floating point type with 'sig' significand bits and 'exp' exponent bits can be declared using the syntax:

float[sig+1]e[exp]

For example, a variable 'name' with type equivalent to IEEE-binary32 could be declared with:

var name : float24e8;

Constant declarations

A floating point constant with a 'sig_value' significand, 'exp_value' exponent, 'sig' significand bits and 'exp' exponent bits can be declared using the syntax:

(-)[sig_value]e[exp_value]f[sig]e[exp]

Where a minus sign is placed at the beginning to represent a sign bit of 1.

For example, we can assign the constant -2.25 to var 'name' using the following syntax:

var name : float24e8;
name := -1048576e128f24e8;

Special constants with 'exp' exponent bits and 'sig' significand bits can of course be generated by constructing a floating point constant and manually computing the appropriate 'exp_value' and 'sig_value'. Alternatively, it is possible to use the following syntax:

Value Declaration
NaN 0NaN[sig]e[exp]
+oo 0+oo[sig]e[exp]
-oo 0-oo[sig]e[exp]
+zero 0e0f[sig]e[exp]
-zero -0e0f[sig]e[exp]

Built-in operators

The following operators are built into Boogie with the following syntax, where x and y are floating point values with identical exponent and significand sizes. Note that each of these operators is assumed to use the round to nearest (RNE) rounding mode. For an explanation on how to use different rounding modes, see the section "SMT-LIBv2 operators via 'builtin' attribute." Note also that the data type returned by each operation is listed in the third column; when a function returns data of type float, that return value is assumed to have the same exponent and significand bit sizes as x and y.

Operator Boogie Equivalent Return Type
fp.neg(x) -x float
fp.add(x, y) x + y float
fp.sub(x, y) x - y float
fp.mul(x, y) x * y float
fp.div(x, y) x / y float
fp.leq(x, y) x <= y boolean
fp.lt(x, y) x < y boolean
fp.geq(x, y) x >= y boolean
fp.gt(x, y) x > y boolean
fp.eq(x, y) x == y boolean

Rounding mode

The rmode type allows the ability to store one of the five below rounding modes:

Rounding Mode Acronym
roundNearestTiesToEven RNE
roundNearestTiesToAway RNA
roundTowardPositive RTP
roundTowardNegative RTN
roundTowardZero RTZ

For example, the Round Toward Positive rounding mode may be assigned to var 'mode' using the following syntax:

var mode: rmode;
mode := RTP;

Or, alternatively, the full name may be used:

var mode: rmode;
mode := roundTowardPositive;

SMT-LIBv2 operators via "builtin" attribute

Some of the operators listed in the "Built-in operators" section can be modified to use one of the five rounding modes listed in the "Rounding mode" section.

This is done by creating a new function with 'function_name' that takes in type arguments 'args', along with an rmode variable, and returns a value of type 'return_type' with the syntax below:

function {:builtin "fp.op"} 'function_name'(rmode, args) returns(return_type);

For example, the following code demonstrates how to declare the "fp.add" operation and use it with both the RTZ and RTN rounding modes:

function {:builtin "fp.add"} FP_ADD_FLOAT32(rmode, float24e8, float24e8) returns(float24e8);

.
.
.
.

var x : float24e8;
var y : float24e8;
var z : float24e8;

z := FP_ADD_FLOAT32(RTZ, x, y);
z := FP_ADD_FLOAT32(RTN, x, y);

Or, alternatively:

function {:builtin "fp.add"} FP_ADD_FLOAT32(rmode, float24e8, float24e8) returns(float24e8);

.
.
.
.

var mode : rmode;
var x : float24e8;
var y : float24e8;
var z : float24e8;

mode := RTZ;
z := FP_ADD_FLOAT32(mode, x, y);
mode := RTN;
z := FP_ADD_FLOAT32(mode, x, y);

The "builtin" attribute can also be used to generate the floating point operators listed below. Descriptions of these functions can be found in the SMT-LIB standard. Note that 'x', 'y', and 'z' are all floating points with equal exponent and significand bit sizes. Note also that when functions return a value of type float, that value is assumed to have the same exponent and significand bit sizes as the arguments to the function.

Operators Return Type
fp.abs(x) float
fp.fma(x, y, z) float
fp.sqrt(x) float
fp.rem(x, y) float
fp.roundToIntegral(x) float
fp.min(x, y) float
fp.max(x, y) float
fp.isNormal(x) boolean
fp.isSubnormal(x) boolean
fp.isZero(x) boolean
fp.isInfinite(x) boolean
fp.isNaN(x) boolean
fp.isNegative(x) boolean
fp.isPositive(x) boolean

The syntax for creating functions for these operators is identical to the syntax for creating functions for the built-in operators.

Alternatively, the above operators (and the built-in operators) may have functions declared that do not accept an "rmode" variable, instead defaulting to the "RNE" rounding mode. For example:

function {:builtin "fp.sqrt"} FP_SQRT_FLOAT32(float24e8) returns(float24e8);

.
.
.
.

var x : float24e8;
x := FP_SQRT_FLOAT32(x);

Type conversion

Conversion to and from floating point values in Boogie is also done using the "builtin" attribute. The following conversions are well defined, where the argument given to an operator is the data type given to the function. Note that when float is given as an argument or return type, the number of exponent and significand bits are defined as part of the function declaration. Also note that a rounding mode may be defined for each conversion to a floating point type:

Operator Return Type
to_fp(int) float
to_fp(real) float
to_fp(float) float
to_fp(bit_vec)* float
to_fp_unsigned(bit_vec) float
fp.to_ubv(float) bit_vec
fp.to_sbv(float) bit_vec
fp.to_real(float) real

*When converting from a bit vector to a floating point value using this builtin definition, the bitvector must be the same size as the resulting floating point value and, in particular, the first 'exp' bits of the bitvector become the exponent bits of the resulting floating point while the remaining bits become the significand of the resulting floating point. To treat the given bitvector as a signed two's complement integer, a special attribute 'ai' must be given with the value "True." For example, to convert a size 16 bitvector to a IEEE-binary16 floating point value directly, one could declare the following function:

function {:builtin "(_ to_fp 5 11) RNE" } TO_FLOAT16_BITVECTOR(bv) returns(float11e5);

While to convert a bitvector representing signed two's complement integer to a IEEE-binary16 floating point with rounding mode RNA, one could declare the following function:

function {:builtin "(_ to_fp 5 11) RNA" :ai "True" } TO_FLOAT16_SINTEGER_RNA(bv) returns(float11e5);

Clone this wiki locally