-
Notifications
You must be signed in to change notification settings - Fork 0
Boon Binary Object Notation for Streaming and Framing BBONSF
All types are expressed in Big Endian Format. The specification is designed to be easy to parse and easy to understand. Some size optimizations were given up for ease of use.
BBONSF has the following basic data encodings:
- Type - Type enum / or partial 8 bit number
- Octet - 8 bit number -128 and a maximum value of 127 (inclusive)
- UInt - 16 bit unsigned number 0 to 65,535 inclusive
- Int - Signed 32 bit number 32-bit signed two's complement integer
- VarInt - Variable sized Int
- Number - Variable size and Variable precision number - all numbers and all sizes can be expressed
- String - UTF-8 encoded string with size or a numeric type
- Array - Uniform List of data values with starting size
- Stream - Uniform List of data values with last flag and chunk size
- Pair - Key value pair
- List - Non Uniform List of data value starting with size
There are only 11 main data encoding types. With these ten types you should be able to encode all types from all language efficiently.
UINT is a UINT16, and INT is a an INT32.
There is not UINT8, INT8, UINT16, INT16, UNIT32, INT32, UINT64, INT64 etc.
There is also no Char, Short, Long, etc.
Java and C# Mappings:
- Type - Type enum or byte or boolean
- Octet - Byte
- UInt - char or int or short
- Int - int or short or long
- VarInt - int or short or long or BigInteger
- Decimal - long, float, double, short, byte, BigInteger, BigDecimal, etc.
- String - String, StringBuilder, StringBuffer, CharBuffer, etc. (Also float, BigDecimal)
- Array - primitive arrays, String arrays, Maps, Objects
- Stream - primitive arrays, String arrays, Maps, Objects
- Pair - Map Key/Value pairs, Object properties
- List - List of objects or an object
There is no concept of Map or Object, but one could use the above constructs to map Objects and Maps. I List of Pairs is an object of sorts or a map.
This specification does not define language mappings just data expressed on the wire and type annotations for the data. From this, one could build more complicated mappings.
Basic types
Numeric Types
- 1 - Octet
- 2 - UInt
- 3 - Int
- 4 - VarInt
- 5 - Decimal
Text
- 6 - String
Arrays
- 7 - Octet Array
- 8 - UInt Array
- 9 - Int Array
- 10 - VarInt Array
- 11 - Decimal Array
- 12 - String Array
Streams
- 13 - Octet Stream
- 14 - UInt Stream
- 15 - Int Stream
- 16 - VarInt Stream
- 17 - Decimal Stream
- 18 - String Stream
Special Data Structs
- 19 - Pair STRING : ENCODED VALUE
- 20 - Pair INT : ENCODED VALUE
- 21 - Array of Pair STRING VAL
- 22 - Array of Pair INT VAL
- 23 - List
- 23 - Stream List Streams of lists
To encode a Type enum, you do this:
-127 + TYPE_ENUM = ON_THE_WIRE_ENCODING_OF_TYPE_ENUM
All values are preceded by their type unless their value is contained in the type.
0 is special it means NULL, 0 and FALSE. 0 is both a value and a type enum. Except you do not have to encode 0 like you do the other type enums.
Any value in a byte that is part of a TYPE_ENUM that is greater than -101 is that value. -127 through -101 are reserved for type enumeration data.
Enum Name On the Wire value
* 1 - Octet -127
* 2 - UInt -126
* 3 - Int -125
* 4 - VarInt -124
* 5 - Decimal -123
* 6 - String -122
* 7 - Octet Array -121
* 8 - UInt Array -120
* 9 - Int Array -119
* 10 - VarInt Array -118
* 11 - Decimal Array -117
* 12 - String Array -116
* 13 - Octet Stream -115
* 14 - UInt Stream -114
* 15 - Int Stream -113
* 16 - VarInt Stream -112
* 17 - Decimal Stream -111
* 18 - String Stream -110
* 19 - Pair STRING : ENCODED VALUE -109
* 20 - Pair INT : ENCODED VALUE -108
* 21 - Array of Pair STRING VAL -107
* 22 - Array of Pair INT VAL -106
* 23 - List -105
* RESERVED -104
* RESERVED -103
* RESERVED -102
* RESERVED -101
* > -101 to 0 Actual int value
* 0 means INT value 0, NULL, and FALSE
* 1 means INT value 1, and TRUE
* > 1 is actual INT VALUE
A Type can store more than half of the INT values of an OCTET.
TYPE_STRING -> SIZE -> STRING DATA -> END
Location Description Contents
byte 0: SIZE 0 Unsigned two byte int
byte 1: SIZE 1
byte 2: STRING DATA String encoded as UTF-8
byte N: STRING DATA String encoded as UTF-8
Strings can hold up to 65,534 UTF-8 encoded characters If SIZE is equal to 65,535, it means that the next String is considered part of this String
65,535 is considered MORE_LEFT
A string larger that 65,534 would be encoded as follows:
TYPE_STRING -> MORE_LEFT -> STRING DATA -> SIZE -> STRING DATA
Size is a UINT value.
The Int would be
Location Description Contents
byte 0: Type Enum -125
byte 3: INT OCTET 0 INT Octet
byte 4: INT OCTET 1 INT Octet
byte 5: INT OCTET 3 INT Octet
byte 5: INT OCTET 4 INT Octet
The Decimal format is similar to database formats DECIMAL and NUMERIC or Java's BigDecimal. NUMBER is important to preserve exact precision, for example with monetary data.
TYPE_DECIMAL -> PRECISION -> SCALE -> BYTES -> END
Decimal is stored in binary format. The number is a byte array containing the two's-complement binary representation of an integer. PRECISION determines the size of the array. SCALE determines where to put the decimal point.
PRECISION and SCALE are both UINT values.
The VarInt format is similar to database formats NUMERIC or Java's BigInteger.
TYPE_DECIMAL -> SIZE -> BYTES -> END
VarInt is stored in binary format. The number is a byte array containing the two's-complement binary representation of an integer.
TYPE_INT_STREAM -> IS_LAST_FLAG -> SIZE -> [UNIFORM ARRAY OF INTS] -> END
Location Description Contents
byte 0: DONE FLAG 0 Means last Chunk, 1 means 1 or more chunks left
byte 1: SIZE 0 Unsigned two byte int
byte 2: SIZE 1
byte 3: INT 1 OCTET 0
byte 4: INT 1 OCTET 1
byte 5: INT 1 OCTET 2
byte 6: INT 1 OCTET 3
byte 7: INT 2 OCTET 0
byte 8: INT 2 OCTET 1
byte 9: INT 2 OCTET 2
byte 10: INT 2 OCTET 3
byte N: INT N OCTET N
Streams can hold up to 65,535 per chunk, and there can be N chunks. To contain more than one chunk, one must do this.
TYPE_INT_STREAM -> NOT_DONE_FLAG -> SIZE -> INT ARRAY DATA -> ... ... DONE FLAG -> INT ARRAY DATA
Size is a UINT value.
TYPE_OCTET_ARRAY -> SIZE -> OCTET DATA -> END
Location Description Contents
byte 0: SIZE 0 Unsigned two byte int
byte 1: SIZE 1
byte 2: OCTET DATA OCTET bytes
byte N: OCTET DATA OCTET bytes
Arrays can hold up to 65,534 values in this case Octets or bytes.
If SIZE is equal to 65,535, it means that the next Array is considered part of this Array.
65,535 is considered MORE_LEFT
An Array larger that 65,534 would be encoded as follows:
TYPE_OCTET_ARRAY -> MORE_LEFT -> OCTET DATA -> SIZE -> OCTET DATA
Size is a UINT value.
YourKit supports Boon open source project with its full-featured Java Profiler. YourKit, LLC is the creator of innovative and intelligent tools for profiling Java and .NET applications. Take a look at YourKit's leading software products: YourKit Java Profiler and YourKit .Net profiler.