-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Packed Boolean Arrays #56
base: master
Are you sure you want to change the base?
Conversation
I'll need to think about this but I have some reservations. First, I'm not
convinced that using a Flags enum isn't the better way to handle bit
fields, and that is already fully supported. I also don't necessarily buy
into the idea that endianness applies to bit-fields (at a bit-level) as
that isn't something that is well-defined. In general I've avoided dealing
with bits across the entire framework since it opens a lot of pitfalls and
complicates the API. Is it possible to accomplish what you're trying to do
with a flags enum field?
…On Sat, Feb 25, 2017 at 11:48 AM, Davipb ***@***.***> wrote:
Right now, when a boolean array is serialized, every boolean is written to
a byte, even though only one bit of the byte is ever used, leading to a
waste of space.
This pull request introduces PackAttribute. When a boolean array is
marked with [Pack], its contents are "packed" into the bits of the
resulting bytes, which means we can now put up to 8 booleans in a single
byte (reducing by 8 times the space usage!).
For example, the array {true, false, true, false, true, false, true,
false} would normally be serialized into 8 bytes: 0x01 0x00 0x01 0x00
0x01 0x00 0x01 0x00. If marked with [Pack], they now fit into a single
byte: 1010 1010 or 0xAA (On Big-Endian mode, see below).
Endianness is respected. On Little-Endian mode, the bools are written from
the Least-Significant Bit (LSB) to the Most-Significant Bit (MSB): 0000
000X, 0000 00X0, 0000 0X00, etc. On Big-Endian, they're written from the
MSB to the LSB: X000 0000, 0X00 0000, 00X0 0000, etc.
Boolean Arrays not marked with [Pack] do not change, which means this
addition is 100% backwards-compatible.
------------------------------
You can view, comment on, or merge this pull request online at:
#56
Commit Summary
- Add packed boolean array nodes
- Add packed boolean array tests
- Make boolean packing optional through PackAttribute
- Add test for unpacked boolean arrays
File Changes
- *A* BinarySerializer.Test/PackedBoolean/
ConstantSizePackedBooleanClass.cs
<https://github.com/jefffhaynes/BinarySerializer/pull/56/files#diff-0>
(20)
- *A* BinarySerializer.Test/PackedBoolean/
EndianAwarePackedBooleanClass.cs
<https://github.com/jefffhaynes/BinarySerializer/pull/56/files#diff-1>
(15)
- *A* BinarySerializer.Test/PackedBoolean/PackedBooleanTests.cs
<https://github.com/jefffhaynes/BinarySerializer/pull/56/files#diff-2>
(157)
- *A* BinarySerializer.Test/PackedBoolean/UnpackedBooleanClass.cs
<https://github.com/jefffhaynes/BinarySerializer/pull/56/files#diff-3>
(16)
- *A* BinarySerializer.Test/PackedBoolean/ValidPackedBooleanClass.cs
<https://github.com/jefffhaynes/BinarySerializer/pull/56/files#diff-4>
(16)
- *M* BinarySerializer/Graph/TypeGraph/ContainerTypeNode.cs
<https://github.com/jefffhaynes/BinarySerializer/pull/56/files#diff-5>
(17)
- *A* BinarySerializer/Graph/TypeGraph/PackedBooleanArrayTypeNode.cs
<https://github.com/jefffhaynes/BinarySerializer/pull/56/files#diff-6>
(17)
- *A* BinarySerializer/Graph/ValueGraph/PackedBooleanArrayValueNode.cs
<https://github.com/jefffhaynes/BinarySerializer/pull/56/files#diff-7>
(105)
- *A* BinarySerializer/PackAttribute.cs
<https://github.com/jefffhaynes/BinarySerializer/pull/56/files#diff-8>
(14)
Patch Links:
- https://github.com/jefffhaynes/BinarySerializer/pull/56.patch
- https://github.com/jefffhaynes/BinarySerializer/pull/56.diff
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#56>, or mute the
thread
<https://github.com/notifications/unsubscribe-auth/AJSKR7PN22Ql5Zdi14NF4-bAULwJCONdks5rgFtOgaJpZM4MMEF7>
.
--
If you want to build a ship, don't drum up people together to collect wood
and don't assign them tasks and work, but rather teach them to long for the
endless immensity of the sea.
Antoine de Saint-Exupery
|
This is mostly a way to save space when storing a boolean array, not an ad-hoc bitfield creator. A bitfield lets you change a fixed set of flags, but a packed boolean array allows you to store any arbitrary number of booleans in a manner that saves space. The "bit manipulation" is only done inside the boolean array's serializer, so it doesn't affect anything else in the library: An array with 9 booleans will take 2 bytes, and the deserializer will use the Field Count to extract the proper number of booleans from the bytes. It can be used, for example, to store a 2D Game Map's tile collision data, which would be a big boolean array, in the hundreds or thousands. Going from 1000 bytes for a 100 x 100 map to 125 bytes is a huge improvement. As the for the Endianness, it seemed like a good choice of a way to let the user configure in which order the booleans are stored (Since the library's goal is to give you as much control over the output as possible). But the same could be done by adding a property to |
Understood. Ok, I definitely need to think about it since this feels like
a significant foray into the world of bits. If we can pack bools then why
can't we pack other things at weird bit offsets, etc...
…On Sat, Feb 25, 2017 at 1:16 PM, Davipb ***@***.***> wrote:
This is mostly a way to save space when storing a boolean array, not an
ad-hoc bitfield creator. A bitfield lets you change a fixed set of flags,
but a packed boolean array allows you to store any arbitrary number of
booleans in a manner that saves space.
The "bit manipulation" is only done inside the boolean array's serializer,
so it doesn't affect anything else in the library: An array with 9 booleans
will take 2 bytes, and the deserializer will use the Field Count to extract
the proper number of booleans from the bytes.
It can be used, for example, to store a 2D Game Map's tile collision data,
which would be a big boolean array, in the hundreds or thousands. Going
from 1000 bytes for a 100 x 100 map to 125 bytes is a huge improvement.
As the for the Endianness, it seemed like a good choice of a way to let
the user configure in which order the booleans are stored (Since the
library's goal is to give you as much control over the output as possible).
But the same could be done by adding a property to PackAttribute, or by
just defining a standard storage order, so that can easily be changed.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#56 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJSKR6uQAqRrWcjAx8R1dlll2MmfV5I8ks5rgG_igaJpZM4MMEF7>
.
--
If you want to build a ship, don't drum up people together to collect wood
and don't assign them tasks and work, but rather teach them to long for the
endless immensity of the sea.
Antoine de Saint-Exupery
|
238a66f
to
d399a46
Compare
760f928
to
ff1e13c
Compare
Hey, just wondering, are you still considering adding this to your Serializer? |
Right now, when a boolean array is serialized, every boolean is written to a byte, even though only one bit of the byte is ever used, leading to a waste of space.
This pull request introduces
PackAttribute
. When a boolean array is marked with[Pack]
, its contents are "packed" into the bits of the resulting bytes, which means we can now put up to 8 booleans in a single byte (reducing by 8 times the space usage!).For example, the array
{true, false, true, false, true, false, true, false}
would normally be serialized into 8 bytes:0x01 0x00 0x01 0x00 0x01 0x00 0x01 0x00
. If marked with[Pack]
, they now fit into a single byte:1010 1010
or0xAA
(On Big-Endian mode, see below).Endianness is respected. On Little-Endian mode, the bools are written from the Least-Significant Bit (LSB) to the Most-Significant Bit (MSB):
0000 000X
,0000 00X0
,0000 0X00
, etc. On Big-Endian, they're written from the MSB to the LSB:X000 0000
,0X00 0000
,00X0 0000
, etc.Boolean Arrays not marked with
[Pack]
do not change, which means this addition is 100% backwards-compatible.