We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I have a use case where I want to send a fairly small arrays/scalars via Arrow IPC. Memory efficiency is quite important for us.
One simple way of doing this would be to put the ints in a Record batch and serialize.
However, I notice that serializing an int, 3 chars, and a bool takes a whopping 512 bytes on my machine:
https://gist.github.com/JerAguilon/8c70107a576c0b04b5846ad8207651e7
STDOUT: batch: pyarrow.RecordBatch a: int64 b: string c: bool, num_rows: 1 buf.size: 512
Things like the schema are well known ahead of time, I'd like to minimize as much data in the payload as possible.
In Arrow IPC (the C++ library works for me), is there some way to serialize just a scalar or array? I see only a RecordBatchWriter exposed:
RecordBatchWriter
https://arrow.apache.org/docs/cpp/api/ipc.html#_CPPv416MakeStreamWriterPN2io12OutputStreamERKNSt10shared_ptrI6SchemaEERK15IpcWriteOptions
C++
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Describe the usage question you have. Please include as many useful details as possible.
I have a use case where I want to send a fairly small arrays/scalars via Arrow IPC. Memory efficiency is quite important for us.
One simple way of doing this would be to put the ints in a Record batch and serialize.
However, I notice that serializing an int, 3 chars, and a bool takes a whopping 512 bytes on my machine:
https://gist.github.com/JerAguilon/8c70107a576c0b04b5846ad8207651e7
Things like the schema are well known ahead of time, I'd like to minimize as much data in the payload as possible.
In Arrow IPC (the C++ library works for me), is there some way to serialize just a scalar or array? I see only a
RecordBatchWriter
exposed:https://arrow.apache.org/docs/cpp/api/ipc.html#_CPPv416MakeStreamWriterPN2io12OutputStreamERKNSt10shared_ptrI6SchemaEERK15IpcWriteOptions
Component(s)
C++
The text was updated successfully, but these errors were encountered: