Implement zero-copy ByteString <=> ByteVector conversions #1

mpilquist · 2015-09-07T14:17:46Z

No description provided.

aloiscochard · 2015-09-07T18:00:26Z

:+1

That would be awesome!

rkuhn · 2015-09-07T19:49:54Z

Step one is done by #2, taking care of composite variants is more complex than I can tackle tonight. Shall we create a more specific ticket for those? (or even one per direction?)

mpilquist · 2015-09-07T22:48:47Z

That sounds good. A ticket per direction might be good -- I've been experimenting with ByteVector => ByteString and there's not much more we can do without a minor change to scodec-bits. Tracking those details would be best in an issue specific to that conversion.

Another thing to note -- the next major release of scodec-bits changes the indexing on ByteVector from Int to Long, so in the case where a ByteVector is larger than Int.MaxValue, the conversion would have to fail.

rkuhn · 2015-10-01T09:22:07Z

I’m unfortunately quite overwhelmed with other things now, @rklaehn would you be interested in tinkering with this?

rklaehn · 2015-10-01T14:16:04Z

@rkuhn So basically akka.util.ByteString and scodec.bits.ByteVector are both unbalanced rope-like data structures representing sequences of bytes. And you want a conversion that does not copy the byte arrays in the leafs. Conversion between leafs is already implemented using this java hack to get around accessibility restrictions, right?

rkuhn · 2015-10-01T14:27:33Z

Well, kind of. Currently it “works”, but it avoids the copy only in the compact case. Concerning access restrictions @mpilquist told me that on the scodec end there might be some work needed to make the necessary constructors accessible, even for Java.

mpilquist · 2015-10-01T14:57:36Z

I'm not sure what to do about access restrictions. I don't really want to make the various concrete ByteVector subtypes public, though perhaps we could make them private[bits] and use a similar Java accessibility workaround. My concern with doing that is that scodec-bits and this library are more tightly coupled than implied by semantic versioning -- e.g., scodec-bits could evolve in a binary compatible way as far as API usage is concerned, but end up breaking this interop library.

IIRC, we have a similar issue with ByteString, where some of the constructors are private[akka].

rkuhn · 2015-10-01T14:58:54Z

Yes, this interop library may require more frequent release than normal client code, but that might be worth it. WDYT?

mpilquist · 2015-10-01T15:01:15Z

Works for me, assuming we can avoid exposing internals to scala clients (via package private, java workarounds, etc.).

rklaehn · 2015-10-01T16:02:09Z

@rkuhn Yes, that is what I meant with leafs. The compact ByteString1C is the leaf of your rope data structure, right? It is (currently?) the only implementation of CompactByteString. I am not familiar with the scodec one, but it seems to be roughly similar except for some trickery with mutable buffers...

In the long term, wouldn't it be best if akka and scodec could come up with a common ByteString implementation that works for both?

rkuhn · 2015-10-01T18:24:01Z

Well, ideally we would just use ByteString :-) (saying that since Akka is quite a bit older than scodec, but also because we would have quite some difficulty phasing out ByteString given our binary compatibility constraints). But scodec will probably not want to depend on akka-actor and we cannot break ByteString out of that artifact without violating some useful practices (like not splitting packages across multiple artifacts—which is a definite no-go for OSGi [which we unfortunately do support]). What would the migration strategies look like?

mpilquist · 2015-10-01T20:10:58Z

@rkuhn Strangely, I feel that the ideal is that we'd just use ByteVector... :)

In all seriousness though, you are correct that we don't want scodec depending on akka-actor. Also, BitVector is much more important to scodec than ByteVector, and we get quite a bit of convenience by having these two types defined together. Further, ByteVector has been optimized for the types of code paths that occur frequently when encoding/decoding binary. ByteString hasn't been optimized for those code paths (you can see micro benchmark results that compare the two implementations in the scodec-bits/benchmark project).

Internally, the ByteVector structure is a balanced tree, with a mutable scratch buffer at the end, which allows referentially transparent copying in to the scratch buffer (where concurrent writes are raced, with the loser having to copy).

As a result, I think an interop layer is our best bet for the foreseeable future. If folks were interested in SLIP-ing something based on BitVector/ByteVector/ByteString, I'd be interested in moving to a standard library type, but only if the performance and convenience of that type was on par with the existing scodec-bits types.

rkuhn mentioned this issue Sep 7, 2015

use non-copying ByteBuffer transfer #2

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement zero-copy ByteString <=> ByteVector conversions #1

Implement zero-copy ByteString <=> ByteVector conversions #1

mpilquist commented Sep 7, 2015

aloiscochard commented Sep 7, 2015

rkuhn commented Sep 7, 2015

mpilquist commented Sep 7, 2015

rkuhn commented Oct 1, 2015

rklaehn commented Oct 1, 2015

rkuhn commented Oct 1, 2015

mpilquist commented Oct 1, 2015

rkuhn commented Oct 1, 2015

mpilquist commented Oct 1, 2015

rklaehn commented Oct 1, 2015

rkuhn commented Oct 1, 2015

mpilquist commented Oct 1, 2015

Implement zero-copy ByteString <=> ByteVector conversions #1

Implement zero-copy ByteString <=> ByteVector conversions #1

Comments

mpilquist commented Sep 7, 2015

aloiscochard commented Sep 7, 2015

rkuhn commented Sep 7, 2015

mpilquist commented Sep 7, 2015

rkuhn commented Oct 1, 2015

rklaehn commented Oct 1, 2015

rkuhn commented Oct 1, 2015

mpilquist commented Oct 1, 2015

rkuhn commented Oct 1, 2015

mpilquist commented Oct 1, 2015

rklaehn commented Oct 1, 2015

rkuhn commented Oct 1, 2015

mpilquist commented Oct 1, 2015