-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replaces persistent collections with immutable collections #101
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks large, but ~18 of the modified files are just changing Persistent*
to Immutable*
.
I recommend starting with ImmutableList.kt and continuing down to ImmutableMap.kt
and the associated tests for those two files. Then, take a look at IonElementLoaderImpl, and then look at the rest in whatever order you like.
var metas = EMPTY_METAS | ||
if (options.includeLocationMeta) { | ||
val location = ionReader.currentLocation() | ||
if (location != null) metas = location.toMetaContainer() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ This is partly a stylistic change. The toMetaContainer()
wraps the IonLocation
in an ImmutableMap
implementation, which is the main performance improvement for the location metadata. The rest of this change is stylistic to try to make it more readable and concise.
@@ -89,18 +89,13 @@ internal class IonElementLoaderImpl(private val options: IonElementLoaderOptions | |||
return handleReaderException(ionReader) { | |||
val valueType = requireNotNull(ionReader.type) { "The IonReader was not positioned at an element." } | |||
|
|||
val annotations = ionReader.typeAnnotations!!.asList().toEmptyOrPersistentList() | |||
val annotations = ionReader.typeAnnotations!!.toImmutableListUnsafe() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ Previously, this line was creating a list from an array, and then creating a persistent list by copying all of the elements from the intermediate list (an O(n)
operation). Now, it just wraps the array with an ArrayBackedImmutableList
(an O(1)
operation) in addition to ArrayBackedImmutableList
having a smaller object layout to begin with.
.toMap() | ||
.toImmutableMapUnsafe() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ I wanted to create only a minimal set of extension functions, so I didn't create Iterable<K, V>.toImmutableMap()
. Instead, I just use the equivalent .toMap().toImmutableMapUnsafe()
here.
* We cannot use `Map.of(...)` because those were introduced in JDK 9, and we still | ||
* support JDK 8. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* We cannot use `Map.of(...)` because those were introduced in JDK 9, and we still | |
* support JDK 8. | |
* We cannot use `Map.of(...)` because those were introduced in JDK 9, and we still | |
* support JDK 8. | |
* See: https://github.com/amazon-ion/ion-element-kotlin/issues/102 |
* We cannot use `List.of(...)` because those were introduced in JDK 9, and we still | ||
* support JDK 8. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* We cannot use `List.of(...)` because those were introduced in JDK 9, and we still | |
* support JDK 8. | |
* We cannot use `List.of(...)` because those were introduced in JDK 9, and we still | |
* support JDK 8. | |
* See: https://github.com/amazon-ion/ion-element-kotlin/issues/102 |
I've had a few offline questions that can be roughly summarized as "Why are the persistent collections so bad for performance?" I don't think persistent collections are bad, per se, but they might not be the right choice here.
|
Well, I just looked again, and Is it more valuable to minimize dependencies on other libraries or to use the interface of another library? |
If we can use Do we still need or benefit from our own specialized Ion location map implementation? |
We still need the specialized Ion location map. It accounts for the majority of the performance improvement in the cases when we're filling location metadata. |
That's what I thought. So if we use I'm finding documentation on
While the [API details] section includes:
Then the implication is that Checking the implementation it looks like this behavior comes from creating an empty immutable thing and then adding the other elements to it while still using the storage of source collection, as described in the README |
We can eliminate the interfaces and collections I added, aside from the special IonLocation map.
That's what I would have hoped. However, That function calls That is what I was trying to avoid. However, I did end up finding |
Issue #, if available:
None. However, an offline discussion with someone who was trying out IonElement indicated that loading all values as IonElement was allocating up to 4x as memory as compared to loading the equivalent data as IonValue.
Description of changes:
This replaces
PersistentList
andPersistentMap
with customImmutableList
andImmutableMap
implementations. These implementations are much simpler thanPersistentList
andPersistentMap
—they can wrap any existingList
/Map
with very little overhead. I chose to do this because a lightweight wrapper will require less allocation than having to copy the data to a completely new instance of aList
orMap
. I think my implementations are pretty much the most minimal implementation one could make for an immutable wrapper ofList
orMap
.I've also added an
ArrayBackedImmutableList
so that we don't need to allocate to convert theString[]
of annotations into aList
before wrapping it to make anImmutableList
.Finally, I've added an
IonLocationBackedImmutableMap
—a special-purpose map implementation that can only contain one key-value pair, and the key can only be$ion_location
. This saves a huge amount of allocation and significantly speeds up reading with location metadata to the point where readingIonElement
with the additional overhead of location data is faster than readingIonValue
(which has no location metadata).Compatibility
The public API only exposes
List
andMap
.PersistentList
andPersistentMap
have never been part of the API, so it is allowable to replace them with a different implementation.However, Java users can still call mutating methods on a Kotlin
List
, so I did some manual testing to confirm that the behavior has not changed in that case. BothPersistentList
and myImmutableList
implementations will throwUnsupportedOperationException
if a method such asadd()
is called. ThePersistentMap
and myImmutableMap
implementations are the same.Performance Data
Composition of some arbitrary data (~5kb)
0.053 ± 0.001
81128.034 ± 0.004
0.047 ± 0.001
215416.030 ± 0.004
0.032 ± 0.001
131720.020 ± 0.003
0.024 ± 0.001
51952.015 ± 0.002
Sample binary Ion application logs (~20MB)
635.450 ± 75.384
625408257.218 ± 49.211
630.640 ± 215.099
2348935671.047 ± 146.942
452.451 ± 283.069
2245469209.217 ± 113.536
312.335 ± 24.048
446810105.670 ± 14.961
Sample product catalog data (~40kb)
0.616 ± 0.093
696584.433 ± 0.297
0.559 ± 0.028
2999688.359 ± 0.046
0.513 ± 0.007
2854072.329 ± 0.041
0.286 ± 0.016
631800.182 ± 0.021
For two of the samples, this PR has a normalized allocation rate that is about 1/4th of the current rate. For the one sample that displayed less of an improvement, it might be because it does not have as many lists/structs as the other data or it could be that it has a higher incidence of annotations. Either way, I'm not too concerned. Cumulatively, this change and #100 have improved the normalized memory allocation rate for all these samples from being ~4x more than
IonValue
to being less thanIonValue
(and in some cases, significantly less).Performance Data (with IonLocation metadata)
(*)
indicates performance without location because location data is not supported by that APIComposition of some arbitrary data (~5kb)
(*) 0.053 ± 0.001
(*) 81128.034 ± 0.004
0.129 ± 0.003
714456.083 ± 0.011
0.076 ± 0.001
368472.048 ± 0.006
0.033 ± 0.001
113096.021 ± 0.002
Sample binary Ion application logs (~20MB)
(*) 635.450 ± 75.384
(*) 625408257.218 ± 49.211
2492.157 ± 460.771
8622042546.800 ± 685.087
1296.107 ± 182.448
3979394547.314 ± 175.082
453.957 ± 62.876
679669196.884 ± 42.721
Sample product catalog data (~40kb)
(*) 0.616 ± 0.093
(*) 696584.433 ± 0.297
1.806 ± 0.055
10717049.154 ± 0.136
0.905 ± 0.025
4688840.633 ± 0.452
0.326 ± 0.006
1026104.209 ± 0.027
This PR brings a ~75% decrease in the allocation rate and a ~60% decrease in time per op compared to the previous commit, and cumulatively, the changes from 1.2.0 to this PR, provide up to a 90% decrease in the allocation rate, and up to a 80% decrease in time per op.
When compared to no location metadata, there is still a performance penalty associated with populating the location metadata, but while it was once 150-300% more time per op, it is now only 10-50% more time per op.
Populating the location metadata obviously requires more memory allocation than not doing it, so including
IonValue
here is not necessarily fair, but it is worth noting that for all of the sample data, the time per op is now ~40% less than IonValue—even with the location metadata.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.