-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make loadAllElements
fall back to an iterative implementation once a certain depth is reached
#103
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #103 +/- ##
=========================================
Coverage ? 86.69%
Complexity ? 531
=========================================
Files ? 34
Lines ? 1135
Branches ? 161
=========================================
Hits ? 984
Misses ? 99
Partials ? 52 ☔ View full report in Codecov by Sentry. |
9b0edb7
to
14785d3
Compare
* | ||
* This will fail for IonDatagram if the IonDatagram does not contain exactly one user value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ This was a non-obvious edge case that I ran into while testing my changes, so I figured I'd add a note here.
@@ -31,7 +31,7 @@ internal class StructElementImpl( | |||
) : AnyElementBase(), StructElement { | |||
|
|||
override val type: ElementType get() = ElementType.STRUCT | |||
override val size = allFields.size | |||
override val size: Int get() = allFields.size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ Both ListElementImpl
and SexpElementImpl
(by way of inheritance from SeqElementImpl
) delegate to the underlying collection rather than storing the size in a field. It's necessary to delegate to the underlying collection so that we don't read the size from the collection before it is fully populated by the element loader.
@@ -45,6 +50,8 @@ class IonElementLoaderTests { | |||
// Converting from IonValue to IonElement should result in an IonElement that is equivalent to the | |||
// parsed IonElement | |||
assertEquals(parsedIonValue.toIonElement(), parsedIonElement) | |||
|
|||
assertEquals(tc.expectedElement, parsedIonElement) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ I was surprised to discover that there was previously no assertion that the parsed element was equal to the expected element. 😮
@@ -57,15 +64,29 @@ class IonElementLoaderTests { | |||
|
|||
IonElementLoaderTestCase("1", ionInt(1)), | |||
|
|||
IonElementLoaderTestCase("existence::42", ionInt(1).withAnnotations("existence")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ A couple of test cases were wrong, but otherwise these changes are just adding test cases for every Ion type.
@@ -128,26 +142,41 @@ internal class IonElementLoaderImpl(private val options: IonElementLoaderOptions | |||
IonType.BLOB -> BlobElementImpl(ionReader.newBytes(), annotations, metas) | |||
IonType.LIST -> { | |||
ionReader.stepIn() | |||
val list = ListElementImpl(loadAllElements(ionReader).toImmutableListUnsafe(), annotations, metas) | |||
val listContent = ArrayList<AnyElement>() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would improve performance to pool these using something like IonJava's RecyclingStack. We could get away with having one per depth level right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm intentionally not using a recycling stack because on L154, we wrap it in an immutable list, and then forget about the underlying mutable reference.
ListElementImpl(listContent.toImmutableListUnsafe(), annotations, metas)
However, it would be worth investigating whether there is a performance benefit to having a pool of ArrayLists that have a relatively larger capacity, and then copying to an exact-sized immutable list.
Issue #, if available:
None
Description of changes:
This ensures that we cannot get a
StackOverflowException
by loading data that has very deeply nested containers.I started out with a purely iterative approach, but I discovered that it introduced a performance regression for some of my sample data, so instead, I made it use the existing recursive approach with a fallback to the iterative approach once the
IonReader
reaches a certain depth of nested containers. (In this case, I've selected 100 as an arbitrary-ish value that ought to be good enough to use recursion for most use cases.)I also noticed that some of my test data had a slight performance improvement using the iterative approach, so I think that we should add an iterative vs recursive flag to the
IonElementLoaderOptions
class. However, it's not straightforward to do so because it has the potential to be a breaking change if done incorrectly, so I want to defer that for a future PR.Performance
ion-java
.master
.MAX_RECURSION_DEPTH
constant is set to 0 (i.e. immediately fall back to the iterative implementation).There is a slight performance regression, but I believe that it is acceptable in order to prevent
StackOverflowError
s from occurring. A purely iterative approach is faster for some data, which is why I plan to make it configurable.Composition of some arbitrary data (~5kb)
0.053 ± 0.001
81128.034 ± 0.004
0.023 ± 0.001
54856.014 ± 0.002
0.025 ± 0.001
54832.016 ± 0.002
0.027 ± 0.001
55912.018 ± 0.002
Sample binary Ion application logs (~20MB)
635.450 ± 75.384
625408257.218 ± 49.211
286.219 ± 28.832
449205604.132 ± 17.491
295.489 ± 32.007
449205608.339 ± 23.524
310.604 ± 22.694
454411169.864 ± 13.729
Sample product catalog data (~40kb)
0.616 ± 0.093
696584.433 ± 0.297
0.310 ± 0.006
632136.199 ± 0.023
0.315 ± 0.004
632080.202 ± 0.026
0.284 ± 0.003
632968.182 ± 0.022
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.