From 70a676aecb0f36c9989a9fd7ec1e09a73347c4d8 Mon Sep 17 00:00:00 2001 From: Greg Dennis Date: Fri, 26 Apr 2024 14:27:57 +1200 Subject: [PATCH 1/5] add post about performance improvements --- .../2024-04-17-more-performance-updates.md | 138 ++++++++++++++++++ 1 file changed, 138 insertions(+) create mode 100644 _posts/2024/2024-04-17-more-performance-updates.md diff --git a/_posts/2024/2024-04-17-more-performance-updates.md b/_posts/2024/2024-04-17-more-performance-updates.md new file mode 100644 index 0000000..6ea34a3 --- /dev/null +++ b/_posts/2024/2024-04-17-more-performance-updates.md @@ -0,0 +1,138 @@ +--- +title: "More Performance Updates" +date: 2024-04-26 09:00:00 +1200 +tags: [json-pointer, json-patch, json-schema, architecture, performance] +toc: true +pin: false +--- + +I've been focused on performance, specifically memory management, a lot recently. My latest target has been _JsonPointer.Net_. + +I've made a significant update that I hope will make everyone's day a little better. This post explores the architectural differences and the fallout of the changes in the other libs. + +## Regarding performance + +Parsing numbers are _way_ down! + +This benchmark measures parsing the set of pointers in the spec _n_ times. + +| Version| n | Mean | Error | StdDev | Gen0 | Allocated | +|------- |------ |-----------:|----------:|-----------:|---------:|----------:| +| v4.0.1 | 1 | 2.778 us | 0.0546 us | 0.1025 us | 4.1962 | 8.57 KB | +| v5.0.0 | 1 | 1.718 us | 0.0335 us | 0.0435 us | 1.4915 | 3.05 KB | +| v4.0.1 | 10 | 26.749 us | 0.5000 us | 0.7330 us | 41.9617 | 85.7 KB | +| v5.0.0 | 10 | 16.719 us | 0.3219 us | 0.4186 us | 14.8926 | 30.47 KB | +| v4.0.1 | 100 | 286.995 us | 5.6853 us | 12.5983 us | 419.4336 | 857.03 KB | +| v5.0.0 | 100 | 157.159 us | 2.5567 us | 2.1350 us | 149.1699 | 304.69 KB | + +Run time is down 45% and memory allocations are down 65%! + +But... that's just parsing. Pointer math times actually went up. + +This benchmark takes those same pointers and just combines them to themselves. + +| Version| n | Mean | Error | StdDev | Gen0 | Allocated | +|------- |------ |------------:|------------:|------------:|---------:|----------:| +| v4.0.1 | 1 | 661.2 ns | 12.86 ns | 11.40 ns | 1.1473 | 2.34 KB | +| v5.0.0 | 1 | 1.912 us | 0.0376 us | 0.0586 us | 1.1101 | 2.27 KB | +| v4.0.1 | 10 | 6,426.4 ns | 124.10 ns | 121.88 ns | 11.4746 | 23.44 KB | +| v5.0.0 | 10 | 18.830 us | 0.3746 us | 0.4600 us | 11.1084 | 22.73 KB | +| v4.0.1 | 100 | 64,469.6 ns | 1,309.01 ns | 1,093.08 ns | 114.7461 | 234.38 KB | +| v5.0.0 | 100 | 188.406 us | 3.6606 us | 5.1317 us | 111.3281 | 227.34 KB | + +The run time just about tripled, but the memory usage went down slightly. We'll talk about the reason behind the increase in the next section about the architecture changes. + +## A new architecture and a new API + +In previous versions, `JsonPointer` was a class that held multiple `PointerSegment`s, and each `PointerSegment` held the decoded string. Whenever you needed the full pointer, the segments would be re-encoded and concatenated. To help matters, it would cache this string, so if you needed the full pointer again, it would just give you the previously calculated value. + +In v5, `JsonPointer` is a struct that holds the entire pointer as a string along with an array of `Range` structs which provide the indices in the string for each segment. The string and array are still on the heap, but they're the only memory that needs allocating. And when parsing, the string is already provided by the user. + +### Combining pointers + +In previous versions, when one pointer needed to be concatenated with another pointer (or any additional segments), the resulting pointer could just take the `PointerSegment` instances it wanted without having to allocate new ones. That means that multiple pointers can actually share `PointerSegment` instances. + +However, because the new architecture just stores the entire string, it has to basically build a new string and then parse the whole thing to get the new ranges. This explains the longer run time and why the memory improvement isn't as significant. + +I'm continuing to work on this, and hopefully I'll have updates out soon to address this. + +### API changes + +As mentioned, `JsonPointer` is now a struct (as is `RelativeJsonPointer`). + +I've also replaced the `.Segments` collection with a `.SegmentCount` property and an indexer that gets you the `ReadOnlySpan` that represents the pointer-encoded segment. + +To address that you're not getting decoded string segments, I've also defined some extension methods: + +- `.SegmentEquals()` - an allocation-free string comparison extension on `ReadOnlySpan` that accounts for JSON Pointer's need to encode the `~` and `/` characters. +- `.GetSegmentName()` - decodes a segment into a string. +- `.GetSegmentIndex()` - parses the segment int an int (int segments don't have to worry about encoding though). + +## Fallout + +While that sums up the changes made to JSON Pointer, it caused a few changes in both _JsonPatch.Net_ and _JsonSchema.Net_. + +The update didn't cause any API changes in _JsonPatch.Net_, so I'm not going really cover it except to say that it was updated. There was some internal code I had to change, but that's it. + +But when I updated _JsonSchema.Net_, it seemed a good time to make some other changes that I discovered while trying to apply the [model-less paradigm](./logic-without-models) to evaluating schemas. + +> You can view and play with the new concept in my [schema/experiment-modelless-schema](https://github.com/gregsdennis/json-everything/tree/schema/experiment-modelless-schema) branch. +{: .prompt-info } + +While those updates did result in a few breaking changes, like the previous few major versions, unless you're building your own keywords, it's not likely going to affect you much. + +## _JsonSchema.Net_ updates + +While I can say that the performance noticeably improved, it's not quite as much as I had hoped. I think part of that is the pointer math problem I mentioned before; evaluating schemas _does_ do a lot of pointer math. So if I can figure that out, evaluating schemas will just benefit. + +### Performance + +This benchmark runs the JSON Schema Test Suite _n_ times. + +| Version | n | Mean | Error | StdDev | Gen0 | Gen1 | Allocated | +|----------|--- |-----------:|---------:|---------:|------------:|-----------:|----------:| +| v6.1.0 | 1 | 412.7 ms | 14.16 ms | 41.30 ms | 27000.0000 | 1000.0000 | 82.66 MB | +| v7.0.0 | 1 | 301.6 ms | 5.93 ms | 10.07 ms | 23000.0000 | 7000.0000 | 78.41 MB | +| v6.1.0 | 10 | 1,074.7 ms | 22.24 ms | 63.82 ms | 218000.0000 | 11000.0000 | 476.56 MB | +| v7.0.0 | 10 | 945.9 ms | 18.64 ms | 32.15 ms | 216000.0000 | 5000.0000 | 472.94 MB | + +The improvements are + +- single evaluation - 27% reduced run time / 5% reduced allocations +- repeated evaluations - 22% reduced run time / negligible allocation reduction + +I was really hoping for more out of this exercise, but something is... something. And as with JSON Pointer, I'll keep working on it. + +### API changes + +After the change to perform static analysis by gathering reusable constraints, the code started to spaghettify a bit, and I needed to do some refactoring internally to reign that in. Unfortunately, some of that refactoring spilled out into the public API. + +#### `IJsonSchemaKeyword` + +The first is a slight change to `IJsonSchemaKeyword.GetConstraint()`. One of the parameters provides access to constraints that have been previously generated (i.e. dependent keywords). While this was a read-only list, due to some memory management updates, it's now a read-only span. I was able to update most of my keywords just by changing the parameter in the method signature. + +#### Schema meta-data + +Previously, I was storing all of the schema meta-data, like anchors, on the schema itself, but in my experiments, I discovered that it made sense to move that stuff to the schema registry. This meant that the registry could perform a lot of stuff at registration time that would have otherwise be done at evaluation time: + +- scan for anchors (found in `$id`, `$anchor`, `$recursiveAnchor`, and `$dynamicAnchor`) +- set base URIs +- set spec versions (determined by `$schema`) +- set dialect (determined by meta-schema's `$vocabuary`) + +Since this data is now identified through a one-time static analysis, I don't have to calculate it at evaluation time. + +#### Vocabulary registry + +The schema registry follows a "default pattern" where there's a single static instance, `.Global`, but there are also local instances on the evaluation options. Searching the local one will automatically search the global one as a fallback. It's really quite useful for when you want to register the dependent schemas for an evaluation, but you don't want all evaluations to have access to them. + +I had followed this same pattern with vocabularies as well. However reflecting on it, I think I was over-engineering. The keyword registry is static, and it made sense that the vocabulary registry should also be static. + +So now it is. + +As a result, it's also been removed from the evaluation options. + + +## Sum-up + +Overall, I'm happy with the direction the libraries are going. I still have some work to do to get the performance better, but I feel the improvements I've made so far are worth putting out there. From d92b510d8de1b18ecb0278bb5ce129ed03572e2c Mon Sep 17 00:00:00 2001 From: Greg Dennis Date: Mon, 29 Apr 2024 09:39:14 +1200 Subject: [PATCH 2/5] edits --- .jekyll-metadata | Bin 25920 -> 28564 bytes .../2024-04-17-more-performance-updates.md | 31 ++++++++++-------- 2 files changed, 17 insertions(+), 14 deletions(-) diff --git a/.jekyll-metadata b/.jekyll-metadata index 1a33f67f40382f3789c6397243bc8e348c893b66..32b837670436a8cadd07926dc9da1e64ea27e417 100644 GIT binary patch delta 228 zcmX?bigC()MrIa{>dAqj`jg*la%{AcPiDz+6uUC{qq+#wffbt@y)SbzMohkFt~j|t znrm`HD2J(uuA#YZZhld!Zb52MT7FS(VqS8pZfQYEVo7STUTzATHA6@>y90-3sWlt# zRDSt}l)aN?PhO}kB6eVfHMawo15dQP0}~&U0~0?G3ouOQ+JDf+ufk_A`APmGJ zKr9NxVn8eo#1cR(3B*!BEDh2SrQ*OO3*^WFu{;nf0I?zvD*>@G5UT*OD%0cuuiXIM Cx;P*J delta 50 zcmbPopYgycMs^mC>R?YLugMq96(=`H3vINLPZm;EP}` that accounts for JSON Pointer's need to encode the `~` and `/` characters. - `.GetSegmentName()` - decodes a segment into a string. -- `.GetSegmentIndex()` - parses the segment int an int (int segments don't have to worry about encoding though). +- `.GetSegmentIndex()` - parses the segment into an int (int segments don't have to worry about encoding though). ## Fallout @@ -79,11 +79,14 @@ But when I updated _JsonSchema.Net_, it seemed a good time to make some other ch > You can view and play with the new concept in my [schema/experiment-modelless-schema](https://github.com/gregsdennis/json-everything/tree/schema/experiment-modelless-schema) branch. {: .prompt-info } -While those updates did result in a few breaking changes, like the previous few major versions, unless you're building your own keywords, it's not likely going to affect you much. +While those updates did result in a few breaking changes, unless you're building your own keywords, it's not likely going to affect you much, if at all. + +> You can see what changed in _JsonPatch.Net_ and _JsonSchema.Net_ in [these commits](https://github.com/gregsdennis/json-everything/pull/719/files/98dff44238c6d252e6a0a5b80e2f54c86be70b86#diff-0106bcd119785c478a42e8a021100335a9a6f9c22b0bb2a4da59a47d25aeb400) and the release notes are in the [docs](https://docs.json-everything.net). +{: .prompt-info } ## _JsonSchema.Net_ updates -While I can say that the performance noticeably improved, it's not quite as much as I had hoped. I think part of that is the pointer math problem I mentioned before; evaluating schemas _does_ do a lot of pointer math. So if I can figure that out, evaluating schemas will just benefit. +While I can say that the run times noticeably improved, the reduction in memory usage isn't quite as much as I had hoped. I think part of that is the pointer math problem I mentioned before; evaluating schemas _does_ do a lot of pointer math. So if I can figure that out, evaluating schemas will just benefit. ### Performance @@ -101,7 +104,7 @@ The improvements are - single evaluation - 27% reduced run time / 5% reduced allocations - repeated evaluations - 22% reduced run time / negligible allocation reduction -I was really hoping for more out of this exercise, but something is... something. And as with JSON Pointer, I'll keep working on it. +I was really hoping for more out of this exercise, but anything is something. And as with JSON Pointer, I'll keep working on it. ### API changes @@ -113,7 +116,7 @@ The first is a slight change to `IJsonSchemaKeyword.GetConstraint()`. One of th #### Schema meta-data -Previously, I was storing all of the schema meta-data, like anchors, on the schema itself, but in my experiments, I discovered that it made sense to move that stuff to the schema registry. This meant that the registry could perform a lot of stuff at registration time that would have otherwise be done at evaluation time: +Previously, I was storing all of the schema meta-data, like anchors, on the schema itself, but in my experiments, I discovered that it made sense to move that stuff to the schema registry. This meant that I could pre-calculate a lot at registration time that would have otherwise be done at evaluation time: - scan for anchors (found in `$id`, `$anchor`, `$recursiveAnchor`, and `$dynamicAnchor`) - set base URIs @@ -126,7 +129,7 @@ Since this data is now identified through a one-time static analysis, I don't ha The schema registry follows a "default pattern" where there's a single static instance, `.Global`, but there are also local instances on the evaluation options. Searching the local one will automatically search the global one as a fallback. It's really quite useful for when you want to register the dependent schemas for an evaluation, but you don't want all evaluations to have access to them. -I had followed this same pattern with vocabularies as well. However reflecting on it, I think I was over-engineering. The keyword registry is static, and it made sense that the vocabulary registry should also be static. +I had followed this same pattern with vocabularies as well. However reflecting on it, I think I was over-engineering: mindlessly following a design pattern. The keyword registry is static, and it makes sense that the vocabulary registry should also be static. So now it is. From 503dcb60b2153836a6e89261863ce14e754b3d7f Mon Sep 17 00:00:00 2001 From: Greg Dennis Date: Tue, 30 Apr 2024 17:22:10 +1200 Subject: [PATCH 3/5] rewrite json pointer post --- .jekyll-metadata | Bin 28564 -> 31312 bytes ...024-04-17-more-performance-updates copy.md | 143 ++++++++++++++++++ .../2024-04-17-more-performance-updates.md | 141 ----------------- assets/css/style.scss | 4 + 4 files changed, 147 insertions(+), 141 deletions(-) create mode 100644 _posts/2024/2024-04-17-more-performance-updates copy.md delete mode 100644 _posts/2024/2024-04-17-more-performance-updates.md diff --git a/.jekyll-metadata b/.jekyll-metadata index 32b837670436a8cadd07926dc9da1e64ea27e417..ce6afc0d42380bdd63a4fb593dbe0063fda422ab 100644 GIT binary patch delta 89 zcmV-f0H*(x-vQ9{0S5#KdsRsyN|VtyA(Ma`60tg3ZlM#MrIa{>dAqj`jg*la%{AU$_D^w5D1e1 diff --git a/_posts/2024/2024-04-17-more-performance-updates copy.md b/_posts/2024/2024-04-17-more-performance-updates copy.md new file mode 100644 index 0000000..63b9d43 --- /dev/null +++ b/_posts/2024/2024-04-17-more-performance-updates copy.md @@ -0,0 +1,143 @@ +--- +title: "Better JSON Pointer" +date: 2024-04-30 09:00:00 +1200 +tags: [json-pointer, architecture, performance] +toc: true +pin: false +--- + +This post was going to be something else, and somewhat more boring. Be glad you're not reading that. + +But instead of blindly forging on, I stopped to consider whether I actually wanted to push out the changes I had made. In the end, I'm glad I hesitated. + +In this post and probably the couple that follow, I will cover my experience trying to squeeze some more performance out of a simple, immutable type. + +## Current state (as it was) + +The `JsonPointer` class is a typical object-oriented approach to implementing the JSON Pointer specification, RFC 6901. + +Syntactically, a JSON Pointer is nothing more a series of string segments separated by forward slashes. All of the pointer segments follow the same rule: any tildes (`~`) or forward slashes (`/`) need to be escaped; otherwise, just use the string as-is. + +Since all of the segments follow a rule, a class is created to model a segment (`PointerSegment`) and then a another class is created to house a series of them (`JsonPointer`). Easy. + +Tack on some functionality for parsing, evaluation, and maybe some pointer math (combining and building pointers), and you have a full implementation. + +## An idea is formed + +In thinking about how the model could be better, I realized that the class is immutable, and it doesn't directly hold a lot of data. What if it were a struct? Then it could live on the stack, eliminating a memory allocation. + +Then, instead of holding a collection of strings, it could hold just the full string and a collection of `Range` objects could indicate the segments: one string allocation instead of an array of objects that hold strings. + +This raises a question of whether the string should hold pointer-encoded segments. If it did, then `.ToString()` could just return the string, eliminating the need to build it, and I could provide new allocation-free string comparison methods that accounted for encoding so that users could still operate on segments. + +I implemented all of this, and it worked! It actually worked quite well: + +| Version| n | Mean | Error | StdDev | Gen0 | Allocated | +|------- |------ |-----------:|----------:|-----------:|---------:|----------:| +| v4.0.1 | 1 | 2.778 us | 0.0546 us | 0.1025 us | 4.1962 | 8.57 KB | +| v5.0.0 | 1 | 1.718 us | 0.0335 us | 0.0435 us | 1.4915 | 3.05 KB | +| v4.0.1 | 10 | 26.749 us | 0.5000 us | 0.7330 us | 41.9617 | 85.7 KB | +| v5.0.0 | 10 | 16.719 us | 0.3219 us | 0.4186 us | 14.8926 | 30.47 KB | +| v4.0.1 | 100 | 286.995 us | 5.6853 us | 12.5983 us | 419.4336 | 857.03 KB | +| v5.0.0 | 100 | 157.159 us | 2.5567 us | 2.1350 us | 149.1699 | 304.69 KB | + +... for parsing. Pointer math was a bit different: + +| Version| n | Mean | Error | StdDev | Gen0 | Allocated | +|------- |------ |------------:|------------:|------------:|---------:|----------:| +| v4.0.1 | 1 | 661.2 ns | 12.86 ns | 11.40 ns | 1.1473 | 2.34 KB | +| v5.0.0 | 1 | 916.3 ns | 17.46 ns | 15.47 ns | 1.1120 | 2.27 KB | +| v4.0.1 | 10 | 6,426.4 ns | 124.10 ns | 121.88 ns | 11.4746 | 23.44 KB | +| v5.0.0 | 10 | 9,128.2 ns | 180.82 ns | 241.39 ns | 11.1237 | 22.73 KB | +| v4.0.1 | 100 | 64,469.6 ns | 1,309.01 ns | 1,093.08 ns | 114.7461 | 234.38 KB | +| v5.0.0 | 100 | 92,437.0 ns | 1,766.38 ns | 1,963.33 ns | 111.3281 | 227.34 KB | + +While the memory allocation decrease was... fine, the 50% run-time increase was unacceptable. I couldn't figure out what was going on here, so I left it for about a week and started on some updates for _JsonSchema.Net_ (post coming soon). + +Initially for the pointer math, I was just creating a new string and then parsing that. The memory usage was a bit higher than what's shown above, but the run-time was almost double. After a bit of thought, I realized I can explicitly build the string _and_ the range array, which cut down on both the run time and the memory, but only these numbers. + +## Eureka! + +After a couple days, I finally figured out that by storing each segment, the old way could re-use segments between pointers. + +For example, let's combine `/foo/bar` and `/baz`. The pointers for those hold the arrays `['foo', 'bar']` and `['baz']`. When combining under the old way, I'd just merge the arrays: `['foo', 'bar', 'baz']`. It's allocating a new array, but not new strings. All of the segment strings stayed the same. + +Under the new way, I'd actually build a new string `/foo/bar/baz` and then build a new array of `Range`s to point to the substrings. + +So this new architecture isn't better after all. + +## Deep in thought + +I thought some more about the two approaches. The old approach does pointer math really well, but I don't like that I have an object (`JsonPointer`) that contains more objects (`PointerSegment`) that each contain strings. That seems wasteful. + +Also, why did I make it a struct? Structs should be a fixed size, and strings are never a fixed size (which is a major reason `string` is a class). Secondly, the memory of a struct should also live on the stack, and strings and arrays (even arrays of structs) are stored on the heap; so really it's only the container that's on the stack. A struct just isn't the right choice for this type, so change it back to a class. + +What if the pointer just held the strings directly instead of having a secondary `PointerSegment` class? Then all of the decoding/encoding logic would have to live somewhere else, but that's fine. So I don't need a model for the segments; plain strings will do. + +Lastly, I could make it implement `IReadOnlyList`. That would give users a `.Count` property, an indexer to access segments, and allow them to iterate over segments directly. + +## A new implementation + +Taking in all of this analysis, I updated `JsonPointer` again: + +- It's a class again. +- It holds an array of (decoded) strings for the segments. +- It will cache its string representation. + - Parsing a pointer already has the string; just store it. + - Constructing a pointer and calling `.ToString()` builds on the fly and caches. + +`PointerSegment`, which had also been changed to a struct in the first set of changes, remains a struct and acts as an intermediate type so that building pointers in code can mix strings and integer indices. (See the `.Create()` method used in the code samples below.) Keeping this as a struct means no allocations. + +I fixed all of my tests and ran the benchmarks again: + +| Parsing | Count | Mean | Error | StdDev | Gen0 | Allocated | +|------- |------ |-----------:|----------:|----------:|---------:|----------:| +| 5.0.0 | 1 | 3.825 us | 0.0760 us | 0.0961 us | 3.0823 | 6.3 KB | +| 5.0.0 | 10 | 36.155 us | 0.6979 us | 0.9074 us | 30.8228 | 62.97 KB | +| 5.0.0 | 100 | 362.064 us | 6.7056 us | 6.2724 us | 308.1055 | 629.69 KB | + +| Math | Count | Mean | Error | StdDev | Gen0 | Allocated | +|------- |------ |------------:|----------:|----------:|--------:|----------:| +| 5.0.0 | 1 | 538.2 ns | 10.12 ns | 10.83 ns | 0.9794 | 2 KB | +| 5.0.0 | 10 | 5,188.1 ns | 97.80 ns | 104.65 ns | 9.7885 | 20 KB | +| 5.0.0 | 100 | 58,245.0 ns | 646.43 ns | 539.80 ns | 97.9004 | 200 KB | + +For parsing, run time is a higher, generally about 30%, but allocations are down 26%. + +For pointer math, run time and allocations are both down, about 20% and 15%, respectively. + +I'm comfortable with the parsing time being a bit higher since I expect more usage of the pointer math. + +## Some new toys + +In addition to the simple indexer you get from `IReadOnlyList`, if you're working in .Net 8, you also get a `Range` indexer which allows you to create a pointer using a subset of the segments. This is really handy when you want to get the parent of a pointer + +```c# +var pointer = JsonPointer.Create("foo", "bar", 5, "baz"); +var parent = pointer[..^1]; // /foo/bar/5 +``` + +or maybe the relative local pointer (i.e. the last segment) + +```c# +var pointer = JsonPointer.Create("foo", "bar", 5, "baz"); +var local = pointer[^1..]; // /baz +``` + +These operations are pretty common in _JsonSchema.Net_. + +For those of you who haven't made it to .Net 8 just yet, this functionality is also available as methods: + +```c# +var pointer = JsonPointer.Create("foo", "bar", 5, "baz"); +var parent = pointer.GetAncestor(1); // /foo/bar/5 +var local = pointer.GetLocal(1); // /baz +``` + +Personally, I like the indexer syntax. I was concerned at first that having an indexer return a new object might feel unorthodox to some developers, but that's exactly what `string` is doing, so I'm fine with it. + +## Wrap up + +I like where this landed a lot more than where it was in the middle. Something just felt off with the design, and I was having trouble isolating what the issue was. I like that `PointerSegment` isn't part of the model anymore, and it's just "syntax candy" to help build pointers. I really like the performance. + +I learned a lot about memory management, which will be the subject of the next post. But more than that, I learned that sometimes inaction is the right action. I hesitated, and the library is better for it. diff --git a/_posts/2024/2024-04-17-more-performance-updates.md b/_posts/2024/2024-04-17-more-performance-updates.md deleted file mode 100644 index c901ecc..0000000 --- a/_posts/2024/2024-04-17-more-performance-updates.md +++ /dev/null @@ -1,141 +0,0 @@ ---- -title: "More Performance Updates" -date: 2024-04-26 09:00:00 +1200 -tags: [json-pointer, json-patch, json-schema, architecture, performance] -toc: true -pin: false ---- - -I've been focused on performance a lot recently, specifically memory management. My latest target has been _JsonPointer.Net_. - -This post explores the architectural differences between the latest update and previous versions, as well as the fallout of changes in the other libs. - -## Regarding performance - -Parsing numbers is _way_ down! - -This benchmark measures parsing the set of pointers in the spec _n_ times. - -| Version| n | Mean | Error | StdDev | Gen0 | Allocated | -|------- |------ |-----------:|----------:|-----------:|---------:|----------:| -| v4.0.1 | 1 | 2.778 us | 0.0546 us | 0.1025 us | 4.1962 | 8.57 KB | -| v5.0.0 | 1 | 1.718 us | 0.0335 us | 0.0435 us | 1.4915 | 3.05 KB | -| v4.0.1 | 10 | 26.749 us | 0.5000 us | 0.7330 us | 41.9617 | 85.7 KB | -| v5.0.0 | 10 | 16.719 us | 0.3219 us | 0.4186 us | 14.8926 | 30.47 KB | -| v4.0.1 | 100 | 286.995 us | 5.6853 us | 12.5983 us | 419.4336 | 857.03 KB | -| v5.0.0 | 100 | 157.159 us | 2.5567 us | 2.1350 us | 149.1699 | 304.69 KB | - -Run time is down 45% and memory allocations are down 65%! - -But... that's just parsing. Pointer math times actually went up. - -This benchmark takes those same pointers and just combines them to themselves. - -| Version| n | Mean | Error | StdDev | Gen0 | Allocated | -|------- |------ |------------:|------------:|------------:|---------:|----------:| -| v4.0.1 | 1 | 661.2 ns | 12.86 ns | 11.40 ns | 1.1473 | 2.34 KB | -| v5.0.0 | 1 | 916.3 ns | 17.46 ns | 15.47 ns | 1.1120 | 2.27 KB | -| v4.0.1 | 10 | 6,426.4 ns | 124.10 ns | 121.88 ns | 11.4746 | 23.44 KB | -| v5.0.0 | 10 | 9,128.2 ns | 180.82 ns | 241.39 ns | 11.1237 | 22.73 KB | -| v4.0.1 | 100 | 64,469.6 ns | 1,309.01 ns | 1,093.08 ns | 114.7461 | 234.38 KB | -| v5.0.0 | 100 | 92,437.0 ns | 1,766.38 ns | 1,963.33 ns | 111.3281 | 227.34 KB | - -The memory usage went down a bit, but the run time is about 50% longer. We'll talk about the reason behind the increase in the next section about the architecture changes. - -## A new architecture and a new API - -In previous versions, `JsonPointer` was a class that held multiple `PointerSegment`s, and each `PointerSegment` held the decoded string. Whenever you needed the full pointer, the segments would be re-encoded and concatenated. To help matters, it would cache this string, so if you needed the full pointer again, it would just give you the previously calculated value. - -In v5, `JsonPointer` is a struct that holds the entire pointer as a string along with an array of `Range` structs which provide the indices in the string for each segment. The string and array are still on the heap, but they're the only memory that needs allocating. And when parsing, the string is already provided by the user. - -### Combining pointers - -In previous versions, when one pointer needed to be concatenated with another pointer (or any additional segments), the resulting pointer could just take the `PointerSegment` instances it wanted without having to allocate new ones. That means that multiple pointers can actually share `PointerSegment` instances. - -However, because the new architecture just stores the entire string, it has to build a new string, which results in a memory allocation. - -I'm continuing to work on this, and hopefully I'll have updates out soon to address this. - -### API changes - -As mentioned, `JsonPointer` is now a struct (as is `RelativeJsonPointer`). - -I've also replaced the `.Segments` collection with a `.SegmentCount` property and an indexer that gets you the `ReadOnlySpan` that represents the pointer-encoded segment. - -To address that you're not getting decoded string segments, I've also defined some extension methods: - -- `.SegmentEquals()` - an allocation-free string comparison extension on `ReadOnlySpan` that accounts for JSON Pointer's need to encode the `~` and `/` characters. -- `.GetSegmentName()` - decodes a segment into a string. -- `.GetSegmentIndex()` - parses the segment into an int (int segments don't have to worry about encoding though). - -## Fallout - -While that sums up the changes made to JSON Pointer, it caused a few changes in both _JsonPatch.Net_ and _JsonSchema.Net_. - -The update didn't cause any API changes in _JsonPatch.Net_, so I'm not going really cover it except to say that it was updated. There was some internal code I had to change, but that's it. - -But when I updated _JsonSchema.Net_, it seemed a good time to make some other changes that I discovered while trying to apply the [model-less paradigm](./logic-without-models) to evaluating schemas. - -> You can view and play with the new concept in my [schema/experiment-modelless-schema](https://github.com/gregsdennis/json-everything/tree/schema/experiment-modelless-schema) branch. -{: .prompt-info } - -While those updates did result in a few breaking changes, unless you're building your own keywords, it's not likely going to affect you much, if at all. - -> You can see what changed in _JsonPatch.Net_ and _JsonSchema.Net_ in [these commits](https://github.com/gregsdennis/json-everything/pull/719/files/98dff44238c6d252e6a0a5b80e2f54c86be70b86#diff-0106bcd119785c478a42e8a021100335a9a6f9c22b0bb2a4da59a47d25aeb400) and the release notes are in the [docs](https://docs.json-everything.net). -{: .prompt-info } - -## _JsonSchema.Net_ updates - -While I can say that the run times noticeably improved, the reduction in memory usage isn't quite as much as I had hoped. I think part of that is the pointer math problem I mentioned before; evaluating schemas _does_ do a lot of pointer math. So if I can figure that out, evaluating schemas will just benefit. - -### Performance - -This benchmark runs the JSON Schema Test Suite _n_ times. - -| Version | n | Mean | Error | StdDev | Gen0 | Gen1 | Allocated | -|----------|--- |-----------:|---------:|---------:|------------:|-----------:|----------:| -| v6.1.0 | 1 | 412.7 ms | 14.16 ms | 41.30 ms | 27000.0000 | 1000.0000 | 82.66 MB | -| v7.0.0 | 1 | 301.6 ms | 5.93 ms | 10.07 ms | 23000.0000 | 7000.0000 | 78.41 MB | -| v6.1.0 | 10 | 1,074.7 ms | 22.24 ms | 63.82 ms | 218000.0000 | 11000.0000 | 476.56 MB | -| v7.0.0 | 10 | 945.9 ms | 18.64 ms | 32.15 ms | 216000.0000 | 5000.0000 | 472.94 MB | - -The improvements are - -- single evaluation - 27% reduced run time / 5% reduced allocations -- repeated evaluations - 22% reduced run time / negligible allocation reduction - -I was really hoping for more out of this exercise, but anything is something. And as with JSON Pointer, I'll keep working on it. - -### API changes - -After the change to perform static analysis by gathering reusable constraints, the code started to spaghettify a bit, and I needed to do some refactoring internally to reign that in. Unfortunately, some of that refactoring spilled out into the public API. - -#### `IJsonSchemaKeyword` - -The first is a slight change to `IJsonSchemaKeyword.GetConstraint()`. One of the parameters provides access to constraints that have been previously generated (i.e. dependent keywords). While this was a read-only list, due to some memory management updates, it's now a read-only span. I was able to update most of my keywords just by changing the parameter in the method signature. - -#### Schema meta-data - -Previously, I was storing all of the schema meta-data, like anchors, on the schema itself, but in my experiments, I discovered that it made sense to move that stuff to the schema registry. This meant that I could pre-calculate a lot at registration time that would have otherwise be done at evaluation time: - -- scan for anchors (found in `$id`, `$anchor`, `$recursiveAnchor`, and `$dynamicAnchor`) -- set base URIs -- set spec versions (determined by `$schema`) -- set dialect (determined by meta-schema's `$vocabuary`) - -Since this data is now identified through a one-time static analysis, I don't have to calculate it at evaluation time. - -#### Vocabulary registry - -The schema registry follows a "default pattern" where there's a single static instance, `.Global`, but there are also local instances on the evaluation options. Searching the local one will automatically search the global one as a fallback. It's really quite useful for when you want to register the dependent schemas for an evaluation, but you don't want all evaluations to have access to them. - -I had followed this same pattern with vocabularies as well. However reflecting on it, I think I was over-engineering: mindlessly following a design pattern. The keyword registry is static, and it makes sense that the vocabulary registry should also be static. - -So now it is. - -As a result, it's also been removed from the evaluation options. - - -## Sum-up - -Overall, I'm happy with the direction the libraries are going. I still have some work to do to get the performance better, but I feel the improvements I've made so far are worth putting out there. diff --git a/assets/css/style.scss b/assets/css/style.scss index 217db2f..e0d54e1 100644 --- a/assets/css/style.scss +++ b/assets/css/style.scss @@ -59,3 +59,7 @@ img { code.highlighter-rouge { font-size: .85em !important; } + +table { + width: -webkit-fill-available; +} From bcf7a859530e58422657b857bd1db7804af33e43 Mon Sep 17 00:00:00 2001 From: Greg Dennis Date: Wed, 1 May 2024 11:47:00 +1200 Subject: [PATCH 4/5] update tags --- _posts/2024/2024-04-17-more-performance-updates copy.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_posts/2024/2024-04-17-more-performance-updates copy.md b/_posts/2024/2024-04-17-more-performance-updates copy.md index 63b9d43..2161a36 100644 --- a/_posts/2024/2024-04-17-more-performance-updates copy.md +++ b/_posts/2024/2024-04-17-more-performance-updates copy.md @@ -1,7 +1,7 @@ --- title: "Better JSON Pointer" date: 2024-04-30 09:00:00 +1200 -tags: [json-pointer, architecture, performance] +tags: [json-pointer, architecture, performance, learning] toc: true pin: false --- From 5d3a6bfc6fcfde296fdf3d172dd368cca97dbcf3 Mon Sep 17 00:00:00 2001 From: Greg Dennis Date: Wed, 1 May 2024 12:30:10 +1200 Subject: [PATCH 5/5] edits --- .jekyll-metadata | Bin 31312 -> 34049 bytes ...y.md => 2024-04-17-better-json-pointer.md} | 26 +++++++++--------- 2 files changed, 13 insertions(+), 13 deletions(-) rename _posts/2024/{2024-04-17-more-performance-updates copy.md => 2024-04-17-better-json-pointer.md} (82%) diff --git a/.jekyll-metadata b/.jekyll-metadata index ce6afc0d42380bdd63a4fb593dbe0063fda422ab..4c90d8bd964ec5e629f2a4b7d1d7bb314ffa7db6 100644 GIT binary patch delta 133 zcmcccg|V@ViJgU`I?PkaYw|^N#mNoQLL03P< z5?jIM+Auq=$%}M!CO3p~*qi7Yn(HQ|mXxFx>1Gw@=jj&YXXXJpdbufV)(j!l><%2B crPge`3Ig&CubLkef=pvP0W(YiXqXKb0CoZ?kpKVy delta 56 zcmZqdV!H5!k)4I3I@D9iYw|^N#mNoQLL03P~%K!iX diff --git a/_posts/2024/2024-04-17-more-performance-updates copy.md b/_posts/2024/2024-04-17-better-json-pointer.md similarity index 82% rename from _posts/2024/2024-04-17-more-performance-updates copy.md rename to _posts/2024/2024-04-17-better-json-pointer.md index 2161a36..955ebd5 100644 --- a/_posts/2024/2024-04-17-more-performance-updates copy.md +++ b/_posts/2024/2024-04-17-better-json-pointer.md @@ -8,17 +8,17 @@ pin: false This post was going to be something else, and somewhat more boring. Be glad you're not reading that. -But instead of blindly forging on, I stopped to consider whether I actually wanted to push out the changes I had made. In the end, I'm glad I hesitated. +In the midst of updating _JsonPointer.Net_, instead of blindly forging on when metrics looked decent but the code was questionable, I stopped to consider whether I actually wanted to push out the changes I had made. In the end, I'm glad I hesitated. -In this post and probably the couple that follow, I will cover my experience trying to squeeze some more performance out of a simple, immutable type. +In this post and at least the couple that follow, I will cover my experience trying to squeeze some more performance out of a simple, immutable type. -## Current state (as it was) +## In the before times The `JsonPointer` class is a typical object-oriented approach to implementing the JSON Pointer specification, RFC 6901. Syntactically, a JSON Pointer is nothing more a series of string segments separated by forward slashes. All of the pointer segments follow the same rule: any tildes (`~`) or forward slashes (`/`) need to be escaped; otherwise, just use the string as-is. -Since all of the segments follow a rule, a class is created to model a segment (`PointerSegment`) and then a another class is created to house a series of them (`JsonPointer`). Easy. +A class is created to model a segment (`PointerSegment`), and then another class is created to house a series of them (`JsonPointer`). Easy. Tack on some functionality for parsing, evaluation, and maybe some pointer math (combining and building pointers), and you have a full implementation. @@ -26,7 +26,7 @@ Tack on some functionality for parsing, evaluation, and maybe some pointer math In thinking about how the model could be better, I realized that the class is immutable, and it doesn't directly hold a lot of data. What if it were a struct? Then it could live on the stack, eliminating a memory allocation. -Then, instead of holding a collection of strings, it could hold just the full string and a collection of `Range` objects could indicate the segments: one string allocation instead of an array of objects that hold strings. +Then, instead of holding a collection of strings, it could hold just the full string and a collection of `Range` objects could indicate the segments as sort of "zero-allocation substrings": one string allocation instead of an array of objects that hold strings. This raises a question of whether the string should hold pointer-encoded segments. If it did, then `.ToString()` could just return the string, eliminating the need to build it, and I could provide new allocation-free string comparison methods that accounted for encoding so that users could still operate on segments. @@ -54,25 +54,25 @@ I implemented all of this, and it worked! It actually worked quite well: While the memory allocation decrease was... fine, the 50% run-time increase was unacceptable. I couldn't figure out what was going on here, so I left it for about a week and started on some updates for _JsonSchema.Net_ (post coming soon). -Initially for the pointer math, I was just creating a new string and then parsing that. The memory usage was a bit higher than what's shown above, but the run-time was almost double. After a bit of thought, I realized I can explicitly build the string _and_ the range array, which cut down on both the run time and the memory, but only these numbers. +Initially for the pointer math, I was just creating a new string and then parsing that. The memory usage was a bit higher than what's shown above, but the run-time was almost double. After a bit of thought, I realized I can explicitly build the string _and_ the range array, which cut down on both the run time and the memory, but only so far as what's shown above. ## Eureka! -After a couple days, I finally figured out that by storing each segment, the old way could re-use segments between pointers. +After a couple days, I finally figured out that by storing each segment, the old way could re-use segments between pointers. Sharing segments helps with pointer math where we're chopping up and combining pointers. -For example, let's combine `/foo/bar` and `/baz`. The pointers for those hold the arrays `['foo', 'bar']` and `['baz']`. When combining under the old way, I'd just merge the arrays: `['foo', 'bar', 'baz']`. It's allocating a new array, but not new strings. All of the segment strings stayed the same. +For example, let's combine `/foo/bar` and `/baz`. Under the old way, the pointers for those hold the arrays `['foo', 'bar']` and `['baz']`. When combining them, I'd just merge the arrays: `['foo', 'bar', 'baz']`. It's allocating a new array, but not new strings. All of the segment strings stayed the same. Under the new way, I'd actually build a new string `/foo/bar/baz` and then build a new array of `Range`s to point to the substrings. So this new architecture isn't better after all. -## Deep in thought +## A hybrid design I thought some more about the two approaches. The old approach does pointer math really well, but I don't like that I have an object (`JsonPointer`) that contains more objects (`PointerSegment`) that each contain strings. That seems wasteful. -Also, why did I make it a struct? Structs should be a fixed size, and strings are never a fixed size (which is a major reason `string` is a class). Secondly, the memory of a struct should also live on the stack, and strings and arrays (even arrays of structs) are stored on the heap; so really it's only the container that's on the stack. A struct just isn't the right choice for this type, so change it back to a class. +Also, why did I make it a struct? Structs should be a fixed size, and strings are never a fixed size (which is a major reason `string` is a class). Secondly, the memory of a struct should also live on the stack, and strings and arrays (even arrays of structs) are stored on the heap; so really it's only the container that's on the stack. A struct just isn't the right choice for this type, so I should change it back to a class. -What if the pointer just held the strings directly instead of having a secondary `PointerSegment` class? Then all of the decoding/encoding logic would have to live somewhere else, but that's fine. So I don't need a model for the segments; plain strings will do. +What if the pointer just held the strings directly instead of having a secondary `PointerSegment` class? In the old design, `PointerSegment` handled all of the decoding/encoding logic, so that would have to live somewhere else, but that's fine. So I don't need a model for the segments; plain strings will do. Lastly, I could make it implement `IReadOnlyList`. That would give users a `.Count` property, an indexer to access segments, and allow them to iterate over segments directly. @@ -102,7 +102,7 @@ I fixed all of my tests and ran the benchmarks again: | 5.0.0 | 10 | 5,188.1 ns | 97.80 ns | 104.65 ns | 9.7885 | 20 KB | | 5.0.0 | 100 | 58,245.0 ns | 646.43 ns | 539.80 ns | 97.9004 | 200 KB | -For parsing, run time is a higher, generally about 30%, but allocations are down 26%. +For parsing, run time is higher, generally about 30%, but allocations are down 26%. For pointer math, run time and allocations are both down, about 20% and 15%, respectively. @@ -134,7 +134,7 @@ var parent = pointer.GetAncestor(1); // /foo/bar/5 var local = pointer.GetLocal(1); // /baz ``` -Personally, I like the indexer syntax. I was concerned at first that having an indexer return a new object might feel unorthodox to some developers, but that's exactly what `string` is doing, so I'm fine with it. +Personally, I like the indexer syntax. I was concerned at first that having an indexer return a new object might feel unorthodox to some developers, but that's exactly what `string` does when you use a `Range` index to get a substring, so I'm fine with it. ## Wrap up