Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add caching of hash code and protobuf size to model objects #343

Open
jasperpotts opened this issue Jan 9, 2025 · 0 comments · May be fixed by #344
Open

Add caching of hash code and protobuf size to model objects #343

jasperpotts opened this issue Jan 9, 2025 · 0 comments · May be fixed by #344

Comments

@jasperpotts
Copy link
Member

Problem

We calculate the hashcode and protobuf size of PBJ model objects many times over. Both are reasonably costly operations especially for large object trees. It would ideal if they were only calculated once and reused. As the model objects are immutable they should only need to be calculated once.

Solution

As the model objects are Java records there is an issue that they can't have private fields or mutable fields. So we can't have lazy computation. There are 3 options I considered:

  1. Change Model Objects from Records to Classes - This would be the most elegant API wise but would mean that model objects can not become value types with future Java project Valhalla. That should offer a huge performance gain and reduction in short term garbage generation. So is very hard to give up on the promise.
  2. Cheat on immutability by adding reference field to a mutable object - There are two sub ideas here a int[] field of size 2 with one for hash code and one for size. The other is a custom object with two int fields. Both would allow lazy computation but I am concerned how they will impact future value types. The second big issue is it means every model object has one more object creation. As we create millions of model objects a second this is a big garbage problem.
  3. Add two int fields to every model object and computation values on construction - This is simple and the cheapest memory/garbage wise. It is ugly API-wise as there is no way to hide those fields. It also changes the timing of when computation cost is paid from use to object construction time. That may or may not be a problem.

The current plan is to try out option 3 and see if it is acceptable.

Alternatives

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant