Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This could be landed in v2-unstable, but it was built off of v2, which apparently has many commits that aren't in v2-unstable.
When dealing with records that have many transactions in their txn-queue, one of the top consumers of CPU is actually the conversion from Token to ObjectId. The primary culprit of this was hasPreReq, which walks all of the tokens in a document and compares the object ID to some other object ID, and essentially converts every Token to an ID multiple times.
This is part of a series of small fixes I've put together around making handling large TXN queues a little bit nicer. It includes some test cases that expose where the problems are, and the patches to make those test cases a bit nicer.
The first commit is just running 'go fmt' which apparently reorders a few imports.
The second introduces the tests, so that you can get an impression of baseline performance. You can see the performance by running:
cd mgo.v2/txn
go test -check.vv -check.f TxnQueue -qlength 100 # 200, 400, 800, 1600, etc
You can see the O(N^2) performance. I've put together a spreadsheet showing the changes with various patches. https://docs.google.com/spreadsheets/d/1OXcIMooO8fG3xzLmn_4WSECLDdXBjQ-cKzkk26hovNU/edit?usp=sharing
The key indicators on the graphs for this patch are the Blue Circles and the Red Triangles.
This patch has the biggest impact on Setup Prepared, ResumeAll Prepared, Setup Preparing and ResumeAll preparing (less of an impact on AssertionGrowth).
The algorithm is still fundamentally O(N^2) because we are re-reading the document from the database N times and it has N transaction tokens on the document. And the time to convert from BSON into Go objects is a significant portion of the time. Converting from Token to ObjectId was the other significant portion of the time, which is what this patch addresses.
I chose to break up the work into smaller performance tweaks so we can discuss them reasonably independently.