Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.0] fix table id mix up during snapshot creation #575

Merged
merged 2 commits into from
Aug 20, 2024

Conversation

spoonincode
Copy link
Member

#514 is defective: it operates under the premise that table ids will be restored upon snapshot loading. But that isn't true because table ids are just chainbase's primary key which, while they can be inspected, cannot be controlled by user code (it is disallowed to set it even upon creation of a chainbase row). Thus what happens is that if there are any holes in the table id values (which will occur when tables are removed), the snapshot will record table ids that will be mixed up or non-existent on loading the snapshot.

For example consider table ids of 0, 1, 2, 3, 5, 6. (table 4 has been removed at some point). The existing snapshot code will record any rows that go with table id 5 as belonging with table id 5. However when the snapshot is reloaded, table id 5 becomes table id 4, and any rows in the snapshot that were attributed to table id 5 actually get mapped to the old table id 6 (which now table id 5).

This wasn't detected earlier because it requires a table to be removed creating a hole, a snapshot created, loaded, and then some access that fails due to the mixed up table rows. For example, loading a v6 snapshot in to v1.0.0-rc1, running it for a while, creating a v7 snapshot and then reloading that v7 snapshot, all might succeed just fine if no tables were removed prior to creating the snapshot. All snapshot unit tests are trivial enough that they do not stumble on this problem; in fact, there is no need to regenerate the v7 test snapshots due to the change in this PR -- they remain byte identical because the table ids all match up due to no holes.

There is a secondary problem with the approach where table ids are not consensus but rather an implementation detail of the specific storage layout chosen by nodeos. Different nodes can have different table ids for a given table depending upon what block the node was initialized from. Thus snapshots can not be reproducible.

The change made in this PR is to instead of recording the table id each row belongs to, record the position/index of the table in the table index. This was tricky to name but is referred to as the 'flattened table id' in code. That is, if the table ids are 0, 1, 2, 4, 5, 8, 10, when writing the rows for table id 8, flattened table id 5 will be written for those rows. Because when the table index is reloaded, those tables will be assigned ids 0, 1, 2, 3, 4, 5, 6. Keep in mind it's possible for a secondary contract index to not have any rows for a given table id, so some sections of the snapshot may not have all table ids but rather just a subset. So we still need to record this flattened id and not just assume each group of rows goes with flattened id 0, 1, 2, 3, 4, 5, 6.

Resolves #568 (Alternatively we can revert #514 to resolve as well)

Leaving as draft at the moment until considering a better test to demonstrate issue is fixed.

@ericpassmore
Copy link
Contributor

Note:start
group: STABILITY
category: INTERNALS
summary: To support snapshots and and faster table loading introduced in Spring, build stable ids in chainbase for tables.
Note:end

@spoonincode spoonincode linked an issue Aug 19, 2024 that may be closed by this pull request
@spoonincode
Copy link
Member Author

I added a simple test that demonstrates the defect is resolved. The test applied on to main -- where it fails -- is demoed at
https://github.com/AntelopeIO/spring/actions/runs/10463450681

@spoonincode spoonincode marked this pull request as ready for review August 20, 2024 01:25
@spoonincode spoonincode merged commit 13044c5 into release/1.0 Aug 20, 2024
36 checks passed
@spoonincode spoonincode deleted the snapshot_tableid_fix_10 branch August 20, 2024 02:40
@ericpassmore ericpassmore added the bug The product is not working as was intended. label Sep 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The product is not working as was intended.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

v7 snapshot loading failures
4 participants