Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zcash_client_backend: Implement async wallet synchronization function #1184

Merged
merged 2 commits into from
Apr 3, 2024

Conversation

str4d
Copy link
Contributor

@str4d str4d commented Feb 9, 2024

This implements the necessary state machine for taking a wallet in some arbitrary synchronization status, and fully scanning (the remainder of) the chain.

Closes #1169.

Copy link

codecov bot commented Feb 9, 2024

Codecov Report

Attention: Patch coverage is 0% with 170 lines in your changes are missing coverage. Please review.

Project coverage is 63.19%. Comparing base (25b8404) to head (7f017bc).

Files Patch % Lines
zcash_client_backend/src/sync.rs 0.00% 167 Missing ⚠️
zcash_client_backend/src/data_api/chain.rs 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1184      +/-   ##
==========================================
- Coverage   63.93%   63.19%   -0.74%     
==========================================
  Files         123      124       +1     
  Lines       14340    14507     +167     
==========================================
  Hits         9168     9168              
- Misses       5172     5339     +167     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

sapling_outputs_count: 0,
orchard_actions_count: 0,
};
std::fs::remove_file(get_block_path(fsblockdb_root, &meta))
Copy link

@fluidvanadium fluidvanadium Feb 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does it individually remove files instead of relying on truncate_to_height?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WalletWrite::truncate_to_height is truncating the wallet database; it has no knowledge about the internal details of the block cache. This file deletion is part of the concrete behaviour required to use FsBlockDb, and as such it should move behind the new trait (i.e. there should be an equivalent BlockCache::truncate_to_height method).

Copy link
Contributor

@AArnott AArnott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this started. zcash_client_backend will be much better for having its own sync engine in it. This code looks very much like what I saw in str4d's console wallet prototype, which I took and built up considerably. Would you consider reviewing that implementation to see what you can glean from it to build up your PR?

meta.block_file_path(&fsblockdb_root.join("blocks"))
}

/// Scans the chain until the wallet is up-to-date.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why stop there? Why not add a parameter to indicate that the caller wishes the method to run indefinitely, keeping up with the blockchain?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other parameter ideas from my own implementation:

  • take a callback parameter that receives % progress updates as well as newly found transactions.
  • take a cancellation token so the caller can gracefully abort the sync where it's at (such that it can be resumed at roughly the same location later).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I have said in multiple places, this code is extracted from the existing private code that I have (and that I shared previously with you, on top of which you built your implementation). It is intended as a starting point for further development in-repo, not the perfect sync engine up-front.

So the reason why it stops there is that it was simpler for me to implement initially, and also how my CLI wallet works (which has no long-running backend process). Please move these suggestions into separate issues for future discussion and implementation.

Comment on lines 181 to 193
for deletion in block_deletions {
deletion.await?;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my perf testing, at least on Windows, the disk I/O associated with storing and soon after deleting the blocks from disk dominated the sync time. By keeping the batch in memory, sync went much faster. And I have a fast SSD by the way. The downside of this approach is with a fixed batch size, the memory requirement for this varies widely, as sandblasted blocks can cause a batch that typically requires small memory to take >2GB of memory. So I have in mind another approach that somewhat does away with batches altogether. We can chat about it if you're interested. Essentially though, it's the idea that we'll ask the server to give us all the blocks we're interested in, and we'll throttle the download speed by reading them no faster than we have space for in a deliberately limited queue that feeds into the scan_blocks fn. This way, we can keep the scan_blocks function busy, memory usage under control, and not worry about some somewhat arbitrary value for 'batch size'.

Comment on lines 160 to 175
let block_meta = download_blocks(client, fsblockdb_root, db_cache, &scan_range).await?;

// Scan the downloaded blocks.
let scan_ranges_updated =
scan_blocks(params, fsblockdb_root, db_cache, db_data, &scan_range)?;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have this pattern in my sync implementation as well, and it's painfully obvious when looking at the resource manager that sync time takes about twice as long as it needs to because resource utilization keeps switching between bandwidth and CPU. If we could be downloading the next batch while scanning the prior batch, we could better utilize both resources in parallel and get the job done much faster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this is exactly what the Android and Swift mobile SDKs do. The approach here was intentionally simplistic for the codebase I wrote it for, but I would want this to also be improved to follow the pipelined approach of the mobile SDKs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated my own sync implementation to be pipelined, limiting resource usage while keeping the CPU busy. I'll be happy to contribute to this PR (or after it) if those changes are welcome.

}

/// Scans the chain until the wallet is up-to-date.
pub async fn run<P, ChT, DbT>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method only scans CompactBlocks, which contain only shielded transaction data. Transparent data is totally omitted. I had to fill in the gap in my sync routine, which is mostly complete, although zcash_client_backend itself doesn't yet support recording a spend of a transparent output, so while I download transparent transactions, the recorded history is incorrect if UTXOs are spent, but it works if they are shielded.
In any case, I think the transparent support either needs to be added here, or the function should be very prominently documented as to its limitations.

zcash_client_backend/Cargo.toml Outdated Show resolved Hide resolved
zcash_client_backend/src/sync.rs Outdated Show resolved Hide resolved
@str4d
Copy link
Contributor Author

str4d commented Apr 1, 2024

Rebased on main to use the new BlockCache trait. I still need to replace the anyhow usage with typed errors, and migrate the previous FsBlockDb-specific code into zcash_client_sqlite.

This implements the necessary state machine for taking a wallet in some
arbitrary synchronization status, and fully scanning (the remainder of)
the chain.

Closes #1169.
@str4d str4d force-pushed the 1169-sync-engine branch from 2d09834 to 24277a6 Compare April 2, 2024 00:26
@str4d
Copy link
Contributor Author

str4d commented Apr 2, 2024

Force-pushed to fix failing tests, and implement an Error enum to replace anyhow usage.

@str4d str4d marked this pull request as ready for review April 2, 2024 00:35
Copy link
Contributor

@nuttycom nuttycom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK. Since this is feature-flagged at this point, I'm happy to go ahead and merge this and have its API revised and improved over time.

@@ -421,7 +422,7 @@ where
/// # Errors
///
/// In the case of an error, some blocks requested for deletion may remain in the block cache.
async fn delete(&self, range: &ScanRange) -> Result<(), Self::Error>;
async fn delete<'a>(&self, range: ScanRange) -> Result<(), Self::Error>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this named lifetime, given that it isn't mentioned in any of the arguments or the result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug, it was me trying to fix the issue a different way that didn't work, and I forgot to undo the lifetime definition.

@nuttycom nuttycom merged commit 430212c into main Apr 3, 2024
24 of 26 checks passed
@nuttycom nuttycom deleted the 1169-sync-engine branch April 3, 2024 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extract sync state machine from zec-client-sqlite into zcash_client_backend
5 participants