zcash_client_backend: Implement async wallet synchronization function #1184

str4d · 2024-02-09T17:10:35Z

This implements the necessary state machine for taking a wallet in some arbitrary synchronization status, and fully scanning (the remainder of) the chain.

Closes #1169.

zcash_client_backend/src/sync.rs

codecov · 2024-02-09T17:30:49Z

Codecov Report

Attention: Patch coverage is 0% with 170 lines in your changes are missing coverage. Please review.

Project coverage is 63.19%. Comparing base (25b8404) to head (7f017bc).

Files	Patch %	Lines
zcash_client_backend/src/sync.rs	0.00%	167 Missing ⚠️
zcash_client_backend/src/data_api/chain.rs	0.00%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1184      +/-   ##
==========================================
- Coverage   63.93%   63.19%   -0.74%     
==========================================
  Files         123      124       +1     
  Lines       14340    14507     +167     
==========================================
  Hits         9168     9168              
- Misses       5172     5339     +167

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

fluidvanadium · 2024-02-12T18:05:26Z

zcash_client_backend/src/sync.rs

+                        sapling_outputs_count: 0,
+                        orchard_actions_count: 0,
+                    };
+                    std::fs::remove_file(get_block_path(fsblockdb_root, &meta))


why does it individually remove files instead of relying on truncate_to_height?

WalletWrite::truncate_to_height is truncating the wallet database; it has no knowledge about the internal details of the block cache. This file deletion is part of the concrete behaviour required to use FsBlockDb, and as such it should move behind the new trait (i.e. there should be an equivalent BlockCache::truncate_to_height method).

AArnott

Thanks for getting this started. zcash_client_backend will be much better for having its own sync engine in it. This code looks very much like what I saw in str4d's console wallet prototype, which I took and built up considerably. Would you consider reviewing that implementation to see what you can glean from it to build up your PR?

AArnott · 2024-03-01T03:36:46Z

zcash_client_backend/src/sync.rs

+    meta.block_file_path(&fsblockdb_root.join("blocks"))
+}
+
+/// Scans the chain until the wallet is up-to-date.


Why stop there? Why not add a parameter to indicate that the caller wishes the method to run indefinitely, keeping up with the blockchain?

Other parameter ideas from my own implementation:

take a callback parameter that receives % progress updates as well as newly found transactions.

take a cancellation token so the caller can gracefully abort the sync where it's at (such that it can be resumed at roughly the same location later).

As I have said in multiple places, this code is extracted from the existing private code that I have (and that I shared previously with you, on top of which you built your implementation). It is intended as a starting point for further development in-repo, not the perfect sync engine up-front.

So the reason why it stops there is that it was simpler for me to implement initially, and also how my CLI wallet works (which has no long-running backend process). Please move these suggestions into separate issues for future discussion and implementation.

AArnott · 2024-03-01T03:41:30Z

zcash_client_backend/src/sync.rs

+    for deletion in block_deletions {
+        deletion.await?;
+    }


In my perf testing, at least on Windows, the disk I/O associated with storing and soon after deleting the blocks from disk dominated the sync time. By keeping the batch in memory, sync went much faster. And I have a fast SSD by the way. The downside of this approach is with a fixed batch size, the memory requirement for this varies widely, as sandblasted blocks can cause a batch that typically requires small memory to take >2GB of memory. So I have in mind another approach that somewhat does away with batches altogether. We can chat about it if you're interested. Essentially though, it's the idea that we'll ask the server to give us all the blocks we're interested in, and we'll throttle the download speed by reading them no faster than we have space for in a deliberately limited queue that feeds into the scan_blocks fn. This way, we can keep the scan_blocks function busy, memory usage under control, and not worry about some somewhat arbitrary value for 'batch size'.

AArnott · 2024-03-01T03:43:23Z

zcash_client_backend/src/sync.rs

+        let block_meta = download_blocks(client, fsblockdb_root, db_cache, &scan_range).await?;
+
+        // Scan the downloaded blocks.
+        let scan_ranges_updated =
+            scan_blocks(params, fsblockdb_root, db_cache, db_data, &scan_range)?;
+


I have this pattern in my sync implementation as well, and it's painfully obvious when looking at the resource manager that sync time takes about twice as long as it needs to because resource utilization keeps switching between bandwidth and CPU. If we could be downloading the next batch while scanning the prior batch, we could better utilize both resources in parallel and get the job done much faster.

Indeed, this is exactly what the Android and Swift mobile SDKs do. The approach here was intentionally simplistic for the codebase I wrote it for, but I would want this to also be improved to follow the pipelined approach of the mobile SDKs.

I've updated my own sync implementation to be pipelined, limiting resource usage while keeping the CPU busy. I'll be happy to contribute to this PR (or after it) if those changes are welcome.

AArnott · 2024-03-01T03:46:16Z

zcash_client_backend/src/sync.rs

+}
+
+/// Scans the chain until the wallet is up-to-date.
+pub async fn run<P, ChT, DbT>(


This method only scans CompactBlocks, which contain only shielded transaction data. Transparent data is totally omitted. I had to fill in the gap in my sync routine, which is mostly complete, although zcash_client_backend itself doesn't yet support recording a spend of a transparent output, so while I download transparent transactions, the recorded history is incorrect if UTXOs are spent, but it works if they are shielded.
In any case, I think the transparent support either needs to be added here, or the function should be very prominently documented as to its limitations.

zcash_client_backend/Cargo.toml

zcash_client_backend/src/sync.rs

str4d · 2024-04-01T21:40:41Z

Rebased on main to use the new BlockCache trait. I still need to replace the anyhow usage with typed errors, and migrate the previous FsBlockDb-specific code into zcash_client_sqlite.

This implements the necessary state machine for taking a wallet in some arbitrary synchronization status, and fully scanning (the remainder of) the chain. Closes #1169.

str4d · 2024-04-02T00:30:44Z

Force-pushed to fix failing tests, and implement an Error enum to replace anyhow usage.

nuttycom

utACK. Since this is feature-flagged at this point, I'm happy to go ahead and merge this and have its API revised and improved over time.

nuttycom · 2024-04-01T22:22:23Z

zcash_client_backend/src/data_api/chain.rs

@@ -421,7 +422,7 @@ where
    /// # Errors
    ///
    /// In the case of an error, some blocks requested for deletion may remain in the block cache.
-    async fn delete(&self, range: &ScanRange) -> Result<(), Self::Error>;
+    async fn delete<'a>(&self, range: ScanRange) -> Result<(), Self::Error>;


What is the purpose of this named lifetime, given that it isn't mentioned in any of the arguments or the result?

This is a bug, it was me trying to fix the issue a different way that didn't work, and I forgot to undo the lifetime definition.

str4d commented Feb 9, 2024

View reviewed changes

zcash_client_backend/src/sync.rs Outdated Show resolved Hide resolved

str4d force-pushed the 1169-sync-engine branch from acedd6a to c83b443 Compare February 9, 2024 17:14

fluidvanadium reviewed Feb 12, 2024

View reviewed changes

Oscar-Pepper mentioned this pull request Feb 20, 2024

Add block cache trait #1192

Merged

AArnott reviewed Mar 1, 2024

View reviewed changes

Oscar-Pepper mentioned this pull request Mar 4, 2024

1169 sync engine with block cache trait #1217

Closed

str4d force-pushed the 1169-sync-engine branch from c83b443 to 2d09834 Compare April 1, 2024 21:39

zcash_client_backend: Implement async wallet synchronization function

24277a6

This implements the necessary state machine for taking a wallet in some arbitrary synchronization status, and fully scanning (the remainder of) the chain. Closes #1169.

str4d force-pushed the 1169-sync-engine branch from 2d09834 to 24277a6 Compare April 2, 2024 00:26

str4d marked this pull request as ready for review April 2, 2024 00:35

CI: Test with sync feature flag

7f017bc

nuttycom approved these changes Apr 3, 2024

View reviewed changes

nuttycom merged commit 430212c into main Apr 3, 2024
24 of 26 checks passed

nuttycom deleted the 1169-sync-engine branch April 3, 2024 17:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

zcash_client_backend: Implement async wallet synchronization function #1184

zcash_client_backend: Implement async wallet synchronization function #1184

str4d commented Feb 9, 2024

codecov bot commented Feb 9, 2024 •

edited

Loading

fluidvanadium Feb 12, 2024 •

edited

Loading

str4d Feb 12, 2024

AArnott left a comment

AArnott Mar 1, 2024

AArnott Mar 1, 2024

str4d Apr 1, 2024

AArnott Mar 1, 2024

AArnott Mar 1, 2024

str4d Mar 29, 2024

AArnott Mar 29, 2024

AArnott Mar 1, 2024

str4d commented Apr 1, 2024

str4d commented Apr 2, 2024

nuttycom left a comment

nuttycom Apr 1, 2024

str4d Apr 3, 2024

zcash_client_backend: Implement async wallet synchronization function #1184

zcash_client_backend: Implement async wallet synchronization function #1184

Conversation

str4d commented Feb 9, 2024

codecov bot commented Feb 9, 2024 • edited Loading

Codecov Report

fluidvanadium Feb 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AArnott left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

str4d commented Apr 1, 2024

str4d commented Apr 2, 2024

nuttycom left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Feb 9, 2024 •

edited

Loading

fluidvanadium Feb 12, 2024 •

edited

Loading