Keep data in fails cases in sync service #2361

AurelienFT · 2024-10-15T14:30:03Z

Linked Issues/PRs

Description

This pull request introduces a caching mechanism to the sync service to avoid redundant data fetching from the network. The most important changes include adding a cache module, modifying the Import struct to include a cache, and updating related methods to utilize this cache.

Caching Mechanism:

crates/services/sync/src/import.rs: Added a new cache module and integrated it into the Import struct. Updated methods to use the cache for fetching and storing headers and blocks.
Cache mechanism allow use to retrieve a stream of batches of either cached headers, cached full blocks, or range to fetch data.

Test Updates:

Update the P2P port in mocks to use async to simulate more complex tests needed for this feature.

This PR contains 50% of changes in the tests and addition of tests in the cache.

Checklist

Breaking changes are clearly marked as such in the PR description and changelog
New behavior is reflected in tests
The specification matches the implemented behavior (link update PR if changes are needed)

Before requesting review

I have reviewed the code myself
I have created follow-up issues caused by this PR and linked them here

…sfully imported

netrome

I don't understand the import task well enough to approve right now. I need clarification on the following points:

How do we ensure this cache doesn't grow forever? Is the Import task short-lived? While the import task launches short-lived streams, it seems like a long-living task to me.
How can we be sure we'll query exactly the same ranges as we have cached? Where is that invariant maintained.

Let me know if you want to jump on a call to chat about this, or just write if I'm missing something obvious here.

netrome · 2024-10-16T20:00:23Z

crates/services/sync/src/import.rs

@@ -98,6 +99,26 @@ pub struct Import<P, E, C> {
    executor: Arc<E>,
    /// Consensus port.
    consensus: Arc<C>,
+    /// A cache of already validated header or blocks.
+    cache: SharedMutex<HashMap<Range<u32>, CachedData>>,


suggestion: Perhaps we should consider using a DashMap here instead like in #2352

Alternatively since we're storing ranges, a BTreeMap could be useful. I'm a bit hesitant towards using Range<u32> as key here. It's not obvious to me that we'll process exactly the same ranges if the stream is restarted. It would be more robust to instead maintain a map of the block height to the cached data at that height.

Or can we be sure that we're

Not storing overlapping ranges and

will query the same ranges that we have cached?

I can use DashMap didn't know about it, seems interesting.

About storing range, I answered here: #2361 (comment) , however I'm trying to implement it now but the problem is that p2p only works with range and so it's a problem if we have a cache in the middle of the range, we would have to bisect the range in two.

Yeah that's why I think a BTreeMap is pretty nice since they allow for querying ranges of keys directly.

YYeah but this doesn't solve the problem that we will have to split ranges asked to p2p and multiply the number of requests

DashMap isn't an option anymore as we are using BTreeMap

netrome · 2024-10-16T20:39:31Z

crates/services/sync/src/import.rs

-    header_stream
+    let ranges = range_chunks(range, params.header_batch_size);
+    futures::stream::iter(ranges)
+        .map({


While the pattern was established before this PR, I think it would be nice to use then instead of map here, and skip the .awaits. We'd be able to return just a Stream<Item = SealedBlockBatch> instead of having the nested futures in the returned stream.

I agree and there is a lot more things to improve on this service I don't wanna make this PR even bigger and so I created an issue for that : #2370

crates/services/sync/src/import.rs

AurelienFT · 2024-10-16T21:09:48Z

@netrome Thanks for taking the time to review this Regarding your interrogations :
1 - Yes for me it will leave a long time but all asked data should be ok at some point and so be cleared otherwise we will only have batch_size as number of element in the cache. But I'm not very sure about this that's why I placed a comment about this in "Interrogation" in the PR. Maybe we need a pruning management.
2 - I was thinking that we re-ask all the same ranges because the batch_size doesn't change but the starting point can change to the last synced block and so ranges can change. I think you are right then the ranges can change I will ask few questions to @xgreenx

rafal-ch

So far looks good, I need to have a deeper look at the tests though.

CHANGELOG.md

crates/services/sync/src/import.rs

crates/services/sync/src/import/back_pressure_tests.rs

AurelienFT · 2024-10-17T09:17:28Z

Convert to draft because of big refacto.

…rs and blocks mixed)

…r and added a bunch of tests

Co-authored-by: Rafał Chabowski <[email protected]>

AurelienFT · 2024-10-21T08:36:50Z

Now everything is cached one by one but there is an issue that I'm having hard time to find a solution. When we successfully had the header but never had the transactions, we need the peer_id to ask the transactions again. However if I cache the peer_id that we used to get the header and failed to give us transactions it will ask him again and I don't think we want to re-ask to someone that returned a fail. But I don't have any ways to find a peer that I know have the transactions.

On top of that the range that I build from cached data could have been fetched from multiple peers.
The only solution I see that simplify everything but cache less things is to cache only full blocks.
Any ideas @netrome @xgreenx @rafal-ch ?

netrome · 2024-10-21T11:42:30Z

Now everything is cached one by one but there is an issue that I'm having hard time to find a solution. When we successfully had the header but never had the transactions, we need the peer_id to ask the transactions again. However if I cache the peer_id that we used to get the header and failed to give us transactions it will ask him again and I don't think we want to re-ask to someone that returned a fail. But I don't have any ways to find a peer that I know have the transactions.

On top of that the range that I build from cached data could have been fetched from multiple peers. The only solution I see that simplify everything but cache less things is to cache only full blocks. Any ideas @netrome @xgreenx @rafal-ch ?

Had a chat about this. @xgreenx proposed we change the p2p interface to not require any peer ID when requesting transactions, but instead leave it up to the p2p implementation to decide which peer to request them from and return that peer ID in the response.

…keep-data-on-stop

## Linked Issues/PRs This is a requirement for #2361 ## Description This PR adds a way to fetch transactions with p2p but without giving a specific peer and let p2p choose the one they prefer. This will be used in #2361 ## Checklist - [x] Breaking changes are clearly marked as such in the PR description and changelog - [x] New behavior is reflected in tests - [x] [The specification](https://github.com/FuelLabs/fuel-specs/) matches the implemented behavior (link update PR if changes are needed) ### Before requesting review - [x] I have reviewed the code myself - [x] I have created follow-up issues caused by this PR and linked them here --------- Co-authored-by: Green Baneling <[email protected]>

Copilot reviewed 6 out of 10 changed files in this pull request and generated no suggestions.

Files not reviewed (4)

crates/services/sync/src/import/test_helpers/pressure_peer_to_peer.rs: Evaluated as low risk
crates/services/sync/src/import/tests.rs: Evaluated as low risk
crates/services/sync/src/ports.rs: Evaluated as low risk
CHANGELOG.md: Evaluated as low risk

AurelienFT added 4 commits October 15, 2024 16:26

Update p2p mock and add a test for the save of data

b191e08

Fix a test

704a583

Add caching of headers or blocks already fetched. Removed when succes…

e94bee2

…sfully imported

Add changelog and fmt

6a61661

AurelienFT marked this pull request as ready for review October 16, 2024 16:44

AurelienFT requested review from xgreenx, Dentosal and MitchTurner as code owners October 16, 2024 16:44

AurelienFT requested a review from a team October 16, 2024 16:44

Fix clippy

34ec1c6

netrome reviewed Oct 16, 2024

View reviewed changes

AurelienFT changed the base branch from release/v0.40.0 to master October 16, 2024 21:21

rafal-ch reviewed Oct 17, 2024

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

crates/services/sync/src/import.rs Outdated Show resolved Hide resolved

crates/services/sync/src/import/back_pressure_tests.rs Outdated Show resolved Hide resolved

AurelienFT marked this pull request as draft October 17, 2024 09:17

AurelienFT and others added 8 commits October 17, 2024 17:30

Add new cache structure. (Not fully working because of batch of heade…

c4bb96c

…rs and blocks mixed)

Add new chunk system to distinguish chunks of blocks and blocks heade…

23c7eaa

…r and added a bunch of tests

Improve changes made to back pressure test because going too fast

41966a6

Update get_chunks method to be more readable

8e4fc44

Merge branch 'master' into sync_service/keep-data-on-stop

27a5856

Update CHANGELOG.md

879eb94

Co-authored-by: Rafał Chabowski <[email protected]>

Avoid unecessary changes in changelog

a29e763

Remove old comment

83401ce

AurelienFT mentioned this pull request Oct 18, 2024

Improve readibility of sync service #2370

Open

AurelienFT marked this pull request as ready for review October 18, 2024 10:33

AurelienFT requested review from netrome and rafal-ch October 18, 2024 10:36

AurelienFT and others added 2 commits October 18, 2024 18:46

fix compil benchmark

62cae71

fix clippy

84bac0a

AurelienFT marked this pull request as draft October 21, 2024 11:40

Add a way to fetch transactions without specifying a peer in P2P

42719bf

AurelienFT mentioned this pull request Oct 21, 2024

Add a way to fetch transactions in P2P without specifying a peer #2376

Merged

5 tasks

AurelienFT added 2 commits October 21, 2024 14:41

Update CANGELOG.md

c80f3ac

Merge branch 'add_p2p_fetch_txs_no_peer_specified' into sync_service/…

ab3221c

…keep-data-on-stop

AurelienFT changed the base branch from master to add_p2p_fetch_txs_no_peer_specified October 21, 2024 12:44

Use get transactions without peer_id when use headers from cache

99135e3

AurelienFT marked this pull request as ready for review October 21, 2024 13:22

AurelienFT self-assigned this Oct 24, 2024

Base automatically changed from add_p2p_fetch_txs_no_peer_specified to master October 31, 2024 08:47

AurelienFT and others added 5 commits October 31, 2024 10:48

Merge branch 'master' into sync_service/keep-data-on-stop

862ef36

Merge branch 'master' into sync_service/keep-data-on-stop

0ac1e34

Merge branch 'master' into sync_service/keep-data-on-stop

2407078

Merge branch 'master' into sync_service/keep-data-on-stop

bb3b8ca

Merge branch 'master' into sync_service/keep-data-on-stop

6dae6c7

rymnc requested a review from Copilot November 21, 2024 10:18

Copilot AI reviewed Nov 21, 2024

View reviewed changes

Merge branch 'master' into sync_service/keep-data-on-stop

0862890

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keep data in fails cases in sync service #2361

Keep data in fails cases in sync service #2361

AurelienFT commented Oct 15, 2024 •

edited

Loading

netrome left a comment

netrome Oct 16, 2024

netrome Oct 16, 2024

AurelienFT Oct 16, 2024 •

edited

Loading

netrome Oct 17, 2024

AurelienFT Oct 17, 2024

AurelienFT Oct 18, 2024

netrome Oct 16, 2024

AurelienFT Oct 18, 2024

AurelienFT commented Oct 16, 2024 •

edited

Loading

rafal-ch left a comment

AurelienFT commented Oct 17, 2024

AurelienFT commented Oct 21, 2024

netrome commented Oct 21, 2024

Keep data in fails cases in sync service #2361

Are you sure you want to change the base?

Keep data in fails cases in sync service #2361

Conversation

AurelienFT commented Oct 15, 2024 • edited Loading

Linked Issues/PRs

Description

Caching Mechanism:

Test Updates:

Checklist

Before requesting review

netrome left a comment

Choose a reason for hiding this comment

netrome Oct 16, 2024

Choose a reason for hiding this comment

netrome Oct 16, 2024

Choose a reason for hiding this comment

AurelienFT Oct 16, 2024 • edited Loading

Choose a reason for hiding this comment

netrome Oct 17, 2024

Choose a reason for hiding this comment

AurelienFT Oct 17, 2024

Choose a reason for hiding this comment

AurelienFT Oct 18, 2024

Choose a reason for hiding this comment

netrome Oct 16, 2024

Choose a reason for hiding this comment

AurelienFT Oct 18, 2024

Choose a reason for hiding this comment

AurelienFT commented Oct 16, 2024 • edited Loading

rafal-ch left a comment

Choose a reason for hiding this comment

AurelienFT commented Oct 17, 2024

AurelienFT commented Oct 21, 2024

netrome commented Oct 21, 2024

Choose a reason for hiding this comment

AurelienFT commented Oct 15, 2024 •

edited

Loading

AurelienFT Oct 16, 2024 •

edited

Loading

AurelienFT commented Oct 16, 2024 •

edited

Loading