feat(resharding) - congestion info computation #12581

wacban · 2024-12-09T16:45:10Z

Implementing the congestion info computation based on the parent congestion info and the receipt groups info from the parent trie. The seed used for the allowed shard is a bit hacky - please let me know if this makes sense.

I added assertion checking that the buffered gas in congestion info is zero iff the buffers are empty. With this in place the test_resharding_v3_buffered_receipts_towards_splitted_shard tests fail without the updated congestion info and pass with it in place.

wacban · 2024-12-09T16:45:59Z

chain/chain/src/resharding/manager.rs

@@ -213,11 +222,24 @@ impl ReshardingManager {
            };
            let mem_changes = trie_changes.mem_trie_changes.as_ref().unwrap();
            let new_state_root = mem_tries.apply_memtrie_changes(block_height, mem_changes);
+            drop(mem_tries);


This is annoying but needed to prevent a deadlock.

Sounds a bit scary 👀, but I assume it has to be this way

It's a bit ugly but I don't think it's scary. Rust is smart enough to know not to reuse this variable anymore. Unless you had some other risk in mind?

wacban · 2024-12-09T16:48:54Z

chain/chain/src/resharding/manager.rs

+                let epoch_id = block.header().epoch_id();
+                let protocol_version = self.epoch_manager.get_epoch_protocol_version(epoch_id)?;
+
+                let trie = tries.get_trie_for_shard(new_shard_uid, new_state_root);


bootstrap_congestion_info requires TrieAccess - can I get it somehow from the memtries above directly?

jancionear · 2024-12-10T05:04:36Z

chain/chain/src/resharding/manager.rs

            // TODO(resharding): set all fields of `ChunkExtra`. Consider stronger
            // typing. Clarify where it should happen when `State` and
            // `FlatState` update is implemented.
            let mut child_chunk_extra = ChunkExtra::clone(&chunk_extra);
            *child_chunk_extra.state_root_mut() = new_state_root;
+            // TODO(resharding) - Implement proper congestion info for
+            // resharding. The current implementation is very expensive.


While we're waiting for the bandwidth scheduler to give us an efficient way to do the same

OutgoingMetadatas are already fully functional and merged into master, I didn't plan to add anything else. What sort of functionality are you looking for?

Ah nice! I think all I need is the total size of the buffered receipts. I'll check and come back.

marcelo-gonzalez

looks pretty sane to me but I am not super familiar w congestion control, so ill let others approve

wacban · 2024-12-11T12:54:22Z

I will do the proper version anyway so don't mind it for now please.

chain/chain/src/resharding/manager.rs

wacban · 2024-12-13T12:50:30Z

nayduck with extra assertions:
https://nayduck.nearone.org/#/run/917

codecov · 2024-12-18T14:04:30Z

Codecov Report

Attention: Patch coverage is 88.88889% with 22 lines in your changes missing coverage. Please review.

Project coverage is 70.59%. Comparing base (21b5109) to head (be19d21).
Report is 13 commits behind head on master.

Files with missing lines	Patch %	Lines
chain/chain/src/resharding/manager.rs	92.45%	0 Missing and 8 partials ⚠️
...chain/src/stateless_validation/chunk_validation.rs	72.00%	0 Missing and 7 partials ⚠️
chain/client/src/client.rs	33.33%	0 Missing and 2 partials ⚠️
core/primitives/src/types.rs	71.42%	2 Missing ⚠️
chain/chain/src/runtime/mod.rs	50.00%	0 Missing and 1 partial ⚠️
core/primitives/src/congestion_info.rs	83.33%	0 Missing and 1 partial ⚠️
runtime/runtime/src/congestion_control.rs	85.71%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #12581      +/-   ##
==========================================
+ Coverage   70.56%   70.59%   +0.02%     
==========================================
  Files         847      847              
  Lines      172735   173367     +632     
  Branches   172735   173367     +632     
==========================================
+ Hits       121897   122380     +483     
- Misses      45737    45861     +124     
- Partials     5101     5126      +25

Flag	Coverage Δ
backward-compatibility	`0.16% <0.00%> (-0.01%)`	⬇️
db-migration	`0.16% <0.00%> (?)`
genesis-check	`1.36% <0.60%> (-0.01%)`	⬇️
linux	`69.18% <31.31%> (-0.08%)`	⬇️
linux-nightly	`70.19% <88.88%> (+0.02%)`	⬆️
pytests	`1.66% <0.59%> (-0.01%)`	⬇️
sanity-checks	`1.47% <0.59%> (-0.01%)`	⬇️
unittests	`70.42% <88.88%> (+0.02%)`	⬆️
upgradability	`0.20% <0.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

marcelo-gonzalez

I think it looks okay to me, but I will let @jancionear approve, since I'm not as familiar with the bandwidth scheduler/congestion code. But ping me if you need me to stamp to unblock it

marcelo-gonzalez · 2024-12-20T07:27:44Z

chain/chain/src/resharding/manager.rs

+            let gas = receipt_groups.total_gas();
+            let gas = gas.try_into().expect("ReceiptGroup gas must fit in u64");
+
+            congestion_info


do we not have to set total_receipts_num as well?

Good sanity check! The total_receipts_num field is part of ReceiptGroupsQueueData which is resharded by copying to left child only. It is handled by the memtrie and flat storage split operations. MemTrie is already implemented here and this PR adds the flat storage equivalent.

jancionear

lgtm, apart from one potential overflow

I'm not very familiar with the resharding logic, but the congestion control changes make sense 👍

jancionear · 2025-01-02T12:57:04Z

chain/chain/src/resharding/manager.rs

+
+            let bytes = receipt_groups.total_size();
+            let gas = receipt_groups.total_gas();
+            let gas = gas.try_into().expect("ReceiptGroup gas must fit in u64");


There is no guarantee that total_gas will fit in u64.
70_000 receipts with 300TGas each would exceed the capacity of u64. It's unlikely that we'll ever have that many receipts, but in theory could happen.
IMO it would be better to substract the u128.

jancionear · 2025-01-02T13:58:59Z

chain/chain/src/resharding/manager.rs

@@ -213,11 +222,24 @@ impl ReshardingManager {
            };
            let mem_changes = trie_changes.mem_trie_changes.as_ref().unwrap();
            let new_state_root = mem_tries.apply_memtrie_changes(block_height, mem_changes);
+            drop(mem_tries);


Sounds a bit scary 👀, but I assume it has to be this way

jancionear · 2025-01-02T14:01:36Z

chain/chain/src/resharding/manager.rs

+            .get_shard_index(own_shard)?
+            .try_into()
+            .expect("ShardIndex must fit in u64");
+        let congestion_seed = own_shard_index;


It would be nice to use the same seed as usual (derived from block height), but I guess it doesn't matter, as long as it's deterministic and only on resharding boundaries.

Yeah the issue was that I couldn't get hold of the block height easily on the stateless validation path. Let me add a TODO but for now as you said, as long as it's deterministic and one off it's fine.

Quick question, doesn't this mess up the coordination of allowed shard with respect to other shards?
Example, suppose we had 5 shards and a round robin method to assign allowed_shard. Say right before resharding we had the mapping
Shard 0 -> Shard 3
Shard 1 -> Shard 4
Shard 2 -> Shard 0
Shard 3 -> Shard 1
Shard 4 -> Shard 2

After resharding we split Shard 4 to Shard 5 and Shard 6 (index 4 and 5). If we break the coordination and just assign allowed shard based on congestion_seed = own_shard_index, wouldn't that potentially lead to two shards having the same mapped allowed_shard?

Or is this issue just for the one block after resharding and life returns to normal after this block?

There shouldn't be any duplicated allowed shards but please don't ask about missing chunks.

If I'm reading the code right, using just the shard index means that each shard will have itself as the allowed shard.

And yeah it's only for one block during resharding.

jancionear · 2025-01-02T14:05:35Z

integration-tests/src/test_loop/tests/resharding_v3.rs

@@ -189,7 +189,7 @@ impl TestReshardingParametersBuilder {
            limit_outgoing_gas: self.limit_outgoing_gas.unwrap_or(false),
            delay_flat_state_resharding: self.delay_flat_state_resharding.unwrap_or(0),
            short_yield_timeout: self.short_yield_timeout.unwrap_or(false),
-            allow_negative_refcount: self.allow_negative_refcount.unwrap_or(false),
+            allow_negative_refcount: self.allow_negative_refcount.unwrap_or(true),


Is it ok to allow negative refcount in resharding tests by default? This looks like something that could be important to catch somewhere 👀

yeah, it's annoying, please see this comment with the latest thoughts on this:
#12581 (comment)

jancionear · 2025-01-02T14:08:46Z

chain/chain/src/resharding/manager.rs

+        // the parent's buffered receipts from the parent's congestion info.
+        let mut congestion_info = parent_congestion_info;
+        for shard_id in parent_shard_layout.shard_ids() {
+            let receipt_groups = ReceiptGroupsQueue::load(parent_trie, shard_id)?;


I wonder if there's some way to protect against a scenario where the ReceiptGroupsQueue isn't fully initialized. The old receipts will most likely be forwarded within one epoch, but in theory there could be still be some receipts not included in the metadata when the resharding happens.
Could we detect that and postpone the protocol upgrade or something? Idk, sounds complicated.

I think it's borderline impossible that we wouldn't go through the queue for a full epoch. We do have some assertions in place to check that congestion info is zeroed in the right places iff the queue is empty so at least we'll detect the issue some time later.

I don't think delaying is feasible as resharding is scheduled an epoch in advance and we only know about this condition at the last moment.

Maybe it'd be possible to bootstrap congestion info? Idk, not ideal either

Yeah, cool idea, it could work but if we're swamped with receipts that could blow up the state witness. I still don't think it's necessary to worry about this case.

wacban · 2025-01-02T15:57:18Z

JFYI I was looking at why the tests need to allow negative refcount -> it's the BANDWIDTH_SCHEDULER_STATE that is double deleted now. I think the issue surfaced only now because I moved the bandwidth scheduler protocol feature before resharding. I'm still not too happy about disabling this check by default but at least it makes sense. The logic is the same as for delayed receipts -> both are copied to both children and so double deleted.

cc @staffik who's looking into a proper solution for getting rid of the allow negative refcount

shreyan-gupta

Looks great! Thanks for handling this! :)

shreyan-gupta · 2025-01-06T19:48:01Z

core/primitives/src/congestion_info.rs

@@ -300,11 +300,11 @@ impl CongestionInfo {
        Ok(())
    }

-    pub fn remove_buffered_receipt_gas(&mut self, gas: Gas) -> Result<(), RuntimeError> {
+    pub fn remove_buffered_receipt_gas(&mut self, gas: u128) -> Result<(), RuntimeError> {


Is there a specific reason why we had to change this function signature to u128?

It's possible to have the sum of gas from may receipts to exceed u64 - or so @jancionear claimed :)

shreyan-gupta · 2025-01-06T19:49:57Z

runtime/runtime/src/congestion_control.rs

        }
+
+        // Assert that empty buffers match zero buffered gas.
+        assert_eq!(all_buffers_empty, self.own_congestion_info.buffered_receipts_gas() == 0);


Nice assert to have!

shreyan-gupta · 2025-01-06T19:54:02Z

chain/chain/src/stateless_validation/chunk_validation.rs

+                // Update the congestion info based on the parent shard. It's
+                // important to do this step before the `retain_split_shard`
+                // because only the parent has the needed information.
+                if let Some(congestion_info) = chunk_extra.congestion_info_mut() {


sanity check: with resharding v3 feature enabled after congestion control, shouldn't we always have congestion_info in the chunk header (and chunk extra)?

Yeah, if you like I can replace the if let with an expect

Yep, thanks! Just wanted to make sure

shreyan-gupta · 2025-01-06T20:02:08Z

chain/chain/src/resharding/manager.rs

+            .get_shard_index(own_shard)?
+            .try_into()
+            .expect("ShardIndex must fit in u64");
+        let congestion_seed = own_shard_index;


Quick question, doesn't this mess up the coordination of allowed shard with respect to other shards?
Example, suppose we had 5 shards and a round robin method to assign allowed_shard. Say right before resharding we had the mapping
Shard 0 -> Shard 3
Shard 1 -> Shard 4
Shard 2 -> Shard 0
Shard 3 -> Shard 1
Shard 4 -> Shard 2

After resharding we split Shard 4 to Shard 5 and Shard 6 (index 4 and 5). If we break the coordination and just assign allowed shard based on congestion_seed = own_shard_index, wouldn't that potentially lead to two shards having the same mapped allowed_shard?

Or is this issue just for the one block after resharding and life returns to normal after this block?

shreyan-gupta · 2025-01-06T20:04:00Z

chain/chain/src/stateless_validation/chunk_validation.rs

+                    // Set the allowed shard based on the child shard.
+                    let next_epoch_id = epoch_manager.get_next_epoch_id(&block_hash)?;
+                    let next_shard_layout = epoch_manager.get_shard_layout(&next_epoch_id)?;
+                    ReshardingManager::finalize_allowed_shard(


This whole bit from ReshardingManager::get_child_congestion_info_not_finalized to ReshardingManager::finalize_allowed_shard is duplicated across here and resharding manager. I think parent_trie.retain_split_shard may be included in the mix. Sounds like a helper function to me!

wacban requested review from Longarithm and marcelo-gonzalez December 9, 2024 16:45

wacban requested a review from a team as a code owner December 9, 2024 16:45

wacban commented Dec 9, 2024

View reviewed changes

jancionear reviewed Dec 10, 2024

View reviewed changes

wacban marked this pull request as draft December 10, 2024 09:30

marcelo-gonzalez reviewed Dec 11, 2024

View reviewed changes

wacban force-pushed the waclaw-resharding branch 2 times, most recently from 9529c2f to e55dae2 Compare December 12, 2024 11:52

wacban commented Dec 12, 2024

View reviewed changes

chain/chain/src/resharding/manager.rs Outdated Show resolved Hide resolved

wacban force-pushed the waclaw-resharding branch from a25c85b to 31b81d0 Compare December 13, 2024 12:43

wacban added 12 commits December 18, 2024 14:27

assert that buffers are empty iff buffered gas in info is zero

2ccd943

brute force congestion info

bb57b01

debug logs

ffe30d9

brute force works

9d04209

does not work

78b5bc8

protocol version order, bandwidth scheduler state resharding

f5ddc63

allow negative refcount

419e804

todos

8b73a77

cleanup

0457ee3

small nits

e432db8

stateless validation

4eea6ad

nit

752e804

wacban force-pushed the waclaw-resharding branch from 42b1175 to 752e804 Compare December 18, 2024 13:38

cleanup

f49a309

wacban added 2 commits December 18, 2024 15:20

nits

f10e0ac

nits

fbcca65

remove allow_negative_refcount

c8c683a

marcelo-gonzalez reviewed Dec 20, 2024

View reviewed changes

wacban added 2 commits December 23, 2024 12:17

revert justfile

bb0f0be

Merge branch 'master' into waclaw-resharding

dca5776

jancionear reviewed Jan 2, 2025

View reviewed changes

marcelo-gonzalez approved these changes Jan 2, 2025

View reviewed changes

review comments

d1da511

jancionear approved these changes Jan 3, 2025

View reviewed changes

Merge branch 'master' into waclaw-resharding

f1d224f

wacban enabled auto-merge January 3, 2025 12:25

debug

fa3fef0

shreyan-gupta self-requested a review January 6, 2025 18:41

shreyan-gupta approved these changes Jan 6, 2025

View reviewed changes

wacban added 3 commits January 7, 2025 09:50

wip - recording

130daf9

record proof

4424fae

undo debug logs

2796ac0

wacban added this pull request to the merge queue Jan 7, 2025

wacban removed this pull request from the merge queue due to a manual request Jan 7, 2025

nits

e9517cf

wacban enabled auto-merge January 7, 2025 13:33

clippy

be19d21

wacban added this pull request to the merge queue Jan 7, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 7, 2025

wacban added this pull request to the merge queue Jan 7, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 7, 2025

wacban added this pull request to the merge queue Jan 7, 2025

Merged via the queue into master with commit 19601cb Jan 7, 2025
28 checks passed

wacban deleted the waclaw-resharding branch January 7, 2025 15:34

feat(resharding) - congestion info computation #12581

feat(resharding) - congestion info computation #12581

Conversation

wacban commented Dec 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jancionear Dec 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcelo-gonzalez left a comment

Choose a reason for hiding this comment

wacban commented Dec 11, 2024

wacban commented Dec 13, 2024

codecov bot commented Dec 18, 2024 • edited Loading

Codecov Report

marcelo-gonzalez left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jancionear left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jancionear Jan 2, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wacban commented Jan 2, 2025

shreyan-gupta left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wacban commented Dec 9, 2024 •

edited

Loading

jancionear Dec 10, 2024 •

edited

Loading

codecov bot commented Dec 18, 2024 •

edited

Loading

jancionear Jan 2, 2025 •

edited

Loading