-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use finalized reference block #350
Use finalized reference block #350
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks a good idea to reference a finalized block
@@ -177,6 +177,13 @@ var ( | |||
EnvVar: common.PrefixEnvVar(envVarPrefix, "MAX_BLOBS_TO_FETCH_FROM_STORE"), | |||
Value: 100, | |||
} | |||
FinalizationBlockDelayFlag = cli.UintFlag{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we query Beacon chain for finalized epoch to get the finalized block?
@@ -28,7 +28,7 @@ abstract contract EigenDAServiceManagerStorage is IEigenDAServiceManager { | |||
* time that nodes can be active after they have deregistered. The larger it is, the farther back stakes can be used, but the longer operators | |||
* have to serve after they've deregistered. | |||
*/ | |||
uint32 public constant BLOCK_STALE_MEASURE = 150; | |||
uint32 public constant BLOCK_STALE_MEASURE = 300; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How was this calculated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This number has to be greater than FinalizationBlockDelay
+ BatchInterval
/12
If the batch interval is 10 minutes, this evaluates to 150 blocks. I doubled it to ensure we were never running up against this constraint.
I think that 30 minutes has negligible UX implications, given that it is small relative to the 14 days that operators are expected to serve after deregistration.
cc @gpsanant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, add this to the comment? It's useful to know and to maintain this value in the future.
On the other hand it raises maintenance question as already 2 offchain parameters can affect this onchain parameter (or conversely, this impacts the choice of batch interval). 10 mins batch interval is 10*60/12=50 blocks, it means it cannot be significantly larger
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default FinalizationBlockDelay is 75
Usage: "The block delay to use for pulling operator state in order to ensure the state is finalized", | ||
Required: false, | ||
EnvVar: common.PrefixEnvVar(envVarPrefix, "FINALIZATION_BLOCK_DELAY"), | ||
Value: 75, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to wait for finalization? What's the downside of using a non-final reference block?
My understanding of the reason why we may want to use an older reference block is to prevent requesting newer blocks than what indexer has indexed. For this purpose, 5~10 blocks seems sufficient.
One downside of having this number too big might a UX concern where operator state is applied that much later. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The advantage is that the OperatorState which we use is unlikely to change due to a reorg. Using 15 minutes here makes it very unlikely that we'll ever fail a batch due to a reorg as we were seeing on goerli.
I'm not really sure that a period of 15 minutes has any important UX drawbacks. We are mostly moving in the direction of decreasing the frequency of state updates (AVSSync takes us to stake updates once per week and we may do something similar with operator registrations), so I think 15 minutes here doesn't really pose a big issue.
I may be missing something though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on the minimal impact on UX and the advantage of this approach.
UX drawback was the only thing i can think of, but my concern is if there is anything else we might be missing here because we've only tested under the scenario where reference block was very recent, and we haven't seen any issues with operator state being reorg'd.
This is probably fine, but we should definitely test this in the wild for some time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and we haven't seen any issues with operator state being reorg'd.
Haven't we? I think the assumption has been that the reorg issues we've been seeing on testnet were exactly this.
But yeah, let's see if it causes any unexpected issues. I can't think if any right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another part is the operator experience where they will be held longer for dispersal after opt-out.
Overall I think it's an improvement for robustness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
according to this twitter thread, it looks like on average 20 reorg per day. So in the worst case, need to wait 20 blocks. https://twitter.com/terencechain/status/1768664666996355178
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does 20 reorgs / day mean we need to wait 20 blocks? I don't understand why these numbers would be related.
@@ -45,7 +45,6 @@ services: | |||
SRS_LOAD: 2900 | |||
CHALLENGE_ORDER: 3000 | |||
LOG_LEVEL: "debug" | |||
LOG_FORMAT: "text" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we not want this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default is set to "json", so I was going with that. Is there a reason we want text logging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@@ -28,7 +28,7 @@ abstract contract EigenDAServiceManagerStorage is IEigenDAServiceManager { | |||
* time that nodes can be active after they have deregistered. The larger it is, the farther back stakes can be used, but the longer operators | |||
* have to serve after they've deregistered. | |||
*/ | |||
uint32 public constant BLOCK_STALE_MEASURE = 150; | |||
uint32 public constant BLOCK_STALE_MEASURE = 300; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, add this to the comment? It's useful to know and to maintain this value in the future.
On the other hand it raises maintenance question as already 2 offchain parameters can affect this onchain parameter (or conversely, this impacts the choice of batch interval). 10 mins batch interval is 10*60/12=50 blocks, it means it cannot be significantly larger
Why are these changes needed?
Uses the a finalized block for the operator state instead of the most recent block.
Checks