Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect production scheduling after error: "Not producing block because "defproducerd" signed a block at the next block time or later" #2380

Closed
greg7mdp opened this issue Apr 4, 2024 · 3 comments
Assignees
Labels
bug Something isn't working 👍 lgtm OCI Work exclusive to OCI team

Comments

@greg7mdp
Copy link
Contributor

greg7mdp commented Apr 4, 2024

Sometimes, the test trx_finality_status_forked_if_test, and likely others, show in the log of node_03 a volley of messages:

Waiting till another block is received and scheduling Speculative/Production Change

These are emitted in producer_plugin_impl::schedule_production_loop(), and are likely caused by incorrect parameters to, or calculations in block_timing_util::calculate_producer_wake_up_time().

We should investigate whether there is indeed a miscalculation here (maybe an off-by-one issue) and address it.

Below is a larger section of the log showing the repeated error messages.

error 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:1853      start_block          ] Not producing block because "defproducerd" signed a block at the next block time or later (2024-03-13T19:23:57.500) than the pending block time (2024-03-13T19:23:57.500)
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2515      schedule_production_ ] Waiting till another block is received and scheduling Speculative/Production Change
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2589      schedule_delayed_pro ] Scheduling Speculative/Production Change at 2024-03-13T19:23:57.200
error 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:1853      start_block          ] Not producing block because "defproducerd" signed a block at the next block time or later (2024-03-13T19:23:57.500) than the pending block time (2024-03-13T19:23:57.500)
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2515      schedule_production_ ] Waiting till another block is received and scheduling Speculative/Production Change
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2589      schedule_delayed_pro ] Scheduling Speculative/Production Change at 2024-03-13T19:23:57.200
error 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:1853      start_block          ] Not producing block because "defproducerd" signed a block at the next block time or later (2024-03-13T19:23:57.500) than the pending block time (2024-03-13T19:23:57.500)
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2515      schedule_production_ ] Waiting till another block is received and scheduling Speculative/Production Change
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2589      schedule_delayed_pro ] Scheduling Speculative/Production Change at 2024-03-13T19:23:57.200
error 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:1853      start_block          ] Not producing block because "defproducerd" signed a block at the next block time or later (2024-03-13T19:23:57.500) than the pending block time (2024-03-13T19:23:57.500)
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2515      schedule_production_ ] Waiting till another block is received and scheduling Speculative/Production Change
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2589      schedule_delayed_pro ] Scheduling Speculative/Production Change at 2024-03-13T19:23:57.200
error 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:1853      start_block          ] Not producing block because "defproducerd" signed a block at the next block time or later (2024-03-13T19:23:57.500) than the pending block time (2024-03-13T19:23:57.500)
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2515      schedule_production_ ] Waiting till another block is received and scheduling Speculative/Production Change
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2589      schedule_delayed_pro ] Scheduling Speculative/Production Change at 2024-03-13T19:23:57.200
error 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:1853      start_block          ] Not producing block because "defproducerd" signed a block at the next block time or later (2024-03-13T19:23:57.500) than the pending block time (2024-03-13T19:23:57.500)
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2515      schedule_production_ ] Waiting till another block is received and scheduling Speculative/Production Change
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2589      schedule_delayed_pro ] Scheduling Speculative/Production Change at 2024-03-13T19:23:57.200
error 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:1853      start_block          ] Not producing block because "defproducerd" signed a block at the next block time or later (2024-03-13T19:23:57.500) than the pending block time (2024-03-13T19:23:57.500)
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2515      schedule_production_ ] Waiting till another block is received and scheduling Speculative/Production Change
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2589      schedule_delayed_pro ] Scheduling Speculative/Production Change at 2024-03-13T19:23:57.200
error 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:1853      start_block          ] Not producing block because "defproducerd" signed a block at the next block time or later (2024-03-13T19:23:57.500) than the pending block time (2024-03-13T19:23:57.500)
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2515      schedule_production_ ] Waiting till another block is received and scheduling Speculative/Production Change
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2589      schedule_delayed_pro ] Scheduling Speculative/Production Change at 2024-03-13T19:23:57.200
error 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:1853      start_block          ] Not producing block because "defproducerd" signed a block at the next block time or later (2024-03-13T19:23:57.500) than the pending block time (2024-03-13T19:23:57.500)
debug 2024-03-13T19:23:57.378 nodeos    producer_plugin.cpp:2515      schedule_production_ ] Waiting till another block is received and scheduling Speculative/Production Change
debug 2024-03-13T19:23:57.379 nodeos    producer_plugin.cpp:2589      schedule_delayed_pro ] Scheduling Speculative/Production Change at 2024-03-13T19:23:57.200
@greg7mdp greg7mdp changed the title Uncorrect production scheduling after error: "Not producing block because "defproducerd" signed a block at the next block time or later" Incorrect production scheduling after error: "Not producing block because "defproducerd" signed a block at the next block time or later" Apr 4, 2024
@enf-ci-bot enf-ci-bot moved this to Todo in Team Backlog Apr 4, 2024
@greg7mdp greg7mdp self-assigned this Apr 4, 2024
@greg7mdp greg7mdp moved this from Todo to In Progress in Team Backlog Apr 4, 2024
@greg7mdp greg7mdp moved this from In Progress to Todo in Team Backlog Apr 5, 2024
@greg7mdp greg7mdp removed their assignment Apr 5, 2024
@greg7mdp
Copy link
Contributor Author

greg7mdp commented Apr 5, 2024

I looked at the issue some, but this is not an area I'm familiar with, it is hard for me to pinpoint exactly what is wrong.
In addition, making changes to calculate_next_block_slot could have serious consequences for block producers (possibly missing slots), so I feel this should be investigated by someone with more knowledge in this area.

@arhag arhag added bug Something isn't working 👍 lgtm and removed triage labels Apr 5, 2024
@greg7mdp
Copy link
Contributor Author

greg7mdp commented Apr 5, 2024

I have verified that the same behavior is seen in the test which doesn't activate IF, so this is not a regression dues to the Savanna consensus.

Running the test locally as: ./tests/trx_finality_status_forked_test.py "-v" "--keep-logs"

One odd observation is that I see the test failing randomly. Seems to fail about every other run.

This doesn't seem to happen when running the test of the main branch instead of hotstuff_integration, but the test has changed in hotstuff_integration, for example using 4 producers instead of tw.

@heifner
Copy link
Member

heifner commented Aug 5, 2024

Resolved by: AntelopeIO/spring#473

@heifner heifner closed this as completed Aug 5, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in Team Backlog Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working 👍 lgtm OCI Work exclusive to OCI team
Projects
Archived in project
Development

No branches or pull requests

4 participants