Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restart and Synchronize Issue #70

Open
zicofish opened this issue Aug 28, 2022 · 4 comments
Open

Restart and Synchronize Issue #70

zicofish opened this issue Aug 28, 2022 · 4 comments

Comments

@zicofish
Copy link

Hi, we have been using this library for a consensus scenario. But there seems to be some issues about restarting a node.

In our scenario, we run 4 nodes for consensus.
Then we stop one of them for approximately 0.5~1 hours.
Then we restart the node.

Afterwards, the node runs for a lot of synchronization blocks and gets stuck. Moreover, it finally drags down all other three nodes, and the whole system hangs.

What could be the problem and do u have a solution for this case?

Thanks~

@zicofish
Copy link
Author

Btw, in addition, the database written out by hotstuff keeps growing, and there is no mechanism to remove old data. Is this expected?

@asonnino
Copy link
Owner

asonnino commented Sep 1, 2022

Sadly this codebase doesn't not implement crash-recovery (so there is no safe way to securely restart a node). To do so, we would need to persist a number of information to storage (eg. preferred round and last voted round).

@asonnino
Copy link
Owner

asonnino commented Sep 1, 2022

Regarding the database size, it is unclear how to solve it by only looking the validator's codebase. A typical solution is to clearly define the "active state" of the validator and cleanup everything else at epoch change (which is not currently implemented); then rely on some sort of "archival" nodes to persist the entire history of the blockchain. This however implies a blockchain ecosystem (which is beyond the scope of the consensus core).

@zicofish
Copy link
Author

zicofish commented Sep 1, 2022

@asonnino Thanks. I have been using this library for a scenario that requires recovery, even after a long period shut down. I have already implemented something that should work. Perhaps I will post a PR after testing. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants