-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validators' view numbers go out of sync #2100
Comments
The issues seen here are (at least partly) caused by a bug fixed here - #2064. proto-testnet is currently running The code linked has been running in devnet over the weekend and we have not seen any timeouts which have failed to resolve within seconds. I suggest we get this code running in proto-testnet. I'll upgrade once ive confirmed with James |
We should deploy #2064 asap no doubt. But were the issues reported above caused by |
I'm seeing nodes receiving a
We're seeing nodes stuck in a conversation around who has which blocks, sending blocks which they have and requesting others, with Validators sometimes not having chance to process Proposed fixes
Shawn's block syncing work may help as there will be a reduction in |
I don't believe that there is any priority order at the moment. The selects are unbiased. |
Thanks you're right. Updated comment to reflect that. |
Status update: I'm holding this ticket open and will re-assess once #2089 is merged. |
As observed twice on the proto-testnet and documented on Slack:
https://zilliqa-team.slack.com/archives/C04D147S9QX/p1734803926514389?thread_ts=1734802780.508109&cid=C04D147S9QX
https://zilliqa-team.slack.com/archives/C04D147S9QX/p1734978319797799
without any obvious reasons such as nodes being down, the validators formed two groups that ended up being 3 views apart. Neither of the groups was large enough to reach supermajority so no blocks could be proposed for a couple of hours until the supermajority was in the same view again.
Investigate the GCP logs to figure out what caused the validators' view numbers to diverge and how it can be prevented in the future?
The text was updated successfully, but these errors were encountered: