-
Notifications
You must be signed in to change notification settings - Fork 224
Unusual blocktime with same validator in subsequent blocks after OracleRequest #882
Comments
2336997 is a boring regular one:
2336998 is a bit more interesting:
One simple view change and we're done. It just happened so that for this block/view number combination our node is the one to create PrepareRequest. The only thing that might be interesting is this line:
but that's neo-project/neo#2057, the node has not seen any traffic from other nodes, so it's a little shy to send CV, instead it sends a recovery requests and waits for other messages. Then 4 CVs are delivered and then after the second timer event (+60 seconds) it feels brave enough to join the CV party.
I think this expectation is not a correct one, once we're past 15s blocks will be produced right after the view change (if primary node is alive for this view of course). Exponential backoff doesn't mean that the first view change will happen after 6*15, in fact it could happen right after 30s, but then timers will be adjusted to give more and more time after each CV.
|
Hey @roman-khimov thanks for the logs and the explanation! Okay, then it seems everything is fine. 🙏🏼 So, just to get this straight: As the NSPCC node was speaker for the last block, it sets its timeout to
Then, the timeout is set to
Did I get this right? |
Yep, that's how it all has happened. |
All right, thanks for verifying. 👍🏼 |
Observation
The blocks 2336997 and 2336998 showed some inconsistent behaviour. They have timestamps 90 seconds apart, and they seem to have had the same validator (
NSPCCpw8YmgNDYWiBfXJHRfz38NDjv6WW3
), which, afaik, is the only Neo-Go node amongst the 7 validators by that time with the others running a C# node. Another observation is that in block 2336997 there's a transaction with an OracleRequest, and in the next block (2336998) is a transaction with an OracleResponse.The blocks before 2336997 and after 2336998 seem to have had normal behaviour.
Expected behaviour
With the same validator in subsequent blocks, it means (in my understanding) that the other validators failed to build and propose a block. In that scenario, I would expect this validator to only propose a new block after exponential time, i.e., waiting a lot longer than 90 seconds (6*15 seconds, with 15 seconds block time).
Potential Issues
@vang1ong7ang, @igormcoelho, and I have already looked into it to some extent, but we haven't found anything concrete yet to explain the observation. The current assumption is that the 6 validators that run a C# node could somehow not manage to include an OracleResponse for the corresponding previous OracleRequest in a block, i.e., there might be an issue with the Oracle plugin there. Further, it seems the Neo Go node proposed a block with linear (instead of exponential) timeout after the C# nodes did not propose any block, which could explain why the Neo Go node proposed a block already after 90 seconds. However, it seems that the Go dbft implementation contains this calculation correctly here. @roman-khimov
@vang1ong7ang, @igormcoelho, please correct me if I got something wrong in this one.
The text was updated successfully, but these errors were encountered: