-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Farmer hangs after Ctrl-C when there are invalid chunk errors #3178
Comments
Attempted FixesThis issue still happens after the code changes in:
Further AnalysisI added some logs to see where the hang was:
The farmer is blocked at the end of the |
Aha, it did exit, it just took 17 minutes. Full logs (including logs that show where in the code it hung):
|
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Setup
On macOS 13.7 on M1 Max, I ran the following commands:
./subspace-node run --chain devnet --base-path ~/subspace-nets/devnet-2024-10-24/ --blocks-pruning archive --state-pruning archive --sync full --bootstrap-nodes /dns/bootstrap-0.devnet.subspace.foundation/tcp/30333/p2p/12D3KooWKd4qisMjuBXU4YvpbJgepiX278iumqMjCC3GiUjjZEZS --dsn-bootstrap-nodes /dns/bootstrap-0.devnet.subspace.foundation/tcp/30533/p2p/12D3KooWSKAQm66N7jQme72obbPPquBSZuSqMyx5mH5ki1aZ4pGv --pot-external-entropy 0xddca5a66 --farmer
This bug happens with the binaries from:
Root Cause
I understand the network is partly shut down, and the farmer's storage might be corrupted. But I'm not sure how the corruption happened, I was running the devnet binaries as part of the test network. Unfortunately I don't have logs of the corruption itself, because my terminal only saves a few thousand lines.
Whatever the root cause was, the farmer shouldn't hang when it's newly started, even if its storage is corrupt.
Errors
There were a bunch of "invalid chunk" errors, so I pressed Ctrl-C:
But the node didn't exit after I pressed Ctrl-C, or when I used kill (SIGTERM) on its pid:
I had to use
kill -KILL
to get it to exit.Where it's hanging
The farmer is hanging somewhere in the piece reading code, but I'm not sure exactly which function is blocking it exiting.
Here's what I got from
flamegraph --pid PID
after the Ctrl-C when the farmer didn't shut down:Full Logs
The full logs are:
The text was updated successfully, but these errors were encountered: