Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libp2p-ping sometimes panics in WASM #5442

Closed
oblique opened this issue Jun 4, 2024 · 4 comments · Fixed by #5447
Closed

libp2p-ping sometimes panics in WASM #5442

oblique opened this issue Jun 4, 2024 · 4 comments · Fixed by #5447

Comments

@oblique
Copy link
Contributor

oblique commented Jun 4, 2024

Summary

While researching our eigerco/lumina#256 bug we found one of the causes were bacause libp2p-ping was panicking.

This was due to inconsistent behavior in futures-timer when it was used in WASM.

Expected behavior

Not to panic

Actual behavior

Panics

Relevant log output

No response

Possible Solution

We fixed the issue with async-rs/futures-timer#74 PR and we are waiting to be reviewed and get merged.

If futures-timer rejects or do not merge our fix, then another fix can be applied directly in libp2p-ping.

Version

No response

Would you like to work on fixing this bug ?

Yes

@dariusc93
Copy link
Member

Thanks for the report and good catch. I havent been able to reproduce this myself though during my testing, so out of curiosity was there any specific conditions in which a panic would happen?

@oblique
Copy link
Contributor Author

oblique commented Jun 4, 2024

It is a bit hard to reproduce it. In our case sometimes takes 30 minute to reproduce. The following flow triggers it:

  1. The first ping request is send (code)
  2. A timeout or IO error is produced (code). These can happen only because of network issues.
  3. In this line only self.outbound is reset.
  4. Because of (3), this is called for a second time and will panic.

More info:

  • gloo-timers panic if an expired TimeoutFuture is reused (code).
  • futures-timer's native implementation does not panic, but they didn't apply this behavior for WASM (code)

The following fixes the issue in libp2p-ping, however we need to recheck all the places that futures_timer::Delay is used in libp2p to make sure that no other places are affected.

diff --git a/protocols/ping/src/handler.rs b/protocols/ping/src/handler.rs
index b9bd25c52..af7a55a13 100644
--- a/protocols/ping/src/handler.rs
+++ b/protocols/ping/src/handler.rs
@@ -183,6 +183,7 @@ impl Handler {
         >,
     ) {
         self.outbound = None; // Request a new substream on the next `poll`.
+        self.interval.reset(Duration::new(0, 0));

         let error = match error {
             StreamUpgradeError::NegotiationFailed => {

@dariusc93
Copy link
Member

Thanks for the info. I will see what i can do on my end to reproduce it just out of my curiosity. In the meantime, maybe we should work on getting a patch done here in libp2p-ping since it might be a while before somebody review your fix in futures-timer (going based on the previous history there).

@oblique
Copy link
Contributor Author

oblique commented Jun 6, 2024

I created a repository with reproduction: https://github.com/oblique/libp2p-issue-5442
Please read README.md for more info.

oblique added a commit to oblique/rust-libp2p that referenced this issue Jun 6, 2024
@mergify mergify bot closed this as completed in #5447 Jun 10, 2024
mergify bot pushed a commit that referenced this issue Jun 10, 2024
TimTinkers pushed a commit to unattended-backpack/rust-libp2p that referenced this issue Sep 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants