Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

long route recovery on sleep device #466

Open
iwanbk opened this issue Oct 2, 2024 · 1 comment
Open

long route recovery on sleep device #466

iwanbk opened this issue Oct 2, 2024 · 1 comment

Comments

@iwanbk
Copy link
Member

iwanbk commented Oct 2, 2024

I tested the flutter mycelium macos app and i found that it got issue when the laptop put to sleep (in my case: >2 minutes of inactivity).

my test setup:

  • macos mycelium app on macbook pro
  • mycelium on Linux, on which i did ping

I expect that the ping will be recovered soon after the macbook woke up.
But many times, it still not recovered even after 10 minutes.

I thought that the problem is on the flutter side, so i implement restart mycelium on wake up event.

Then i also tested it using my own public node, and the recovery times are significantly faster, many times only around 5 seconds.
(after few tests, the recovery time getting longer. Restart mycelium on my public node solve the issue).

i copied the logs from threefoldtech/myceliumflut#69 (comment) here:

macos app side
on minutes 28:26

"Frame error from TCP 192.168.1.6:55080 <-> 68.183.228.64:9651: Connection reset by peer (os error 54)

on minutes 28:31

Connected to new peer

my own public node
it already connected to the public node

2024-10-02T07:28:31.439602Z  INFO mycelium::peer_manager: Accepted new inbound peer
2024-10-02T07:28:31.439650Z  INFO add_peer: mycelium::peer_manager: Added new peer peer.endpoint=Tcp [::ffff:36.80.99.115]:55229
2024-10-02T07:28:31.464380Z  INFO mycelium::router: Acquired route subnet=525:c933:ef2e:bfe7::/64 peer="TCP [::ffff:68.183.228.64]:9651 <-> [::ffff:36.80.99.115]:55229"

linux

  • But the route on linux not recovered instantly
024-10-02T07:30:38.961313Z  INFO mycelium::router: Acquired route subnet=525:c933:ef2e:bfe7::/64 peer="TCP 192.168.0.108:53200 <-> 68.183.228.64:9651"
@LeeSmet
Copy link
Contributor

LeeSmet commented Oct 18, 2024

I'm not entirely sure what happens when the device goes to sleep. But basically mycelium sends a "HELLO" message on every connection every 20 seconds, to which the receiver replies with "IHU". If the receiver does not reply 2 times (basically last time an IHU was received is more than 43 second iirc), the connection is assumed to be dead, and either the peer is cleaned up (if we did not initiate the connection) or we try to reconnect. On top of this, for tcp there is also the tcp keepalive which should close the connection automatically should it not be sent anymore. This should also be detected, leading to a similar scenario as described above.

Once the node reconnects, if the process did not exit, it will likely have the exact same router ID. Depending on how long the node disconnected for, there might still be a source key for this router ID alive in the peers source key table. This will prevent announcements of the node if the metric is higher. In case it is, a seqno request should be sent which would cause the node to bumps its local seqno, and resent its routes.

In general, reconnection time is expected to be 5 seconds, as you see with your own public node. If you run that in debug mode, you might find some clue as to why it sometimes takes longer than 5 seconds. Also checking the metrics could help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants