-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ensure peer_connected is called before peer_disconnected #3110
ensure peer_connected is called before peer_disconnected #3110
Conversation
("Route Handler", self.message_handler.route_handler.peer_connected(&their_node_id, &msg, peer_lock.inbound_connection)), | ||
("Channel Handler", self.message_handler.chan_handler.peer_connected(&their_node_id, &msg, peer_lock.inbound_connection)), | ||
("Onion Handler", self.message_handler.onion_message_handler.peer_connected(&their_node_id, &msg, peer_lock.inbound_connection)), | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't like this attempt to dry up the handling then I'm find just having separate results where I check them one by one with their own log messages.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3110 +/- ##
==========================================
+ Coverage 89.84% 90.78% +0.93%
==========================================
Files 119 119
Lines 97561 103463 +5902
Branches 97561 103463 +5902
==========================================
+ Hits 87655 93925 +6270
+ Misses 7331 7032 -299
+ Partials 2575 2506 -69 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think we also need to move the peer_lock.their_features
call up - we only call peer_disconnected
if that line has been hit (Peer::handshake_complete
checks for it) so we want to always hit that immediately before we call peer_connected
s.
4a1cade
to
db3b148
Compare
Whoops, fixed it. Can't be before calls to peer_connected because they pass a reference to msg but as long as we do it before returning it should be okay. |
db3b148
to
3a8f3b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to get a test (which should be pretty easy), but either way LGTM.
Doesn't look like there's easy way to handle testing it with the existing test message handlers. Should I create new ones that can error on peer_connected and track connected/disconnected have been called or add the functionality to the existing test handlers? Have used something like Is this what you had in mind for being able to test it? |
Yea, I was figuring you'd just create a trivial |
Hm, using a CustomMessageHandler doesn't really test the fix here since it goes last. One of the issues was the early return causing the later handlers to not get the I guess at least it would catch the fix for ensuring disconnect is called. |
Added a test that passes but it duplicates a ton of code to handle all of the setup but with the new message handlers :| not sure if this is okay, looking for feedback on the test and how to do it better if it's not okay. |
2c4c40a
to
7a29c39
Compare
7a29c39
to
922c31f
Compare
if let Err(()) = self.message_handler.custom_message_handler.peer_connected(&their_node_id, &msg, peer_lock.inbound_connection) { | ||
log_debug!(logger, "Custom Message Handler decided we couldn't communicate with peer {}", log_pubkey!(their_node_id)); | ||
|
||
peer_lock.their_features = Some(msg.features); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused by this, wouldn't that lead to use falsely assuming the handshake succeeded even though one of our handlers rejected it? And there is a window between us dropping the lock and handling the disconnect even where we would deal with it in a 'normal' manner, e.g., accepting further messages, and potentially rebroadcasting etc?
(cc @TheBlueMatt as he requested this change)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, if that's true then seems like we'll need to separate "handshake_completed" from "triggered peer_connected" with a new flag on the peer that we can use to decide whether or not to trigger peer_disconnected in do_disconnect
and disconnect_event_internal
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, kinda, we'll end up forwarding broadcasts, as you point out, which is maybe not ideal, but we shouldn't process any further messages - we're currently in a read processing call, and we require read processing calls for any given peer to be serial, so presumably when we return an error the read-processing pipeline for this peer will stall and we won't get any more reads. We could make that explicit in the docs, however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhh, rather than introducing this race-y behavior in the first place, couldn't we just introduce a new handshake_aborted
flag and check that alternatively to !peer.handshake_complete
in disconnect_event_internal
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine too.
use crate::ln::{msgs, wire}; | ||
use crate::ln::msgs::{Init, LightningError, SocketAddress}; | ||
use crate::util::test_utils; | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Drop superfluous whitespace.
@@ -2779,6 +2781,76 @@ mod tests { | |||
} | |||
} | |||
|
|||
struct TestPeerTrackingMessageHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I believe the alternative would be to add TestCustomMessageHandler
and TestOnionMessageHandler
to test_utils
and use them as part of the default test setup?
@johncantrell97 Any interest in finishing this PR? |
Supersceded by #3580. |
Fixes #3108
Makes sure all message handler's
peer_connected
methods are called instead of returning early on the first to error.As for whether or not the user has to call back into socket_disconnected after a
PeerManager::read_event
, I assume you mean after it returns an Err? I think the user does not have to becauseread_event
will calldisconnect_event_internal
on any error before returning it to the user.I took a look at
lightning-net-tokio
and it appears to be the case over there as well. It does:Only calling
socket_disconnected
if the disconnection type is one the user detected. Ifread_event
returns an Err it breaks with a disconnection type of Disconnect::CloseConnection and does not call back intosocket_disconnected
.Matt seems to think you do have to so I'm probably misunderstanding the original question. Happy to dig into it a bit more with some clarification if I misunderstood.