-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add server side ICE consent checks to detect silent WebRTC disconnection #1332
Conversation
- Fixes #1127 ### Details - This PR provides the app with the ability to detect downed networks or silent/abrupts disconnections in client side. - Make mediasoup send ICE consent requests every ~5 seconds once ICE is established. - Move to 'disconnected' state (and notify the app as usual) if no ICE response is received in 30 seconds as per spec (this may change with a setting, see TODO below). - This feature requires providing the mediasoup `WebRtcTransport` with the remote ICE username fragment and ICE password, which will be done by latest mediasoup-client once released. ### TODO - [ ] Add a `iceConsentTimeout` parameter to `WebRtcTransportOptions` to override the default value of 30 seconds (0 means "disable the mechanism"). - [ ] Rust. - [ ] React to 403 ICE error responses by also moving to 'disconnected' state.
- Add iceConsentTimeout in WebRtcTransportOptions (default to 30). - Limit max entries in the queue to 20.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it looks good to me.
The only concern I have is that the application will receive a 'disconnected' event, but it has no means to know the reason why, and that's IMO important.
I think a 'reason' argument should be added to the notification from C++ to Node/Rust so in this case it would be something like ConsentCheckExpired
and the corresponding ones in other cases.
That could be done in a different PR though.
this.safeEmit('icestatechange', iceState); You mean adding a |
Yes, for example. The problem is not knowing the reason of the disconnected now that it can be due to this checks timeout. |
NOTE: if the WebRTC client disconnects its side by closing the PeerConnection or by closing the browser tab, DTLS close alert arrives BEFORE to mediasoup and "dtlsstatechange" (state "closed") happens BEFORE. So if that happens the app should already close the transport. Basically "iceconnectionstatechange" event with state "disconnected" ONLY happens if consent timeout happens or if the client is connected via TCP and the TCP is abruptly closed, which in both cases means that the network is down. Taking this into account I think that we don't need any "reason" argument. |
Also note that, before this feature, "iceconnectionstatechange" with state "disconnected" ONLY may happen if the client was connected via TCP and it was abruptly closed without sending DTLS close first. So I strongly think we don't need any "reason" argument. |
Ok, we'll see if there is a need when troubleshooting. Not a blocker anyhow. |
NOTE: Not working yet due to FBS issues.
This is done, Rust included. |
Hell, demo crashed:
|
Is it just me or everyone else is also fed up with memory-related issues in C++? |
Ok I got core dump in demo server:
|
It's crashing in I did changes in that method, of course, since we used to only verify that the first part of the USERNAME attribute in the received STUN packet matches our server side ICE username fragment plus colon. But now we also have (or may have) the remote username fragment, so we match the received USERNAME attribute with NOTE that in this PR we are also calling |
I'm refactoring C++ code a bit to have better error traces. |
Wow, another crash in the demo and this is NOT related to STUN...
However ICE disconnection happened before:
|
…LL receive packets on other tuples
And this has been fixed. Problem is that I was deallocating a tuple before notifying it to the listener who was then calling methods on a deallocated instance. |
New stuff in |
I see a coredump in the demo but I strongly fail to correlate it with this PR:
Hell, problem is in
|
Consider not destroying/deallocating anything. Just notify the application about "disconnected" state and let the application decide what to do. Similarly in the browser side the ICE "disconnected" state is not destroying anything and can recover if the user decides to keep the connection open after that event. I think it would simplify the logic and avoid some of the issues you are debugging here. Feel free to ignore it. |
Perhaps I was not clear but I didn't mean that. We are not deallocating the The bug I mean was an internal C++ thing fixed in this commit and has nothing to do with communication with Node or Rust: 7ee1753 |
this->oldUsernameFragment.clear(); | ||
this->oldPassword.clear(); | ||
} | ||
if (IsConsentCheckSupported() && IsConsentCheckRunning()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would suffice:
if (IsConsentCheckSupported() && IsConsentCheckRunning()) | |
if (IsConsentCheckRunning()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I know. But I'd prefer to be super explicit because lot of logic in IceServer file is too sensitive and probably should be refactored in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rationale is that I want to prevent that consent check is started when not supported. It cannot happen because the StartConsentCheck()
method asserts IsConsentCheckSupported()
but anyway.
MS_ASSERT(IsConsentCheckSupported(), "ICE consent check not supported"); | ||
MS_ASSERT(IsConsentCheckRunning(), "ICE consent check not running"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would suffice:
MS_ASSERT(IsConsentCheckSupported(), "ICE consent check not supported"); | |
MS_ASSERT(IsConsentCheckRunning(), "ICE consent check not running"); | |
MS_ASSERT(IsConsentCheckRunning(), "ICE consent check not running"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
MS_ASSERT(IsConsentCheckSupported(), "ICE consent check not supported"); | ||
MS_ASSERT(IsConsentCheckRunning(), "ICE consent check not running"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would suffice.
In general, if we the check is running, we can assume it is supported. And we it's enough to check for support only when start running.
MS_ASSERT(IsConsentCheckSupported(), "ICE consent check not supported"); | |
MS_ASSERT(IsConsentCheckRunning(), "ICE consent check not running"); | |
MS_ASSERT(IsConsentCheckRunning(), "ICE consent check not running"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
# Conflicts: # CHANGELOG.md
Details
iceConsentTimeout
setting inWebRtcTransportOptions
).TODO
iceConsentTimeout
it in the website and extend the description of the "icestatechange" with state "disconnected" (also document that "icestatechange" with state "disconnected" happens after "dtlsstatechange" with state "closed" if client closes its PeerConnection).