Fix issues with concurrent connection filtering #199
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The hope here is that this closes ppy/osu#25635 and fixes similar issues that have been visible on sentry for the past day or so, but I can't provide any strong guarantee that it does, as testing this in practice is difficult.
The goal here is to fix two shortcomings:
Because each hub is a separate endpoint / websocket connection, they do not have to be synchronised wrt connection state at all. Therefore, it is possible for the user to disconnect from only some of the hubs (and I have observed this in practice). The previous cleanup logic implicitly assumed this to be impossible, and as such nuked the user's entire state. To counteract this, each disconnect event only performs cleanup in the context of the hub where the disconnect happened, and the entity is removed from the store if and only if it has no connections registered any more.
This is covered by
TestHubDisconnectsTrackedSeparately()
in the tests. It also can be tested in practice by e.g. launching a game instance in debug and pausing its execution until signalr-side keepalive detects some hubs as dead, and then resuming the game instance. Doing so should result in a sequence similar towith this pull.
Another scenario pointed out by @peppy - theoretical at this point, although plausible - is that if the following sequence of events happens (using server-side message ordering) for a single client instance:
then the last disconnection would not only nuke the entire connection state as mentioned in the previous point, it shouldn't really be doing anything at all, since it's a stale event concerning a stale connection and therefore it should not result in anything changing. Thus, a connection disconnection event can only unregister a connection now if the connection IDs match.
This is covered in the added unit tests, in
TestStaleDisconnectIsANoOp()
.