-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: provide a way to manage RTT component networks when hosts "disappear" (on top of #15) #16
base: corba_multi_dispatcher
Are you sure you want to change the base?
Conversation
…connection The issue with having a connendpoint without having the connection registered is that it crashes on disconnect, since the endpoint calls the port and then the port cannot find the connection
…update the policy Policy updating is needed to exfiltrate some information in the OOB transport case (namely, a name that explains what the other side should do to connect, as for instance the MQ name for the MQ transport). Turns out that only the output half is doing so, and the other take the policy as input. Ideally, we would also have cleaned up what information is or is not being passed to the other calls (the connect calls, for instance, really don't need much policy information), but that would be for another PR.
The current RTT behaviour is to have destructors explicitly disconnect channels. It's all well and good, but at destruction time things are ... unorderly. Allow to assume that a system manager will handle the cleanup when possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I was questioning the need for the remote_side_lock, but it turns out that the remote_side variable itself needs to be protected against concurrent access(independent of the reference counter and the referenced object).
I'll keep this in mind when reworking the cpp rock-display connection handling.
@maltewi this might be interesting for cnd/execution?
A rock-display-like tool won't necessarily benefit from this. The current connection handling will continue working fine (the signalling flag is a lot more critical). A syskit-like tool, on the other hand, can definitely benefit from this in term of robustness in distributed systems. On local systems, really not that much. But the migration is quite a bit of work. I'm still testing, there are some crashes. I'd be happy to discuss it with (both of) you over a call if you'd like. |
On top of #15
Whenever a remote host "disappears", a lot of operations related to dataflow are becoming blocking as well (have to wait until timeout), because these operations will call the remote side for disconnection.
This makes systems greatly unstable and misbehaving for a while, until all these calls clear. And kills the possibility for a system management layer to do the cleanup knowing what is happening, and gives situations where some half-channels will be left dangling (for instance, a task will get an OldData on a port because its part of the connection is still there).
This PR adds a new API without touching the current behaviour. The API allows to manage "half channels", that is the part of the channel that is within the process, without touching the remote side.