-
-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
thread_signal/2 throws an existence error for threads that terminated but are not yet joined #1236
Comments
Same problem with
Again this behavior forces using a |
What else do you want? Existence is a bit misleading, but a completed thread is what Unix calls a zombie process: the thing is gone, but there is still an entry in the thread/process table that allows for join/wait. It is in no way capable of processing the signal or message. We could consider another exception (permission error?), but IMO that makes things worse as it would require catching two different exceptions. If the misleading error message is your (only) concern we could add a comment to the 2nd argument of the error term? I'm also no fan of the requirement to use catch/3. I see little alternative though. It is a bit like opening a file. In fairly static environments testing the access first may be defensible, but in a dynamic environment you must use catch/3 because the file may disappear or change permissions between the two calls. |
If the thread message queue (the one used by P.S. This implementation choice is found on (from my limited testing) in ECLiPSe and Trealla Prolog. It's also how it's implemented in LVM. |
In SWI-Prolog at least, the entire thread structure is cleared when the thread terminates. So, there is no place to deliver a signal or a message. There is also no point as it would not be processed anyway. I agree that a signal intended to tear down the thread could be ignored if it is already dead. The only candidate for that seems Do you have documentation from the other systems on how this is handled? I'm happy to discuss the topic with other developers. |
How difficult would be to that to happen only for detached threads but postpone it for attached threads until they are joined? Also, what would be the expectation that this change in semantics/behavior would break existing applications?
Indeed they would be no-ops (as I mentioned above) but that would avoid the need of
A possible alternative in the last scenario would be to use
I don't think this level of implementation details is explicit in the documentation of other open-source systems. At least not that I could find in a quick search. I'm part of the team developing LVM, but this is a commercial system and its documentation is not (currently) publicly available. A discussion between developers would be welcome. My idea (if I ever find the time) is to update the threads draft standardization proposal (which currently Trealla Prolog are using as a guide) and add a test set to the Logtalk distribution. It would be great to minimize the differences between systems for better portability of multi-threading applications. |
It is probably easier to silently ignore messages and signals when we detect that the thread is in a zombie state. I don't really expect that to break properly functioning applications. I expect that silently ignoring signals and messages that cannot be delivered is more a cause of problems than a way to avoid them. Notably you typically send a message to a thread if you want it to be processed. For signals the story is a bit different. Most signals are for aborting or debugging. I have also used signals to actually make threads do something though. For sending messages we have an option list that we could use to avoid an error (like close/1). We do not have that for thread_signal/2. One could also consider a high level interface for aborting and joining a thread. The debug usage is mostly interactive and controlled by more high level utilities.
If you organize one, I'm happy to join. You've done a lot of good work for the standard and I still regret that didn't continue. The SWI-Prolog thread API evolved quite a bit since then. |
Using an option in |
Or copy ISO close/2, which implements |
My experience from ~20 years ago, using POSIX threads on a non-Unix real-time OS (VxWorks, IIRC) is that if you don't do things exactly right,(*) all kinds of weird things can happen -- and I don't see how (or why) SWI-Prolog should deal with those situations. There's only so much you can do when the underlying system is buggy or badly designed. (In the case of VxWorks, my recollection is that it had its own threading model and provided a POSIX API that was either not quite compliant or buggy or both.) (*) Where "exactly right" was often undefined in the documentation. |
The implementation is not really a problem. Linux pthreads is rock solid. MacOS has a few tweaks I managed to work around. The Windows implementation has some limits one can work around mostly by using native Windows alternatives for some. NetBSD and OpenBSD had some flaws in the past, but seem stable now as well. The simple question is what do do if you talk to a thread that terminated, but is not yet joined. It seems some systems silently ignore the signals and messages while SWI-Prolog raises and exception. I still think that is what should happen. The alternative is much harder. Checking it is still alive before sending a message is no guarantee it is alive when you send the message. I'm more tempted to add a warning similarly to detached threads not exiting cleanly for threads that have pending messages in their input queue when they are joined or (for detached threads) die. |
From this discussion and my own experience, it seems clear that, depending on the application, we ideally want to either silently succeeding or throwing an exception when sending a message or a signal to a terminated (but not yet joined attached) thread. My preference goes to be able to select the desired behavior using an option. For systems like LVM and Trealla Prolog, where the implementation of multi-threading features is a work-in-progress, this is ideal time to sync on a common solution. I will draw the attention of ECLiPSe and YAP developers to this discussion. Thanks for all the feedback. |
Consider:
The exception is arguably misleading and this behavior forces wrapping
thread_signal/2
calls usingcatch/3
as a thread may terminate between checking that it's running and calling the predicate.The text was updated successfully, but these errors were encountered: