Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform busy waiting when the wait duration is less than MIN_SLEEP_DURATION #514

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

byeonggiljun
Copy link
Collaborator

This fixes #513.

Add busy waiting when the wait duration is less than MIN_SLEEP_DURATION to prevent the reactor from advancing to a logical time greater than the current physical time.

@byeonggiljun
Copy link
Collaborator Author

@erlingrj I ended up creating a new PR instead of #403 because that PR is WIP. I have two questions for you.

  1. What do you prefer between removing MIN_SLEEP_DURATION and adding busy waiting?
  2. If we're going to add busy waiting, should we adjust the value of MIN_SLEEP_DURATION? In other words, is busy waiting for at most 10 usec too much?

@byeonggiljun byeonggiljun marked this pull request as ready for review January 28, 2025 19:04
@byeonggiljun byeonggiljun changed the title Perform busy wait when the wait duration is less than MIN_SLEEP_DURATION Perform busy waiting when the wait duration is less than MIN_SLEEP_DURATION Jan 28, 2025
@erlingrj
Copy link
Collaborator

Hmm, I think doing busy-waiting is fine, but I think it should be viewed as a real-time optimization, the purpose of it is to reduce lag. That might not be always what is wanted, however, we only do this a short period of time anyways. Another argument is that if the users are writing real-time programs they have to configure the underlying platform to achieve it. If they are using Linux, this means enabling PREEMPT_RT and using a real-time scheduling policy for the process that runs the LF program.

@byeonggiljun
Copy link
Collaborator Author

byeonggiljun commented Jan 30, 2025

Surprisingly, either removing MIN_SLEEP_DURATION (CI-test#184b55f) or adding busy waiting (CI-test#fd7bd22) in wait_until led to test failures in window-latest while the error didn't occur when I revert the change.

@erlingrj
Copy link
Collaborator

erlingrj commented Jan 30, 2025

My guess here is that we try going to sleep for a negative duration and the Windows platform implementation does not support that. I suggest that we verify that the sleep duration is greater than zero. (Basically setting MIN_SLEEP_DURATION to zero).

The first failures when adding the busy-wait seems to be different than the ones when you removed it altogether:

*** java.lang.instrument ASSERTION FAILED ***: "!errorOutstanding" with message can't create name string at s\src\java.instrument\share\native\libinstrument\JPLISAgent.c line: 830
*** java.lang.instrument ASSERTION FAILED ***: "!errorOutstanding" with message can't create name string at s\src\java.instrument\share\native\libinstrument\JPLISAgent.c line: 830

Could be something else completely

@erlingrj
Copy link
Collaborator

Very wierd Java errors, almost seems like its something with the network?

@byeonggiljun
Copy link
Collaborator Author

Yes, it is very weird. RomanIakovlev/timeshape#1 Maybe the busy waiting causes a stack or heap overflow?

I suggest that we verify that the sleep duration is greater than zero. (Basically setting MIN_SLEEP_DURATION to zero).

Anyway, I'll add your suggestion. Thanks!

@byeonggiljun
Copy link
Collaborator Author

Hmm... I found that there's already the code to handle negative time delays below in lf_windows_support.c. Let's first see whether the test passes now.

int _lf_cond_timedwait(lf_cond_t* cond, instant_t wakeup_time) {
// Convert the absolute time to a relative time.
interval_t wait_duration = wakeup_time - lf_time_physical();
if (wait_duration <= 0) {
// physical time has already caught up sufficiently and we do not need to wait anymore
return 0;
}

@byeonggiljun byeonggiljun requested a review from erlingrj February 7, 2025 00:53
@byeonggiljun
Copy link
Collaborator Author

Why don't we set MIN_SLEEP_DURATION to 0 for now to avoid the flaky error? We can improve the design for real-time optimization in other PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimistic MIN_SLEEP_DURATION causes a problem
2 participants