-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not suspend JobManager during startup by default #2671
base: master
Are you sure you want to change the base?
Conversation
This pull request changes some projects for the first time in this development cycle.
An additional commit containing all the necessary changes was pushed to the top of this PR's branch. To obtain these changes (for example if you want to push more changes) either fetch from your fork or apply the git patch. Git patch
Further information are available in Common Build Issues - Missing version increments. |
514090#2 also says:
|
We can find and fix those, if any |
If the bugs uncovered are downstream, someone else will need to find and fix them. It seems best to err on the side of caution. |
If there are no objections i would like to merge and see what happens. |
Could you please wait a bit? I would like to run a build now to see effects of eclipse-jdt/eclipse.jdt.core#3531 only. |
sure |
I've triggered https://ci.eclipse.org/releng/job/Builds/job/I-build-4.35/80/ to get intermediate results. |
the discovery UI problem has been fixed with eclipse-equinox/p2#604 , restarted build with https://ci.eclipse.org/releng/job/Builds/job/I-build-4.35/81/ |
Given my comment earlier I will explicitly state my objection for historical purposes. I want to ensure that in the future when things are broken and folks arrive at this issue about who it got broken it will be clear that when someone says "We can find and fix those, if any" that it means that someone here has taken responsibility and has volunteered to help fix this problem for the entire Eclipse ecosystem. Of course I may misinterpret this to mean we will fix the problems we find in our stuff, but is the implication that we don't care so much for anything that grows beyond our patch of the garden? |
And, if I find any problems in our own stuff that needs fixing, then I will raise my objection to a strong objection and will ask that a revert be applied if we've steam rolled ahead anyway. |
@merks i.e. you prefer to have a known deadlock over the vague possibility of some regression? Of cause we could revert this if we find anything that we can not solve. |
I share the concerns of @merks (To avoid any misreading here: I agree with Eds statements). |
would you accept a system property that controls the behavior for a possible emergency fall back?, defaulting to not suspending. |
That is actually a good idea. Disable by default, but give a chance to "fix" that for product owners. |
Would you consider @sratz suggestion? The problem with product owners is that mostly the users use EPP packages but then install a whack of things from SimRel and from marketplace. Any one of those might be a problem and then where is the owner? The user will be clueless and will have one more reason why Eclipse sucks so much and in fact sucks more with each release. Err on the side of caution. Default to not changing the behavior giving product owners a chance to test what breaks. But of course no one is likely to do that because there seems not a big advantage to changing the behavior in general, i.e., it seems having a potential avalanche of jobs starting during startup, all of which have validly been able to assume that the workbench is fully initialized by the time they start running is more likely to do more harm than good. |
Because of possible deadlock when Javaeditor from JDT calls PDE calls OOMPH calls P2 to download new target platform waiting for a Job that is not executed because JobManager is suspendend eclipse-pde/eclipse.pde#1481 Was already suggested in https://bugs.eclipse.org/bugs/show_bug.cgi?id=514090 Old behavior can be used by setting VM property -Dorg.eclipse.ui.suspendJobManagerDuringStart=true
I do not understand what you actually suggest. To keep IDE as bad as is (deadlock)? The suspend was initially done for performance reasons without any measurements given. That there are issues with a not initialized UI is pure speculation at this point. |
Reading the old bugzilla it seems to me that actually the suspension was already placed earlier (but I might be wrong), from an IDE perspective it makes sense to only start jobs when there is an UI because then user can cancel them and instead show UI as soon as possible (might not be a big deal today with multi CPU systems being the default). In any case code that relies on fire a job and then join it always looks suspicious to me (even though I was surprised about this join behavior) because it feels one better should use a So I'm not 100% sure what is the use case for firing a job and then join it (in the startup phase) even though it would help here but more in hiding the problem that we do heavy work in the startup phase. |
The suspend state iorg.eclipse.core.internal.jobs.JobManager.suspended s initialized like this: What happens if this is initialized to true? Of course the concern about the default behavior remains the same, but if the value is initialized via a system property that defaults to false, we change nothing in the default behavior, but the Eclipse SDK can set that property to true, as can EPP. This would prevent any job from starting before suspend is called. I've seen assertions to suggest this fixes the problem we're trying to fix. So given there is a historical desire to avoid jobs starting during startup, that would seem the best way to ensure the past target behavior. |
I guess our multi core CPUs are annoyed by a single threaded startup. I agree that creating a job and joining it is an antipattern that could also be fixed. |
I do not understand which problem that should fix. it does not fix the deadlock nor i am aware of any job that starts before jobmanager is suspended |
Build available for testing: https://download.eclipse.org/eclipse/downloads/drops4/I20250108-0430/ |
So let's do nothing until everyone fully understands the problem and what we are doing to fix it. Could someone please take a step back and fully describe the problem and how this change fixes it. ( I was traveling for a day, and am dealing with 150 emails, many related to this topic, and sorry to say this out loud, many with a tone that I find offensive and embarrassing. So now I'm beyond frustrated and am shocked to find things being pushed through as if a fire flared up and needed to be put out before the entire building is engulfed in frames. ) If I read this details correctly, some job is running and then the job manager is suspended but that already-running job is waiting for some other job (that it launched?) to complete, but it cannot because the job manager is not starting any jobs. So if we avoid starting any jobs at all, then we can avoid that problem. I could be wrong. Too much email, too much noise, and not enough simple factual detail. So please let's clarify the facts and the details. Is eclipse-equinox/p2#604 related to this PR. I'm so confused. |
@iloveeclipse did ask @jukzi in #2671 (comment) to wait with this PR until an aggregator build is done with the change in JDT core. The build failed and eclipse-equinox/p2#604 fixed this failure. Now everyone who wants can test the JDT core change before this change. The result of the build can be found here: https://download.eclipse.org/eclipse/downloads/drops4/I20250108-0430/ |
In the interest of gathering facts, I did the following. I installed the Eierlegende Wollmilchsau, created a sample PDE project, opened the Java editor in that project, and closed the IDE. I enabled remote debug in the eclipse.ini, suspending the JVM on start. I set a breakpoint in org.eclipse.core.internal.jobs.JobManager.suspend() and when the debugger stopped there, I change the value back to false. Then I let the IDE complete the startup phase. This appears in the Error log:
So definitely jobs might assume that org.eclipse.ui.PlatformUI.getWorkbench() will work nicely to find that now it does not. The next experiment I did with the same setup, but I stopped in the JobManager constructor and changed suspended to true in the debugger. The IDE starts with the open editor just fine with no logged errors. |
thats good.
And could be fixed for example using org.eclipse.ui.PlatformUI.isWorkbenchRunning() https://bugs.eclipse.org/bugs/show_bug.cgi?id=49316 JDT's org.eclipse.jdt.internal.ui.text.correction.CorrectionCommandInstaller works fine. Anyhow i will wait with any further investigation to find out if the deadlock still ever happens after eclipse-equinox/p2#600 |
I just wanted to note that we are talking about the UI Thread (!) so it is always single threaded and the goal seems to be keep that as short as possible so we show something to the user and then start the "heavy" work, |
Because of possible deadlock when Javaeditor from JDT calls PDE calls OOMPH calls P2 to download new target platform waiting for a Job that is not executed because JobManager is suspendend
eclipse-pde/eclipse.pde#1481
Was already suggested in
https://bugs.eclipse.org/bugs/show_bug.cgi?id=514090
Old behavior can be used by setting VM property
-Dorg.eclipse.ui.suspendJobManagerDuringStart=true