Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoiding idling requires micromanagement #48

Open
dymil opened this issue Dec 30, 2024 · 4 comments
Open

Avoiding idling requires micromanagement #48

dymil opened this issue Dec 30, 2024 · 4 comments

Comments

@dymil
Copy link

dymil commented Dec 30, 2024

Problem

Several projects are dead, dying, or on holiday: Rosetta has zero "Tasks ready to send", WCG is on extended break, etc. Some projects definitely have work, such as Asteroids, yafu, Ramanujan Machine, and ODLK, but I'm not readily assigned those projects.

Below is the result from entering preferences somewhat straightforwardly, e.g., keeping Math/CS at "as needed" with other areas as "prefer", and Russian Academy of Sciences (their only project is ODLK) analogously below other institutions/locations.
image

This selection of projects is almost guaranteed to get no work. Barring having WUs, I'd want notification, so I can assign CPUs to Folding@Home.

Workarounds

Allowing new tasks and clicking "update" doesn't do anything when "Will remove when tasks are done" is active, which I think is appropriate behavior from BOINC. I decided the most parsimonious solution is just to manually exclude projects with no work, such as BOINC Central, along with WCG and Rosetta temporarily. I iterate this until "computer info" only lists active projects under "Given the above info, this computer would do work for these projects:". Earlier, I was doing trial & error on countries and science areas and syncing from the BOINC client to see what I would get.

Causes?

I haven't read the source at all but I believe the issue originates in the basic algorithm of Science United, per the implementation doc: it doesn't appear to guarantee there will actually be work! A proximate fix is to ensure this when assigning projects to clients: a first shot at this would just ensure aggregate "tasks ready to send" is positive, and skip projects with no work.

However, a more ultimate cause is that projects with no work still gain allocation balance up to the limit (and indeed are sitting at 1K allocation score), which might explain why my project list always had dead projects like RNA World (for which I never did work, so there's no special preference from that). This goes against my intuition about the resource share model, that the goal is to ensure fairness and that work isn't queued for too long. A project shouldn't be rewarded for being bursty IMO and certainly not for being inactive.

@dymil dymil changed the title Idling requires micromanaging Science/Location prefs Avoiding idling requires micromanagement Dec 30, 2024
@davidpanderson
Copy link
Owner

Thanks for the report; I'll look into this.

BTW, the implementation doc is somewhat outdated.
Since it was written, we added a mechanism to deal with the situation you describe:
the request message includes, for each project, a list of processor types (CPU, GPU)
for which the project failed to supply work.
The SU scheduling algorithm takes this into account in choosing projects;
it tries to include at least one project with work for each processor type.

It looks like this mechanism isn't working as intended.

@dymil
Copy link
Author

dymil commented Dec 31, 2024

Ah, I hadn't (and haven't) looked at the source.

Somewhat unfortunately, the third-place project Asteroids (after SiDock & Milkyway) was hanging out after every sync in a "Will remove when tasks are done" state, but now I've tweaked my preferences/bans again to contribute to that and LODA.

I suppose this would be slightly less of an issue if dead projects were booted from Science United. Or alternatively, if I understand "at least one", why schedule projects with no work at all—are SU syncs so infrequent?

EDIT: I also note that the mechanism to keep all processor types busy is not working for the GPU either; the machine in question is solely doing Ramanujan now, whereas it was briefly doing just Asteroids, with a mix of CPU & GPU jobs.

@davidpanderson
Copy link
Owner

The only totally dead project I could find (on SU) was Universe@home. Are there others?

The polling period for SU is 1 day.

SU will sometimes assign projects that previously had no work (or no work for a particular processor)
because it's possible that they now have work.

@dymil
Copy link
Author

dymil commented Jan 2, 2025

Well, BOINC Central you can revive whenever something shows up there, and I'd had the impression that RNA World was dead but I couldn't find proof of that, just the lack of recent work or forum activity. That's beside the point, I guess. Some projects are inactive for significant stretches and still accumulate allocations.

Possibly dumb question (again, still haven't dug into the source), but why must SU clients attach to a project to check if it has work? Intuitively, the server status would give that info more readily, and not require downloading kernels or whatever is involved in prepping to run a project. SU could keep track of that pretty readily when doing the client assignments, I would think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants