-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use mixin class for osu #222
Conversation
from eessi.testsuite.utils import find_modules, log | ||
|
||
|
||
def filter_scales_pt2pt(): | ||
""" | ||
Filtering function for filtering scales for the pt2pt OSU test | ||
returns all scales with either 2 cores, 1 full node, or 2 full nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was a bit surprised that it runs on 2 cores, full node and 2 full nodes, but not e.g. on 2 GPUs (which on Snellius happens to be 1_2_node). This was already the case before your changes here, so it's not introduced in this PR, but maybe @satishskamath can comment on why this is the case? I do seem to remember him saying he thought exclusive nodes would be good for reproducibility for these tests, but according to that argument also the 2-cores size would have been filtered - I'd say having 2 GPUs is the equivalent .
Now, I do think it is difficult to filter the GPUs down to 2-GPU setups, as you don't know this until after the setup phase. I.e. we'd basically have to accept everything single-node in the init phase, and then skip the ones that would provide more than 2 GPUs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix this in a follow-up PR, see #224
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
edit: i decided to add it to this PR anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial attempt on this and the reason I kept 2_cores
and 1cpn_2nodes
case for GPUs for point to point tests was because these cases were valid and one needs at least one host CPU to assign to a device. But these scales are mainly for CPUs therefore can be a bit misleading and I do lean towards the fact that network based tests are performed the best using two full exclusive nodes or 1 full node which is the objective of this test as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested, runs fine. My main question would be the required_mem_per_node
remark, otherwise, I think this is ready to be merged.
I found an issue with the OSU test that I didn't realize was there before. It does NOT pop up if you have the
Now, it makes total sense to use
and set
in order to make sure the right amount of tasks is launched per node. So, I'm a bit at a loss where we messed this up. Should we somehow have this test set the |
Note that it is not just important for the
in order for those GPUs to even be requested by the job... otherwise, you'd get a CPU-only allocation on a GPU node. Even more accurate might be:
For collectives, there is no problem, because there we map the |
Ok, this doesn't account for the fact that the node could have only 1 GPU per node. We should do:
|
Note: I made a PR to your feature branch @smoors |
@casparvl In the version in the main branch, I used to set this in the test itself. |
@casparvl good catch! i'll update this PR and copy the other improvements in your PR that are needed or useful. EDIT: see c33114e, which adds an extra variable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. We should realize though that setting always_request_gpus
on scales for which node_part
is not set means the maximum number of GPUs will be requested. E.g. for a 1_core
scale, you'd still request all GPUs.
It doesn't matter for this test, we only request sizes for pt2pt that have node_part
set. So all good. But just to keep in mind.
hm, indeed not an issue for this test, but i'll see if we can fix this so it won't bite us later on. EDIT: checking the hooks code, this shouldn't happen because scale |
EESSI_Mixin class:
@run_before('setup', always_last=True)
to avoidperf_variables
being overwritten by the hpctestlibnum_tasks_per_compute_unit
OSU test:
pt2pt
GPU test and skip the ones that don't have exactly 2 GPUsfixes #145