Skip to content

Sync meeting on EESSI test suite (2024 01 18)

Kenneth Hoste edited this page Feb 1, 2024 · 1 revision

EESSI test suite sync meetings

Planning

  • every 2 weeks on Thursday at 14:00 CE(S)T
  • next meetings:
    • Thu 1 Feb'24 14:00 CET
    • Thu 15 Feb'24 14:00 CET
    • Thu 29 Feb'24 14:00 CET
    • Thu 14 Mar'24 14:00 CET

Meeting (2024-01-18)

  • test-suite on Hortense

    • Issue 2970 was opened in ReFrame 24 Aug by @vkarak.
      • Will be addressed by @boegel and Lara.
    • Linked to the following issue 68 in the test-suite.
      • Lara submits for every partition but will be discussed further.
  • Merged #96 which adds --mem to configuration files

    • This was done for each partition and can be done commonly for all partition.
      • Caspar: Agreed, should be in some common config. Who will do it? How?
        • Maybe a options: eessi.testsuite.common_config.get_common_options() would be enough?
  • OSU tests

    • Sam reviewed, comments need to be checked by Satish
      • https://github.com/EESSI/test-suite/pull/54#discussion_r1451741808 .
      • Running CUDA modules on the pure CPU nodes using stubs:
        • Currently, CUDA module generating pure cpu test will fail on cpu nodes.
        • Currently, remove the cpu tests from CUDA modules.
        • GROMACS CUDA module runs on CPU devices without complaining where as OSU crashes.?
        • Should we not allow running CUDA modules on CPU nodes at all?
        • Currently not a blocker, but open an issue.
      • 32 GB of memory for point to point tests is too much.
        • Contact OSU for checking this and also better error reporting.
        • Currently not a blocker, but open an issue.
        • Play with this option: -M, --mem-limit SIZE set per process maximum memory consumption to SIZE bytes
      • Install CUDA OSU module, talk to Snellius system admins and get an update on Caspar's request.
    • Lara tested on Hortense CPU, had issues on GPU but those seemed not specific to OSU.
    • Merge now including collectives and figure out the problems later.
    • Hand the test-suite to other partners.
  • bot now picks up on bot/test.sh and bot/check-test.sh script in target repo

    • currently as part of the build phase, in build environment
    • bot is ready but not doing anything for now: OSU and TensorFlow good candidates.
    • GROMACS tests have been failing.
  • Xin tested docs to see if it was clear how to run (tested on Snellius)

  • MultiXscale deliverable finished and is online.

  • goals for next weeks

    • Sam/Satish: finish OSU PR
    • Sam
      • CUDA samples
      • maybe port over test from VUB test suite to EESSI test suite
    • Kenneth:
      • maybe look into GROMACS CI test
    • Xin:
      • docs
      • Espresso test
    • Satish
    • fix GROMACS CI test when there's too many cores
      • skip if there's too many cores available per node
      • print message that there's too many cores available, give useful suggestion

Previous meetings

Clone this wiki locally