-
Notifications
You must be signed in to change notification settings - Fork 0
bot sync meeting 2024 01 12
Kenneth Hoste edited this page Jan 20, 2024
·
2 revisions
- present: Lara, Kenneth, Richard, Thomas
- goal: high-level discussion of development priorities for 2024 (maybe 1st half of 2024)
- loose list of topics
- see open issues https://github.com/EESSI/eessi-bot-software-layer/issues
- deployment improvements
- single staging PR (issue #192)
- upload metadata and tarball to different directories (cfr. approach in NESSI)
- => create issue with more info
- add issue comment id to metadata
- add additional metadata (which?)
- which bot created the tarball
- support for deleting/overwriting something that is already there (issue #147)
- example: deploying new compat layer
- => create issue for this
- metadata that comes with the tarball should specify which actions should be taken as part of the ingest
- paths to remove
- list of tarballs to ingest => single staging PR!
"payload": { "remove": ["path1", "path2", ...], "add": [{ "filename": "eessi-2023.06-software-linux-x86_64-intel-skylake_avx512-1704485930.tar.gz", "size": "601491765", "ctime": "Fri Jan 5 20:20:16 UTC 2024", "sha256sum": "ded54f4555bf0411f8dc61e2a9b8f78b321533b978231543ed9d00a61c20ae2f", "url": "https://software.eessi.io-2023.06.s3.amazonaws.com/2023.06/software/linux/x86_64/intel/skylake_avx512/1704485930/eessi-2023.06-software-linux-x86_64-intel-skylake_avx512-1704485930.tar.gz" }},
- ingest script will need to be updated accordingly as well
- should work based on metadata files in S3 bucket (not tarballs)
- cfr. approach in NESSI
- revise directory structure on S3 bucket
- EESSI: target/tarball+metadata
- NESSI:
- tarballs/target/example.tgz
- new/target/metadata.txt
- other directories for staging/approved/ingested/etc.
- dealing with multiple bots that deploy
- Stratum-0 should be enhanced to know when it can open a staging PR
- only when tarballs for all CPU targets have arrived
- Stratum-0 should be enhanced to know when it can open a staging PR
- bot deploy implementation is specific to EESSI software-layer (issue #113?)
- support for running a
bot/pre-deploy.sh
script, only trigger deploy when that script has exit code 0- => open issue on this
- a more general approach could be to add support to configure the bot with a particular workflow
- => open issue on this
- job manager crashes
- see issues #193, #191, #142
- related: event handler crash (issue #160)
- use retry approach for problems talking to GitHub
- catch exceptions (like we do in PyGhee/event handler)
- report back in GitHub what went wrong, by means of notification
- sync NESSI/EESSI bot codes
- bot implementation of NESSI bot has diverged
- w.r.t. deployment procedure (extra updated in GitHub comment)
- extra info in metadata file
- uploading of tarball is done is different subdirectory of bucket
- => should open PR with changes made in NESSI bot to EESSI bot to ave a clear view of divergence
- bot implementation of NESSI bot has diverged
- support for cancelling a job (issue #190)
- only people with build permissions
- or only person who triggered build + select group of "admins"
- need to make sure that job manager properly cleans up when job gets cancelled (or disappears due to some other problem)
- support for building on top of EESSI/NESSI (local FS, second CVMFS (restricted repo))
- requires update in
eessi_container.sh
script to bind-mount stuff for multiple repos
- requires update in
- "control center" (issue #96)
- first step could be to use first comment created by each bot to let it provide a status overview
- could be a step towards having a more general web interface that provides an overview of all build jobs/open PRs
- make bot less "chatty" (issue #159)
- bot should warn when something that was built successfully before is being rebuilt
- maybe it should even refuse to rebuild, unless a (new)
rebuild
command is used instead (issue #92) - => need to open issue on this
- maybe it should even refuse to rebuild, unless a (new)
- implement extra test phase in bot
- overlay should not be writable in this case!
- still need to bind mount contents of build tarball
- support GitLab (issue #194)
- what information we need - map between HGH and GL
- how to update PyGHee
- plan the work
- milestone for bug fix release 0.2.1
- fix crashes of job manager (+ event handler)
- (see differences first) sync EESSI/NESSI code release 0.2.2
- open PR [Thomas]
- milestone for minor release 0.3.0
- improve deploy phase
- => open missing issues [Thomas]
- next sync meeting Wed, Feb 7, 10:00
- use weekly support meetings to report on status
- milestone for bug fix release 0.2.1