Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: clarify the role of "code" #79

Open
lukasheinrich opened this issue Aug 22, 2018 · 2 comments
Open

RFC: clarify the role of "code" #79

lukasheinrich opened this issue Aug 22, 2018 · 2 comments

Comments

@lukasheinrich
Copy link
Member

To reiterate the 4 questions:

  1. What is your input data?
  2. What is your environment?
  3. Which code analyses it?
  4. Which steps did you take?

the role of "code" is not really clear yet. One the one hand it is being treated just like the data by being bind-mounted in, on the other hand a lot of the analysis code (if not all of it) is already captured in the environment (i.e. docker image)

I think it would be useful to collect thoughts to clarify this. In my understanding

code (from 3) +environment (from 4)-> runtime (container_image or filesystem)

there are multiple ways to arrive at this image

  • build the image via Dockerfile
  • various source2image solutions like

buildpacks (https://buildpacks.io)
source2image (https://github.com/openshift/source-to-image.git)
repo2docker (https://github.com/jupyter/repo2docker)
ad-hoc mounting in the data (only works for dynamic languages -- which is what we do now)
(docker run -v code:/code)

For a workflow there might be multple code bases (for each workflow), multiple images, etc. So we should try to define this carefully

Once the and then the data (from 1) ) is made available by mounting in a directory and running the step in 4) :

ctr run -v data:/data container_image "./mycommand.sh /data/in.root /data/out.root"

@lukasheinrich
Copy link
Member Author

I think for me the priority would be that in the REANA codebase, these things are separated already at the code level, such that the code factors the concerns appropriately. I.e. we should bind mount data and code separately.

@tiborsimko
Copy link
Member

Quoting some of my past musings from the chatroom:

We distinguish between the "compile-time code" that builds the container, so
to speak, and the "run-time code" that can be mounted "live" just for the
execution time. Think R environment and running small R scripts, or ROOT
environment and running small macros to produce plots. In this use case, the
user code does not necessarily need to build a new image, it can be
"mounted" into an already existing runtime environment. The use case is a
researcher with simple needs that may not know much about containers and
images and workflows. In this use case, we are basically avoiding the need
to build images.... Make sense if many people can easily work on top of
already-prepared environment image. There are pros and cons obviously. (And
the more common use case would be to work on top of alerady-prepared base
image, creating Dockerfile with ADD . /code during image build time, where
CWL workflow graphs would look naturally pretty...)

Another quote:

Regarding mounting code, for REANA we have foreseen two use cases: (1) users
using images prepared by others; (2) users writing their own images. For the
former use case, we are actually recommending to mount the code and inputs
into the container [...] The assumption is that the number of
people-using-images may be significantly higher than the number of
people-creating-images, so mounting code may simplify things for the
newcomers. 

Another quote:

We'd distinguish between compile-time code (that is helping to build and
define the environment) and the runtime code (that only uses it). Hence the
"code" section outside the "environment" section, in case the said code does
not contribute anything to "build" the environment, but simply "uses" a
provided environment. The distinction can be useful in case you have 1-2
researchers "providing" environments and 10-15 researchers "using" them,
without modification needs... That kind of compile-time vs run-time code.

Since the time of these musings, the build-time-code vs run-time-code
distinction still holds, obviously -- as you also describe it.

However one of major novelties of REANA v0.3.0 is that we have abolished the
separation between code and data runtime inputs in the reana.yaml structure
and also in the CLI usage scenarios, so that the old:

$ reana-client code upload ./mycode.C
$ reana-client inputs upload ./mydata.csv

is now simply:

$ reana-client upload ./mycode.C 
$ reana-client upload ./mydata.csv

Basically, the users are still fully free to select any kind of directory
structure they want for their analyses -- to separate code and data, to not
separate them, to introduce subdirectories etc -- the REANA CLI won't try to map
these into "code" or "data" category anymore. All the code organisation
techniques would work the same.

This will hopefully alleviate possible confusions regarding build-time-code vs
run-time-code.

P.S. See also reanahub/reana-demo-worldpopulation#23
that discusses what might be a good default for various demo examples.

P.S. Note that REANA v0.2.0 was combining code and data inputs together and into
a single mount under the hood anyway...

mdonadoni added a commit to mdonadoni/reana that referenced this issue Mar 5, 2024
chore(reana-ui/master): release 0.9.4

build(reana-ui/package): update yarn.lock (reanahub#399)

build(reana-ui/package): require jsroot<7.6.0 (reanahub#399)

ci(reana-ui/commitlint): allow release commit style (reanahub#400)

docs(reana-ui/authors): complete list of contributors (reanahub#396)

ci(reana-ui/shellcheck): exclude node_modules from the analyzed paths (reanahub#387)

fix(reana-ui/progress): update failed workflows duration using finish time (reanahub#387)

feat(reana-ui/footer): link privacy notice to configured URL (reanahub#393)

refactor(reana-ui/docs): move from reST to Markdown (reanahub#391)

ci(reana-ui/commitlint): check for the presence of concrete PR number (reanahub#390)

ci(reana-ui/shellcheck): fix exit code propagation (reanahub#390)

fix(reana-ui/launcher): remove dollar sign in generated Markdown (reanahub#389)

ci(reana-ui/release-please): update version in package.json and Dockerfile (reanahub#385)

ci(reana-ui/release-please): switch to `simple` release strategy (reanahub#383)

fix(reana-ui/router): show 404 page for invalid URLs (reanahub#382)

ci(reana-ui/release-please): initial configuration (reanahub#380)

ci(reana-ui/commitlint): addition of commit message linter (reanahub#380)

chore(reana-message-broker/master): release 0.9.3

ci(reana-message-broker/commitlint): allow release commit style (reanahub#67)

docs(reana-message-broker/authors): complete list of contributors (reanahub#66)

refactor(reana-message-broker/docs): move from reST to Markdown (reanahub#65)

ci(reana-message-broker/commitlint): check for the presence of concrete PR number (reanahub#64)

ci(reana-message-broker/shellcheck): fix exit code propagation (reanahub#64)

ci(reana-message-broker/release-please): update version in Dockerfile (reanahub#63)

fix(reana-message-broker/startup): handle signals for graceful shutdown (reanahub#59)

ci(reana-message-broker/release-please): initial configuration (reanahub#60)

ci(reana-message-broker/commitlint): addition of commit message linter (reanahub#60)

chore(reana-server/master): release 0.9.3

build(reana-server/python): bump all required packages as of 2024-03-04 (reanahub#674)

build(reana-server/python): bump shared REANA packages as of 2024-03-04 (reanahub#674)

build(reana-server/python): bump shared modules (reanahub#676)

ci(reana-server/commitlint): allow release commit style (reanahub#675)

docs(reana-server/authors): complete list of contributors (reanahub#673)

ci(reana-server/pytest): move to PostgreSQL 14.10 (reanahub#672)

refactor(reana-server/docs): move from reST to Markdown (reanahub#671)

style(reana-server/black): format with black v24 (reanahub#670)

ci(reana-server/commitlint): check for the presence of concrete PR number (reanahub#669)

ci(reana-server/shellcheck): fix exit code propagation (reanahub#669)

ci(reana-server/release-please): update version in Dockerfile/OpenAPI specs (reanahub#668)

build(reana-server/docker): non-editable submodules in "latest" mode (reanahub#656)

build(reana-server/deps): pin invenio-userprofiles to 1.2.4 (reanahub#665)

ci(reana-server/release-please): initial configuration (reanahub#665)

ci(reana-server/commitlint): addition of commit message linter (reanahub#665)

chore(reana-workflow-controller/master): release 0.9.3

build(reana-workflow-controller/python): bump all required packages as of 2024-03-04 (reanahub#574)

build(reana-workflow-controller/python): bump shared REANA packages as of 2024-03-04 (reanahub#574)

feat(reana-workflow-controller/manager): increase termination period of run-batch pods (reanahub#572)

ci(reana-workflow-controller/commitlint): allow release commit style (reanahub#575)

feat(reana-workflow-controller/manager): pass custom env variables to job controller (reanahub#571)

feat(reana-workflow-controller/manager): pass custom env variables to workflow engines (reanahub#571)

docs(reana-workflow-controller/authors): complete list of contributors (reanahub#570)

ci(reana-workflow-controller/pytest): move to PostgreSQL 14.10 (reanahub#568)

fix(reana-workflow-controller/manager): use valid group name when calling `groupadd` (reanahub#566)

refactor(reana-workflow-controller/docs): move from reST to Markdown (reanahub#567)

fix(reana-workflow-controller/stop): store engine logs of stopped workflow (reanahub#563)

fix(reana-workflow-controller/manager): graceful shutdown of job-controller (reanahub#559)

feat(reana-workflow-controller/manager): call shutdown endpoint before workflow stop (reanahub#559)

refactor(reana-workflow-controller/consumer): do not update status of jobs (reanahub#559)

style(reana-workflow-controller/black): format with black v24 (reanahub#564)

ci(reana-workflow-controller/commitlint): check for the presence of concrete PR number (reanahub#562)

ci(reana-workflow-controller/shellcheck): fix exit code propagation (reanahub#562)

ci(reana-workflow-controller/release-please): update version in Dockerfile/OpenAPI specs (reanahub#558)

build(reana-workflow-controller/docker): non-editable submodules in "latest" mode (reanahub#551)

ci(reana-workflow-controller/release-please): initial configuration (reanahub#555)

ci(reana-workflow-controller/commitlint): addition of commit message linter (reanahub#555)

chore(reana-job-controller/master): release 0.9.3

build(reana-job-controller/python): bump all required packages as of 2024-03-04 (reanahub#442)

build(reana-job-controller/python): bump shared REANA packages as of 2024-03-04 (reanahub#442)

ci(reana-job-controller/commitlint): allow release commit style (reanahub#443)

build(reana-job-controller/certificates): update expired CERN Grid CA certificate (reanahub#440)

fix(reana-job-controller/database): limit the number of open database connections (reanahub#437)

docs(reana-job-controller/authors): complete list of contributors (reanahub#434)

perf(reana-job-controller/cache): avoid caching jobs when the cache is disabled (reanahub#435)

ci(reana-job-controller/pytest): move to PostgreSQL 14.10 (reanahub#429)

refactor(reana-job-controller/docs): move from reST to Markdown (reanahub#428)

ci(reana-job-controller/commitlint): check for the presence of concrete PR number (reanahub#425)

ci(reana-job-controller/shellcheck): fix exit code propagation (reanahub#425)

feat(reana-job-controller/shutdown): stop all running jobs before stopping workflow (reanahub#423)

refactor(reana-job-controller/monitor): move fetching of logs to job-manager (reanahub#423)

refactor(reana-job-controller/db): set job status also in the main database (reanahub#423)

refactor(reana-job-controller/monitor): centralise logs and status updates (reanahub#423)

style(reana-job-controller/black): format with black v24 (reanahub#426)

ci(reana-job-controller/release-please): update version in Dockerfile/OpenAPI specs (reanahub#421)

build(reana-job-controller/docker): non-editable submodules in "latest" mode (reanahub#416)

ci(reana-job-controller/release-please): initial configuration (reanahub#417)

ci(reana-job-controller/commitlint): addition of commit message linter (reanahub#417)

chore(reana-workflow-engine-cwl/master): release 0.9.3

build(reana-workflow-engine-cwl/python): bump all required packages as of 2024-03-04 (reanahub#267)

build(reana-workflow-engine-cwl/python): bump shared REANA packages as of 2024-03-04 (reanahub#267)

docs(reana-workflow-engine-cwl/conformance-tests): update CWL conformance test badges (reanahub#264)

ci(reana-workflow-engine-cwl/commitlint): allow release commit style (reanahub#268)

docs(reana-workflow-engine-cwl/authors): complete list of contributors (reanahub#266)

refactor(reana-workflow-engine-cwl/docs): move from reST to Markdown (reanahub#263)

fix(reana-workflow-engine-cwl/progress): handle stopped jobs (reanahub#260)

ci(reana-workflow-engine-cwl/commitlint): check for the presence of concrete PR number (reanahub#262)

ci(reana-workflow-engine-cwl/shellcheck): fix exit code propagation (reanahub#262)

build(reana-workflow-engine-cwl/docker): install correct extras of reana-commons submodule (reanahub#261)

ci(reana-workflow-engine-cwl/release-please): update version in Dockerfile (reanahub#259)

build(reana-workflow-engine-cwl/docker): non-editable submodules in "latest" mode (reanahub#255)

ci(reana-workflow-engine-cwl/release-please): initial configuration (reanahub#256)

ci(reana-workflow-engine-cwl/commitlint): addition of commit message linter (reanahub#256)

chore(reana-workflow-engine-serial/master): release 0.9.3

build(reana-workflow-engine-serial/python): bump all required packages as of 2024-03-04 (reanahub#200)

build(reana-workflow-engine-serial/python): bump shared REANA packages as of 2024-03-04 (reanahub#200)

ci(reana-workflow-engine-serial/commitlint): allow release commit style (reanahub#201)

docs(reana-workflow-engine-serial/authors): complete list of contributors (reanahub#199)

refactor(reana-workflow-engine-serial/docs): move from reST to Markdown (reanahub#198)

fix(reana-workflow-engine-serial/progress): handle stopped jobs (reanahub#195)

ci(reana-workflow-engine-serial/commitlint): check for the presence of concrete PR number (reanahub#197)

ci(reana-workflow-engine-serial/shellcheck): fix exit code propagation (reanahub#197)

build(reana-workflow-engine-serial/docker): install correct extras of reana-commons submodule (reanahub#196)

ci(reana-workflow-engine-serial/release-please): update version in Dockerfile (reanahub#194)

build(reana-workflow-engine-serial/docker): non-editable submodules in "latest" mode (reanahub#190)

ci(reana-workflow-engine-serial/release-please): initial configuration (reanahub#191)

ci(reana-workflow-engine-serial/commitlint): addition of commit message linter (reanahub#191)

chore(reana-workflow-engine-yadage/master): release 0.9.4

build(reana-workflow-engine-yadage/python): bump all required packages as of 2024-03-04 (reanahub#261)

build(reana-workflow-engine-yadage/python): bump shared REANA packages as of 2024-03-04 (reanahub#261)

ci(reana-workflow-engine-yadage/commitlint): allow release commit style (reanahub#262)

docs(reana-workflow-engine-yadage/authors): complete list of contributors (reanahub#260)

refactor(reana-workflow-engine-yadage/docs): move from reST to Markdown (reanahub#259)

fix(reana-workflow-engine-yadage/progress): correctly handle running and stopped jobs (reanahub#258)

ci(reana-workflow-engine-yadage/commitlint): check for the presence of concrete PR number (reanahub#257)

ci(reana-workflow-engine-yadage/shellcheck): fix exit code propagation (reanahub#257)

build(reana-workflow-engine-yadage/docker): install correct extras of reana-commons submodule (reanahub#256)

ci(reana-workflow-engine-yadage/release-please): update version in Dockerfile (reanahub#254)

build(reana-workflow-engine-yadage/docker): non-editable submodules in "latest" mode (reanahub#249)

ci(reana-workflow-engine-yadage/release-please): initial configuration (reanahub#251)

ci(reana-workflow-engine-yadage/commitlint): addition of commit message linter (reanahub#251)

chore(reana-workflow-engine-snakemake/master): release 0.9.3

build(reana-workflow-engine-snakemake/python): bump all required packages as of 2024-03-04 (reanahub#85)

build(reana-workflow-engine-snakemake/python): bump shared REANA packages as of 2024-03-04 (reanahub#85)

ci(reana-workflow-engine-snakemake/commitlint): allow release commit style (reanahub#86)

feat(reana-workflow-engine-snakemake/config): get max number of parallel jobs from env vars (reanahub#84)

feat(reana-workflow-engine-snakemake/executor): upgrade to Snakemake v7.32.4 (reanahub#81)

docs(reana-workflow-engine-snakemake/authors): complete list of contributors (reanahub#83)

refactor(reana-workflow-engine-snakemake/docs): move from reST to Markdown (reanahub#82)

fix(reana-workflow-engine-snakemake/progress): handle stopped jobs (reanahub#78)

ci(reana-workflow-engine-snakemake/commitlint): check for the presence of concrete PR number (reanahub#80)

ci(reana-workflow-engine-snakemake/shellcheck): fix exit code propagation (reanahub#80)

build(reana-workflow-engine-snakemake/docker): install correct extras of reana-commons submodule (reanahub#79)

ci(reana-workflow-engine-snakemake/release-please): update version in Dockerfile (reanahub#77)

build(reana-workflow-engine-snakemake/docker): non-editable submodules in "latest" mode (reanahub#73)

ci(reana-workflow-engine-snakemake/release-please): initial configuration (reanahub#74)

ci(reana-workflow-engine-snakemake/commitlint): addition of commit message linter (reanahub#74)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants