-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: explicitly support branches #330
Comments
The idea of a unique experiment ID also suggested appending the experiment ID to uniquely identify So if creating a new branch also triggered a new unique experiment ID, that would cover this use case. |
This was discussed at the COSIMA II technical workshop today. It was suggested that if branch name were used as suggested, that this not be done for the default case of |
An alternative to use git hooks to update the |
Below are some ideas for experiment uuids and payu git branch support: payu commands
Example usage
To list branches and uuids:
Note at this point, the git log could look like this:
To avoid new Size of uuid Experiment Archive Name: Backwards combatibility: What should not change for old experiments is the For new experiments cloned/created with |
Note it possible to run https://git-scm.com/docs/git#Documentation/git.txt--Cltpathgt
Apologies if I was unclear, but we definitely want to use the full hash in In an ideal world it should be possible to There are a number of packages/posts/code snippets which use an approach of converting the UUID to a number and then re-encoding using a larger base encoding to represent the number in a shorter string, e.g. https://pypi.org/project/shortuuid/ https://github.com/Devskiller/friendly-id (Effectively BASE64 but with dropping some of the less safe characters) So typically this reduces a 36 character uuid4 string to 22 characters. It also means a shortened version contains more entropy. Maybe not worth the hassle, and not being an acknowledged standard, but thought it was interesting. |
I agree with having a longer Yeah, I saw that
To avoid over-writing output in remote directory in |
Good question. So .. digression Common user workflowUp to this point the most common (and encouraged) user workflow was to
The experiment name wasn't set in I'll refer to this later as the legacy workflow. ACCESS-OM3 OrganisationCOSIMA (@aekiss and @micaeljtoliveira) are organising their closely related experiments in single repos, with branches for the different combinations of resolution and atmospheric forcing: https://github.com/COSIMA/MOM6-CICE6 This is a good idea from a maintenance point of view: it reduces the number of repos and makes it simpler to alter related configurations by, for example, rebasing from a common shared ancestor. However it (slightly) alters the currently used workflow: users will need to either do an additional step of checking out a specific branch for the experiment configuration they want, or include the branch name in the Workflow Proposal: Utilising BranchesHow can explicit support for branching work with the legacy workflow, but work well with the new COSIMA repo organisation? From a user perspective the COSIMA repo organisation doesn't change much about how they work. Apart from being asked to work from a fork, they are still encouraged to clone into a local directory that is named for their proposed experiment name. However the stated purposes of this issue was to allow users to have a single repository for their related experiments, and use branches for each unique experiment. So locally users were doing something similar to the COSIMA organisation, but whereas COSIMA has branches for very different configurations, users would be branching from a single model configuration. If we think about this from a namespace point of view, from a users perspective the cloned experiment represents a namespace from which perturbation experiments can be run. So it makes sense to utilise this idea to reduce the length and complexity of branch names by automatically utilising the experiment directory name. Branch names must be unique within a repo, so we could default to assuming the experiment So a proposed workflow could be:
and alter the The issue proposing unique experiment ids suggested adding a shortened ID to the However, this has the downside of potential namespace conflicts between researchers when copying to a shared space. This risk already exists, but could be mitigated with using experiment IDs in the naming of the So belt and braces approach would be to use
Yes. I agree
Yes, checking if running from We could make a We could add a
See above. |
Currently
payu
doesn't prevent the use of branches in thegit
repo of the model experiment directory, but it has no explicit knowledge or support for it.I propose a change to the way
payu
names thearchive
andwork
directories by appending the branch name to the model name towork
andarchive
directories.This has the advantage that a single experiment control directory can be used for perturbations/tests/modifications and they can happily co-exist as fully-formed experiments. Simply changing experiment with
git branch
will automatically switch betweenarchive
directories.This would require changing the symbolic link to the
archive
directory whengit branch
called. This could be done using git hooks.The text was updated successfully, but these errors were encountered: