Experiment versioning #191

aidanheerdegen · 2019-06-21T00:24:02Z

It would be great to be able to uniquely identify an entire experiment.

Could generate a uuid for this purpose, and save it in a metadata ~~YaML~~ YAML file in archive. If there was not already a metadata file.

It would break if the metadata file was copied from another experiment. Maybe just live with that?

Would want to be consistent with whatever is done for the cosima cookbook

COSIMA/cosima-cookbook#134

Would the metadata file be tracked by the git repo? That would mean generating the metadata file in the control directory and copying it to archive. It would make it more ambiguous when it was necessary to generate a new metadata file.

The text was updated successfully, but these errors were encountered:

marshallward · 2019-06-21T15:34:33Z

Could the git hash for the experiment act as a unique ID? Not sure what metadata is referring to here.

aidanheerdegen · 2019-06-24T00:27:33Z

Sorry, need more background. There was a discussion in this cosima-cookbook PR about providing some metadata for the cookbook database.

COSIMA/cosima-cookbook#130 (comment)

An example was:

contact: Andrew Kiss 
contact_email: [email protected]

created: 2018-01-01

description: "Attempted spinup, using Russ' salt flux fix https://arccss.slack.com/archives/C6PP0GU9Y/p1515460656000124 and https://github.com/mom-ocean/MOM5/pull/208/commits/9f4ee6f8b72b76c96a25bf26f3f6cdf773b424d2 from the start. Used mushy ice from July year 1 onwards to avoid vertical thermo error in cice https://arccss.slack.com/archives/C6PP0GU9Y/p1515842016000079"

notes: "Stripy salt restoring: https://github.com/OceansAus/access-om2/issues/74  tripole seam bug: https://github.com/OceansAus/access-om2/issues/86 requires dt=300s in May, dt=240s in Aug to maintain CFL in CICE near tripoles (storms in those months in 8485RYF); all other months work with dt=400s"

I then suggested we could generate a uuid to uniquely identify an experiment.

We'd need just a single hash right? So which one? Well I guess it would be the hash at the time a new experiment was being started? I don't think we could guarantee that was unique. If an experiment is forked and a the user specified a reproducible run I don't think this would trigger a commit, and therefore a new hash.

aidanheerdegen · 2019-10-28T03:26:01Z

Was thinking about this the other day and definitely want an exptID generated separately from the git repo hashes of the control directory.

This exptID needs to be regenerated when various criteria are met. Some I thought of

run counter is reset
experiment name (directory name) changed
archive directory created
no existing metadata.yaml file

There is redundancy in the above list, as making an archive directory pretty much implies there is no existing metadata file, and the run counter will be reset. Equally when the experiment name is changed it is likely that a new archive directory will be created. If not, if a user manually renames the archive directory would the exptID need to change?

Some examples of when exptID would (and should) change:

Clone an experiment but change some parameters/forcing and start from initial conditions
Clone an experiment, but change some parameters/forcing and start from restarts: this is an experiment fork, like a perturbation run
Re-run after pay sweep --hard (so maybe a failed run, or incorrect inputs etc)

If a user clones their own experiment they are pretty much forced to change the experiment name otherwise they will have a directory name clash in their laboratory. A user clone someone else's config has no such restriction, but again, the clone will not bring over a metadata.yaml file, so one should be created as desired.

aidanheerdegen · 2020-05-04T00:07:25Z

Some portion of this ID could be appended to work and archive directories to disambiguate between identical experiment names. In this case that would be another reason to regenerate an experiment ID, if that experiment name already exists. Now that I think about it, that case is probably implicitly covered above when there is no archive directory .. but to check for that you'd need the experiment ID. This is getting circular ...

aidanheerdegen · 2024-02-08T03:34:33Z

Closed by #384

aidanheerdegen added the feature label Jun 21, 2019

aidanheerdegen mentioned this issue Sep 1, 2020

Deal with duplicated experiment name in querying COSIMA/cosima-cookbook#168

Open

aidanheerdegen mentioned this issue Jun 24, 2022

Proposal: explicitly support branches #330

Closed

jo-basevi self-assigned this Oct 25, 2023

jo-basevi mentioned this issue Nov 20, 2023

Adding experiment uuid, metadata and branch support #384

Merged

aidanheerdegen closed this as completed Feb 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment versioning #191

Experiment versioning #191

aidanheerdegen commented Jun 21, 2019 •

edited by marshallward

Loading

marshallward commented Jun 21, 2019

aidanheerdegen commented Jun 24, 2019

aidanheerdegen commented Oct 28, 2019

aidanheerdegen commented May 4, 2020

aidanheerdegen commented Feb 8, 2024

Experiment versioning #191

Experiment versioning #191

Comments

aidanheerdegen commented Jun 21, 2019 • edited by marshallward Loading

marshallward commented Jun 21, 2019

aidanheerdegen commented Jun 24, 2019

aidanheerdegen commented Oct 28, 2019

aidanheerdegen commented May 4, 2020

aidanheerdegen commented Feb 8, 2024

aidanheerdegen commented Jun 21, 2019 •

edited by marshallward

Loading