Restart Catalogue #272

charles-turner-1 · 2024-11-26T22:03:34Z

Is your feature request related to a problem? Please describe.

@aidanheerdegen noted yesterday it might be helpful to have a separate catalogue of restarts, so users would be able to easily access restarts for model runs, but not accidentally access it to avoid confusion.

Describe the feature you'd like

Intake allows a single catalog to describe multiple sources: ie, the access_nri and restart catalogues could be combined as

sources:
  access_nri:
    args:
      columns_with_iterables:
      - model
      - realm
      - frequency
      - variable
      mode: r
      name_column: name
      path: /g/data/xp65/public/apps/access-nri-intake-catalog/{{version}}/metacatalog.csv
      yaml_column: yaml
    description: ACCESS-NRI intake catalog
    driver: intake_dataframe_catalog.core.DfFileCatalog
    metadata:
      storage: gdata/fs38+gdata/oi10+gdata/tm70
      version: '{{version}}'
    parameters:
      version:
        default: v0.1.3
        description: Catalog version
        type: str
  restarts:
    args:
      columns_with_iterables:
      - model
      - realm
      - frequency
      - variable
      mode: r
      name_column: name
      path: /g/data/xp65/public/apps/access-nri-intake-catalog/{{version}}/restart_metacatalog.csv
      yaml_column: yaml
    description: ACCESS-NRI restart catalog
    driver: intake_dataframe_catalog.core.DfFileCatalog
    metadata:
      storage: gdata/al33+gdata/rr3+gdata/tm70
      version: '{{version}}'
    parameters:
      version:
        default: v2024-11-11
        description: Catalog version
        max: v2024-11-11
        min: v2024-11-08
        type: str

which would then be accessible through

>>> import intake
>>> intake.cat.access_nri
<access_nri catalog with 94 source(s) across 2272 rows>
>>> intake.cat.restarts
<user_def catalog with x source(s) across y rows>

Describe alternatives you've considered

This feature would build on the approach described in #245 - see there for potential pitfalls.

Additional context

The text was updated successfully, but these errors were encountered:

aidanheerdegen · 2024-11-26T22:09:27Z

Thanks @charles-turner-1 for making this issue and pointing out the possibilities for how it might work.

I'll ping @jo-basevi and @tmcadam here as this is part of the experiment provenance and tracking work.

Would it help to have some example restarts to index for testing purposes? We're probably still lacking some important metadata (experiment and run IDs) in the files themselves, see payu-org/payu#510, so that might be a blocker until remedied.

marc-white · 2024-11-28T00:25:28Z

For my own edification (and the future software peeps who aren't from a climate background), can someone give me a definitive explanation (or link to the same) of what the 'restarts' are, particularly with respect to how they differ from the 'outputs'?

charles-turner-1 · 2024-12-02T00:48:44Z

I believe that they contain the fields, as they were at the very last timestep of the model run, whereas the outputs tend to be averaged over the output period.

So for a model with a 6 hour timestep, we might run up until eg. the end of June, giving us a restart that represents the model state on the 30th of June at 18:00, whereas the June output will be the average conditions over the entirety of June (00:00 1st June - 18:00 30th June).

I would confirm this with someone who regularly performs the model runs - this is based off my conversation with @aidanheerdegen last week and some model runs I performed about 6 years ago...

aidanheerdegen · 2024-12-02T06:45:35Z

I believe that they contain the fields, as they were at the very last timestep of the model run, whereas the outputs tend to be averaged over the output period.

Yeah that's correct. The models have a start > run > checkpoint > stop > resubmit cycle, in part because there are time limits on PBS jobs on the HPC, but there are sometimes also limits in the models themselves , e.g. overflow time buffers, generate new forcing data.

The checkpoint step dumps to disk all the prognostic fields that that the model uses internally in time-stepping the models and these can then be read back in when the model restarts.

If other users wants to use an experiment as a base from which they create new runs, typically with some perturbed physics, then they also need the restart files from the experiment.

Usually not all restarts are retained, as it isn't usually necessary to be able to restart the model from any time during the experiment. payu (the model run tool) has a feature to "prune" restarts and retain only a subset of them. It's also not uncommon for the outputs and restarts to be separated, as the outputs are more often the product that is more widely used.

charles-turner-1 added the enhancement New feature or request label Nov 26, 2024

github-project-automation bot added this to Model Evaluation & Diagnostics Nov 26, 2024

github-project-automation bot moved this to Backlog in Model Evaluation & Diagnostics Nov 26, 2024

charles-turner-1 mentioned this issue Dec 12, 2024

Catalog polling to infer & exclude broken datastores #308

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restart Catalogue #272

Restart Catalogue #272

charles-turner-1 commented Nov 26, 2024

aidanheerdegen commented Nov 26, 2024

marc-white commented Nov 28, 2024

charles-turner-1 commented Dec 2, 2024

aidanheerdegen commented Dec 2, 2024

Restart Catalogue #272

Restart Catalogue #272

Comments

charles-turner-1 commented Nov 26, 2024

Is your feature request related to a problem? Please describe.

Describe the feature you'd like

Describe alternatives you've considered

Additional context

aidanheerdegen commented Nov 26, 2024

marc-white commented Nov 28, 2024

charles-turner-1 commented Dec 2, 2024

aidanheerdegen commented Dec 2, 2024