feature: config validation #157

theissenhelen · 2024-11-22T16:53:03Z

Currently, the configurations are passed via hydra from yaml files. This PR adds structured configs (or schemas) and basic config validation via Pydantic base models.

Some advantageous are:

validation and feedback to the user
syntax highlighting
data transformations

Main changes are:

schemas in utils that represent the structure of the yamls
a new command Anemoi-training config validate config_name

For developers:
If you make changes to the configs, these need to be represented in the structured configs/schemas.

This still work in progress, but I wanted to get feedback on e.g. where important validations are missing.

📚 Documentation preview 📚: https://anemoi-training--157.org.readthedocs.build/en/157/

HCookie

Overall I like the way this has been implemented, but I do have some concerns.

Where should defaults be specified? I worry about the visibility of having them set in the schema, and values being filled automajically for the user.
We use hydra instantiate to allow a user to bring some of thier own classes and work them into the run. Some of the checks herein limit what can be given to the _target_. I wonder if instead, we could have an approach, that if it is a hard subclass, we enfore the parent init args, but allow whatever extra kwargs alongside any _target_. Or have a custom model validator, which on validation, loads the _target_, creates a schema for it, and then runs validate. That way the config is still validated, but we allow for any class to be used.

HCookie · 2024-11-22T17:08:53Z

src/anemoi/training/utils/schemas/data.py

+        assert target in [
+            "anemoi.models.preprocessing.normalizer.InputNormalizer",
+            "anemoi.models.preprocessing.imputer.InputImputer",
+            "anemoi.models.preprocessing.remapper.Remapper",
+        ]


I have concerns that this hard limitation to anemoi.models provided preprocessors is not scalable.
Say I build another processor in another package, with the hydra.instatiate I can bring that along by providing the path, but I would get blocked here.

There are three options I can think of that we should discuss: 1. No check for valid targets, 2. strict validation in the sense, we support only preprocessors that are implemented in Anemoi and thus tested and 3. we need to instantiate the targets and check whether they are subclassed from the BaseProcessor class. The third option requires addtional insatntiation of the target classes which might not be optimal.

src/anemoi/training/utils/schemas/hardware.py

src/anemoi/training/utils/schemas/training.py

FussyDuck · 2024-12-02T17:18:57Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ theissenhelen
❌ chebertpinard
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

for more information, see https://pre-commit.ci

…tings

…-structures

…https://github.com/ecmwf/anemoi-training into 1-feature-improved-configuration-and-data-structures

HCookie · 2024-12-19T13:20:11Z

src/anemoi/training/schemas/training.py

+class Rollout(BaseModel):
+    """Rollout configuration."""
+
+    start: PositiveInt = Field(default=1)
+    "Number of rollouts to start with."
+    epoch_increment: NonNegativeInt = Field(default=0)
+    "Number of epochs to increment the rollout."
+    max: PositiveInt = Field(default=1)
+    "Maximum number of rollouts."


To let you know, #206 will change these options.

HCookie reviewed Nov 25, 2024

View reviewed changes

theissenhelen force-pushed the 1-feature-improved-configuration-and-data-structures branch from 176e652 to 74c7373 Compare December 5, 2024 12:45

theissenhelen added 27 commits December 6, 2024 12:01

feat: dataclasses for training config

639025b

test: add test for dataclasses

ffbdff4

refactor: unused config values

ad9c178

feat: structured TrainingConfig

bae570c

fix: training config not in configstore

76cd43c

feat: HardwareConfig

194bac8

fix: missing base config attributes

3cb472a

fix: variable names (temporary fix only)

cbe84bd

feat: add data config schema

01d7bc0

feat: add structured config for gnn

7d5a047

feat: structured configs for transformer and graphtransformer

83fc4d2

feat: extended config schema for model architectures

a623237

feat: add diagnostics structured config

6f23137

feat: translate hardware config to pydantic

3ffaca2

feat: translate data to pydantic

de33e0c

feat: translate training and diagostic to pydantic

b2f192e

fix: hydra instantiation

9a05ef3

feat: translate gnn config to pydantic

c98a90b

fix: config setup working

25d2c2f

fix: type hints

54c731d

refactor: remove model component

f953c31

feat: translate transformer config to pydantic

b352fa1

feat: translate GraphTransformerConfig to pydantic

caf7b1b

feat: add target validator

0c4568f

chore: refactor

c44d231

feat: add defaults

674e2d7

feat: add basic graph schemas

4d3ef0c

theissenhelen and others added 21 commits December 6, 2024 12:02

feat: add dataloader schema

af855f5

feat: adjust datamodule to use dataloader schema

b9a1690

feat: make Frequency model compatible

c9ee0af

feat: add benchmarkprofiler schema

8b16f64

feat: config validate command

872b3eb

refactor: replace with enum

c50aa3a

docs: add description to dataloader schema

5cdb0be

docs: add description to data schema

a7d95f3

doc: add docstrings

3df5eb2

refactor: move schemas folder up

895a3c7

chore: add autodoc_pydantic

4412a19

doc: dosctrings to hardware schemas

de4f32f

fix: serialising Enums

f7fb892

feat: add http_max_retries to config and basemodel

9e5ef60

feat: set default value of read_group_size

86a8a9a

feat: accelerator check moves to pydantic hardware schema

a9b64e7

[pre-commit.ci] auto fixes from pre-commit.com hooks

edaa60e

for more information, see https://pre-commit.ci

refactor: replace target validation with Literal

091ca0b

feat: replace validators with enums

f445c1b

feat: add model_validator to adjust the learning rate to hardware set…

5ade973

…tings

fix: missing configs

844cd24

theissenhelen force-pushed the 1-feature-improved-configuration-and-data-structures branch from 3762500 to 844cd24 Compare December 6, 2024 12:02

chebertpinard and others added 7 commits December 9, 2024 16:15

docs: adjust description format for data and dataloader schema

ee3f362

docs: add description to diagnostics schema

a9f0a3c

Merge branch 'develop' into 1-feature-improved-configuration-and-data…

b6cbab2

…-structures

fix: adjust to changes from develop

e3f2ecf

Merge branch '1-feature-improved-configuration-and-data-structures' of …

6bd4fac

…https://github.com/ecmwf/anemoi-training into 1-feature-improved-configuration-and-data-structures

docs: allow Any plot callbacks in diagnostics

937536f

docs: add description to models schema

d7a8d93

HCookie reviewed Dec 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: config validation #157

feature: config validation #157

theissenhelen commented Nov 22, 2024 •

edited by github-actions bot

Loading

HCookie left a comment

HCookie Nov 22, 2024

theissenhelen Dec 5, 2024

FussyDuck commented Dec 2, 2024 •

edited

Loading

HCookie Dec 19, 2024

feature: config validation #157

Are you sure you want to change the base?

feature: config validation #157

Conversation

theissenhelen commented Nov 22, 2024 • edited by github-actions bot Loading

HCookie left a comment

Choose a reason for hiding this comment

HCookie Nov 22, 2024

Choose a reason for hiding this comment

theissenhelen Dec 5, 2024

Choose a reason for hiding this comment

FussyDuck commented Dec 2, 2024 • edited Loading

HCookie Dec 19, 2024

Choose a reason for hiding this comment

theissenhelen commented Nov 22, 2024 •

edited by github-actions bot

Loading

FussyDuck commented Dec 2, 2024 •

edited

Loading