-
Notifications
You must be signed in to change notification settings - Fork 15
feature: config validation #157
base: develop
Are you sure you want to change the base?
feature: config validation #157
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I like the way this has been implemented, but I do have some concerns.
- Where should defaults be specified? I worry about the visibility of having them set in the schema, and values being filled automajically for the user.
- We use hydra instantiate to allow a user to bring some of thier own classes and work them into the run. Some of the checks herein limit what can be given to the
_target_
. I wonder if instead, we could have an approach, that if it is a hard subclass, we enfore the parent init args, but allow whatever extra kwargs alongside any_target_
. Or have a custom model validator, which on validation, loads the_target_
, creates a schema for it, and then runs validate. That way the config is still validated, but we allow for any class to be used.
assert target in [ | ||
"anemoi.models.preprocessing.normalizer.InputNormalizer", | ||
"anemoi.models.preprocessing.imputer.InputImputer", | ||
"anemoi.models.preprocessing.remapper.Remapper", | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have concerns that this hard limitation to anemoi.models
provided preprocessors is not scalable.
Say I build another processor in another package, with the hydra.instatiate I can bring that along by providing the path, but I would get blocked here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are three options I can think of that we should discuss: 1. No check for valid targets, 2. strict validation in the sense, we support only preprocessors that are implemented in Anemoi and thus tested and 3. we need to instantiate the targets and check whether they are subclassed from the BaseProcessor class. The third option requires addtional insatntiation of the target classes which might not be optimal.
|
176e652
to
74c7373
Compare
for more information, see https://pre-commit.ci
3762500
to
844cd24
Compare
…https://github.com/ecmwf/anemoi-training into 1-feature-improved-configuration-and-data-structures
class Rollout(BaseModel): | ||
"""Rollout configuration.""" | ||
|
||
start: PositiveInt = Field(default=1) | ||
"Number of rollouts to start with." | ||
epoch_increment: NonNegativeInt = Field(default=0) | ||
"Number of epochs to increment the rollout." | ||
max: PositiveInt = Field(default=1) | ||
"Maximum number of rollouts." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To let you know, #206 will change these options.
Currently, the configurations are passed via hydra from yaml files. This PR adds structured configs (or schemas) and basic config validation via Pydantic base models.
Some advantageous are:
Main changes are:
Anemoi-training config validate config_name
For developers:
If you make changes to the configs, these need to be represented in the structured configs/schemas.
This still work in progress, but I wanted to get feedback on e.g. where important validations are missing.
📚 Documentation preview 📚: https://anemoi-training--157.org.readthedocs.build/en/157/