-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pydantic errors are overwhelming in convenience functions #356
Comments
Also using discriminated unions will decrease number of errors, with the caveat that the discriminated field has to be included when initialising the models. see toy example below: from typing import Literal, Union
from pydantic import BaseModel, Field
class A(BaseModel):
type: Literal["a"] = "a"
a_param: Literal["x", "y"]
class B(BaseModel):
type: Literal["b"] = "b"
b_param: int
class Parent(BaseModel):
nested: Union[A, B]
class ParentDiscriminated(BaseModel):
nested: Union[A, B] = Field(discriminator="type") Example errors without discriminator: >>> Parent(nested={"type": "a", "a_param": "z"})
ValidationError: 3 validation errors for Parent
nested.A.a_param
Input should be 'x' or 'y' [type=literal_error, input_value='z', input_type=str]
For further information visit https://errors.pydantic.dev/2.10/v/literal_error
nested.B.type
Input should be 'b' [type=literal_error, input_value='a', input_type=str]
For further information visit https://errors.pydantic.dev/2.10/v/literal_error
nested.B.b_param
Field required [type=missing, input_value={'type': 'a', 'a_param': 'z'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/missing Example error with discriminator: >>> ParentDiscriminated(nested={"type": "a", "a_param": "z"})
ValidationError: 1 validation error for ParentDiscriminated
nested.a.a_param
Input should be 'x' or 'y' [type=literal_error, input_value='z', input_type=str]
For further information visit https://errors.pydantic.dev/2.10/v/literal_error However the trade off is pydantic will now not infer the classes from the other parameters, e.g.: >>> ParentDiscriminated(nested={"a_param": "x"})
ValidationError: 1 validation error for ParentDiscriminated
nested
Unable to extract tag using discriminator 'type' [type=union_tag_not_found, input_value={'a_param': 'x'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/union_tag_not_found Whereas, without the discriminated union: >>> Parent(nested={"a_param": "x"})
Parent(nested=A(type='a', a_param='x')) |
And the adapter: TypeAdapter = TypeAdapter(
Anotated[Union[N2VConfiguration, N2NConfiguration, CAREConfiguration], Field(discriminator="type")]
) We might not need to add a new discriminator field to the |
That's a very good idea!! At least for the Algorithm configuration it comes naturally as you pointed out. Also, Regarding the convenience functions, I still feel that they should explicitly return their configuration (`create_n2v_config(...) -> N2VConfiguration), and in that sense it would make sense to do the following: def _create_config(...) -> tuple[GeneralAlgorithmConfiguration, DataConfiguration, TrainingConfiguration]:
...
def create_n2v_configuration(...) -> N2VConfiguration:
...
algo, data, train = _create_config(...)
return N2VConfiguration(
experiment_name="",
algorithm_config=algo,
data_config=data,
trainong_config=train
) I am currently (happy to change my mind!) not a big fan of adding yet another entry that says |
Having each convenience function return the correct configuration makes sense to me! Just as a note if the discriminated field is set by default then users (and elsewhere in the code) never have to interact with it e.g.: class N2VConfiguration(BaseModel):
experiment_name: str
# ...
# other params
# ...
configuration_type: Literal["n2v"] = "n2v" >>> N2VConfiguration(experiment_name="demo") # don't set configuration type
N2VConfiguration(experiment_name='demo', configuration_type='n2v') Also what I meant in my previous comment was: if the algorithm will always be different for different configurations it might be possible to use the adapter: TypeAdapter = TypeAdapter(
Anotated[
Union[N2VConfiguration, N2NConfiguration, CAREConfiguration],
Field(discriminator="algorithm_config")
]
) |
Unfortunately, and probably for reasons, discriminator fields need to be An alternative in that direction is to use |
Ah that's cool, might be a good solution! But I guess we wait for the N2V PR before trying to implement anything. |
So.... I couldn't wait to test it, then almost everything was already done, so I went ahead. Since I worked on the @CatEek's PR, I am fairly certain the two PRs are compatible. Three changes to solve this issue:
|
Ok but if |
Hhm that's a very good point indeed. We need to think a little bit what is going to happen with the LVAE-based algorithms. Because there again the data might have requirement for different data. I will open a new issue and let's discuss it there! |
Waiting for #365 to close |
Problem
Since #344, the Pydantic errors caused in the convenience functions are very long and are drowning the real error into a lot of useless (but consequential) errors.
Reproducing the errors
Leads to the following errors:
The reason is that Pydantic will try to validate the overall configuration against all algorithm (N2V, CARE and N2N). The error we should be focusing on is the error coming from the N2V validation, but Pydantic also validates against CARE and N2N, which yields additional validation errors due to the incompatibility between the different algorithms, losses, transforms etc..
Solutions
1 - Avoid using
configuration_factory
in the convenience functionSince the convenience functions know which algorithm to create, they could simply call the specific one and not use the factory. That would require rewriting a bit the supervised convenience functions.
Advantage is that it clarifies the convenience functions output type.
2 - Filter errors
We could also keep the code base as is, the configuration factory being a one-liner, and wrap it in a
try ... catch
and filter the errors unrelated to the specific configuration. Less efficient, but maybe smaller foot print.Probably too hacky.
3 - Use Pydantic
Discriminator
andField(discriminator=...)
The proper way to do it with Pydantic. It might make reading a bit more complicated, but it is the most elegant and secure solution! (thanks @melisande-c)
The text was updated successfully, but these errors were encountered: