-
-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support user-defined format
in string schemas
#29
Comments
Hi and sorry for a late reply :) With a global registry, it might be easier to use by 3rd party tools like Schemathesis. E.g. users will be able to use
Hm, for Schemathesis it means, that if there is a custom format in the schema, then it will fail on any affected endpoint. Which, actually may help to find spots where we can generate data better suited than random strings. But I see a possible downside - there might be fewer errors discovered. We actually found a few flaws in our input validators on data that didn't fit the specified format. With the validation we'll have to write a strategy for data generation and the previously discovered cases will not be generated. Also, it might slow down error discovery - e.g. if there are two fields A (without format) and B (with format) and there is an error that depends on A's value, then we won't discover this error until a proper format strategy will not be specified. It might be hard for users that are not experienced with Hypothesis and use Schemathesis as a zero-config tool. But I do feel that it makes internal logic more explicit in both packages and the cases I mentioned above could be also handled explicitly on the Schemathesis side, e.g. by passing some "default" format strategy for unknown formats (this behavior may be configurable as well). Also, since the scope of So, in short I think passing format strategies explicitly + validation might be the most explicit and clean approach :) |
Hmm, some examples might help: # Imagine that we've registered a validator for the format "hello",
# which the only valid value is "world", and no other formats.
>>> FORMATS: Dict[str, SearchStrategy[str]] = {"hello": st.just("world")}
# If we pass a strategy for a custom format, it gets used
>>> from_schema({"type": "string", "format": "hello"}, custom_formats=FORMATS)
st.just("world")
# If the schema contains an unknown format, ignore the format keyword (status quo)
>>> from_schema({"type": "string", "format": "something else"})
st.text()
# It's an error to pass a strategy for a standard format
>>> from_schema(..., custom_formats={"date": ...})
Traceback: ...
InvalidArgument("Cannot pass a custom format for `date`, "
"because it is part of the jsonschema standard.")
# It's an error to pass a strategy for a custom format unless a
# validator for that format is registered on the draft(4,6,7) validator
# See https://python-jsonschema.readthedocs.io/en/stable/validate/#validating-formats
>>> from_schema(..., custom_formats={"foo": ...})
Traceback: ...
InvalidArgument("A strategy was passed for the custom format `foo`, "
"but no checker is registered for this format.")
# Error if generated value for custom format doesn't validate
>>> from_schema(
{"type": "string", "format": "hello"},
custom_formats={"hello": st.just("an invalid string")}
).example()
Traceback: ...
InvalidArgument("Got 'an invalid string' from custom_formats['hello']=..., "
"but this value does not conform to the format checker.") And this shouldn't actually be too hard to implement, so long as we're careful to thread the |
Aha, so you propose to connect format strategies with jsonschema's registered checks, right? Sounds interesting to me, I didn't think about it initially :) I like that it will add some extra validation on user-defined strategies. Thank you, Zac, your examples are very helpful and from my pov reflect a very clean and reliable approach :) |
Yep, that's right! |
Untested prototype: master...custom-formats - @Stranger6667 I'd love to know how this would work for you! |
@Zac-HD Seems excellent to me :) |
@Stranger6667 - I've rebased, finished, and tested this 🎉 Can you |
Hi @Zac-HD ! Thank you for working on this! I am thinking about a couple of use cases where additional
In this case, there is no dependency on
Even though In general, I like the idea of extra validation, but it seems to me that it might negatively affect the experience of the users outside of |
That makes sense. Let's relax the requirement a little: still use the registered checker if there is one, but allow custom formats without a registered checker too. |
Sounds awesome! :) |
Alright! I've added the feature in 61e0cdf, and shipped it as version 0.17.0 🎉 |
Great, thanks! I will take a look at integrating the new version into Schemathesis :) |
Thanks to @Stranger6667 in schemathesis/schemathesis#337 (comment):
This is definitely in-scope for
hypothesis-jsonschema
, because user-defined formats are explicitly allowed by the spec! The options for this are to either add an argument, so that the API looks likeor to maintain a global registry of known formats and strategies. In either case I'd want to raise an error if we ever try to generate a format value for which
jsonschema
does not have a known validator.At the moment I'm leaning towards global state, with validation and the caveat that users can't override formats defined in the spec, but feedback would be most welcome.
The text was updated successfully, but these errors were encountered: