-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic enum support. #82
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this!
We'll need to add docs for this behavior. I can polish them, but if you can at least get them started, that would be appreciated.
genson/schema/node.py
Outdated
active_strategy = self._active_strategies[0] | ||
return active_strategy | ||
if kind == "schema": | ||
for special_strategy in [Enum, Typeless]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SchemaNode
shouldn't need to be modified as Enum
doesn't need to (in fact shouldn't) be a special strategy. Just add it to BASIC_SCHEMA_STRATEGIES
instead at the head of the list.
This will mean that enum
and type
will never exist in the same schema even though they technically could, but can add a disclaimer for that in the docs. You will only end up with an enum
schema if you start with (or add) one, so presumably most users who try this will know what they're doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rolled back all the changes to node.py and added Enum into the BASIC_SCHEMA_STRATEGIES. Adding the Enum at the head of the list breaks the tests. All the tests pass if the Enum is added at the tail of the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I realize now that I misunderstood the complexity of this. Getting enum
to behave the way that I said in my earlier comments would actually break backwards compatibility. This is because there's no distinction between matching when you do vs. don't have an existing schema, and this actually would require different behavior in each case.
So your original implementation is correct, including the throwing exceptions, and possibly even designating this as a "special" strategy, though I think it's still probably cleaner just to put it at the end of the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think there is a way, but it's a bit hacky:
Basically, you create a @classmethod
match_object()
that always returns False
. But there's another instance method _instance_match_object()
that always returns True
. In __init__()
, you reassign self.match_object = self._instance_match_object()
.
That way you can get different behavior if the SchemaStrategy already exists vs. not and so there's no way an Enum
will get created unless it was explicitly asked for by a schema, and this behavior is not order- or type-dependent, so you can give enum
preference over other strategies and raise an error if its misused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea. Done.
test/test_seed_schema.py
Outdated
@@ -64,3 +64,11 @@ def test_keeps_unrecognized_properties(self): | |||
'properties': {'a': {'type': 'boolean'}}, | |||
'patternProperties': {r'^\d$': {'type': 'integer'}}, | |||
'required': ['a']}) | |||
|
|||
def test_enum(self): | |||
self.add_schema({'type': 'object', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could simplify this test by making enum
the top-level schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
test/test_seed_schema.py
Outdated
@@ -64,3 +64,11 @@ def test_keeps_unrecognized_properties(self): | |||
'properties': {'a': {'type': 'boolean'}}, | |||
'patternProperties': {r'^\d$': {'type': 'integer'}}, | |||
'required': ['a']}) | |||
|
|||
def test_enum(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll want to add another test that checks what happens when complex objects are added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Okay, extra credit idea. You don't have to do this, but I might add it after you're done. A problem with this implementation is that if given a schema that contains enum and a type, one or the other will be discarded. This could be fixed by having the This might not be a good idea if we expect there to be several ways to combine different schemas like this, but as far as I can tell, this is the only such keyword. Any other combining keywords are specific to one or two schema types and don't cut across all of them like |
It seems that I am done with all of the comments received so far. I would prefer if you could implement "enum and type" case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again! I've left a few comments. They're all small and hopefully don't seem too nitpicky, but there are a couple of things that look like bugs that I wanted to point out.
test/sort.py
Outdated
@@ -0,0 +1,39 @@ | |||
from sys import intern | |||
|
|||
class Py2Key: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does "Py2" come from in the name? I think it's best to use something more descriptive such as "MultiTypeSortKey"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the name as suggested.
genson/schema/node.py
Outdated
@@ -137,4 +137,4 @@ def _get_strategy_for_(self, kind, schema_or_obj): | |||
# no match found, raise an error | |||
raise SchemaGenerationError( | |||
'Could not find matching schema type for {0}: {1!r}'.format( | |||
kind, schema_or_obj)) | |||
kind, schema_or_obj)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This codebase uses the default flake8 styles, which includes a newline at EOF. Please run flake8
to find and fix formatting issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed linter errors.
genson/schema/strategies/enum.py
Outdated
# and this behavior is not order- or type-dependent. | ||
self.match_object = self._instance_match_object | ||
|
||
@classmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this also be @staticmethod
(or match_schema
be @classmethod
) for consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to @staticmethod
def __lt__(self, other): | ||
try: | ||
return self.value < other.value | ||
except TypeError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the case that self
and other
are of the same type, but that type is not sortable? Maybe we don't expect to encounter that (at least not in tests)? If so, please document this behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment as suggested.
return self.typestr < other.typestr | ||
|
||
|
||
def sort_lists_in_schema(schema, sorted_key): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is only used in one place and always uses the same sorted_key
, why take it as a parameter? Why not just hardcode it and make the interface simpler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the parameter as suggested.
genson/schema/strategies/enum.py
Outdated
if item_type in [bool, str, int, float]: | ||
self._enum.add(item) | ||
elif item is None: | ||
self._enum.add("null") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, this adds a string "null", which is not the same as the original object. JSON.dumps
will handle the conversion from "None" to "null" for you. Just add this to the scalars list and remove this if case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to scalars as suggested and refactored this code.
test/test_seed_schema.py
Outdated
|
||
def test_enum_scalar_list(self): | ||
self.add_schema({"enum": []}) | ||
self.add_object(["123", 1, 1.2, True, None]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this raise an error since it's a list
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. Refactored this code.
test/test_seed_schema.py
Outdated
self.add_schema({"enum": []}) | ||
self.add_object(["123", 1, 1.2, True, None]) | ||
self.assertResult( | ||
{"enum": ["123", 1, 1.2, "null"]}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get this result, you should have to either start with this schema or add each object individually. To properly test, you should probably do a combination of the two: start with 2 or 3 of the items and then add the rest, making sure you add at least one of each type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the test as suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a couple suggestions left, but lgtm at this point.
Once this is merged, I'll update the readme and add a couple other fixes before releasing the next minor version.
@@ -64,3 +64,21 @@ def test_keeps_unrecognized_properties(self): | |||
'properties': {'a': {'type': 'boolean'}}, | |||
'patternProperties': {r'^\d$': {'type': 'integer'}}, | |||
'required': ['a']}) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why this didn't occur to me before, but there should be some test that adds the same item multiple times to make sure it gets deduped.
# Add only scalar types. Technically, the JSON-Schema spec allows | ||
# any type in an enum list, but using objects and lists is a very | ||
# rare use-case. | ||
if obj is not None and type(obj) not in [bool, str, int, float]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplified:
if obj is not None and type(obj) not in [bool, str, int, float]: | |
if not isinstance(obj, (bool, str, int, float, type(None)): |
Please review this pull request to close the issue of Support constants in JSON schema #81