Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First tuple element type not being checked correctly #41

Open
Yannik opened this issue Aug 22, 2024 · 11 comments
Open

First tuple element type not being checked correctly #41

Yannik opened this issue Aug 22, 2024 · 11 comments

Comments

@Yannik
Copy link

Yannik commented Aug 22, 2024

I have the following code:

import type_enforced

@type_enforced.Enforcer
def test() -> tuple[list[int],list[str]]:
    return ['a'], ['b']

print(test())

I would expect this to throw a TypeError. It however, does not throw an exception, but returns normally and prints (['a'], ['b']).

With method signature def test() -> tuple[list[str],list[int]]: (first element type is correct, second element type is incorrect), the TypeError is correctly thrown.

I'm using type_enforced 1.5.0.

@connor-makowski
Copy link
Owner

Hey @Yannik thanks for creating an issue.

The way that type enforced works is a bit different than your expectations. In this case, you are checking that your output is either a tuple of list of strings or a tuple of list of integers. The comma operates as an OR statement. Type enforced also does not check for length in this case. Since you specify a list of strings or list of integers: if you passed a tuple with a list of mixed ints and strs, you would get an error.

Now to achieve exactly what you are looking for, you could always use a custom constraint to validate that you have a tuple of length 2 composed of lists. See: https://github.com/connor-makowski/type_enforced?tab=readme-ov-file#validate-with-constraints.

Let me know if I can add any better clarity there.

@Yannik
Copy link
Author

Yannik commented Aug 22, 2024

@connor-makowski Okay, got it. That's too bad!

What I don't understand from response is why an error is thrown in the second case (tuple[list[str],list[int]] instead of tuple[list[int],list[str]]). If it's just OR'ed, that shouldn't happen, should it?

@Yannik
Copy link
Author

Yannik commented Aug 22, 2024

@connor-makowski I just tried to write a CustomConstraint for validating that the 2 values are returned from a function, the first one a list[str] and the second one a list[int]:

import typing
from type_enforced.utils import GenericConstraint

CustomConstraint = GenericConstraint(
    {
        'asdf': lambda x: isinstance(x, tuple) and len(x) == 2 and isinstance(x[0],list) and all(isinstance(s, str) for s in x[0]) and isinstance(x[1],list) and all(isinstance(s, int) for s in x[1]),
    }
)


@type_enforced.Enforcer
def test() -> [CustomConstraint]:
    return ['a'], [1]

Is there any better way to do this? That's so hard to read :D

@Yannik
Copy link
Author

Yannik commented Aug 22, 2024

I read some more on this topic, for example this part of the mypy docs: https://mypy.readthedocs.io/en/stable/kinds_of_types.html#tuple-types

They say: The type tuple[T1, ..., Tn] represents a tuple with the item types T1, …, Tn:.

And show the following example:

def f(t: tuple[int, str]) -> None:
    t = 1, 'foo'    # OK
    t = 'foo', 1    # Type check error

I found many more sources supporting that a tuple[a,b] type hint means the the first tuple element is of type a, and the second one is of type b.

However, if I understood you correctly, type_enforced handles this differently, so that tuple[a,b] does not verify the tuple length at all, and allows a/b for any type of the tuple? And will as such not throw an error on the second case in the mypy example?

This seems so confusing to me! Is there a special reason for this?

@connor-makowski
Copy link
Owner

@Yannik These are great comments. Let me address them appropriately.

"What I don't understand from response is why an error is thrown in the second case (tuple[list[str],list[int]] instead of tuple[list[int],list[str]]). If it's just OR'ed, that shouldn't happen, should it?" The order does not matter here. I was referring to the fact that a tuple like (['a','b'],[1,2]) would pass validation but (['a',1],[1,2]) would not. To allow mixed lists to pass, you could simply pass tuple[list[int|str]] or tuple[list[int,str]]

Nice job with the custom constraint. Here is how I might do it:

import type_enforced
from type_enforced.utils import GenericConstraint

# Option 1
def myCustomDataStructCheck(a:tuple[list[int,str]]):
    if len(a) != 2:
        return False
    if len(a[0]) != 1:
        return False
    if len(a[1]) != 1:
        return False
    if not isinstance(a[0][0], int):
        return False
    if not isinstance(a[1][0], str):
        return False
    return True

ConstraintExampleA = GenericConstraint(
    {
        'myDataStruct': myCustomDataStructCheck
    }
)

# Option 2
# Alternatively, you can use a more comprehensive way to define the constraint
# Note this will provide better error messages as the offending key names will be used in the error message
ConstraintExampleB = GenericConstraint(
    {
        'tuple_len_2': lambda x: len(x) == 2,
        'list1_len_1': lambda x: len(x[0]) == 1,
        'list2_len_1': lambda x: len(x[1]) == 1,
        'list1_is_int': lambda x: isinstance(x[0][0], int),
        'list2_is_str': lambda x: isinstance(x[1][0], str)
    }
)

# Note that all regular checks happen before constraints so you can still use normal type hints
# prior to adding your constraints.
# See: https://github.com/connor-makowski/type_enforced?tab=readme-ov-file#validate-with-constraints
@type_enforced.Enforcer()
def test() -> [tuple[list[int|str]], ConstraintExampleB]:
    return ([1], ['a'])

test()

tuple[a, b]: Back story: Type hints are loosely defined by python. Before python3.10, the | operator was not supported in bracket enclosed type hints. To allow certain type combinations to compile on the fly in pure python at the time, I had to take certain liberties with what was allowed in python and I had to find a way to pull it out of the function. Mypy is actually a different python runtime that can allow for more complex scenarios. In my case, I wanted pure standard python with easily toggable type checking. So I had to take advantage of what was able to be run without being flagged in standard python.

EG: To allow multiple types as ors, we used the accepted comma separated bracket enclosed list. To allow for a list of strings of undefined length, list[str] made sense. Now what about strings or ints: list[int,str] which can now be also written as list[int|str]. Now list[str] makes more sense to be a list of strings of undefined length over a list of 1 string.

@Yannik
Copy link
Author

Yannik commented Aug 23, 2024

The order does not matter here. I was referring to the fact that a tuple like (['a','b'],[1,2]) would pass validation but (['a',1],[1,2]) would not. To allow mixed lists to pass, you could simply pass tuple[list[int|str]] or tuple[list[int,str]]

I think we still have a misunderstanding here.

To me it seems like the order does matter.

This does not throw an error:

import type_enforced

@type_enforced.Enforcer
def test() -> tuple[list[int],list[str]]:
   return ['a'], ['b']

test()

This does:

import type_enforced

@type_enforced.Enforcer
def test2() -> tuple[list[str],list[int]]:
   return ['a'], ['b']
   
test()

The only difference here is the order of list[int] and list[str].

@Yannik
Copy link
Author

Yannik commented Aug 23, 2024

tuple[a, b]: Back story: Type hints are loosely defined by python. Before python3.10, the | operator was not supported in bracket enclosed type hints. To allow certain type combinations to compile on the fly in pure python at the time, I had to take certain liberties with what was allowed in python and I had to find a way to pull it out of the function. Mypy is actually a different python runtime that can allow for more complex scenarios. In my case, I wanted pure standard python with easily toggable type checking. So I had to take advantage of what was able to be run without being flagged in standard python.

Understood.

EG: To allow multiple types as ors, we used the accepted comma separated bracket enclosed list. To allow for a list of strings of undefined length, list[str] made sense.
Now what about strings or ints: list[int,str] which can now be also written as list[int|str]. Now list[str] makes more sense to be a list of strings of undefined length over a list of 1 string.

I would expect the following behavior:

# List/set of items of type int
x: list[int] = [1]
x: set[int] = {6, 7}

# Dict of keys of type str, values of type float
x: dict[str, float] = {"field": 2.0} 

# Tuple of fixed size with first element int, second element str, third element float
x: tuple[int, str, float] = (3, "yes", 7.5)

# Tuple of variable size containing elements of type int
x: tuple[int, ...] = (1, 2, 3) 

This is exactly how all static type checkers I know work (including mypy) and how I would usually expect type-hints to function, were they enforced by the language.

Since we have | available for unions in python 3.10+ (and typing.Union[a,b] in earlier versions), wouldn't it be possible to now exclusively use this for unions and use the comma to specify fixed size tuple element types?

@connor-makowski
Copy link
Owner

@Yannik those are good points.

Regarding different results based on order, I'll have to look into that. If so, I would expect a new minor version sometime soon.

Regarding changing the way nested structures (like tuples of specific length) work, I want to look into that more. This would require a major version change with the package since it is a breaking change. I want to do some homework here before proceeding. I'm not sure on a timeframe for these changes.

On a side note, it's great to have you using type_enforced. I designed it to easily enforce types in python webservers on the cloud. I open sourced it on pypi because it made server setup super easy. I am however curious as to why you are using it over mypy. Could you let me know why you are using this over another package? it would help me to better design it for other similar users.

@Yannik
Copy link
Author

Yannik commented Aug 23, 2024

Regarding changing the way nested structures (like tuples of specific length) work, I want to look into that more. This would require a major version change with the package since it is a breaking change. I want to do some homework here before proceeding. I'm not sure on a timeframe for these changes.

Makes sense! Thank you for looking into it.

On a side note, it's great to have you using type_enforced. I designed it to easily enforce types in python webservers on the cloud. I open sourced it on pypi because it made server setup super easy. I am however curious as to why you are using it over mypy. Could you let me know why you are using this over another package? it would help me to better design it for other similar users.

Backstory: I am not a python programmer by trade, I mostly write one-file functional scripts or ansible modules to use in my infrastructure as code/devops work. The dynamically typed nature of python was never really a problem for this, however, for a bigger project I am currently working on (actually a web-api!) I wanted something more robust.
I have programmed in lots of statically typed languages before, as well as hybrid languages like php (with declare(strict_types=1)) and I had seen python code with typehints before, so I just started using them. I quickly found out that they merely serve as documentation and do actually do any typechecking! That got me searching for a solution, and your project came up as one of the first results. Then I tried it out and ran into the unexpected tuple-behavior that got me opening up this issue :D

@connor-makowski
Copy link
Owner

Thanks for the backstory. This is helpful.

I was able to identify the issue with two different nested types. EG:

import type_enforced

@type_enforced.Enforcer
def test(
    x: [list[str]|list[int]]
) -> None:
    pass

test([1, 2, 3])  # Passes
test(["a", "b", "c"])  # Should pass but fails

To follow up on this, I am not entirely sure how I want to proceed here. This may take more time to solve than I was expecting.

For internal use:

This is related to how we build acceptable dicts of type values. See:

https://github.com/connor-makowski/type_enforced/blob/main/type_enforced/enforcer.py#L87

The solution for checking multiple overlapping (collision) nested types as or statements cleanly is not immediately apparent. Current thinking on how to solve this require reworking the type checking data structure. This, however, would convolute the code and would substantially slow runtime speed.

As a quick fix, consider firing a warning when this occurs. This should more appropriately be an error in the next major version if the larger issue is not fixed.

@connor-makowski
Copy link
Owner

Thanks for the backstory. This is helpful.

I was able to identify the issue with two different nested types. EG:

import type_enforced

@type_enforced.Enforcer
def test(
    x: [list[str]|list[int]]
) -> None:
    pass

test([1, 2, 3])  # Passes
test(["a", "b", "c"])  # Should pass but fails

To follow up on this, I am not entirely sure how I want to proceed here. This may take more time to solve than I was expecting.

For internal use:

This is related to how we build acceptable dicts of type values. See:

https://github.com/connor-makowski/type_enforced/blob/main/type_enforced/enforcer.py#L87

The solution for checking multiple overlapping (collision) nested types as or statements cleanly is not immediately apparent. Current thinking on how to solve this require reworking the type checking data structure. This, however, would convolute the code and would substantially slow runtime speed.

As a quick fix, consider firing a warning when this occurs. This should more appropriately be an error in the next major version if the larger issue is not fixed.

Fixed in 1.6.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants