Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggest async with when with finds no __enter__/__exit__ #128398

Open
gvanrossum opened this issue Jan 1, 2025 · 17 comments
Open

Suggest async with when with finds no __enter__/__exit__ #128398

gvanrossum opened this issue Jan 1, 2025 · 17 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@gvanrossum
Copy link
Member

gvanrossum commented Jan 1, 2025

Bug report

Bug description:

(This is not an asyncio bug! I am just using asyncio.TaskGroup() as an example.)

import asyncio
def foo():
    with asyncio.TaskGroup() as g:  # BUG: should be `async with`
        pass

This currently gives an error ending in

TypeError: 'TaskGroup' object does not support the context manager protocol (missed __exit__ method)

That's not very clear about what's wrong. Maybe when issuing this TypeError we could check if the object supports __aexit__ and __aenter__, and if so, suggest something like "maybe try async with ?".

CPython versions tested on:

3.12, 3.13, 3.14

Operating systems tested on:

No response

@gvanrossum gvanrossum added type-bug An unexpected behavior, bug, or error type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Jan 1, 2025
@gvanrossum gvanrossum changed the title Hint about async with when with finds no __enter__/__exit__ Suggest async with when with finds no __enter__/__exit__ Jan 1, 2025
@graingert
Copy link
Contributor

graingert commented Jan 1, 2025

In trio extra __enter__ and __exit__ methods are added that raise a nicer exception message:

https://github.com/python-trio/trio/blob/main/src/trio/_core/_run.py#L1057

It's a bit of a hack, and they have to be hidden from the type checking

It would be nice to be able to avoid this

@TeamSpen210
Copy link

Might be good to also do this with iteration, and the other direction (async used on a sync object)?

@gvanrossum
Copy link
Member Author

Someone should look in the source code and see how easy this would be to implement first. Possibly a good first issue, but possibly not (I'm pretty sure that error comes from C code).

@picnixz picnixz added stdlib Python modules in the Lib dir interpreter-core (Objects, Python, Grammar, and Parser dirs) and removed stdlib Python modules in the Lib dir labels Jan 1, 2025
@picnixz
Copy link
Member

picnixz commented Jan 1, 2025

(I'm pretty sure that error comes from C code).

Yes, that's why I removed the stdlib and used the other label (my bad for this one). It's here:

cpython/Python/ceval.c

Lines 375 to 379 in a327810

[SPECIAL___ENTER__] = {
.name = &_Py_ID(__enter__),
.error = "'%.200s' object does not support the "
"context manager protocol (missed __enter__ method)",
},
.

@ZeroIntensity
Copy link
Member

I've provisionally marked this as "easy", as we only need to change an error, but there's a very good chance that it's not. It might be a good issue for anyone wanting to learn about the interpreter core, though!

@picnixz
Copy link
Member

picnixz commented Jan 2, 2025

I'm not sure it's really easy as the error is used here:

cpython/Python/bytecodes.c

Lines 3163 to 3166 in 58e9f95

_PyErr_Format(tstate, PyExc_TypeError,
_Py_SpecialMethods[oparg].error,
Py_TYPE(owner_o)->tp_name);
}

(what's hard here is understanding how bytecode.c is written as it's not true C code)

@abhijeetsharma200
Copy link
Contributor

Hey, I would like to take up this issue. I am going to put a breakpoint in the file mentioned by @picnixz to follow trace of the error being as I can't find a reference to _Py_SpecialMethods in the docs. I would be happy to hear any suggestions on a better way to approach this! Thanks

@ZeroIntensity
Copy link
Member

Go for it! That sounds like a fine way to approach it to me.

@picnixz
Copy link
Member

picnixz commented Jan 2, 2025

You won't have anything in the docs I think. Those are essentially internal stuff. You'll likely need to modify bytecode.c (see #128398 (comment)) and run make regen-all to regenerate related files. You should also grep _Py_SpecialMethods to locate other occurrences and decide whether they need to be modified (though I'm not sure they should).

A solution is to add a new error field that you could use if the oparg satisfies other properties:

 [SPECIAL___ENTER__] = { 
     .name = &_Py_ID(__enter__), 
     .error = "'%.200s' object does not support the " 
              "context manager protocol (missed __enter__ method)", 
     .suggestion = "%T object does [...]"
 }, 

@abhijeetsharma200
Copy link
Contributor

image

Hi, so I started of by trying to put a break point on line 374 but for some reason(I would love to know why) gdb does not want to put a breakpoint there and moves it to line 412. I have also tried searching for _Py_SpecialMethod and _Py_SpecialMethods in gdb without any result. The I am working is to use backtrace from the return of main and try to find where the error get populated. But that seems like a very inefficient way to go about it. I would appreciate any help on going about doing this the right way. Thanks!

@picnixz
Copy link
Member

picnixz commented Jan 3, 2025

gdb does not want to put a breakpoint there and moves it to line 412

Likely because it's at the end the static array declaration.


Instead of doing it using GDB, here's what you should do:

  • Open Python/bytecode.c. The file is a "fake" C file.
  • Go to line 3163. The usage of _Py_SpecialMethods is as follows:
     _PyErr_Format(tstate, PyExc_TypeError, 
                   _Py_SpecialMethods[oparg].error, 
                   Py_TYPE(owner_o)->tp_name); 

Here, oparg is the special method (SPECIAL___ENTER__) while owner_o is the current object (in Guido's example, this would be the variable g).

Now, let's have a look at the entire "block":

inst(LOAD_SPECIAL, (owner -- attr, self_or_null)) {
    assert(oparg <= SPECIAL_MAX);
    PyObject *owner_o = PyStackRef_AsPyObjectSteal(owner);
    PyObject *name = _Py_SpecialMethods[oparg].name;
    PyObject *self_or_null_o;
    PyObject *attr_o = _PyObject_LookupSpecialMethod(owner_o, name, &self_or_null_o);
    if (attr_o == NULL) {
        if (!_PyErr_Occurred(tstate)) {
            _PyErr_Format(tstate, PyExc_TypeError,
                          _Py_SpecialMethods[oparg].error,
                          Py_TYPE(owner_o)->tp_name);
        }
        ERROR_IF(true, error);
    }
    attr = PyStackRef_FromPyObjectSteal(attr_o);
    self_or_null = self_or_null_o == NULL ?
        PyStackRef_NULL : PyStackRef_FromPyObjectSteal(self_or_null_o);
}

The LHS of -- is the list of inputs and the RHS of -- is the list of of outputs (you can read more in https://github.com/python/cpython/blob/main/Tools/cases_generator/interpreter_definition.md). Above, you have owner as an input and attr as an output. What we are doing is roughly equivalent to (C pseudocode):

static PyObject *
lookup_special_method(PyObject *owner, uint8_t oparg)
{
    PyObject *name = _Py_SpecialMethods[oparg].name;
    PyObject *dummy;
    PyObject *method = _PyObject_LookupSpecialMethod(owner, name, &dummy);
    if (method == NULL) {
        PyErr_Format(PyExc_TypeError, _Py_specialMethods[oparg].error, ...);
        return NULL;
    }
    return method;
}

Guido is suggesting that we change this to something like that:

static PyObject *
lookup_special_method(PyObject *owner, uint8_t oparg)
{
    PyObject *name = _Py_SpecialMethods[oparg].name;
    PyObject *dummy;
    PyObject *method = _PyObject_LookupSpecialMethod(owner, name, &dummy);
    if (method == NULL) {
        if (PyUnicode_EqualToUTF8(name, "__enter__") && supports_async_with(owner)) {
            // use an alternative suggestion if the object supports __aenter__ instead
            PyErr_Format(PyExc_TypeError, _Py_specialMethods[oparg].alterror, ...);            
            return NULL;
        }
        PyErr_Format(PyExc_TypeError, _Py_specialMethods[oparg].error, ...);
        return NULL;
    }
    return method;
}

The above is just a pseudo-code and there are still stuff to check (like supports_async_with, creating the "alt" error, etc). We should decide when in which case we want to print an alternative error. In my example above, I'm printing another error if I wanted to use __enter__ (i.e. with x:) but x actually supports async with x: (namely, x supports the asynchronous context manager protocol but not the synchronous one).

@picnixz picnixz removed the easy label Jan 3, 2025
@picnixz
Copy link
Member

picnixz commented Jan 3, 2025

(I'm removing the easy label because it's not that easy IMO; you need to navigate in a lot of internal parts, but we can definitely help you @abhijeetsharma200).

I would advise first focusing on the following questions:

  • when do we need to print that error? namely, how would you write the error handling in plain Python?
  • how do you modify the items in _Py_SpecialMethods? how does it affect other places? (if we are using alternative errors, then I think we would also store the alternative error format in this structure).
  • what kind of additional helpers do you need?

@abhijeetsharma200
Copy link
Contributor

Thanks, appreciate the detailed comments!

@abhijeetsharma200
Copy link
Contributor

abhijeetsharma200 commented Jan 4, 2025

Hey, I have thought about the questions. Here are my thoughts on the questions you posed:

when do we need to print that error? namely, how would you write the error handling in plain Python?

So from my understanding, we print the alterror whenever __enter__, or __exit__ or both are missing. Then we check if both __aenter__ and __aexit__ exist. This is done by supports_async_with function in my code.

Initially, I only checked for __enter__, assuming both methods would either exist together or not at all(same for the async case). However, the original error message posted by Guido says (missed __exit__ method). I tested that this is only possible if there is __enter__ and no __exit__. I don't quite understand why this is happening because TaskGroup does not have __enter__.

Moving on to my implementation of testing this, I added a check_against flag to see if we are checking for async/non-async cases mostly to make my python implementation more readable. I would like to know how this part is actually implemented in the code. Best I can guess is that oparg is set to look for __enter__ or __aenter__ somewhere upsteam? This also brings up if we should include a similar alterror if the object supports with x: when trying to use async with:.

def supports_async_with(owner):
    return hasattr(owner, "__aenter__") and hasattr(owner, "__aexit__")


def handle_with_error(owner, check_against):
    '''
    :param owner: Object being evaluated containing methods
    :param check_against: Context Manager ("cm") or Async Context Manager ("acm")
    '''
    missing_enter = (f"'{type(owner).__name__}' object does not support the context manager protocol"
                     f" (missed __enter__ method)")
    missing_exit = (f"'{type(owner).__name__}' object does not support the context manager protocol"
                    f" (missed __exit__ method)")
    general_error = f"'{type(owner).__name__}' object does not support the context manager protocol"
    alt_error = (f"'{type(owner).__name__}' object does not support the context manager protocol, "
                 f"but it supports the asynchronous context manager protocol. "
                 f"Did you mean to use 'async with'?")

    enter_method = hasattr(owner, "__enter__")
    exit_method = hasattr(owner, "__exit__")


    if check_against == "cm":
        # There is only enter or exit or neither enter or exit but it has support
        if (not (enter_method and exit_method)) and supports_async_with(owner):
            raise TypeError(alt_error)

        if not enter_method and not exit_method:
            raise TypeError(general_error)

        if not enter_method:
            raise TypeError(missing_enter)

        if not exit_method:
            raise TypeError(missing_exit)

    elif check_against == "acm":
        # Same as above but with aenter, aexit methods and errors
        pass

@abhijeetsharma200
Copy link
Contributor

how do you modify the items in _Py_SpecialMethods? how does it affect other places? (if we are using alternative errors, then I think we would also store the alternative error format in this structure).

So, we can add alterror parameter in _Py_SpecialMethod typedef in pycore_ceval.h and populate those fields based in ceval.c with the alterror defined in my code above. The exact places within that will depend on how exactly the implementation currently works. However, when alterror has been added we need to add the same if statement or define flag that ensures that only one of the two errors gets propogated.

@picnixz
Copy link
Member

picnixz commented Jan 4, 2025

The algorithmic part looks fine (namely I also had that kind of thought), though we will probably use something else than hasattr for checking support.

I would like to know how this part is actually implemented in the code. Best I can guess is that oparg is set to look for __enter__ or __aenter__ somewhere upsteam?

In order to understand bytecode.c, you should read main/Tools/cases_generator/interpreter_definition.md. This file explains the syntax of this file quite precisely.

Now, oparg in this case is a special integer denoting __enter__, __aenter__ etc. I don't think it's necessary to know about how it has been deduced. Essentially, it's the index of the special method in the _Py_SpecialMethods array. You can see how it's being used here: PyObject *name = _Py_SpecialMethods[oparg].name; We are simply looking up the name of the operation we are trying to perform (namely, __enter__, __aenter__, __exit__ or __aexit__).

So, we can add alterror parameter in _Py_SpecialMethod typedef in pycore_ceval.h and populate those fields based in ceval.c with the alterror defined in my code above. The exact places within that will depend on how exactly the implementation currently works. However, when alterror has been added we need to add the same if statement or define flag that ensures that only one of the two errors gets propogated.

Yes, the idea is simply to add a new field in the _Py_SpecialMethod structure and update the _Py_SpecialMethods array. I don't know what you mean by "However, when alterror has been added we need to add the same if statement or define flag that ensures that only one of the two errors gets propogated.". By modifying bytecode.c, you would guarantee this automatically. As you can see in

    if (attr_o == NULL) {
        if (!_PyErr_Occurred(tstate)) {
            _PyErr_Format(tstate, PyExc_TypeError,
                          _Py_SpecialMethods[oparg].error,
                          Py_TYPE(owner_o)->tp_name);
        }
        ERROR_IF(true, error);
    }

Having attr_o == NULL means that looking up the special method somehow failed (i.e. _PyObject_LookupSpecialMethod failed to find anything; see here for its implementation).

Note that this function does not necessarily set an exception (hence if (!_PyErr_Occurred(tstate))). So we should inject the logic of selecting the error message in there:

if (!_PyErr_Occurred(tstate)) {
    const char *fmt = _Py_SpecialMethods[oparg].error;
    if (use_alterror_instead(...)) {
       fmt = _Py_SpecialMethods[oparg].alterror;
    }
    _PyErr_Format(tstate, PyExc_TypeError, fmt,
                  Py_TYPE(owner_o)->tp_name);
}

That's essentially the only thing we would do in bytecode.c. I think we can implement use_alterror_instead() as an internal function next to _PyObject_LookupSpecialMethod and says that it's only used by bytecode.c. It could look like that:

static int
use_alterror_instead(PyObject *self, int oparg)
{
    switch (oparg) {
        case SPECIAL___ENTER__: {
            assert(!hasattr(self, "__exit__"));
            return hasattr(self, "__aenter__") && hasattr(self, "__aexit__");
        }
        ...
    }
}

and hasattr would be a smart way to detect whether the object supports that or this special method.


Note that what I'm suggesting is essentially without me coding anything so maybe there are things that are not compatible. Since @markshannon added _PyObject_LookupSpecialMethod, we can also ask him if he has some recommendations.

@abhijeetsharma200
Copy link
Contributor

Thanks for helpful insights. I am going through Tools/cases_generator/interpreter_definition.md to better understand bytecode.c.

Yes, the idea is simply to add a new field in the _Py_SpecialMethod structure and update the _Py_SpecialMethods array. I don't know what you mean by"However, when alterror has been added we need to add the same if statement or define flag that ensures that only one of the two errors gets propogated.". By modifying bytecode.c, you would guarantee this automatically. As you can see in

I had misunderstood how errors were being propagated because I found occurrences of _Py_SpecialMethods[oparg].error in executor_cases.c.h and generated_cases.c.h. I have since seen that they are generated by a script rather thus would not need to be modified.

Having attr_o == NULL means that looking up the special method somehow failed (i.e. _PyObject_LookupSpecialMethod failed to find anything; see here for its implementation).
Note that this function does not necessarily set an exception (hence if (!_PyErr_Occurred(tstate))). So we should inject the logic of selecting the error message in there:

Thanks for providing a clear description of the flow through this function. It has really helped in building a correct mental model for me and I really appreciate it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

6 participants