Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-123954: proposal for improving logic for setting SSL exceptions #128391

Draft
wants to merge 28 commits into
base: main
Choose a base branch
from

Conversation

picnixz
Copy link
Contributor

@picnixz picnixz commented Jan 1, 2025

I'd recommend also reviewing each commit separately due to the changes arising from logic extraction. I've also observed that we incorrectly use ERR_GET_REASON(p) on a custom error code. It's not really an issue because this is a no-op but I've added an assertion and removed the call for the semantics to match (we format the message using the error coming from ERR_peek_last_error() but the exception object is initialized with another error code that can also be obtained by SSL_get_error() or is a custom one).

I haven't benchmarked the branch, I just want to share a proposal. Note that refactoring the logic could help target the operations that take longer time.

cc @tarasko @kumaraditya303 @gpshead

Modules/_ssl.c Outdated Show resolved Hide resolved
Modules/_ssl.c Outdated Show resolved Hide resolved
Modules/_ssl.c Outdated Show resolved Hide resolved
Modules/_ssl.c Outdated Show resolved Hide resolved
Modules/_ssl.c Outdated Show resolved Hide resolved
Modules/_ssl.c Outdated Show resolved Hide resolved
Modules/_ssl.c Outdated Show resolved Hide resolved
Modules/_ssl.c Outdated Show resolved Hide resolved
@picnixz
Copy link
Contributor Author

picnixz commented Jan 1, 2025

Since all strings that will be rendered in the exception are actually ASCII messages (no localization), I'm considering switching from a PyDict storage to a _Py_hashtable_t structure but I don't know which one is the most efficient. With a _Py_hashtable_t structure, I can simply forget about Unicode PyObject * objects and deal with const char * objects only.

I haven't checked more in details though since I already want to know whether the proposal performs well.

@ZeroIntensity
Copy link
Member

_Py_hashtable_t will probably perform better, but it's not thread safe: is it possible for concurrent writes to be an issue here?

@picnixz
Copy link
Contributor Author

picnixz commented Jan 1, 2025

No, we're only reading from the hash table. It is cached in the module's state and we only lookup strings by integer keys or pairs of integers.

@picnixz
Copy link
Contributor Author

picnixz commented Jan 3, 2025

Ok so I've converted it into _Py_hashtable_t (we don't need to build tuples any more) but I have some weird stuff happening and I'm pretty sure it's because the keys are deemed identical even though the raw pointer value is not.

I need to remember how I can cast my integer properly so that it can interpreted as a const void * and then hash this specific void * but this is something I barely do in C :')

@picnixz
Copy link
Contributor Author

picnixz commented Jan 4, 2025

Great. The SSL data is indeed corrupted (see https://github.com/python/cpython/actions/runs/12609630149/job/35143368822?pr=128391)

BROTLI_DEFLATE_ERROR 41, 103, 0x14800067, 343933031, 343933031
BROTLI_ENCODE_ERROR 41, 103, 0x14800067, 343933031, 343933031

Two errors have the exact same library and reason codes. I don't know how to fix this. It's either an OpenSSL bug or it's a generation error on our side:

cc @gpshead

@ZeroIntensity
Copy link
Member

I'll investigate. My first impression is that perhaps this is intentional--it's supposed to be a generic brotli error, but has two separate names for semantic reasons.

@picnixz
Copy link
Contributor Author

picnixz commented Jan 4, 2025

I'll investigate. My first impression is that perhaps this is intentional--it's supposed to be a generic brotli error, but has two separate names for semantic reasons.

You can see that the issue is that we are having two different codes when the macros are not specified, but if we're using the macros they seem to collide:

cpython/Modules/_ssl_data_34.h

Lines 2125 to 2134 in a4e773c

#ifdef COMP_R_BROTLI_DEFLATE_ERROR
{"BROTLI_DEFLATE_ERROR", ERR_LIB_COMP, COMP_R_BROTLI_DEFLATE_ERROR},
#else
{"BROTLI_DEFLATE_ERROR", 41, 103},
#endif
#ifdef COMP_R_BROTLI_ENCODE_ERROR
{"BROTLI_ENCODE_ERROR", ERR_LIB_COMP, COMP_R_BROTLI_ENCODE_ERROR},
#else
{"BROTLI_ENCODE_ERROR", 41, 106},
#endif

@ZeroIntensity
Copy link
Member

ZeroIntensity commented Jan 4, 2025

Looks like an upstream issue. (I'll file an issue and CC you.) There's two pieces of relevant code on OpenSSL's end. Here:
https://github.com/openssl/openssl/blob/817a2b2b4955da0233fe7e6e4bd16c0255262b4f/crypto/err/openssl.txt#L409 and here: https://github.com/openssl/openssl/blob/817a2b2b4955da0233fe7e6e4bd16c0255262b4f/include/openssl/comperr.h#L27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants