Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extend sys module on windows #102536

Open
maxbachmann opened this issue Mar 8, 2023 · 31 comments
Open

extend sys module on windows #102536

maxbachmann opened this issue Mar 8, 2023 · 31 comments
Labels
OS-windows type-feature A feature request or enhancement

Comments

@maxbachmann
Copy link
Contributor

maxbachmann commented Mar 8, 2023

It would be useful to be able to detect:

  1. the platform as 32-bit vs 64-bit vs ARM64.
  2. the windows api partition
    • multiple partitions can be true at the same time, so this should probably be a named tuple

@zooba you mentioned adding them to sys.implementation. Since they are only available on windows, I assume this would mean adding them with an underscore prefix? The alternative would be a separate API similar to sys.winver.

This is a follow up to #102256.

Linked PRs

@maxbachmann maxbachmann added the type-feature A feature request or enhancement label Mar 8, 2023
@zooba
Copy link
Member

zooba commented Mar 8, 2023

Yeah, I don't think we want to bother with making them required attributes, so I'd add:

  • sys.implementation._architecture and make it the value of $(ArchName) from python.props (this approx. matches platform.architecture)
  • sys.implementation._api_partitions and make it a tuple of all the ones we test for. No need to add names, simple in checks will be the most common use

@ericsnowcurrently
Copy link
Member

Is sys.implementation the right place? It should have things related only to the Python implementation, that other implementations also have. The architecture is a build characteristic and the api partitions seem like it as well. Where do we normally record build or platform (or OS) information?

@zooba
Copy link
Member

zooba commented Sep 26, 2024

The only real alternative is at the top level of the sys module. Maybe you could make a case for it going into sysconfig, but we still need an attribute on a native module somewhere, as it's the only way we're getting the field it at build time.

I assume this hasn't come up before because Windows is the only OS with transparent CPU emulation? So querying the current OS architecture isn't sufficient to determine the current process architecture.

@ericsnowcurrently
Copy link
Member

sys._build.*?

@zooba
Copy link
Member

zooba commented Sep 26, 2024

Really it should be sys.platform_but_for_real_this_time as we can't just go updating sys.platform any more.

What if we added sys.architecture for all platforms? platform.architecture(sys.executable) is too limited in its return values to work here, but it ought to be known at compile time, right? And something like Jython or IronPython can use clr or jvm if they prefer. Maybe they've already got an additional field for that?

@vstinner
Copy link
Member

the platform as 32-bit vs 64-bit vs ARM64.

There are multiple existing APIs to do that. Examples:

>>> platform.architecture()
('64bit', 'ELF')
>>> os.uname().machine
'x86_64'

I don't think that we should add information retrieved at runtime in sys. Usually, sys is more information computed during the build.

@zooba
Copy link
Member

zooba commented Sep 27, 2024

platform.architecture() doesn't distinguish between ARM64 and x64, and os.uname().machine doesn't reflect how CPython was compiled. So we need to store the target architecture at build time.

But you're correct that we don't need to retrieve it at runtime. (Though worth noting that a lot of sys is dynamic - path, modules, various hooks, etc.)

@rruuaanng
Copy link
Contributor

Hmmm. Actually, the discussion in the PR failed because no one decided whether to merge it or not (it seems that the only supporter is @zooba).
The newly added arch is used to get the current build architecture in Windows (it comes from python.props). What do you think?

link:
#124582
https://discuss.python.org/t/regarding-whether-we-should-add-py-currentarch-or-py-archname-function/65191/17

@ericsnowcurrently
Copy link
Member

My main concern was/is that this information doesn't belong on sys.implementation. That's meant for information about the Python implementation (e.g. cpython, pypy, jython, micropython), not about the build. See PEP 421.

Something like sys._architecture, sys.build.architecture, or platform.??? or on sysconfig (like @zooba suggested) would be more appropriate. (FWIW, I like the idea of sys.build.*, PEP-421-style, but its hypothetical contents might already be covered elsewhere.)

All that said, I haven't looked closely at what we're trying to actually accomplish here, so I don't have much feedback on that.

@zooba
Copy link
Member

zooba commented Oct 17, 2024

I'm often the only supporter of things involving Windows :-) Usually aggregated from a lot of discussions with a range of users, but not easily cited (many Windows users are scared off by the culture here, so they won't report stuff themselves).

In this case, we've been looking at adding support to project's build scripts to allow building on Windows ARM64, which requires the builder to know whether it should be targeted ARM64 or not. That's usually done by running in the ARM64 interpreter, and then to cross-compile you'd run in the x64 or x86 interpreter, but we don't currently have a reliable way to detect that. Historically we haven't needed it, because sys.maxsize (or sys.maxint) was sufficient to detect 64-bit vs 32-bit. But that can't tell the difference between ARM64 and AMD64.

More importantly, and relevant for this case, is that platform.architecture() can't return the difference, even if it could detect it, because it's defined as only returning "64" or "32". And PEP 421 explicitly suggests sys.implementation is the place to store information that would be returned from platform about the Python implementation. The architecture the current runtime was compiled for sure seems to be related to the implementation, and since platform would be the obvious place to return such information (as it already is, albeit insufficiently), adding the field to sys.implementation seems fine.

Having now re-read that PEP though,1 I think we should drop the underscore. Make it sys.implementation.architecture and if it exists then platform.architecture(sys.executable) can return it rather than querying the actual binary (if it's within the range of allowable return values, and perhaps we can later add platform.architecture_but_for_real_this_time())

Footnotes

  1. Specifically Non-required Attributes and Use Cases, the platform and Jython use cases being most relevant.

@vstinner
Copy link
Member

platform.machine() is another possible place for such information. It already has Windows specific code to detect the architecture.

@erlend-aasland
Copy link
Contributor

platform.processor() also seems like a good candidate. There's also a handful of platform.win32_*() APIs. Perhaps a similar one could be added. I agree with Eric and others that sys.implementation seems unfit for this API.

@zooba
Copy link
Member

zooba commented Oct 18, 2024

platform.machine() and platform.processor() are for different purposes. We want to know the compile-time target of the runtime, which is unrelated to the current machine type (Windows has emulation on most architectures, so the machine could be ARM64 but the runtime is x86 and there's currently no way to determine this).

@rruuaanng
Copy link
Contributor

I agree. If I remember correctly, the win kernel is actually simulating multiple subsystems at the same time (such as win32. It's used to be compatible with historical calls). Maybe when building, it's in win32, but in fact it's in amd64. Maybe it‘s in arm64. These can only be obtained in the initial state at build time.

@ZeroIntensity
Copy link
Member

OK, I think I understand Steve's point of view now. There's precedent for this in the stdlib: we're using sys.implementation in platform here:

if sys.implementation._multiarch.endswith("simulator"):

I'm only comfortable with sys.implementation._architecture if it's an internal thing used only for platform, instead of a public interface--then, it really is an implementation detail, and sort of fits there.

Alternatively, we could add a public sys.architecture that also supports other systems, rather than just Windows. It doesn't seem like it would be all that difficult to just define ARCH_NAME as something else for Linux builds.

@rruuaanng
Copy link
Contributor

I think that for sys.architecture, a new PR can be submitted after the _architecture attribute is added. This issue may be put on hold for now. In fact, if possible, the value obtained by sys.architecture should be the same as the _architecture attribute.

@ZeroIntensity
Copy link
Member

If we go for sys.architecture, then there's no need for an _architecture attribute at all.

@rruuaanng
Copy link
Contributor

As far as the discussion is concerned, it seems that only the _architecture attribute is what we need, because other platforms are not actually as complicated as windows.

@zooba
Copy link
Member

zooba commented Oct 18, 2024

My main hesitation with defining it as public sys.architecture is that now we have to figure out sensible values for all the other platforms as well... what do we put for WASI? How do we under-specify it enough that we can expand to new platforms later on and not run into platform.architecture type issues (or sys.platform type issues)?

@erlend-aasland
Copy link
Contributor

Platforms that use our configure build system (that is, most platforms) can retrieve compile time stuff via sysconfig APIs.

@vstinner
Copy link
Member

vstinner commented Oct 21, 2024

The sysconfig module sounds like a good place to add the build-time architecture on Windows.

@zooba
Copy link
Member

zooba commented Oct 21, 2024

The sysconfig module doesn't have a native component, and we don't have a compile-time generated file that ships with the runtime (yet). sysconfig retrieves compile-time stuff on Linux because we include a copy of Makefile, and it scans it.

Storing a single value somewhere in a native module is much easier on Windows than adding an entirely new static file that is generated at compile time. The question is about where we should store the value, more than where it gets retrieved from.

@zware
Copy link
Member

zware commented Oct 22, 2024

The sysconfig module doesn't have a native component, and we don't have a compile-time generated file that ships with the runtime (yet). sysconfig retrieves compile-time stuff on Linux because we include a copy of Makefile, and it scans it.

Actually, I was just surprised to learn that it looks like a _sysconfig module has snuck in via gh-88402/GH-1100491.

Footnotes

  1. And to be clear, I consider that a good thing :). More of sysconfig/_sysconfigdata should probably make its way there eventually.

@FFY00
Copy link
Member

FFY00 commented Oct 23, 2024

IMO, this should be implemented by adding MULTIARCH — which represents a target triple — in sys.get_config_vars() on windows, while we don't have a better API. I don't think sys.implementation is the right place, this is a build detail, not an implementation-specific detail.

@zooba
Copy link
Member

zooba commented Oct 23, 2024

I was just surprised to learn that it looks like a _sysconfig module has snuck in

Ah, so it has.

this should be implemented by adding MULTIARCH — which represents a target triple

Given that we can embed it at compile time, I'm fine with this. Do we have official target triples for Windows? Or do we just borrow them from someone else (I'm pretty sure Rust uses them, yeah?)

@rruuaanng
Copy link
Contributor

rruuaanng commented Oct 25, 2024

I actually have no problem with adding the arch member to MULTIARCH, but would that be better than adding it to the implementation? For example, in terms of declarative?

@zooba, what do you think about this?

@zooba
Copy link
Member

zooba commented Oct 25, 2024

would that be better than adding it to the implementation? For example, in terms of declarative?

It would be added to the Modules\_sysconfig.c file (which I'd forgotten about), so it's easy enough to add. It's a bit rough on users, but at least it'll be possible to get at the information.

We need a spec for MULTIARCH though, to define exactly what the values should look like.

@rruuaanng
Copy link
Contributor

rruuaanng commented Oct 25, 2024

That is, it should look like this

>>> import _sysconfig
>>> _sysconfig.config_vars()
{'EXT_SUFFIX': '.cp314-win_amd64.pyd', 'SOABI': 'cp314-win_amd64', 'Py_GIL_DISABLED': 0, 'ARCH': 'win32'}

Edit
(or maybe I didn't output the value correctly).

@zooba
Copy link
Member

zooba commented Nov 12, 2024

Well, if EXT_SUFFIX contains win_amd64 then I'd expect ARCH to be amd64, but the spec has to list all possible values and what they mean, across all platforms (or we make it unsupported on some, and it might already exist on others). And it should go in the documentation somewhere.

@malemburg
Copy link
Member

If you want to access build information, sysconfig is the right module. The platform module is for information about the platform Python is currently running on.

At the time, the platform module was conceived, there was no ARM64 architecture and only 32bit vs. 64bit was important.

It's certainly possible to add another argument family='' and have this return the architecture family, if it can be detected. The return value of platform.architecture() would then have to be turned into a tuple subclass to maintain backwards compatibility and still return the new field as an attribute (similar to how codecs.CodecInfo works). family could then return "ARM64", "ARM32", "x86_64", "x86", etc.

@rruuaanng
Copy link
Contributor

If you want to access build information, sysconfig is the right module. The platform module is for information about the platform Python is currently running on.

Maybe add an ARCH field to sysconfig? This is my understanding of this :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OS-windows type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests