-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Install a static installation description file as part of the Python installation #107956
Comments
To be clear, this is for a CPython installation, not a general Python installation based on what's included (and assuming it's all required as nothing is specified as optional). |
I'd like to make the a general Python installation thing, by putting the CPython specific details in their own section. |
Also to be clarify:
Is this to mean that the description file would be the same across all virtual environments made from a specific installation? For example, let us say I install python version 3.11 and then make 3 virtual environments: |
Yes, as the description file would be located in the Python installation paths themselves, instead of virtual environment. Btw, standard Python virtual environment already have such a file, |
My thoughts on this general topic are written down in brettcannon/python-launcher#168 . While that proposal does not make sense in view of a static file tied to an interpreter, below is probably what would make sense (if you created this file for virtual environments as well): {
// An array specifying what is required to execute the interpreter.
// The expectation is to append args to the end of the array before
// execution.
// E.g. for conda environments:
// ```
// ["/path/to/conda", "run", "--path",
// "/home/brett/.conda/envs/conda-env", "--no-capture-output"]
// ```.
"run": [
"/home/brett/my-venvs/my-venv/bin/python3.10"
],
"python_version": {
// `sys.version_info`
"major": 3, // Optional
"minor": 10, // Optional
"micro": 1, // Optional
"releaselevel": "final", // Optional
"serial": 0 // Optional
},
"implementation": { // Optional
// `sys.implementation`
"name": "cpython",
"version": { // Has the same structure as `python_version` above.
"major": 3, // Optional
"minor": 10, // Optional
"micro": 1, // Optional
"releaselevel": "final", // Optional
"serial": 0 // Optional
}
},
"executable": {
"bits": 64, // Optional; `math.log2(sys.maxsize+1) + 1`
"architecture": "x86-64" // Optional; platform.machine()
},
"environment": { // Optional
// What type of environment, e.g. "virtual", "conda", etc.
"type": "virtual",
"name": "my-venv"
}
} |
@brettcannon thanks! Your use-case is a bit different than mine (improving cross-compilation support), so it's definitely . I have a working prototype locally, which I am planing to propose soon, and there are a couple things I'd like to finalize before that, if you have any thoughts:
This initial implementation proposal will only include the main key details (I think probably the version, executable path, and stdlib path), and the rest of the details can then be proposed later in smaller PRs/proposals. My key goal here is to get the static description file itself sorted out.
That was specifically left out of scope 1_ because it's tricky to do properly, the main thing that jumps out to me being the Python installation updating and leaving incorrect virtual environment static description files in place. Some thoughts on possible approaches to tackle this issue:
But ultimately, this is a significantly harder problem, and considering this is not incompatible with the current proposal and that the current proposal would already be a huge improvement over the current status quo, I think it makes sense to pursue it with the current scope. Footnotes
|
I think the question is whether you expect people to ever read or write this file? If the answer is in general "no", then I think JSON is the easier format for people to ingest tooling-wise. But if you expect a human being to ever interact with the file I can see TOML making sense.
I think the way
Fair enough. I always have to remember Unix is just symlinks and so it can change underneath itself while Windows is a hard copy and so you can't mess that detail up. But that might actually be an argument for the file when copying the |
This seems to me like it should just be a Hopefully between a couple of substitutes (i.e. allow I don't see any reason to separate it from |
Yes, I expect people to read, and in some situations, write this file, but not too frequently. Some use-cases:
Yes, I expect other implementations to use this file, though not necessarily ship it in the same way. My plan was for the format to be shared between everyone, but if and how the static description file is provided, to be up to each implementation. Of course, I'd like the way the static description file is provided to be consistent, but as you point out, this is something that can be highly dependent on the implementation, build format, target platform, etc., so I don't think we can define a single way to do it. We could define how this should work in different scenarios, but I don't think we (CPython) should be the responsible for telling other implementations how they should implement this, especially in scenarios where we might not be experts. For consistency, I would expect implementations that use an installation scheme similar to CPython to place the file in the same/equivalent place, but they wouldn't be required to do so. I think trying to get this to happen is something we should do as a community, by engaging in discussions with the other parties. Finally, as I mentioned above, deciding if the file should be provided at all, is something I also think should be up to each implementation, and is something that could depend on the target, build format, etc.
Regarding the name, I think using "sysconfig" in the name would imply a tie to the Regarding the location, while I guess using
I don't quite agree here 😅. This goes into the bigger picture question of the direction we want to push IMO For this reason, instead of building further on top of That said, Regarding being able to simply replace @zooba does this make sense to you? Like I said, it depends a lot on our higher level goals regarding the direction want to move |
Signed-off-by: Filipe Laíns <[email protected]>
Signed-off-by: Filipe Laíns <[email protected]>
Yeah, fair enough. It was more of a "similar intent" suggestion, rather than saying we should actually do it. But I do think it's important to have a sensible migration path. So any data that we can provide through existing (or new) config vars, we should, so that way users don't have to do a switch on version. Any data that we fix should be fixed in both, and existing scenarios should keep working and even get better without people having to change their code. The static file should include all of But I agree with the overall goal of providing the actual info needed for building and installing extensions. We want the new fields to be the actual commands, not just the ones that were used to build CPython. This probably ties into #108064 to document these commands. |
I partially agree. While I think any data changes should be reflected on all mechanisms/interfaces (old
While I understand the reasoning behind this, I am very uneasy about the long-term maintenance aspect of that. Instead of blindly adding all the Also, I don't know if this was clear enough in the proposal description, and I think we might not be on the same page regarding it, but my purpose for this file was to help in use-cases where running the interpreter from introspection is impossible or undesirable. The objective wasn't to replace
🤣 okay, this is something I slightly disagree again! While I think providing the commands is helpful, I don't think this is the place to do so. IMO both in the static description file and the Sorry for the large chunk of text again, but this is something I have extensively thought about, so I have opinions 😅. I am definitely happy to discuss and figure out together which would be the best solution for these different questions. |
We're more in agreement than you think 😉 I'm just stating things very simply compared to the depth that you've thought about them, and I think you're assuming that I've also thought these through 110% and am making definitive design statements. I'm not - I'm just standing somewhere that I can see the problem and generally waving in their direction (I should add, I'm confident in that direction, I'm just not spelling out as full an implementation plan as you are). I definitely agree that not all the Agreed that replacing sysconfig isn't the goal, but it is a good test for whether we've provided enough information. Right now, the gaps in this area are filled by people using sysconfig and often making assumptions about how Python is normally installed - if we can't replace all of those with this file (plus some runtime calculation, of course), I think we've missed the mark. People shouldn't need to use sysconfig if we get this right, whether they can launch the interpreter or not. And yeah, by "commands" I really just meant the compile-time options required, in some format that a library can figure out which actual options to use with its own compiler. That doc page ought to end up with specific compiler commands as examples, but we wouldn't put those in here. However, there should be enough information in this file that a program/script can figure out how to match the original compiler settings enough to get a compatible extension module. Again, if we can't provide that, I think we've missed our goal, so this is more of a validation test than a specific feature. Footnotes
|
I'm not entirely sure I understand the distinction you are making between "the actual commands" and "the ones that were used to build CPython". The commands used for building CPython are actual real commands, they contain for example the name of a specific compiler. That's actually problematic because build systems such as meson, cmake, autotools, (and to a small extent that I wish were a lot larger, setuptools) etc. cannot accept "actual commands" from sysconfig since they need to be able to use different commands, and sometimes compile different languages that aren't used in the CPython build system at all (such as mingw Fortran on Windows, which is a GCC component that, yes, is getting mixed with MSVC-built pythons). Any proposed API for CPython that wants to replace current usage of sysconfig.get_config_vars as the preferred information source used by build systems, cannot hardcode the name of a compiler to use. If it does hardcode the name of a compiler to use then we (meson, in this case) will simply ignore that new API and keep using sysconfig.get_config_vars with a hacked up heuristic -- our current one to find the correct library / import library is, depending on Unix/windows and also CPython / PyPy, a mixture of templating the hardcoded word Basically, what we would like to see in order to make our jobs as buildsystem developers easier, is a way to do for Windows what We don't really want to know the commands to use -- we have our own commands that we're probably already using to build (embedded copies of?) regular C/C++/Fortran libraries into static libraries that will then get linked with a python binding file and the python import library. We need the flexibility to choose our own toolchain, because to us, libpython is just "yet another dependency" irrespective of the fact that the libpython dependency happens to parallel the runtime we are building plugins for. If people want to document and know an example command capable of being copy-pasted and run to build an extension without resorting to a build system like meson, cmake, or autotools... then IMHO this is best as documentation, not as part of a stdlib API. A stdlib API should only contain the python-specific parts that would be used by the documentation. It's up to the documentation to then explain how to find/choose/activate/run a compiler. |
Thanks for clarifying this. I appear to have missed that response while writing my own, lol... I think a large part of my confusion was that I typically think of, say, cl.exe as a command, and things like header search directories or import libraries as flags. Different terminologies I guess. |
Gotcha. I wasn't assuming that, but wasn't sure the tone you were going for (yay nonverbal communication). For me it's easier to spell everything out, to make sure we are on the same page and there isn't any misunderstanding.
If the only thing people are missing is access to the
I agree. Though, we shouldn't forget that this file isn't targeting all the use-cases covered by sysconfig (eg. the install layout), for now at least.
That makes more sense, we're on the same page then.
My goal, as I also stated above, is to provide all the compiler-related information in a compiler-agnostic way. Downstream users, like Meson, should be able to take that information and give it to whichever compiler backend it wants to use and have that be able to generate the actual compiler-specific commands. This should be a 1st party supported use-case. |
This doesn't work on Windows - no file is generated. But we still may need to get fields for a non-executable runtime (such as an ARM64 build on an x64 machine).
I assume by "install layout" you mean how to install wheels? It ought to cover the locations where CPython itself installed things to, such as its own headers and libs, right? (Those are "compiler-related information," I guess.) Other than that, agreed with all the rest. |
Well, the install scheme really includes both what wheels typically use as well as a bit more -- it's actually legal to install headers in a wheel, which maps to the same location as the CPython headers in "include" / "platinclude". I don't think that wheels can do anything with the stdlib location though... which is on that The install layout from sysconfig doesn't currently say anything about the directory where libpython itself is though -- maybe it should? 🤔 People keep on wanting to package up C/++ libraries via tools like auditwheel repair / delvewheel and it's a big pain to do cross platform what with adding the DLL directory, or embedding private paths, having a single conventional directory that python itself guarantees is in the DLL search path for all modules could be handy... |
Legal, but unspecified, and not portable. The current The plan right now is not to specify those. So we'd include the actual directory where CPython stuff is installed to, and anyone who interprets that as "I can also install stuff here" is going off-label. |
Okay, to move things forward, let's set some of the base implementation details, so that GH-108483 is unblocked. I am gonna try to summarize the discussions so far, and what I think to be the most reasonable outcome for each topic.
Sorry if missed anything, misunderstood anything, or if there was any bias. Please let me know what you think, and if the proposed outcomes seem correct. Also, this is my first time writing a discussion summary of this kind, so I am sorry if you feel misrepresented in any way. Please let me know if that is the case, so that I can try to prevent it in the future and try to improve. |
It's a good summary. The only bias I feel is there is (apparently/weakly) dismissing my proposals because nobody else is talking about them 😆 Some responses that I started posting on the PR, but make more sense here:
As far as I know, neither Bash nor PowerShell have anything built in, which means you can't write a native script to handle it. (PowerShell definitely has JSON, and I'm not sure about Bash, but I bet that a command line tool like .NET certainly has nothing native, though adding additional dependencies is only sometimes an issue. But when it is, writing a basic parser is far more likely than jumping through whatever hoops are necessary to get one. LTS versions of Python likely to be on existing Linux distros don't have it, which means system scripts or tools on those will need an additional dependency to handle the file. Again, not impossible, but potentially complicated enough that people will reach for
What different types do we need? If certain fields are defined as being int or float, it's easy enough for any language to parse those. But I expect most fields are going to be strings, and the escaping rules for anything other than Footnotes
|
I'd like to raise an additional defense of using json over toml:
It sounds to me like there's no real concern that users are meant to write the file, just read it:
I do hear the rationale for reading the file, mainly for debugging -- in general, data formats that a human can read somehow, are beneficial absent compelling need otherwise. But I'm not sure what the rationale for writing one is. That there isn't one already existing? Why is this a reason to write one? I would think it's a reason to raise a request for your python binary distributor to add one. I doubt software will be dropping support for running against a python that doesn't have this file, which means old pythons are covered... and new pythons should have the file, right? ... If it's only interesting to read the file, not to write it, then I still think json is a good choice. The main problem with json is that it's annoying to write correctly (in particular, the requirement to separate elements in a list with commas, but raise a syntax error if the last element has a non-ambiguous trailing comma). Reading it is mostly easy, you just pretty-print it when originally generating the json file. Optionally, if you want comments, you have to "cheat" and hack those in by creating json entries called
jq is very common, yes. Every time I've ever wanted to parse a toml file from a shell though, I ended up finding one of like 8 different programs all called "toml2json", then passing that to jq. |
I'd be +1 TOML if we had a writer in the stdlib. I'm maybe +0 though currently, as the TOML implementation in the PR seems a little fragile currently. JSON does have the benefits Steve mentioned for PS/bash, etc, though. A |
FWIW, feature flags that affect the stable ABI are currently:
|
Signed-off-by: Filipe Laíns <[email protected]>
Feature or enhancement
Ship a static file that describes various aspects of the installation as part of the Python installation.
Pitch
Shipping such a file would make it much easier to introspect a Python installation without having to run the interpreter. There are many use-cases for this, but some key would be for eg. Python launchers, and cross-compilation tooling.
Information we could provide:
libpython
(available?)libpython
(available?)libpython3.so
)(incomplete table, just an initial proposal)
Note: This issue specifically targets a descriptor file for the Python installation, not a Python environment, so paths are out of scope.
Previous discussion
https://discuss.python.org/t/what-information-is-useful-to-know-statically-about-an-interpreter/25563.
(cc @brettcannon)
Linked PRs
The text was updated successfully, but these errors were encountered: