Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch sysconfig data at install time #9857

Merged
merged 1 commit into from
Dec 13, 2024
Merged

Conversation

charliermarsh
Copy link
Member

@charliermarsh charliermarsh commented Dec 13, 2024

Summary

This PR reimplements sysconfigpatcher in Rust and applies it to our Python installations at install-time, ensuring that the sysconfig data is more likely to be correct.

For now, we only rewrite prefixes (i.e., any path that starts with /install gets rewritten to the correct absolute path for the current machine).

Unlike sysconfigpatcher, this PR does not yet do any of the following:

  • Patch pkginfo files.
  • Change clang references to cc.

A few things that we should do as follow-ups, in my opinion:

  1. Rewrite AR.
  2. Remove -isysroot, which we already do for newer builds.

@charliermarsh charliermarsh added no-build Disable building binaries in CI no-test Disable CI tests for a pull request labels Dec 13, 2024
@charliermarsh charliermarsh marked this pull request as draft December 13, 2024 00:40
@charliermarsh charliermarsh force-pushed the charlie/patch-sysconfig branch 4 times, most recently from 2db5f83 to 6587146 Compare December 13, 2024 00:57
@charliermarsh charliermarsh marked this pull request as ready for review December 13, 2024 00:57
@charliermarsh charliermarsh removed no-build Disable building binaries in CI no-test Disable CI tests for a pull request labels Dec 13, 2024
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially thought we could just parse this as JSON, but we only started to output sysconfig data as JSON in the most recent PBS release. So it won't apply to existing Pythons.

Instead, this is a small parser that just handles Python dictionaries with the assumption that keys are string and values are (strings or integers). Main complexity is that strings can be implicitly concatenated (and are in practice), but it's not too bad.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to have @dhruvmanila review this? :)

Copy link
Member Author

@charliermarsh charliermarsh Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good idea. @dhruvmanila -- this is applied to a file like the following. We know that it's an object with string keys, and the values must be strings or integers (no f-strings, but they can be implicitly concatenated).

# system configuration generated and used by the sysconfig module
build_time_vars = {'ABIFLAGS': '',
 'AC_APPLE_UNIVERSAL_BUILD': 0,
 'AIX_BUILDDATE': 0,
 'AIX_GENUINE_CPLUSPLUS': 0,
 ...
 'XMLLIBSUBDIRS': 'xml xml/dom xml/etree xml/parsers xml/sax',
 'abs_builddir': '/private/var/folders/0w/4z5l9vds32nbkz7l22n8j6s80000gn/T/tmp5be2znu_/Python-3.10.16',
 'abs_srcdir': '/private/var/folders/0w/4z5l9vds32nbkz7l22n8j6s80000gn/T/tmp5be2znu_/Python-3.10.16',
 'datarootdir': '/install/share',
 'exec_prefix': '/install',
 'prefix': '/install',
 'srcdir': '.'}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parser looks good. I think it just has a lot of lookaheads where we could just use bump but if this is not in a critical path (which it seems like so), then it shouldn't matter practically.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes from Ruff (which ported it from rustc).

@charliermarsh
Copy link
Member Author

\cc @bluss who I also chatted with on Discord

@charliermarsh charliermarsh force-pushed the charlie/patch-sysconfig branch from 6587146 to 26fda2f Compare December 13, 2024 00:59
@charliermarsh charliermarsh added enhancement New feature or improvement to existing functionality compatibility Compatibility with a specification or another tool labels Dec 13, 2024
@charliermarsh charliermarsh force-pushed the charlie/patch-sysconfig branch 5 times, most recently from b3c7870 to fddd5d9 Compare December 13, 2024 03:10
@charliermarsh charliermarsh added the no-build Disable building binaries in CI label Dec 13, 2024
@charliermarsh charliermarsh force-pushed the charlie/patch-sysconfig branch 5 times, most recently from e33c0a5 to 0248c28 Compare December 13, 2024 04:07
crates/uv-python/src/sysconfig/parser.rs Outdated Show resolved Hide resolved
}

loop {
match cursor.first() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to always do the lookahead first and then bump the character? Typically, we avoid doing lookaheads and only do it when required. If we bump the character first, it always avoids bumping it for every branch in the loop.

This would look like:

loop {
	let Some(next) = cursor.bump() else {
		break;
	};

	match next {
		...
	}
}

For other parse methods, it would then need to accept the character that's bumped. For strings, you could then utilize the cursor.previous() method to add a debug assertion that the previous character was the same as the given quote character.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it as-is but I'll play with changing it.

crates/uv-python/src/sysconfig/parser.rs Outdated Show resolved Hide resolved
Comment on lines +158 to +189
if cursor.first() == EOF_CHAR {
return Err(Error::UnexpectedCharacter(EOF_CHAR));
}

// Handle escaped quotes.
if cursor.first() == '\\' {
// Consume the backslash.
cursor.bump();
if cursor.first() == quote {
result.push(quote);
cursor.bump();
continue;
}

// Keep the backslash and following character.
result.push('\\');
result.push(cursor.first());
cursor.bump();
continue;
}

// Consume closing quote.
if cursor.first() == quote {
cursor.bump();
break;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we could utilize a match to avoid computing self.first for each branch

crates/uv-python/src/sysconfig/parser.rs Outdated Show resolved Hide resolved
@charliermarsh charliermarsh force-pushed the charlie/patch-sysconfig branch from 0248c28 to ed4a2cc Compare December 13, 2024 19:20
@charliermarsh charliermarsh force-pushed the charlie/patch-sysconfig branch from ed4a2cc to ffd3394 Compare December 13, 2024 19:25
@charliermarsh charliermarsh merged commit d2fb4c5 into main Dec 13, 2024
71 of 73 checks passed
@charliermarsh charliermarsh deleted the charlie/patch-sysconfig branch December 13, 2024 19:36
charliermarsh added a commit that referenced this pull request Dec 13, 2024
## Summary

Based on some review feedback from
#9857.
charliermarsh pushed a commit that referenced this pull request Dec 15, 2024
## Summary

Minor follow up to #9857 to patch
AR.

Implements the AR replacement used in
[sysconfigpatcher](https://github.com/bluss/sysconfigpatcher/blob/main/src/sysconfigpatcher.py#L54),
namely

```python
DEFAULT_VARIABLE_UPDATES = {
    ...
    "AR": "ar",
}
```

## Test Plan

Added an additional test. Tested local python installs.

Related traces
```
TRACE Updated `AR` from `/tools/clang-linux64/bin/llvm-ar` to `ar`
```

/// A cursor represents a pointer in the source code.
///
/// Based on [`rustc`'s `Cursor`](https://github.com/rust-lang/rust/blob/d1b7355d3d7b4ead564dbecb1d240fcc74fff21b/compiler/rustc_lexer/src/cursor.rs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a lot like https://github.com/typst/unscanny

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks interesting!

tmeijn pushed a commit to tmeijn/dotfiles that referenced this pull request Dec 20, 2024
This MR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [astral-sh/uv](https://github.com/astral-sh/uv) | patch | `0.5.8` -> `0.5.11` |

MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).

**Proposed changes to behavior should be submitted there as MRs.**

---

### Release Notes

<details>
<summary>astral-sh/uv (astral-sh/uv)</summary>

### [`v0.5.11`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#0511)

[Compare Source](astral-sh/uv@0.5.10...0.5.11)

##### Enhancements

-   Normalize `platform_system` to `sys_platform` ([#&#8203;9949](astral-sh/uv#9949))
-   Improve retry mechanisms on Windows for `copy_atomic` and `write_atomic` ([#&#8203;10026](astral-sh/uv#10026))
-   Add nuance to prefetch logging ([#&#8203;9984](astral-sh/uv#9984))
-   Update to [`python-build-standalone 20241219`](https://github.com/astral-sh/python-build-standalone/releases/tag/20241219)

##### Preview features

-   Build backend: Preserve executable bits for scripts in distributions ([#&#8203;10027](astral-sh/uv#10027))
-   Build backend: Handle case where `metadata_directory` already contains `dist-info` directory ([#&#8203;10005](astral-sh/uv#10005))

##### Performance

-   Batch resolver pre-fetches per fork ([#&#8203;10029](astral-sh/uv#10029))

##### Bug fixes

-   Allow `--script` to be provided with `uv run -` ([#&#8203;10035](astral-sh/uv#10035))
-   Allow `uv run` arguments when reading from `stdin` ([#&#8203;10034](astral-sh/uv#10034))
-   Prefer higher Python lower-bounds when forking ([#&#8203;10007](astral-sh/uv#10007))
-   Remove references to deprecated `first-match` ([#&#8203;10036](astral-sh/uv#10036))

##### Documentation

-   Add `uv python install --preview` to the documentation ([#&#8203;10010](astral-sh/uv#10010))
-   Fix `uv python install --default` note about multiple requests ([#&#8203;10011](astral-sh/uv#10011))
-   Fix typo in Caching docs ([#&#8203;10032](astral-sh/uv#10032))
-   Remove remaining references to deprecated `first-match` ([#&#8203;10038](astral-sh/uv#10038))
-   Supplement missing separators for `UV_INSTALL_DIR` directions on Windows ([#&#8203;9507](astral-sh/uv#9507))

### [`v0.5.10`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#0510)

[Compare Source](astral-sh/uv@0.5.9...0.5.10)

##### Enhancements

-   Improve backtracking behavior when packages conflict repeatedly ([#&#8203;9843](astral-sh/uv#9843))
-   Patch Python `sysconfig` values such as `AR` at `ar` install time ([#&#8203;9905](astral-sh/uv#9905))
-   Patch Python `sysconfig` values such as `clang` to `cc` at install time ([#&#8203;9916](astral-sh/uv#9916))
-   Skip `--native-tls` in `pip compile` header ([#&#8203;9913](astral-sh/uv#9913))
-   Add resolver error hint for no-binary and no-build failures ([#&#8203;9948](astral-sh/uv#9948))
-   Improve build error messages ([#&#8203;9660](astral-sh/uv#9660))
-   Reduce redundant Python version incompatibilities in resolver error message ([#&#8203;9957](astral-sh/uv#9957))
-   Reduce redundant enumeration of all package versions in some resolver errors ([#&#8203;9885](astral-sh/uv#9885))
-   Improve display of ranges when pre-releases are not allowed ([#&#8203;9944](astral-sh/uv#9944))
-   Improve error messages for `uv remove` ([#&#8203;9959](astral-sh/uv#9959))
-   Improve phrasing for single term incompatibilities ([#&#8203;9953](astral-sh/uv#9953))
-   Improve styling of `uv remove` dependency hints ([#&#8203;9960](astral-sh/uv#9960))
-   Omit trailing zeros on Python requirements inferred from versions ([#&#8203;9952](astral-sh/uv#9952))
-   Show a concise error message for missing `version` field ([#&#8203;9912](astral-sh/uv#9912))
-   Use the build options value to improve hints for no wheel / source distribution errors ([#&#8203;9950](astral-sh/uv#9950))

##### Bug fixes

-   Allow multiple disjoint URLs in overrides ([#&#8203;9893](astral-sh/uv#9893))
-   Include explicit indexes in publish index choice ([#&#8203;9932](astral-sh/uv#9932))
-   Fix Python interpreter detection for 32-bit operating systems on 64-bit hosts ([#&#8203;9970](astral-sh/uv#9970))

##### Documentation

-   Fix typo "operation system" ([#&#8203;9971](astral-sh/uv#9971))
-   Clarify uninstallation docs ([#&#8203;9938](astral-sh/uv#9938))
-   Add a note to say that dependencies between workspace members are editable ([#&#8203;9363](astral-sh/uv#9363))
-   Correctly document default value of `fork-strategy` setting ([#&#8203;9931](astral-sh/uv#9931))
-   Use double quotes for Windows support in examples ([#&#8203;9946](astral-sh/uv#9946))
-   Remove `pypy` from top-level pin example ([#&#8203;9896](astral-sh/uv#9896))
-   Update references to `python-build-standalone` to reflect the transferred project ([#&#8203;9977](astral-sh/uv#9977))
-   Use a different Ruff version in documentation ([#&#8203;9943](astral-sh/uv#9943))
-   Change example so it works as-is on `powershell` and `cmd.exe` ([#&#8203;9903](astral-sh/uv#9903))
-   Clarify best practice for Python matrix strategy in GitHub Actions ([#&#8203;9454](astral-sh/uv#9454))
-   Add documentation for `uv-lock` and `uv-export` pre-commit hooks ([#&#8203;9872](astral-sh/uv#9872))

##### Preview features

-   Build backend: Fix pre-PEP 639 license files ([#&#8203;9965](astral-sh/uv#9965))

### [`v0.5.9`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#059)

[Compare Source](astral-sh/uv@0.5.8...0.5.9)

##### Enhancements

-   Fork version selection based on `requires-python` requirements ([#&#8203;9827](astral-sh/uv#9827))
-   Patch `sysconfig` data at install time ([#&#8203;9857](astral-sh/uv#9857))
-   Remove `-isysroot` when patching sysconfig ([#&#8203;9860](astral-sh/uv#9860))

##### Configuration

-   Introduce a `--fork-strategy` preference mode ([#&#8203;9868](astral-sh/uv#9868))
-   Add support for `UV_OFFLINE` ([#&#8203;9795](astral-sh/uv#9795))

##### Bug fixes

-   Avoid `panic!()` when current directory does not exist ([#&#8203;9876](astral-sh/uv#9876))
-   Avoid reusing interpreter metadata when running under Rosetta ([#&#8203;9846](astral-sh/uv#9846))
-   Avoid trailing slash when deserializing from lockfile ([#&#8203;9848](astral-sh/uv#9848))
-   Fix bug in terms when collapsing unavailable versions in resolver errors ([#&#8203;9877](astral-sh/uv#9877))
-   Fix suggestion to use `uv help python` on invalid install requests ([#&#8203;9820](astral-sh/uv#9820))
-   Skip root when assessing prefix viability ([#&#8203;9823](astral-sh/uv#9823))
-   Avoid spurious 'Upgraded tool environment' in `uv tool upgrade` ([#&#8203;9870](astral-sh/uv#9870))

##### Rust API

-   Upgrade minimum Rust version to 1.83 ([#&#8203;9815](astral-sh/uv#9815))

##### Documentation

-   Document the `--fork-strategy` setting ([#&#8203;9887](astral-sh/uv#9887))

##### Preview features

-   Build backend: Allow underscores in entrypoints ([#&#8203;9825](astral-sh/uv#9825))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this MR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOS42NC4wIiwidXBkYXRlZEluVmVyIjoiMzkuNzUuMCIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiUmVub3ZhdGUgQm90Il19-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatibility Compatibility with a specification or another tool enhancement New feature or improvement to existing functionality no-build Disable building binaries in CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants