Add trusted publisher release workfiow #13048

sbidoul · 2024-10-27T15:14:16Z

closes #12708

ichard26 · 2024-11-05T22:56:17Z

It'd be neat to also have the workflow cut a GitHub Release automatically: #12169. Of course, this can be done later.

sbidoul · 2024-11-06T14:27:10Z

@ichard26 I'd prefer to handle GitHub releases as a separate work stream indeed.

.github/workflows/release.yml

ichard26 · 2024-11-18T23:57:35Z

.github/workflows/release.yml

+  pypi:
+    name: upload release to PyPI
+    runs-on: ubuntu-latest
+    environment: release


Are we going to include any deployment protection rules for the release environment? I realize that means requiring a second person to briefly review any release deployments, but that seems prudent given the rise of supply-chain attacks. For example, if I were to be a maintainer, I don't think I'd need the ability to cut a release completely solo.

What would we ask a second reviewer to verify before approving the release?

The steps in the release process that this replaces are all ones that happen locally. So there's nothing visible that a second reviewer could check. It's still perfectly fine for someone doing a release to stop just before pushing the tag and asking for a review of the commit that will become the release if they feel the need to do so.

I was more so suggesting that we require a second pair of eyes for any release as a defence-in-depth mechanism against account takeover and other supply-chain attacks. Realistically, we would be compromised in other ways anyway and it isn't worth the hassle.

Hmm, surprisingly I can't mark my own review comment as resolved? Either way, I have no further comments.

Disclaimer: I've spent an hour writing this. I tried to present a logically consistent argument, but if it starts to run off course, you know why.

I'm somewhat uncomfortable relying on PyPA owner status for admin access to pip. I'd much rather we had a clear and well-defined set of security rules and levels within the project.

Fair enough. I brought them up as even if we introduced stricter protection rules and dropped the base pip committer permission, they would still retain full administrator permissions. Thus, they would logically be the starting point for deciding who would be in the "project owner" group. We definitely don't have to tie project ownership to PyPA ownership.

And yeah, GitHub's permission model for organisation repositories is not exactly flexible. I chose Maintain simply as it's the highest access role w/o administrator (i.e. bypass everything) permissions.

I'm not convinced we need the distinction between committers and release managers - until now we've managed just fine with all committers also being potential release managers, and I don't think we're big enough as a team to make it worth separating the roles.

Agreed.

I'm also wary of a distinct "project owner" status, as it goes somewhat contrary to the "team effort" picture I have of pip, to have privileged "owners".

I don't think having granular committer/owner roles is antithetical to the "team effort" approach to pip. When I used to maintain Black, I only retained committer access to the project, but I still felt fully able to contribute.¹ Actually, I used to be effectively the lead maintainer when I had so much free time. Sure, technically I couldn't do as much as the "project owners", but 99% of what I did only required committer access anyway. More importantly, we still treated each other's opinions and suggestions from a level playing field. Yes, there was a technical imbalance, but socially, no.

I suppose this is a difference of perspective, but I consider the division of committer access as a reasonable compromise between ease of maintenance and security. If everyone has administrator access, then, yes, when we need to update a sensitive setting (say to land #13107), it's easier as we don't have to bother a specific person (or group of people). OTOH, it's rare when we actually need do something that requires administrator access (editing protection rules, GHA settings, webhooks, etc). In the vast majority of time where we don't need admin access, those accounts represent a larger than necessary security risk (if they were to be compromised somehow). The small additional friction to do X thing is IMO outweighed by the security benefits.

So I guess what I'm saying is that I don't see the point in splitting (2), (3) and (4), except conceptually. Certainly when I'm asked if someone should be a committer, I assume that includes the potential to be a RM and an admin on the project.

I trust everyone else on the pip committer team to be a RM and administrator on the project as well. If a pip committer needed to be an administrator for a legitimate reason, I have no objections to extending that permission to them. I also trust everyone to take proper measures to secure their accounts. Adding "security levels" doesn't change that for me. The problem is that mistakes happen. Access tokens get mistakenly leaked, credentials get phished or bruteforced, and a variety of other creative attacks occur. We should be prepared for that.

I'm happy to require reviews if people think that's necessary. I feel that it might introduce yet more delays into what is already a frustratingly1 slow process

I'm actually not entirely in favour of requiring reviews for similar reasons. I do think it's worth it to drop the base permission and require reviews for releases as they're infrequent events.

Of course, if we didn't have to worry about security then none of this would need to be discussed, but for better or worse, we don't live in such a world anymore :( Also, I haven't been a RM before (and I don't have any plans to be one soon, for lack of time). If you, one of our regular RMs, find these suggestions to be too onerous, then I'm fine with maintaining the status quo.

Footnotes

There was a time where only Łukasz had admin permissions, which was annoying as we occasionally needed him to do something when he wasn't available, but that was addressed once an active maintainer (not me) was given admin permissions. ↩

I suppose this is a difference of perspective, but I consider the division of committer access as a reasonable compromise between ease of maintenance and security.

I don't disagree with that. I think part of my "hidden agenda" here is that if we're going to have distinct project owners, then I'd prefer not to be one, simply because I don't want to be a bottleneck on actions that require an admin¹. I also don't want to be characterised as "owning" pip, because I'm very conscious of our shortcomings as a project, and I could do without feeling more responsible for that. But I do want to remain a PyPA owner, as I feel that I have a useful role in that context. That's all very muddled, and not really actionable, but not having an "owner vs committer" distinction brushes the problem under the carpet, which works for me 🙂

If you, one of our regular RMs, find these suggestions to be too onerous, then I'm fine with maintaining the status quo.

I think I might. My usual release workflow is that I set aside a Saturday morning for doing the release. I'll manage all the outstanding PRs and milestones in the week or two before the release, then on the day I'll go through the release process and I'm done. I keep track of the release in my head, which is fine as it's a single piece of work with no interruptions. Adding a review to that process would introduce a delay where I'd need someone else to be available, and approve the release - there's no guarantee that would happen in the sort of timescale (an hour or so max) that I'm working to.

So I think I'd need to see the Release Process section of the docs split into two parts in order to incorporate a review. Do part 1, request a review, then do part 2. And I wouldn't be able to guarantee when I could assign time for part 2 in advance, because I don't know when I'll get an approval. So the git repo needs to be (at least in some sense) frozen between parts 1 and 2, which could be days apart.

Am I overthinking this? It feels like requiring a review during the release process necessitates splitting the process into two parts like I describe above, which is bad for scheduling. But maybe I've misunderstood how getting a review would work in practice?

Of course, if we didn't have to worry about security then none of this would need to be discussed, but for better or worse, we don't live in such a world anymore :(

Agreed. But conversely, admin around security is not what I want to spend my volunteer open source time on, so keeping things streamlined is important to me.

Footnotes

And I know that not having an extra admin is more of a bottleneck than me being one but not being available sometimes, but for me, I don't want to feel responsible for keeping on top of "things that need admin access". ↩

FWIW, I've created a release environment for this as well.

@pradyunsg
Note: You don't actually have to pre-create it unless you're going to configure it. They are auto-created.
Though, in another comment, I tried explaining why it shouldn't be called after a process due to the semantics. And I'll note that somebody should make sure to duplicate that name on PyPI.

@ichard26 hint: it is possible to disallow self-approvals in the environment protections, FYI. Also, it's possible to add up to 6 entries into the required reviewers list there. Not only users, but teams — you can let more people approve if you use teams.

Although, I tend to enable required reviews even on projects where I'm the only committer. This allows me to have more control over the process, and this pauses the workflow run just before starting the job. So when the build job produces the wheels, I could even download them locally, and inspect if I wanted to.
This is another reason for splitting the job into separate security scopes with lower permissions for the build one.

@pfmoore similarly, to address the “delay” concern — setting up required reviews with just 1 reviewer required and self-reviews allowed would let you have just enough control over the last action which is immutable (the actual PyPI upload) w/o contributing to any delay meaningfully. This is what I tend to configure.

I would like to come back to my initial question: What would we ask a second reviewer to verify before approving the release?

When we setup a review process, I think it is important to explain what is expected from the reviewer. When reviewing or merging a PR, this is implicit but I think generally accepted: the person who approves or merges communicates that they have looked at, and agree with the change, unless explicitly mentioned otherwise.

Currently, as a RM preparing a release, I do not re-review nor look at everything that was merged in the last quarter to assure there is no malicious code that was merged on main. In effect I assume that the review process was effective and guarded against malicious intents.

If we introduce an approval step in the release process, what would the reviewer need to do in that step? Downloading the built wheel and sdist and inspecting them does certainly not looks like something that would be very practical to me. This is a genuine question.

So if we want to guard against compromise of a maintainer GitHub account, I'd think a second approval on release is good, but only effective if we also protect main and strictly require a second review (no self review) before every merge.

As a side note, I also think it would somewhat complicate the release process which I usually do when I have time on a weekend, and not currently coordinating with availability of another maintainer. But I'll adapt if we reach the conclusion that this is important, of course.

Agreed. My RM process is very similar. And in particular, the assumption that what is merged on main is correct, valid and ready for release. If we were to require the RM to (in effect) validate all code that went into the release I'm pretty certain I wouldn't have the time to be a RM.

sbidoul · 2024-11-19T09:29:20Z

@pfmoore @pradyunsg as regular release managers, what do you think about letting a GitHub action doing the publishing to PyPI using trusted publishers?

pfmoore · 2024-11-19T11:09:01Z

+1 from me.

webknjaz

👋 hey, so with my pypi-publish maintainer hat on, I figured I'd point out a few places that are considered discouraged/dangerous. Plus, there are a few suggestions that are not strictly security-related. JFYI.

.github/workflows/release.yml

webknjaz · 2024-12-10T04:06:19Z

.github/workflows/release.yml

+  pypi:
+    name: upload release to PyPI
+    runs-on: ubuntu-latest
+    environment: release


FWIW, I've created a release environment for this as well.

@pradyunsg
Note: You don't actually have to pre-create it unless you're going to configure it. They are auto-created.
Though, in another comment, I tried explaining why it shouldn't be called after a process due to the semantics. And I'll note that somebody should make sure to duplicate that name on PyPI.

@ichard26 hint: it is possible to disallow self-approvals in the environment protections, FYI. Also, it's possible to add up to 6 entries into the required reviewers list there. Not only users, but teams — you can let more people approve if you use teams.

Although, I tend to enable required reviews even on projects where I'm the only committer. This allows me to have more control over the process, and this pauses the workflow run just before starting the job. So when the build job produces the wheels, I could even download them locally, and inspect if I wanted to.
This is another reason for splitting the job into separate security scopes with lower permissions for the build one.

@pfmoore similarly, to address the “delay” concern — setting up required reviews with just 1 reviewer required and self-reviews allowed would let you have just enough control over the last action which is immutable (the actual PyPI upload) w/o contributing to any delay meaningfully. This is what I tend to configure.

sbidoul · 2024-12-10T09:46:11Z

@webknjaz thanks! I have applied your recommendations.

sethmlarson · 2024-12-10T04:22:32Z

.github/workflows/release.yml

+      # Used to authenticate to PyPI via OIDC.
+      id-token: write
+    steps:
+      - uses: actions/checkout@v4


Pin all the action steps to commit SHAs instead of git tags to avoid a source of immutability. You can use frizbee to do this for you if you'd like.

There's also https://github.com/davidism/gha-update. And Dependabot knows to update the hashes too (also bumping the human-readable tag in a comment on the same line).

I pinned the actions using frizbee.

webknjaz

@sbidoul I looked closer into other bits of the patch and noticed a few more things that I'd rather change/keep before merging.

.github/workflows/ci.yml

.github/workflows/release.yml

webknjaz · 2024-12-10T15:01:09Z

.github/workflows/release.yml

+on:
+  push:
+    tags:
+      - "*"


This is unnecessary, it's the same by default:

Suggested change

- "*"

When I remove this line, vscode complains. I could put an empty sequence but I'm not sure it is easier to read.

I'm not sure that an empty sequence has the same semantics. I think null or ~ might be equivalent, though.

webknjaz · 2024-12-10T15:03:31Z

noxfile.py

@@ -315,94 +314,3 @@ def prepare_release(session: nox.Session) -> None:
    next_dev_version = release.get_next_development_version(version)
    release.update_version_file(next_dev_version, VERSION_FILE)
    release.commit_file(session, VERSION_FILE, message="Bump for development")
-
-
-@nox.session(name="build-release")


Can we keep this env so nox is still in the center of the processes and the manual ways remain in the repo? Plus, consider implementing pinning of the build env on this level per my other comments.

sbidoul · 2025-01-12T15:16:25Z

Here is an updated version.

I pinned build dependencies, and use them in an dedicated build environment, with build --no-isolation.

I pinned release GitHub actions (using frizbee).

I chose to not use the setup-python action, since we have python in the GitHub runner, and it is therefore one less thing to audit.

In a followup I plan to update the nox build action to use a similar build process with the pinned build deps. I chose however to not use nox in the release process to avoid having to pin nox dependencies because there are too many of them and I feel that would make auditing the build environment harder.

Is this reasoning of limiting the number of dependencies used in the build process in order to facilitate audit valid? One questioning I have is about the auditability of the GitHub runner used for the build. The job logs gives a link to the runner image release. But what if the jobs log is lost? Is that recorded somewhere else?

notatallshaw · 2025-01-12T15:34:53Z

Is this reasoning of limiting the number of dependencies used in the build process in order to facilitate audit valid? One questioning I have is about the auditability of the GitHub runner used for the build. The job logs gives a link to the runner image release. But what if the jobs log is lost? Is that recorded somewhere else?

I can't speak for any "official" audit processes, but when I look at validating artifacts myself I'm looking a minimum of:

Are there version controlled build steps?
Are there sufficiently pinned dependencies?
When I run the build steps with the pinned dependencies do they produce binary identical artifacts? (wheel and sdist in this case)

From that point of view I'm supportive of minimizing dependencies, it's less chance of things going wrong.

Beyond that, is pinning down the non-Python package dependencies an explicit goal of this PR? If so I would recommend using a pinned Python docker image to run the build processes with Python always called using isolated mode.

sbidoul · 2025-01-12T16:08:14Z

Beyond that, is pinning down the non-Python package dependencies an explicit goal of this PR?

Not a goal of mine, at least.

If so I would recommend using a pinned Python docker image to run the build processes with Python always called using isolated mode.

But that would add one more moving piece to the game.

So I think I'm happy with this PR as it is. My question about the auditability of the GitHub runner is more curiosity than anything I want or think we should address.

webknjaz · 2025-01-14T02:01:50Z

build-requirements.txt

+# This file is autogenerated by pip-compile with Python 3.12
+# by the following command:
+#
+#    pip-compile --allow-unsafe --generate-hashes build-requirements.in


Side note: these options can be set in a config file that pip-tools support. Consider doing so in a follow-up if more of pip-tools will eventually end up being used.

.github/workflows/release.yml

webknjaz · 2025-01-14T02:04:50Z

I think the CI logs live for about 3 months before garbage collection. The artifacts, I think, also live for the same amount of time. In CI/CD workflows where the process is shared between release and testing, I tend to conditionally set the retention time to 90 days, which is usually max.

To make the dists reproducible, you have to set the $SOURCE_DATE_EPOCH env var which most build backends (including setuptools) would recognize. People typically use the timestamp that Git's HEAD is pointing to at the time of building: https://github.com/ansible/awx-plugins/blob/c8cff62/.github/workflows/ci-cd.yml#L543C15-L543C59.

This would have to be duplicated in noxfile.py, though, to ensure that building locally (which might be needed in cases of emergency or for verification purposes) is the same. This is kinda the main reason I wanted it to be reused in the workflow. If you don't want to have this in noxfile.py directly, it might make sense to put it into a Python script that both would call. Might be an idea for a follow-up.

Using the Python Docker image would indeed improve reproducibility at the cost of delegating reviewing that image to somebody else (provided that it's pinned using SHA).

That said, I don't see any serious blockers here. If you're happy with the PR, I'd say — merge it and think about other things in a follow-up.

sbidoul · 2025-01-24T07:39:35Z

We'll know soon :) Thanks again to everyone involved here!

sbidoul · 2025-01-24T12:50:37Z

We have the first dependabot update already: #13171.

webknjaz · 2025-01-24T21:54:31Z

We have the first dependabot update already: #13171.

Yeah, that allows you to start using license expression in the core packaging metadata..

sbidoul · 2025-01-26T13:07:04Z

The new release process went well.

Two things I note:

While SOURCE_DATE_EPOCH is taken into account in the built wheel, and unzip -v shows the exact same content (including timestamps and crc32), I could not reproduce a wheel with the same sha256 on my machine. I don't know if that would be expected.
While the PyPI simple index page shows a data-provenance attribute with value https://pypi.org/integrity/pip/25.0/pip-25.0-py3-none-any.whl/provenance, that URL returns {"message":"Request not acceptable"}.

webknjaz · 2025-01-26T14:22:06Z

I could not reproduce a wheel with the same sha256 on my machine. I don't know if that would be expected.

And the same env var in epoch? Have you tried comparing the context with a recursive diff?

webknjaz · 2025-01-26T14:23:25Z

While the PyPI simple index page shows a data-provenance attribute with value https://pypi.org/integrity/pip/25.0/pip-25.0-py3-none-any.whl/provenance, that URL returns {"message":"Request not acceptable"}

@woodruffw is this expected?

potiuk · 2025-01-26T14:24:29Z

While SOURCE_DATE_EPOCH seems is taken into account in the built wheel, and unzip -v shows the exact same content (including timestamps and crc32), I could not reproduce a wheel with the same sha256 on my machine. I don't know if that would be expected.

Comment: I solved quite a number of those reproducible checks in Airlfow - our packages are reproducible for about a year (well almost sometimes we find some small issues).

The one reason that is non-obvious is a question of umask of the system you run it on.

Generally speaking Git when retrieving a code uses - by default umask to create a files - https://git-scm.com/docs/git-config#Documentation/git-config.txt-coresharedRepository. On somoe systems umask is group write, on some it's group read, on some it's group none. While git maintains some of the permission bits on POSIX filesystems (lile executable bit for user) - it uses umask for most other things Difference in umask produces different binary artifacts. Solution to that is to have a script that clears group bits before building the package (other bits are generally almost always clear, but you can clear them as well).

I heartily recommend https://diffoscope.org/ which is a fantastic tool to compare artifacts and see the differences. It's been developed as part of the "reproducible builds" effort.

potiuk · 2025-01-26T14:28:29Z

Some other things - you can take a look here where I keep all the things needed to make Airlfow builds reproducible https://github.com/apache/airflow/blob/main/dev/breeze/src/airflow_breeze/utils/reproducible.py -> some packages (but that mostly for tars) are not packaging things deterministically so I had to rewrite parts of it. For example you have to set the right LOCALE to be the same during the build, because if you want to repack stuff deterministically, file order matters, and LOCALE impacts sorting order.

The part with permissions is here: https://github.com/apache/airflow/blob/main/dev/breeze/src/airflow_breeze/utils/reproducible.py#L110

sbidoul · 2025-01-26T14:34:04Z

The one reason that is non-obvious is a question of umask of the system you run it on.

That's it, thanks!

sbidoul · 2025-01-26T14:36:17Z

I heartily recommend https://diffoscope.org/ which is a fantastic tool to compare artifacts and see the differences. It's been developed as part of the "reproducible builds" effort.

I had tried unzip -v, pkgdiff and pip-wheel-diff. TIL diffoscope, which revealed the file permission difference.

sbidoul · 2025-01-26T14:39:05Z

FWIW, I don't think we should add more code to our build process for reproducibility. If that is considered important I'd rather switch to a build backend that supports that out of the box.

ichard26 · 2025-01-26T14:46:01Z

One questioning I have is about the auditability of the GitHub runner used for the build.

The nice thing about trusted publishing is that it links the release artifacts to the exact GHA run that built and published them. The sigstore transparency logs (sdist, wheel) have a Run Invocation URI attribute. Checking our settings, our logs persist for the longest period allowed by GitHub which is 90 days. That's long enough where a compromised release would be detected and audited before the logs expire.

notatallshaw · 2025-01-26T14:51:17Z

FWIW, I don't think we should add more code to our build process for reproducibility. If that is considered important I'd rather switch to a build backend that supports that out of the box.

I agree, and I think it's worth a separate issue if it's time to move pip to a different backend.

For example flit or hatchling, hatchling has been gaining significant popularity, and flit is very minimal. My understanding is both are better for build reproducibility, at work I switched to hatchling and it solved my reproducibility issues and was able to remove supporting code that I had to do that previously (though I wouldn't have had this umask issue).

woodruffw · 2025-01-26T15:01:46Z

While the PyPI simple index page shows a data-provenance attribute with value https://pypi.org/integrity/pip/25.0/pip-25.0-py3-none-any.whl/provenance, that URL returns {"message":"Request not acceptable"}

@woodruffw is this expected?

Yep: the endpoint is currently pretty strict about the Accept header it wants. I have a PR open to relax it a bit, but in the mean time passing application/vnd.pypi.integrity.v1+json as the accept header should make the request succeed.

(See pypi/warehouse#17498)

sbidoul · 2025-01-26T15:11:07Z

@woodruffw thanks! Out of curiosity, what is the easiest way to get something a human can grok out of that provenance URL?

potiuk · 2025-01-26T16:02:29Z

FWIW, I don't think we should add more code to our build process for reproducibility. If that is considered important I'd rather switch to a build backend that supports that out of the box.

I agree, and I think it's worth a separate issue if it's time to move pip to a different backend.

For example flit or hatchling, hatchling has been gaining significant popularity, and flit is very minimal. My understanding is both are better for build reproducibility, at work I switched to hatchling and it solved my reproducibility issues and was able to remove supporting code that I had to do that previously (though I wouldn't have had this umask issue).

We use both flit and hatchling and they did not solve all the issues.

My recommendation (and this is what we do) is just do the reproducible build in controlled environment (container image - we use debian buster python as a base). That helps to battle all environmental issues - and gives easy instructions for someone who wants to verify the build.

The important thing about reproducible builds is not that they are "always reproducible" - but that they can be "easily reproduced when folllowing same build steps and environment". Because that allows 3rd-parties (that are inevitably going to start doing it) to verify and attest whether the build published by the maintainer has not been tampered with.

It will be enough that 3 or 4 such trusted parties will keep a public ledger where they attest that indeed - when you follow the build process and checkout this git branch, you get binary identical result.

So important is to have a way that they can follow easily to reproduce it. This is the real value of reproducible builds. And it might help to prevent things like ultralytics https://blog.pypi.org/posts/2024-12-11-ultralytics-attack-analysis/ and XZ backdoor https://en.wikipedia.org/wiki/XZ_Utils_backdoor - both of which involved a package that contained different things than the repository tag they were produced from - because of either roque maintainer modified scripts (in xz case) or Cache poisoning modified the package through Github Actions (in ultralytics case).

@sethmlarson -> WDYT? Am I right with my assesment? Do you know of any 3rd-parties that might attempt to do such kind of public ledger/verification of those artifacts produced in PyPI?

potiuk · 2025-01-26T16:08:57Z

BTW. In Airflow we already have thos 3rd-parties effectively - we have to have 3 PMC members of Airflow PMC building the packages we release and only after the 3 of them independently confirm that the packages are the same, we release them.

Others might not have the luxury - but Apache Software Foundation has always been serious on release being a legal act of foundation and 3 PMC members having to vote +1 on such release - so it was very easy to plug-it-in into our process.

woodruffw · 2025-01-26T16:18:28Z

@woodruffw thanks! Out of curiosity, what is the easiest way to get something a human can grok out of that provenance URL?

No problem!

The easiest human-grokkable presentation is probably the one on PyPI itself at the moment, e.g. for the sdist: https://pypi.org/project/pip/#pip-25.0.tar.gz

Here's how that appears in my browser:

For the JSON itself, the next best thing would probably be pypi-attestations verify pypi, although that's currently more of a demo CLI than users should stabilize on 🙂

pfmoore · 2025-01-26T17:35:33Z

FWIW, I don't think we should add more code to our build process for reproducibility.

I agree. One question. Is the target simply to have the official builds, run on Github Actions, be reproducible? Or is the intention that a pip maintainer, or 3rd party, can reproduce the same build locally?

The reason I ask is that we've traditionally had problems with the build/release process on Windows, because I'm probably the only maintainer who uses Windows. And the more complexity we have in the build process, the more risk there is that something doesn't get tested on Windows, and I hit it during a release... (I don't think this is the case here, but I've not been following the changes closely).

sbidoul · 2025-01-26T17:58:09Z

@pfmoore I have tested running build-project.py on Windows. I have not tested the build-release nox session, but it should still work on all platforms too.

I have considered running our packaging test on Windows and macos but refrained so far because that step is a prerequisite for other steps and worried to make CI slower.

ichard26 · 2025-01-26T18:07:29Z

I have considered running our packaging test on Windows and macos but refrained so far because that step is a prerequisite for other steps and worried to make CI slower.

We could remove the dependency on the packaging job for the test jobs to run. I don't think the packaging job fails that often in practice, so we're not really saving any CI resources. 👍 to test our packaging flow on at least Windows.

pfmoore · 2025-01-26T21:30:09Z

Just to be clear, I wasn't suggesting there would be problems. Just that the talk about umasks made me wonder how (or if) that would apply to Windows, and would the same build process give identical results on Windows and Unix. Or would we get concerns from people who could't reproduce our "reproducible" build, simply because I did the build on Windows¹.

My recommendation (and this is what we do) is just do the reproducible build in controlled environment (container image - we use debian buster python as a base).

This is the sort of thing that concerns me, as it assumes everyone has docker available (which, for example, wasn't true at my previous place of work). I'm not against adding prerequisites, but I'd prefer that we were cautious in doing so. We have enough resource problems already, and I'd rather we didn't add any more obstacles to adding new maintainers/RMs than we have to.

This isn't entirely hypothetical. At one point, we had an issue raised because I did a release and one of our processes didn't normalise line endings, meaning that a text file in the release had CRLF line endings. ↩

potiuk · 2025-01-26T22:09:18Z

This isn't entirely hypothetical. At one point, we had an issue raised because I did a release and one of our processes didn't normalise line endings, meaning that a text file in the release had CRLF line endings. ↩

It's about verificaiton, not release process. With trusted publishing the release process will happen on GitHub, so you - or any other maintainer can run "release" workflow and it will work, regardless what machine you have locally.

It's more to give "others" (3rd-parties or selected other maintainers who could volunteer to verify that the build is reproducible) a clear and unambiguous decription of the way to do so.

potiuk · 2025-01-26T22:14:25Z

The main point - simply - not everyone must be capable of running the release and get reproducible build. This is the same concept as sigstore ledger - you just need to have enough of trusted people - including 3rd-parties, to be able (and to do) the reproducible build process leading to the same binary. Hopefully that will become the norm that they will do it and publish the results - using the specified build environment and process to follow.

But it absolutely does not mean that everyone in all circumstances wil be able to produce the same binary result - this has never been the goal of "reproducible builds" idea. The idea was that you give those who want to verify it a clear recipe how to build the reproducible build (and make sure that the build you publish in PyPI- for exmple using GitHub action - is done using the same recipe). That's all.

pradyunsg · 2025-01-27T10:31:40Z

FWIW, I'm still mildly concerned about the increased/moved security surface from pypi.org to pypi.org/github.com account.

Prior to this change, the only way to cut a compromised release (barring a malicious maintainer) was to compromise a pypi.org account of one of the maintainers, which gets used in a very limited context. This now changes that to pypi.org account for one of the maintainers or anyone with admin on the github.com repository (which is all maintainers + all PyPA admins) -- the latter of which gets used in a lot more places including as a login provider.

It's not a big-enough problem to be a blocking concern (evidently) but I'm noting this down here none the less since it's a change we should all be mindful of ¹. One thing worth noting is that we do have 2FA enforced on the org as well so things should be fine in the grand scheme of things.

Party writing this for a me, TBH. ↩

potiuk · 2025-01-27T10:51:46Z

FWIW, I'm still mildly concerned about the increased/moved security surface from pypi.org to pypi.org/github.com account.

In case you have not done it - the best practice (also mentioned by @sethmlarson in https://blog.pypi.org/posts/2024-12-11-ultralytics-attack-analysis/) is to have a separate deployment environment and configure your trusted publishing to only accept releases from that environment. There are various protection rules that you can implement and you can set-up up to 6 people to be able to actually run the release job there - as far as I understand. In fact - we are waiting for enabling trusted publishing in all Apache Software Foundation projects before we give the possibility to manage such deployment environments to the projects:

https://docs.github.com/en/actions/managing-workflow-runs-and-deployments/managing-deployments/managing-environments-for-deployment#deployment-protection-rules

sbidoul · 2025-01-27T12:05:07Z

To elaborate on the current config, the PyPI environment here on GitHub is configured to require review by a member of pypa/pip-committers, and the PyPI side is configured to require that environment.

During the release process, a confirmation by a pypa/pip-committers is therefore required.

Nevertheless @pradyunsg is correct in saying that the attack vectors to succeed with a pip release have changed.

Whether this is worse or better than before, I can't tell, and there certainly no absolute answer to that question.

woodruffw · 2025-01-27T18:06:41Z

Just as an update: the improvements to the Accept header handling have landed on PyPI, so consumers of the provenance endpoints should see fewer HTTP 406s now 🙂

sbidoul mentioned this pull request Oct 27, 2024

Improve the release process to enable trusted publishing #12708

Closed

sbidoul force-pushed the trusted-publisher-sbi branch 2 times, most recently from e1e19dc to 29af9e3 Compare October 27, 2024 15:22

potiuk mentioned this pull request Oct 27, 2024

Make "Trusted Publishing" works for our PyPI releasing apache/airflow#41937

Open

sbidoul mentioned this pull request Oct 28, 2024

24.3.1 git tag missing #13054

Closed

1 task

ichard26 reviewed Nov 18, 2024

View reviewed changes

sbidoul added this to the 25.0 milestone Nov 19, 2024

ichard26 mentioned this pull request Dec 7, 2024

Release 25.0 #13103

Open

webknjaz suggested changes Dec 10, 2024

View reviewed changes

potiuk mentioned this pull request Dec 10, 2024

Adding svn validations basic flow gopidesupavan/gh-svn-pypi-publisher#1

Closed

sbidoul requested a review from webknjaz December 10, 2024 10:47

sethmlarson reviewed Dec 10, 2024

View reviewed changes

webknjaz suggested changes Dec 10, 2024

View reviewed changes

sbidoul mentioned this pull request Dec 11, 2024

Pin build dependencies #13113

Closed

sbidoul force-pushed the trusted-publisher-sbi branch 5 times, most recently from 604154c to cd82ee7 Compare January 12, 2025 15:05

webknjaz reviewed Jan 14, 2025

View reviewed changes

.github/workflows/release.yml Outdated Show resolved Hide resolved

sbidoul merged commit 6b0fb90 into pypa:main Jan 24, 2025
31 checks passed

sbidoul deleted the trusted-publisher-sbi branch January 24, 2025 07:39

Add trusted publisher release workfiow #13048

Add trusted publisher release workfiow #13048

Conversation

sbidoul commented Oct 27, 2024

ichard26 commented Nov 5, 2024 • edited Loading

sbidoul commented Nov 6, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Footnotes

Choose a reason for hiding this comment

Footnotes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbidoul commented Nov 19, 2024

pfmoore commented Nov 19, 2024

webknjaz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbidoul commented Dec 10, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

webknjaz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbidoul commented Jan 12, 2025

notatallshaw commented Jan 12, 2025 • edited Loading

sbidoul commented Jan 12, 2025

Choose a reason for hiding this comment

webknjaz commented Jan 14, 2025

sbidoul commented Jan 24, 2025 • edited Loading

sbidoul commented Jan 24, 2025

webknjaz commented Jan 24, 2025

sbidoul commented Jan 26, 2025 • edited Loading

webknjaz commented Jan 26, 2025

webknjaz commented Jan 26, 2025

potiuk commented Jan 26, 2025

potiuk commented Jan 26, 2025

sbidoul commented Jan 26, 2025

sbidoul commented Jan 26, 2025 • edited Loading

sbidoul commented Jan 26, 2025

ichard26 commented Jan 26, 2025

notatallshaw commented Jan 26, 2025 • edited Loading

woodruffw commented Jan 26, 2025

sbidoul commented Jan 26, 2025

potiuk commented Jan 26, 2025 • edited Loading

potiuk commented Jan 26, 2025 • edited Loading

woodruffw commented Jan 26, 2025

pfmoore commented Jan 26, 2025

sbidoul commented Jan 26, 2025

ichard26 commented Jan 26, 2025

pfmoore commented Jan 26, 2025

Footnotes

potiuk commented Jan 26, 2025

potiuk commented Jan 26, 2025 • edited Loading

pradyunsg commented Jan 27, 2025

Footnotes

potiuk commented Jan 27, 2025 • edited Loading

sbidoul commented Jan 27, 2025

woodruffw commented Jan 27, 2025

ichard26 commented Nov 5, 2024 •

edited

Loading

notatallshaw commented Jan 12, 2025 •

edited

Loading

sbidoul commented Jan 24, 2025 •

edited

Loading

sbidoul commented Jan 26, 2025 •

edited

Loading

sbidoul commented Jan 26, 2025 •

edited

Loading

notatallshaw commented Jan 26, 2025 •

edited

Loading

potiuk commented Jan 26, 2025 •

edited

Loading

potiuk commented Jan 26, 2025 •

edited

Loading

potiuk commented Jan 26, 2025 •

edited

Loading

potiuk commented Jan 27, 2025 •

edited

Loading