Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only show the latest version in the Arch index #33262

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

ExplodingDragon
Copy link
Contributor

@ExplodingDragon ExplodingDragon commented Jan 14, 2025

Only show the latest version of the package in the arch repo, having too many packages will make the index larger, and most of the time, there isn't much demand for downloading older versions of software packages.

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Jan 14, 2025
@pull-request-size pull-request-size bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 14, 2025
@github-actions github-actions bot added modifies/go Pull requests that update Go code docs-update-needed The document needs to be updated synchronously labels Jan 14, 2025
@ExplodingDragon ExplodingDragon changed the title Feat(package): Keep only the latest version in Arch DB WIP: Keep only the latest version in Arch DB Jan 14, 2025
@ExplodingDragon ExplodingDragon changed the title WIP: Keep only the latest version in Arch DB Keep only the latest version in Arch Repo Jan 15, 2025
@ExplodingDragon ExplodingDragon marked this pull request as ready for review January 15, 2025 07:30
@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Jan 20, 2025
@wxiaoguang
Copy link
Contributor

Actually I have some questions about the design.

  • KEEP_LATEST_VERSION sounds like "only keep the package with latest version, and delete others"
  • After reading code, I think it means that "only show the package with latest version in the package index"

So the questions are:

  • Should we delete the old packages? Or not delete, just hide them from the index (this PR does)?
  • Could it ("deleting old packages" or "not-deleting but just hide old packages from index") be a general behavior for other package registries?
  • The config name sounds misleading: I think it should clarify what's the expected result in the name: deleting, or just hiding.

@ExplodingDragon ExplodingDragon changed the title Keep only the latest version in Arch Repo Only show the latest version in the Arch index Jan 21, 2025
@ExplodingDragon
Copy link
Contributor Author

Actually I have some questions about the design.

* `KEEP_LATEST_VERSION` sounds like "only keep the package with latest version, and delete others"

* After reading code, I think it means that "only show the package with latest version in the package index"

So the questions are:

* Should we delete the old packages? Or not delete, just hide them from the index (this PR does)?

* Could it ("deleting old packages" or "not-deleting but just hide old packages from index") be a general behavior for other package registries?

* The config name sounds misleading: I think it should clarify what's the expected result in the name: deleting, or just hiding.

@wxiaoguang Sorry, I misspoke. KEEP_LATEST_VERSION will change to SHOW_LATEST_VERSION.

@wxiaoguang
Copy link
Contributor

Are there some reference documents for the "only show latest version in index" behavior? (The question is why it is needed to be done on server side, since client could always figure out the latest version)

@ExplodingDragon
Copy link
Contributor Author

ExplodingDragon commented Jan 21, 2025

Are there some reference documents for the "only show latest version in index" behavior? (The question is why it is needed to be done on server side, since client could always figure out the latest version)

@wxiaoguang No, this comes from my subjective opinion. Having too many packages will make the index larger, and most of the time, there isn't much demand for downloading older versions of software packages. Other package registries are facing the same situation.

Just like Arch Linux's rolling release model, which only keeps the latest version.

@wxiaoguang
Copy link
Contributor

Thank you for the clarification. To be honest, I think it needs more time to make this PR mature.

  1. We could/should learn from other package registries (and maybe arch official site) to see how they handle such requirements.
  2. If there are too many out-dated packages (for example, some in-house usage, the packages would be published rapidly), I think it's better to delete them but not just hiding them.

@ExplodingDragon
Copy link
Contributor Author

ExplodingDragon commented Jan 21, 2025

@wxiaoguang Thanks for your review.

We could/should learn from other package registries (and maybe arch official site) to see how they handle such requirements.

In Arch Linux, older versions of packages are stored in the Arch Linux Archive , and the index only contains the latest versions. e.g. archlinux/core/os/x86_64 , Other packages are the same, like Alpine.

But for some software package repositories, like Kubernetes, keeping only the latest version isn't suitable because they need to install older versions to meet their requirements.

If there are too many out-dated packages (for example, some in-house usage, the packages would be published rapidly), I think it's better to delete them but not just hiding them.

The old package cleanup feature already exists and meets the requirements (it cleans up outdated packages and rebuilds the index).

@wxiaoguang
Copy link
Contributor

We could/should learn from other package registries (and maybe arch official site) to see how they handle such requirements.

In Arch Linux, older versions of packages are stored in the Arch Linux Archive , and the index only contains the latest versions. e.g. archlinux/core/os/x86_64 , Other packages are the same, like Alpine.

So for this case, it doesn't need a new option, just "only show latest packages" for arch/alpine?

@ExplodingDragon
Copy link
Contributor Author

ExplodingDragon commented Jan 21, 2025

So for this case, it doesn't need a new option, just "only show latest packages" for arch/alpine?

@wxiaoguang As mentioned earlier (for some software package repositories like Kubernetes), there are users who need to keep historical versions.

In fact, this option should be included in the organization settings. maximum package upload size limit and custom package registry GPG signatures also need to be moved to the organization settings.

@wxiaoguang
Copy link
Contributor

So for this case, it doesn't need a new option, just "only show latest packages" for arch/alpine?

@wxiaoguang As mentioned earlier (for some software package repositories like Kubernetes), there are users who need to keep historical versions.

Yes, so I said just "only show latest packages" for arch/alpine? It should follow the official behavior.

@wxiaoguang
Copy link
Contributor

In fact, this option should be included in the organization settings. maximum package upload size limit and custom package registry GPG signatures also need to be moved to the organization settings.

Maybe no one has interest/motivation to refactor the user (org) settings system at the moment ..... anyway, "open source".

@lunny lunny added this to the 1.24.0 milestone Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-update-needed The document needs to be updated synchronously lgtm/need 1 This PR needs approval from one additional maintainer to be merged. modifies/go Pull requests that update Go code size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants