-
-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC 0185] Redistribute redistributable software #185
base: master
Are you sure you want to change the base?
Conversation
Co-authored-by: Matt McHenry <[email protected]>
I think a drawback for the plan as-is would be that some of the stuff is actually pretty large so space usage at Hydra/cache goes up. A missing related piece of work here is to look at the reverse dependencies of unfree packages and to check if they are a good idea to build. For normal packages we have some «it's large but downloading is longer than the build» packages marked as |
87cb264
to
27241c3
Compare
Good points! I just added them to the RFC. Also I'll incidentally say that TPTP seems to already be marked as |
I think that given the skew of unfree stuff towards large binaries, significantly increasing the evaluation time and somewhat increasing the storage growth are separate points to mention. (Probably we need to ask for feedback from the infrastructure team on all that at some point) |
@7c6f434c Why would binary-based packages significantly increase the evaluation time? Nixpkgs requires packages to pass strict evaluation, which means that downloading would never occur during evaluation. I haven't experiment it, but I guess that packages that requires a long evaluation time typically falls into the categories below:
none of the above are specific to unfree or binary-based packages. Update: Some binary-based packages might be built with legacy versions of libraries, which would require custom overriding if such version is uncommon in Nixpkgs. Still, such situation also occurs to large packages like TensorFlow, and small projects with few dependencies wouldn't take too long to evaluate. |
Evaluation time will increase because life is hard. Basically, even though |
I think I already listed the two points you're mentioning? The evaluation time issue was listed in the settings from the start; and I just added the built size issue as an unresolved question, as it's currently unclear whether it's negligible or not |
Also you seem to be hinting at a doubling of the eval time but I don't think that'd be the case. Hydra would eval a pkgset that'd consist of essentially |
We could do the Hydra's eval simply as now but with unfree allowed – and do this check separately in a channel-blocking job (executed on a builder instead of centralized with the eval). We have similar checks already in the CI might check this as well, but such regressions seem quite unlikely to me. |
Oh right, evaluating all the ISOs is not negligible, but indeed can be pushed to a build |
Thank you so much for working on this. Since MongoDB is the biggest offender which causes many people serious trouble day-to-day to build, perhaps we could also consider a phased rollout plan where that is the first thing to be included 😄 |
And MongoDB indeed has license which avoids most general concerns, in the sense that the source is available, and both arbitrary patches (as they are derivative, they are presumed same-license-as-MongoDB in Nixpkgs anyway) and running inside a network-isolated sandbox are permitted without restriction. This is not true for all unfree-redistributable things… |
The way I understand (and mean) the current RFC text, all currently unfree redistributable packages would stay out of hydra until marked Are there any remaining concerns on the current RFC, that I could address? :) |
@NixOS/infra-build just so that all of you see it… |
No objections to this RFC |
According to [this discussion](https://github.com/NixOS/nixpkgs/issues/83433), the current statu quo dates back to the 20.03 release meeting. | ||
More than four years have passed, and it is likely worth rekindling this discussion, especially now that we actually have a Steering Committee. | ||
|
||
Recent exchanges have been happening in [this issue](https://github.com/NixOS/nixpkgs/issues/83884). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context, we also started building all the redistribuable+unfree packages in the nix-community sister project.
See all the unfree-redis* jobsets here: https://hydra.nix-community.org/project/nixpkgs
It's only ~400 packages. The builds are available at https://nix-community.cachix.org/
The jobset is defined in nixpkgs to make upstreaming easier:
https://github.com/NixOS/nixpkgs/blob/master/pkgs/top-level/release-unfree-redistributable.nix
If this RFC passes it will be even better as users don't necessarily know about or want to trust a secondary cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's great to know, thank you! Though we may need to do a bit more to properly handle the "cannot be run on hydra" point that was raised above.
I can already see on the hydra link you sent that eval takes <1min, so should be a negligible addition to hydra's current eval times. Build times seem to take ~half a day. AFAIU there's a single machine running the jobs. If I read correctly, hydra currently has ~5 builders, and one trunk-combined build takes ~1 day. So it means that the build times would increase by at most ~10%, and probably less considering that there is probably duplication between what the nix-community hydra builds and what nixos' hydra is already building. I'm also not taking into account machine performance, which is probably stronger on nixos' hydra than nix-community's hydra.
I think this means eval/build times are things we can reasonably live with, and if we get any surprise we can always rollback.
There's just one thing I can't find in the links you sent to properly adjust the unresolved questions: do you know how large one build closure is on nix-community's hydra? I don't know how to get it on nixos' hydra either but it'd still help confirm there's zero risk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this means eval/build times are things we can reasonably live with, and if we get any surprise we can always rollback.
Yes, especially since the way the unfree-redis jobset is put together is by evaluating and filtering trough all the nixpkgs derivations. So most likely the combined eval time is much smaller than the addition of both.
There's just one thing I can't find in the links you sent to properly adjust the unresolved questions: do you know how large one build closure is on nix-community's hydra?
The best I can think of is to build a script that takes all the successful store paths, pulls them from the cache, runs nix path-info -s
on them and then sums up the value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your answer! I actually more or less found the answer from Hydra's UI. Here is my script:
curl https://hydra.nix-community.org/jobset/nixpkgs/cuda/channel/latest > hydra-jobs
cat hydra-jobs | grep '<td><a href="https://hydra.nix-community.org/build/' | cut -d '"' -f 2 > job-urls
for u in $(cat job-urls); curl "$u" 2>/dev/null | grep -A 1 'Output size' | tail -n 1 | cut -d '>' -f 2 >> job-sizes; wc -l < job-sizes | head -c -1; echo -n " / "; wc -l < job-urls; end
awk '{sum += $1} END {print sum}' job-sizes
# NVidia kernel packages take ~1.3GiB each and there are 334-164 = 170
# Total: 215G, so 45G without NVidia kernel packages
I got the following results:
- For
unfree-redist-full
, a total of 215G, including 200G for NVidia kernel packages and 15G for the rest of the software - For
cuda
, a total of 482G
Unfortunately I cannot run the same test on NixOS' hydra, considering that it disabled the channels API.
I just updated the RFC with these numbers, it might make sense to not build all of cuda on hydra at first, considering the literally hundreds of duplicated above-1G derivations :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So with the current Hydra workflows I'd estimate that very roughly as uploading 2 TB per month to S3. (we rebuild stuff) Except that we upload compressed NARs, so it would be less.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correctly, that it'd be reasonable to do the following?
- Just push everything, and
- if compression is not good enough rollback CUDA & NVidia kernels; and
- even if we need to rollback, the added <1T would not be an issue to keep "forever"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. To me it doesn't even feel like a technical question. (3. is WIP so far, I think. There's no removal from cache.nixos.org yet.)
Thank you for the work on this! I agree with the implementation. Redistributable in general in the license doesn't necessarily mean redistributable on Hydra, having a secondary flag that indicates whether we are comfortable doing so seems prudent. I would second a phased rollout plan where MongoDB (and maybe CUDA) come first since they have elicited the most complaints. |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/lastmodified-attribute-mismatch-in-input-lastmodified-0/58395/1 |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/rfcsc-meeting-2025-01-06/58396/1 |
Rendered