lammps: Use cudaPackages.backendStdenv if needed #366528

doronbehar · 2024-12-19T13:15:37Z

Description of changes

Things done

Add a 👍 reaction to pull requests you find important.

markuskowa · 2024-12-24T11:50:31Z

Does lammps even have CUDA support at moment (I don't see any in the inputs)?
I wonder if it would not be more efficient to find a solution at the level of the CUDA packages (e.g. cudaPackages could expose a compatible stdenv?). Every package that uses CUDA should have a similar problem.
In any case, I would suggest to make the pinning of stdenv conditional if CUDA is present or not.

doronbehar · 2024-12-24T12:55:29Z

Does lammps even have CUDA support at moment (I don't see any in the inputs)?

Not at the moment, but I add it in my personal overlays, via the extraBuildInputs argument.

I wonder if it would not be more efficient to find a solution at the level of the CUDA packages (e.g. cudaPackages could expose a compatible stdenv?). Every package that uses CUDA should have a similar problem.

Yea I agree. Perhaps the @NixOS/cuda-maintainers will have an opinion on the matter. The only thing that is important to me is to help future usages of the package avoid the issue I encountered here: lammps/lammps#4420 . Perhaps making autoAddDriverRunpath error early if the compiler version it finds is too high would be best (we can worry about making it's inclusion in lammps conditional afterwards).

In any case, I would suggest to make the pinning of stdenv conditional if CUDA is present or not.

I agree with that too.

markuskowa · 2024-12-25T11:26:06Z

Turns out there is already cudaPackages.backendStdenv which could be used here for CUDA builds.

doronbehar · 2024-12-25T12:40:49Z

Turns out there is already cudaPackages.backendStdenv which could be used here for CUDA builds.

Nice. I think this resolves our request from the @NixOS/cuda-maintainers . Based on a pattern I saw with git grep cudaPackages.backendStdenv, I pushed something which I think would satisfy you too.

pkgs/by-name/la/lammps/package.nix

markuskowa · 2024-12-26T11:27:04Z

pkgs/by-name/la/lammps/package.nix

+  # Ease adding CUDA support. See:
+  # https://github.com/lammps/lammps/issues/4420
+  stdenv' =
+    if overrideStdenv && (config.cudaSupport || ((extraCmakeFlags.GPU_API or "") == "cuda")) then


In case the "cudaSupport" is true do you want to also set the corresponding cmakeFlags and buildInputs so that lammps gets build automatically with CUDA (i.e. in a similar way we do it in openmpi)?

In case the "cudaSupport" is true do you want to also set the corresponding cmakeFlags and buildInputs so that lammps gets build automatically with CUDA (i.e. in a similar way we do it in openmpi)?

I think it makes the logic pretty complicated and I'm not sure I want to delve into implementing this. How many details do we want to be taken care of automatically? What do we want to happen in case someone simply sets config.cudaSupport? TBH, I don't like this flag at all, as it makes package maintainers' life pretty complicated, especially for complex packages such as this, and since the cuda support for every package is different and users should read the docs of such packages anyway and not simply count on config.cudaSupport to do the job...

Here's a list of scenarios I was able to think about that we should handle:

# config.cudaSupport | packages.GPU | extraCmakeFlags.GPU_API || use cudaPackages.backendStdenv? || is valid? # ------------------ | ------------ | ----------------------- || ------------------------------- || --------- # false | false | "opencl" || || # true | false | "opencl" || || # false | true | "opencl" || || # true | true | "opencl" || || # false | false | "cuda" || || # true | false | "cuda" || || # false | true | "cuda" || || # true | true | "cuda" || || # false | false | "hip" || || # true | false | "hip" || || # false | true | "hip" || || # true | true | "hip" || ||

Not only that, there are the cases where packages.GPU is not specified, and/or extraCmakeFlags.GPU_API is not specified... And, when opencl GPU_API is used, you should add the ocl-icd package to buildInputs, and cudaPackages.cudatoolkit when cuda GPU_API is used.

I tried for an hour or so to implement a logic that would satisfy everyone and I got lost in the sea of options above. Not only that, I decided to revert the last revision which read the value of config.cudaSupport, as I want the expression to be left clean from heuristic decisions that are hard to figure out just from reading the Nix code.

I guess this is the price we pay for the full configurability of extraCmakeFlags (and packages) and extraBuildInputs. I added a NOTE comment explaining this decision.

I think understand the problem. However, if the config is too complex and if the intention is to configure it externally via extraCmakeFlags, I am non sure overrideStdenv is even needed here.
I that case you could override it accordingly and have more transparency:

lammps.override { stdenv = cudaPackages.backendStdenv; extraCmakeFlags.GPU_API = "cuda"; <+ what else needs to be overriden> }

I think understand the problem. However, if the config is too complex and if the intention is to configure it externally via extraCmakeFlags, I am non sure overrideStdenv is even needed here. I that case you could override it accordingly and have more transparency:

lammps.override { stdenv = cudaPackages.backendStdenv; extraCmakeFlags.GPU_API = "cuda"; <+ what else needs to be overriden> }

When you suggested adding an overrideStdenv, I thought you wanted to help the following case: A user wishes to build with -DGPU_API=cuda (and -DPKG_GPU=true), and not use the default stdenv, nor cudaPackages.backendStdenv, but something like clangStdenv (perhaps for testing the build or whatever). Without an overrideStdenv flag, this user would have no way to tell the build to use their supplied stdenv argument.

The discussion of having an overrideStdenv or not doesn't add or subtract from the complexity described in the above table.

-DGPU_API=cuda

Our overrideXXXX zoo is undeniably a source of infinite pain, but I think for now it's reasonable to say that in these cases "the user is on their own" and they should just use overrideAttrs to add -D flags manually.

The if cudaSupport then backendStdenv else stdenv is an unfortunate pattern since the user now has to .override { both stdenv = ...; and cudaPackages = ...; } to be sure their argument isn't silently ignored, and your overrideStdenv solution helps with that but not in a systematic way . Maybe we're better off without it and should build towards changing the callPackage mechanism instead

Here specifically when the user wants to override the stdenv but still pass the GPU flags. We can try to handle every conceivable combination, but afaiu this is getting complicated, and we'll need to keep maintaining the complexity.

My goal here is only to help Nix users of this package avoid the build error described here:

[BUG] Build errors with -DGPU_API=cuda and GCC 14 lammps/lammps#4420

This build error is barely understandable, and hence it'd be nice to spare user's time by making the usage of both -DGPU_API="cuda" and -DPKG_GPU=ON CMake flags automatically trigger the usage of the cudaPackages.backendStdenv. I want to make this logic as less complicated as possible, and hence introducing config.cudaSupport to this logic game is way too much IMO.

Explaining the situation once more and also after the "systematicity" discussion, I think perhaps we can use an assertion that will check whether the stdenv.cc.version is lower then or equal to cudaPackages.backendStdenv.cc.version. With this logic, we'd spare the overrideStdenv argument, while still helping users avoid the experience of the build error mentioned above. What do you think about that?

Disclaimer: I may not have familiarized myself with the expression enough to jump to conclusions, I'm not sure if I'm being helpful

The expression is not that complicated at the moment, so I think this is not why I personally feel you are not being helpful. I feel you are not being helpful because it takes me too long to understand you.

Sorry for the late reply. I do understand your motivation to handle the stdenv logic internally. It seems like an acceptable compromise for now.
For now the situation with CUDA is a global mess.

For now the situation with CUDA is a global mess.

I just thought of another idea that @NixOS/cuda-maintainers might like: Add a setup hook to cudaPackages.cudatoolkit that would prevent it from being compiled with a gcc --version lower then cudaPackages.backendStdenv.cc.version!

I just thought of another idea that @NixOS/cuda-maintainers might like: Add a setup hook to cudaPackages.cudatoolkit that would prevent it from being compiled with a gcc --version lower then cudaPackages.backendStdenv.cc.version!

Yes, this conversation also made me think in this direction

I just thought of another idea that @NixOS/cuda-maintainers might like: Add a setup hook to cudaPackages.cudatoolkit that would prevent it from being compiled with a gcc --version lower then cudaPackages.backendStdenv.cc.version!

Yes, this conversation also made me think in this direction

OK Great, I think this should resolve the whole purpose of this PR. For reference:

cudaPackages.setupCudaHook: prevent builds with unsupported host compilers #370879

Hence, I'm closing this.

See lammps/lammps#4420

pkgs/by-name/la/lammps/package.nix

SomeoneSerge · 2024-12-26T21:19:46Z

pkgs/by-name/la/lammps/package.nix

@@ -44,9 +45,27 @@
  extraCmakeFlags ? { },


It's true the semantics of config.cudaSupport is somewhat vague. I think my expectation would be something along the lines of extraCmakeFlags ? optionalAttrs config.cudaSupport { GPU_API = "cuda"; }, which would parallel how other packages accept cudaSupport ? config.cudaSupport

Alternatively, one could cut off some of the combinations of arguments using asserts...

It's true the semantics of config.cudaSupport is somewhat vague. I think my expectation would be something along the lines of extraCmakeFlags ? optionalAttrs config.cudaSupport { GPU_API = "cuda"; }, which would parallel how other packages accept cudaSupport ? config.cudaSupport

Yes but that won't be enough to make the build change anything. You also have to set packages.GPU to true (or CMake-wise: -DPKG_GPU=ON). See my table outside the review comments.

I suppose

"the user is on their own" and they should just use overrideAttrs to add -D flags manually.

...is my stance on this, but it's obviously a compromise

doronbehar · 2024-12-28T16:52:41Z

Please tell me how to fill this table and I'll consider implementing what you think makes sense:

`config.cudaSupport`	`packages.GPU`	`extraCmakeFlags.GPU_API`
false(default)	false(default)	`opencl`(default)
true	false(default)	`opencl`(default)
false(default)	true	`opencl`(default)
true	true	`opencl`(default)
false(default)	false(default)	`cuda`
true	false(default)	`cuda`
false(default)	true	`cuda`
true	true	`cuda`
false(default)	false(default)	`hip`
true	false(default)	`hip`
false(default)	true	`hip`
true	true	`hip`

SomeoneSerge · 2025-01-02T14:24:47Z

pkgs/by-name/la/lammps/package.nix

Another observation along the same lines: .override { extraBuildInputs = as; extraCmakeFlags = bs; } is "just" .overrideAttrs (oldAttrs: { buildInputs = oldAttrs.buildInputs or [ ] ++ as; cmakeFlags = oldAttrs.cmakeFlags or [ ] ++ bs; })

Arguably, this need is very common in nixpkgs, which I think is evidenced by experiments like DavHau's "drv-parts" and amjoseph's "infuse" (both being more "systematic" solutions in the sense described in the other comment). Few packages in Nixpkgs have this kind of ad hoc helpers (referring to extraCmakeFlags).

So I wonder if we should keep maintaining these arguments and the associated complexity.

Arguably, this need is very common in nixpkgs, which I think is evidenced by experiments like DavHau's "drv-parts" and amjoseph's "infuse" (both being more "systematic" solutions in the sense described in the other comment). Few packages in Nixpkgs have this kind of ad hoc helpers (referring to extraCmakeFlags).

So I wonder if we should keep maintaining these arguments and the associated complexity.

I think people are waiting for structured attributes for the kind of features you are describing.

I promised to stop posting confusing messages, feel free to dismiss this reply

Ah, right, un-structuredAttrs do make overrides harder, but I rather meant that even if you assume that lists are always lists and attrsets are always attrsets (instead of being strings in 10 derivations out of 100_000), the overrideAttrs approach still results in very deep expressions and breaks composability; https://discourse.nixos.org/t/infuse-nix-deep-override-attrs-generalizes-pipe-and-recursiveupdate/ and https://discourse.nixos.org/t/drv-parts-configure-packages-like-nixos-systems/23629 attempt to address this specifically, by exposing a declarative interface instead of the point-ful functional style

SomeoneSerge · 2025-01-02T16:55:21Z

pkgs/by-name/la/lammps/package.nix

I see. I'll stop posting confusing messages. My opinion on behalf of cuda-maintainers is that the "officially" advertised way to use cuda in nixpkgs is to import nixpkgs { config.cudaSupport = true; } (because of transitive dependencies, it's safer to just re-evaluate the whole nixpkgs). So I would ask that lammps treats config.cudaSupport when choosing default values. It's perfectly consistent with the current semantics to ignore config.cudaSupport when the user explicitly passes (in callPackage or .override) something like GPU_API = "opencl", because then they explicitly opt out of the "defaults".

So in terms of the existing derivation, I think the behaviour I expect is something like this:

{ config, packages ? { ..., GPU ? (extraCmakeFlags.GPU_API or null != null) }, extraCmakeFlags ? { GPU_API ? if config.cudaSupport then "cuda" else if config.rocmSupport then "hip" else "opencl" }, overrideStdenv ? if (extraCmakeFlags.GPU_API or null) == "cuda then cudaPackages.backendStdenv else stdenv, ... }: assert (extraCmakeFlags.GPU_API or null != null) -> (packages.GPU or false); assert (extraCmakeFlags.GPU_API or null == "cuda") -> (overrideStdenv is compatible with nvcc); ...

Regarding the last (optional) assert, cudaPackages currently doesn't expose a function to test stdenv compatibility but maybe it should; the data is there in nvccCompatibleToolchains if I remember the attribute name correctly

SomeoneSerge · 2025-01-04T16:00:44Z

Hence, I'm closing this.

Just to be clear, I think the work making lammps automatically use the right stdenv in simple known cases is still relevant, but you're the judge whether it's worth more of your time. Thank you for the PR and the discussion

vcunat changed the base branch from staging-next to master December 23, 2024 14:02

doronbehar requested a review from markuskowa December 23, 2024 14:46

ofborg bot requested a review from costrouc December 24, 2024 03:34

ofborg bot added 11.by: package-maintainer This PR was created by the maintainer of the package it changes 10.rebuild-darwin: 1-10 10.rebuild-linux: 1-10 labels Dec 24, 2024

doronbehar force-pushed the pkg/lammps branch from c1d8ab7 to 0ed23a6 Compare December 25, 2024 12:40

github-actions bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux and removed 10.rebuild-darwin: 1-10 10.rebuild-linux: 1-10 labels Dec 25, 2024

doronbehar changed the title ~~lammps: Use gcc12Stdenv because CUDA doesn't support GCC14~~ lammps: Use cudaPackages.backendStdenv if needed Dec 25, 2024

doronbehar force-pushed the pkg/lammps branch from 0ed23a6 to 161ccaa Compare December 25, 2024 13:01

markuskowa reviewed Dec 25, 2024

View reviewed changes

pkgs/by-name/la/lammps/package.nix Outdated Show resolved Hide resolved

pkgs/by-name/la/lammps/package.nix Outdated Show resolved Hide resolved

doronbehar force-pushed the pkg/lammps branch from 161ccaa to c997276 Compare December 26, 2024 10:29

markuskowa reviewed Dec 26, 2024

View reviewed changes

doronbehar force-pushed the pkg/lammps branch from c997276 to 0801929 Compare December 26, 2024 15:02

lammps: Ease using cuda by automatically using correct stdenv

cfd773a

See lammps/lammps#4420

doronbehar force-pushed the pkg/lammps branch from 0801929 to cfd773a Compare December 26, 2024 15:05

SomeoneSerge reviewed Dec 26, 2024

View reviewed changes

pkgs/by-name/la/lammps/package.nix Show resolved Hide resolved

SomeoneSerge reviewed Dec 26, 2024

View reviewed changes

SomeoneSerge reviewed Jan 2, 2025

View reviewed changes

doronbehar mentioned this pull request Jan 4, 2025

cudaPackages.setupCudaHook: prevent builds with unsupported host compilers #370879

Open

doronbehar closed this Jan 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lammps: Use cudaPackages.backendStdenv if needed #366528

lammps: Use cudaPackages.backendStdenv if needed #366528

doronbehar commented Dec 19, 2024

markuskowa commented Dec 24, 2024

doronbehar commented Dec 24, 2024

markuskowa commented Dec 25, 2024

doronbehar commented Dec 25, 2024

markuskowa Dec 26, 2024

doronbehar Dec 26, 2024

markuskowa Dec 27, 2024

doronbehar Dec 28, 2024

SomeoneSerge Jan 2, 2025

doronbehar Jan 2, 2025

markuskowa Jan 3, 2025

doronbehar Jan 3, 2025

SomeoneSerge Jan 3, 2025

doronbehar Jan 4, 2025

SomeoneSerge Dec 26, 2024

SomeoneSerge Dec 26, 2024

doronbehar Dec 28, 2024

SomeoneSerge Jan 2, 2025

doronbehar commented Dec 28, 2024

SomeoneSerge Jan 2, 2025

doronbehar Jan 2, 2025

SomeoneSerge Jan 2, 2025

SomeoneSerge Jan 2, 2025

SomeoneSerge commented Jan 4, 2025

lammps: Use cudaPackages.backendStdenv if needed #366528

lammps: Use cudaPackages.backendStdenv if needed #366528

Conversation

doronbehar commented Dec 19, 2024

Description of changes

Things done

markuskowa commented Dec 24, 2024

doronbehar commented Dec 24, 2024

markuskowa commented Dec 25, 2024

doronbehar commented Dec 25, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

doronbehar commented Dec 28, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SomeoneSerge commented Jan 4, 2025