Add rootFS and driverInstallDir fields to ClusterPolicy #747

cdesiniotis · 2024-06-10T21:36:21Z

No description provided.

controllers/object_controls.go

deployments/gpu-operator/values.yaml

validator/main.go

api/nvidia/v1/clusterpolicy_types.go

controllers/object_controls.go

validator/main.go

elezar

Thanks @cdesiniotis. It's a big PR, but I took a basic pass.

A general question is whether it makes sense to factor some of the transform logic that we are adding into functions instead of implementing them in-line in the three cases where we need them.

assets/state-container-toolkit/0500_daemonset.yaml

elezar · 2024-06-11T07:53:09Z

assets/state-container-toolkit/0500_daemonset.yaml

@@ -41,7 +41,7 @@ spec:
          seLinuxOptions:
            level: "s0"
        volumeMounts:
-          - name: driver-install-path
+          - name: driver-root
            mountPath: /run/nvidia/driver


Question (and maybe out of scope for this PR): Does it make sense to set this to /driver-root in the container instead?

I am using the exact hostPath so that dev-char symlink creation works correctly. I want to make sure the target of the symlink resolves correctly on the host. So for example, we create a symlink /host-dev-char/195:0 --> /run/nvidia/driver/dev/nvidia0 in the driver validator container. I guess if we transformed the target path before creating the symlink, then we could always mount the driver install directory at /driver-root or /driver-install-dir. What do you think?

I believe we'd be able to create the symlink regardless of whether the target exists or not, so as lon as NVIDIA_DEV_ROOT (or NVIDIA_DEV_PATH) refers to the correct path on the host the symlinks should work as expected.

In the medium term we could consider moving this out of the toolkit container anyway ... since it really should be a postcondition of the driver container.

Let's leave it as is for now though.

assets/state-container-toolkit/0500_daemonset.yaml

assets/state-device-plugin/0400_configmap.yaml

assets/state-device-plugin/0500_daemonset.yaml

elezar · 2024-06-11T08:33:00Z

validator/main.go

+	hostRootFlag                  string
+	driverRootFlag                string
+	devRootFlag                   string


Instead of separate flags how about the following:

type options struct { type hostPaths struct { root string driverRoot string devRoot string } }

(or even a named struct for the hostPaths).

If we don't want to update the functions signatures to pass options instead, then we can create a global variable:

var hostPaths hostPathsType{}

and refer to this when setting the command line arguments.

controllers/object_controls.go

klueska · 2024-06-14T09:37:59Z

validator/find.go

+// getDevRoot returns the dev root associated with the root.
+// If the root is not a dev root, this defaults to "/".
+func (r root) getDevRoot() string {
+	if r.isDevRoot() {
+		return string(r)
+	}
+	return "/"
+}


I was surprised to not see a field for devPath in HostPathsSpec, but then I recalled a conversation about not needing to specify it if we assume that it is either only ever at:

<DriverInstallDir>/dev; OR

/dev

It looks like this is the logic that is implementing this assumption.

Correct. We decided to omit devPath from HostPathsSpec to start with. If a use case presents itself where device nodes need to exist at some other location, we can introduce this field to the API.

elezar

Thanks @cdesiniotis. I think we're getting there.

I still have some comments / questions, but these are mostly informative.

assets/state-container-toolkit/0500_daemonset.yaml

elezar · 2024-06-14T11:42:58Z

assets/state-container-toolkit/0500_daemonset.yaml

@@ -41,7 +41,7 @@ spec:
          seLinuxOptions:
            level: "s0"
        volumeMounts:
-          - name: driver-install-path
+          - name: driver-root
            mountPath: /run/nvidia/driver


I believe we'd be able to create the symlink regardless of whether the target exists or not, so as lon as NVIDIA_DEV_ROOT (or NVIDIA_DEV_PATH) refers to the correct path on the host the symlinks should work as expected.

In the medium term we could consider moving this out of the toolkit container anyway ... since it really should be a postcondition of the driver container.

Let's leave it as is for now though.

assets/state-container-toolkit/0500_daemonset.yaml

controllers/object_controls.go

controllers/transforms_test.go

validator/driver.go

validator/main.go

Dragoncell · 2024-06-14T18:49:33Z

assets/state-device-plugin/0400_configmap.yaml

-    echo "CONTAINER_DRIVER_ROOT=$CONTAINER_DRIVER_ROOT"
+
+    set -o allexport
+    cat /run/nvidia/validations/driver-ready


Does it also export the custom toolkit installation path from values in toolkit to the device plugin for nvidia-ctk tool ?

No. But we set the NVIDIA_CTK_PATH variable here for the device-plugin in the clusterpolicy controller: https://github.com/NVIDIA/gpu-operator/blob/main/controllers/object_controls.go#L1284.

…ripts This commit updates the driver validation to always create a 'driver-ready' file, regardless if the driver is installed on the host or not. It also populates this file with a list of environment variables, one per line, which are required by some operands. The startup scripts for several operands are simplified to simply source the content of this file before executing the main program for the container. Signed-off-by: Christopher Desiniotis <[email protected]>

validator/find.go

validator/main.go

elezar

I'm happy to move forward with this as is and then address any findings from testing as follow-ups.

elezar · 2024-06-17T12:36:36Z

assets/state-container-toolkit/0200_role.yaml

@@ -12,3 +12,9 @@ rules:
  - use
  resourceNames:
  - privileged
+- apiGroups:


I was not aware of the toolkit inspecting daemonsets? Was thsi change intentional?

Is this because the validator runs here too? I thought we changed the logic to just look for the ready file?

Yes, it is because the validator runs here as an init container. We still need to run the driver validation here since the operator-validator pod runs with the nvidia runtime class, and thus, needs the toolkit pod to run first before it can run and start performing any validations.

elezar · 2024-06-17T12:37:28Z

assets/state-container-toolkit/0500_daemonset.yaml

@@ -67,6 +71,8 @@ spec:
          value: "management.nvidia.com/gpu"
        - name: NVIDIA_VISIBLE_DEVICES
          value: "void"
+        - name: TOOLKIT_PID_FILE


For reference. This requires NVIDIA/nvidia-container-toolkit#544

elezar · 2024-06-17T12:42:24Z

validator/driver.go

+		devRoot = hostRoot
+		devRootCtrPath = "/host"
+	} else {
+		devRoot = driverInstallDir


Just so that I'm clear, this is only triggered for managed drivers?

No. This logic applies to any driver not installed on the host directly at /. So this can be the operator managed driver or an unmanaged driver, like GKEs daemonset.

The driverInstallDir that is referenced here will correspond to what is set in ClusterPolicy as hostPaths.driverInstallDir.

elezar · 2024-06-17T12:46:27Z

validator/main.go

+			cmd.Stdout = os.Stdout
+			cmd.Stderr = os.Stderr
+		}
+		return cmd.Run()


My understanding is that nvidia-smi expects the device nodes created at /dev and that there isn't a way to control this.

elezar · 2024-06-17T12:47:22Z

validator/main.go

+	statusFileContent := strings.Join([]string{
+		fmt.Sprintf("IS_HOST_DRIVER=%t", driverInfo.isHostDriver),
+		fmt.Sprintf("NVIDIA_DRIVER_ROOT=%s", driverInfo.driverRoot),
+		fmt.Sprintf("DRIVER_ROOT_CTR_PATH=%s", driverInfo.driverRootCtrPath),
+		fmt.Sprintf("NVIDIA_DEV_ROOT=%s", driverInfo.devRoot),
+		fmt.Sprintf("DEV_ROOT_CTR_PATH=%s", driverInfo.devRootCtrPath),
+	}, "\n") + "\n"


nit: is this something that we could hang off the driverInfo type?

deployments/gpu-operator/templates/clusterpolicy.yaml

validator/main.go

validator/find.go

…naged by GPU Operator The commit updates our driver validator to only check the presence of .driver-ctr.ready, a file created by our driver daemonset readiness probe, if the driver container is managed by GPU Operator. This allows us to support non-standard environments, like GKE, where a driver container is deployed but not managed by GPU Operator. Signed-off-by: Christopher Desiniotis <[email protected]>

This commit updates our driver validator code to only chroot when validating a host installed driver. When validating a driver container install, we discover the paths to libnvidia-ml.so.1 and nvidia-smi at the driver container root and then run 'LD_PRELOAD=/driverRoot/path/to/libnvidia-ml.so.1 nvidia-smi'. This sets the stage for validating driver container installs where driverRoot does not represent a full filesystem hiearchy that one can 'chroot' into. Signed-off-by: Christopher Desiniotis <[email protected]>

RootFS represents the path to the root filesystem of the host. This is used by components that need to interact with the host filesystem and as such this must be a chroot-able filesystem. Examples include the MIG Manager and Toolkit Container which may need to stop, start, or restart systemd services. Signed-off-by: Christopher Desiniotis <[email protected]> Co-authored-by: Angelos Kolaitis <[email protected]>

…and containers except the driver-validator Having a static path inside our containers will make it easier when driverRoot is a configurable field. If driverRoot is set to a custom path, we can transform the host path for the volume while keeping the container path unchanged. The driver-validation initContainer is the exception to this rule. From the driver validation initContainer, the container path must match the host path otherwise the /dev/char symlinks will not resolve correctly on the host. The target of the symlinks must correspond to the path of the device nodes on the host. For example, when the NVIDIA device nodes are present under `/run/nvidia/driver/dev` on the host, running the following command from inside the container would create an invalid symlink: ln -s /driver-root/dev/nvidiactl /host-dev-char/195:255 while running the below command from inside the container would create a valid symlink: ln -s /run/nvidia/driver/dev/nvidiactl /host-dev-char/195:255 Signed-off-by: Christopher Desiniotis <[email protected]>

This allows for non-standard driver container installations, where the driver installation path and device nodes are rooted at paths other than '/run/nvidia/driver'. Note, setting driverInstallDir to a custom value is currently only supported for driver container installations not managed by by GPU Operator. For example, in the GKE use case where a driver daemonset is deployed prior to installing GPU Operator and the GPU Operator managed driver is disabled. The GPU Operator's driver container daemonset still assumes that the full driver installation is made available at '/run/nvidia/driver' on the host, and consequently, we always mount '/run/nvidia/driver' into the GPU Operator managed daemonset. We may consider removing this assumption in the future and support driver container implementations which allow for a custom driverInstallDir to be specified. Signed-off-by: Christopher Desiniotis <[email protected]>

cdesiniotis requested review from ArangoGutierrez, elezar, shivamerla and tariq1890 as code owners June 10, 2024 21:36

cdesiniotis commented Jun 10, 2024

View reviewed changes

controllers/object_controls.go Outdated Show resolved Hide resolved

deployments/gpu-operator/values.yaml Outdated Show resolved Hide resolved

validator/main.go Show resolved Hide resolved

validator/main.go Outdated Show resolved Hide resolved

tariq1890 reviewed Jun 11, 2024

View reviewed changes

api/nvidia/v1/clusterpolicy_types.go Outdated Show resolved Hide resolved

api/nvidia/v1/clusterpolicy_types.go Outdated Show resolved Hide resolved

controllers/object_controls.go Outdated Show resolved Hide resolved

controllers/object_controls.go Outdated Show resolved Hide resolved

tariq1890 reviewed Jun 11, 2024

View reviewed changes

controllers/object_controls.go Outdated Show resolved Hide resolved

tariq1890 reviewed Jun 11, 2024

View reviewed changes

controllers/object_controls.go Outdated Show resolved Hide resolved

tariq1890 reviewed Jun 11, 2024

View reviewed changes

controllers/object_controls.go Outdated Show resolved Hide resolved

tariq1890 reviewed Jun 11, 2024

View reviewed changes

controllers/object_controls.go Outdated Show resolved Hide resolved

validator/main.go Outdated Show resolved Hide resolved

elezar reviewed Jun 11, 2024

View reviewed changes

cdesiniotis force-pushed the custom-roots branch 5 times, most recently from dc44164 to b6fe11d Compare June 13, 2024 23:54

cdesiniotis requested review from elezar and tariq1890 June 14, 2024 00:21

cdesiniotis commented Jun 14, 2024

View reviewed changes

controllers/object_controls.go Show resolved Hide resolved

klueska reviewed Jun 14, 2024

View reviewed changes

elezar reviewed Jun 14, 2024

View reviewed changes

tariq1890 reviewed Jun 14, 2024

View reviewed changes

validator/main.go Outdated Show resolved Hide resolved

tariq1890 reviewed Jun 14, 2024

View reviewed changes

validator/main.go Outdated Show resolved Hide resolved

Dragoncell reviewed Jun 14, 2024

View reviewed changes

tariq1890 reviewed Jun 14, 2024

View reviewed changes

validator/find.go Outdated Show resolved Hide resolved

validator/main.go Outdated Show resolved Hide resolved

validator/main.go Outdated Show resolved Hide resolved

validator/main.go Outdated Show resolved Hide resolved

validator/main.go Show resolved Hide resolved

cdesiniotis force-pushed the custom-roots branch 4 times, most recently from 292e8d6 to 11757cf Compare June 15, 2024 20:30

cdesiniotis requested review from elezar and tariq1890 June 15, 2024 20:37

elezar approved these changes Jun 17, 2024

View reviewed changes

tariq1890 reviewed Jun 17, 2024

View reviewed changes

deployments/gpu-operator/templates/clusterpolicy.yaml Outdated Show resolved Hide resolved

validator/main.go Outdated Show resolved Hide resolved

tariq1890 reviewed Jun 17, 2024

View reviewed changes

validator/main.go Outdated Show resolved Hide resolved

tariq1890 reviewed Jun 17, 2024

View reviewed changes

validator/find.go Outdated Show resolved Hide resolved

cdesiniotis and others added 5 commits June 17, 2024 10:37

cdesiniotis force-pushed the custom-roots branch from 11757cf to d53e3e3 Compare June 17, 2024 17:50

tariq1890 approved these changes Jun 17, 2024

View reviewed changes

cdesiniotis changed the title ~~Add hostRoot, driverInstallDir, and devPath options to ClusterPolicy~~ Add rootFS and driverInstallDir fields to ClusterPolicy Jun 17, 2024

cdesiniotis enabled auto-merge June 17, 2024 18:22

cdesiniotis merged commit 2c4f301 into main Jun 17, 2024
11 checks passed

tariq1890 deleted the custom-roots branch June 17, 2024 19:18

cdesiniotis mentioned this pull request Jun 28, 2024

Adding transformation for kata-manager daemonset for supporting CRI-O #797

Merged

Hexoplon mentioned this pull request Jul 17, 2024

NVIDIA GPU Operator cannot find talos nvidia drivers siderolabs/talos#9014

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add rootFS and driverInstallDir fields to ClusterPolicy #747

Add rootFS and driverInstallDir fields to ClusterPolicy #747

cdesiniotis commented Jun 10, 2024

elezar left a comment

elezar Jun 11, 2024

cdesiniotis Jun 11, 2024

elezar Jun 14, 2024

elezar Jun 11, 2024

klueska Jun 14, 2024

cdesiniotis Jun 14, 2024

elezar left a comment

elezar Jun 14, 2024

Dragoncell Jun 14, 2024

cdesiniotis Jun 14, 2024

elezar left a comment

elezar Jun 17, 2024

elezar Jun 17, 2024

cdesiniotis Jun 17, 2024

elezar Jun 17, 2024

elezar Jun 17, 2024

cdesiniotis Jun 17, 2024

cdesiniotis Jun 17, 2024

elezar Jun 17, 2024

elezar Jun 17, 2024

Add rootFS and driverInstallDir fields to ClusterPolicy #747

Add rootFS and driverInstallDir fields to ClusterPolicy #747

Conversation

cdesiniotis commented Jun 10, 2024

elezar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elezar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elezar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment