-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rootFS and driverInstallDir fields to ClusterPolicy #747
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cdesiniotis. It's a big PR, but I took a basic pass.
A general question is whether it makes sense to factor some of the transform logic that we are adding into functions instead of implementing them in-line in the three cases where we need them.
@@ -41,7 +41,7 @@ spec: | |||
seLinuxOptions: | |||
level: "s0" | |||
volumeMounts: | |||
- name: driver-install-path | |||
- name: driver-root | |||
mountPath: /run/nvidia/driver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question (and maybe out of scope for this PR): Does it make sense to set this to /driver-root
in the container instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am using the exact hostPath so that dev-char symlink creation works correctly. I want to make sure the target of the symlink resolves correctly on the host. So for example, we create a symlink /host-dev-char/195:0
--> /run/nvidia/driver/dev/nvidia0
in the driver validator container. I guess if we transformed the target path before creating the symlink, then we could always mount the driver install directory at /driver-root
or /driver-install-dir
. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we'd be able to create the symlink regardless of whether the target exists or not, so as lon as NVIDIA_DEV_ROOT
(or NVIDIA_DEV_PATH
) refers to the correct path on the host the symlinks should work as expected.
In the medium term we could consider moving this out of the toolkit container anyway ... since it really should be a postcondition of the driver container.
Let's leave it as is for now though.
validator/main.go
Outdated
hostRootFlag string | ||
driverRootFlag string | ||
devRootFlag string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of separate flags how about the following:
type options struct {
type hostPaths struct {
root string
driverRoot string
devRoot string
}
}
(or even a named struct for the hostPaths
).
If we don't want to update the functions signatures to pass options instead, then we can create a global variable:
var hostPaths hostPathsType{}
and refer to this when setting the command line arguments.
dc44164
to
b6fe11d
Compare
// getDevRoot returns the dev root associated with the root. | ||
// If the root is not a dev root, this defaults to "/". | ||
func (r root) getDevRoot() string { | ||
if r.isDevRoot() { | ||
return string(r) | ||
} | ||
return "/" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was surprised to not see a field for devPath
in HostPathsSpec
, but then I recalled a conversation about not needing to specify it if we assume that it is either only ever at:
<DriverInstallDir>/dev
; OR/dev
It looks like this is the logic that is implementing this assumption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. We decided to omit devPath
from HostPathsSpec
to start with. If a use case presents itself where device nodes need to exist at some other location, we can introduce this field to the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cdesiniotis. I think we're getting there.
I still have some comments / questions, but these are mostly informative.
@@ -41,7 +41,7 @@ spec: | |||
seLinuxOptions: | |||
level: "s0" | |||
volumeMounts: | |||
- name: driver-install-path | |||
- name: driver-root | |||
mountPath: /run/nvidia/driver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we'd be able to create the symlink regardless of whether the target exists or not, so as lon as NVIDIA_DEV_ROOT
(or NVIDIA_DEV_PATH
) refers to the correct path on the host the symlinks should work as expected.
In the medium term we could consider moving this out of the toolkit container anyway ... since it really should be a postcondition of the driver container.
Let's leave it as is for now though.
echo "CONTAINER_DRIVER_ROOT=$CONTAINER_DRIVER_ROOT" | ||
|
||
set -o allexport | ||
cat /run/nvidia/validations/driver-ready |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it also export the custom toolkit installation path from values in toolkit to the device plugin for nvidia-ctk tool ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. But we set the NVIDIA_CTK_PATH
variable here for the device-plugin in the clusterpolicy controller: https://github.com/NVIDIA/gpu-operator/blob/main/controllers/object_controls.go#L1284.
…ripts This commit updates the driver validation to always create a 'driver-ready' file, regardless if the driver is installed on the host or not. It also populates this file with a list of environment variables, one per line, which are required by some operands. The startup scripts for several operands are simplified to simply source the content of this file before executing the main program for the container. Signed-off-by: Christopher Desiniotis <[email protected]>
292e8d6
to
11757cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to move forward with this as is and then address any findings from testing as follow-ups.
@@ -12,3 +12,9 @@ rules: | |||
- use | |||
resourceNames: | |||
- privileged | |||
- apiGroups: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not aware of the toolkit inspecting daemonsets? Was thsi change intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this because the validator runs here too? I thought we changed the logic to just look for the ready file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is because the validator runs here as an init container. We still need to run the driver validation here since the operator-validator pod runs with the nvidia
runtime class, and thus, needs the toolkit pod to run first before it can run and start performing any validations.
@@ -67,6 +71,8 @@ spec: | |||
value: "management.nvidia.com/gpu" | |||
- name: NVIDIA_VISIBLE_DEVICES | |||
value: "void" | |||
- name: TOOLKIT_PID_FILE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference. This requires NVIDIA/nvidia-container-toolkit#544
devRoot = hostRoot | ||
devRootCtrPath = "/host" | ||
} else { | ||
devRoot = driverInstallDir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just so that I'm clear, this is only triggered for managed drivers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. This logic applies to any driver not installed on the host directly at /
. So this can be the operator managed driver or an unmanaged driver, like GKEs daemonset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The driverInstallDir
that is referenced here will correspond to what is set in ClusterPolicy as hostPaths.driverInstallDir
.
cmd.Stdout = os.Stdout | ||
cmd.Stderr = os.Stderr | ||
} | ||
return cmd.Run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that nvidia-smi
expects the device nodes created at /dev
and that there isn't a way to control this.
statusFileContent := strings.Join([]string{ | ||
fmt.Sprintf("IS_HOST_DRIVER=%t", driverInfo.isHostDriver), | ||
fmt.Sprintf("NVIDIA_DRIVER_ROOT=%s", driverInfo.driverRoot), | ||
fmt.Sprintf("DRIVER_ROOT_CTR_PATH=%s", driverInfo.driverRootCtrPath), | ||
fmt.Sprintf("NVIDIA_DEV_ROOT=%s", driverInfo.devRoot), | ||
fmt.Sprintf("DEV_ROOT_CTR_PATH=%s", driverInfo.devRootCtrPath), | ||
}, "\n") + "\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is this something that we could hang off the driverInfo
type?
…naged by GPU Operator The commit updates our driver validator to only check the presence of .driver-ctr.ready, a file created by our driver daemonset readiness probe, if the driver container is managed by GPU Operator. This allows us to support non-standard environments, like GKE, where a driver container is deployed but not managed by GPU Operator. Signed-off-by: Christopher Desiniotis <[email protected]>
This commit updates our driver validator code to only chroot when validating a host installed driver. When validating a driver container install, we discover the paths to libnvidia-ml.so.1 and nvidia-smi at the driver container root and then run 'LD_PRELOAD=/driverRoot/path/to/libnvidia-ml.so.1 nvidia-smi'. This sets the stage for validating driver container installs where driverRoot does not represent a full filesystem hiearchy that one can 'chroot' into. Signed-off-by: Christopher Desiniotis <[email protected]>
RootFS represents the path to the root filesystem of the host. This is used by components that need to interact with the host filesystem and as such this must be a chroot-able filesystem. Examples include the MIG Manager and Toolkit Container which may need to stop, start, or restart systemd services. Signed-off-by: Christopher Desiniotis <[email protected]> Co-authored-by: Angelos Kolaitis <[email protected]>
…and containers except the driver-validator Having a static path inside our containers will make it easier when driverRoot is a configurable field. If driverRoot is set to a custom path, we can transform the host path for the volume while keeping the container path unchanged. The driver-validation initContainer is the exception to this rule. From the driver validation initContainer, the container path must match the host path otherwise the /dev/char symlinks will not resolve correctly on the host. The target of the symlinks must correspond to the path of the device nodes on the host. For example, when the NVIDIA device nodes are present under `/run/nvidia/driver/dev` on the host, running the following command from inside the container would create an invalid symlink: ln -s /driver-root/dev/nvidiactl /host-dev-char/195:255 while running the below command from inside the container would create a valid symlink: ln -s /run/nvidia/driver/dev/nvidiactl /host-dev-char/195:255 Signed-off-by: Christopher Desiniotis <[email protected]>
This allows for non-standard driver container installations, where the driver installation path and device nodes are rooted at paths other than '/run/nvidia/driver'. Note, setting driverInstallDir to a custom value is currently only supported for driver container installations not managed by by GPU Operator. For example, in the GKE use case where a driver daemonset is deployed prior to installing GPU Operator and the GPU Operator managed driver is disabled. The GPU Operator's driver container daemonset still assumes that the full driver installation is made available at '/run/nvidia/driver' on the host, and consequently, we always mount '/run/nvidia/driver' into the GPU Operator managed daemonset. We may consider removing this assumption in the future and support driver container implementations which allow for a custom driverInstallDir to be specified. Signed-off-by: Christopher Desiniotis <[email protected]>
11757cf
to
d53e3e3
Compare
No description provided.