Replies: 2 comments 2 replies
This comment has been hidden.
This comment has been hidden.
-
@klueska , how do you create the channel device btw, created by driver or imex ds here? I only saw a validation logic in the container yaml file. /dev/nvidia-caps-imex-channels/channel0 is imex-channel-0 the resource name or the real char device name? |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Multi-Node NVLink support with host-installed driver
Prerequisites:
Setup GPU Operator and install DRA driver
Fetch the latest GPU Operator and DRA driver helm charts (from
main
):Install the the GPU Operator for use with HOST managed drivers:
Install the DRA Driver with support for HOST managed drivers:
Validate GPU Operator and DRA driver running
Validate that all GPU Operator components are running and in a Ready state:
Validate that the DRA driver components are running and in a Ready state:
Confirm that all GPU nodes are labeled with clique ids:
Simple IMEX channel injection with IMEX daemon running
Run a simple test to validate IMEX daemons are started and IMEX channels are injected:
Multi-Node MPI test
Install the latest version of the MPI Operator:
Run a multi-node
nvbandwidth
test requiring IMEX channels with MPI:Beta Was this translation helpful? Give feedback.
All reactions