-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically provisioning X11 and Wayland devices of GPU inside container? #118
Comments
I have added this feature request to our backlog. At present we have a big backlog, so it's unclear exactly when we will be able to look at this in detail. That said, it feels like it could be added as a new |
As you see the thumbs up, this feature is in quite a big demand, so it would be great to be implemented quickly. Thank you. |
If you get a chance to do that, maybe add |
In addition to the initial feature request, these are all the devices required to be provisioned automatically for NVIDIA to officially support Display (e.g. X11, Wayland) in Docker. If these devices are able to be provisioned using the container toolkit automatically, the nvidia/opengl container (nvidia-docker) can properly support the NVIDIA version of XWayland (currently undergoing support into the Linux kernel by NVIDIA devs) and thus support Displays. There are a lot of people waiting for Display support in Docker and Kubernetes, especially because NVIDIA is to support XWayland in the near future. Please implement this feature to streamline this. |
Any updates? @klueska |
Please use https://gitlab.com/arm-research/smarter/smarter-device-manager for /dev/dri/card* and /dev/dri/render* if you stumble upon this issue. |
EGL does not require |
Still likely needed for Wayland with GBM. |
Thanks @ehfd. We are working on improving the injection of these devices in an upcoming release. Note that the current plan is to do so using the Do you have samples containers / test cases that you would be able to provide to ensure that we meet the requirements? |
@elezar These two repositories involve a series of hacks to make NVIDIA GPUs work reliably inside a container unprivileged with a properly accelerated GUI. docker-nvidia-glx-desktop must install the userspace driver components at startup mostly following your examples but after reading from In the current state, the same userspace driver installation must be done for Wayland by reading Also, in docker-nvidia-egl-desktop, where the userspace drivers aren't installed at startup, an annoying situation arises, where Vulkan requires the Vulkan should be possible only with the And the Please also consider injecting the necessary libraries for NVFBC with the We really hope that Note that we currently use https://gitlab.com/arm-research/smarter/smarter-device-manager for provisioning /dev/dri devices, but there is no methodology to push just the devices for the GPU allocated to the container. Thanks a lot! |
Thanks for all the information. I will comb through it while working on the feature. Hopefully we can improve things significantly! |
I'm writing a remote Wayland compositor and am currently busy integrating it with k8s and can independently confirm everything @ehfd has stated so far, as I've hit all of them in the last couple of weeks. Being able to access @elezar I'm happy to assist and answer any questions you might have to help move this forward! |
Thanks @Zubnix. We have started work on injecting the I think in all cases having a list of specific devices, libraries, and environment variables that are required in a container for things to work as expected would be quite useful. We will be sure to update this issue as soon as there is something our for testing and early feedback. |
@Zubnix Hi! I've been having interest in Greenfield for a long time. Nice to meet you here! I also hope that eliminating driver dependencies of a container is very important. Thanks for your feedback! |
@elezar Hi! I saw that the /dev/dri component got merged. I believe that should make Wayland fundamentally work in Kubernetes. Would it be possible to pass the below library components for enhanced X11/Wayland support? |
Thanks @ehfd I will have a look at the link you suggested. |
@elezar Anything with AND means should be injected in either of the cases. And as you know well, the generic symlinks to the .so.525.78.01 files should be passed. And I believe that, for practical use, everything in Configuration .json files should be added to the container like the base images do now.
|
As of libnvidia-container
These important libraries are still not provisioned. |
@klueska @elezar A reminder for you guys... The below are the only libraries left until I can finally close this three-year-old issue and both X11 and Wayland works inside a container. This is likely a 30 minute work for you guys. Things mostly work now, but only after downloading
If you can't include some of these into the container toolkit, please tell us why. |
@ehfd thanks for the reminder here. Some of the libraries are already handled by the NVIDIA Container Toolkit -- with the Caveat that their detection may be distribution dependent at the moment. The main thing to change here is where we search for the libraries. There should be no technical reason for why we haven't done this and the delay is largely caused by resource constraints. Note that in theory, if you mount these missing libraries from the host it should not be required to use the .run file to install the user space libraries in the container. If you have capacity to contribute the changes, I would be happy to review these. Note that I would recommend making these against the NVIDIA Container Toolkit where we already inject some of the libraries that you mentioned. |
Thank you @elezar |
The core issue seems that https://gitlab.com/nvidia/container-toolkit/container-toolkit/-/blob/main/internal/discover/graphics.go does not invoke with Docker somehow. Perhaps this might be something with the Docker runner not being based on CDI? |
To trigger the logic as linked you need to:
To configure the Then we can run a container:
This does not require CDI support explicitly. |
Most of the above issues were probably because the PPA for graphics drivers did not install:
|
@elezar I have a contribution. |
NVIDIA/nvidia-container-toolkit#490 (comment) More detailed situation and requirements to close this issue conclusively. |
PR to fix Wayland: NVIDIA/nvidia-container-toolkit#548 - Merged. New issue for X11: NVIDIA/nvidia-container-toolkit#563 |
Redirected from NVIDIA/k8s-device-plugin#206 to a more suitable repository.
1. Issue or feature description
In docker and kubernetes, people have had to have manual host setup to provision the X server using host path directive
/tmp/.X11-unix
. This is quite tedious for sysadmins and at the same time a security threat as people can spoof the host.To mitigate this, there have been attempts (https://github.com/ehfd/docker-nvidia-glx-desktop which is based on https://github.com/ryought/glx-docker-headless-gpu) to execute an X server and use GLX inside the container after getting provisioned the GPU using libnvidia-container.
An alternative was created by the developers at VirtualGL (used widely in HPC to enable GPU-based rendering in VNC virtual display environments) have been able to develop a feature that uses the EGL API to enable 3D GL rendering such as Blender, Matlab, and Unity, previously only possible with GLX and thus an X server. As you guys know well, nvidia-docker does not support GLX but has introduced the EGL API just below two years ago.
See EGL config section of VirtualGL/virtualgl#113 (comment)
EGL is also required to start a Wayland compositor inside a container with the EGLStreams specification in NVIDIA GPUs, which is the way forward after X11 development has stopped.
These use cases require access to the devices
/dev/dri/cardX
corresponding to each GPU provisioned using libnvidia-container. However, it does not seem like libnvidia-container provisions this automatically. I would like to ask you whether this is possible, and how this can be configured.2. Steps to reproduce the issue
Provision one GPU inside container
nvidia/cudagl:11.0-devel-ubuntu20.04
ornvidia/opengl:1.2-glvnd-devel-ubuntu20.04
in Docker CE 19.03 (or using onenvidia.com/gpu: 1
with k8s-device-plugin v0.7.0 with default configurations in Kubernetes v1.18.6).Do:
ls /dev
Result: Inside the container you see
/dev/nvidiaX
,/dev/nvidia-modeset
,/dev/nvidia-uvm
,/dev/nvidia-uvm-tools
, HOWEVER directory/dev/dri
does not exist.Wayland compositors are unlikely to start inside a container without DRM devices. VirtualGL does not work through any devices other than
/dev/dri/cardX
as well.3. Information to attach (optional if deemed irrelevant)
Other issues and repositories:
Example of VirtualGL EGL configuration that requires
/dev/dri/cardX
: https://github.com/ehfd/docker-nvidia-egl-desktopImplementation of an unprivileged remote desktop bundling an X server with many hacks: https://github.com/ehfd/docker-nvidia-glx-desktop
The text was updated successfully, but these errors were encountered: