Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing confing file in /etc/ld.so.conf.d/ pointing to /lib64 causing the ldconfig caches wrong glibc and broken the k8s-device-plugin container img #1182

Open
fangpenlin opened this issue Feb 27, 2025 · 0 comments · May be fixed by #1183

Comments

@fangpenlin
Copy link

I followed the guide described in here to install the plugin helm chart. Somehow a few container shows error messages like this:

symbol lookup error: /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libc.so.6: undefined symbol: __tunable_is_initialized, version GLIBC_PRIVATE

This Kubernetes cluster is running on top of NixOS by the way. I can reproduce it with a simple podman cli run like this:

$ podman run -it --entrypoint=/bin/sh --rm --device nvidia.com/gpu=all --security-opt=label=disable nvcr.io/nvidia/k8s-device-plugin:v0.17.0
/bin/sh: symbol lookup error: /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libc.so.6: undefined symbol: __tunable_is_initialized, version GLIBC_PRIVATE

After digging into the problem, the ldconfig cache OCI hook generated in the CDI config is the root cause:

      {
        "hookName": "createContainer",
        "path": "/nix/store/il5kz2p67hdd05c9gmg8m5c5l8gbrk90-container-toolkit-container-toolkit-1.15.0-rc.3/bin/nvidia-ctk",
        "args": [
          "nvidia-ctk",
          "hook",
          "update-ldcache",
          "--ldconfig-path",
          "/nix/store/29mb4q8b5306f4gk2wh38h0c1akb0n97-glibc-2.40-36-bin/bin/ldconfig",
          "--folder",
          "/nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib"
        ]
      }

The nvidia-ctk command line tool will create a config file like nvcr-1167929244.conf into the /etc/ld.so.conf.d for the mounted folders we provided via --folder argument. Before the hook runs, the ld cache looks like this:

$ ldconfig -p
131 libs found in cache `/etc/ld.so.cache'
        libzstd.so.1 (libc6,x86-64) => /lib64/libzstd.so.1
        libz.so.1 (libc6,x86-64) => /lib64/libz.so.1
        libyaml-0.so.2 (libc6,x86-64) => /lib64/libyaml-0.so.2
        libxml2.so.2 (libc6,x86-64) => /lib64/libxml2.so.2
        libverto.so.1 (libc6,x86-64) => /lib64/libverto.so.1
        libuuid.so.1 (libc6,x86-64) => /lib64/libuuid.so.1
        libutil.so.1 (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libutil.so.1
        libunistring.so.2 (libc6,x86-64) => /lib64/libunistring.so.2
        libudev.so.1 (libc6,x86-64) => /lib64/libudev.so.1
        libtinfo.so.6 (libc6,x86-64) => /lib64/libtinfo.so.6
        libtic.so.6 (libc6,x86-64) => /lib64/libtic.so.6
        libthread_db.so.1 (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libthread_db.so.1
        libtasn1.so.6 (libc6,x86-64) => /lib64/libtasn1.so.6
        libsystemd.so.0 (libc6,x86-64) => /lib64/libsystemd.so.0
        libstdc++.so.6 (libc6,x86-64) => /lib64/libstdc++.so.6
        libssl.so.3 (libc6,x86-64) => /lib64/libssl.so.3
        libsqlite3.so.0 (libc6,x86-64) => /lib64/libsqlite3.so.0
        libsolvext.so.1 (libc6,x86-64) => /lib64/libsolvext.so.1
        libsolv.so.1 (libc6,x86-64) => /lib64/libsolv.so.1
        libsmartcols.so.1 (libc6,x86-64) => /lib64/libsmartcols.so.1
        libslapi.so.2 (libc6,x86-64) => /lib64/libslapi.so.2
        libsigsegv.so.2 (libc6,x86-64) => /lib64/libsigsegv.so.2
        libsepol.so.2 (libc6,x86-64) => /lib64/libsepol.so.2
        libsemanage.so.2 (libc6,x86-64) => /lib64/libsemanage.so.2
        libselinux.so.1 (libc6,x86-64) => /lib64/libselinux.so.1
        libsasl2.so.3 (libc6,x86-64) => /lib64/libsasl2.so.3
        librt.so.1 (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/librt.so.1
        librpmio.so.9 (libc6,x86-64) => /lib64/librpmio.so.9
        librpm.so.9 (libc6,x86-64) => /lib64/librpm.so.9
        librhsm.so.0 (libc6,x86-64) => /lib64/librhsm.so.0
        libresolv.so.2 (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libresolv.so.2
        librepo.so.0 (libc6,x86-64) => /lib64/librepo.so.0
        libreadline.so.8 (libc6,x86-64) => /lib64/libreadline.so.8
        libp11-kit.so.0 (libc6,x86-64) => /lib64/libp11-kit.so.0
        libpthread.so.0 (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libpthread.so.0
        libpsx.so.2 (libc6,x86-64) => /lib64/libpsx.so.2
        libpopt.so.0 (libc6,x86-64) => /lib64/libpopt.so.0
        libpeas-1.0.so.0 (libc6,x86-64) => /lib64/libpeas-1.0.so.0
        libpcre2-8.so.0 (libc6,x86-64) => /lib64/libpcre2-8.so.0
        libpcre2-posix.so.3 (libc6,x86-64) => /lib64/libpcre2-posix.so.3
        libpcreposix.so.0 (libc6,x86-64) => /lib64/libpcreposix.so.0
        libpcre.so.1 (libc6,x86-64) => /lib64/libpcre.so.1
        libpcprofile.so (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libpcprofile.so
        libpanelw.so.6 (libc6,x86-64) => /lib64/libpanelw.so.6
        libpanel.so.6 (libc6,x86-64) => /lib64/libpanel.so.6
        libnssckbi.so (libc6,x86-64) => /lib64/libnssckbi.so
        libnss_systemd.so.2 (libc6,x86-64) => /lib64/libnss_systemd.so.2
        libnss_resolve.so.2 (libc6,x86-64) => /lib64/libnss_resolve.so.2
        libnss_myhostname.so.2 (libc6,x86-64) => /lib64/libnss_myhostname.so.2
        libnss_files.so.2 (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libnss_files.so.2
        libnss_dns.so.2 (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libnss_dns.so.2
        libnss_compat.so.2 (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libnss_compat.so.2
        libnpth.so.0 (libc6,x86-64) => /lib64/libnpth.so.0
        libnghttp2.so.14 (libc6,x86-64) => /lib64/libnghttp2.so.14
        libnettle.so.8 (libc6,x86-64) => /lib64/libnettle.so.8
        libncursesw.so.6 (libc6,x86-64) => /lib64/libncursesw.so.6
        libncurses.so.6 (libc6,x86-64) => /lib64/libncurses.so.6
        libmvec.so.1 (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libmvec.so.1
        libmpfr.so.6 (libc6,x86-64) => /lib64/libmpfr.so.6
        libmount.so.1 (libc6,x86-64) => /lib64/libmount.so.1
        libmodulemd.so.2 (libc6,x86-64) => /lib64/libmodulemd.so.2
        libmenuw.so.6 (libc6,x86-64) => /lib64/libmenuw.so.6
        libmenu.so.6 (libc6,x86-64) => /lib64/libmenu.so.6
        libmemusage.so (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libmemusage.so
        libmagic.so.1 (libc6,x86-64) => /lib64/libmagic.so.1
        libm.so.6 (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libm.so.6
        liblz4.so.1 (libc6,x86-64) => /lib64/liblz4.so.1
        liblzma.so.5 (libc6,x86-64) => /lib64/liblzma.so.5
        liblua-5.4.so (libc6,x86-64) => /lib64/liblua-5.4.so
        libldap.so.2 (libc6,x86-64) => /lib64/libldap.so.2
        liblber.so.2 (libc6,x86-64) => /lib64/liblber.so.2
        libk5crypto.so.3 (libc6,x86-64) => /lib64/libk5crypto.so.3
        libksba.so.8 (libc6,x86-64) => /lib64/libksba.so.8
        libkrb5support.so.0 (libc6,x86-64) => /lib64/libkrb5support.so.0
        libkrb5.so.3 (libc6,x86-64) => /lib64/libkrb5.so.3
        libkrad.so.0 (libc6,x86-64) => /lib64/libkrad.so.0
        libkeyutils.so.1 (libc6,x86-64) => /lib64/libkeyutils.so.1
        libkdb5.so.10 (libc6,x86-64) => /lib64/libkdb5.so.10
        libjson-glib-1.0.so.0 (libc6,x86-64) => /lib64/libjson-glib-1.0.so.0
        libjson-c.so.5 (libc6,x86-64) => /lib64/libjson-c.so.5
        libidn2.so.0 (libc6,x86-64) => /lib64/libidn2.so.0
        libhogweed.so.6 (libc6,x86-64) => /lib64/libhogweed.so.6
        libhistory.so.8 (libc6,x86-64) => /lib64/libhistory.so.8
        libgthread-2.0.so.0 (libc6,x86-64) => /lib64/libgthread-2.0.so.0
        libgssrpc.so.4 (libc6,x86-64) => /lib64/libgssrpc.so.4
        libgssapi_krb5.so.2 (libc6,x86-64) => /lib64/libgssapi_krb5.so.2
        libgpgme.so.11 (libc6,x86-64) => /lib64/libgpgme.so.11
        libgpg-error.so.0 (libc6,x86-64) => /lib64/libgpg-error.so.0
        libgobject-2.0.so.0 (libc6,x86-64) => /lib64/libgobject-2.0.so.0
        libgnutls.so.30 (libc6,x86-64) => /lib64/libgnutls.so.30
        libgmp.so.10 (libc6,x86-64) => /lib64/libgmp.so.10
        libgmodule-2.0.so.0 (libc6,x86-64) => /lib64/libgmodule-2.0.so.0
        libglib-2.0.so.0 (libc6,x86-64) => /lib64/libglib-2.0.so.0
        libgirepository-1.0.so.1 (libc6,x86-64) => /lib64/libgirepository-1.0.so.1
        libgio-2.0.so.0 (libc6,x86-64) => /lib64/libgio-2.0.so.0
        libgdbm_compat.so.4 (libc6,x86-64) => /lib64/libgdbm_compat.so.4
        libgdbm.so.6 (libc6,x86-64) => /lib64/libgdbm.so.6
        libgcrypt.so.20 (libc6,x86-64) => /lib64/libgcrypt.so.20
        libgcc_s.so.1 (libc6,x86-64) => /lib64/libgcc_s.so.1
        libformw.so.6 (libc6,x86-64) => /lib64/libformw.so.6
        libform.so.6 (libc6,x86-64) => /lib64/libform.so.6
        libffi.so.8 (libc6,x86-64) => /lib64/libffi.so.8
        libevent_pthreads-2.1.so.7 (libc6,x86-64) => /lib64/libevent_pthreads-2.1.so.7
        libevent_openssl-2.1.so.7 (libc6,x86-64) => /lib64/libevent_openssl-2.1.so.7
        libevent_extra-2.1.so.7 (libc6,x86-64) => /lib64/libevent_extra-2.1.so.7
        libevent_core-2.1.so.7 (libc6,x86-64) => /lib64/libevent_core-2.1.so.7
        libevent-2.1.so.7 (libc6,x86-64) => /lib64/libevent-2.1.so.7
        libdrop_ambient.so.0 (libc6,x86-64) => /lib64/libdrop_ambient.so.0
        libdnf.so.2 (libc6,x86-64) => /lib64/libdnf.so.2
        libdl.so.2 (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libdl.so.2
        libcurl.so.4 (libc6,x86-64) => /lib64/libcurl.so.4
        libcrypto.so.3 (libc6,x86-64) => /lib64/libcrypto.so.3
        libcrypt.so.2 (libc6,x86-64) => /lib64/libcrypt.so.2
        libcom_err.so.2 (libc6,x86-64) => /lib64/libcom_err.so.2
        libcap.so.2 (libc6,x86-64) => /lib64/libcap.so.2
        libcap-ng.so.0 (libc6,x86-64) => /lib64/libcap-ng.so.0
        libc_malloc_debug.so.0 (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libc_malloc_debug.so.0
        libc.so.6 (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libc.so.6
        libbz2.so.1 (libc6,x86-64) => /lib64/libbz2.so.1
        libblkid.so.1 (libc6,x86-64) => /lib64/libblkid.so.1
        libauparse.so.0 (libc6,x86-64) => /lib64/libauparse.so.0
        libauparse.so (libc6,x86-64) => /lib64/libauparse.so
        libaudit.so.1 (libc6,x86-64) => /lib64/libaudit.so.1
        libattr.so.1 (libc6,x86-64) => /lib64/libattr.so.1
        libassuan.so.0 (libc6,x86-64) => /lib64/libassuan.so.0
        libarchive.so.13 (libc6,x86-64) => /lib64/libarchive.so.13
        libanl.so.1 (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libanl.so.1
        libacl.so.1 (libc6,x86-64) => /lib64/libacl.so.1
        libSegFault.so (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libSegFault.so
        libBrokenLocale.so.1 (libc6,x86-64, OS ABI: Linux 3.2.0) => /lib64/libBrokenLocale.so.1
        ld-linux-x86-64.so.2 (libc6,x86-64) => /lib64/ld-linux-x86-64.so.2
Cache generated by: ldconfig (GNU libc) stable release version 2.34

But then when the ldconfig hook kicks in, right after it runs, here's what the ld cache looks like inside the container's namespace:

ldconfig -r /home/fangpen/.local/share/containers/storage/overlay/3e2b7a6d05ca228fafe39b47a19952d4b81a9c413618c66a7013d794e4db3d96/merged -C /etc/ld.so.cache -f /etc/ld.so.conf -p
64 libs found in cache `/etc/ld.so.cache'
        libutil.so.1 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libutil.so.1
        libutil.so (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libutil.so
        libthread_db.so.1 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libthread_db.so.1
        libthread_db.so (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libthread_db.so
        librt.so.1 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/librt.so.1
        librt.so (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/librt.so
        libresolv.so.2 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libresolv.so.2
        libresolv.so (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libresolv.so
        libpthread.so.0 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libpthread.so.0
        libpthread.so (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libpthread.so
        libpcprofile.so (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libpcprofile.so
        libnvoptix.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvoptix.so.1
        libnvidia-vksc-core.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-vksc-core.so.1
        libnvidia-tls.so.565.77 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-tls.so.565.77
        libnvidia-sandboxutils.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-sandboxutils.so.1
        libnvidia-rtcore.so.565.77 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-rtcore.so.565.77
        libnvidia-ptxjitcompiler.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-ptxjitcompiler.so.1
        libnvidia-pkcs11-openssl3.so.565.77 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-pkcs11-openssl3.so.565.77
        libnvidia-opticalflow.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-opticalflow.so.1
        libnvidia-opencl.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-opencl.so.1
        libnvidia-nvvm.so.4 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-nvvm.so.4
        libnvidia-ngx.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-ngx.so.1
        libnvidia-ml.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-ml.so.1
        libnvidia-gpucomp.so.565.77 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-gpucomp.so.565.77
        libnvidia-glvkspirv.so.565.77 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-glvkspirv.so.565.77
        libnvidia-glsi.so.565.77 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-glsi.so.565.77
        libnvidia-glcore.so.565.77 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-glcore.so.565.77
        libnvidia-fbc.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-fbc.so.1
        libnvidia-encode.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-encode.so.1
        libnvidia-eglcore.so.565.77 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-eglcore.so.565.77
        libnvidia-egl-gbm.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-egl-gbm.so.1
        libnvidia-cfg.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-cfg.so.1
        libnvidia-allocator.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvidia-allocator.so.1
        libnvcuvid.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libnvcuvid.so.1
        libnss_hesiod.so.2 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libnss_hesiod.so.2
        libnss_hesiod.so (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libnss_hesiod.so
        libnss_files.so.2 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libnss_files.so.2
        libnss_dns.so.2 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libnss_dns.so.2
        libnss_db.so.2 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libnss_db.so.2
        libnss_db.so (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libnss_db.so
        libnss_compat.so.2 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libnss_compat.so.2
        libnss_compat.so (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libnss_compat.so
        libnsl.so.1 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libnsl.so.1
        libmvec.so.1 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libmvec.so.1
        libmvec.so (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libmvec.so
        libmemusage.so (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libmemusage.so
        libm.so.6 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libm.so.6
        libglxserver_nvidia.so.565.77 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libglxserver_nvidia.so.565.77
        libdl.so.2 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libdl.so.2
        libdl.so (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libdl.so
        libcudadebugger.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libcudadebugger.so.1
        libcuda.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libcuda.so.1
        libc_malloc_debug.so.0 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libc_malloc_debug.so.0
        libc_malloc_debug.so (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libc_malloc_debug.so
        libc.so.6 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libc.so.6
        libanl.so.1 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libanl.so.1
        libanl.so (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libanl.so
        libGLX_nvidia.so.0 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libGLX_nvidia.so.0
        libGLESv2_nvidia.so.2 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libGLESv2_nvidia.so.2
        libGLESv1_CM_nvidia.so.1 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libGLESv1_CM_nvidia.so.1
        libEGL_nvidia.so.0 (libc6,x86-64) => /nix/store/x5522a7p46nnbwxjv8w942p6qps7x0lw-nvidia-x11-565.77-6.6.79/lib/libEGL_nvidia.so.0
        libBrokenLocale.so.1 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libBrokenLocale.so.1
        libBrokenLocale.so (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libBrokenLocale.so
        ld-linux-x86-64.so.2 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/ld-linux-x86-64.so.2
Cache generated by: ldconfig (GNU libc) stable release version 2.40

As you can see now the libc is pointing to the one provided by my CDI config file as the other nvidia drivers and executables relying on it

libc.so.6 (libc6,x86-64) => /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36/lib/libc.so.6

I guess the glibc version compiled inside the plugin container is different from the one ship with my nixos nvidia binaries, as a result, we see this error:

undefined symbol: __tunable_is_initialized, version GLIBC_PRIVATE

Solving this problem is fairly simple. One only needs to create a lib64.conf inside the /etc/ld.so.conf.d folder making it pointing to /lib64 in the container from the first place. Like this:

$ echo "/lib64" > lib64.conf
$ podman run -v $PWD/lib64.conf:/etc/ld.so.conf.d/lib64.conf -it --entrypoint=/bin/sh --rm --device nvidia.com/gpu=all --security-opt=label=disable nvcr.io/nvidia/k8s-device-plugin:v0.17.0
sh-5.1#

I checked with the other common docker images, such as ubuntu, it doesn't come with this issue because it comes with all those needed ld config files:

$ podman run -it ubuntu
root@3e192d26a2c8:/# ls -al /etc/ld.so.conf.d/
total 16
drwxr-xr-x 2 root root 4096 Jan 27 02:09 .
drwxr-xr-x 1 root root 4096 Feb 27 20:30 ..
-rw-r--r-- 1 root root   44 Aug  2  2022 libc.conf
-rw-r--r-- 1 root root  100 Mar 30  2024 x86_64-linux-gnu.conf

I think it makes sense to update the docker file in this repo to include the ldconfig config file so that we can avoid a problem like that. I am going to create a PR shortly for adding that missing config file.

fangpenlin added a commit to fangpenlin/k8s-device-plugin that referenced this issue Feb 27, 2025
fangpenlin added a commit to fangpenlin/k8s-device-plugin that referenced this issue Feb 28, 2025
…b64 in plugin container image

Signed-off-by: Fang-Pen Lin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant