Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ceph-csi for rbd cannot mount image after upgrade k8s to v1.31 #5066

Closed
hasonhai opened this issue Jan 9, 2025 · 5 comments
Closed

ceph-csi for rbd cannot mount image after upgrade k8s to v1.31 #5066

hasonhai opened this issue Jan 9, 2025 · 5 comments
Labels
component/rbd Issues related to RBD

Comments

@hasonhai
Copy link

hasonhai commented Jan 9, 2025

Describe the bug

After we upgraded the K8S cluster from 1.30.4 to 1.31.4, ceph-rbdplugin cannot mount the image anymore. It still works fine on node that has kubelet of version 1.30.4.
In the beginning, we have ceph-csi v3.12.1. The error occured, so we try upgrading to v3.13.0 to see if it can fix the issue, but it's still the same.

Environment details

  • Image/version of Ceph CSI driver : v3.13.0
  • Helm chart version :
  • Kernel version : RHEL9 5.14.0-503.19.1.el9_5.x86_64
  • Mounter used for mounting PVC (for cephFS its fuse or kernel. for rbd its
    krbd or rbd-nbd) : krbd
  • Kubernetes cluster version : v1.31.4
  • Ceph cluster version : v18.2.4

Steps to reproduce

Steps to reproduce the behavior:

  1. Setup details
    Storage class:
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  creationTimestamp: "2021-04-01T13:37:08Z"
  name: dynamic-ceph-storage
  resourceVersion: "177756013"
  uid: a533c5dc-402c-4ad4-9a81-c543accbd954
mountOptions:
- nodelalloc
parameters:
  clusterID: --masked--
  csi.storage.k8s.io/controller-expand-secret-name: ceph-user-secret
  csi.storage.k8s.io/controller-expand-secret-namespace: access-control
  csi.storage.k8s.io/fstype: ext4
  csi.storage.k8s.io/node-stage-secret-name: ceph-user-secret
  csi.storage.k8s.io/node-stage-secret-namespace: access-control
  csi.storage.k8s.io/provisioner-secret-name: ceph-user-secret
  csi.storage.k8s.io/provisioner-secret-namespace: access-control
  imageFeatures: layering
  pool: k8s-sharedpool
provisioner: rbd.csi.ceph.com
reclaimPolicy: Delete
volumeBindingMode: Immediate

User permission:

[client.kube]
        key = --masked--
        caps mon = "allow r"
        caps osd = "allow class-read object_prefix rbd_children, allow rwx pool=k8s-sharedpool"

We also try with the new capabilities docs but it has no help

[client.newkube]
        key = --masked--
        caps mgr = "profile rbd pool=k8s-sharedpool"
        caps mon = "profile rbd"
        caps osd = "profile rbd pool=k8s-sharedpool"
  1. Deployment to trigger the issue '....'
  2. See error
    Pod stuck in Init stage and reported error:
 Normal   Scheduled               95s                default-scheduler        Successfully assigned logging-system/aap-es-data-1 to defr4app510
  Warning  FailedAttachVolume      95s                attachdetach-controller  Multi-Attach error for volume "pvc-0091ed72-b8d3-4642-9c65-cb45ddfc328e" Volume is already exclusively attached to one node
  Normal   SuccessfulAttachVolume  85s                attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-0091ed72-b8d3-4642-9c65-cb45ddfc328e"
  Warning  FailedMount             18s (x8 over 84s)  kubelet                  MountVolume.MountDevice failed for volume "pvc-0091ed72-b8d3-4642-9c65-cb45ddfc328e" : rpc error: code = Internal desc = exi

Actual results

Node can map the block device but cannot mount it. From the logs, I think the driver try to grep info of the block device using blkid command but not success. Everything works fine when we have kubelet v1.30.

Expected behavior

Node can map and mount the block device to provide to the pods.

Logs

If the issue is in PVC mounting please attach complete logs of below containers.

  • csi-rbdplugin/csi-cephfsplugin and driver-registrar container logs from
    plugin pod from the node where the mount is failing.
I0109 09:44:29.224808  941792 nodeserver.go:422] ID: 327 Req-ID: 0001-0024-4d3a09c7-d8d2-4927-91cd-08ca6601d0b2-0000000000000007-fe3ca7ee-580e-11ec-b976-a289cdd026fa rbd image: k8s-sharedpool/csi-vol-fe3ca7ee-580e-11ec-b976-a289cdd026fa was successfully mapped at /dev/rbd0
I0109 09:44:29.224926  941792 mount_linux.go:577] Attempting to determine if disk "/dev/rbd0" is formatted using blkid with args: ([-p -s TYPE -s PTTYPE -o export /dev/rbd0])
I0109 09:44:29.227079  941792 mount_linux.go:580] Output: "blkid: error: /dev/rbd0: Operation not permitted\n"
E0109 09:44:29.229984  941792 nodeserver.go:825] ID: 327 Req-ID: 0001-0024-4d3a09c7-d8d2-4927-91cd-08ca6601d0b2-0000000000000007-fe3ca7ee-580e-11ec-b976-a289cdd026fa failed to run mkfs.ext4 ([-m0 -Enodiscard,lazy_itable_init=1,lazy_journal_init=1 /dev/rbd0]) error: exit status 1, output: mke2fs 1.46.5 (30-Dec-2021)
mkfs.ext4: Operation not permitted while trying to determine filesystem size
I0109 09:44:29.311555  941792 cephcmds.go:105] ID: 327 Req-ID: 0001-0024-4d3a09c7-d8d2-4927-91cd-08ca6601d0b2-0000000000000007-fe3ca7ee-580e-11ec-b976-a289cdd026fa command succeeded: rbd [unmap /dev/rbd0 --device-type krbd --options noudev]
E0109 09:44:29.311786  941792 utils.go:245] ID: 327 Req-ID: 0001-0024-4d3a09c7-d8d2-4927-91cd-08ca6601d0b2-0000000000000007-fe3ca7ee-580e-11ec-b976-a289cdd026fa GRPC error: rpc error: code = Internal desc = exit status 1

Note:- If its a rbd issue please provide only rbd related logs, if its a
cephFS issue please provide cephFS logs.

@nixpanic nixpanic added the component/rbd Issues related to RBD label Jan 9, 2025
@nixpanic
Copy link
Member

nixpanic commented Jan 9, 2025

Hi,

these lines in the logs show what's happening best:

I0109 09:44:29.224808  941792 nodeserver.go:422] ID: 327 Req-ID: 0001-0024-4d3a09c7-d8d2-4927-91cd-08ca6601d0b2-0000000000000007-fe3ca7ee-580e-11ec-b976-a289cdd026fa rbd image: k8s-sharedpool/csi-vol-fe3ca7ee-580e-11ec-b976-a289cdd026fa was successfully mapped at /dev/rbd0
I0109 09:44:29.224926  941792 mount_linux.go:577] Attempting to determine if disk "/dev/rbd0" is formatted using blkid with args: ([-p -s TYPE -s PTTYPE -o export /dev/rbd0])
I0109 09:44:29.227079  941792 mount_linux.go:580] Output: "blkid: error: /dev/rbd0: Operation not permitted\n"
  1. mapping the RBD-image worked, it is available as device /dev/rbd0
  2. reading from the device (with blkid) fails with Operation not permitted

Does this problem happen with any RBD-image, pre-existing as well as newly created ones?

Kubernetes 1.31 (minikube) is part of the CI runs that are done for every change, so there must be something else that causes the failure.

Were there any other changes made? Upgrade of the OS on the nodes, updates of the Ceph cluster, ... ?

@hasonhai
Copy link
Author

hasonhai commented Jan 9, 2025

The working node and the non-working node only different in the version of kubernetes and crio.
We have CRI-O v1.31.3 and K8S v1.31.4.
Ceph-CSI works fine on those nodes that have K8S v1.30.4 and CRI-O v1.30.4.
The config for them are the same, we populate the same config to every nodes.

There is no issue for creating the new volume (both pvc and pv):

Name:          ceph-rbd-vol
Namespace:     kube-system
StorageClass:  dynamic-ceph-storage
Status:        Bound
Volume:        pvc-d57cd08f-222d-492f-9a9e-8d08a93c86aa
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: rbd.csi.ceph.com
               volume.kubernetes.io/storage-provisioner: rbd.csi.ceph.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      1Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Used By:       ceph-rbd-test
Events:
  Type    Reason                 Age                From                                                                                           Message
  ----    ------                 ----               ----                                                                                           -------
  Normal  ExternalProvisioning   32s (x2 over 32s)  persistentvolume-controller                                                                    Waiting for a volume to be created either by the external provisioner 'rbd.csi.ceph.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
  Normal  Provisioning           32s                rbd.csi.ceph.com_csi-rbdplugin-provisioner-7864679-n4flm_bb314cb9-8c88-4764-8299-8049e1a8e696  External provisioner is provisioning volume for claim "kube-system/ceph-rbd-vol"
  Normal  ProvisioningSucceeded  32s                rbd.csi.ceph.com_csi-rbdplugin-provisioner-7864679-n4flm_bb314cb9-8c88-4764-8299-8049e1a8e696  Successfully provisioned volume pvc-d57cd08f-222d-492f-9a9e-8d08a93c86aa
Name:            pvc-d57cd08f-222d-492f-9a9e-8d08a93c86aa
Labels:          <none>
Annotations:     pv.kubernetes.io/provisioned-by: rbd.csi.ceph.com
                 volume.kubernetes.io/provisioner-deletion-secret-name: ceph-user-secret
                 volume.kubernetes.io/provisioner-deletion-secret-namespace: access-control
Finalizers:      [external-provisioner.volume.kubernetes.io/finalizer kubernetes.io/pv-protection]
StorageClass:    dynamic-ceph-storage
Status:          Bound
Claim:           kube-system/ceph-rbd-vol
Reclaim Policy:  Delete
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        1Gi
Node Affinity:   <none>
Message:
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            rbd.csi.ceph.com
    FSType:            ext4
    VolumeHandle:      0001-0024---masked---0000000000000007-1b828329-2005-4798-8803-65f3ae1c330c
    ReadOnly:          false
    VolumeAttributes:      clusterID=--masked--
                           imageFeatures=layering
                           imageName=csi-vol-1b828329-2005-4798-8803-65f3ae1c330c
                           journalPool=k8s-sharedpool
                           pool=k8s-sharedpool
                           storage.kubernetes.io/csiProvisionerIdentity=1736421088713-9702-rbd.csi.ceph.com
Events:                <none>

The only issue is with mounting

kdp ceph-rbd-test
Name:             ceph-rbd-test
Namespace:        kube-system
Priority:         0
Service Account:  default
Node:             --masked--
Start Time:       Thu, 09 Jan 2025 17:12:49 +0100
Labels:           <none>
Annotations:      <none>
Status:           Pending
SeccompProfile:   RuntimeDefault
IP:
IPs:              <none>
Containers:
  ceph-rbd-test:
    Container ID:
    Image:         --masked--/library/alpine:3.17
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/bash
    Args:
      -c
      touch /usr/share/cephdir/hello && ls -lsah /usr/share/cephdir && rm -f /usr/share/cephdir/hello && sleep 1d
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /usr/share/cephdir from ceph-vol (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-qdq8c (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   False
  Initialized                 True
  Ready                       False
  ContainersReady             False
  PodScheduled                True
Volumes:
  ceph-vol:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  ceph-rbd-vol
    ReadOnly:   false
  kube-api-access-qdq8c:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age              From                     Message
  ----     ------                  ----             ----                     -------
  Warning  FailedMount             15s              kubelet                  Unable to attach or mount volumes: unmounted volumes=[ceph-vol], unattached volumes=[], failed to process volumes=[ceph-vol]: error processing PVC kube-system/ceph-rbd-vol: PVC is not bound
  Normal   SuccessfulAttachVolume  15s              attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-d57cd08f-222d-492f-9a9e-8d08a93c86aa"
  Warning  FailedMount             0s (x2 over 1s)  kubelet                  MountVolume.MountDevice failed for volume "pvc-d57cd08f-222d-492f-9a9e-8d08a93c86aa" : rpc error: code = Internal desc = exit status 1

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jan 9, 2025

If i remember correctly there was a bug in crio, @iPraveenParihar or @Nikhil-Ladha might have link to it.

@Nikhil-Ladha
Copy link
Contributor

This is the fix for the issue: containers/crun#1614
I am not sure what's the backport status for it, the linked issue only talks about OCP and not any other flavour of k8s.

@hasonhai
Copy link
Author

I can confirm now that the issue is fixed after I replaced the crun v1.18 in the static release bundle of cri-o by crun-1.19.1 released 3 weeks ago.
Thanks everyone for helping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/rbd Issues related to RBD
Projects
None yet
Development

No branches or pull requests

4 participants