You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As title says it, i try to test the restore of a pvc (mainly but also its VM) backed up and existing in one namespace, into another (empty) namespace.
The issue is that 9 out of 10 times the restore will end up creating the pvc in trident-protect namespace. Other times, without changing anything except the name of the restore CR, it works and creates the pvc in the intended namespace.
Here is the yaml i use ( i tried tridenctl-protect as well, same results, but command is a bit too long to paste in here):
Everything is internal and for test purposes, so no need to wory about appvault name or path being disclosed.
I had to include 2 resourceMatchers because although the backup contains the PVC bound to this VM, by restoring only the VM, when working, would complain about pvc not existing and as such the vm was not able to start since pvc is included in its definition. StorageclassMapping seems to be for no scope in here since there is only this storageclass in our cluster.
Versions:
trident-protect-100.2410.1, trident-operator-100.2410.0
Nodes run on Debian GNU/Linux 12, kernel 6.1.0-23-amd64, containerd://1.7.20 with Kubernetes at v1.29.8
Kubevirt is on top at v1.3.1
How i do it:
-- after a restore worked OK:
tridentctl-protect get backuprestore -n test-migr2
k -n test-migr2 get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
feds-pvc Bound pvc-cebef5e5-ff9b-4453-989f-b251fe1f6fad 1Gi RWO aff-volume <unset> 83s
k -n test-migr2 get vms
NAME AGE STATUS READY
feds-cos 89s Running True
-- delete all:
k -n test-migr2 delete vm feds-cos
virtualmachine.kubevirt.io "feds-cos" deleted
k -n test-migr2 delete pvc feds-pvc
persistentvolumeclaim "feds-pvc" deleted
k -n migr2 get all,pvc
Warning: kubevirt.io/v1 VirtualMachineInstancePresets is now deprecated and will be removed in v2.
No resources found
k delete -f test-cos-restore-to-new-new-ns.yaml
backuprestore.protect.trident.netapp.io "test-cos-restore-new-9" deleted
-- edit yaml to set new name for restore CR name
vim test-cos-restore-to-new-new-ns.yaml
-- run another restore:
k apply -f test-cos-restore-to-new-new-ns.yaml
-- result:
tridentctl-protect get backuprestore -n test-migr2
+-------------------------+----------+--------+-------+--------------------------------+
| NAME | APPVAULT | STATE | AGE | ERROR |
+-------------------------+----------+--------+-------+--------------------------------+
| test-cos-restore-new-10 | s3-atl2 | Failed | 6m59s | VolumeRestoreHandler failed |
| | | | | with permanent error k... |
+-------------------------+----------+--------+-------+--------------------------------+
--- or better logged as:
k -n test-migr2 get backuprestore.protect.trident.netapp.io/test-cos-restore-new-10|tail
NAME STATE ERROR AGE
test-cos-restore-new-10 Failed VolumeRestoreHandler failed with permanent error kopiaVolumeRestore timed out for volume trident-protect/feds-pvc-b7f5e21a687e061098dbbb2c4cbddca0: permanent error 10m
-- the pvc is again in trident-protect namespace instead of my test-migr2:
k -n trident-protect get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
feds-pvc-b7f5e21a687e061098dbbb2c4cbddca0 Bound pvc-988fb8ef-030a-4dcd-b1fe-d738af254a7e 1Gi RWO aff-volume <unset> 15m
-- grepping for error in trident-protect pod i get only these:
2025-01-30T12:46:55.334333584Z ERROR KopiaVolumeRestore has timed out {"controller": "kopiavolumerestore", "controllerGroup": "protect.trident.netapp.io", "controllerKind": "KopiaVolumeRestore", "KopiaVolumeRestore": {"name":"kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b","namespace":"test-migr2"}, "namespace": "test-migr2", "name": "kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b", "reconcileID": "b9bb55f8-2dde-43c0-9103-1699dc857847", "correlationid": "b40fd216-ea23-4ae8-8bcd-a7dbe9d7e1f0", "resourceid": "6bdcf83a-b36d-416e-aeed-4e8eea7bfb7f", "lastUpdatedAt": "2025-01-30 12:41:55 +0000 UTC", "error": "progress has not been updated in the allotted time"}
2025-01-30T12:46:55.34862339Z ERROR Reconciler error {"controller": "kopiavolumerestore", "controllerGroup": "protect.trident.netapp.io", "controllerKind": "KopiaVolumeRestore", "KopiaVolumeRestore": {"name":"kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b","namespace":"test-migr2"}, "namespace": "test-migr2", "name": "kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b", "reconcileID": "b9bb55f8-2dde-43c0-9103-1699dc857847", "error": "progress has not been updated in the allotted time"}
2025-01-30T12:46:55.354051338Z ERROR KopiaVolumeRestore has timed out {"controller": "kopiavolumerestore", "controllerGroup": "protect.trident.netapp.io", "controllerKind": "KopiaVolumeRestore", "KopiaVolumeRestore": {"name":"kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b","namespace":"test-migr2"}, "namespace": "test-migr2", "name": "kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b", "reconcileID": "fd96e6f9-9bb9-4ac4-bfb9-63cc3bbf5ae2", "correlationid": "b40fd216-ea23-4ae8-8bcd-a7dbe9d7e1f0", "resourceid": "6bdcf83a-b36d-416e-aeed-4e8eea7bfb7f", "lastUpdatedAt": "2025-01-30 12:41:55 +0000 UTC", "error": "progress has not been updated in the allotted time"}
2025-01-30T12:46:55.361262051Z ERROR Failed to update status {"controller": "kopiavolumerestore", "controllerGroup": "protect.trident.netapp.io", "controllerKind": "KopiaVolumeRestore", "KopiaVolumeRestore": {"name":"kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b","namespace":"test-migr2"}, "namespace": "test-migr2", "name": "kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b", "reconcileID": "fd96e6f9-9bb9-4ac4-bfb9-63cc3bbf5ae2", "correlationid": "b40fd216-ea23-4ae8-8bcd-a7dbe9d7e1f0", "resourceid": "6bdcf83a-b36d-416e-aeed-4e8eea7bfb7f", "error": "Operation cannot be fulfilled on kopiavolumerestores.protect.trident.netapp.io "kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b": the object has been modified; please apply your changes to the latest version and try again"}
So, is something delaying restore operations and trident-protect defaults to its namespace? Nodes are not under any pressure and this is a test environment with only few deployments. Any idea?
The text was updated successfully, but these errors were encountered:
As title says it, i try to test the restore of a pvc (mainly but also its VM) backed up and existing in one namespace, into another (empty) namespace.
The issue is that 9 out of 10 times the restore will end up creating the pvc in trident-protect namespace.
Other times, without changing anything except the name of the restore CR, it works and creates the pvc in the intended namespace.
Here is the yaml i use ( i tried tridenctl-protect as well, same results, but command is a bit too long to paste in here):
Everything is internal and for test purposes, so no need to wory about appvault name or path being disclosed.
I had to include 2 resourceMatchers because although the backup contains the PVC bound to this VM, by restoring only the VM, when working, would complain about pvc not existing and as such the vm was not able to start since pvc is included in its definition. StorageclassMapping seems to be for no scope in here since there is only this storageclass in our cluster.
Versions:
trident-protect-100.2410.1, trident-operator-100.2410.0
Nodes run on Debian GNU/Linux 12, kernel 6.1.0-23-amd64, containerd://1.7.20 with Kubernetes at v1.29.8
Kubevirt is on top at v1.3.1
How i do it:
-- after a restore worked OK:
tridentctl-protect get backuprestore -n test-migr2
-- checking, they are restored as it should:
-- delete all:
k -n test-migr2 delete vm feds-cos
virtualmachine.kubevirt.io "feds-cos" deleted
k -n test-migr2 delete pvc feds-pvc
persistentvolumeclaim "feds-pvc" deleted
k -n migr2 get all,pvc
Warning: kubevirt.io/v1 VirtualMachineInstancePresets is now deprecated and will be removed in v2.
No resources found
k delete -f test-cos-restore-to-new-new-ns.yaml
backuprestore.protect.trident.netapp.io "test-cos-restore-new-9" deleted
-- edit yaml to set new name for restore CR name
vim test-cos-restore-to-new-new-ns.yaml
-- run another restore:
k apply -f test-cos-restore-to-new-new-ns.yaml
-- result:
tridentctl-protect get backuprestore -n test-migr2
--- or better logged as:
-- grepping for error in trident-protect pod i get only these:
2025-01-30T12:46:55.334333584Z ERROR KopiaVolumeRestore has timed out {"controller": "kopiavolumerestore", "controllerGroup": "protect.trident.netapp.io", "controllerKind": "KopiaVolumeRestore", "KopiaVolumeRestore": {"name":"kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b","namespace":"test-migr2"}, "namespace": "test-migr2", "name": "kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b", "reconcileID": "b9bb55f8-2dde-43c0-9103-1699dc857847", "correlationid": "b40fd216-ea23-4ae8-8bcd-a7dbe9d7e1f0", "resourceid": "6bdcf83a-b36d-416e-aeed-4e8eea7bfb7f", "lastUpdatedAt": "2025-01-30 12:41:55 +0000 UTC", "error": "progress has not been updated in the allotted time"}
2025-01-30T12:46:55.34862339Z ERROR Reconciler error {"controller": "kopiavolumerestore", "controllerGroup": "protect.trident.netapp.io", "controllerKind": "KopiaVolumeRestore", "KopiaVolumeRestore": {"name":"kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b","namespace":"test-migr2"}, "namespace": "test-migr2", "name": "kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b", "reconcileID": "b9bb55f8-2dde-43c0-9103-1699dc857847", "error": "progress has not been updated in the allotted time"}
2025-01-30T12:46:55.354051338Z ERROR KopiaVolumeRestore has timed out {"controller": "kopiavolumerestore", "controllerGroup": "protect.trident.netapp.io", "controllerKind": "KopiaVolumeRestore", "KopiaVolumeRestore": {"name":"kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b","namespace":"test-migr2"}, "namespace": "test-migr2", "name": "kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b", "reconcileID": "fd96e6f9-9bb9-4ac4-bfb9-63cc3bbf5ae2", "correlationid": "b40fd216-ea23-4ae8-8bcd-a7dbe9d7e1f0", "resourceid": "6bdcf83a-b36d-416e-aeed-4e8eea7bfb7f", "lastUpdatedAt": "2025-01-30 12:41:55 +0000 UTC", "error": "progress has not been updated in the allotted time"}
2025-01-30T12:46:55.361262051Z ERROR Failed to update status {"controller": "kopiavolumerestore", "controllerGroup": "protect.trident.netapp.io", "controllerKind": "KopiaVolumeRestore", "KopiaVolumeRestore": {"name":"kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b","namespace":"test-migr2"}, "namespace": "test-migr2", "name": "kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b", "reconcileID": "fd96e6f9-9bb9-4ac4-bfb9-63cc3bbf5ae2", "correlationid": "b40fd216-ea23-4ae8-8bcd-a7dbe9d7e1f0", "resourceid": "6bdcf83a-b36d-416e-aeed-4e8eea7bfb7f", "error": "Operation cannot be fulfilled on kopiavolumerestores.protect.trident.netapp.io "kvr-c0167ecb-af03-45ab-a461-72d658bbb44d-c3eca3af-1c0d-4b": the object has been modified; please apply your changes to the latest version and try again"}
So, is something delaying restore operations and trident-protect defaults to its namespace? Nodes are not under any pressure and this is a test environment with only few deployments. Any idea?
The text was updated successfully, but these errors were encountered: