Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd: VolumeGroupReplicationContent controller to regenerate the OMAP data #4750

Open
wants to merge 4 commits into
base: devel
Choose a base branch
from

Conversation

iPraveenParihar
Copy link
Contributor

@iPraveenParihar iPraveenParihar commented Aug 6, 2024

Describe what this PR does

This commit adds new controller that watches for the
VolumeGroupReplicationContent and regenerates the OMAP data if
it doesn't exists.

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Reviewed the developer guide on Submitting a Pull
    Request
  • Pending release notes updated with breaking and/or notable changes for the next major release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

Show available bot commands

These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:

  • /retest ci/centos/<job-name>: retest the <job-name> after unrelated
    failure (please report the failure too!)

@iPraveenParihar iPraveenParihar self-assigned this Aug 6, 2024
@mergify mergify bot added the component/rbd Issues related to RBD label Aug 6, 2024
type volumeGroup struct {
// id is a unique value for this volume group in the Ceph cluster, it
// VolumeGroup handles all requests for 'rbd group' operations.
type VolumeGroup struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't export these, they are private intentional. They implement the interface in the internal/rbd/types/*.go files. You can use rbd.NewManager() to get a manager struct and use the Manager API to get the VolumeGroup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are referring to GetVolumeGroupByID, then It doesn't suite well to use it here. Because in GetVolumeGroup we have GetPoolName by LocationID (which is PoolID) and PoolID may be different in secondary cluster.

pool, err := util.GetPoolName(mons, creds, csiID.LocationID)
if err != nil {
return nil, fmt.Errorf("failed to get pool for volume group id %q: %w", id, err)
}

Else we can do it by passing an optional parameter PoolName for GetVolumeGroup ?
If it is passed as empty, then call GetPoolName else use the PoolName passed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to add more functions to the manager and other types. Ideally the API that the manager provides stays simple to use.

@iPraveenParihar iPraveenParihar force-pushed the vg/replication-omap-regenerate branch 4 times, most recently from 2a75eec to 6d757b8 Compare August 6, 2024 11:26
@iPraveenParihar iPraveenParihar marked this pull request as ready for review August 6, 2024 11:31
@iPraveenParihar
Copy link
Contributor Author

Initial Test results -

E0806 10:45:48.804306       1 omap.go:80] omap not found (pool="replicapool", namespace="", name="csi.groups.default"): rados: ret=-2, No such file or directory
I0806 10:45:48.804338       1 rbd_journal.go:764] the journal does not contain a reservation for a volume group with name "vgrcontent-88b1d055-4c66-4218-a81c-487ba0b58ccf" yet
I0806 10:45:48.811316       1 omap.go:159] set omap keys (pool="replicapool", namespace="", name="csi.groups.default"): map[csi.volume.group.vgrcontent-88b1d055-4c66-4218-a81c-487ba0b58ccf:88b1d055-4c66-4218-a81c-487ba0b58ccf])
I0806 10:45:48.814878       1 omap.go:159] set omap keys (pool="replicapool", namespace="", name="csi.volume.group.88b1d055-4c66-4218-a81c-487ba0b58ccf"): map[csi.groupname:csi-vol-group-88b1d055-4c66-4218-a81c-487ba0b58ccf csi.volname:vgrcontent-88b1d055-4c66-4218-a81c-487ba0b58ccf])
I0806 10:45:48.816837       1 omap.go:89] got omap values: (pool="replicapool", namespace="", name="csi.volume.5bfefce1-6427-45a3-b088-6d43d19b1e9c"): map[csi.imageid:288758945f2b8 csi.imagename:csi-vol-5bfefce1-6427-45a3-b088-6d43d19b1e9c csi.volname:pvc-34dc4369-2eb3-492a-88aa-5cd37b557154 csi.volume.owner:rook-ceph]
I0806 10:45:48.838187       1 omap.go:89] got omap values: (pool="replicapool", namespace="", name="csi.volume.896c6912-604b-43db-916c-1e386a71973f"): map[csi.imageid:216b4d44baf60 csi.imagename:csi-vol-896c6912-604b-43db-916c-1e386a71973f csi.volname:pvc-2506eb3c-69e9-4054-8b8d-405de060c0f4 csi.volume.owner:rook-ceph]
I0806 10:45:48.856998       1 rbd_journal.go:799] adding volume mapping for volume "0001-0009-rook-ceph-0000000000000001-5bfefce1-6427-45a3-b088-6d43d19b1e9c" to volume group "csi-vol-group-88b1d055-4c66-4218-a81c-487ba0b58ccf"
I0806 10:45:48.860180       1 omap.go:159] set omap keys (pool="replicapool", namespace="", name="csi.volume.group.88b1d055-4c66-4218-a81c-487ba0b58ccf"): map[0001-0009-rook-ceph-0000000000000001-5bfefce1-6427-45a3-b088-6d43d19b1e9c:])
I0806 10:45:48.860287       1 rbd_journal.go:799] adding volume mapping for volume "0001-0009-rook-ceph-0000000000000001-896c6912-604b-43db-916c-1e386a71973f" to volume group "csi-vol-group-88b1d055-4c66-4218-a81c-487ba0b58ccf"
I0806 10:45:48.863734       1 omap.go:159] set omap keys (pool="replicapool", namespace="", name="csi.volume.group.88b1d055-4c66-4218-a81c-487ba0b58ccf"): map[0001-0009-rook-ceph-0000000000000001-896c6912-604b-43db-916c-1e386a71973f:])
I0806 10:45:48.863972       1 rbd_journal.go:816] re-generated Group ID (0001-0009-rook-ceph-0000000000000001-88b1d055-4c66-4218-a81c-487ba0b58ccf) and Group Name (csi-vol-group-88b1d055-4c66-4218-a81c-487ba0b58ccf) for request name (vgrcontent-88b1d055-4c66-4218-a81c-487ba0b58ccf)
bash-5.1$ rados listomapvals csi.groups.default -p replicapool
csi.volume.group.vgrcontent-88b1d055-4c66-4218-a81c-487ba0b58ccf
value (36 bytes) :
00000000  38 38 62 31 64 30 35 35  2d 34 63 36 36 2d 34 32  |88b1d055-4c66-42|
00000010  31 38 2d 61 38 31 63 2d  34 38 37 62 61 30 62 35  |18-a81c-487ba0b5|
00000020  38 63 63 66                                       |8ccf|
00000024

bash-5.1$ rados listomapvals csi.volume.group.88b1d055-4c66-4218-a81c-487ba0b58ccf -p replicapool
0001-0009-rook-ceph-0000000000000001-5bfefce1-6427-45a3-b088-6d43d19b1e9c
value (0 bytes) :

0001-0009-rook-ceph-0000000000000001-896c6912-604b-43db-916c-1e386a71973f
value (0 bytes) :

csi.groupname
value (50 bytes) :
00000000  63 73 69 2d 76 6f 6c 2d  67 72 6f 75 70 2d 38 38  |csi-vol-group-88|
00000010  62 31 64 30 35 35 2d 34  63 36 36 2d 34 32 31 38  |b1d055-4c66-4218|
00000020  2d 61 38 31 63 2d 34 38  37 62 61 30 62 35 38 63  |-a81c-487ba0b58c|
00000030  63 66                                             |cf|
00000032

csi.volname
value (47 bytes) :
00000000  76 67 72 63 6f 6e 74 65  6e 74 2d 38 38 62 31 64  |vgrcontent-88b1d|
00000010  30 35 35 2d 34 63 36 36  2d 34 32 31 38 2d 61 38  |055-4c66-4218-a8|
00000020  31 63 2d 34 38 37 62 61  30 62 35 38 63 63 66     |1c-487ba0b58ccf|

@iPraveenParihar iPraveenParihar force-pushed the vg/replication-omap-regenerate branch 4 times, most recently from 0290729 to 628b00b Compare August 6, 2024 13:35
Copy link
Contributor

mergify bot commented Aug 6, 2024

This pull request now has conflicts with the target branch. Could you please resolve conflicts and force push the corrected changes? 🙏

internal/rbd/manager.go Outdated Show resolved Hide resolved
Copy link
Contributor

mergify bot commented Aug 9, 2024

The PR description contains the unsupported [skip ci] command, please update the description and mark the PR ready for review again.

@mergify mergify bot marked this pull request as draft August 9, 2024 10:07
@iPraveenParihar iPraveenParihar marked this pull request as ready for review August 12, 2024 04:15
@iPraveenParihar iPraveenParihar force-pushed the vg/replication-omap-regenerate branch 5 times, most recently from 17e3418 to 62e2f4a Compare August 12, 2024 13:42
internal/rbd/rbd_journal.go Outdated Show resolved Hide resolved
internal/rbd/rbd_journal.go Outdated Show resolved Hide resolved
@iPraveenParihar iPraveenParihar force-pushed the vg/replication-omap-regenerate branch 2 times, most recently from 87e64d7 to 01ca0f3 Compare August 14, 2024 11:40
Copy link
Contributor

mergify bot commented Aug 27, 2024

This pull request now has conflicts with the target branch. Could you please resolve conflicts and force push the corrected changes? 🙏

Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Sep 26, 2024
@iPraveenParihar iPraveenParihar added keepalive This label can be used to disable stale bot activiity in the repo and removed stale labels Sep 27, 2024
@iPraveenParihar iPraveenParihar force-pushed the vg/replication-omap-regenerate branch 3 times, most recently from 98a23cd to 8a72e9c Compare November 6, 2024 12:40
@iPraveenParihar
Copy link
Contributor Author

@nixpanic, I have addressed your requested changes and rebased.

@iPraveenParihar iPraveenParihar force-pushed the vg/replication-omap-regenerate branch from 8a72e9c to 0927a48 Compare November 6, 2024 12:46
nixpanic
nixpanic previously approved these changes Nov 22, 2024
Copy link
Member

@nixpanic nixpanic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Looks like there is a conflict in go.mod, that needs addressing before this can be merged.

@nixpanic nixpanic requested a review from Madhu-1 November 22, 2024 13:48
Copy link
Contributor

mergify bot commented Nov 22, 2024

This pull request now has conflicts with the target branch. Could you please resolve conflicts and force push the corrected changes? 🙏

@iPraveenParihar iPraveenParihar force-pushed the vg/replication-omap-regenerate branch from 0927a48 to aa7cf8a Compare November 25, 2024 07:37
@mergify mergify bot dismissed nixpanic’s stale review November 25, 2024 07:37

Pull request has been modified.

@iPraveenParihar
Copy link
Contributor Author

Thanks!

Looks like there is a conflict in go.mod, that needs addressing before this can be merged.

Thanks, rebased now

Copy link
Collaborator

@Madhu-1 Madhu-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iPraveenParihar i hope you already tested this change and we are able to regenerate and able to find the group with newly generated omap.

)

require github.com/go-task/slim-sprig/v3 v3.0.0 // indirect
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already have require section for indirect dep, this should be added there, please check

Comment on lines 553 to 556
journalPool = mgr.parameters["journalPool"]
if journalPool == "" {
journalPool = pool
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have journalPool in volumegroup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CreateVolumeGroup has it -

// journalPool is an optional parameter, use pool if it is not set
journalPool, ok := mgr.parameters["journalPool"]
if !ok || journalPool == "" {
journalPool = pool
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, i dont know in which case we will have journalPool, i think we removed it as its was not required. cc @nixpanic

}
defer vgJournal.Destroy()

namePrefix = mgr.parameters["volumeNamePrefix"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have volumeNamePrefix in group?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, it should be groupNamePrefix. CreateVolumeGroupSnapshot uses groupNamePrefix.
But CreateVolumeGroup uses volumeNamePrefix

// volumeNamePrefix is an optional parameter, can be an empty string
prefix := mgr.parameters["volumeNamePrefix"]

Will change it to groupNamePrefix ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes lets stick to one groupNamePrefix for both as they need to work together

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed CephFS uses volumeGroupNamePrefix naming. Let's keep it consistent across component.
Raised a separate PR to address this - #5063

@iPraveenParihar iPraveenParihar force-pushed the vg/replication-omap-regenerate branch from aa7cf8a to 149ce56 Compare December 4, 2024 07:34
return "", fmt.Errorf("failed to get poolID for %q: %w", groupUUID, err)
}

groupHandle, err := util.GenerateVolID(ctx, monitors, mgr.creds, poolID, pool, clusterID, groupUUID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it return an updated handle when we modify the VolumeGroup? If yes, then we need to check for the VG response always while calling the ModifyVolumeGroup RPC and update the VGRContent CR as well, if the handle has changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the params passed to the function, I am assuming if the handle is already created it should not get updated as the params would remain the same 🤔

This commit adds groupUUID param for `ReserveName` to be used for
OMAP name reserve instead of auto-generating.
This is useful for mirroring and metro-DR ensuring that mirrored
resources have consistent OMAP names across mirrored clusters.

Signed-off-by: Praveen M <[email protected]>
@iPraveenParihar iPraveenParihar force-pushed the vg/replication-omap-regenerate branch 4 times, most recently from 6014d6e to 967098d Compare January 8, 2025 10:35
This commit adds `RegenerateVolumeGroupJournal` to Manager
interface. RegenerateVolumeGroupJournal regenerate the omap
data for the volume group.

This performs the following operations:
  - extracts clusterID and Mons from the cluster mapping
  - Retrieves pool and journalPool parameters from the VolumeGroupReplicationClass
  - Reserves omap data
  - Add volumeIDs mapping to the reserved volume group omap object
  - Generate new volume group handle

Returns the generated volume group handler.

Signed-off-by: Praveen M <[email protected]>
This commit adds new controller that watches for the
VolumeGroupReplicationContent and regenerates the OMAP data if
it doesn't exists.

Signed-off-by: Praveen M <[email protected]>
VolumeGroupReplicationContent controller needs `get`, `list` and `watch`
access control for resource `VolumeGroupReplicationContents`. And `get`
access control for resource `VolumeGroupReplicationClasses`.

Signed-off-by: Praveen M <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/rbd Issues related to RBD keepalive This label can be used to disable stale bot activiity in the repo
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants