Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage: Avoid filling volume config on VolumeDBCreate #14923

Draft
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

hamistao
Copy link
Contributor

@hamistao hamistao commented Feb 5, 2025

This is a pre-requisite for #14595.

The whole idea here is that we want to adapt the provided volume config on VolumeDBCreate to populate volatile.rootfs.size without changing the original config passed in, as this would be a very easy and consistent way to keep track of volume sizes.

But we are currently relying on VolumeDBCreate to populate some fields within the volume for future use, which is giving many unrelated purposes to this function. So this instead proposed filling this config on GetNewVolume instead, as it makes sense for a new volume to have default conifg for the driver filled in. That, along with a few more tweaks, allows us to use VolumeDBCreate exclusively to create the database entry for the volume, adapting the volume config as needed without compromising the original config passed in as an argument.

@hamistao hamistao force-pushed the move_volume_config_filling branch 2 times, most recently from 8a73064 to 1497ea0 Compare February 5, 2025 04:50
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm I'd be hesitant to start forcefully killing ongoing storage commands that are mutating the local or remote storage layers during a clean shutdown. This feels like its asking for trouble to me.

Was there a reason you didn't go with context.Background() here to ensure the commands safely completed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just failed to consider the risks of stopping these commands midway. Considering that doing so would simply leave the storage layers unharmed is far too optimistic.

@@ -289,7 +289,7 @@ func (d *common) moveGPTAltHeader(devPath string) error {
return nil
}

_, err = shared.RunCommand(path, "--move-second-header", devPath)
_, err = shared.RunCommandContext(d.state.ShutdownCtx, path, "--move-second-header", devPath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think we should use d.state.ShutdownCtx for anything mutating the disk, as Go will then forcefully kill the command, leaving it in an inconsistent state.

We've been bitten before by this in the past where we killed an ongoing ceph map command, which then left unrecoverable structures in the kernel and the system had to be rebooted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've been bitten before by this in the past where we killed an ongoing ceph map command, which then left unrecoverable structures in the kernel and the system had to be rebooted.

Always useful to know some history, thanks for the insight!
I will make the necessary updates.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general if a command isnt mutating something we can use d.state.shutdownCtx to exit more quickly, but if we're mutating something it'd be better to let it finish in my view (ofc if thats going to take a long time and some cleanup is going to happen on error, such as writing a large file out and then removing it on error in a defer then that would be different).

Copy link
Member

@tomponline tomponline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets avoid introducing potential new issues by forcefully canceling commands that are mutating on disk structures during a clean shutdown. Use context.Background() for these until such time as you're prepared to test each one for its resulting behaviour.

@hamistao hamistao changed the title Storage: Avoid filling volume config on VolumeDBCreate` Storage: Avoid filling volume config on VolumeDBCreate Feb 18, 2025
@hamistao hamistao force-pushed the move_volume_config_filling branch 4 times, most recently from 0e06451 to e8cb5a9 Compare February 25, 2025 03:46
This erros is always nil, so there is no reason for is to keep this return type.

Signed-off-by: hamistao <[email protected]>
Relying on VolumeDBCreate to perform this adds unnecessary contraints.
Such as not enabling to perform operations that require having a filled in volume
config before create the DB volume unless one adds a redundant call to FillVolumeConfig.
Or not being able to set keys that aren't supposed to go in the database before calling
VolumeDBCreate, such as `size` or `state.size`.

Signed-off-by: hamistao <[email protected]>
The intention is to cover for almost all cases where we were filling volume configuration on `VolumeDBCreate`. Now this makes more sense as it is expected for a newly created volume to have the default configuration for its pool.

Signed-off-by: hamistao <[email protected]>
@hamistao hamistao force-pushed the move_volume_config_filling branch from e8cb5a9 to a3b794e Compare February 25, 2025 15:53
This simplification also fits nicely into the concept of expecting a
volume with a previously filled config.

Signed-off-by: hamistao <[email protected]>
These volume are not created with `GetNewVolume`, so we need to call
`FillVolumeConfig` on them.

Signed-off-by: hamistao <[email protected]>
@hamistao hamistao force-pushed the move_volume_config_filling branch from a3b794e to 66cf1ec Compare February 27, 2025 02:14
Signed-off-by: hamistao <[email protected]>
@hamistao hamistao force-pushed the move_volume_config_filling branch from 66cf1ec to d8df58a Compare February 27, 2025 02:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants