Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Volume information caching in $scfg #15

Open
amulet1 opened this issue Jan 18, 2025 · 4 comments
Open

Volume information caching in $scfg #15

amulet1 opened this issue Jan 18, 2025 · 4 comments

Comments

@amulet1
Copy link
Collaborator

amulet1 commented Jan 18, 2025

The current PVE Storage library does not maintain internal storage configuration state. Any high-level call, whether through the GUI or command line, creates a new storage configuration instance, resetting all parameters.

For example, running any storage-related qm command reinitializes all storage plugins, which also occurs with GUI actions. This can be observed in the debug output - cached authentication tokens for arrays are lost between actions (e.g. try resizing the same disk twice).

As a result, caching volume information is mostly ineffective (except in cases like list_images() when Storage.pm itself supplies the short-lived $cache variable). It introduces unnecessary complexity and could lead to out-of-sync caches in different instances of the same storage plugin, which is risky.

It's unclear whether this behavior is intentional or an oversight by the developers. We might want to open a ticket.

Ultimately, we should avoid caching volume information in the $scfg variable. While storing cached data in a static file (and re-reading it on the timestamp change) might be an option, it is simply not worth it.

After merging my forthcoming changes, we should remove the $scfg->{cache} logic entirely.

@amulet1
Copy link
Collaborator Author

amulet1 commented Jan 19, 2025

Since I now have write access, do you want me to post verified commits directly to main branch or still do a PR?

@timansky
Copy link
Collaborator

timansky commented Jan 19, 2025

As a result, caching volume information is mostly ineffective (except in cases like list_images() when Storage.pm itself supplies the short-lived $cache variable). It introduces unnecessary complexity and could lead to out-of-sync caches in different instances of the same storage plugin, which is risky.

If I understand correctly, caching works within the scope of a single API call. Yes, this can be risky, which is why the cache should be cleared when creating or modifying data. However, without caching, the UI becomes very laggy. With a large number of volumes, using the UI becomes challenging.

For example, list_volumes is triggered when opening a window related to adding or modifying a disk.

Caching is a necessity; I agree that in its current form, it is not very efficient and definetly requires improvement.

Since I now have write access, do you want me to post verified commits directly to main branch or still do a PR?

For main branch PR please.

@amulet1
Copy link
Collaborator Author

amulet1 commented Jan 19, 2025

list_volumes() uses $cache offered by the Storage.pm itself, and this is how it should be. But based on what I see in Storage.pm, this is the only place where it is useful for the plugin.

Caching of volume info in purestorage_volume_info() is essentially useless as it is one time use only. I did not manage to trigger it to return cached data, and I see no places in Storage.pm where it would use it.

We could potentially cache volumes info in $cache passed to activate_volume() and deactivate_volume(), but it is not worth it too. These calls are done volume-by-volume, the plugin has no way of knowing if multiple volumes will be activated/deactivated at once, and retrieving/caching all volumes when PVE probably tries to activate just one VM's disks is more expensive thing to do, IMHO.

I looked into Storage.pm and unless they improve the logic (for a start, have a single instance of storage/plugin configuration and only reset storage configuration whenever /etc/pve/storage.cfg changes), there is not much we can do (whatever we cache will not have a chance to be used).

@timansky
Copy link
Collaborator

timansky commented Jan 21, 2025

My thoughts:

I think cache must be file based. File can be places under /etc/pve to share between nodes. File based cache is simple to develop and do not need another dependencies.
At least we can cache:

  • Pure token
  • Volume list for list_images
  • Volume info for list_images

On volume change/add/delete call - cache reset.

At this level race condition is not important, but if cache will be added to another functions then lock need to be implemented

PS

I am not a perl developer, so there is a big question what is the best way to implement cache

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants