Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K8SPXC-1461: improve pvc resize logs #1947

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

K8SPXC-1461: improve pvc resize logs #1947

wants to merge 1 commit into from

Conversation

pooknull
Copy link
Contributor

@pooknull pooknull commented Jan 20, 2025

K8SPXC-1461 Powered by Pull Request Badge

https://perconadev.atlassian.net/browse/K8SPXC-1461

DESCRIPTION

Problem:
In situations where storage resources are limited and we attempt to resize PVCs, some PVCs may be successfully resized while others fail. In this case, we log only the PVCs that failed to resize, without logging the PVCs that were resized successfully.

Solution:
Create a warning Event on exceeded quota and storageclass that provisions the pvc must support resize errors for each failed PVC update. The operator will depend on these events to get information about each PVC resize failure. The operator should wait until all PVC resize operations are either completed or failed, ensuring that all PVC resize results are logged. Only then, it should revert the PVC size in the cr.yaml.

CHECKLIST

Jira

  • Is the Jira ticket created and referenced properly?
  • Does the Jira ticket have the proper statuses for documentation (Needs Doc) and QA (Needs QA)?
  • Does the Jira ticket link to the proper milestone (Fix Version field)?

Tests

  • Is an E2E test/test case added for the new feature/change?
  • Are unit tests added where appropriate?
  • Are OpenShift compare files changed for E2E tests (compare/*-oc.yml)?

Config/Logging/Testability

  • Are all needed new/changed options added to default YAML files?
  • Are all needed new/changed options added to the Helm Chart?
  • Did we add proper logging messages for operator actions?
  • Did we ensure compatibility with the previous version or cluster upgrade process?
  • Does the change support oldest and newest supported PXC version?
  • Does the change support oldest and newest supported Kubernetes version?

@pull-request-size pull-request-size bot added the size/M 30-99 lines label Jan 20, 2025
@JNKPercona
Copy link
Collaborator

Test name Status
affinity-8-0 passed
auto-tuning-8-0 passed
cross-site-8-0 passed
custom-users-8-0 passed
demand-backup-cloud-8-0 passed
demand-backup-encrypted-with-tls-8-0 passed
demand-backup-8-0 passed
haproxy-5-7 passed
haproxy-8-0 passed
init-deploy-5-7 passed
init-deploy-8-0 passed
limits-8-0 passed
monitoring-2-0-8-0 passed
one-pod-5-7 passed
one-pod-8-0 passed
pitr-8-0 passed
pitr-gap-errors-8-0 passed
proxy-protocol-8-0 passed
proxysql-sidecar-res-limits-8-0 passed
pvc-resize-5-7 passed
pvc-resize-8-0 passed
recreate-8-0 passed
restore-to-encrypted-cluster-8-0 passed
scaling-proxysql-8-0 passed
scaling-8-0 passed
scheduled-backup-5-7 passed
scheduled-backup-8-0 passed
security-context-8-0 passed
smart-update1-8-0 passed
smart-update2-8-0 passed
storage-8-0 passed
tls-issue-cert-manager-ref-8-0 passed
tls-issue-cert-manager-8-0 passed
tls-issue-self-8-0 passed
upgrade-consistency-8-0 passed
upgrade-haproxy-5-7 passed
upgrade-haproxy-8-0 passed
upgrade-proxysql-5-7 passed
upgrade-proxysql-8-0 passed
users-5-7 passed
users-8-0 passed
validation-hook-8-0 passed
We run 42 out of 42

commit: 55b50bb
image: perconalab/percona-xtradb-cluster-operator:PR-1947-55b50bbf

@egegunes egegunes added this to the v1.17.0 milestone Jan 21, 2025
@pooknull pooknull marked this pull request as ready for review January 21, 2025 11:43
Comment on lines +199 to +202
if len(resizeErrors) > 0 {
if pendingResize {
return nil
}
Copy link
Contributor

@gkech gkech Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a specific reason that we are checking if we are in a pending resize state only if resize errors exist? Because if these are not necessarily connected, we can return nil the moment we track an event.Reason related to pending resizing as noted in L.187. This will simplify the logic.

But maybe I'm missing something,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/M 30-99 lines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants