Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[main] Prevent the cluster status from becoming active prematurely #863

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

krunalhinguu
Copy link
Contributor

@krunalhinguu krunalhinguu commented Feb 19, 2025

What this PR does / why we need it:

While updating multiple fields, changes were being applied one by one. During this process, the system periodically polls the cluster for status updates. However, the cluster status was prematurely transitioning to active instead of updating. This led to:

  • Some updates being reverted.
  • Certain changes getting skipped intermittently.

To address this, we now ensure that while making updates, the cluster status is enqueued as updating. This keeps the cluster continuously checking for upstream changes and prevents unintended timeouts.

Which issue(s) this PR fixes
Issue #667

Checklist:

  • squashed commits into logical changes
  • includes documentation
  • adds unit tests
  • adds or updates e2e tests
  • backport needed

@krunalhinguu krunalhinguu added the kind/bug Something isn't working label Feb 19, 2025
@krunalhinguu krunalhinguu requested review from a team as code owners February 19, 2025 22:25
@yiannistri yiannistri self-requested a review February 26, 2025 11:01
vardhaman22
vardhaman22 previously approved these changes Mar 3, 2025
Copy link
Contributor

@vardhaman22 vardhaman22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +869 to +873
// If status is not updating, then enqueue the update (to re-enter the onChange handler)
if config.Status.Phase != aksConfigUpdatingPhase {
return h.enqueueUpdate(config)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this block be inside the if !resourceGroupExists block a few lines up? Apologies if my previous review comment was confusing. In my opinion, we should enqueue an update right before we begin making any changes, does that make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yiannistri We enqueue the update whenever we update the cluster. However, during the first update, the cluster does not always go into the Updating state automatically. That’s why we originally added this check—to ensure the update is re-queued when necessary.
If we move this check inside the if block, we might miss forcing the status to Updating in cases where the condition is not met. Let me know if you have any concerns!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
Status: PR to be reviewed
Development

Successfully merging this pull request may close these issues.

3 participants