[main] Prevent the cluster status from becoming active prematurely #863

krunalhinguu · 2025-02-19T22:25:13Z

What this PR does / why we need it:

While updating multiple fields, changes were being applied one by one. During this process, the system periodically polls the cluster for status updates. However, the cluster status was prematurely transitioning to active instead of updating. This led to:

Some updates being reverted.
Certain changes getting skipped intermittently.

To address this, we now ensure that while making updates, the cluster status is enqueued as updating. This keeps the cluster continuously checking for upstream changes and prevents unintended timeouts.

Which issue(s) this PR fixes
Issue #667

Checklist:

…bility

vardhaman22

LGTM

controller/aks-cluster-config-handler.go

yiannistri · 2025-03-05T08:34:54Z

controller/aks-cluster-config-handler.go

+		// If status is not updating, then enqueue the update (to re-enter the onChange handler)
+		if config.Status.Phase != aksConfigUpdatingPhase {
+			return h.enqueueUpdate(config)
+		}
+


Should this block be inside the if !resourceGroupExists block a few lines up? Apologies if my previous review comment was confusing. In my opinion, we should enqueue an update right before we begin making any changes, does that make sense?

@yiannistri We enqueue the update whenever we update the cluster. However, during the first update, the cluster does not always go into the Updating state automatically. That’s why we originally added this check—to ensure the update is re-queued when necessary.
If we move this check inside the if block, we might miss forcing the status to Updating in cases where the condition is not met. Let me know if you have any concerns!

Refactor update re-enqueue logic for better readability and maintaina…

0bc730c

…bility

krunalhinguu added the kind/bug Something isn't working label Feb 19, 2025

krunalhinguu requested review from a team as code owners February 19, 2025 22:25

yiannistri self-requested a review February 26, 2025 11:01

vardhaman22 previously approved these changes Mar 3, 2025

View reviewed changes

This was referenced Mar 4, 2025

[backport][v2.11] Prevent the cluster status from becoming active prematurely #874

Open

[backport][v2.10] Prevent the cluster status from becoming active prematurely #875

Open

yiannistri reviewed Mar 4, 2025

View reviewed changes

controller/aks-cluster-config-handler.go Outdated Show resolved Hide resolved

move enqueue update logic in updateUpstreamClusterState

1b330c9

krunalhinguu dismissed vardhaman22’s stale review via 1b330c9 March 5, 2025 04:54

yiannistri reviewed Mar 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[main] Prevent the cluster status from becoming active prematurely #863

[main] Prevent the cluster status from becoming active prematurely #863

krunalhinguu commented Feb 19, 2025 •

edited

Loading

vardhaman22 left a comment

yiannistri Mar 5, 2025

krunalhinguu Mar 6, 2025

[main] Prevent the cluster status from becoming active prematurely #863

Are you sure you want to change the base?

[main] Prevent the cluster status from becoming active prematurely #863

Conversation

krunalhinguu commented Feb 19, 2025 • edited Loading

vardhaman22 left a comment

Choose a reason for hiding this comment

yiannistri Mar 5, 2025

Choose a reason for hiding this comment

krunalhinguu Mar 6, 2025

Choose a reason for hiding this comment

krunalhinguu commented Feb 19, 2025 •

edited

Loading