Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(operator): Add Status.AvailableReplicas to Model CRD #5873

Merged
merged 4 commits into from
Jan 29, 2025

Conversation

lc525
Copy link
Member

@lc525 lc525 commented Sep 4, 2024

The Model.Status.Replicas field always reflects the total number of Model replicas in existence, irrespective of their status. This is because it is used by the scale subresource (enabling the configuration of HPA on Model CRs).

However, we need a way of showing how many of those Replicas are available in order to distinguish between various partial availability scenarios, when the ModelReady condition is set to false.

This PR introduces a new Status.AvailableReplicas field, showing exactly that.

The context of this change is the following:

  • Say we start with a cluster where Model A has 2 replicas, and the Server on which it is scheduled also has 2 replicas
  • For whatever reason (manual change to the Server CR, failure) the number of Server replicas goes down to 1
  • As a result, Model A's ModelReady status condition transitions to false, with a message of ScheduleFailed: Failed to schedule model as no matching server had enough suitable replicas
  • However, in reality there is still 1 replica of Model A, which remains available and can serve requests. Without additional data in the Model CR Status, there's no way of knowing that.

Out-of-scope

An additional, related issue is the interaction of this type of Model status transition and Pipelines/Experiments. For example, a pipeline that has Model A as a step, will listen to Model A's Status transitions and update the PipelineReady and ModelsReady conditions to false whenever one of the models is no longer "available" (as decided by the condition here). Deciding on a way of showing partial availability of models which are part of pipelines will be tracked as a separate issue.

In general, we should consider the ergonomics of clients interacting with the k8s "API" as defined by our CR status fields.

TODOs

  • smoke test in Kind cluster

Notes to reviewers:

This PR contains files that were automatically generated as part of the process of building the operator and the CRDs/helm-charts. I've left a comment where that's the case, describing how the file was generated.

@lc525 lc525 added the v2 label Sep 4, 2024
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file generated by make create in [seldon-core-2-root]/k8s dir

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file generated by make create in [seldon-core-2-root]/k8s dir

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file generated by make create in [seldon-core-2-root]/k8s dir

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file generated by make create in [seldon-core-2-root]/k8s dir

Copy link
Member Author

@lc525 lc525 Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file generated by kubebuilder via make manifests in [seldon-core-2-root]/operator dir

@lc525 lc525 requested review from abhimanyu003 and driev September 5, 2024 11:21
@lc525 lc525 marked this pull request as ready for review September 5, 2024 11:22
@lc525 lc525 requested a review from sakoush as a code owner September 5, 2024 11:22
@lc525 lc525 removed the request for review from sakoush September 5, 2024 11:25
@lc525
Copy link
Member Author

lc525 commented Sep 17, 2024

There is a bug where the expected number of model replicas does not update after change in the number of server replicas. Will fix that part first before merging

lc525 added 2 commits January 29, 2025 11:09
The Model.Status.Replicas field always reflects the total number of Model
replicas in existence, irrespective of their status. This is because it
is used by the scale subresource (enabling the configuration of HPA on
Model CRs).

However, we need a way of showing how many of those Replicas are available in
order to distinguish between various partial availability scenarios, when the
ModelReady condition is set to false.

This introduces a new Status.AvailableReplicas field, showing exactly that
@sakoush sakoush force-pushed the quickfix/model-status-availablereplicas branch from 98ae6ea to 873c542 Compare January 29, 2025 11:11
@sakoush
Copy link
Member

sakoush commented Jan 29, 2025

There is a bug where the expected number of model replicas does not update after change in the number of server replicas. Will fix that part first before merging

I think this is now fixed, will revisit in a follow up PR if it resurfaced.

@sakoush sakoush merged commit bdf3fc0 into SeldonIO:v2 Jan 29, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants