Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add preference binpack replicas plugin #295

Merged
merged 1 commit into from
Dec 12, 2023

Conversation

mrlihanbo
Copy link
Collaborator

No description provided.

@mrlihanbo mrlihanbo force-pushed the feat/PreferenceBinpack-2 branch from b4172cf to 2aa1c64 Compare December 8, 2023 08:46
Copy link

codecov bot commented Dec 8, 2023

Codecov Report

Attention: 126 lines in your changes are missing coverage. Please review.

Comparison is base (e57d64a) 30.42% compared to head (863b6ef) 31.15%.

Files Patch % Lines
...k/plugins/preferencebinpack/preference_bin_pack.go 0.00% 89 Missing ⚠️
...ler/framework/plugins/preferencebinpack/planner.go 92.57% 9 Missing and 6 partials ⚠️
pkg/controllers/automigration/controller.go 0.00% 13 Missing ⚠️
pkg/controllers/scheduler/profile.go 25.00% 2 Missing and 1 partial ⚠️
pkg/controllers/scheduler/scheduler.go 40.00% 2 Missing and 1 partial ⚠️
pkg/controllers/scheduler/schedulingunit.go 66.66% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #295      +/-   ##
==========================================
+ Coverage   30.42%   31.15%   +0.73%     
==========================================
  Files         120      122       +2     
  Lines       13944    14266     +322     
==========================================
+ Hits         4242     4445     +203     
- Misses       9296     9406     +110     
- Partials      406      415       +9     
Flag Coverage Δ
unittests 31.15% <62.83%> (+0.73%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mrlihanbo mrlihanbo force-pushed the feat/PreferenceBinpack-2 branch 4 times, most recently from 4f4b26c to 61f2525 Compare December 10, 2023 09:53
@@ -21,7 +21,7 @@ import (
"github.com/kubewharf/kubeadmiral/pkg/controllers/scheduler/framework/plugins/names"
)

func GetDefaultEnabledPlugins() *fedcore.EnabledPlugins {
func GetDefaultEnabledPlugins(replicasPlugin string) *fedcore.EnabledPlugins {
Copy link
Member

@gary-lgy gary-lgy Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance to mutate the returned EnabledPlugins in the scheduler instead? The default list is the baseline, and should not vary depending on configuration.

for _, pod := range podList {
if pod.GetDeletionTimestamp() != nil {
continue
}

scheduledCondition, isUnschedulable := getPodScheduledCondition(pod)
if !isUnschedulable {
if scheduledCondition != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be more conservative and additionally check that scheduledCondition.Status == True?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, I will add the additionally check

return scheduledReplicas, nil, nil
}

// If we don't desiredPlan to avoid migration, just return the plan computed from preferences
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If we don't desiredPlan to avoid migration, just return the plan computed from preferences
// If we don't need to avoid migration, just return the plan computed from preferences

if !keepUnschedulableReplicas {
newOverflow := make(map[string]int64)
for key, value := range desiredOverflow {
value = framework.MinInt64(value, totalReplicas-currentTotalScheduledReplicas)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a comment -

// Cap the overflow at currently unscheduled replicas.

return currentReplicaCount, nil
}

func preCheck(
Copy link
Member

@gary-lgy gary-lgy Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func preCheck(
func shouldPlan(

limitedCapacity map[string]int64,
avoidDisruption bool,
keepUnschedulableReplicas bool,
scheduledReplicas map[string]int64,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a comment for this param - Replicas that have been scheduled in the member clusters to prevent the confusion that scheduled here may refer to fed scheduling.

}()

for cluster, replicas := range scheduledReplicas {
if capacity, exist := estimatedCapacity[cluster]; exist && capacity != replicas {
Copy link
Member

@gary-lgy gary-lgy Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a comment here to explain why we skip planning here

If `capacity != replicas`, some replicas are still being scheduled. We defer planning until the status of all replicas are known.

@gary-lgy
Copy link
Member

gary-lgy commented Dec 11, 2023

Related to #289

The new PropagationPolicy field ReplicasStrategy decides which replicas plugin to use - the traditional spread, or the new capacity-probing binpack.

Scheduling profile was designed to configure plugins based on the scheduling requirements. Adding ReplicasStrategy defeats the purpose of scheduling profiles, and opens us up to the slippery slope of blindly adding new fields for every single plugin combination.

This is bad design imo.

@mrlihanbo mrlihanbo force-pushed the feat/PreferenceBinpack-2 branch from 61f2525 to 2ffd8d3 Compare December 11, 2023 13:41
@limhawjia
Copy link
Contributor

Adding on to @gary-lgy 's point, the scheduling profile used to dictate which fields in PropagationPolicy are respected and what semantics they had. In a sense, the scheduling profile governed PropagationPolicy. There used to be a clean one-way relation between profiles and policies.

This PR introduces the inverse of this, creating a two-way relation between profiles and policies where they can affect one another in elaborate and unexpected ways. This is bad design and a slippery slope in my opinion as well.

@limhawjia
Copy link
Contributor

I propose an alternative solution. @mrlihanbo @gary-lgy

Instead of introducing a new ReplicasStrategy field that modifies the plugins used in the scheduling profile, how about we provide multiple built-in profiles? One of which is spread (the default), and one of which is binpack.

spread will consist of the current default list of plugins, while binpack will contain plugins configured to provide bin-packing capabilities. Doing so will retain ease-of-use for the user. They do not have to configure their own scheduling profile, and enabling bin-packing is still a matter of setting a single field (just schedulingProfile instead of replicasStrategy).

@Poor12
Copy link
Collaborator

Poor12 commented Dec 12, 2023

Agreed all. But there is one thing, schedulingProfile is indeed more complicated than adding replicaStrategy to policy. I think if preferenceBinpack is a basic feature, it would be better to add replicaStrategy. Otherwise, if it's a extended feature, it would be better to add it in schedulingProfile. During the former discussion, seems we hope it to be a basic feature?

@mrlihanbo mrlihanbo force-pushed the feat/PreferenceBinpack-2 branch from 2ffd8d3 to 863b6ef Compare December 12, 2023 02:52
@gary-lgy
Copy link
Member

gary-lgy commented Dec 12, 2023

To those without context, there was an offline discussion among @manmanhappy @gary-lgy @mrlihanbo @JackZxj and @Poor12 on this topic. @gary-lgy proposed the 2 built-in profiles solution mentioned by @limhawjia above. However, @manmanhappy rejected it and proposed the current solution as implemented in #289 and this PR (and did not leave much room for further discussion).

Replying to @Poor12's comment:

Agreed all. But there is one thing, schedulingProfile is indeed more complicated than adding replicaStrategy to policy.

From whose perspective? The profiles are built-in (either in code or created by the platform). From the users' perspective, they need to configure a single field in the policy spec, be it pp.spec.schedulingProfile = binpack or pp.spec.replicasStrategy = binpack. I do not see how one is "more complicated" than the other. To the developers, profile is an existing mechanism and the current solution involves API changes, so the latter is more complicated.
Most importantly, the current solution pollutes the scheduler architecture for reasons stated in previous comments.

I think if preferenceBinpack is a basic feature, it would be better to add replicaStrategy. Otherwise, if it's as extended feature, it would be better to add it in schedulingProfile.

Basic or extended is subjective, and I don't think it's the right yardstick to determine whether a configuration option should become a policy field. There are in fact many considerations for preferring the profile approach:

  1. We already have too many fields in the policy spec. We can't expect the users to keep all of them in their head. Adding more fields further worsens the cognitive load of using a policy. Certainly, we can use defaulting to reduce some of its complexity, but managing the fields for us is also a burden as the interactions between them become ever harder to reason with.
  2. This new replica scheduling behaviour only works well for serverless/on-demand Kubernetes clusters and will spell disaster if used on a normal cluster backed by reversed nodes. How many percent of our user space would need to use this niche feature? Does it justify adding a new policy field when we already have another solution (profile) which can satisfy the requirements?

To be absolutely clear, I maintain my position (which I made clear during the initial offline discussion) that we shouldn't go down this slippery slope of continuously adding new fields.

@mrlihanbo mrlihanbo requested a review from Poor12 December 12, 2023 03:27
@mrlihanbo mrlihanbo merged commit 6d35318 into kubewharf:main Dec 12, 2023
7 checks passed
miraclejzd pushed a commit to miraclejzd/kubeadmiral that referenced this pull request Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants