Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Position around Management Cluster #8210

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions sig-multicluster/management-cluster-position-statement.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Management Cluster - SIG Multicluster Position Statement
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the termManagement is a bit too generic. Could we do a poll first to gather some community feedback?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great idea, thank you! I'll ask in Slack for a list and try to figure out where/how to run said poll :)

Copy link
Member

@MikeSpreitzer MikeSpreitzer Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all multi-cluster management designs have a monolith. I do not think this SIG should take a position that requires or recommends a monolith. There are multiple roles involved in multi-cluster management. I think that it may be more helpful to identify those roles. A system with a monolith can be described as having one cluster that plays all those roles.

For example, in https://github.com/kubestellar/kubestellar/ we identify distinct roles and allow flexibility in what plays which role. We start by identifying a concept that is less than a "cluster". We define a "space" to be the fragment of cluster behavior that is only concerned with generic API machinery stuff. A space can store and serve kube API objects and subjects them to the general-purpose controllers (the ones that apply to all kinds of API objects, not the controllers involved specifically with containerized workload (Pod, Service, ...) ). KubeStellar defines the following roles.

  • an "inventory space" holds the inventory of managed clusters. That is, API objects that refer to or describe the managed clusters. In OCM these are ManagedCluster objects.
  • a "workload description space (WDS)" holds both (a) objects that describe the workload (both desired state from users and reported state to users) and (b) control objects that control the multi-cluster management behavior.
  • a "workload execution cluster (WEC)" is a managed cluster, where the workload is run or executed.
  • the current KubeStellar implementation uses OCM and has a "transport space" that holds the workload interface objects of OCM (e.g., ManifestWork). The current implementation of KubeStellar requires that one space play both "inventory space" and "transport space" roles, calling this joint role "inventory and transport space (ITS)".
  • the current implementation of KubeStellar uses https://github.com/kubestellar/kubeflex as a space manager. The KubeFlex API objects reside in, and the KubeFlex central controller runs in, a role called the "KubeFlex hosting cluster".
  • KubeFlex considers itself to manage "ControlPlanes", and supports a few different types. One is "vcluster", which uses https://www.vcluster.com/ to create virtual slices of its hosting cluster. Another is "k8s", which is essentially hosted Kubernetes kube-apiserver, its storage, and general-purpose controllers (the essence of a "space"). KubeFlex can also have a ControlPlane that represents its hosting cluster. We are working on enabling KubeFlex to adopt an arbitrary existing cluster as a ControlPlane.

One configuration that KubeStellar supports is one real cluster playing the roles of WDS, ITS, and KubeFlex hosting cluster.

In OCM, where the workload description is wrapped and the workload execution cluster holds the unwrapped objects, could one cluster play both WDS and WEC roles?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for those details. I agree with your statement of not being a monolith.
I'm only trying to define the nuance of what a management cluster is; I don't think I force everything to be in the same or a monolith. Maybe my wording is wrong? I could rename the title to be "Management Clusters" to make it clearer.... but I'm just trying to get a wide definition of what such a cluster is (by opposition to a workload cluster).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what is Kubestellar recommended way to host WDS/ITS/Kubeflex?


Author: Corentin Debains (**[@corentone](https://github.com/corentone)**), Google
Last Edit: 2024/12/09
Status: DRAFT

## Goal
To establish a standard definition for a central cluster that is leveraged by multicluster
controllers to manage multicluster applications or features across an inventory of clusters.

## Context
Multicluster controllers have always needed a place to run. This may happen in external
proprietary control-planes but for more generic platforms, it has been natural for the
Kubernetes community to leverage a Kubernetes Cluster and the existing api-machinery
available. There has been a variety of examples of which we can quote ArgoCD, MultiKueue
or any of the Federation effort (Karmada, KubeAdmiral), all of them not-naming the "location"
where they run or not aligning on the name (Admin cluster, Hub Cluster, Manager Cluster...).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you read Hub Cluster, you do think of the exact location of where it runs, right in the middle, especially if the term spoke comes up next.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't pick hub cluster because of two aspects for me:
1/ I was worried it'd sound weird when the "hub" cluster is part of the workload clusters themselves.
2/ given there can be multiple "hub clusters", hub may lose some meaning because it's not a singleton? multiple hubs is odd?

So I prioritized it's "function" over what it was? But those arguments are weak and someone suggested to poll the community, which is likely to be better than my arguments :)

The [ClusterInventory](https://github.com/kubernetes/enhancements/blob/master/keps/sig-multicluster/4322-cluster-inventory/README.md)
(ClusterProfile CRDs) is also the starting point for a lot of multicluster controllers and,
being a CRD, it requires an api-machinery to host it.

## Definition

A (multicluster) management cluster is a Kubernetes cluster that acts as a
control-plane for other Kubernetes clusters (named Workload Clusters to differentiate
them). It MUST have visibility over the available clusters and MAY have administrative

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what does "visibility" mean here? There can be workload clusters that do not have public IP.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, visibility is vague. I think it is somewhat the ability to query more data about the cluster. I'm thinking it would be the ability to "READ" via kubectl (thought it may not be everything). But it should be able to potentially see the Cluster object in GKE or other platforms (wherever the provisioning/lifecycle happens).

Maybe this is too vague, I originally had this in to mandate some kind of visibility on the cluster so that the controller could do something (if the controller has only access to the clusterprofile, I dont know what it can really do).

But we may be able to remove it and it wouldn't affect the definition much.

privileges over them. It SHOULD not be part of workload clusters to provide a better
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced of this "SHOULD". I am not convinced that this sort of statement belongs in a definition.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if you are not convinced that the "hub" cluster and a "spoke" cluster should not be the same or just "should" does not belong to a "definition"? I am curious about the reason if it's the former.

security isolation, especially when it has any administrative privileges over them.
There MAY be multiple management clusters overseeing the same set of Workload Clusters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This statement supposes multiplicity only in the form of potentially competing equals; it omits the possibility of clusters fulfilling distinct roles.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to clarify in the next section that it could be multiple roles:

to allow for separation of functionality (security-enforcer management cluster vs
config-delivery management cluster)

is the wording not strong enough? I don't mean to close that door and thought the current wording was enough and not emitting a direction, just requiring that the admin oversees potential overlap between different management clusters. If there is no overlap, they are fine to co-exist as separate clusters.

and it is left to the administrator to guarantee that they don't compete in their
management tasks. There SHOULD be a single clusterset managed by a management cluster.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This begs the question of what is a "clusterset".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://multicluster.sigs.k8s.io/api-types/cluster-set/

I tried to tie it to the sig-mc definition (I need to link it, indeed!).

Management clusters can be used for both control-plane or data-plane features.


### Rationale on some specific points of the definition

* Multiple management clusters: While it often makes sense to have a single "Brain" overseeing
a Fleet of Clusters, there is a need for flexibility over the number of management clusters. To
allow redundancy to improve reliability, to allow sharding of responsibility (for regionalized
controllers), to allow for separation of functionality (security-enforcer management cluster vs
config-delivery management cluster), to allow for migrations (from old management cluster to new
management cluster) and likely more.
* Management cluster also being part of the workload-running Fleet: We do recommend that the
management cluster(s) be isolated from the running Workload Fleet for security and management
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense in general, but I am not convinced that there are no use cases for combining roles in one cluster.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the first paragraph encourages different clusters for different roles actually.
This second paragraph is just about being part of the workload clusters or not.

Let me try to think of introducing the notion of "role" or something like that, as a subdivision of the broad Management Cluster.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what's the definition of "workload"? I usually associate them with applications but not controllers so it's okey to me to run controllers in the "central" cluster that requires leader-election.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is definitely okay to run a controller in the management cluster. A controller doesn't have to be a "Workload" and be considered part of the "control-plane".

I think its the persona that matters. If it is a platform admin -owned controller, performing management tasks, I wouldn't consider it a workload. Workload to me is an application serving actual business-logic purpose.

All in all, running those management controllers is the reason why I want to define management clusters and not just a management API. (we had discussed internally giving simply an API with machinery... but then very quickly you want to bring a controller to act on this API and look where to run it)

concerns. But there may be specific cases or applications that require to mix the two. For example,
controllers that take a "leader-election" approach and want a smaller footprint.