-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubeadm: check health status of all control plane components in wait-control-plane phase for inti #119598
Kubeadm: check health status of all control plane components in wait-control-plane phase for inti #119598
Conversation
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Please note that we're already in Test Freeze for the Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Wed Jul 26 16:09:42 UTC 2023. |
Welcome @tommas1988! |
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hi @tommas1988. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: tommas1988 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the PR has one blocker about imports / code organization.
but let's continue the discussion on the ticket, to decide if we want this.
also we are in code freeze for 1.28.
cmd/kubeadm/.import-restrictions
Outdated
@@ -3,3 +3,4 @@ rules: | |||
- selectorRegexp: k8s[.]io/kubernetes | |||
allowedPrefixes: | |||
- k8s.io/kubernetes/cmd/kubeadm | |||
- k8s.io/kubernetes/pkg/probe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubeadm must not import kubernetes/pkg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why kubeadm is limited from importing kubernetes/pkg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see kubernetes/kubeadm#1600
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubernetes/pkg contains tools that must not be imported by external projects.
at the same time, we want to move kubeadm outside of kubernetes/kubernetes one day.
"k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm" | ||
"k8s.io/kubernetes/cmd/kubeadm/app/phases/controlplane" | ||
"k8s.io/kubernetes/pkg/probe" | ||
httpprobe "k8s.io/kubernetes/pkg/probe/http" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mentioned above kubeadm must not import kubernetes/pkg.
note, we use a few import groups separated by new line:
- go stdlib
- external (e.g. github)
- k8s.io (non kubeadm)
- k8s.io kubeadm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for updating the PR. i will comment on the ticket.
/ok-to-test adding test runs, but note that this needs more discussion. i think. |
if cfg.ControlPlane != nil && features.Enabled(initCfg.FeatureGates, features.WaitForAllControlPlaneComponents) { | ||
fmt.Println("[kubelet-start] Wait for control plane components") | ||
timeout := 40 * time.Second | ||
err := controlplane.WaitForControlPlaneComponents( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after the kubelet start phase there is one called "control-plane-join"
can we have this logic there instead?
kubernetes/cmd/kubeadm/app/cmd/join.go
Lines 215 to 219 in 114aca5
joinRunner.AppendPhase(phases.NewPreflightPhase()) | |
joinRunner.AppendPhase(phases.NewControlPlanePreparePhase()) | |
joinRunner.AppendPhase(phases.NewCheckEtcdPhase()) | |
joinRunner.AppendPhase(phases.NewKubeletStartPhase()) | |
joinRunner.AppendPhase(phases.NewControlPlaneJoinPhase()) |
i don't recall if we had this discussion already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we haven`t discuss this before.
There are three subphases in "control-plane-join" phase, which subphase should contain the WaitForControlPlaneComponents
code? MarkControlPlane
phase ?
Or create a new subphase ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think a new Phase{} that is hidden is best here.
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/cmd/phases/workflow/phase.go#L46C9-L46C13
perhaps wait-control-plane
@@ -217,6 +220,21 @@ func runKubeletStartJoinPhase(c workflow.RunData) (returnErr error) { | |||
return err | |||
} | |||
|
|||
if cfg.ControlPlane != nil && features.Enabled(initCfg.FeatureGates, features.WaitForAllControlPlaneComponents) { | |||
fmt.Println("[kubelet-start] Wait for control plane components") | |||
timeout := 40 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and following on the prev comment, there should be possible to use clusterConfig.APIServer.TimeoutForControlPlane.Duration
instead of 40s.
FTR, we have this pending:
#122529
but my proposal is to still have a single timeout value for any CP component - 4m.
9d779ff
to
c37d1e8
Compare
c37d1e8
to
e432781
Compare
cmd/kubeadm/app/cmd/init.go
Outdated
initRunner.AppendPhase(phases.NewUploadConfigPhase()) | ||
initRunner.AppendPhase(phases.NewUploadCertsPhase()) | ||
initRunner.AppendPhase(phases.NewWaitControlPlanePhase()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused, why change the sequence of phases?
I'm afraid executing UploadConfigPhase
before the control plane is ready will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubelet needs cluster-admins
role binding exists when requesting pods from kubelet. And initData.Client()
will make sure the role binding exists before returning api client.
I have reverted init.go
back, and create a new commit to move this logic into controlplane/components.go
e432781
to
b041dcc
Compare
@tommas1988: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
// ControlPlaneComponents contains all components in control plane | ||
var ControlPlaneComponents = []string{ | ||
kubeadmconstants.KubeAPIServer, | ||
kubeadmconstants.KubeControllerManager, | ||
kubeadmconstants.KubeScheduler, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's make this a function instead that returns a private list:
func ControlPlaneComponents() []string {
...
}
|
||
if config.Authorization.Mode == kubeletconfig.KubeletAuthorizationModeWebhook { | ||
// make sure cluster admins role binding is created, thus request to kubelet will pass server authorization | ||
if _, err := kubeconfig.EnsureAdminClusterRoleBinding(kubeadmconstants.KubernetesDir, nil); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't the RB already created at this point?
it is one of the first things that happen during kubeadm init and for joining CP nodes it should be there already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't the RB already created at this point? it is one of the first things that happen during kubeadm init and for joining CP nodes it should be there already.
This RB is created in function initData.Client()
during kubeadm init, and initData.Client()
is ever not called till WaitControlPlanePhase
is reached
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the CRB is required earlier perhaps we can call a _ = initData.Client()
instead of calling kubeconfig.EnsureAdminClusterRoleBinding
. that is because we want this Client() function to set the right vars / state. getKubeletEndpoint() should just return the endpoint and not perform CRB creation.
in runWaitControlPlanePhase there is a client, err := data.ClientWithoutBootstrap()
, but this client doesn't have permissions.
but, in this discussion we talked about using the apiserver client cert/key
https://github.com/kubernetes/kubernetes/pull/119598/files#r1430504568
and the code in this PR is still using it. i'm not sure i understand why we need to bootstrap the admin.conf client?
i.e.
// make sure cluster admins role binding is created, thus request to kubelet will pass server authorization
kubeconfig.EnsureAdminClusterRoleBinding() ensures that the kubeadm:cluster-admins
group (which admin.conf is part of) is grated the cluster-admin privilege. it seems unrelated to getting the pods with the apiserver client cert/key. that user should already be able to get the pods from the kubelet, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but, in this discussion we talked about using the apiserver client cert/key
https://github.com/kubernetes/kubernetes/pull/119598/files#r1430504568
and the code in this PR is still using it. i'm not sure i understand why we need to bootstrap the admin.conf client?
previously waitconrolpane,go
use data.Client()
to get client which is changed by this commit.
kubeconfig.EnsureAdminClusterRoleBinding() ensures that the kubeadm:cluster-admins group (which admin.conf is part of) is grated the cluster-admin privilege. it seems unrelated to getting the pods with the apiserver client cert/key. that user should already be able to get the pods from the kubelet, no?
kubelet server will install auth filter with Authorization
configuration, and when Authorization.Mode = Webhook
(which is default), kubelet server will create SubjectAccessReview
to authorize requests to kubelet.
if the CRB is required earlier perhaps we can call a _ = initData.Client() instead of calling kubeconfig.EnsureAdminClusterRoleBinding. that is because we want this Client() function to set the right vars / state. getKubeletEndpoint() should just return the endpoint and not perform CRB creation.
Why put this code in getKubeletEndpoint()
is only use _ = initData.Client()
is not clear why request to kubelet will be failed without kubeadm:cluster-admins
RB is created (I did spend some time to find why request to kubelet return athorize error).
Now, I need to revert client, err := data.ClientWithoutBootstrap()
to client, err := data.Client()
or _ = data.Client()
when this feature gate is enabled in waitcontrolplane?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, I need to revert client, err := data.ClientWithoutBootstrap() to client, err := data.Client() or _ = data.Client() when this feature gate is enabled in waitcontrolplane?
there might be a "chicken and egg" problem, because data.ClientWithoutBootstrap was used because data.Client() does not work yet. i.e. the CRB cannot be performed due to apiserver pod not ready. you can try it, of course.
kubelet server will install auth filter with Authorization configuration, and when Authorization.Mode = Webhook (which is default), kubelet server will create SubjectAccessReview to authorize requests to kubelet.
i see, so it's because of this:
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/phases/certs/certlist.go#L294
apiserver-kubelet-client needs the admin credentials, to get pods or do other actions.
if it's possible to create the Client() earlier that might be best, but if it's not, perhaps we can:
- during init use the credentials from super-admin.conf
- during join just use admin.conf or apiserver-kubelet-client. at that point the CRB should already exist.
|
||
resp, err := client.Get(kubeletEndpoint) | ||
if err != nil { | ||
fmt.Printf("[kubelet client] Error getting pods [%v]\n", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of printing these errors we can store the last error as a lastError
and return it?
err := wait.PollUntilContextTimeout...
if err != nil {
return lastError
}
} | ||
} | ||
|
||
comp.touched = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on a quick look it's not obvious why we the code is tracking labels and touched state. can you elaborate why status.Ready is not enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on a quick look it's not obvious why we the code is tracking labels and touched state. can you elaborate why status.Ready is not enough?
component.touched: bool
is dealing with the case: when pods info returned from kubelet cannot be found with this component.labels
.
I will add some comments on component
struct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, i will check the comments later.
currently, i still don't get why we need to the labels and touched flags.
components[i] = &component{name, labels, false} | ||
} | ||
|
||
return wait.PollUntilContextTimeout(context.Background(), 5*time.Second, timeout, false, func(ctx context.Context) (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use the same retry interval as the legacy waiter for API server:
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/util/apiclient/wait.go#L82
or it's too short?
return false, nil | ||
} | ||
|
||
defer resp.Body.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless i'm mistaken the linter in this repo will complain that the return value is not checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless i'm mistaken the linter in this repo will complain that the return value is not checked.
Yes, the linter did complaint with this, but it is not a required check, and i also check some old code that did not check the returned err with closing a response body, so I just levea is as it is.
I will check this returned error in new commit
func newWaitControlPlaneComponentsSubphase() workflow.Phase { | ||
return workflow.Phase{ | ||
Name: "wait-control-plane-components", | ||
Hidden: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a new hidden phases seems ok.
i guess on join we are not calling WaitForAPI or have a dedicated phase to wait for the apiserver to come up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a new hidden phases seems ok. i guess on join we are not calling WaitForAPI or have a dedicated phase to wait for the apiserver to come up?
Do you mean we need to add legacy WaitForAPI
code when this feature gate is disabled in this hidden phase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, i'm just trying to remember what happens on join.
i guess we are not even calling WaitForAPI on join, so adding the new hidden phase seems OK.
later if we decide we can make it non-hidden.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
give this a try:
https://github.com/kubernetes/kubernetes/pull/119598/files#r1477736950
if it doesn't work i can try downloading the PR locally and check what can be done.
Given the credential data from super-admin.conf also dosen`t work. And after reading code of kubelet and kube-apiserver, it seems that kubelet will use a RequestAttributes that contains by the way, is that reasonable for creating |
it doesn't?
i am surprised if system:masters doesn't work for this. one of the reasons it exists is to create initial rbac rules on bootstrap.
not sure I understand the question. but please tell me if you have more ideas. |
@tommas1988 see this patch. i tested it locally and it works with go routines at /healthz. the code is relatively simple: diff --git a/cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go b/cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go
index 8b746f288c3..f0b0a9587ce 100644
--- a/cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go
+++ b/cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go
@@ -112,7 +112,7 @@ func runWaitControlPlanePhase(c workflow.RunData) error {
}
waiter.SetTimeout(data.Cfg().Timeouts.ControlPlaneComponentHealthCheck.Duration)
- if err := waiter.WaitForAPI(); err != nil {
+ if err := waiter.WaitForControlPlaneComponents(); err != nil {
return handleError(err)
}
diff --git a/cmd/kubeadm/app/phases/upgrade/staticpods_test.go b/cmd/kubeadm/app/phases/upgrade/staticpods_test.go
index 94ba22b66ba..017b28c7077 100644
--- a/cmd/kubeadm/app/phases/upgrade/staticpods_test.go
+++ b/cmd/kubeadm/app/phases/upgrade/staticpods_test.go
@@ -98,6 +98,11 @@ func NewFakeStaticPodWaiter(errsToReturn map[string]error) apiclient.Waiter {
}
}
+// WaitForControlPlaneComponents just returns a dummy nil, to indicate that the program should just proceed
+func (w *fakeWaiter) WaitForControlPlaneComponents() error {
+ return nil
+}
+
// WaitForAPI just returns a dummy nil, to indicate that the program should just proceed
func (w *fakeWaiter) WaitForAPI() error {
return nil
diff --git a/cmd/kubeadm/app/util/apiclient/wait.go b/cmd/kubeadm/app/util/apiclient/wait.go
index da122b553dd..2b396e36394 100644
--- a/cmd/kubeadm/app/util/apiclient/wait.go
+++ b/cmd/kubeadm/app/util/apiclient/wait.go
@@ -18,6 +18,7 @@ package apiclient
import (
"context"
+ "crypto/tls"
"fmt"
"io"
"net/http"
@@ -28,6 +29,7 @@ import (
v1 "k8s.io/api/core/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+ utilerrors "k8s.io/apimachinery/pkg/util/errors"
netutil "k8s.io/apimachinery/pkg/util/net"
"k8s.io/apimachinery/pkg/util/wait"
clientset "k8s.io/client-go/kubernetes"
@@ -37,6 +39,8 @@ import (
// Waiter is an interface for waiting for criteria in Kubernetes to happen
type Waiter interface {
+ // WaitForControlPlaneComponents waits for all control plane components to report "ok" on /healthz
+ WaitForControlPlaneComponents() error
// WaitForAPI waits for the API Server's /healthz endpoint to become "ok"
WaitForAPI() error
// WaitForPodsWithLabel waits for Pods in the kube-system namespace to become Ready
@@ -72,6 +76,69 @@ func NewKubeWaiter(client clientset.Interface, timeout time.Duration, writer io.
}
}
+// WaitForControlPlaneComponents waits for all control plane components to report "ok" on /healthz
+func (w *KubeWaiter) WaitForControlPlaneComponents() error {
+ fmt.Printf("[control-plane-check] Waiting for healthy control plane components."+
+ " This can take up to %v\n", w.timeout)
+
+ type component struct {
+ name string
+ port int
+ }
+ var components = []component{
+ {name: "kube-api-server", port: 6443},
+ {name: "kube-controller-manager", port: 10257},
+ {name: "kube-scheduler", port: 10259},
+ }
+ var errs []error
+ errChan := make(chan error, len(components))
+
+ for _, comp := range components {
+ go func(comp component) {
+ tr := &http.Transport{
+ TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
+ }
+ client := &http.Client{Transport: tr}
+ start := time.Now()
+ var lastError error
+
+ err := wait.PollUntilContextTimeout(
+ context.Background(),
+ constants.KubernetesAPICallRetryInterval,
+ w.timeout,
+ true, func(ctx context.Context) (bool, error) {
+ resp, err := client.Get(fmt.Sprintf("https://127.0.0.1:%d/healthz", comp.port))
+ if err != nil {
+ lastError = errors.WithMessagef(err, "%s /healthz check failed", comp.name)
+ return false, nil
+ }
+
+ defer resp.Body.Close()
+ if resp.StatusCode != http.StatusOK {
+ lastError = errors.Errorf("%s /healthz check failed with status: %d", comp.name, resp.StatusCode)
+ return false, nil
+ }
+
+ return true, nil
+ })
+ if err != nil {
+ fmt.Printf("[control-plane-check] %s is not healthy after %v\n", comp.name, time.Since(start))
+ errChan <- lastError
+ return
+ }
+ fmt.Printf("[control-plane-check] %s is healthy after %v\n", comp.name, time.Since(start))
+ errChan <- nil
+ }(comp)
+ }
+
+ for i := 0; i < len(components); i++ {
+ if err := <-errChan; err != nil {
+ errs = append(errs, err)
+ }
+ }
+ return utilerrors.NewAggregate(errs)
+}
+
// WaitForAPI waits for the API Server's /healthz endpoint to report "ok"
func (w *KubeWaiter) WaitForAPI() error {
fmt.Printf("[api-check] Waiting for a healthy API server. This can take up to %v\n", w.timeout)
diff --git a/cmd/kubeadm/app/util/dryrun/dryrun.go b/cmd/kubeadm/app/util/dryrun/dryrun.go
index b1c4e7a4712..ae34d2bdee9 100644
--- a/cmd/kubeadm/app/util/dryrun/dryrun.go
+++ b/cmd/kubeadm/app/util/dryrun/dryrun.go
@@ -88,6 +88,11 @@ func NewWaiter() apiclient.Waiter {
return &Waiter{}
}
+// WaitForControlPlaneComponents just returns a dummy nil, to indicate that the program should just proceed
+func (w *Waiter) WaitForControlPlaneComponents() error {
+ return nil
+}
+
// WaitForAPI just returns a dummy nil, to indicate that the program should just proceed
func (w *Waiter) WaitForAPI() error {
fmt.Println("[dryrun] Would wait for the API Server's /healthz endpoint to return 'ok'")
output:
EDIT: one note here, apiserver has /readyz which is better but the other components don't have it, thus i used /healthz for all of them. also the old WaitForAPI code uses /healthz anyhow. IMPORTANT: kubeadm does not support very well --annonymous-auth=false on the kube-apiserver, but if some user decides to turn it off they will face this problem: but i guess our WaitForAPI already has this problem. in any case, i think what we can do the following:
WDYT? |
/kind feature
@tommas1988 the release note block should started with
|
@pacoxu: /release-note-edit must be used with a single release note block. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
https://raw.githubusercontent.com/kubernetes/kubernetes/master/.github/PULL_REQUEST_TEMPLATE.md |
i opened a new PR for the same feature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
closing in favor of #123341
/close |
@neolit123: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Fixed kubeadm init command of wait-control-plane phase dose not check health status of all control plane components
Which issue(s) this PR fixes:
Fixes kubernetes/kubeadm#2907
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: