Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP for adding toggle behaviour to KudoOperator #1794

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
147 changes: 147 additions & 0 deletions keps/0036-kudo-operator-toggle-behavior.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
---
kep-number: 36
short-desc: KudoOperator Toggling
title: Parameter-controlled toggling of KudoOperator tasks
authors:
- "@asekretenko"
owners:
- "@asekretenko"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historically owners have been committers and I don't know @asekretenko. Are we expecting him to be a project committer? There is usually a voting process by owners in the CNCF process for this. That said there is a lot of flux in the committers and owners which is unfortunate.

Please review the KEP process. In particular: https://github.com/kudobuilder/kudo/blame/main/keps/0001-kep-process.md#L168-L173


The OWNER that is most closely associated with this KEP. If there is code or
 other artifacts that will result from this KEP, then it is expected that
 this OWNER will take responsibility for the bulk of those artifacts.

@asekretenko will you be able to serve in this capacity and own feature?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kensipe the intention is to have @asekretenko become an owner down the line and we're earning our way to it via contributions in the form of bug-fixes and other KEPs down the line.

editor: @asekretenko
creation-date: 2021-04-30
last-updated: 2021-04-30
status: provisional
see-also:
- KEP-23
- KEP-29
---

# Parameter-controlled toggling of KudoOperator tasks

## Table of Contents
* [Summary](#summary)
* [Motivation](#motivation)
* [Goals](#goals)
* [Non-Goals](#non-goals)
* [Proposal](#proposal)
* [Interface](#interface)
* [Validation](#validation)
* [Changing enablingParameter on upgrade](#changing-enablingparameter-on-upgrade)
* [Implementation Notes](#implementation-notes)
* [Risks and Mitigations](#risks-and-mitigations)
* [Drawbacks](#drawbacks)
* [Alternatives](#alternatives)

<!-- [Tools for generating]: https://github.com/ekalinin/github-markdown-toc -->
asekretenko marked this conversation as resolved.
Show resolved Hide resolved

## Summary

This KEP aims to add an ability to switch an operator dependency on and off
via a specified parameter of a parent operator. Namely, as a result of
implementing this KEP, `KudoOperator` task definition will get an optional field
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
implementing this KEP, `KudoOperator` task definition will get an optional field
implementing this KEP, `KudoOperator` task definition will have an optional field

containing a name of a boolean parameter (of the parent operator), the value
of which will determine whether the child operator instance should exist.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't make sense... in particular should exist. I'm guessing.. but perhaps you mean that the operator will expect that the child operator exists and will not be created as part of a deployment.

Also... it might be worth editing this paragraph with a focus on "why" and less on how. The how will be outlined in the KEP.. the summary should strongly support "why" it adds value. what the world is likely without and what the world will be like with it.... which then be amplified by the motivation below.


## Motivation
From [#1775](https://github.com/kudobuilder/kudo/issues/1775):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

> Currently, KudoOperator tasks don't support toggle behavior which
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> Currently, KudoOperator tasks don't support toggle behavior which
> Currently, KudoOperator tasks doesn't support toggle behavior which

> significantly complicates the handling of the optional dependencies. A good
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'handling of "the" optional dependencies'... gives the impression of "one" dependencies and requires a lot of assumptions of the reader. Likely we mean "handling of optional dependencies", which is explained below.

Suggested change
> significantly complicates the handling of the optional dependencies. A good
> significantly complicates the handling of optional dependencies. A good

> example could be Spark and History Server or Kafka and Schema Registry. The
> only possible options to handle it today are either include the templates
> as-is into the parent operator and control the contents at the template level
> or to implement a job (via Toggle task) which deploys/removes the dependency
> based on a parameter value. Also, cascading updates/upgrades become
> complicated in these scenarios.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The paragraph has each line with a >. please remove.

Also better to have as one line so that edits which result in shorter lines doesn't require refactoring all the lines of a paragraph causing noise in review changes.


### Goals

* Add an ability to enforce existence/**non-existence** of the child
operator instance (and all its children) via a parent operator parameter,
i.e. to toggle the dependency.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we please rephrase? It is hard to understand and it is the goal. Are we expecting KUDO to "enforce" something? That isn't the mental model I formed to this point of reading. I am expecting that a defined dependency is already installed and those I don't want KUDO to install it (as a dependency) of the parent kudo operator. KUDO isn't "enforcing" anything at that point... it seems to be allowing a deviation from the current norm.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kensipe thanks, I'll rephrase this (and also other locations).

I should have been much more careful with the word "enforce". While an execution of a task (Apply/Delete/Toggle/KudoOperator) is enforcing existence/deletion of a resource (probably "is ensuring" would be a better word), neither tasks nor plans nor KUDO are really enforcing anything; such an enforcement is very far from the scope of this KEP.
Sorry for confusion.


* Make parent operators specifying an invalid parameter for toggling fail
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm question "invalid" parameter... I don't understand what that means in this context. It also sounds inverted... perhaps more like "required" dependency which can be set false. It is unclear what we are communicating here. What exactly are we toggling?

instance admission (as opposed to simply failing a plan with misconfigured
asekretenko marked this conversation as resolved.
Show resolved Hide resolved
`KudoOperator` task).

### Non-Goals

* Chain validation of child instances. `KudoOperator` configurations that result
in failed instance admission of a child do not fail parent instance admission,
only plan execution. Functionality introduced by this proposal is also
affected by this: for example, if a parent switches on a child that uses a
non-existent parameter for toggling a grandchild, parent instance admission
will succeed, only the corresponding parent plan will fail.

* OperatorVersion validation. This proposal adds one more way to create
an operator that just cannot be installed regardless of instance parameters.

* Introducing an `UninstallKudoOperator` task. An ability to unconditionally
remove a child operator (for example, installed by a previous version), while
no more difficult, is not a goal of this proposal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before getting to "proposal"... we need more information outlined to set expectations..
For examples questions / comments below, assuming Spark operator dependent on History Server and the KUDO spark operator is configured for a dependent KUDO History Server.

When KUDO Spark is installed... it will install KUDO History Server. Resulting in KUDO having "ownership" as a controller over the history server. Resulting in param changes for a kudo update being managed for the history server.

Questions:

  1. IF we opt-out of the dependency, does KUDO own it or not own it? It seems it would not but the KEP should make that clear... with all the responsibilities for the dependency being removed from kudo. (like updates, deletes etc.)
  2. IF we opt-out, (and I'm going off really old memory here).. I can't remember if we have the ability to have output values from a dependency... for instance a dependent zookeeper, provide a zookeeper client details to a parent.. if we have that, how does that work for a parent that opts out?
  3. If we opt-out, what happens for kudo update for parameters which are for an opt-out dependency? Do we error out? does it ignore it?
  4. same for a kudo upgrade
  5. for an opt-out, can an kudo update reverse the decision? Opting in and provisioning a new dependency?
  6. What does the plan status look like for an opt-out... now it shows a tree of phases including all dependencies... does it show the dependencies? Does it reflect the opt-out?

There is likely more... we need these questions answered in the KEP to frame a mental model for discussions, expectations and awareness. A feature of this magnitude requires some thoughts around impact to all other features and usability. It really is required prior to any proposal as the proposal will be evaluated against all the edge cases.

We don't have to boil the ocean.. this KEP is "provisional"... the first go of it, doesn't require a proposal... just a lot of detail of all the edge cases and a strong case around the acceptance criteria. It is also reasonable to identify cases which are too complex and will not be supported but they need to called out in this doc which becomes the foundation by which we add the feature.

Copy link
Member Author

@asekretenko asekretenko May 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kensipe Thanks for these questions!

Regarding (1), I've tried to reword this KEP to make it clear that the goal is not toggling of a dependency relationship between two operator instances, but controlling whether the dependency instance exists by means of a parameter.

(2) is not an issue: as of today, there are no means for obtaining an output from the dependency.
I've tried to address (3) - (6) in the Non-Goals section.

Please take a new look; hopefully, this makes more sense now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asekretenko sorry for delays.. I will make time to review again tomorrow morning... it should be there by your mid-day.

## Proposal

### Interface
An optional field `enablingParameter` will be added to the `KudoOperator` task:
```
- name: dependency
kind: KudoOperator
spec:
package: "./packages/dependency"
parameterFile: dependency-params.yaml
enablingParameter: "installDependency"
```
In this example, when the parameter `installDependency` equals to `true`, the
task will be handled as a normal, unconditional `KudoOperator`. When the
parameter equals to `false`, execution of the task will ensure that
the corresponding operator instance is not installed (uninstalling it if
necessary).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will the uninstall step trigger the clanup plan of the child operator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akirillov Yes, it will. Normally, removing an instance triggers the cleanup plan (given that the plan exists, hasn't been triggered yet, and so on). In this KEP, there is no plan to treat child instance removal as a special case.


#### Validation
A `KudoOperator` task referencing a non-existing parent parameter will be
treated as invalid: instance admission (and plan execution, if instance
admission is not used) will fail.

Task execution and instance validation will require that the type
of the specified parameter is either "boolean" or a string convertible
to boolean according to rules used by Go's `strconv.ParseBool()`

#### Changing `enablingParameter` on upgrade
No special handling for upgrade is planned.

Some consequencees of this choice:
* Dropping the parameter will convert the task into an unconditional
`KudoOperator` managing the same instance.
* Switching to a parent operator parameter with a different name will have
an effect determined only by the value of the new parameter.
* Dropping a `KudoOperator` task in one operator version and re-introducing
a dependency with the same name with completely different parameters
in some next version will still allow the new version to uninstall
the formerly child operator by upgrade with setting the newly introduced
enabling parameter to `false`. Operator developers will still need to
consider the whole history of development of an operator to avoid unintended
consequences of upgrades.

### Implementation Notes

Implementation is relatively straightforward, including the instance removal
case: the child instance to be removed will be identified via the same mechanism
as one used for identifying the instance to patch on upgrade/update.

## Risks and Mitigations

## Drawbacks

This proposal makes the operator developer API even less robust with regards
to developer errors than it is now (see [Non-Goals](#non-goals)).

## Alternatives

Currently existing alternative is conditionally creating a child operator
instance (and also Operator/OperatorVersion if needed) by means of `Toggle`
tasks. This is extremely cumbersome in the existing form, but might become much
more convenient if we are to implement a stable API for managing operators
via a single custom resource (aka CRD-based installation; KEP yet to be filed).

However, one might argue that `KudoOperator` will still provide a much more
convenient framework for cases when a parent operator has multiple children
that are never used on their own.