operator: poc: implement event-driven lifecycle of controllers by operator #1557
Labels
area/api-gateway
Issues or PRs related to api-gateway
kind/feature
Categorizes issue or PR as related to a new feature.
/kind feature
/area api-gateway
Description
Right now current implementation of all controllers in api-gateway-manager are enabled on startup startup. To enable updates to APIGateway CR, which serves as an operator for the functionality of entire APIGateway module, we cannot use Watches on external resources.
After small discussion with @pbochynski we had an agreement that APIGateway Operator could dynamically register and unregister controllers based on the specific set of dependencies. The two requirements are:
Some of our controllers, like APIRule, or newly created RateLimit controller, have direct dependency to Istio resources. Without those resources controllers simply cannot work. If the dependencies are missing, we can simply not enable a controller.
controller-runtime has a possibility to create unmanaged controllers, which then can be manually started. We could extend APIGateway CR, watch on CustomResourceDefinition and enable/disable controllers based on the cluster state. This seems to be most reasonable, but requires a bit more research.
We could create a separate DynamicCRDController which registers controllers if the CRDs are installed, like in the Kong KIC project. However this has a downside when you try to disable or restart controller when the precondition is not met anymore, eg. EnvoyFilter CRD gets deleted, manager may still crash due to resource not being there, but having a registered watch.
Another approach is to separate controllers by binaries. APIGateway Operator orchestrates lifecycle of controller deployments, applying or removing deployments based on the initial configuration provided by the APIGateway CR. This way we can observe the deployments and react on their state change.
We could just poll for a resource, but that is against the kubernetes controller pattern, as stated in the Kubebuilder book. That also implies that managed external resources, like VirtualServices, EnvoyFilters created by our controllers will never reconcile them immediately. That alone can cause instability in API, potential downtime in connectivity until next reconciliation occurs.
The text was updated successfully, but these errors were encountered: