Enhance restart proxy feature to differentiate between user workloads and kyma workloads #1249
Closed
21 tasks done
Labels
kind/feature
Categorizes issue or PR as related to a new feature.
Milestone
Description
Istio bump is related to the requirement of restarting the pod with Istio sidecar injected, to keep it up to date with the istiod. In a situation where the customer's workload is broken for any reason, the pods are not able to get up - we cannot reconcile the state of the data plane of the service mesh to the required state. This situation puts our Istio operator in a state of a constant retries of proxy restart, resulting in Istio CR in an infinite loop of processing -> error -> processing state. This kind of situation is independent of Istio Module team, but because it's reflected in the inability to restart a Pod, it's an Istio Module team that receives alerts, and therefore responsibility. As a solution for this issue, we should aim to limit a number of retries during the proxy restart. Inability to successfully restart a workload should end up in the Istio CR warning state indicating that the actions is on customer side, to fix the workload so the proxies can be updated.
Consider the case when there is a single Pod without parent Deployment, that we can not restart. We should decide whether to ignore it or Istio CR status must be warning.
Consider Istio CR status when node is draining and Deployment/Pod in evicted state. Should not be causing error status.
TODOs:
PRs
ACs [PO]
DoD [Developer & Reviewer]
The text was updated successfully, but these errors were encountered: