Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.5.2 ) #2596

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

snoopy82481-bot[bot]
Copy link
Contributor

@snoopy82481-bot snoopy82481-bot bot commented Feb 6, 2025

This PR contains the following updates:

Package Update Change
kube-prometheus-stack (source) major 67.11.0 -> 69.5.2

Release Notes

prometheus-community/helm-charts (kube-prometheus-stack)

v69.5.2

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@prom-label-proxy-0.10.2...kube-prometheus-stack-69.5.2

v69.5.1

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@prometheus-operator-crds-18.0.1...kube-prometheus-stack-69.5.1

v69.5.0

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@prometheus-operator-admission-webhook-0.19.0...kube-prometheus-stack-69.5.0

v69.4.1

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

New Contributors

Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-69.4.0...kube-prometheus-stack-69.4.1

v69.4.0

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

New Contributors

Full Changelog: prometheus-community/helm-charts@prometheus-rabbitmq-exporter-2.1.1...kube-prometheus-stack-69.4.0

v69.3.3

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@prometheus-rabbitmq-exporter-2.1.0...kube-prometheus-stack-69.3.3

v69.3.2

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@prometheus-elasticsearch-exporter-6.6.1...kube-prometheus-stack-69.3.2

v69.3.1

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

New Contributors

Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-69.3.0...kube-prometheus-stack-69.3.1

v69.3.0

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

New Contributors

Full Changelog: prometheus-community/helm-charts@prometheus-json-exporter-0.16.0...kube-prometheus-stack-69.3.0

v69.2.4

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

New Contributors

Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-69.2.3...kube-prometheus-stack-69.2.4

v69.2.3

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

New Contributors

Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-69.2.2...kube-prometheus-stack-69.2.3

v69.2.2

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-69.2.1...kube-prometheus-stack-69.2.2

v69.2.1

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@kube-state-metrics-5.30.0...kube-prometheus-stack-69.2.1

v69.2.0

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-69.1.2...kube-prometheus-stack-69.2.0

v69.1.2

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

New Contributors

Full Changelog: prometheus-community/helm-charts@prometheus-json-exporter-0.15.0...kube-prometheus-stack-69.1.2

v69.1.1

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

New Contributors

Full Changelog: prometheus-community/helm-charts@prometheus-operator-crds-18.0.0...kube-prometheus-stack-69.1.1

v69.1.0

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-69.0.0...kube-prometheus-stack-69.1.0

v69.0.0

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-68.5.0...kube-prometheus-stack-69.0.0

v68.5.0

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@prometheus-operator-admission-webhook-0.18.2...kube-prometheus-stack-68.5.0

v68.4.5

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

New Contributors

Full Changelog: prometheus-community/helm-charts@prometheus-27.3.0...kube-prometheus-stack-68.4.5

v68.4.4

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

New Contributors

Full Changelog: prometheus-community/helm-charts@prometheus-27.2.0...kube-prometheus-stack-68.4.4

v68.4.3

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-68.4.2...kube-prometheus-stack-68.4.3

v68.4.2

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@kube-state-metrics-5.29.0...kube-prometheus-stack-68.4.2

v68.4.0

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-68.3.3...kube-prometheus-stack-68.4.0

v68.3.3

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

New Contributors

Full Changelog: prometheus-community/helm-charts@prometheus-sql-exporter-0.2.2...kube-prometheus-stack-68.3.3

v68.3.2

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

New Contributors

Full Changelog: prometheus-community/helm-charts@prometheus-nginx-exporter-1.0.1...kube-prometheus-stack-68.3.2

v68.3.0

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-68.2.2...kube-prometheus-stack-68.3.0

v68.2.2

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@prometheus-conntrack-stats-exporter-0.5.15...kube-prometheus-stack-68.2.2

v68.2.1

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-68.2.0...kube-prometheus-stack-68.2.1

v68.2.0

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@prometheus-yet-another-cloudwatch-exporter-0.39.3...kube-prometheus-stack-68.2.0

v68.1.1

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

New Contributors

Full Changelog: prometheus-community/helm-charts@prometheus-pushgateway-2.17.0...kube-prometheus-stack-68.1.1

v68.1.0

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

New Contributors

Full Changelog: prometheus-community/helm-charts@prometheus-windows-exporter-0.8.0...kube-prometheus-stack-68.1.0

v68.0.0

Compare Source

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

What's Changed

Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-67.11.0...kube-prometheus-stack-68.0.0


Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR has been generated by Renovate Bot.

@snoopy82481-bot snoopy82481-bot bot added renovate/helm type/major size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. area/kubernetes Changes made in the kubernetes namespace directory and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Feb 6, 2025
@snoopy82481-bot
Copy link
Contributor Author

snoopy82481-bot bot commented Feb 6, 2025

--- kubernetes/apps/monitoring/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack HelmRelease: monitoring/kube-prometheus-stack

+++ kubernetes/apps/monitoring/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack HelmRelease: monitoring/kube-prometheus-stack

@@ -13,13 +13,13 @@

     spec:
       chart: kube-prometheus-stack
       sourceRef:
         kind: HelmRepository
         name: prometheus-community
         namespace: flux-system
-      version: 67.11.0
+      version: 69.5.2
   dependsOn:
   - name: rook-ceph-cluster
     namespace: rook-ceph
   - name: cert-manager
     namespace: cert-manager
   install:

@snoopy82481-bot
Copy link
Contributor Author

snoopy82481-bot bot commented Feb 6, 2025

--- HelmRelease: monitoring/kube-prometheus-stack ConfigMap: monitoring/kube-prometheus-stack-alertmanager-overview

+++ HelmRelease: monitoring/kube-prometheus-stack ConfigMap: monitoring/kube-prometheus-stack-alertmanager-overview

@@ -1,42 +0,0 @@

----
-apiVersion: v1
-kind: ConfigMap
-metadata:
-  namespace: monitoring
-  name: kube-prometheus-stack-alertmanager-overview
-  labels:
-    grafana_dashboard: '1'
-    app: kube-prometheus-stack-grafana
-    app.kubernetes.io/managed-by: Helm
-    app.kubernetes.io/instance: kube-prometheus-stack
-    app.kubernetes.io/part-of: kube-prometheus-stack
-    release: kube-prometheus-stack
-    heritage: Helm
-data:
-  alertmanager-overview.json: '{"graphTooltip":1,"panels":[{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":0},"id":1,"panels":[],"title":"Alerts","type":"row"},{"datasource":{"type":"prometheus","uid":"$datasource"},"description":"current
-    set of alerts stored in the Alertmanager","fieldConfig":{"defaults":{"custom":{"fillOpacity":10,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"none"}},"gridPos":{"h":7,"w":12,"x":0,"y":1},"id":2,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum(alertmanager_alerts{namespace=~\"$namespace\",service=~\"$service\"})
-    by (namespace,service,instance)","intervalFactor":2,"legendFormat":"{{instance}}"}],"title":"Alerts","type":"timeseries"},{"datasource":{"type":"prometheus","uid":"$datasource"},"description":"rate
-    of successful and invalid alerts received by the Alertmanager","fieldConfig":{"defaults":{"custom":{"fillOpacity":10,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"ops"}},"gridPos":{"h":7,"w":12,"x":12,"y":1},"id":3,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum(rate(alertmanager_alerts_received_total{namespace=~\"$namespace\",service=~\"$service\"}[$__rate_interval]))
-    by (namespace,service,instance)","intervalFactor":2,"legendFormat":"{{instance}}
-    Received"},{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum(rate(alertmanager_alerts_invalid_total{namespace=~\"$namespace\",service=~\"$service\"}[$__rate_interval]))
-    by (namespace,service,instance)","intervalFactor":2,"legendFormat":"{{instance}}
-    Invalid"}],"title":"Alerts receive rate","type":"timeseries"},{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":8},"id":4,"panels":[],"title":"Notifications","type":"row"},{"datasource":{"type":"prometheus","uid":"$datasource"},"description":"rate
-    of successful and invalid notifications sent by the Alertmanager","fieldConfig":{"defaults":{"custom":{"fillOpacity":10,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"ops"}},"gridPos":{"h":7,"w":12,"x":0,"y":9},"id":5,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi"}},"pluginVersion":"v11.4.0","repeat":"integration","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum(rate(alertmanager_notifications_total{namespace=~\"$namespace\",service=~\"$service\",
-    integration=\"$integration\"}[$__rate_interval])) by (integration,namespace,service,instance)","intervalFactor":2,"legendFormat":"{{instance}}
-    Total"},{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum(rate(alertmanager_notifications_failed_total{namespace=~\"$namespace\",service=~\"$service\",
-    integration=\"$integration\"}[$__rate_interval])) by (integration,namespace,service,instance)","intervalFactor":2,"legendFormat":"{{instance}}
-    Failed"}],"title":"$integration: Notifications Send Rate","type":"timeseries"},{"datasource":{"type":"prometheus","uid":"$datasource"},"description":"latency
-    of notifications sent by the Alertmanager","fieldConfig":{"defaults":{"custom":{"fillOpacity":10,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"s"}},"gridPos":{"h":7,"w":12,"x":12,"y":9},"id":6,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi"}},"pluginVersion":"v11.4.0","repeat":"integration","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"histogram_quantile(0.99,\n  sum(rate(alertmanager_notification_latency_seconds_bucket{namespace=~\"$namespace\",service=~\"$service\",
-    integration=\"$integration\"}[$__rate_interval])) by (le,namespace,service,instance)\n)\n","intervalFactor":2,"legendFormat":"{{instance}}
-    99th Percentile"},{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"histogram_quantile(0.50,\n  sum(rate(alertmanager_notification_latency_seconds_bucket{namespace=~\"$namespace\",service=~\"$service\",
-    integration=\"$integration\"}[$__rate_interval])) by (le,namespace,service,instance)\n)\n","intervalFactor":2,"legendFormat":"{{instance}}
-    Median"},{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum(rate(alertmanager_notification_latency_seconds_sum{namespace=~\"$namespace\",service=~\"$service\",
-    integration=\"$integration\"}[$__rate_interval])) by (namespace,service,instance)\n/\nsum(rate(alertmanager_notification_latency_seconds_count{namespace=~\"$namespace\",service=~\"$service\",
-    integration=\"$integration\"}[$__rate_interval])) by (namespace,service,instance)\n","intervalFactor":2,"legendFormat":"{{instance}}
-    Average"}],"title":"$integration: Notification Duration","type":"timeseries"}],"schemaVersion":39,"tags":["alertmanager-mixin"],"templating":{"list":[{"current":{"selected":false,"text":"Prometheus","value":"Prometheus"},"hide":0,"label":"Data
-    Source","name":"datasource","query":"prometheus","type":"datasource"},{"current":{"selected":false,"text":"","value":""},"datasource":{"type":"prometheus","uid":"${datasource}"},"includeAll":false,"label":"namespace","name":"namespace","query":"label_values(alertmanager_alerts,
-    namespace)","refresh":2,"sort":1,"type":"query"},{"current":{"selected":false,"text":"","value":""},"datasource":{"type":"prometheus","uid":"${datasource}"},"includeAll":false,"label":"service","name":"service","query":"label_values(alertmanager_alerts,
-    service)","refresh":2,"sort":1,"type":"query"},{"current":{"selected":false,"text":"$__all","value":"$__all"},"datasource":{"type":"prometheus","uid":"${datasource}"},"hide":2,"includeAll":true,"name":"integration","query":"label_values(alertmanager_notifications_total{integration=~\".*\"},
-    integration)","refresh":2,"sort":1,"type":"query"}]},"time":{"from":"now-1h","to":"now"},"timepicker":{"refresh_intervals":["30s"]},"timezone":
-    "utc","title":"Alertmanager / Overview","uid":"alertmanager-overview"}'
-
--- HelmRelease: monitoring/kube-prometheus-stack ConfigMap: monitoring/kube-prometheus-stack-node-cluster-rsrc-use

+++ HelmRelease: monitoring/kube-prometheus-stack ConfigMap: monitoring/kube-prometheus-stack-node-cluster-rsrc-use

@@ -10,597 +10,41 @@

     app.kubernetes.io/managed-by: Helm
     app.kubernetes.io/instance: kube-prometheus-stack
     app.kubernetes.io/part-of: kube-prometheus-stack
     release: kube-prometheus-stack
     heritage: Helm
 data:
-  node-cluster-rsrc-use.json: |-
-    {
-        "graphTooltip": 1,
-        "panels": [
-            {
-                "collapsed": false,
-                "gridPos": {
-                    "h": 1,
-                    "w": 24,
-                    "x": 0,
-                    "y": 0
-                },
-                "id": 1,
-                "panels": [
+  node-cluster-rsrc-use.json: '{"graphTooltip":1,"panels":[{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":0},"id":1,"panels":[],"title":"CPU","type":"row"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"percentunit"}},"gridPos":{"h":7,"w":12,"x":0,"y":1},"id":2,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"((\n  instance:node_cpu_utilisation:rate5m{job=\"node-exporter\",
+    cluster=\"$cluster\"}\n  *\n  instance:node_num_cpu:sum{job=\"node-exporter\",
+    cluster=\"$cluster\"}\n) != 0 )\n/ scalar(sum(instance:node_num_cpu:sum{job=\"node-exporter\",
+    cluster=\"$cluster\"}))\n","legendFormat":"{{ instance }}"}],"title":"CPU Utilisation","type":"timeseries"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"percentunit"}},"gridPos":{"h":7,"w":12,"x":12,"y":1},"id":3,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"(\n  instance:node_load1_per_cpu:ratio{job=\"node-exporter\",
+    cluster=\"$cluster\"}\n  / scalar(count(instance:node_load1_per_cpu:ratio{job=\"node-exporter\",
+    cluster=\"$cluster\"}))\n)  != 0\n","legendFormat":"{{ instance }}"}],"title":"CPU
+    Saturation (Load1 per CPU)","type":"timeseries"},{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":8},"id":4,"panels":[],"title":"Memory","type":"row"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"percentunit"}},"gridPos":{"h":7,"w":12,"x":0,"y":9},"id":5,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"(\n  instance:node_memory_utilisation:ratio{job=\"node-exporter\",
+    cluster=\"$cluster\"}\n  / scalar(count(instance:node_memory_utilisation:ratio{job=\"node-exporter\",
+    cluster=\"$cluster\"}))\n) != 0\n","legendFormat":"{{ instance }}"}],"title":"Memory
+    Utilisation","type":"timeseries"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"rds"}},"gridPos":{"h":7,"w":12,"x":12,"y":9},"id":6,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"instance:node_vmstat_pgmajfault:rate5m{job=\"node-exporter\",
+    cluster=\"$cluster\"}","legendFormat":"{{ instance }}"}],"title":"Memory Saturation
+    (Major Page Faults)","type":"timeseries"},{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":16},"id":7,"panels":[],"title":"Network","type":"row"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"Bps"},"overrides":[{"matcher":{"id":"byRegexp","options":"/Transmit/"},"properties":[{"id":"custom.transform","value":"negative-Y"}]}]},"gridPos":{"h":7,"w":12,"x":0,"y":17},"id":8,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"instance:node_network_receive_bytes_excluding_lo:rate5m{job=\"node-exporter\",
+    cluster=\"$cluster\"} != 0","legendFormat":"{{ instance }} Receive"},{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"instance:node_network_transmit_bytes_excluding_lo:rate5m{job=\"node-exporter\",
+    cluster=\"$cluster\"} != 0","legendFormat":"{{ instance }} Transmit"}],"title":"Network
+    Utilisation (Bytes Receive/Transmit)","type":"timeseries"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"Bps"},"overrides":[{"matcher":{"id":"byRegexp","options":"/Transmit/"},"properties":[{"id":"custom.transform","value":"negative-Y"}]}]},"gridPos":{"h":7,"w":12,"x":12,"y":17},"id":9,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"instance:node_network_receive_drop_excluding_lo:rate5m{job=\"node-exporter\",
+    cluster=\"$cluster\"} != 0","legendFormat":"{{ instance }} Receive"},{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"instance:node_network_transmit_drop_excluding_lo:rate5m{job=\"node-exporter\",
+    cluster=\"$cluster\"} != 0","legendFormat":"{{ instance }} Transmit"}],"title":"Network
+    Saturation (Drops Receive/Transmit)","type":"timeseries"},{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":24},"id":10,"panels":[],"title":"Disk
+    IO","type":"row"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"percentunit"}},"gridPos":{"h":7,"w":12,"x":0,"y":25},"id":11,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"instance_device:node_disk_io_time_seconds:rate5m{job=\"node-exporter\",
+    cluster=\"$cluster\"}\n/ scalar(count(instance_device:node_disk_io_time_seconds:rate5m{job=\"node-exporter\",
+    cluster=\"$cluster\"}))\n","legendFormat":"{{ instance }} {{device}}"}],"title":"Disk
+    IO Utilisation","type":"timeseries"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"percentunit"}},"gridPos":{"h":7,"w":12,"x":12,"y":25},"id":12,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"instance_device:node_disk_io_time_weighted_seconds:rate5m{job=\"node-exporter\",
+    cluster=\"$cluster\"}\n/ scalar(count(instance_device:node_disk_io_time_weighted_seconds:rate5m{job=\"node-exporter\",
+    cluster=\"$cluster\"}))\n","legendFormat":"{{ instance }} {{device}}"}],"title":"Disk
+    IO Saturation","type":"timeseries"},{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":34},"id":13,"panels":[],"title":"Disk
+    Space","type":"row"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"percentunit"}},"gridPos":{"h":7,"w":24,"x":0,"y":35},"id":14,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum
+    without (device) (\n  max without (fstype, mountpoint) ((\n    node_filesystem_size_bytes{job=\"node-exporter\",
+    fstype!=\"\", mountpoint!=\"\", cluster=\"$cluster\"}\n    -\n    node_filesystem_avail_bytes{job=\"node-exporter\",
+    fstype!=\"\", mountpoint!=\"\", cluster=\"$cluster\"}\n  ) != 0)\n)\n/ scalar(sum(max
+    without (fstype, mountpoint) (node_filesystem_size_bytes{job=\"node-exporter\",
+    fstype!=\"\", mountpoint!=\"\", cluster=\"$cluster\"})))\n","legendFormat":"{{
+    instance }}"}],"title":"Disk Space Utilisation","type":"timeseries"}],"refresh":"30s","schemaVersion":39,"tags":["node-exporter-mixin"],"templating":{"list":[{"name":"datasource","query":"prometheus","type":"datasource"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"hide":2,"includeAll":false,"name":"cluster","query":"label_values(node_time_seconds,
+    cluster)","refresh":2,"sort":1,"type":"query","allValue":".*"}]},"time":{"from":"now-1h","to":"now"},"timezone":
+    "utc","title":"Node Exporter / USE Method / Cluster","uid":"3e97d1d02672cdd0861f4c97c64f89b2"}'
 
-                ],
-                "title": "CPU",
-                "type": "row"
-            },
-            {
-                "datasource": {
-                    "type": "prometheus",
-                    "uid": "${datasource}"
-                },
-                "fieldConfig": {
-                    "defaults": {
-                        "custom": {
-                            "fillOpacity": 100,
-                            "showPoints": "never",
-                            "stacking": {
-                                "mode": "normal"
-                            }
-                        },
-                        "unit": "percentunit"
-                    }
-                },
-                "gridPos": {
-                    "h": 7,
-                    "w": 12,
-                    "x": 0,
[Diff truncated by flux-local]
--- HelmRelease: monitoring/kube-prometheus-stack ConfigMap: monitoring/kube-prometheus-stack-prometheus

+++ HelmRelease: monitoring/kube-prometheus-stack ConfigMap: monitoring/kube-prometheus-stack-prometheus

@@ -10,74 +10,46 @@

     app.kubernetes.io/managed-by: Helm
     app.kubernetes.io/instance: kube-prometheus-stack
     app.kubernetes.io/part-of: kube-prometheus-stack
     release: kube-prometheus-stack
     heritage: Helm
 data:
-  prometheus.json: '{"annotations":{"list":[]},"editable":true,"gnetId":null,"graphTooltip":0,"hideControls":false,"links":[],"refresh":"60s","rows":[{"collapse":false,"height":"250px","panels":[{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":1,"id":1,"legend":{"avg":false,"current":false,"max":false,"min":false,"show":true,"total":false,"values":false},"lines":true,"linewidth":1,"links":[],"nullPointMode":"null
-    as zero","percentage":false,"pointradius":5,"points":false,"renderer":"flot","seriesOverrides":[],"spaceLength":10,"span":12,"stack":false,"steppedLine":false,"styles":[{"alias":"Time","dateFormat":"YYYY-MM-DD
-    HH:mm:ss","pattern":"Time","type":"hidden"},{"alias":"Count","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD
-    HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill
-    down","linkUrl":"","pattern":"Value #A","thresholds":[],"type":"hidden","unit":"short"},{"alias":"Uptime","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD
-    HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill
-    down","linkUrl":"","pattern":"Value #B","thresholds":[],"type":"number","unit":"s"},{"alias":"Cluster","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD
-    HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill
-    down","linkUrl":"","pattern":"cluster","thresholds":[],"type":"number","unit":"short"},{"alias":"Instance","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD
-    HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill
-    down","linkUrl":"","pattern":"instance","thresholds":[],"type":"number","unit":"short"},{"alias":"Job","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD
-    HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill
-    down","linkUrl":"","pattern":"job","thresholds":[],"type":"number","unit":"short"},{"alias":"Version","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD
-    HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill
-    down","linkUrl":"","pattern":"version","thresholds":[],"type":"number","unit":"short"},{"alias":"","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD
-    HH:mm:ss","decimals":2,"pattern":"/.*/","thresholds":[],"type":"string","unit":"short"}],"targets":[{"expr":"count
+  prometheus.json: '{"panels":[{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":0},"id":1,"panels":[],"title":"Prometheus
+    Stats","type":"row"},{"datasource":{"type":"prometheus","uid":"$datasource"},"fieldConfig":{"defaults":{"decimals":2,"displayName":"","unit":"short"},"overrides":[{"matcher":{"id":"byName","options":"Time"},"properties":[{"id":"displayName","value":"Time"},{"id":"custom.align","value":null},{"id":"custom.hidden","value":"true"}]},{"matcher":{"id":"byName","options":"cluster"},"properties":[{"id":"custom.align","value":null},{"id":"unit","value":"short"},{"id":"decimals","value":2},{"id":"displayName","value":"Cluster"}]},{"matcher":{"id":"byName","options":"job"},"properties":[{"id":"custom.align","value":null},{"id":"unit","value":"short"},{"id":"decimals","value":2},{"id":"displayName","value":"Job"}]},{"matcher":{"id":"byName","options":"instance"},"properties":[{"id":"displayName","value":"Instance"},{"id":"custom.align","value":null},{"id":"unit","value":"short"},{"id":"decimals","value":2}]},{"matcher":{"id":"byName","options":"version"},"properties":[{"id":"displayName","value":"Version"},{"id":"custom.align","value":null},{"id":"unit","value":"short"},{"id":"decimals","value":2}]},{"matcher":{"id":"byName","options":"Value
+    #A"},"properties":[{"id":"displayName","value":"Count"},{"id":"custom.align","value":null},{"id":"unit","value":"short"},{"id":"decimals","value":2},{"id":"custom.hidden","value":"true"}]},{"matcher":{"id":"byName","options":"Value
+    #B"},"properties":[{"id":"displayName","value":"Uptime"},{"id":"custom.align","value":null},{"id":"unit","value":"s"}]}]},"gridPos":{"h":7,"w":24,"x":0,"y":1},"id":2,"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"count
     by (cluster, job, instance, version) (prometheus_build_info{cluster=~\"$cluster\",
-    job=~\"$job\", instance=~\"$instance\"})","format":"table","instant":true,"legendFormat":"","refId":"A"},{"expr":"max
+    job=~\"$job\", instance=~\"$instance\"})","format":"table","instant":true,"legendFormat":""},{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"max
     by (cluster, job, instance) (time() - process_start_time_seconds{cluster=~\"$cluster\",
-    job=~\"$job\", instance=~\"$instance\"})","format":"table","instant":true,"legendFormat":"","refId":"B"}],"thresholds":[],"timeFrom":null,"timeShift":null,"title":"Prometheus
-    Stats","tooltip":{"shared":true,"sort":2,"value_type":"individual"},"transform":"table","type":"table","xaxis":{"buckets":null,"mode":"time","name":null,"show":true,"values":[]},"yaxes":[{"format":"short","label":null,"logBase":1,"max":null,"min":0,"show":true},{"format":"short","label":null,"logBase":1,"max":null,"min":null,"show":false}]}],"repeat":null,"repeatIteration":null,"repeatRowId":null,"showTitle":true,"title":"Prometheus
-    Stats","titleSize":"h6"},{"collapse":false,"height":"250px","panels":[{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":1,"id":2,"legend":{"avg":false,"current":false,"max":false,"min":false,"show":true,"total":false,"values":false},"lines":true,"linewidth":1,"links":[],"nullPointMode":"null
-    as zero","percentage":false,"pointradius":5,"points":false,"renderer":"flot","seriesOverrides":[],"spaceLength":10,"span":6,"stack":false,"steppedLine":false,"targets":[{"expr":"sum(rate(prometheus_target_sync_length_seconds_sum{cluster=~\"$cluster\",job=~\"$job\",instance=~\"$instance\"}[5m]))
-    by (cluster, job, scrape_job, instance) * 1e3","format":"time_series","legendFormat":"{{cluster}}:{{job}}:{{instance}}:{{scrape_job}}","legendLink":null}],"thresholds":[],"timeFrom":null,"timeShift":null,"title":"Target
-    Sync","tooltip":{"shared":true,"sort":2,"value_type":"individual"},"type":"graph","xaxis":{"buckets":null,"mode":"time","name":null,"show":true,"values":[]},"yaxes":[{"format":"ms","label":null,"logBase":1,"max":null,"min":0,"show":true},{"format":"short","label":null,"logBase":1,"max":null,"min":null,"show":false}]},{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":10,"id":3,"legend":{"avg":false,"current":false,"max":false,"min":false,"show":true,"total":false,"values":false},"lines":true,"linewidth":0,"links":[],"nullPointMode":"null
-    as zero","percentage":false,"pointradius":5,"points":false,"renderer":"flot","seriesOverrides":[],"spaceLength":10,"span":6,"stack":true,"steppedLine":false,"targets":[{"expr":"sum
+    job=~\"$job\", instance=~\"$instance\"})","format":"table","instant":true,"legendFormat":""}],"title":"Prometheus
+    Stats","type":"table"},{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":8},"id":3,"panels":[],"title":"Discovery","type":"row"},{"datasource":{"type":"prometheus","uid":"$datasource"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":10,"showPoints":"never"},"min":0,"unit":"ms"}},"gridPos":{"h":7,"w":12,"x":0,"y":9},"id":4,"options":{"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum(rate(prometheus_target_sync_length_seconds_sum{cluster=~\"$cluster\",job=~\"$job\",instance=~\"$instance\"}[5m]))
+    by (cluster, job, scrape_job, instance) * 1e3","format":"time_series","legendFormat":"{{cluster}}:{{job}}:{{instance}}:{{scrape_job}}"}],"title":"Target
+    Sync","type":"timeseries"},{"datasource":{"type":"prometheus","uid":"$datasource"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"lineWidth":0,"showPoints":"never","stacking":{"mode":"normal"}},"min":0,"unit":"short"}},"gridPos":{"h":7,"w":12,"x":12,"y":9},"id":5,"options":{"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum
     by (cluster, job, instance) (prometheus_sd_discovered_targets{cluster=~\"$cluster\",
-    job=~\"$job\",instance=~\"$instance\"})","format":"time_series","legendFormat":"{{cluster}}:{{job}}:{{instance}}","legendLink":null}],"thresholds":[],"timeFrom":null,"timeShift":null,"title":"Targets","tooltip":{"shared":true,"sort":2,"value_type":"individual"},"type":"graph","xaxis":{"buckets":null,"mode":"time","name":null,"show":true,"values":[]},"yaxes":[{"format":"short","label":null,"logBase":1,"max":null,"min":0,"show":true},{"format":"short","label":null,"logBase":1,"max":null,"min":null,"show":false}]}],"repeat":null,"repeatIteration":null,"repeatRowId":null,"showTitle":true,"title":"Discovery","titleSize":"h6"},{"collapse":false,"height":"250px","panels":[{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":1,"id":4,"legend":{"avg":false,"current":false,"max":false,"min":false,"show":true,"total":false,"values":false},"lines":true,"linewidth":1,"links":[],"nullPointMode":"null
-    as zero","percentage":false,"pointradius":5,"points":false,"renderer":"flot","seriesOverrides":[],"spaceLength":10,"span":4,"stack":false,"steppedLine":false,"targets":[{"expr":"rate(prometheus_target_interval_length_seconds_sum{cluster=~\"$cluster\",
[Diff truncated by flux-local]
--- HelmRelease: monitoring/kube-prometheus-stack Service: monitoring/kube-prometheus-stack-kube-state-metrics

+++ HelmRelease: monitoring/kube-prometheus-stack Service: monitoring/kube-prometheus-stack-kube-state-metrics

@@ -8,14 +8,12 @@

     app.kubernetes.io/managed-by: Helm
     app.kubernetes.io/component: metrics
     app.kubernetes.io/part-of: kube-state-metrics
     app.kubernetes.io/name: kube-state-metrics
     app.kubernetes.io/instance: kube-prometheus-stack
     release: kube-prometheus-stack
-  annotations:
-    prometheus.io/scrape: 'true'
 spec:
   type: ClusterIP
   ports:
   - name: http
     protocol: TCP
     port: 8080
--- HelmRelease: monitoring/kube-prometheus-stack DaemonSet: monitoring/kube-prometheus-stack-prometheus-node-exporter

+++ HelmRelease: monitoring/kube-prometheus-stack DaemonSet: monitoring/kube-prometheus-stack-prometheus-node-exporter

@@ -40,13 +40,13 @@

         runAsGroup: 65534
         runAsNonRoot: true
         runAsUser: 65534
       serviceAccountName: kube-prometheus-stack-prometheus-node-exporter
       containers:
       - name: node-exporter
-        image: quay.io/prometheus/node-exporter:v1.8.2
+        image: quay.io/prometheus/node-exporter:v1.9.0
         imagePullPolicy: IfNotPresent
         args:
         - --path.procfs=/host/proc
         - --path.sysfs=/host/sys
         - --path.rootfs=/host/root
         - --path.udev.data=/host/root/run/udev/data
--- HelmRelease: monitoring/kube-prometheus-stack Deployment: monitoring/kube-prometheus-stack-kube-state-metrics

+++ HelmRelease: monitoring/kube-prometheus-stack Deployment: monitoring/kube-prometheus-stack-kube-state-metrics

@@ -44,13 +44,13 @@

       - name: kube-state-metrics
         args:
         - --port=8080
         - --resources=certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,leases,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments
         - --metric-labels-allowlist=deployments=[*],persistentvolumeclaims=[*]
         imagePullPolicy: IfNotPresent
-        image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.14.0
+        image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.15.0
         ports:
         - containerPort: 8080
           name: http
         livenessProbe:
           failureThreshold: 3
           httpGet:
--- HelmRelease: monitoring/kube-prometheus-stack Deployment: monitoring/kube-prometheus-stack-operator

+++ HelmRelease: monitoring/kube-prometheus-stack Deployment: monitoring/kube-prometheus-stack-operator

@@ -31,20 +31,20 @@

         app: kube-prometheus-stack-operator
         app.kubernetes.io/name: kube-prometheus-stack-prometheus-operator
         app.kubernetes.io/component: prometheus-operator
     spec:
       containers:
       - name: kube-prometheus-stack
-        image: quay.io/prometheus-operator/prometheus-operator:v0.79.2
+        image: quay.io/prometheus-operator/prometheus-operator:v0.80.1
         imagePullPolicy: IfNotPresent
         args:
         - --kubelet-service=kube-system/kube-prometheus-stack-kubelet
         - --kubelet-endpoints=true
         - --kubelet-endpointslice=false
         - --localhost=127.0.0.1
-        - --prometheus-config-reloader=quay.io/prometheus-operator/prometheus-config-reloader:v0.79.2
+        - --prometheus-config-reloader=quay.io/prometheus-operator/prometheus-config-reloader:v0.80.1
         - --config-reloader-cpu-request=0
         - --config-reloader-cpu-limit=0
         - --config-reloader-memory-request=0
         - --config-reloader-memory-limit=0
         - --thanos-default-base-image=quay.io/thanos/thanos:v0.37.2
         - --secret-field-selector=type!=kubernetes.io/dockercfg,type!=kubernetes.io/service-account-token,type!=helm.sh/release.v1
--- HelmRelease: monitoring/kube-prometheus-stack Prometheus: monitoring/kube-prometheus-stack

+++ HelmRelease: monitoring/kube-prometheus-stack Prometheus: monitoring/kube-prometheus-stack

@@ -10,14 +10,14 @@

     app.kubernetes.io/instance: kube-prometheus-stack
     app.kubernetes.io/part-of: kube-prometheus-stack
     release: kube-prometheus-stack
     heritage: Helm
 spec:
   automountServiceAccountToken: true
-  image: quay.io/prometheus/prometheus:v3.1.0
-  version: v3.1.0
+  image: quay.io/prometheus/prometheus:v3.2.0
+  version: v3.2.0
   externalUrl: http://prometheus...PLACEHOLDER_SECRET_DOMAIN../
   paused: false
   replicas: 1
   shards: 1
   logLevel: info
   logFormat: logfmt
--- HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-etcd

+++ HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-etcd

@@ -26,15 +26,15 @@

         or
           count without (To) (
             sum without (instance, pod) (rate(etcd_network_peer_sent_failures_total{job=~".*etcd.*"}[120s])) > 0.01
           )
         )
         > 0
-      for: 10m
-      labels:
-        severity: critical
+      for: 20m
+      labels:
+        severity: warning
     - alert: etcdInsufficientMembers
       annotations:
         description: 'etcd cluster "{{ $labels.job }}": insufficient members ({{ $value
           }}).'
         summary: etcd cluster has insufficient number of members.
       expr: sum(up{job=~".*etcd.*"} == bool 1) without (instance, pod) < ((count(up{job=~".*etcd.*"})
--- HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kube-apiserver-slos

+++ HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kube-apiserver-slos

@@ -14,13 +14,14 @@

 spec:
   groups:
   - name: kube-apiserver-slos
     rules:
     - alert: KubeAPIErrorBudgetBurn
       annotations:
-        description: The API server is burning too much error budget.
+        description: The API server is burning too much error budget on cluster {{
+          $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeapierrorbudgetburn
         summary: The API server is burning too much error budget.
       expr: |-
         sum by (cluster) (apiserver_request:burnrate1h) > (14.40 * 0.01000)
         and on (cluster)
         sum by (cluster) (apiserver_request:burnrate5m) > (14.40 * 0.01000)
@@ -28,13 +29,14 @@

       labels:
         long: 1h
         severity: critical
         short: 5m
     - alert: KubeAPIErrorBudgetBurn
       annotations:
-        description: The API server is burning too much error budget.
+        description: The API server is burning too much error budget on cluster {{
+          $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeapierrorbudgetburn
         summary: The API server is burning too much error budget.
       expr: |-
         sum by (cluster) (apiserver_request:burnrate6h) > (6.00 * 0.01000)
         and on (cluster)
         sum by (cluster) (apiserver_request:burnrate30m) > (6.00 * 0.01000)
@@ -42,13 +44,14 @@

       labels:
         long: 6h
         severity: critical
         short: 30m
     - alert: KubeAPIErrorBudgetBurn
       annotations:
-        description: The API server is burning too much error budget.
+        description: The API server is burning too much error budget on cluster {{
+          $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeapierrorbudgetburn
         summary: The API server is burning too much error budget.
       expr: |-
         sum by (cluster) (apiserver_request:burnrate1d) > (3.00 * 0.01000)
         and on (cluster)
         sum by (cluster) (apiserver_request:burnrate2h) > (3.00 * 0.01000)
@@ -56,13 +59,14 @@

       labels:
         long: 1d
         severity: warning
         short: 2h
     - alert: KubeAPIErrorBudgetBurn
       annotations:
-        description: The API server is burning too much error budget.
+        description: The API server is burning too much error budget on cluster {{
+          $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeapierrorbudgetburn
         summary: The API server is burning too much error budget.
       expr: |-
         sum by (cluster) (apiserver_request:burnrate3d) > (1.00 * 0.01000)
         and on (cluster)
         sum by (cluster) (apiserver_request:burnrate6h) > (1.00 * 0.01000)
--- HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-apps

+++ HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-apps

@@ -15,24 +15,25 @@

   groups:
   - name: kubernetes-apps
     rules:
     - alert: KubePodCrashLooping
       annotations:
         description: 'Pod {{ $labels.namespace }}/{{ $labels.pod }} ({{ $labels.container
-          }}) is in waiting state (reason: "CrashLoopBackOff").'
+          }}) is in waiting state (reason: "CrashLoopBackOff") on cluster {{ $labels.cluster
+          }}.'
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubepodcrashlooping
         summary: Pod is crash looping.
       expr: max_over_time(kube_pod_container_status_waiting_reason{reason="CrashLoopBackOff",
         job="kube-state-metrics", namespace=~".*"}[5m]) >= 1
       for: 15m
       labels:
         severity: warning
     - alert: KubePodNotReady
       annotations:
         description: Pod {{ $labels.namespace }}/{{ $labels.pod }} has been in a non-ready
-          state for longer than 15 minutes.
+          state for longer than 15 minutes on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubepodnotready
         summary: Pod has been in a non-ready state for more than 15 minutes.
       expr: |-
         sum by (namespace, pod, cluster) (
           max by (namespace, pod, cluster) (
             kube_pod_status_phase{job="kube-state-metrics", namespace=~".*", phase=~"Pending|Unknown|Failed"}
@@ -44,26 +45,27 @@

       labels:
         severity: warning
     - alert: KubeDeploymentGenerationMismatch
       annotations:
         description: Deployment generation for {{ $labels.namespace }}/{{ $labels.deployment
           }} does not match, this indicates that the Deployment has failed but has
-          not been rolled back.
+          not been rolled back on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubedeploymentgenerationmismatch
         summary: Deployment generation mismatch due to possible roll-back
       expr: |-
         kube_deployment_status_observed_generation{job="kube-state-metrics", namespace=~".*"}
           !=
         kube_deployment_metadata_generation{job="kube-state-metrics", namespace=~".*"}
       for: 15m
       labels:
         severity: warning
     - alert: KubeDeploymentReplicasMismatch
       annotations:
         description: Deployment {{ $labels.namespace }}/{{ $labels.deployment }} has
-          not matched the expected number of replicas for longer than 15 minutes.
+          not matched the expected number of replicas for longer than 15 minutes on
+          cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubedeploymentreplicasmismatch
         summary: Deployment has not matched the expected number of replicas.
       expr: |-
         (
           kube_deployment_spec_replicas{job="kube-state-metrics", namespace=~".*"}
             >
@@ -76,58 +78,60 @@

       for: 15m
       labels:
         severity: warning
     - alert: KubeDeploymentRolloutStuck
       annotations:
         description: Rollout of deployment {{ $labels.namespace }}/{{ $labels.deployment
-          }} is not progressing for longer than 15 minutes.
+          }} is not progressing for longer than 15 minutes on cluster {{ $labels.cluster
+          }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubedeploymentrolloutstuck
         summary: Deployment rollout is not progressing.
       expr: |-
         kube_deployment_status_condition{condition="Progressing", status="false",job="kube-state-metrics", namespace=~".*"}
         != 0
       for: 15m
       labels:
         severity: warning
     - alert: KubeStatefulSetReplicasMismatch
       annotations:
         description: StatefulSet {{ $labels.namespace }}/{{ $labels.statefulset }}
-          has not matched the expected number of replicas for longer than 15 minutes.
+          has not matched the expected number of replicas for longer than 15 minutes
+          on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubestatefulsetreplicasmismatch
         summary: StatefulSet has not matched the expected number of replicas.
       expr: |-
         (
           kube_statefulset_status_replicas_ready{job="kube-state-metrics", namespace=~".*"}
             !=
-          kube_statefulset_status_replicas{job="kube-state-metrics", namespace=~".*"}
+          kube_statefulset_replicas{job="kube-state-metrics", namespace=~".*"}
         ) and (
           changes(kube_statefulset_status_replicas_updated{job="kube-state-metrics", namespace=~".*"}[10m])
             ==
           0
         )
       for: 15m
       labels:
         severity: warning
     - alert: KubeStatefulSetGenerationMismatch
       annotations:
         description: StatefulSet generation for {{ $labels.namespace }}/{{ $labels.statefulset
           }} does not match, this indicates that the StatefulSet has failed but has
-          not been rolled back.
+          not been rolled back on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubestatefulsetgenerationmismatch
         summary: StatefulSet generation mismatch due to possible roll-back
       expr: |-
         kube_statefulset_status_observed_generation{job="kube-state-metrics", namespace=~".*"}
           !=
         kube_statefulset_metadata_generation{job="kube-state-metrics", namespace=~".*"}
       for: 15m
       labels:
         severity: warning
     - alert: KubeStatefulSetUpdateNotRolledOut
       annotations:
         description: StatefulSet {{ $labels.namespace }}/{{ $labels.statefulset }}
-          update has not been rolled out.
+          update has not been rolled out on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubestatefulsetupdatenotrolledout
         summary: StatefulSet update has not been rolled out.
       expr: |-
         (
           max by (namespace, statefulset, job, cluster) (
             kube_statefulset_status_current_revision{job="kube-state-metrics", namespace=~".*"}
@@ -148,13 +152,14 @@

       for: 15m
       labels:
         severity: warning
     - alert: KubeDaemonSetRolloutStuck
       annotations:
         description: DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset }} has
-          not finished or progressed for at least 15m.
+          not finished or progressed for at least 15m on cluster {{ $labels.cluster
+          }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubedaemonsetrolloutstuck
         summary: DaemonSet rollout is stuck.
       expr: |-
         (
           (
             kube_daemonset_status_current_number_scheduled{job="kube-state-metrics", namespace=~".*"}
@@ -182,70 +187,74 @@

       labels:
         severity: warning
     - alert: KubeContainerWaiting
       annotations:
         description: 'pod/{{ $labels.pod }} in namespace {{ $labels.namespace }} on
           container {{ $labels.container}} has been in waiting state for longer than
-          1 hour. (reason: "{{ $labels.reason }}").'
+          1 hour. (reason: "{{ $labels.reason }}") on cluster {{ $labels.cluster }}.'
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubecontainerwaiting
         summary: Pod container waiting longer than 1 hour
       expr: kube_pod_container_status_waiting_reason{reason!="CrashLoopBackOff", job="kube-state-metrics",
         namespace=~".*"} > 0
       for: 1h
       labels:
         severity: warning
     - alert: KubeDaemonSetNotScheduled
       annotations:
         description: '{{ $value }} Pods of DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset
-          }} are not scheduled.'
+          }} are not scheduled on cluster {{ $labels.cluster }}.'
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubedaemonsetnotscheduled
         summary: DaemonSet pods are not scheduled.
       expr: |-
         kube_daemonset_status_desired_number_scheduled{job="kube-state-metrics", namespace=~".*"}
           -
         kube_daemonset_status_current_number_scheduled{job="kube-state-metrics", namespace=~".*"} > 0
       for: 10m
       labels:
         severity: warning
     - alert: KubeDaemonSetMisScheduled
       annotations:
         description: '{{ $value }} Pods of DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset
-          }} are running where they are not supposed to run.'
+          }} are running where they are not supposed to run on cluster {{ $labels.cluster
+          }}.'
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubedaemonsetmisscheduled
         summary: DaemonSet pods are misscheduled.
       expr: kube_daemonset_status_number_misscheduled{job="kube-state-metrics", namespace=~".*"}
         > 0
       for: 15m
       labels:
         severity: warning
     - alert: KubeJobNotCompleted
       annotations:
         description: Job {{ $labels.namespace }}/{{ $labels.job_name }} is taking
-          more than {{ "43200" | humanizeDuration }} to complete.
+          more than {{ "43200" | humanizeDuration }} to complete on cluster {{ $labels.cluster
+          }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubejobnotcompleted
         summary: Job did not complete in time
       expr: |-
         time() - max by (namespace, job_name, cluster) (kube_job_status_start_time{job="kube-state-metrics", namespace=~".*"}
           and
         kube_job_status_active{job="kube-state-metrics", namespace=~".*"} > 0) > 43200
       labels:
         severity: warning
     - alert: KubeJobFailed
       annotations:
[Diff truncated by flux-local]
--- HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-resources

+++ HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-resources

@@ -70,13 +70,13 @@

       for: 5m
       labels:
         severity: warning
     - alert: KubeQuotaAlmostFull
       annotations:
         description: Namespace {{ $labels.namespace }} is using {{ $value | humanizePercentage
-          }} of its {{ $labels.resource }} quota.
+          }} of its {{ $labels.resource }} quota on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubequotaalmostfull
         summary: Namespace quota is going to be full.
       expr: |-
         kube_resourcequota{job="kube-state-metrics", type="used"}
           / ignoring(instance, job, type)
         (kube_resourcequota{job="kube-state-metrics", type="hard"} > 0)
@@ -84,13 +84,13 @@

       for: 15m
       labels:
         severity: info
     - alert: KubeQuotaFullyUsed
       annotations:
         description: Namespace {{ $labels.namespace }} is using {{ $value | humanizePercentage
-          }} of its {{ $labels.resource }} quota.
+          }} of its {{ $labels.resource }} quota on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubequotafullyused
         summary: Namespace quota is fully used.
       expr: |-
         kube_resourcequota{job="kube-state-metrics", type="used"}
           / ignoring(instance, job, type)
         (kube_resourcequota{job="kube-state-metrics", type="hard"} > 0)
@@ -98,13 +98,13 @@

       for: 15m
       labels:
         severity: info
     - alert: KubeQuotaExceeded
       annotations:
         description: Namespace {{ $labels.namespace }} is using {{ $value | humanizePercentage
-          }} of its {{ $labels.resource }} quota.
+          }} of its {{ $labels.resource }} quota on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubequotaexceeded
         summary: Namespace quota has exceeded the limits.
       expr: |-
         kube_resourcequota{job="kube-state-metrics", type="used"}
           / ignoring(instance, job, type)
         (kube_resourcequota{job="kube-state-metrics", type="hard"} > 0)
@@ -113,13 +113,13 @@

       labels:
         severity: warning
     - alert: CPUThrottlingHigh
       annotations:
         description: '{{ $value | humanizePercentage }} throttling of CPU in namespace
           {{ $labels.namespace }} for container {{ $labels.container }} in pod {{
-          $labels.pod }}.'
+          $labels.pod }} on cluster {{ $labels.cluster }}.'
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/cputhrottlinghigh
         summary: Processes experience elevated CPU throttling.
       expr: |-
         sum(increase(container_cpu_cfs_throttled_periods_total{container!="", job="kubelet", metrics_path="/metrics/cadvisor", }[5m])) without (id, metrics_path, name, image, endpoint, job, node)
           /
         sum(increase(container_cpu_cfs_periods_total{job="kubelet", metrics_path="/metrics/cadvisor", }[5m])) without (id, metrics_path, name, image, endpoint, job, node)
--- HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-system-apiserver

+++ HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-system-apiserver

@@ -53,13 +53,14 @@

       for: 10m
       labels:
         severity: warning
     - alert: KubeAggregatedAPIDown
       annotations:
         description: Kubernetes aggregated API {{ $labels.name }}/{{ $labels.namespace
-          }} has been only {{ $value | humanize }}% available over the last 10m.
+          }} has been only {{ $value | humanize }}% available over the last 10m on
+          cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeaggregatedapidown
         summary: Kubernetes aggregated API is down.
       expr: (1 - max by (name, namespace, cluster)(avg_over_time(aggregator_unavailable_apiservice{job="apiserver"}[10m])))
         * 100 < 85
       for: 5m
       labels:
@@ -73,13 +74,13 @@

       for: 15m
       labels:
         severity: critical
     - alert: KubeAPITerminatedRequests
       annotations:
         description: The kubernetes apiserver has terminated {{ $value | humanizePercentage
-          }} of its incoming requests.
+          }} of its incoming requests on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeapiterminatedrequests
         summary: The kubernetes apiserver has terminated {{ $value | humanizePercentage
           }} of its incoming requests.
       expr: sum by (cluster) (rate(apiserver_request_terminations_total{job="apiserver"}[10m]))
         / ( sum by (cluster) (rate(apiserver_request_total{job="apiserver"}[10m]))
         + sum by (cluster) (rate(apiserver_request_terminations_total{job="apiserver"}[10m]))
--- HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-system-kubelet

+++ HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-system-kubelet

@@ -14,135 +14,149 @@

 spec:
   groups:
   - name: kubernetes-system-kubelet
     rules:
     - alert: KubeNodeNotReady
       annotations:
-        description: '{{ $labels.node }} has been unready for more than 15 minutes.'
+        description: '{{ $labels.node }} has been unready for more than 15 minutes
+          on cluster {{ $labels.cluster }}.'
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubenodenotready
         summary: Node is not ready.
-      expr: kube_node_status_condition{job="kube-state-metrics",condition="Ready",status="true"}
-        == 0
+      expr: |-
+        kube_node_status_condition{job="kube-state-metrics",condition="Ready",status="true"} == 0
+        and on (cluster, node)
+        kube_node_spec_unschedulable{job="kube-state-metrics"} == 0
       for: 15m
       labels:
         severity: warning
     - alert: KubeNodeUnreachable
       annotations:
         description: '{{ $labels.node }} is unreachable and some workloads may be
-          rescheduled.'
+          rescheduled on cluster {{ $labels.cluster }}.'
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubenodeunreachable
         summary: Node is unreachable.
       expr: (kube_node_spec_taint{job="kube-state-metrics",key="node.kubernetes.io/unreachable",effect="NoSchedule"}
         unless ignoring(key,value) kube_node_spec_taint{job="kube-state-metrics",key=~"ToBeDeletedByClusterAutoscaler|cloud.google.com/impending-node-termination|aws-node-termination-handler/spot-itn"})
         == 1
       for: 15m
       labels:
         severity: warning
     - alert: KubeletTooManyPods
       annotations:
         description: Kubelet '{{ $labels.node }}' is running at {{ $value | humanizePercentage
-          }} of its Pod capacity.
+          }} of its Pod capacity on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubelettoomanypods
         summary: Kubelet is running at capacity.
       expr: |-
-        count by (cluster, node) (
-          (kube_pod_status_phase{job="kube-state-metrics",phase="Running"} == 1) * on (instance,pod,namespace,cluster) group_left(node) topk by (instance,pod,namespace,cluster) (1, kube_pod_info{job="kube-state-metrics"})
+        (
+          max by (cluster, instance) (
+            kubelet_running_pods{job="kubelet", metrics_path="/metrics"} > 1
+          )
+          * on (cluster, instance) group_left(node)
+          max by (cluster, instance, node) (
+            kubelet_node_name{job="kubelet", metrics_path="/metrics"}
+          )
         )
-        /
+        / on (cluster, node) group_left()
         max by (cluster, node) (
-          kube_node_status_capacity{job="kube-state-metrics",resource="pods"} != 1
+          kube_node_status_capacity{job="kube-state-metrics", resource="pods"} != 1
         ) > 0.95
       for: 15m
       labels:
         severity: info
     - alert: KubeNodeReadinessFlapping
       annotations:
         description: The readiness status of node {{ $labels.node }} has changed {{
-          $value }} times in the last 15 minutes.
+          $value }} times in the last 15 minutes on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubenodereadinessflapping
         summary: Node readiness status is flapping.
-      expr: sum(changes(kube_node_status_condition{job="kube-state-metrics",status="true",condition="Ready"}[15m]))
-        by (cluster, node) > 2
+      expr: |-
+        sum(changes(kube_node_status_condition{job="kube-state-metrics",status="true",condition="Ready"}[15m])) by (cluster, node) > 2
+        and on (cluster, node)
+        kube_node_spec_unschedulable{job="kube-state-metrics"} == 0
       for: 15m
       labels:
         severity: warning
     - alert: KubeletPlegDurationHigh
       annotations:
         description: The Kubelet Pod Lifecycle Event Generator has a 99th percentile
-          duration of {{ $value }} seconds on node {{ $labels.node }}.
+          duration of {{ $value }} seconds on node {{ $labels.node }} on cluster {{
+          $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletplegdurationhigh
         summary: Kubelet Pod Lifecycle Event Generator is taking too long to relist.
       expr: node_quantile:kubelet_pleg_relist_duration_seconds:histogram_quantile{quantile="0.99"}
         >= 10
       for: 5m
       labels:
         severity: warning
     - alert: KubeletPodStartUpLatencyHigh
       annotations:
         description: Kubelet Pod startup 99th percentile latency is {{ $value }} seconds
-          on node {{ $labels.node }}.
+          on node {{ $labels.node }} on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletpodstartuplatencyhigh
         summary: Kubelet Pod startup latency is too high.
       expr: histogram_quantile(0.99, sum(rate(kubelet_pod_worker_duration_seconds_bucket{job="kubelet",
         metrics_path="/metrics"}[5m])) by (cluster, instance, le)) * on (cluster,
         instance) group_left(node) kubelet_node_name{job="kubelet", metrics_path="/metrics"}
         > 60
       for: 15m
       labels:
         severity: warning
     - alert: KubeletClientCertificateExpiration
       annotations:
         description: Client certificate for Kubelet on node {{ $labels.node }} expires
-          in {{ $value | humanizeDuration }}.
+          in {{ $value | humanizeDuration }} on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletclientcertificateexpiration
         summary: Kubelet client certificate is about to expire.
       expr: kubelet_certificate_manager_client_ttl_seconds < 604800
       labels:
         severity: warning
     - alert: KubeletClientCertificateExpiration
       annotations:
         description: Client certificate for Kubelet on node {{ $labels.node }} expires
-          in {{ $value | humanizeDuration }}.
+          in {{ $value | humanizeDuration }} on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletclientcertificateexpiration
         summary: Kubelet client certificate is about to expire.
       expr: kubelet_certificate_manager_client_ttl_seconds < 86400
       labels:
         severity: critical
     - alert: KubeletServerCertificateExpiration
       annotations:
         description: Server certificate for Kubelet on node {{ $labels.node }} expires
-          in {{ $value | humanizeDuration }}.
+          in {{ $value | humanizeDuration }} on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletservercertificateexpiration
         summary: Kubelet server certificate is about to expire.
       expr: kubelet_certificate_manager_server_ttl_seconds < 604800
       labels:
         severity: warning
     - alert: KubeletServerCertificateExpiration
       annotations:
         description: Server certificate for Kubelet on node {{ $labels.node }} expires
-          in {{ $value | humanizeDuration }}.
+          in {{ $value | humanizeDuration }} on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletservercertificateexpiration
         summary: Kubelet server certificate is about to expire.
       expr: kubelet_certificate_manager_server_ttl_seconds < 86400
       labels:
         severity: critical
     - alert: KubeletClientCertificateRenewalErrors
       annotations:
         description: Kubelet on node {{ $labels.node }} has failed to renew its client
-          certificate ({{ $value | humanize }} errors in the last 5 minutes).
+          certificate ({{ $value | humanize }} errors in the last 5 minutes) on cluster
+          {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletclientcertificaterenewalerrors
         summary: Kubelet has failed to renew its client certificate.
       expr: increase(kubelet_certificate_manager_client_expiration_renew_errors[5m])
         > 0
       for: 15m
       labels:
         severity: warning
     - alert: KubeletServerCertificateRenewalErrors
       annotations:
         description: Kubelet on node {{ $labels.node }} has failed to renew its server
-          certificate ({{ $value | humanize }} errors in the last 5 minutes).
+          certificate ({{ $value | humanize }} errors in the last 5 minutes) on cluster
+          {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletservercertificaterenewalerrors
         summary: Kubelet has failed to renew its server certificate.
       expr: increase(kubelet_server_expiration_renew_errors[5m]) > 0
       for: 15m
       labels:
         severity: warning
--- HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-system

+++ HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-system

@@ -15,24 +15,25 @@

   groups:
   - name: kubernetes-system
     rules:
     - alert: KubeVersionMismatch
       annotations:
         description: There are {{ $value }} different semantic versions of Kubernetes
-          components running.
+          components running on cluster {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeversionmismatch
         summary: Different semantic versions of Kubernetes components running.
       expr: count by (cluster) (count by (git_version, cluster) (label_replace(kubernetes_build_info{job!~"kube-dns|coredns"},"git_version","$1","git_version","(v[0-9]*.[0-9]*).*")))
         > 1
       for: 15m
       labels:
         severity: warning
     - alert: KubeClientErrors
       annotations:
         description: Kubernetes API server client '{{ $labels.job }}/{{ $labels.instance
-          }}' is experiencing {{ $value | humanizePercentage }} errors.'
+          }}' is experiencing {{ $value | humanizePercentage }} errors on cluster
+          {{ $labels.cluster }}.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeclienterrors
         summary: Kubernetes API server client is experiencing errors.
       expr: |-
         (sum(rate(rest_client_requests_total{job="apiserver",code=~"5.."}[5m])) by (cluster, instance, job, namespace)
           /
         sum(rate(rest_client_requests_total{job="apiserver"}[5m])) by (cluster, instance, job, namespace))
--- HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-node-exporter

+++ HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-node-exporter

@@ -340,12 +340,24 @@

         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/node/nodesystemdservicefailed
         summary: Systemd service has entered failed state.
       expr: node_systemd_unit_state{job="node-exporter", state="failed"} == 1
       for: 5m
       labels:
         severity: warning
+    - alert: NodeSystemdServiceCrashlooping
+      annotations:
+        description: Systemd service {{ $labels.name }} has being restarted too many
+          times at {{ $labels.instance }} for the last 15 minutes. Please check if
+          service is crash looping.
+        runbook_url: https://runbooks.prometheus-operator.dev/runbooks/node/nodesystemdservicecrashlooping
+        summary: Systemd service keeps restaring, possibly crash looping.
+      expr: increase(node_systemd_service_restart_total{job="node-exporter"}[5m])
+        > 2
+      for: 15m
+      labels:
+        severity: warning
     - alert: NodeBondingDegraded
       annotations:
         description: Bonding interface {{ $labels.master }} on {{ $labels.instance
           }} is in degraded state due to one or more slave failures.
         runbook_url: https://runbooks.prometheus-operator.dev/runbooks/node/nodebondingdegraded
         summary: Bonding interface is degraded
--- HelmRelease: monitoring/kube-prometheus-stack ServiceMonitor: monitoring/kube-prometheus-stack-kubelet

+++ HelmRelease: monitoring/kube-prometheus-stack ServiceMonitor: monitoring/kube-prometheus-stack-kubelet

@@ -79,12 +79,22 @@

       - __name__
     - action: drop
       regex: container_(file_descriptors|tasks_state|threads_max)
       sourceLabels:
       - __name__
     - action: drop
+      regex: container_memory_failures_total;hierarchy
+      sourceLabels:
+      - __name__
+      - scope
+    - action: drop
+      regex: container_network_.*;(cali|cilium|cni|lxc|nodelocaldns|tunl).*
+      sourceLabels:
+      - __name__
+      - interface
+    - action: drop
       regex: container_spec.*
       sourceLabels:
       - __name__
     - action: drop
       regex: .+;
       sourceLabels:
--- HelmRelease: monitoring/kube-prometheus-stack Job: monitoring/kube-prometheus-stack-admission-create

+++ HelmRelease: monitoring/kube-prometheus-stack Job: monitoring/kube-prometheus-stack-admission-create

@@ -30,13 +30,13 @@

         heritage: Helm
         app.kubernetes.io/name: kube-prometheus-stack-prometheus-operator
         app.kubernetes.io/component: prometheus-operator-webhook
     spec:
       containers:
       - name: create
-        image: registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20221220-controller-v1.5.1-58-g787ea74b6
+        image: registry.k8s.io/ingress-nginx/kube-webhook-certgen:v1.5.1
         imagePullPolicy: IfNotPresent
         args:
         - create
         - --host=kube-prometheus-stack-operator,kube-prometheus-stack-operator.monitoring.svc
         - --namespace=monitoring
         - --secret-name=kube-prometheus-stack-admission
--- HelmRelease: monitoring/kube-prometheus-stack Job: monitoring/kube-prometheus-stack-admission-patch

+++ HelmRelease: monitoring/kube-prometheus-stack Job: monitoring/kube-prometheus-stack-admission-patch

@@ -30,13 +30,13 @@

         heritage: Helm
         app.kubernetes.io/name: kube-prometheus-stack-prometheus-operator
         app.kubernetes.io/component: prometheus-operator-webhook
     spec:
       containers:
       - name: patch
-        image: registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20221220-controller-v1.5.1-58-g787ea74b6
+        image: registry.k8s.io/ingress-nginx/kube-webhook-certgen:v1.5.1
         imagePullPolicy: IfNotPresent
         args:
         - patch
         - --webhook-name=kube-prometheus-stack-admission
         - --namespace=monitoring
         - --secret-name=kube-prometheus-stack-admission

@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from bd8fa8c to 7ec9658 Compare February 6, 2025 17:07
@snoopy82481-bot snoopy82481-bot bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Feb 6, 2025
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.0.0 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.1.0 ) Feb 6, 2025
@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from 7ec9658 to 4b2505f Compare February 6, 2025 23:08
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.1.0 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.1.1 ) Feb 6, 2025
@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from 4b2505f to 5af78c4 Compare February 7, 2025 07:08
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.1.1 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.1.2 ) Feb 7, 2025
@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from 5af78c4 to ef7c814 Compare February 7, 2025 08:09
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.1.2 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.2.0 ) Feb 7, 2025
@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from ef7c814 to 02c2bfe Compare February 10, 2025 09:08
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.2.0 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.2.1 ) Feb 10, 2025
@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from 02c2bfe to 92601fc Compare February 10, 2025 16:25
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.2.1 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.2.2 ) Feb 10, 2025
@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from 92601fc to 866461b Compare February 11, 2025 22:07
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.2.2 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.2.3 ) Feb 11, 2025
@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from 866461b to 91d8ad6 Compare February 12, 2025 23:07
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.2.3 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.2.4 ) Feb 12, 2025
@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from 91d8ad6 to 1d164ad Compare February 14, 2025 19:06
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.2.4 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.3.0 ) Feb 14, 2025
@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from 1d164ad to e526c97 Compare February 15, 2025 17:07
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.3.0 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.3.1 ) Feb 15, 2025
@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from e526c97 to 5852642 Compare February 18, 2025 18:10
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.3.1 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.3.2 ) Feb 18, 2025
@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from 5852642 to 15fb087 Compare February 20, 2025 16:09
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.3.2 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.3.3 ) Feb 20, 2025
@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from 15fb087 to 6ace61d Compare February 21, 2025 13:12
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.3.3 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.4.0 ) Feb 21, 2025
@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from 6ace61d to 245fe05 Compare February 21, 2025 19:07
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.4.0 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.4.1 ) Feb 21, 2025
@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from 245fe05 to e866b5e Compare February 25, 2025 09:08
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.4.1 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.5.0 ) Feb 25, 2025
@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from e866b5e to 6ec9237 Compare February 25, 2025 10:09
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.5.0 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.5.1 ) Feb 25, 2025
| datasource | package               | from    | to     |
| ---------- | --------------------- | ------- | ------ |
| helm       | kube-prometheus-stack | 67.11.0 | 69.5.2 |
@snoopy82481-bot snoopy82481-bot bot force-pushed the renovate/kube-prometheus-stack-69.x branch from 6ec9237 to a8c32b6 Compare February 25, 2025 21:07
@snoopy82481-bot snoopy82481-bot bot changed the title feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.5.1 ) feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.5.2 ) Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubernetes Changes made in the kubernetes namespace directory renovate/helm size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. type/major
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants