-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(helm)!: Update kube-prometheus-stack ( 67.11.0 → 69.5.2 ) #2596
Open
snoopy82481-bot
wants to merge
1
commit into
main
Choose a base branch
from
renovate/kube-prometheus-stack-69.x
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- kubernetes/apps/monitoring/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack HelmRelease: monitoring/kube-prometheus-stack
+++ kubernetes/apps/monitoring/kube-prometheus-stack/app Kustomization: flux-system/kube-prometheus-stack HelmRelease: monitoring/kube-prometheus-stack
@@ -13,13 +13,13 @@
spec:
chart: kube-prometheus-stack
sourceRef:
kind: HelmRepository
name: prometheus-community
namespace: flux-system
- version: 67.11.0
+ version: 69.5.2
dependsOn:
- name: rook-ceph-cluster
namespace: rook-ceph
- name: cert-manager
namespace: cert-manager
install: |
--- HelmRelease: monitoring/kube-prometheus-stack ConfigMap: monitoring/kube-prometheus-stack-alertmanager-overview
+++ HelmRelease: monitoring/kube-prometheus-stack ConfigMap: monitoring/kube-prometheus-stack-alertmanager-overview
@@ -1,42 +0,0 @@
----
-apiVersion: v1
-kind: ConfigMap
-metadata:
- namespace: monitoring
- name: kube-prometheus-stack-alertmanager-overview
- labels:
- grafana_dashboard: '1'
- app: kube-prometheus-stack-grafana
- app.kubernetes.io/managed-by: Helm
- app.kubernetes.io/instance: kube-prometheus-stack
- app.kubernetes.io/part-of: kube-prometheus-stack
- release: kube-prometheus-stack
- heritage: Helm
-data:
- alertmanager-overview.json: '{"graphTooltip":1,"panels":[{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":0},"id":1,"panels":[],"title":"Alerts","type":"row"},{"datasource":{"type":"prometheus","uid":"$datasource"},"description":"current
- set of alerts stored in the Alertmanager","fieldConfig":{"defaults":{"custom":{"fillOpacity":10,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"none"}},"gridPos":{"h":7,"w":12,"x":0,"y":1},"id":2,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum(alertmanager_alerts{namespace=~\"$namespace\",service=~\"$service\"})
- by (namespace,service,instance)","intervalFactor":2,"legendFormat":"{{instance}}"}],"title":"Alerts","type":"timeseries"},{"datasource":{"type":"prometheus","uid":"$datasource"},"description":"rate
- of successful and invalid alerts received by the Alertmanager","fieldConfig":{"defaults":{"custom":{"fillOpacity":10,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"ops"}},"gridPos":{"h":7,"w":12,"x":12,"y":1},"id":3,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum(rate(alertmanager_alerts_received_total{namespace=~\"$namespace\",service=~\"$service\"}[$__rate_interval]))
- by (namespace,service,instance)","intervalFactor":2,"legendFormat":"{{instance}}
- Received"},{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum(rate(alertmanager_alerts_invalid_total{namespace=~\"$namespace\",service=~\"$service\"}[$__rate_interval]))
- by (namespace,service,instance)","intervalFactor":2,"legendFormat":"{{instance}}
- Invalid"}],"title":"Alerts receive rate","type":"timeseries"},{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":8},"id":4,"panels":[],"title":"Notifications","type":"row"},{"datasource":{"type":"prometheus","uid":"$datasource"},"description":"rate
- of successful and invalid notifications sent by the Alertmanager","fieldConfig":{"defaults":{"custom":{"fillOpacity":10,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"ops"}},"gridPos":{"h":7,"w":12,"x":0,"y":9},"id":5,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi"}},"pluginVersion":"v11.4.0","repeat":"integration","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum(rate(alertmanager_notifications_total{namespace=~\"$namespace\",service=~\"$service\",
- integration=\"$integration\"}[$__rate_interval])) by (integration,namespace,service,instance)","intervalFactor":2,"legendFormat":"{{instance}}
- Total"},{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum(rate(alertmanager_notifications_failed_total{namespace=~\"$namespace\",service=~\"$service\",
- integration=\"$integration\"}[$__rate_interval])) by (integration,namespace,service,instance)","intervalFactor":2,"legendFormat":"{{instance}}
- Failed"}],"title":"$integration: Notifications Send Rate","type":"timeseries"},{"datasource":{"type":"prometheus","uid":"$datasource"},"description":"latency
- of notifications sent by the Alertmanager","fieldConfig":{"defaults":{"custom":{"fillOpacity":10,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"s"}},"gridPos":{"h":7,"w":12,"x":12,"y":9},"id":6,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi"}},"pluginVersion":"v11.4.0","repeat":"integration","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"histogram_quantile(0.99,\n sum(rate(alertmanager_notification_latency_seconds_bucket{namespace=~\"$namespace\",service=~\"$service\",
- integration=\"$integration\"}[$__rate_interval])) by (le,namespace,service,instance)\n)\n","intervalFactor":2,"legendFormat":"{{instance}}
- 99th Percentile"},{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"histogram_quantile(0.50,\n sum(rate(alertmanager_notification_latency_seconds_bucket{namespace=~\"$namespace\",service=~\"$service\",
- integration=\"$integration\"}[$__rate_interval])) by (le,namespace,service,instance)\n)\n","intervalFactor":2,"legendFormat":"{{instance}}
- Median"},{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum(rate(alertmanager_notification_latency_seconds_sum{namespace=~\"$namespace\",service=~\"$service\",
- integration=\"$integration\"}[$__rate_interval])) by (namespace,service,instance)\n/\nsum(rate(alertmanager_notification_latency_seconds_count{namespace=~\"$namespace\",service=~\"$service\",
- integration=\"$integration\"}[$__rate_interval])) by (namespace,service,instance)\n","intervalFactor":2,"legendFormat":"{{instance}}
- Average"}],"title":"$integration: Notification Duration","type":"timeseries"}],"schemaVersion":39,"tags":["alertmanager-mixin"],"templating":{"list":[{"current":{"selected":false,"text":"Prometheus","value":"Prometheus"},"hide":0,"label":"Data
- Source","name":"datasource","query":"prometheus","type":"datasource"},{"current":{"selected":false,"text":"","value":""},"datasource":{"type":"prometheus","uid":"${datasource}"},"includeAll":false,"label":"namespace","name":"namespace","query":"label_values(alertmanager_alerts,
- namespace)","refresh":2,"sort":1,"type":"query"},{"current":{"selected":false,"text":"","value":""},"datasource":{"type":"prometheus","uid":"${datasource}"},"includeAll":false,"label":"service","name":"service","query":"label_values(alertmanager_alerts,
- service)","refresh":2,"sort":1,"type":"query"},{"current":{"selected":false,"text":"$__all","value":"$__all"},"datasource":{"type":"prometheus","uid":"${datasource}"},"hide":2,"includeAll":true,"name":"integration","query":"label_values(alertmanager_notifications_total{integration=~\".*\"},
- integration)","refresh":2,"sort":1,"type":"query"}]},"time":{"from":"now-1h","to":"now"},"timepicker":{"refresh_intervals":["30s"]},"timezone":
- "utc","title":"Alertmanager / Overview","uid":"alertmanager-overview"}'
-
--- HelmRelease: monitoring/kube-prometheus-stack ConfigMap: monitoring/kube-prometheus-stack-node-cluster-rsrc-use
+++ HelmRelease: monitoring/kube-prometheus-stack ConfigMap: monitoring/kube-prometheus-stack-node-cluster-rsrc-use
@@ -10,597 +10,41 @@
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/instance: kube-prometheus-stack
app.kubernetes.io/part-of: kube-prometheus-stack
release: kube-prometheus-stack
heritage: Helm
data:
- node-cluster-rsrc-use.json: |-
- {
- "graphTooltip": 1,
- "panels": [
- {
- "collapsed": false,
- "gridPos": {
- "h": 1,
- "w": 24,
- "x": 0,
- "y": 0
- },
- "id": 1,
- "panels": [
+ node-cluster-rsrc-use.json: '{"graphTooltip":1,"panels":[{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":0},"id":1,"panels":[],"title":"CPU","type":"row"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"percentunit"}},"gridPos":{"h":7,"w":12,"x":0,"y":1},"id":2,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"((\n instance:node_cpu_utilisation:rate5m{job=\"node-exporter\",
+ cluster=\"$cluster\"}\n *\n instance:node_num_cpu:sum{job=\"node-exporter\",
+ cluster=\"$cluster\"}\n) != 0 )\n/ scalar(sum(instance:node_num_cpu:sum{job=\"node-exporter\",
+ cluster=\"$cluster\"}))\n","legendFormat":"{{ instance }}"}],"title":"CPU Utilisation","type":"timeseries"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"percentunit"}},"gridPos":{"h":7,"w":12,"x":12,"y":1},"id":3,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"(\n instance:node_load1_per_cpu:ratio{job=\"node-exporter\",
+ cluster=\"$cluster\"}\n / scalar(count(instance:node_load1_per_cpu:ratio{job=\"node-exporter\",
+ cluster=\"$cluster\"}))\n) != 0\n","legendFormat":"{{ instance }}"}],"title":"CPU
+ Saturation (Load1 per CPU)","type":"timeseries"},{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":8},"id":4,"panels":[],"title":"Memory","type":"row"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"percentunit"}},"gridPos":{"h":7,"w":12,"x":0,"y":9},"id":5,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"(\n instance:node_memory_utilisation:ratio{job=\"node-exporter\",
+ cluster=\"$cluster\"}\n / scalar(count(instance:node_memory_utilisation:ratio{job=\"node-exporter\",
+ cluster=\"$cluster\"}))\n) != 0\n","legendFormat":"{{ instance }}"}],"title":"Memory
+ Utilisation","type":"timeseries"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"rds"}},"gridPos":{"h":7,"w":12,"x":12,"y":9},"id":6,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"instance:node_vmstat_pgmajfault:rate5m{job=\"node-exporter\",
+ cluster=\"$cluster\"}","legendFormat":"{{ instance }}"}],"title":"Memory Saturation
+ (Major Page Faults)","type":"timeseries"},{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":16},"id":7,"panels":[],"title":"Network","type":"row"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"Bps"},"overrides":[{"matcher":{"id":"byRegexp","options":"/Transmit/"},"properties":[{"id":"custom.transform","value":"negative-Y"}]}]},"gridPos":{"h":7,"w":12,"x":0,"y":17},"id":8,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"instance:node_network_receive_bytes_excluding_lo:rate5m{job=\"node-exporter\",
+ cluster=\"$cluster\"} != 0","legendFormat":"{{ instance }} Receive"},{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"instance:node_network_transmit_bytes_excluding_lo:rate5m{job=\"node-exporter\",
+ cluster=\"$cluster\"} != 0","legendFormat":"{{ instance }} Transmit"}],"title":"Network
+ Utilisation (Bytes Receive/Transmit)","type":"timeseries"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"Bps"},"overrides":[{"matcher":{"id":"byRegexp","options":"/Transmit/"},"properties":[{"id":"custom.transform","value":"negative-Y"}]}]},"gridPos":{"h":7,"w":12,"x":12,"y":17},"id":9,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"instance:node_network_receive_drop_excluding_lo:rate5m{job=\"node-exporter\",
+ cluster=\"$cluster\"} != 0","legendFormat":"{{ instance }} Receive"},{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"instance:node_network_transmit_drop_excluding_lo:rate5m{job=\"node-exporter\",
+ cluster=\"$cluster\"} != 0","legendFormat":"{{ instance }} Transmit"}],"title":"Network
+ Saturation (Drops Receive/Transmit)","type":"timeseries"},{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":24},"id":10,"panels":[],"title":"Disk
+ IO","type":"row"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"percentunit"}},"gridPos":{"h":7,"w":12,"x":0,"y":25},"id":11,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"instance_device:node_disk_io_time_seconds:rate5m{job=\"node-exporter\",
+ cluster=\"$cluster\"}\n/ scalar(count(instance_device:node_disk_io_time_seconds:rate5m{job=\"node-exporter\",
+ cluster=\"$cluster\"}))\n","legendFormat":"{{ instance }} {{device}}"}],"title":"Disk
+ IO Utilisation","type":"timeseries"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"percentunit"}},"gridPos":{"h":7,"w":12,"x":12,"y":25},"id":12,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"instance_device:node_disk_io_time_weighted_seconds:rate5m{job=\"node-exporter\",
+ cluster=\"$cluster\"}\n/ scalar(count(instance_device:node_disk_io_time_weighted_seconds:rate5m{job=\"node-exporter\",
+ cluster=\"$cluster\"}))\n","legendFormat":"{{ instance }} {{device}}"}],"title":"Disk
+ IO Saturation","type":"timeseries"},{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":34},"id":13,"panels":[],"title":"Disk
+ Space","type":"row"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"showPoints":"never","stacking":{"mode":"normal"}},"unit":"percentunit"}},"gridPos":{"h":7,"w":24,"x":0,"y":35},"id":14,"options":{"legend":{"showLegend":false},"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum
+ without (device) (\n max without (fstype, mountpoint) ((\n node_filesystem_size_bytes{job=\"node-exporter\",
+ fstype!=\"\", mountpoint!=\"\", cluster=\"$cluster\"}\n -\n node_filesystem_avail_bytes{job=\"node-exporter\",
+ fstype!=\"\", mountpoint!=\"\", cluster=\"$cluster\"}\n ) != 0)\n)\n/ scalar(sum(max
+ without (fstype, mountpoint) (node_filesystem_size_bytes{job=\"node-exporter\",
+ fstype!=\"\", mountpoint!=\"\", cluster=\"$cluster\"})))\n","legendFormat":"{{
+ instance }}"}],"title":"Disk Space Utilisation","type":"timeseries"}],"refresh":"30s","schemaVersion":39,"tags":["node-exporter-mixin"],"templating":{"list":[{"name":"datasource","query":"prometheus","type":"datasource"},{"datasource":{"type":"prometheus","uid":"${datasource}"},"hide":2,"includeAll":false,"name":"cluster","query":"label_values(node_time_seconds,
+ cluster)","refresh":2,"sort":1,"type":"query","allValue":".*"}]},"time":{"from":"now-1h","to":"now"},"timezone":
+ "utc","title":"Node Exporter / USE Method / Cluster","uid":"3e97d1d02672cdd0861f4c97c64f89b2"}'
- ],
- "title": "CPU",
- "type": "row"
- },
- {
- "datasource": {
- "type": "prometheus",
- "uid": "${datasource}"
- },
- "fieldConfig": {
- "defaults": {
- "custom": {
- "fillOpacity": 100,
- "showPoints": "never",
- "stacking": {
- "mode": "normal"
- }
- },
- "unit": "percentunit"
- }
- },
- "gridPos": {
- "h": 7,
- "w": 12,
- "x": 0,
[Diff truncated by flux-local]
--- HelmRelease: monitoring/kube-prometheus-stack ConfigMap: monitoring/kube-prometheus-stack-prometheus
+++ HelmRelease: monitoring/kube-prometheus-stack ConfigMap: monitoring/kube-prometheus-stack-prometheus
@@ -10,74 +10,46 @@
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/instance: kube-prometheus-stack
app.kubernetes.io/part-of: kube-prometheus-stack
release: kube-prometheus-stack
heritage: Helm
data:
- prometheus.json: '{"annotations":{"list":[]},"editable":true,"gnetId":null,"graphTooltip":0,"hideControls":false,"links":[],"refresh":"60s","rows":[{"collapse":false,"height":"250px","panels":[{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":1,"id":1,"legend":{"avg":false,"current":false,"max":false,"min":false,"show":true,"total":false,"values":false},"lines":true,"linewidth":1,"links":[],"nullPointMode":"null
- as zero","percentage":false,"pointradius":5,"points":false,"renderer":"flot","seriesOverrides":[],"spaceLength":10,"span":12,"stack":false,"steppedLine":false,"styles":[{"alias":"Time","dateFormat":"YYYY-MM-DD
- HH:mm:ss","pattern":"Time","type":"hidden"},{"alias":"Count","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD
- HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill
- down","linkUrl":"","pattern":"Value #A","thresholds":[],"type":"hidden","unit":"short"},{"alias":"Uptime","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD
- HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill
- down","linkUrl":"","pattern":"Value #B","thresholds":[],"type":"number","unit":"s"},{"alias":"Cluster","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD
- HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill
- down","linkUrl":"","pattern":"cluster","thresholds":[],"type":"number","unit":"short"},{"alias":"Instance","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD
- HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill
- down","linkUrl":"","pattern":"instance","thresholds":[],"type":"number","unit":"short"},{"alias":"Job","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD
- HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill
- down","linkUrl":"","pattern":"job","thresholds":[],"type":"number","unit":"short"},{"alias":"Version","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD
- HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill
- down","linkUrl":"","pattern":"version","thresholds":[],"type":"number","unit":"short"},{"alias":"","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD
- HH:mm:ss","decimals":2,"pattern":"/.*/","thresholds":[],"type":"string","unit":"short"}],"targets":[{"expr":"count
+ prometheus.json: '{"panels":[{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":0},"id":1,"panels":[],"title":"Prometheus
+ Stats","type":"row"},{"datasource":{"type":"prometheus","uid":"$datasource"},"fieldConfig":{"defaults":{"decimals":2,"displayName":"","unit":"short"},"overrides":[{"matcher":{"id":"byName","options":"Time"},"properties":[{"id":"displayName","value":"Time"},{"id":"custom.align","value":null},{"id":"custom.hidden","value":"true"}]},{"matcher":{"id":"byName","options":"cluster"},"properties":[{"id":"custom.align","value":null},{"id":"unit","value":"short"},{"id":"decimals","value":2},{"id":"displayName","value":"Cluster"}]},{"matcher":{"id":"byName","options":"job"},"properties":[{"id":"custom.align","value":null},{"id":"unit","value":"short"},{"id":"decimals","value":2},{"id":"displayName","value":"Job"}]},{"matcher":{"id":"byName","options":"instance"},"properties":[{"id":"displayName","value":"Instance"},{"id":"custom.align","value":null},{"id":"unit","value":"short"},{"id":"decimals","value":2}]},{"matcher":{"id":"byName","options":"version"},"properties":[{"id":"displayName","value":"Version"},{"id":"custom.align","value":null},{"id":"unit","value":"short"},{"id":"decimals","value":2}]},{"matcher":{"id":"byName","options":"Value
+ #A"},"properties":[{"id":"displayName","value":"Count"},{"id":"custom.align","value":null},{"id":"unit","value":"short"},{"id":"decimals","value":2},{"id":"custom.hidden","value":"true"}]},{"matcher":{"id":"byName","options":"Value
+ #B"},"properties":[{"id":"displayName","value":"Uptime"},{"id":"custom.align","value":null},{"id":"unit","value":"s"}]}]},"gridPos":{"h":7,"w":24,"x":0,"y":1},"id":2,"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"count
by (cluster, job, instance, version) (prometheus_build_info{cluster=~\"$cluster\",
- job=~\"$job\", instance=~\"$instance\"})","format":"table","instant":true,"legendFormat":"","refId":"A"},{"expr":"max
+ job=~\"$job\", instance=~\"$instance\"})","format":"table","instant":true,"legendFormat":""},{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"max
by (cluster, job, instance) (time() - process_start_time_seconds{cluster=~\"$cluster\",
- job=~\"$job\", instance=~\"$instance\"})","format":"table","instant":true,"legendFormat":"","refId":"B"}],"thresholds":[],"timeFrom":null,"timeShift":null,"title":"Prometheus
- Stats","tooltip":{"shared":true,"sort":2,"value_type":"individual"},"transform":"table","type":"table","xaxis":{"buckets":null,"mode":"time","name":null,"show":true,"values":[]},"yaxes":[{"format":"short","label":null,"logBase":1,"max":null,"min":0,"show":true},{"format":"short","label":null,"logBase":1,"max":null,"min":null,"show":false}]}],"repeat":null,"repeatIteration":null,"repeatRowId":null,"showTitle":true,"title":"Prometheus
- Stats","titleSize":"h6"},{"collapse":false,"height":"250px","panels":[{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":1,"id":2,"legend":{"avg":false,"current":false,"max":false,"min":false,"show":true,"total":false,"values":false},"lines":true,"linewidth":1,"links":[],"nullPointMode":"null
- as zero","percentage":false,"pointradius":5,"points":false,"renderer":"flot","seriesOverrides":[],"spaceLength":10,"span":6,"stack":false,"steppedLine":false,"targets":[{"expr":"sum(rate(prometheus_target_sync_length_seconds_sum{cluster=~\"$cluster\",job=~\"$job\",instance=~\"$instance\"}[5m]))
- by (cluster, job, scrape_job, instance) * 1e3","format":"time_series","legendFormat":"{{cluster}}:{{job}}:{{instance}}:{{scrape_job}}","legendLink":null}],"thresholds":[],"timeFrom":null,"timeShift":null,"title":"Target
- Sync","tooltip":{"shared":true,"sort":2,"value_type":"individual"},"type":"graph","xaxis":{"buckets":null,"mode":"time","name":null,"show":true,"values":[]},"yaxes":[{"format":"ms","label":null,"logBase":1,"max":null,"min":0,"show":true},{"format":"short","label":null,"logBase":1,"max":null,"min":null,"show":false}]},{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":10,"id":3,"legend":{"avg":false,"current":false,"max":false,"min":false,"show":true,"total":false,"values":false},"lines":true,"linewidth":0,"links":[],"nullPointMode":"null
- as zero","percentage":false,"pointradius":5,"points":false,"renderer":"flot","seriesOverrides":[],"spaceLength":10,"span":6,"stack":true,"steppedLine":false,"targets":[{"expr":"sum
+ job=~\"$job\", instance=~\"$instance\"})","format":"table","instant":true,"legendFormat":""}],"title":"Prometheus
+ Stats","type":"table"},{"collapsed":false,"gridPos":{"h":1,"w":24,"x":0,"y":8},"id":3,"panels":[],"title":"Discovery","type":"row"},{"datasource":{"type":"prometheus","uid":"$datasource"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":10,"showPoints":"never"},"min":0,"unit":"ms"}},"gridPos":{"h":7,"w":12,"x":0,"y":9},"id":4,"options":{"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum(rate(prometheus_target_sync_length_seconds_sum{cluster=~\"$cluster\",job=~\"$job\",instance=~\"$instance\"}[5m]))
+ by (cluster, job, scrape_job, instance) * 1e3","format":"time_series","legendFormat":"{{cluster}}:{{job}}:{{instance}}:{{scrape_job}}"}],"title":"Target
+ Sync","type":"timeseries"},{"datasource":{"type":"prometheus","uid":"$datasource"},"fieldConfig":{"defaults":{"custom":{"fillOpacity":100,"lineWidth":0,"showPoints":"never","stacking":{"mode":"normal"}},"min":0,"unit":"short"}},"gridPos":{"h":7,"w":12,"x":12,"y":9},"id":5,"options":{"tooltip":{"mode":"multi","sort":"desc"}},"pluginVersion":"v11.4.0","targets":[{"datasource":{"type":"prometheus","uid":"$datasource"},"expr":"sum
by (cluster, job, instance) (prometheus_sd_discovered_targets{cluster=~\"$cluster\",
- job=~\"$job\",instance=~\"$instance\"})","format":"time_series","legendFormat":"{{cluster}}:{{job}}:{{instance}}","legendLink":null}],"thresholds":[],"timeFrom":null,"timeShift":null,"title":"Targets","tooltip":{"shared":true,"sort":2,"value_type":"individual"},"type":"graph","xaxis":{"buckets":null,"mode":"time","name":null,"show":true,"values":[]},"yaxes":[{"format":"short","label":null,"logBase":1,"max":null,"min":0,"show":true},{"format":"short","label":null,"logBase":1,"max":null,"min":null,"show":false}]}],"repeat":null,"repeatIteration":null,"repeatRowId":null,"showTitle":true,"title":"Discovery","titleSize":"h6"},{"collapse":false,"height":"250px","panels":[{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":1,"id":4,"legend":{"avg":false,"current":false,"max":false,"min":false,"show":true,"total":false,"values":false},"lines":true,"linewidth":1,"links":[],"nullPointMode":"null
- as zero","percentage":false,"pointradius":5,"points":false,"renderer":"flot","seriesOverrides":[],"spaceLength":10,"span":4,"stack":false,"steppedLine":false,"targets":[{"expr":"rate(prometheus_target_interval_length_seconds_sum{cluster=~\"$cluster\",
[Diff truncated by flux-local]
--- HelmRelease: monitoring/kube-prometheus-stack Service: monitoring/kube-prometheus-stack-kube-state-metrics
+++ HelmRelease: monitoring/kube-prometheus-stack Service: monitoring/kube-prometheus-stack-kube-state-metrics
@@ -8,14 +8,12 @@
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/component: metrics
app.kubernetes.io/part-of: kube-state-metrics
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/instance: kube-prometheus-stack
release: kube-prometheus-stack
- annotations:
- prometheus.io/scrape: 'true'
spec:
type: ClusterIP
ports:
- name: http
protocol: TCP
port: 8080
--- HelmRelease: monitoring/kube-prometheus-stack DaemonSet: monitoring/kube-prometheus-stack-prometheus-node-exporter
+++ HelmRelease: monitoring/kube-prometheus-stack DaemonSet: monitoring/kube-prometheus-stack-prometheus-node-exporter
@@ -40,13 +40,13 @@
runAsGroup: 65534
runAsNonRoot: true
runAsUser: 65534
serviceAccountName: kube-prometheus-stack-prometheus-node-exporter
containers:
- name: node-exporter
- image: quay.io/prometheus/node-exporter:v1.8.2
+ image: quay.io/prometheus/node-exporter:v1.9.0
imagePullPolicy: IfNotPresent
args:
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
- --path.rootfs=/host/root
- --path.udev.data=/host/root/run/udev/data
--- HelmRelease: monitoring/kube-prometheus-stack Deployment: monitoring/kube-prometheus-stack-kube-state-metrics
+++ HelmRelease: monitoring/kube-prometheus-stack Deployment: monitoring/kube-prometheus-stack-kube-state-metrics
@@ -44,13 +44,13 @@
- name: kube-state-metrics
args:
- --port=8080
- --resources=certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,leases,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments
- --metric-labels-allowlist=deployments=[*],persistentvolumeclaims=[*]
imagePullPolicy: IfNotPresent
- image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.14.0
+ image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.15.0
ports:
- containerPort: 8080
name: http
livenessProbe:
failureThreshold: 3
httpGet:
--- HelmRelease: monitoring/kube-prometheus-stack Deployment: monitoring/kube-prometheus-stack-operator
+++ HelmRelease: monitoring/kube-prometheus-stack Deployment: monitoring/kube-prometheus-stack-operator
@@ -31,20 +31,20 @@
app: kube-prometheus-stack-operator
app.kubernetes.io/name: kube-prometheus-stack-prometheus-operator
app.kubernetes.io/component: prometheus-operator
spec:
containers:
- name: kube-prometheus-stack
- image: quay.io/prometheus-operator/prometheus-operator:v0.79.2
+ image: quay.io/prometheus-operator/prometheus-operator:v0.80.1
imagePullPolicy: IfNotPresent
args:
- --kubelet-service=kube-system/kube-prometheus-stack-kubelet
- --kubelet-endpoints=true
- --kubelet-endpointslice=false
- --localhost=127.0.0.1
- - --prometheus-config-reloader=quay.io/prometheus-operator/prometheus-config-reloader:v0.79.2
+ - --prometheus-config-reloader=quay.io/prometheus-operator/prometheus-config-reloader:v0.80.1
- --config-reloader-cpu-request=0
- --config-reloader-cpu-limit=0
- --config-reloader-memory-request=0
- --config-reloader-memory-limit=0
- --thanos-default-base-image=quay.io/thanos/thanos:v0.37.2
- --secret-field-selector=type!=kubernetes.io/dockercfg,type!=kubernetes.io/service-account-token,type!=helm.sh/release.v1
--- HelmRelease: monitoring/kube-prometheus-stack Prometheus: monitoring/kube-prometheus-stack
+++ HelmRelease: monitoring/kube-prometheus-stack Prometheus: monitoring/kube-prometheus-stack
@@ -10,14 +10,14 @@
app.kubernetes.io/instance: kube-prometheus-stack
app.kubernetes.io/part-of: kube-prometheus-stack
release: kube-prometheus-stack
heritage: Helm
spec:
automountServiceAccountToken: true
- image: quay.io/prometheus/prometheus:v3.1.0
- version: v3.1.0
+ image: quay.io/prometheus/prometheus:v3.2.0
+ version: v3.2.0
externalUrl: http://prometheus...PLACEHOLDER_SECRET_DOMAIN../
paused: false
replicas: 1
shards: 1
logLevel: info
logFormat: logfmt
--- HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-etcd
+++ HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-etcd
@@ -26,15 +26,15 @@
or
count without (To) (
sum without (instance, pod) (rate(etcd_network_peer_sent_failures_total{job=~".*etcd.*"}[120s])) > 0.01
)
)
> 0
- for: 10m
- labels:
- severity: critical
+ for: 20m
+ labels:
+ severity: warning
- alert: etcdInsufficientMembers
annotations:
description: 'etcd cluster "{{ $labels.job }}": insufficient members ({{ $value
}}).'
summary: etcd cluster has insufficient number of members.
expr: sum(up{job=~".*etcd.*"} == bool 1) without (instance, pod) < ((count(up{job=~".*etcd.*"})
--- HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kube-apiserver-slos
+++ HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kube-apiserver-slos
@@ -14,13 +14,14 @@
spec:
groups:
- name: kube-apiserver-slos
rules:
- alert: KubeAPIErrorBudgetBurn
annotations:
- description: The API server is burning too much error budget.
+ description: The API server is burning too much error budget on cluster {{
+ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeapierrorbudgetburn
summary: The API server is burning too much error budget.
expr: |-
sum by (cluster) (apiserver_request:burnrate1h) > (14.40 * 0.01000)
and on (cluster)
sum by (cluster) (apiserver_request:burnrate5m) > (14.40 * 0.01000)
@@ -28,13 +29,14 @@
labels:
long: 1h
severity: critical
short: 5m
- alert: KubeAPIErrorBudgetBurn
annotations:
- description: The API server is burning too much error budget.
+ description: The API server is burning too much error budget on cluster {{
+ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeapierrorbudgetburn
summary: The API server is burning too much error budget.
expr: |-
sum by (cluster) (apiserver_request:burnrate6h) > (6.00 * 0.01000)
and on (cluster)
sum by (cluster) (apiserver_request:burnrate30m) > (6.00 * 0.01000)
@@ -42,13 +44,14 @@
labels:
long: 6h
severity: critical
short: 30m
- alert: KubeAPIErrorBudgetBurn
annotations:
- description: The API server is burning too much error budget.
+ description: The API server is burning too much error budget on cluster {{
+ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeapierrorbudgetburn
summary: The API server is burning too much error budget.
expr: |-
sum by (cluster) (apiserver_request:burnrate1d) > (3.00 * 0.01000)
and on (cluster)
sum by (cluster) (apiserver_request:burnrate2h) > (3.00 * 0.01000)
@@ -56,13 +59,14 @@
labels:
long: 1d
severity: warning
short: 2h
- alert: KubeAPIErrorBudgetBurn
annotations:
- description: The API server is burning too much error budget.
+ description: The API server is burning too much error budget on cluster {{
+ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeapierrorbudgetburn
summary: The API server is burning too much error budget.
expr: |-
sum by (cluster) (apiserver_request:burnrate3d) > (1.00 * 0.01000)
and on (cluster)
sum by (cluster) (apiserver_request:burnrate6h) > (1.00 * 0.01000)
--- HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-apps
+++ HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-apps
@@ -15,24 +15,25 @@
groups:
- name: kubernetes-apps
rules:
- alert: KubePodCrashLooping
annotations:
description: 'Pod {{ $labels.namespace }}/{{ $labels.pod }} ({{ $labels.container
- }}) is in waiting state (reason: "CrashLoopBackOff").'
+ }}) is in waiting state (reason: "CrashLoopBackOff") on cluster {{ $labels.cluster
+ }}.'
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubepodcrashlooping
summary: Pod is crash looping.
expr: max_over_time(kube_pod_container_status_waiting_reason{reason="CrashLoopBackOff",
job="kube-state-metrics", namespace=~".*"}[5m]) >= 1
for: 15m
labels:
severity: warning
- alert: KubePodNotReady
annotations:
description: Pod {{ $labels.namespace }}/{{ $labels.pod }} has been in a non-ready
- state for longer than 15 minutes.
+ state for longer than 15 minutes on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubepodnotready
summary: Pod has been in a non-ready state for more than 15 minutes.
expr: |-
sum by (namespace, pod, cluster) (
max by (namespace, pod, cluster) (
kube_pod_status_phase{job="kube-state-metrics", namespace=~".*", phase=~"Pending|Unknown|Failed"}
@@ -44,26 +45,27 @@
labels:
severity: warning
- alert: KubeDeploymentGenerationMismatch
annotations:
description: Deployment generation for {{ $labels.namespace }}/{{ $labels.deployment
}} does not match, this indicates that the Deployment has failed but has
- not been rolled back.
+ not been rolled back on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubedeploymentgenerationmismatch
summary: Deployment generation mismatch due to possible roll-back
expr: |-
kube_deployment_status_observed_generation{job="kube-state-metrics", namespace=~".*"}
!=
kube_deployment_metadata_generation{job="kube-state-metrics", namespace=~".*"}
for: 15m
labels:
severity: warning
- alert: KubeDeploymentReplicasMismatch
annotations:
description: Deployment {{ $labels.namespace }}/{{ $labels.deployment }} has
- not matched the expected number of replicas for longer than 15 minutes.
+ not matched the expected number of replicas for longer than 15 minutes on
+ cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubedeploymentreplicasmismatch
summary: Deployment has not matched the expected number of replicas.
expr: |-
(
kube_deployment_spec_replicas{job="kube-state-metrics", namespace=~".*"}
>
@@ -76,58 +78,60 @@
for: 15m
labels:
severity: warning
- alert: KubeDeploymentRolloutStuck
annotations:
description: Rollout of deployment {{ $labels.namespace }}/{{ $labels.deployment
- }} is not progressing for longer than 15 minutes.
+ }} is not progressing for longer than 15 minutes on cluster {{ $labels.cluster
+ }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubedeploymentrolloutstuck
summary: Deployment rollout is not progressing.
expr: |-
kube_deployment_status_condition{condition="Progressing", status="false",job="kube-state-metrics", namespace=~".*"}
!= 0
for: 15m
labels:
severity: warning
- alert: KubeStatefulSetReplicasMismatch
annotations:
description: StatefulSet {{ $labels.namespace }}/{{ $labels.statefulset }}
- has not matched the expected number of replicas for longer than 15 minutes.
+ has not matched the expected number of replicas for longer than 15 minutes
+ on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubestatefulsetreplicasmismatch
summary: StatefulSet has not matched the expected number of replicas.
expr: |-
(
kube_statefulset_status_replicas_ready{job="kube-state-metrics", namespace=~".*"}
!=
- kube_statefulset_status_replicas{job="kube-state-metrics", namespace=~".*"}
+ kube_statefulset_replicas{job="kube-state-metrics", namespace=~".*"}
) and (
changes(kube_statefulset_status_replicas_updated{job="kube-state-metrics", namespace=~".*"}[10m])
==
0
)
for: 15m
labels:
severity: warning
- alert: KubeStatefulSetGenerationMismatch
annotations:
description: StatefulSet generation for {{ $labels.namespace }}/{{ $labels.statefulset
}} does not match, this indicates that the StatefulSet has failed but has
- not been rolled back.
+ not been rolled back on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubestatefulsetgenerationmismatch
summary: StatefulSet generation mismatch due to possible roll-back
expr: |-
kube_statefulset_status_observed_generation{job="kube-state-metrics", namespace=~".*"}
!=
kube_statefulset_metadata_generation{job="kube-state-metrics", namespace=~".*"}
for: 15m
labels:
severity: warning
- alert: KubeStatefulSetUpdateNotRolledOut
annotations:
description: StatefulSet {{ $labels.namespace }}/{{ $labels.statefulset }}
- update has not been rolled out.
+ update has not been rolled out on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubestatefulsetupdatenotrolledout
summary: StatefulSet update has not been rolled out.
expr: |-
(
max by (namespace, statefulset, job, cluster) (
kube_statefulset_status_current_revision{job="kube-state-metrics", namespace=~".*"}
@@ -148,13 +152,14 @@
for: 15m
labels:
severity: warning
- alert: KubeDaemonSetRolloutStuck
annotations:
description: DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset }} has
- not finished or progressed for at least 15m.
+ not finished or progressed for at least 15m on cluster {{ $labels.cluster
+ }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubedaemonsetrolloutstuck
summary: DaemonSet rollout is stuck.
expr: |-
(
(
kube_daemonset_status_current_number_scheduled{job="kube-state-metrics", namespace=~".*"}
@@ -182,70 +187,74 @@
labels:
severity: warning
- alert: KubeContainerWaiting
annotations:
description: 'pod/{{ $labels.pod }} in namespace {{ $labels.namespace }} on
container {{ $labels.container}} has been in waiting state for longer than
- 1 hour. (reason: "{{ $labels.reason }}").'
+ 1 hour. (reason: "{{ $labels.reason }}") on cluster {{ $labels.cluster }}.'
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubecontainerwaiting
summary: Pod container waiting longer than 1 hour
expr: kube_pod_container_status_waiting_reason{reason!="CrashLoopBackOff", job="kube-state-metrics",
namespace=~".*"} > 0
for: 1h
labels:
severity: warning
- alert: KubeDaemonSetNotScheduled
annotations:
description: '{{ $value }} Pods of DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset
- }} are not scheduled.'
+ }} are not scheduled on cluster {{ $labels.cluster }}.'
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubedaemonsetnotscheduled
summary: DaemonSet pods are not scheduled.
expr: |-
kube_daemonset_status_desired_number_scheduled{job="kube-state-metrics", namespace=~".*"}
-
kube_daemonset_status_current_number_scheduled{job="kube-state-metrics", namespace=~".*"} > 0
for: 10m
labels:
severity: warning
- alert: KubeDaemonSetMisScheduled
annotations:
description: '{{ $value }} Pods of DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset
- }} are running where they are not supposed to run.'
+ }} are running where they are not supposed to run on cluster {{ $labels.cluster
+ }}.'
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubedaemonsetmisscheduled
summary: DaemonSet pods are misscheduled.
expr: kube_daemonset_status_number_misscheduled{job="kube-state-metrics", namespace=~".*"}
> 0
for: 15m
labels:
severity: warning
- alert: KubeJobNotCompleted
annotations:
description: Job {{ $labels.namespace }}/{{ $labels.job_name }} is taking
- more than {{ "43200" | humanizeDuration }} to complete.
+ more than {{ "43200" | humanizeDuration }} to complete on cluster {{ $labels.cluster
+ }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubejobnotcompleted
summary: Job did not complete in time
expr: |-
time() - max by (namespace, job_name, cluster) (kube_job_status_start_time{job="kube-state-metrics", namespace=~".*"}
and
kube_job_status_active{job="kube-state-metrics", namespace=~".*"} > 0) > 43200
labels:
severity: warning
- alert: KubeJobFailed
annotations:
[Diff truncated by flux-local]
--- HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-resources
+++ HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-resources
@@ -70,13 +70,13 @@
for: 5m
labels:
severity: warning
- alert: KubeQuotaAlmostFull
annotations:
description: Namespace {{ $labels.namespace }} is using {{ $value | humanizePercentage
- }} of its {{ $labels.resource }} quota.
+ }} of its {{ $labels.resource }} quota on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubequotaalmostfull
summary: Namespace quota is going to be full.
expr: |-
kube_resourcequota{job="kube-state-metrics", type="used"}
/ ignoring(instance, job, type)
(kube_resourcequota{job="kube-state-metrics", type="hard"} > 0)
@@ -84,13 +84,13 @@
for: 15m
labels:
severity: info
- alert: KubeQuotaFullyUsed
annotations:
description: Namespace {{ $labels.namespace }} is using {{ $value | humanizePercentage
- }} of its {{ $labels.resource }} quota.
+ }} of its {{ $labels.resource }} quota on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubequotafullyused
summary: Namespace quota is fully used.
expr: |-
kube_resourcequota{job="kube-state-metrics", type="used"}
/ ignoring(instance, job, type)
(kube_resourcequota{job="kube-state-metrics", type="hard"} > 0)
@@ -98,13 +98,13 @@
for: 15m
labels:
severity: info
- alert: KubeQuotaExceeded
annotations:
description: Namespace {{ $labels.namespace }} is using {{ $value | humanizePercentage
- }} of its {{ $labels.resource }} quota.
+ }} of its {{ $labels.resource }} quota on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubequotaexceeded
summary: Namespace quota has exceeded the limits.
expr: |-
kube_resourcequota{job="kube-state-metrics", type="used"}
/ ignoring(instance, job, type)
(kube_resourcequota{job="kube-state-metrics", type="hard"} > 0)
@@ -113,13 +113,13 @@
labels:
severity: warning
- alert: CPUThrottlingHigh
annotations:
description: '{{ $value | humanizePercentage }} throttling of CPU in namespace
{{ $labels.namespace }} for container {{ $labels.container }} in pod {{
- $labels.pod }}.'
+ $labels.pod }} on cluster {{ $labels.cluster }}.'
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/cputhrottlinghigh
summary: Processes experience elevated CPU throttling.
expr: |-
sum(increase(container_cpu_cfs_throttled_periods_total{container!="", job="kubelet", metrics_path="/metrics/cadvisor", }[5m])) without (id, metrics_path, name, image, endpoint, job, node)
/
sum(increase(container_cpu_cfs_periods_total{job="kubelet", metrics_path="/metrics/cadvisor", }[5m])) without (id, metrics_path, name, image, endpoint, job, node)
--- HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-system-apiserver
+++ HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-system-apiserver
@@ -53,13 +53,14 @@
for: 10m
labels:
severity: warning
- alert: KubeAggregatedAPIDown
annotations:
description: Kubernetes aggregated API {{ $labels.name }}/{{ $labels.namespace
- }} has been only {{ $value | humanize }}% available over the last 10m.
+ }} has been only {{ $value | humanize }}% available over the last 10m on
+ cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeaggregatedapidown
summary: Kubernetes aggregated API is down.
expr: (1 - max by (name, namespace, cluster)(avg_over_time(aggregator_unavailable_apiservice{job="apiserver"}[10m])))
* 100 < 85
for: 5m
labels:
@@ -73,13 +74,13 @@
for: 15m
labels:
severity: critical
- alert: KubeAPITerminatedRequests
annotations:
description: The kubernetes apiserver has terminated {{ $value | humanizePercentage
- }} of its incoming requests.
+ }} of its incoming requests on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeapiterminatedrequests
summary: The kubernetes apiserver has terminated {{ $value | humanizePercentage
}} of its incoming requests.
expr: sum by (cluster) (rate(apiserver_request_terminations_total{job="apiserver"}[10m]))
/ ( sum by (cluster) (rate(apiserver_request_total{job="apiserver"}[10m]))
+ sum by (cluster) (rate(apiserver_request_terminations_total{job="apiserver"}[10m]))
--- HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-system-kubelet
+++ HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-system-kubelet
@@ -14,135 +14,149 @@
spec:
groups:
- name: kubernetes-system-kubelet
rules:
- alert: KubeNodeNotReady
annotations:
- description: '{{ $labels.node }} has been unready for more than 15 minutes.'
+ description: '{{ $labels.node }} has been unready for more than 15 minutes
+ on cluster {{ $labels.cluster }}.'
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubenodenotready
summary: Node is not ready.
- expr: kube_node_status_condition{job="kube-state-metrics",condition="Ready",status="true"}
- == 0
+ expr: |-
+ kube_node_status_condition{job="kube-state-metrics",condition="Ready",status="true"} == 0
+ and on (cluster, node)
+ kube_node_spec_unschedulable{job="kube-state-metrics"} == 0
for: 15m
labels:
severity: warning
- alert: KubeNodeUnreachable
annotations:
description: '{{ $labels.node }} is unreachable and some workloads may be
- rescheduled.'
+ rescheduled on cluster {{ $labels.cluster }}.'
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubenodeunreachable
summary: Node is unreachable.
expr: (kube_node_spec_taint{job="kube-state-metrics",key="node.kubernetes.io/unreachable",effect="NoSchedule"}
unless ignoring(key,value) kube_node_spec_taint{job="kube-state-metrics",key=~"ToBeDeletedByClusterAutoscaler|cloud.google.com/impending-node-termination|aws-node-termination-handler/spot-itn"})
== 1
for: 15m
labels:
severity: warning
- alert: KubeletTooManyPods
annotations:
description: Kubelet '{{ $labels.node }}' is running at {{ $value | humanizePercentage
- }} of its Pod capacity.
+ }} of its Pod capacity on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubelettoomanypods
summary: Kubelet is running at capacity.
expr: |-
- count by (cluster, node) (
- (kube_pod_status_phase{job="kube-state-metrics",phase="Running"} == 1) * on (instance,pod,namespace,cluster) group_left(node) topk by (instance,pod,namespace,cluster) (1, kube_pod_info{job="kube-state-metrics"})
+ (
+ max by (cluster, instance) (
+ kubelet_running_pods{job="kubelet", metrics_path="/metrics"} > 1
+ )
+ * on (cluster, instance) group_left(node)
+ max by (cluster, instance, node) (
+ kubelet_node_name{job="kubelet", metrics_path="/metrics"}
+ )
)
- /
+ / on (cluster, node) group_left()
max by (cluster, node) (
- kube_node_status_capacity{job="kube-state-metrics",resource="pods"} != 1
+ kube_node_status_capacity{job="kube-state-metrics", resource="pods"} != 1
) > 0.95
for: 15m
labels:
severity: info
- alert: KubeNodeReadinessFlapping
annotations:
description: The readiness status of node {{ $labels.node }} has changed {{
- $value }} times in the last 15 minutes.
+ $value }} times in the last 15 minutes on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubenodereadinessflapping
summary: Node readiness status is flapping.
- expr: sum(changes(kube_node_status_condition{job="kube-state-metrics",status="true",condition="Ready"}[15m]))
- by (cluster, node) > 2
+ expr: |-
+ sum(changes(kube_node_status_condition{job="kube-state-metrics",status="true",condition="Ready"}[15m])) by (cluster, node) > 2
+ and on (cluster, node)
+ kube_node_spec_unschedulable{job="kube-state-metrics"} == 0
for: 15m
labels:
severity: warning
- alert: KubeletPlegDurationHigh
annotations:
description: The Kubelet Pod Lifecycle Event Generator has a 99th percentile
- duration of {{ $value }} seconds on node {{ $labels.node }}.
+ duration of {{ $value }} seconds on node {{ $labels.node }} on cluster {{
+ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletplegdurationhigh
summary: Kubelet Pod Lifecycle Event Generator is taking too long to relist.
expr: node_quantile:kubelet_pleg_relist_duration_seconds:histogram_quantile{quantile="0.99"}
>= 10
for: 5m
labels:
severity: warning
- alert: KubeletPodStartUpLatencyHigh
annotations:
description: Kubelet Pod startup 99th percentile latency is {{ $value }} seconds
- on node {{ $labels.node }}.
+ on node {{ $labels.node }} on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletpodstartuplatencyhigh
summary: Kubelet Pod startup latency is too high.
expr: histogram_quantile(0.99, sum(rate(kubelet_pod_worker_duration_seconds_bucket{job="kubelet",
metrics_path="/metrics"}[5m])) by (cluster, instance, le)) * on (cluster,
instance) group_left(node) kubelet_node_name{job="kubelet", metrics_path="/metrics"}
> 60
for: 15m
labels:
severity: warning
- alert: KubeletClientCertificateExpiration
annotations:
description: Client certificate for Kubelet on node {{ $labels.node }} expires
- in {{ $value | humanizeDuration }}.
+ in {{ $value | humanizeDuration }} on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletclientcertificateexpiration
summary: Kubelet client certificate is about to expire.
expr: kubelet_certificate_manager_client_ttl_seconds < 604800
labels:
severity: warning
- alert: KubeletClientCertificateExpiration
annotations:
description: Client certificate for Kubelet on node {{ $labels.node }} expires
- in {{ $value | humanizeDuration }}.
+ in {{ $value | humanizeDuration }} on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletclientcertificateexpiration
summary: Kubelet client certificate is about to expire.
expr: kubelet_certificate_manager_client_ttl_seconds < 86400
labels:
severity: critical
- alert: KubeletServerCertificateExpiration
annotations:
description: Server certificate for Kubelet on node {{ $labels.node }} expires
- in {{ $value | humanizeDuration }}.
+ in {{ $value | humanizeDuration }} on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletservercertificateexpiration
summary: Kubelet server certificate is about to expire.
expr: kubelet_certificate_manager_server_ttl_seconds < 604800
labels:
severity: warning
- alert: KubeletServerCertificateExpiration
annotations:
description: Server certificate for Kubelet on node {{ $labels.node }} expires
- in {{ $value | humanizeDuration }}.
+ in {{ $value | humanizeDuration }} on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletservercertificateexpiration
summary: Kubelet server certificate is about to expire.
expr: kubelet_certificate_manager_server_ttl_seconds < 86400
labels:
severity: critical
- alert: KubeletClientCertificateRenewalErrors
annotations:
description: Kubelet on node {{ $labels.node }} has failed to renew its client
- certificate ({{ $value | humanize }} errors in the last 5 minutes).
+ certificate ({{ $value | humanize }} errors in the last 5 minutes) on cluster
+ {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletclientcertificaterenewalerrors
summary: Kubelet has failed to renew its client certificate.
expr: increase(kubelet_certificate_manager_client_expiration_renew_errors[5m])
> 0
for: 15m
labels:
severity: warning
- alert: KubeletServerCertificateRenewalErrors
annotations:
description: Kubelet on node {{ $labels.node }} has failed to renew its server
- certificate ({{ $value | humanize }} errors in the last 5 minutes).
+ certificate ({{ $value | humanize }} errors in the last 5 minutes) on cluster
+ {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeletservercertificaterenewalerrors
summary: Kubelet has failed to renew its server certificate.
expr: increase(kubelet_server_expiration_renew_errors[5m]) > 0
for: 15m
labels:
severity: warning
--- HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-system
+++ HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-kubernetes-system
@@ -15,24 +15,25 @@
groups:
- name: kubernetes-system
rules:
- alert: KubeVersionMismatch
annotations:
description: There are {{ $value }} different semantic versions of Kubernetes
- components running.
+ components running on cluster {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeversionmismatch
summary: Different semantic versions of Kubernetes components running.
expr: count by (cluster) (count by (git_version, cluster) (label_replace(kubernetes_build_info{job!~"kube-dns|coredns"},"git_version","$1","git_version","(v[0-9]*.[0-9]*).*")))
> 1
for: 15m
labels:
severity: warning
- alert: KubeClientErrors
annotations:
description: Kubernetes API server client '{{ $labels.job }}/{{ $labels.instance
- }}' is experiencing {{ $value | humanizePercentage }} errors.'
+ }}' is experiencing {{ $value | humanizePercentage }} errors on cluster
+ {{ $labels.cluster }}.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubeclienterrors
summary: Kubernetes API server client is experiencing errors.
expr: |-
(sum(rate(rest_client_requests_total{job="apiserver",code=~"5.."}[5m])) by (cluster, instance, job, namespace)
/
sum(rate(rest_client_requests_total{job="apiserver"}[5m])) by (cluster, instance, job, namespace))
--- HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-node-exporter
+++ HelmRelease: monitoring/kube-prometheus-stack PrometheusRule: monitoring/kube-prometheus-stack-node-exporter
@@ -340,12 +340,24 @@
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/node/nodesystemdservicefailed
summary: Systemd service has entered failed state.
expr: node_systemd_unit_state{job="node-exporter", state="failed"} == 1
for: 5m
labels:
severity: warning
+ - alert: NodeSystemdServiceCrashlooping
+ annotations:
+ description: Systemd service {{ $labels.name }} has being restarted too many
+ times at {{ $labels.instance }} for the last 15 minutes. Please check if
+ service is crash looping.
+ runbook_url: https://runbooks.prometheus-operator.dev/runbooks/node/nodesystemdservicecrashlooping
+ summary: Systemd service keeps restaring, possibly crash looping.
+ expr: increase(node_systemd_service_restart_total{job="node-exporter"}[5m])
+ > 2
+ for: 15m
+ labels:
+ severity: warning
- alert: NodeBondingDegraded
annotations:
description: Bonding interface {{ $labels.master }} on {{ $labels.instance
}} is in degraded state due to one or more slave failures.
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/node/nodebondingdegraded
summary: Bonding interface is degraded
--- HelmRelease: monitoring/kube-prometheus-stack ServiceMonitor: monitoring/kube-prometheus-stack-kubelet
+++ HelmRelease: monitoring/kube-prometheus-stack ServiceMonitor: monitoring/kube-prometheus-stack-kubelet
@@ -79,12 +79,22 @@
- __name__
- action: drop
regex: container_(file_descriptors|tasks_state|threads_max)
sourceLabels:
- __name__
- action: drop
+ regex: container_memory_failures_total;hierarchy
+ sourceLabels:
+ - __name__
+ - scope
+ - action: drop
+ regex: container_network_.*;(cali|cilium|cni|lxc|nodelocaldns|tunl).*
+ sourceLabels:
+ - __name__
+ - interface
+ - action: drop
regex: container_spec.*
sourceLabels:
- __name__
- action: drop
regex: .+;
sourceLabels:
--- HelmRelease: monitoring/kube-prometheus-stack Job: monitoring/kube-prometheus-stack-admission-create
+++ HelmRelease: monitoring/kube-prometheus-stack Job: monitoring/kube-prometheus-stack-admission-create
@@ -30,13 +30,13 @@
heritage: Helm
app.kubernetes.io/name: kube-prometheus-stack-prometheus-operator
app.kubernetes.io/component: prometheus-operator-webhook
spec:
containers:
- name: create
- image: registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20221220-controller-v1.5.1-58-g787ea74b6
+ image: registry.k8s.io/ingress-nginx/kube-webhook-certgen:v1.5.1
imagePullPolicy: IfNotPresent
args:
- create
- --host=kube-prometheus-stack-operator,kube-prometheus-stack-operator.monitoring.svc
- --namespace=monitoring
- --secret-name=kube-prometheus-stack-admission
--- HelmRelease: monitoring/kube-prometheus-stack Job: monitoring/kube-prometheus-stack-admission-patch
+++ HelmRelease: monitoring/kube-prometheus-stack Job: monitoring/kube-prometheus-stack-admission-patch
@@ -30,13 +30,13 @@
heritage: Helm
app.kubernetes.io/name: kube-prometheus-stack-prometheus-operator
app.kubernetes.io/component: prometheus-operator-webhook
spec:
containers:
- name: patch
- image: registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20221220-controller-v1.5.1-58-g787ea74b6
+ image: registry.k8s.io/ingress-nginx/kube-webhook-certgen:v1.5.1
imagePullPolicy: IfNotPresent
args:
- patch
- --webhook-name=kube-prometheus-stack-admission
- --namespace=monitoring
- --secret-name=kube-prometheus-stack-admission |
bd8fa8c
to
7ec9658
Compare
7ec9658
to
4b2505f
Compare
4b2505f
to
5af78c4
Compare
5af78c4
to
ef7c814
Compare
ef7c814
to
02c2bfe
Compare
02c2bfe
to
92601fc
Compare
92601fc
to
866461b
Compare
866461b
to
91d8ad6
Compare
91d8ad6
to
1d164ad
Compare
1d164ad
to
e526c97
Compare
e526c97
to
5852642
Compare
5852642
to
15fb087
Compare
15fb087
to
6ace61d
Compare
6ace61d
to
245fe05
Compare
245fe05
to
e866b5e
Compare
e866b5e
to
6ec9237
Compare
| datasource | package | from | to | | ---------- | --------------------- | ------- | ------ | | helm | kube-prometheus-stack | 67.11.0 | 69.5.2 |
6ec9237
to
a8c32b6
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/kubernetes
Changes made in the kubernetes namespace directory
renovate/helm
size/XS
Denotes a PR that changes 0-9 lines, ignoring generated files.
type/major
0 participants
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
67.11.0
->69.5.2
Release Notes
prometheus-community/helm-charts (kube-prometheus-stack)
v69.5.2
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
Full Changelog: prometheus-community/helm-charts@prom-label-proxy-0.10.2...kube-prometheus-stack-69.5.2
v69.5.1
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
Full Changelog: prometheus-community/helm-charts@prometheus-operator-crds-18.0.1...kube-prometheus-stack-69.5.1
v69.5.0
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
Full Changelog: prometheus-community/helm-charts@prometheus-operator-admission-webhook-0.19.0...kube-prometheus-stack-69.5.0
v69.4.1
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
New Contributors
Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-69.4.0...kube-prometheus-stack-69.4.1
v69.4.0
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
New Contributors
Full Changelog: prometheus-community/helm-charts@prometheus-rabbitmq-exporter-2.1.1...kube-prometheus-stack-69.4.0
v69.3.3
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
tpl
support for additional secret names by @richardtief in https://github.com/prometheus-community/helm-charts/pull/5339Full Changelog: prometheus-community/helm-charts@prometheus-rabbitmq-exporter-2.1.0...kube-prometheus-stack-69.3.3
v69.3.2
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
Full Changelog: prometheus-community/helm-charts@prometheus-elasticsearch-exporter-6.6.1...kube-prometheus-stack-69.3.2
v69.3.1
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
New Contributors
Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-69.3.0...kube-prometheus-stack-69.3.1
v69.3.0
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
New Contributors
Full Changelog: prometheus-community/helm-charts@prometheus-json-exporter-0.16.0...kube-prometheus-stack-69.3.0
v69.2.4
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
New Contributors
Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-69.2.3...kube-prometheus-stack-69.2.4
v69.2.3
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
New Contributors
Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-69.2.2...kube-prometheus-stack-69.2.3
v69.2.2
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
tpl
in various prometheus spec fields by @richardtief in https://github.com/prometheus-community/helm-charts/pull/5286Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-69.2.1...kube-prometheus-stack-69.2.2
v69.2.1
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
85f7611
by @renovate in https://github.com/prometheus-community/helm-charts/pull/5301Full Changelog: prometheus-community/helm-charts@kube-state-metrics-5.30.0...kube-prometheus-stack-69.2.1
v69.2.0
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-69.1.2...kube-prometheus-stack-69.2.0
v69.1.2
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
New Contributors
Full Changelog: prometheus-community/helm-charts@prometheus-json-exporter-0.15.0...kube-prometheus-stack-69.1.2
v69.1.1
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
New Contributors
Full Changelog: prometheus-community/helm-charts@prometheus-operator-crds-18.0.0...kube-prometheus-stack-69.1.1
v69.1.0
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
--force-conflicts
on CRD job by @onedr0p in https://github.com/prometheus-community/helm-charts/pull/5288Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-69.0.0...kube-prometheus-stack-69.1.0
v69.0.0
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-68.5.0...kube-prometheus-stack-69.0.0
v68.5.0
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
Full Changelog: prometheus-community/helm-charts@prometheus-operator-admission-webhook-0.18.2...kube-prometheus-stack-68.5.0
v68.4.5
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
New Contributors
Full Changelog: prometheus-community/helm-charts@prometheus-27.3.0...kube-prometheus-stack-68.4.5
v68.4.4
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
New Contributors
Full Changelog: prometheus-community/helm-charts@prometheus-27.2.0...kube-prometheus-stack-68.4.4
v68.4.3
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-68.4.2...kube-prometheus-stack-68.4.3
v68.4.2
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
Full Changelog: prometheus-community/helm-charts@kube-state-metrics-5.29.0...kube-prometheus-stack-68.4.2
v68.4.0
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-68.3.3...kube-prometheus-stack-68.4.0
v68.3.3
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
New Contributors
Full Changelog: prometheus-community/helm-charts@prometheus-sql-exporter-0.2.2...kube-prometheus-stack-68.3.3
v68.3.2
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
New Contributors
Full Changelog: prometheus-community/helm-charts@prometheus-nginx-exporter-1.0.1...kube-prometheus-stack-68.3.2
v68.3.0
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-68.2.2...kube-prometheus-stack-68.3.0
v68.2.2
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
Full Changelog: prometheus-community/helm-charts@prometheus-conntrack-stats-exporter-0.5.15...kube-prometheus-stack-68.2.2
v68.2.1
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-68.2.0...kube-prometheus-stack-68.2.1
v68.2.0
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
Full Changelog: prometheus-community/helm-charts@prometheus-yet-another-cloudwatch-exporter-0.39.3...kube-prometheus-stack-68.2.0
v68.1.1
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
New Contributors
Full Changelog: prometheus-community/helm-charts@prometheus-pushgateway-2.17.0...kube-prometheus-stack-68.1.1
v68.1.0
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
New Contributors
Full Changelog: prometheus-community/helm-charts@prometheus-windows-exporter-0.8.0...kube-prometheus-stack-68.1.0
v68.0.0
Compare Source
kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
What's Changed
Full Changelog: prometheus-community/helm-charts@kube-prometheus-stack-67.11.0...kube-prometheus-stack-68.0.0
Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR has been generated by Renovate Bot.