Alert solutions

This document contains possible solutions for when you find alerts are firing in Sourcegraph's monitoring. If your alert isn't mentioned here, or if the solution doesn't help, contact us for assistance.

To learn more about Sourcegraph's alerting and how to set up alerts, see our alerting guide.

frontend: 99th_percentile_search_request_duration

99th percentile successful search request duration over 5m

Descriptions

warning frontend: 20s+ 99th percentile successful search request duration over 5m

Possible solutions

Get details on the exact queries that are slow by configuring "observability.logSlowSearches": 20, in the site configuration and looking for frontend warning logs prefixed with slow search request for additional details.
Check that most repositories are indexed by visiting https://sourcegraph.example.com/site-admin/repositories?filter=needs-index (it should show few or no results.)
Kubernetes: Check CPU usage of zoekt-webserver in the indexed-search pod, consider increasing CPU limits in the indexed-search.Deployment.yaml if regularly hitting max CPU utilization.
Docker Compose: Check CPU usage on the Zoekt Web Server dashboard, consider increasing cpus: of the zoekt-webserver container in docker-compose.yml if regularly hitting max CPU utilization.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_99th_percentile_search_request_duration"
]

_{Managed by the Sourcegraph Search team.}

frontend: 90th_percentile_search_request_duration

90th percentile successful search request duration over 5m

Descriptions

warning frontend: 15s+ 90th percentile successful search request duration over 5m

Possible solutions

Get details on the exact queries that are slow by configuring "observability.logSlowSearches": 15, in the site configuration and looking for frontend warning logs prefixed with slow search request for additional details.
Check that most repositories are indexed by visiting https://sourcegraph.example.com/site-admin/repositories?filter=needs-index (it should show few or no results.)
Kubernetes: Check CPU usage of zoekt-webserver in the indexed-search pod, consider increasing CPU limits in the indexed-search.Deployment.yaml if regularly hitting max CPU utilization.
Docker Compose: Check CPU usage on the Zoekt Web Server dashboard, consider increasing cpus: of the zoekt-webserver container in docker-compose.yml if regularly hitting max CPU utilization.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_90th_percentile_search_request_duration"
]

_{Managed by the Sourcegraph Search team.}

frontend: hard_timeout_search_responses

hard timeout search responses every 5m

Descriptions

warning frontend: 2%+ hard timeout search responses every 5m for 15m0s
critical frontend: 5%+ hard timeout search responses every 5m for 15m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_hard_timeout_search_responses",
  "critical_frontend_hard_timeout_search_responses"
]

_{Managed by the Sourcegraph Search team.}

frontend: hard_error_search_responses

hard error search responses every 5m

Descriptions

warning frontend: 2%+ hard error search responses every 5m for 15m0s
critical frontend: 5%+ hard error search responses every 5m for 15m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_hard_error_search_responses",
  "critical_frontend_hard_error_search_responses"
]

_{Managed by the Sourcegraph Search team.}

frontend: partial_timeout_search_responses

partial timeout search responses every 5m

Descriptions

warning frontend: 5%+ partial timeout search responses every 5m for 15m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_partial_timeout_search_responses"
]

_{Managed by the Sourcegraph Search team.}

frontend: search_alert_user_suggestions

search alert user suggestions shown every 5m

Descriptions

warning frontend: 5%+ search alert user suggestions shown every 5m for 15m0s

Possible solutions

This indicates your user`s are making syntax errors or similar user errors.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_search_alert_user_suggestions"
]

_{Managed by the Sourcegraph Search team.}

frontend: page_load_latency

90th percentile page load latency over all routes over 10m

Descriptions

critical frontend: 2s+ 90th percentile page load latency over all routes over 10m

Possible solutions

Confirm that the Sourcegraph frontend has enough CPU/memory using the provisioning panels.
Trace a request to see what the slowest part is: https://docs.sourcegraph.com/admin/observability/tracing
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_frontend_page_load_latency"
]

_{Managed by the Sourcegraph Core application team.}

frontend: blob_load_latency

90th percentile blob load latency over 10m

Descriptions

critical frontend: 5s+ 90th percentile blob load latency over 10m

Possible solutions

Confirm that the Sourcegraph frontend has enough CPU/memory using the provisioning panels.
Trace a request to see what the slowest part is: https://docs.sourcegraph.com/admin/observability/tracing
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_frontend_blob_load_latency"
]

_{Managed by the Sourcegraph Core application team.}

frontend: 99th_percentile_search_codeintel_request_duration

99th percentile code-intel successful search request duration over 5m

Descriptions

warning frontend: 20s+ 99th percentile code-intel successful search request duration over 5m

Possible solutions

Get details on the exact queries that are slow by configuring "observability.logSlowSearches": 20, in the site configuration and looking for frontend warning logs prefixed with slow search request for additional details.
Check that most repositories are indexed by visiting https://sourcegraph.example.com/site-admin/repositories?filter=needs-index (it should show few or no results.)
Kubernetes: Check CPU usage of zoekt-webserver in the indexed-search pod, consider increasing CPU limits in the indexed-search.Deployment.yaml if regularly hitting max CPU utilization.
Docker Compose: Check CPU usage on the Zoekt Web Server dashboard, consider increasing cpus: of the zoekt-webserver container in docker-compose.yml if regularly hitting max CPU utilization.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_99th_percentile_search_codeintel_request_duration"
]

_{Managed by the Sourcegraph Code-intelligence team.}

frontend: 90th_percentile_search_codeintel_request_duration

90th percentile code-intel successful search request duration over 5m

Descriptions

warning frontend: 15s+ 90th percentile code-intel successful search request duration over 5m

Possible solutions

Get details on the exact queries that are slow by configuring "observability.logSlowSearches": 15, in the site configuration and looking for frontend warning logs prefixed with slow search request for additional details.
Check that most repositories are indexed by visiting https://sourcegraph.example.com/site-admin/repositories?filter=needs-index (it should show few or no results.)
Kubernetes: Check CPU usage of zoekt-webserver in the indexed-search pod, consider increasing CPU limits in the indexed-search.Deployment.yaml if regularly hitting max CPU utilization.
Docker Compose: Check CPU usage on the Zoekt Web Server dashboard, consider increasing cpus: of the zoekt-webserver container in docker-compose.yml if regularly hitting max CPU utilization.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_90th_percentile_search_codeintel_request_duration"
]

_{Managed by the Sourcegraph Code-intelligence team.}

frontend: hard_timeout_search_codeintel_responses

hard timeout search code-intel responses every 5m

Descriptions

warning frontend: 2%+ hard timeout search code-intel responses every 5m for 15m0s
critical frontend: 5%+ hard timeout search code-intel responses every 5m for 15m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_hard_timeout_search_codeintel_responses",
  "critical_frontend_hard_timeout_search_codeintel_responses"
]

_{Managed by the Sourcegraph Code-intelligence team.}

frontend: hard_error_search_codeintel_responses

hard error search code-intel responses every 5m

Descriptions

warning frontend: 2%+ hard error search code-intel responses every 5m for 15m0s
critical frontend: 5%+ hard error search code-intel responses every 5m for 15m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_hard_error_search_codeintel_responses",
  "critical_frontend_hard_error_search_codeintel_responses"
]

_{Managed by the Sourcegraph Code-intelligence team.}

frontend: partial_timeout_search_codeintel_responses

partial timeout search code-intel responses every 5m

Descriptions

warning frontend: 5%+ partial timeout search code-intel responses every 5m for 15m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_partial_timeout_search_codeintel_responses"
]

_{Managed by the Sourcegraph Code-intelligence team.}

frontend: search_codeintel_alert_user_suggestions

search code-intel alert user suggestions shown every 5m

Descriptions

warning frontend: 5%+ search code-intel alert user suggestions shown every 5m for 15m0s

Possible solutions

This indicates a bug in Sourcegraph, please open an issue.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_search_codeintel_alert_user_suggestions"
]

_{Managed by the Sourcegraph Code-intelligence team.}

frontend: 99th_percentile_search_api_request_duration

99th percentile successful search API request duration over 5m

Descriptions

warning frontend: 50s+ 99th percentile successful search API request duration over 5m

Possible solutions

Get details on the exact queries that are slow by configuring "observability.logSlowSearches": 20, in the site configuration and looking for frontend warning logs prefixed with slow search request for additional details.
Check that most repositories are indexed by visiting https://sourcegraph.example.com/site-admin/repositories?filter=needs-index (it should show few or no results.)
Kubernetes: Check CPU usage of zoekt-webserver in the indexed-search pod, consider increasing CPU limits in the indexed-search.Deployment.yaml if regularly hitting max CPU utilization.
Docker Compose: Check CPU usage on the Zoekt Web Server dashboard, consider increasing cpus: of the zoekt-webserver container in docker-compose.yml if regularly hitting max CPU utilization.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_99th_percentile_search_api_request_duration"
]

_{Managed by the Sourcegraph Search team.}

frontend: 90th_percentile_search_api_request_duration

90th percentile successful search API request duration over 5m

Descriptions

warning frontend: 40s+ 90th percentile successful search API request duration over 5m

Possible solutions

Get details on the exact queries that are slow by configuring "observability.logSlowSearches": 15, in the site configuration and looking for frontend warning logs prefixed with slow search request for additional details.
Check that most repositories are indexed by visiting https://sourcegraph.example.com/site-admin/repositories?filter=needs-index (it should show few or no results.)
Kubernetes: Check CPU usage of zoekt-webserver in the indexed-search pod, consider increasing CPU limits in the indexed-search.Deployment.yaml if regularly hitting max CPU utilization.
Docker Compose: Check CPU usage on the Zoekt Web Server dashboard, consider increasing cpus: of the zoekt-webserver container in docker-compose.yml if regularly hitting max CPU utilization.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_90th_percentile_search_api_request_duration"
]

_{Managed by the Sourcegraph Search team.}

frontend: hard_timeout_search_api_responses

hard timeout search API responses every 5m

Descriptions

warning frontend: 2%+ hard timeout search API responses every 5m for 15m0s
critical frontend: 5%+ hard timeout search API responses every 5m for 15m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_hard_timeout_search_api_responses",
  "critical_frontend_hard_timeout_search_api_responses"
]

_{Managed by the Sourcegraph Search team.}

frontend: hard_error_search_api_responses

hard error search API responses every 5m

Descriptions

warning frontend: 2%+ hard error search API responses every 5m for 15m0s
critical frontend: 5%+ hard error search API responses every 5m for 15m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_hard_error_search_api_responses",
  "critical_frontend_hard_error_search_api_responses"
]

_{Managed by the Sourcegraph Search team.}

frontend: partial_timeout_search_api_responses

partial timeout search API responses every 5m

Descriptions

warning frontend: 5%+ partial timeout search API responses every 5m for 15m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_partial_timeout_search_api_responses"
]

_{Managed by the Sourcegraph Search team.}

frontend: search_api_alert_user_suggestions

search API alert user suggestions shown every 5m

Descriptions

warning frontend: 5%+ search API alert user suggestions shown every 5m

Possible solutions

This indicates your user`s search API requests have syntax errors or a similar user error. Check the responses the API sends back for an explanation.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_search_api_alert_user_suggestions"
]

_{Managed by the Sourcegraph Search team.}

frontend: internal_indexed_search_error_responses

internal indexed search error responses every 5m

Descriptions

warning frontend: 5%+ internal indexed search error responses every 5m for 15m0s

Possible solutions

Check the Zoekt Web Server dashboard for indications it might be unhealthy.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_internal_indexed_search_error_responses"
]

_{Managed by the Sourcegraph Search team.}

frontend: internal_unindexed_search_error_responses

internal unindexed search error responses every 5m

Descriptions

warning frontend: 5%+ internal unindexed search error responses every 5m for 15m0s

Possible solutions

Check the Searcher dashboard for indications it might be unhealthy.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_internal_unindexed_search_error_responses"
]

_{Managed by the Sourcegraph Search team.}

frontend: internal_api_error_responses

internal API error responses every 5m by route

Descriptions

warning frontend: 5%+ internal API error responses every 5m by route for 15m0s

Possible solutions

May not be a substantial issue, check the frontend logs for potential causes.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_internal_api_error_responses"
]

_{Managed by the Sourcegraph Core application team.}

frontend: 99th_percentile_gitserver_duration

99th percentile successful gitserver query duration over 5m

Descriptions

warning frontend: 20s+ 99th percentile successful gitserver query duration over 5m

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_99th_percentile_gitserver_duration"
]

_{Managed by the Sourcegraph Core application team.}

frontend: gitserver_error_responses

gitserver error responses every 5m

Descriptions

warning frontend: 5%+ gitserver error responses every 5m for 15m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_gitserver_error_responses"
]

_{Managed by the Sourcegraph Core application team.}

frontend: observability_test_alert_warning

warning test alert metric

Descriptions

warning frontend: 1+ warning test alert metric

Possible solutions

This alert is triggered via the triggerObservabilityTestAlert GraphQL endpoint, and will automatically resolve itself.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_observability_test_alert_warning"
]

_{Managed by the Sourcegraph Distribution team.}

frontend: observability_test_alert_critical

critical test alert metric

Descriptions

critical frontend: 1+ critical test alert metric

Possible solutions

This alert is triggered via the triggerObservabilityTestAlert GraphQL endpoint, and will automatically resolve itself.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_frontend_observability_test_alert_critical"
]

_{Managed by the Sourcegraph Distribution team.}

frontend: mean_blocked_seconds_per_conn_request

mean blocked seconds per conn request

Descriptions

warning frontend: 0.05s+ mean blocked seconds per conn request for 5m0s
critical frontend: 0.1s+ mean blocked seconds per conn request for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_mean_blocked_seconds_per_conn_request",
  "critical_frontend_mean_blocked_seconds_per_conn_request"
]

_{Managed by the Sourcegraph Core application team.}

frontend: container_cpu_usage

container cpu usage total (1m average) across all cores by instance

Descriptions

warning frontend: 99%+ container cpu usage total (1m average) across all cores by instance

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the (frontend|sourcegraph-frontend) container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_container_cpu_usage"
]

_{Managed by the Sourcegraph Core application team.}

frontend: container_memory_usage

container memory usage by instance

Descriptions

warning frontend: 99%+ container memory usage by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of (frontend|sourcegraph-frontend) container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_container_memory_usage"
]

_{Managed by the Sourcegraph Core application team.}

frontend: provisioning_container_cpu_usage_long_term

container cpu usage total (90th percentile over 1d) across all cores by instance

Descriptions

warning frontend: 80%+ container cpu usage total (90th percentile over 1d) across all cores by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the Deployment.yaml for the (frontend|sourcegraph-frontend) service.
Docker Compose: Consider increasing cpus: of the (frontend|sourcegraph-frontend) container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_provisioning_container_cpu_usage_long_term"
]

_{Managed by the Sourcegraph Core application team.}

frontend: provisioning_container_memory_usage_long_term

container memory usage (1d maximum) by instance

Descriptions

warning frontend: 80%+ container memory usage (1d maximum) by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing memory limits in the Deployment.yaml for the (frontend|sourcegraph-frontend) service.
Docker Compose: Consider increasing memory: of the (frontend|sourcegraph-frontend) container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_provisioning_container_memory_usage_long_term"
]

_{Managed by the Sourcegraph Core application team.}

frontend: provisioning_container_cpu_usage_short_term

container cpu usage total (5m maximum) across all cores by instance

Descriptions

warning frontend: 90%+ container cpu usage total (5m maximum) across all cores by instance for 30m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the (frontend|sourcegraph-frontend) container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_provisioning_container_cpu_usage_short_term"
]

_{Managed by the Sourcegraph Core application team.}

frontend: provisioning_container_memory_usage_short_term

container memory usage (5m maximum) by instance

Descriptions

warning frontend: 90%+ container memory usage (5m maximum) by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of (frontend|sourcegraph-frontend) container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_provisioning_container_memory_usage_short_term"
]

_{Managed by the Sourcegraph Core application team.}

frontend: go_goroutines

maximum active goroutines

Descriptions

warning frontend: 10000+ maximum active goroutines for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_go_goroutines"
]

_{Managed by the Sourcegraph Core application team.}

frontend: go_gc_duration_seconds

maximum go garbage collection duration

Descriptions

warning frontend: 2s+ maximum go garbage collection duration

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_go_gc_duration_seconds"
]

_{Managed by the Sourcegraph Core application team.}

frontend: pods_available_percentage

percentage pods available

Descriptions

critical frontend: less than 90% percentage pods available for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_frontend_pods_available_percentage"
]

_{Managed by the Sourcegraph Core application team.}

frontend: mean_successful_sentinel_duration_5m

mean successful sentinel search duration over 5m

Descriptions

warning frontend: 5s+ mean successful sentinel search duration over 5m for 15m0s
critical frontend: 8s+ mean successful sentinel search duration over 5m for 30m0s

Possible solutions

Look at the breakdown by query to determine if a specific query type is being affected
Check for high CPU usage on zoekt-webserver
Check Honeycomb for unusual activity
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_mean_successful_sentinel_duration_5m",
  "critical_frontend_mean_successful_sentinel_duration_5m"
]

_{Managed by the Sourcegraph Search team.}

frontend: mean_sentinel_stream_latency_5m

mean sentinel stream latency over 5m

Descriptions

warning frontend: 2s+ mean sentinel stream latency over 5m for 15m0s
critical frontend: 3s+ mean sentinel stream latency over 5m for 30m0s

Possible solutions

Look at the breakdown by query to determine if a specific query type is being affected
Check for high CPU usage on zoekt-webserver
Check Honeycomb for unusual activity
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_mean_sentinel_stream_latency_5m",
  "critical_frontend_mean_sentinel_stream_latency_5m"
]

_{Managed by the Sourcegraph Search team.}

frontend: 90th_percentile_successful_sentinel_duration_5m

90th percentile successful sentinel search duration over 5m

Descriptions

warning frontend: 5s+ 90th percentile successful sentinel search duration over 5m for 15m0s
critical frontend: 10s+ 90th percentile successful sentinel search duration over 5m for 30m0s

Possible solutions

Look at the breakdown by query to determine if a specific query type is being affected
Check for high CPU usage on zoekt-webserver
Check Honeycomb for unusual activity
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_90th_percentile_successful_sentinel_duration_5m",
  "critical_frontend_90th_percentile_successful_sentinel_duration_5m"
]

_{Managed by the Sourcegraph Search team.}

frontend: 90th_percentile_sentinel_stream_latency_5m

90th percentile sentinel stream latency over 5m

Descriptions

warning frontend: 4s+ 90th percentile sentinel stream latency over 5m for 15m0s
critical frontend: 6s+ 90th percentile sentinel stream latency over 5m for 30m0s

Possible solutions

Look at the breakdown by query to determine if a specific query type is being affected
Check for high CPU usage on zoekt-webserver
Check Honeycomb for unusual activity
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_frontend_90th_percentile_sentinel_stream_latency_5m",
  "critical_frontend_90th_percentile_sentinel_stream_latency_5m"
]

_{Managed by the Sourcegraph Search team.}

gitserver: disk_space_remaining

disk space remaining by instance

Descriptions

warning gitserver: less than 25% disk space remaining by instance
critical gitserver: less than 15% disk space remaining by instance

Possible solutions

Provision more disk space: Sourcegraph will begin deleting least-used repository clones at 10% disk space remaining which may result in decreased performance, users having to wait for repositories to clone, etc.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_gitserver_disk_space_remaining",
  "critical_gitserver_disk_space_remaining"
]

_{Managed by the Sourcegraph Core application team.}

gitserver: running_git_commands

git commands running on each gitserver instance

Descriptions

warning gitserver: 50+ git commands running on each gitserver instance for 2m0s
critical gitserver: 100+ git commands running on each gitserver instance for 5m0s

Possible solutions

Check if the problem may be an intermittent and temporary peak using the "Container monitoring" section at the bottom of the Git Server dashboard.
Single container deployments: Consider upgrading to a Docker Compose deployment which offers better scalability and resource isolation.
Kubernetes and Docker Compose: Check that you are running a similar number of git server replicas and that their CPU/memory limits are allocated according to what is shown in the Sourcegraph resource estimator.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_gitserver_running_git_commands",
  "critical_gitserver_running_git_commands"
]

_{Managed by the Sourcegraph Core application team.}

gitserver: repository_clone_queue_size

repository clone queue size

Descriptions

warning gitserver: 25+ repository clone queue size

Possible solutions

If you just added several repositories, the warning may be expected.
Check which repositories need cloning, by visiting e.g. https://sourcegraph.example.com/site-admin/repositories?filter=not-cloned
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_gitserver_repository_clone_queue_size"
]

_{Managed by the Sourcegraph Core application team.}

gitserver: repository_existence_check_queue_size

repository existence check queue size

Descriptions

warning gitserver: 25+ repository existence check queue size

Possible solutions

Check the code host status indicator for errors: on the Sourcegraph app homepage, when signed in as an admin click the cloud icon in the top right corner of the page.
Check if the issue continues to happen after 30 minutes, it may be temporary.
Check the gitserver logs for more information.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_gitserver_repository_existence_check_queue_size"
]

_{Managed by the Sourcegraph Core application team.}

gitserver: frontend_internal_api_error_responses

frontend-internal API error responses every 5m by route

Descriptions

warning gitserver: 2%+ frontend-internal API error responses every 5m by route for 5m0s

Possible solutions

Single-container deployments: Check docker logs $CONTAINER_ID for logs starting with repo-updater that indicate requests to the frontend service are failing.
Kubernetes:
- Confirm that kubectl get pods shows the frontend pods are healthy.
- Check kubectl logs gitserver for logs indicate request failures to frontend or frontend-internal.
Docker Compose:
- Confirm that docker ps shows the frontend-internal container is healthy.
- Check docker logs gitserver for logs indicating request failures to frontend or frontend-internal.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_gitserver_frontend_internal_api_error_responses"
]

_{Managed by the Sourcegraph Core application team.}

gitserver: mean_blocked_seconds_per_conn_request

mean blocked seconds per conn request

Descriptions

warning gitserver: 0.05s+ mean blocked seconds per conn request for 5m0s
critical gitserver: 0.1s+ mean blocked seconds per conn request for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_gitserver_mean_blocked_seconds_per_conn_request",
  "critical_gitserver_mean_blocked_seconds_per_conn_request"
]

_{Managed by the Sourcegraph Core application team.}

gitserver: container_cpu_usage

container cpu usage total (1m average) across all cores by instance

Descriptions

warning gitserver: 99%+ container cpu usage total (1m average) across all cores by instance

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the gitserver container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_gitserver_container_cpu_usage"
]

_{Managed by the Sourcegraph Core application team.}

gitserver: container_memory_usage

container memory usage by instance

Descriptions

warning gitserver: 99%+ container memory usage by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of gitserver container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_gitserver_container_memory_usage"
]

_{Managed by the Sourcegraph Core application team.}

gitserver: provisioning_container_cpu_usage_long_term

container cpu usage total (90th percentile over 1d) across all cores by instance

Descriptions

warning gitserver: 80%+ container cpu usage total (90th percentile over 1d) across all cores by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the Deployment.yaml for the gitserver service.
Docker Compose: Consider increasing cpus: of the gitserver container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_gitserver_provisioning_container_cpu_usage_long_term"
]

_{Managed by the Sourcegraph Core application team.}

gitserver: provisioning_container_cpu_usage_short_term

container cpu usage total (5m maximum) across all cores by instance

Descriptions

warning gitserver: 90%+ container cpu usage total (5m maximum) across all cores by instance for 30m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the gitserver container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_gitserver_provisioning_container_cpu_usage_short_term"
]

_{Managed by the Sourcegraph Core application team.}

gitserver: go_goroutines

maximum active goroutines

Descriptions

warning gitserver: 10000+ maximum active goroutines for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_gitserver_go_goroutines"
]

_{Managed by the Sourcegraph Core application team.}

gitserver: go_gc_duration_seconds

maximum go garbage collection duration

Descriptions

warning gitserver: 2s+ maximum go garbage collection duration

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_gitserver_go_gc_duration_seconds"
]

_{Managed by the Sourcegraph Core application team.}

gitserver: pods_available_percentage

percentage pods available

Descriptions

critical gitserver: less than 90% percentage pods available for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_gitserver_pods_available_percentage"
]

_{Managed by the Sourcegraph Core application team.}

github-proxy: github_proxy_waiting_requests

number of requests waiting on the global mutex

Descriptions

warning github-proxy: 100+ number of requests waiting on the global mutex for 5m0s

Possible solutions

  						- **Check github-proxy logs for network connection issues.
  						- **Check github status.

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_github-proxy_github_proxy_waiting_requests"
]

_{Managed by the Sourcegraph Core application team.}

github-proxy: container_cpu_usage

container cpu usage total (1m average) across all cores by instance

Descriptions

warning github-proxy: 99%+ container cpu usage total (1m average) across all cores by instance

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the github-proxy container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_github-proxy_container_cpu_usage"
]

_{Managed by the Sourcegraph Core application team.}

github-proxy: container_memory_usage

container memory usage by instance

Descriptions

warning github-proxy: 99%+ container memory usage by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of github-proxy container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_github-proxy_container_memory_usage"
]

_{Managed by the Sourcegraph Core application team.}

github-proxy: provisioning_container_cpu_usage_long_term

container cpu usage total (90th percentile over 1d) across all cores by instance

Descriptions

warning github-proxy: 80%+ container cpu usage total (90th percentile over 1d) across all cores by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the Deployment.yaml for the github-proxy service.
Docker Compose: Consider increasing cpus: of the github-proxy container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_github-proxy_provisioning_container_cpu_usage_long_term"
]

_{Managed by the Sourcegraph Core application team.}

github-proxy: provisioning_container_memory_usage_long_term

container memory usage (1d maximum) by instance

Descriptions

warning github-proxy: 80%+ container memory usage (1d maximum) by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing memory limits in the Deployment.yaml for the github-proxy service.
Docker Compose: Consider increasing memory: of the github-proxy container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_github-proxy_provisioning_container_memory_usage_long_term"
]

_{Managed by the Sourcegraph Core application team.}

github-proxy: provisioning_container_cpu_usage_short_term

container cpu usage total (5m maximum) across all cores by instance

Descriptions

warning github-proxy: 90%+ container cpu usage total (5m maximum) across all cores by instance for 30m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the github-proxy container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_github-proxy_provisioning_container_cpu_usage_short_term"
]

_{Managed by the Sourcegraph Core application team.}

github-proxy: provisioning_container_memory_usage_short_term

container memory usage (5m maximum) by instance

Descriptions

warning github-proxy: 90%+ container memory usage (5m maximum) by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of github-proxy container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_github-proxy_provisioning_container_memory_usage_short_term"
]

_{Managed by the Sourcegraph Core application team.}

github-proxy: go_goroutines

maximum active goroutines

Descriptions

warning github-proxy: 10000+ maximum active goroutines for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_github-proxy_go_goroutines"
]

_{Managed by the Sourcegraph Core application team.}

github-proxy: go_gc_duration_seconds

maximum go garbage collection duration

Descriptions

warning github-proxy: 2s+ maximum go garbage collection duration

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_github-proxy_go_gc_duration_seconds"
]

_{Managed by the Sourcegraph Core application team.}

github-proxy: pods_available_percentage

percentage pods available

Descriptions

critical github-proxy: less than 90% percentage pods available for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_github-proxy_pods_available_percentage"
]

_{Managed by the Sourcegraph Core application team.}

postgres: connections

active connections

Descriptions

warning postgres: less than 5 active connections for 5m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_postgres_connections"
]

_{Managed by the Sourcegraph Core application team.}

postgres: transaction_durations

maximum transaction durations

Descriptions

warning postgres: 300ms+ maximum transaction durations for 5m0s
critical postgres: 500ms+ maximum transaction durations for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_postgres_transaction_durations",
  "critical_postgres_transaction_durations"
]

_{Managed by the Sourcegraph Core application team.}

postgres: postgres_up

database availability

Descriptions

critical postgres: less than 0 database availability for 5m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_postgres_postgres_up"
]

_{Managed by the Sourcegraph Core application team.}

postgres: invalid_indexes

invalid indexes (unusable by the query planner)

Descriptions

critical postgres: 1+ invalid indexes (unusable by the query planner)

Possible solutions

Drop and re-create the invalid trigger - please contact Sourcegraph to supply the trigger definition.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_postgres_invalid_indexes"
]

_{Managed by the Sourcegraph Core application team.}

postgres: pg_exporter_err

errors scraping postgres exporter

Descriptions

warning postgres: 1+ errors scraping postgres exporter for 5m0s

Possible solutions

Ensure the Postgres exporter can access the Postgres database. Also, check the Postgres exporter logs for errors.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_postgres_pg_exporter_err"
]

_{Managed by the Sourcegraph Core application team.}

postgres: migration_in_progress

active schema migration

Descriptions

critical postgres: 1+ active schema migration for 5m0s

Possible solutions

The database migration has been in progress for 5 or more minutes - please contact Sourcegraph if this persists.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_postgres_migration_in_progress"
]

_{Managed by the Sourcegraph Core application team.}

postgres: provisioning_container_cpu_usage_long_term

container cpu usage total (90th percentile over 1d) across all cores by instance

Descriptions

warning postgres: 80%+ container cpu usage total (90th percentile over 1d) across all cores by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the Deployment.yaml for the (pgsql|codeintel-db) service.
Docker Compose: Consider increasing cpus: of the (pgsql|codeintel-db) container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_postgres_provisioning_container_cpu_usage_long_term"
]

_{Managed by the Sourcegraph Core application team.}

postgres: provisioning_container_memory_usage_long_term

container memory usage (1d maximum) by instance

Descriptions

warning postgres: 80%+ container memory usage (1d maximum) by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing memory limits in the Deployment.yaml for the (pgsql|codeintel-db) service.
Docker Compose: Consider increasing memory: of the (pgsql|codeintel-db) container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_postgres_provisioning_container_memory_usage_long_term"
]

_{Managed by the Sourcegraph Core application team.}

postgres: provisioning_container_cpu_usage_short_term

container cpu usage total (5m maximum) across all cores by instance

Descriptions

warning postgres: 90%+ container cpu usage total (5m maximum) across all cores by instance for 30m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the (pgsql|codeintel-db) container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_postgres_provisioning_container_cpu_usage_short_term"
]

_{Managed by the Sourcegraph Core application team.}

postgres: provisioning_container_memory_usage_short_term

container memory usage (5m maximum) by instance

Descriptions

warning postgres: 90%+ container memory usage (5m maximum) by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of (pgsql|codeintel-db) container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_postgres_provisioning_container_memory_usage_short_term"
]

_{Managed by the Sourcegraph Core application team.}

postgres: pods_available_percentage

percentage pods available

Descriptions

critical postgres: less than 90% percentage pods available for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_postgres_pods_available_percentage"
]

_{Managed by the Sourcegraph Core application team.}

precise-code-intel-worker: frontend_internal_api_error_responses

frontend-internal API error responses every 5m by route

Descriptions

warning precise-code-intel-worker: 2%+ frontend-internal API error responses every 5m by route for 5m0s

Possible solutions

Single-container deployments: Check docker logs $CONTAINER_ID for logs starting with repo-updater that indicate requests to the frontend service are failing.
Kubernetes:
- Confirm that kubectl get pods shows the frontend pods are healthy.
- Check kubectl logs precise-code-intel-worker for logs indicate request failures to frontend or frontend-internal.
Docker Compose:
- Confirm that docker ps shows the frontend-internal container is healthy.
- Check docker logs precise-code-intel-worker for logs indicating request failures to frontend or frontend-internal.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_precise-code-intel-worker_frontend_internal_api_error_responses"
]

_{Managed by the Sourcegraph Code-intelligence team.}

precise-code-intel-worker: mean_blocked_seconds_per_conn_request

mean blocked seconds per conn request

Descriptions

warning precise-code-intel-worker: 0.05s+ mean blocked seconds per conn request for 5m0s
critical precise-code-intel-worker: 0.1s+ mean blocked seconds per conn request for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_precise-code-intel-worker_mean_blocked_seconds_per_conn_request",
  "critical_precise-code-intel-worker_mean_blocked_seconds_per_conn_request"
]

_{Managed by the Sourcegraph Core application team.}

precise-code-intel-worker: container_cpu_usage

container cpu usage total (1m average) across all cores by instance

Descriptions

warning precise-code-intel-worker: 99%+ container cpu usage total (1m average) across all cores by instance

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the precise-code-intel-worker container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_precise-code-intel-worker_container_cpu_usage"
]

_{Managed by the Sourcegraph Code-intelligence team.}

precise-code-intel-worker: container_memory_usage

container memory usage by instance

Descriptions

warning precise-code-intel-worker: 99%+ container memory usage by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of precise-code-intel-worker container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_precise-code-intel-worker_container_memory_usage"
]

_{Managed by the Sourcegraph Code-intelligence team.}

precise-code-intel-worker: provisioning_container_cpu_usage_long_term

container cpu usage total (90th percentile over 1d) across all cores by instance

Descriptions

warning precise-code-intel-worker: 80%+ container cpu usage total (90th percentile over 1d) across all cores by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the Deployment.yaml for the precise-code-intel-worker service.
Docker Compose: Consider increasing cpus: of the precise-code-intel-worker container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_precise-code-intel-worker_provisioning_container_cpu_usage_long_term"
]

_{Managed by the Sourcegraph Code-intelligence team.}

precise-code-intel-worker: provisioning_container_memory_usage_long_term

container memory usage (1d maximum) by instance

Descriptions

warning precise-code-intel-worker: 80%+ container memory usage (1d maximum) by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing memory limits in the Deployment.yaml for the precise-code-intel-worker service.
Docker Compose: Consider increasing memory: of the precise-code-intel-worker container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_precise-code-intel-worker_provisioning_container_memory_usage_long_term"
]

_{Managed by the Sourcegraph Code-intelligence team.}

precise-code-intel-worker: provisioning_container_cpu_usage_short_term

container cpu usage total (5m maximum) across all cores by instance

Descriptions

warning precise-code-intel-worker: 90%+ container cpu usage total (5m maximum) across all cores by instance for 30m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the precise-code-intel-worker container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_precise-code-intel-worker_provisioning_container_cpu_usage_short_term"
]

_{Managed by the Sourcegraph Code-intelligence team.}

precise-code-intel-worker: provisioning_container_memory_usage_short_term

container memory usage (5m maximum) by instance

Descriptions

warning precise-code-intel-worker: 90%+ container memory usage (5m maximum) by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of precise-code-intel-worker container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_precise-code-intel-worker_provisioning_container_memory_usage_short_term"
]

_{Managed by the Sourcegraph Code-intelligence team.}

precise-code-intel-worker: go_goroutines

maximum active goroutines

Descriptions

warning precise-code-intel-worker: 10000+ maximum active goroutines for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_precise-code-intel-worker_go_goroutines"
]

_{Managed by the Sourcegraph Code-intelligence team.}

precise-code-intel-worker: go_gc_duration_seconds

maximum go garbage collection duration

Descriptions

warning precise-code-intel-worker: 2s+ maximum go garbage collection duration

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_precise-code-intel-worker_go_gc_duration_seconds"
]

_{Managed by the Sourcegraph Code-intelligence team.}

precise-code-intel-worker: pods_available_percentage

percentage pods available

Descriptions

critical precise-code-intel-worker: less than 90% percentage pods available for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_precise-code-intel-worker_pods_available_percentage"
]

_{Managed by the Sourcegraph Code-intelligence team.}

query-runner: frontend_internal_api_error_responses

frontend-internal API error responses every 5m by route

Descriptions

warning query-runner: 2%+ frontend-internal API error responses every 5m by route for 5m0s

Possible solutions

Single-container deployments: Check docker logs $CONTAINER_ID for logs starting with repo-updater that indicate requests to the frontend service are failing.
Kubernetes:
- Confirm that kubectl get pods shows the frontend pods are healthy.
- Check kubectl logs query-runner for logs indicate request failures to frontend or frontend-internal.
Docker Compose:
- Confirm that docker ps shows the frontend-internal container is healthy.
- Check docker logs query-runner for logs indicating request failures to frontend or frontend-internal.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_query-runner_frontend_internal_api_error_responses"
]

_{Managed by the Sourcegraph Search team.}

query-runner: container_cpu_usage

container cpu usage total (1m average) across all cores by instance

Descriptions

warning query-runner: 99%+ container cpu usage total (1m average) across all cores by instance

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the query-runner container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_query-runner_container_cpu_usage"
]

_{Managed by the Sourcegraph Search team.}

query-runner: container_memory_usage

container memory usage by instance

Descriptions

warning query-runner: 99%+ container memory usage by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of query-runner container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_query-runner_container_memory_usage"
]

_{Managed by the Sourcegraph Search team.}

query-runner: provisioning_container_cpu_usage_long_term

container cpu usage total (90th percentile over 1d) across all cores by instance

Descriptions

warning query-runner: 80%+ container cpu usage total (90th percentile over 1d) across all cores by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the Deployment.yaml for the query-runner service.
Docker Compose: Consider increasing cpus: of the query-runner container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_query-runner_provisioning_container_cpu_usage_long_term"
]

_{Managed by the Sourcegraph Search team.}

query-runner: provisioning_container_memory_usage_long_term

container memory usage (1d maximum) by instance

Descriptions

warning query-runner: 80%+ container memory usage (1d maximum) by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing memory limits in the Deployment.yaml for the query-runner service.
Docker Compose: Consider increasing memory: of the query-runner container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_query-runner_provisioning_container_memory_usage_long_term"
]

_{Managed by the Sourcegraph Search team.}

query-runner: provisioning_container_cpu_usage_short_term

container cpu usage total (5m maximum) across all cores by instance

Descriptions

warning query-runner: 90%+ container cpu usage total (5m maximum) across all cores by instance for 30m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the query-runner container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_query-runner_provisioning_container_cpu_usage_short_term"
]

_{Managed by the Sourcegraph Search team.}

query-runner: provisioning_container_memory_usage_short_term

container memory usage (5m maximum) by instance

Descriptions

warning query-runner: 90%+ container memory usage (5m maximum) by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of query-runner container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_query-runner_provisioning_container_memory_usage_short_term"
]

_{Managed by the Sourcegraph Search team.}

query-runner: go_goroutines

maximum active goroutines

Descriptions

warning query-runner: 10000+ maximum active goroutines for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_query-runner_go_goroutines"
]

_{Managed by the Sourcegraph Search team.}

query-runner: go_gc_duration_seconds

maximum go garbage collection duration

Descriptions

warning query-runner: 2s+ maximum go garbage collection duration

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_query-runner_go_gc_duration_seconds"
]

_{Managed by the Sourcegraph Search team.}

query-runner: pods_available_percentage

percentage pods available

Descriptions

critical query-runner: less than 90% percentage pods available for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_query-runner_pods_available_percentage"
]

_{Managed by the Sourcegraph Search team.}

worker: worker_job_codeintel-janitor_count

number of worker instances running the codeintel-janitor job

Descriptions

warning worker: less than 1 number of worker instances running the codeintel-janitor job for 1m0s
critical worker: less than 1 number of worker instances running the codeintel-janitor job for 5m0s

Possible solutions

Ensure your instance defines a worker container such that:
- WORKER_JOB_ALLOWLIST contains "codeintel-janitor" (or "all"), and
- WORKER_JOB_BLOCKLIST does not contain "codeintel-janitor"
Ensure that such a container is not failing to start or stay active
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_worker_worker_job_codeintel-janitor_count",
  "critical_worker_worker_job_codeintel-janitor_count"
]

_{Managed by the Sourcegraph Code-intelligence team.}

worker: worker_job_codeintel-commitgraph_count

number of worker instances running the codeintel-commitgraph job

Descriptions

warning worker: less than 1 number of worker instances running the codeintel-commitgraph job for 1m0s
critical worker: less than 1 number of worker instances running the codeintel-commitgraph job for 5m0s

Possible solutions

Ensure your instance defines a worker container such that:
- WORKER_JOB_ALLOWLIST contains "codeintel-commitgraph" (or "all"), and
- WORKER_JOB_BLOCKLIST does not contain "codeintel-commitgraph"
Ensure that such a container is not failing to start or stay active
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_worker_worker_job_codeintel-commitgraph_count",
  "critical_worker_worker_job_codeintel-commitgraph_count"
]

_{Managed by the Sourcegraph Code-intelligence team.}

worker: worker_job_codeintel-auto-indexing_count

number of worker instances running the codeintel-auto-indexing job

Descriptions

warning worker: less than 1 number of worker instances running the codeintel-auto-indexing job for 1m0s
critical worker: less than 1 number of worker instances running the codeintel-auto-indexing job for 5m0s

Possible solutions

Ensure your instance defines a worker container such that:
- WORKER_JOB_ALLOWLIST contains "codeintel-auto-indexing" (or "all"), and
- WORKER_JOB_BLOCKLIST does not contain "codeintel-auto-indexing"
Ensure that such a container is not failing to start or stay active
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_worker_worker_job_codeintel-auto-indexing_count",
  "critical_worker_worker_job_codeintel-auto-indexing_count"
]

_{Managed by the Sourcegraph Code-intelligence team.}

worker: frontend_internal_api_error_responses

frontend-internal API error responses every 5m by route

Descriptions

warning worker: 2%+ frontend-internal API error responses every 5m by route for 5m0s

Possible solutions

Single-container deployments: Check docker logs $CONTAINER_ID for logs starting with repo-updater that indicate requests to the frontend service are failing.
Kubernetes:
- Confirm that kubectl get pods shows the frontend pods are healthy.
- Check kubectl logs worker for logs indicate request failures to frontend or frontend-internal.
Docker Compose:
- Confirm that docker ps shows the frontend-internal container is healthy.
- Check docker logs worker for logs indicating request failures to frontend or frontend-internal.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_worker_frontend_internal_api_error_responses"
]

_{Managed by the Sourcegraph Code-intelligence team.}

worker: mean_blocked_seconds_per_conn_request

mean blocked seconds per conn request

Descriptions

warning worker: 0.05s+ mean blocked seconds per conn request for 5m0s
critical worker: 0.1s+ mean blocked seconds per conn request for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_worker_mean_blocked_seconds_per_conn_request",
  "critical_worker_mean_blocked_seconds_per_conn_request"
]

_{Managed by the Sourcegraph Core application team.}

worker: container_cpu_usage

container cpu usage total (1m average) across all cores by instance

Descriptions

warning worker: 99%+ container cpu usage total (1m average) across all cores by instance

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the worker container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_worker_container_cpu_usage"
]

_{Managed by the Sourcegraph Code-intelligence team.}

worker: container_memory_usage

container memory usage by instance

Descriptions

warning worker: 99%+ container memory usage by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of worker container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_worker_container_memory_usage"
]

_{Managed by the Sourcegraph Code-intelligence team.}

worker: provisioning_container_cpu_usage_long_term

container cpu usage total (90th percentile over 1d) across all cores by instance

Descriptions

warning worker: 80%+ container cpu usage total (90th percentile over 1d) across all cores by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the Deployment.yaml for the worker service.
Docker Compose: Consider increasing cpus: of the worker container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_worker_provisioning_container_cpu_usage_long_term"
]

_{Managed by the Sourcegraph Code-intelligence team.}

worker: provisioning_container_memory_usage_long_term

container memory usage (1d maximum) by instance

Descriptions

warning worker: 80%+ container memory usage (1d maximum) by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing memory limits in the Deployment.yaml for the worker service.
Docker Compose: Consider increasing memory: of the worker container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_worker_provisioning_container_memory_usage_long_term"
]

_{Managed by the Sourcegraph Code-intelligence team.}

worker: provisioning_container_cpu_usage_short_term

container cpu usage total (5m maximum) across all cores by instance

Descriptions

warning worker: 90%+ container cpu usage total (5m maximum) across all cores by instance for 30m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the worker container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_worker_provisioning_container_cpu_usage_short_term"
]

_{Managed by the Sourcegraph Code-intelligence team.}

worker: provisioning_container_memory_usage_short_term

container memory usage (5m maximum) by instance

Descriptions

warning worker: 90%+ container memory usage (5m maximum) by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of worker container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_worker_provisioning_container_memory_usage_short_term"
]

_{Managed by the Sourcegraph Code-intelligence team.}

worker: go_goroutines

maximum active goroutines

Descriptions

warning worker: 10000+ maximum active goroutines for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_worker_go_goroutines"
]

_{Managed by the Sourcegraph Code-intelligence team.}

worker: go_gc_duration_seconds

maximum go garbage collection duration

Descriptions

warning worker: 2s+ maximum go garbage collection duration

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_worker_go_gc_duration_seconds"
]

_{Managed by the Sourcegraph Code-intelligence team.}

worker: pods_available_percentage

percentage pods available

Descriptions

critical worker: less than 90% percentage pods available for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_worker_pods_available_percentage"
]

_{Managed by the Sourcegraph Code-intelligence team.}

repo-updater: src_repoupdater_max_sync_backoff

time since oldest sync

Descriptions

critical repo-updater: 32400s+ time since oldest sync for 10m0s

Possible solutions

An alert here indicates that no code host connections have synced in at least 9h0m0s. This indicates that there could be a configuration issue with your code hosts connections or networking issues affecting communication with your code hosts.
Check the code host status indicator (cloud icon in top right of Sourcegraph homepage) for errors.
Make sure external services do not have invalid tokens by navigating to them in the web UI and clicking save. If there are no errors, they are valid.
Check the repo-updater logs for errors about syncing.
Confirm that outbound network connections are allowed where repo-updater is deployed.
Check back in an hour to see if the issue has resolved itself.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_repo-updater_src_repoupdater_max_sync_backoff"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: src_repoupdater_syncer_sync_errors_total

site level external service sync error rate

Descriptions

critical repo-updater: 0+ site level external service sync error rate for 10m0s

Possible solutions

An alert here indicates errors syncing site level repo metadata with code hosts. This indicates that there could be a configuration issue with your code hosts connections or networking issues affecting communication with your code hosts.
Check the code host status indicator (cloud icon in top right of Sourcegraph homepage) for errors.
Make sure external services do not have invalid tokens by navigating to them in the web UI and clicking save. If there are no errors, they are valid.
Check the repo-updater logs for errors about syncing.
Confirm that outbound network connections are allowed where repo-updater is deployed.
Check back in an hour to see if the issue has resolved itself.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_repo-updater_src_repoupdater_syncer_sync_errors_total"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: syncer_sync_start

repo metadata sync was started

Descriptions

warning repo-updater: less than 0 repo metadata sync was started for 9h0m0s

Possible solutions

Check repo-updater logs for errors.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_syncer_sync_start"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: syncer_sync_duration

95th repositories sync duration

Descriptions

warning repo-updater: 30s+ 95th repositories sync duration for 5m0s

Possible solutions

Check the network latency is reasonable (<50ms) between the Sourcegraph and the code host
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_syncer_sync_duration"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: source_duration

95th repositories source duration

Descriptions

warning repo-updater: 30s+ 95th repositories source duration for 5m0s

Possible solutions

Check the network latency is reasonable (<50ms) between the Sourcegraph and the code host
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_source_duration"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: syncer_synced_repos

repositories synced

Descriptions

warning repo-updater: less than 0 repositories synced for 9h0m0s

Possible solutions

Check network connectivity to code hosts
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_syncer_synced_repos"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: sourced_repos

repositories sourced

Descriptions

warning repo-updater: less than 0 repositories sourced for 9h0m0s

Possible solutions

Check network connectivity to code hosts
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_sourced_repos"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: user_added_repos

total number of user added repos

Descriptions

critical repo-updater: 180000+ total number of user added repos for 5m0s

Possible solutions

Check for unusual spikes in user added repos. Each user is only allowed to add 2000
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_repo-updater_user_added_repos"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: purge_failed

repositories purge failed

Descriptions

warning repo-updater: 0+ repositories purge failed for 5m0s

Possible solutions

Check repo-updater`s connectivity with gitserver and gitserver logs
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_purge_failed"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: sched_auto_fetch

repositories scheduled due to hitting a deadline

Descriptions

warning repo-updater: less than 0 repositories scheduled due to hitting a deadline for 9h0m0s

Possible solutions

Check repo-updater logs. This is expected to fire if there are no user added code hosts
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_sched_auto_fetch"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: sched_known_repos

repositories managed by the scheduler

Descriptions

warning repo-updater: less than 0 repositories managed by the scheduler for 10m0s

Possible solutions

Check repo-updater logs. This is expected to fire if there are no user added code hosts
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_sched_known_repos"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: sched_update_queue_length

rate of growth of update queue length over 5 minutes

Descriptions

critical repo-updater: 0+ rate of growth of update queue length over 5 minutes for 2h0m0s

Possible solutions

Check repo-updater logs for indications that the queue is not being processed. The queue length should trend downwards over time as items are sent to GitServer
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_repo-updater_sched_update_queue_length"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: sched_loops

scheduler loops

Descriptions

warning repo-updater: less than 0 scheduler loops for 9h0m0s

Possible solutions

Check repo-updater logs for errors. This is expected to fire if there are no user added code hosts
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_sched_loops"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: sched_error

repositories schedule error rate

Descriptions

critical repo-updater: 1+ repositories schedule error rate for 1m0s

Possible solutions

Check repo-updater logs for errors
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_repo-updater_sched_error"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: perms_syncer_perms

time gap between least and most up to date permissions

Descriptions

warning repo-updater: 259200s+ time gap between least and most up to date permissions for 5m0s

Possible solutions

Increase the API rate limit to GitHub, GitLab or Bitbucket Server.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_perms_syncer_perms"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: perms_syncer_stale_perms

number of entities with stale permissions

Descriptions

warning repo-updater: 100+ number of entities with stale permissions for 5m0s

Possible solutions

Increase the API rate limit to GitHub, GitLab or Bitbucket Server.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_perms_syncer_stale_perms"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: perms_syncer_no_perms

number of entities with no permissions

Descriptions

warning repo-updater: 100+ number of entities with no permissions for 5m0s

Possible solutions

Enabled permissions for the first time: Wait for few minutes and see if the number goes down.
Otherwise: Increase the API rate limit to GitHub, GitLab or Bitbucket Server.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_perms_syncer_no_perms"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: perms_syncer_sync_duration

95th permissions sync duration

Descriptions

warning repo-updater: 30s+ 95th permissions sync duration for 5m0s

Possible solutions

Check the network latency is reasonable (<50ms) between the Sourcegraph and the code host.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_perms_syncer_sync_duration"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: perms_syncer_queue_size

permissions sync queued items

Descriptions

warning repo-updater: 100+ permissions sync queued items for 5m0s

Possible solutions

Enabled permissions for the first time: Wait for few minutes and see if the number goes down.
Otherwise: Increase the API rate limit to GitHub, GitLab or Bitbucket Server.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_perms_syncer_queue_size"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: perms_syncer_sync_errors

permissions sync error rate

Descriptions

critical repo-updater: 1+ permissions sync error rate for 1m0s

Possible solutions

Check the network connectivity the Sourcegraph and the code host.
Check if API rate limit quota is exhausted on the code host.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_repo-updater_perms_syncer_sync_errors"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: src_repoupdater_external_services_total

the total number of external services

Descriptions

critical repo-updater: 20000+ the total number of external services for 1h0m0s

Possible solutions

Check for spikes in external services, could be abuse
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_repo-updater_src_repoupdater_external_services_total"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: src_repoupdater_user_external_services_total

the total number of user added external services

Descriptions

warning repo-updater: 20000+ the total number of user added external services for 1h0m0s

Possible solutions

Check for spikes in external services, could be abuse
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_src_repoupdater_user_external_services_total"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: repoupdater_queued_sync_jobs_total

the total number of queued sync jobs

Descriptions

warning repo-updater: 100+ the total number of queued sync jobs for 1h0m0s

Possible solutions

Check if jobs are failing to sync: "SELECT * FROM external_service_sync_jobs WHERE state = errored";
Increase the number of workers using the repoConcurrentExternalServiceSyncers site config.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_repoupdater_queued_sync_jobs_total"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: repoupdater_completed_sync_jobs_total

the total number of completed sync jobs

Descriptions

warning repo-updater: 100000+ the total number of completed sync jobs for 1h0m0s

Possible solutions

Check repo-updater logs. Jobs older than 1 day should have been removed.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_repoupdater_completed_sync_jobs_total"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: repoupdater_errored_sync_jobs_percentage

the percentage of external services that have failed their most recent sync

Descriptions

warning repo-updater: 10%+ the percentage of external services that have failed their most recent sync for 1h0m0s

Possible solutions

Check repo-updater logs. Check code host connectivity
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_repoupdater_errored_sync_jobs_percentage"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: github_graphql_rate_limit_remaining

remaining calls to GitHub graphql API before hitting the rate limit

Descriptions

critical repo-updater: less than 250 remaining calls to GitHub graphql API before hitting the rate limit

Possible solutions

Try restarting the pod to get a different public IP.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_repo-updater_github_graphql_rate_limit_remaining"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: github_rest_rate_limit_remaining

remaining calls to GitHub rest API before hitting the rate limit

Descriptions

critical repo-updater: less than 250 remaining calls to GitHub rest API before hitting the rate limit

Possible solutions

Try restarting the pod to get a different public IP.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_repo-updater_github_rest_rate_limit_remaining"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: github_search_rate_limit_remaining

remaining calls to GitHub search API before hitting the rate limit

Descriptions

critical repo-updater: less than 5 remaining calls to GitHub search API before hitting the rate limit

Possible solutions

Try restarting the pod to get a different public IP.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_repo-updater_github_search_rate_limit_remaining"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: gitlab_rest_rate_limit_remaining

remaining calls to GitLab rest API before hitting the rate limit

Descriptions

critical repo-updater: less than 30 remaining calls to GitLab rest API before hitting the rate limit

Possible solutions

Try restarting the pod to get a different public IP.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_repo-updater_gitlab_rest_rate_limit_remaining"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: frontend_internal_api_error_responses

frontend-internal API error responses every 5m by route

Descriptions

warning repo-updater: 2%+ frontend-internal API error responses every 5m by route for 5m0s

Possible solutions

Single-container deployments: Check docker logs $CONTAINER_ID for logs starting with repo-updater that indicate requests to the frontend service are failing.
Kubernetes:
- Confirm that kubectl get pods shows the frontend pods are healthy.
- Check kubectl logs repo-updater for logs indicate request failures to frontend or frontend-internal.
Docker Compose:
- Confirm that docker ps shows the frontend-internal container is healthy.
- Check docker logs repo-updater for logs indicating request failures to frontend or frontend-internal.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_frontend_internal_api_error_responses"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: mean_blocked_seconds_per_conn_request

mean blocked seconds per conn request

Descriptions

warning repo-updater: 0.05s+ mean blocked seconds per conn request for 5m0s
critical repo-updater: 0.1s+ mean blocked seconds per conn request for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_mean_blocked_seconds_per_conn_request",
  "critical_repo-updater_mean_blocked_seconds_per_conn_request"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: container_cpu_usage

container cpu usage total (1m average) across all cores by instance

Descriptions

warning repo-updater: 99%+ container cpu usage total (1m average) across all cores by instance

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the repo-updater container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_container_cpu_usage"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: container_memory_usage

container memory usage by instance

Descriptions

critical repo-updater: 90%+ container memory usage by instance for 10m0s

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of repo-updater container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_repo-updater_container_memory_usage"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: provisioning_container_cpu_usage_long_term

container cpu usage total (90th percentile over 1d) across all cores by instance

Descriptions

warning repo-updater: 80%+ container cpu usage total (90th percentile over 1d) across all cores by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the Deployment.yaml for the repo-updater service.
Docker Compose: Consider increasing cpus: of the repo-updater container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_provisioning_container_cpu_usage_long_term"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: provisioning_container_memory_usage_long_term

container memory usage (1d maximum) by instance

Descriptions

warning repo-updater: 80%+ container memory usage (1d maximum) by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing memory limits in the Deployment.yaml for the repo-updater service.
Docker Compose: Consider increasing memory: of the repo-updater container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_provisioning_container_memory_usage_long_term"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: provisioning_container_cpu_usage_short_term

container cpu usage total (5m maximum) across all cores by instance

Descriptions

warning repo-updater: 90%+ container cpu usage total (5m maximum) across all cores by instance for 30m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the repo-updater container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_provisioning_container_cpu_usage_short_term"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: provisioning_container_memory_usage_short_term

container memory usage (5m maximum) by instance

Descriptions

warning repo-updater: 90%+ container memory usage (5m maximum) by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of repo-updater container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_provisioning_container_memory_usage_short_term"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: go_goroutines

maximum active goroutines

Descriptions

warning repo-updater: 10000+ maximum active goroutines for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_go_goroutines"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: go_gc_duration_seconds

maximum go garbage collection duration

Descriptions

warning repo-updater: 2s+ maximum go garbage collection duration

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_repo-updater_go_gc_duration_seconds"
]

_{Managed by the Sourcegraph Core application team.}

repo-updater: pods_available_percentage

percentage pods available

Descriptions

critical repo-updater: less than 90% percentage pods available for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_repo-updater_pods_available_percentage"
]

_{Managed by the Sourcegraph Core application team.}

searcher: unindexed_search_request_errors

unindexed search request errors every 5m by code

Descriptions

warning searcher: 5%+ unindexed search request errors every 5m by code for 5m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_searcher_unindexed_search_request_errors"
]

_{Managed by the Sourcegraph Search team.}

searcher: replica_traffic

requests per second over 10m

Descriptions

warning searcher: 5+ requests per second over 10m

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_searcher_replica_traffic"
]

_{Managed by the Sourcegraph Search team.}

searcher: frontend_internal_api_error_responses

frontend-internal API error responses every 5m by route

Descriptions

warning searcher: 2%+ frontend-internal API error responses every 5m by route for 5m0s

Possible solutions

Single-container deployments: Check docker logs $CONTAINER_ID for logs starting with repo-updater that indicate requests to the frontend service are failing.
Kubernetes:
- Confirm that kubectl get pods shows the frontend pods are healthy.
- Check kubectl logs searcher for logs indicate request failures to frontend or frontend-internal.
Docker Compose:
- Confirm that docker ps shows the frontend-internal container is healthy.
- Check docker logs searcher for logs indicating request failures to frontend or frontend-internal.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_searcher_frontend_internal_api_error_responses"
]

_{Managed by the Sourcegraph Search team.}

searcher: container_cpu_usage

container cpu usage total (1m average) across all cores by instance

Descriptions

warning searcher: 99%+ container cpu usage total (1m average) across all cores by instance

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the searcher container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_searcher_container_cpu_usage"
]

_{Managed by the Sourcegraph Search team.}

searcher: container_memory_usage

container memory usage by instance

Descriptions

warning searcher: 99%+ container memory usage by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of searcher container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_searcher_container_memory_usage"
]

_{Managed by the Sourcegraph Search team.}

searcher: provisioning_container_cpu_usage_long_term

container cpu usage total (90th percentile over 1d) across all cores by instance

Descriptions

warning searcher: 80%+ container cpu usage total (90th percentile over 1d) across all cores by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the Deployment.yaml for the searcher service.
Docker Compose: Consider increasing cpus: of the searcher container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_searcher_provisioning_container_cpu_usage_long_term"
]

_{Managed by the Sourcegraph Search team.}

searcher: provisioning_container_memory_usage_long_term

container memory usage (1d maximum) by instance

Descriptions

warning searcher: 80%+ container memory usage (1d maximum) by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing memory limits in the Deployment.yaml for the searcher service.
Docker Compose: Consider increasing memory: of the searcher container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_searcher_provisioning_container_memory_usage_long_term"
]

_{Managed by the Sourcegraph Search team.}

searcher: provisioning_container_cpu_usage_short_term

container cpu usage total (5m maximum) across all cores by instance

Descriptions

warning searcher: 90%+ container cpu usage total (5m maximum) across all cores by instance for 30m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the searcher container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_searcher_provisioning_container_cpu_usage_short_term"
]

_{Managed by the Sourcegraph Search team.}

searcher: provisioning_container_memory_usage_short_term

container memory usage (5m maximum) by instance

Descriptions

warning searcher: 90%+ container memory usage (5m maximum) by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of searcher container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_searcher_provisioning_container_memory_usage_short_term"
]

_{Managed by the Sourcegraph Search team.}

searcher: go_goroutines

maximum active goroutines

Descriptions

warning searcher: 10000+ maximum active goroutines for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_searcher_go_goroutines"
]

_{Managed by the Sourcegraph Search team.}

searcher: go_gc_duration_seconds

maximum go garbage collection duration

Descriptions

warning searcher: 2s+ maximum go garbage collection duration

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_searcher_go_gc_duration_seconds"
]

_{Managed by the Sourcegraph Search team.}

searcher: pods_available_percentage

percentage pods available

Descriptions

critical searcher: less than 90% percentage pods available for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_searcher_pods_available_percentage"
]

_{Managed by the Sourcegraph Search team.}

symbols: store_fetch_failures

store fetch failures every 5m

Descriptions

warning symbols: 5+ store fetch failures every 5m

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_symbols_store_fetch_failures"
]

_{Managed by the Sourcegraph Code-intelligence team.}

symbols: current_fetch_queue_size

current fetch queue size

Descriptions

warning symbols: 25+ current fetch queue size

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_symbols_current_fetch_queue_size"
]

_{Managed by the Sourcegraph Code-intelligence team.}

symbols: frontend_internal_api_error_responses

frontend-internal API error responses every 5m by route

Descriptions

warning symbols: 2%+ frontend-internal API error responses every 5m by route for 5m0s

Possible solutions

Single-container deployments: Check docker logs $CONTAINER_ID for logs starting with repo-updater that indicate requests to the frontend service are failing.
Kubernetes:
- Confirm that kubectl get pods shows the frontend pods are healthy.
- Check kubectl logs symbols for logs indicate request failures to frontend or frontend-internal.
Docker Compose:
- Confirm that docker ps shows the frontend-internal container is healthy.
- Check docker logs symbols for logs indicating request failures to frontend or frontend-internal.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_symbols_frontend_internal_api_error_responses"
]

_{Managed by the Sourcegraph Code-intelligence team.}

symbols: container_cpu_usage

container cpu usage total (1m average) across all cores by instance

Descriptions

warning symbols: 99%+ container cpu usage total (1m average) across all cores by instance

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the symbols container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_symbols_container_cpu_usage"
]

_{Managed by the Sourcegraph Code-intelligence team.}

symbols: container_memory_usage

container memory usage by instance

Descriptions

warning symbols: 99%+ container memory usage by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of symbols container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_symbols_container_memory_usage"
]

_{Managed by the Sourcegraph Code-intelligence team.}

symbols: provisioning_container_cpu_usage_long_term

container cpu usage total (90th percentile over 1d) across all cores by instance

Descriptions

warning symbols: 80%+ container cpu usage total (90th percentile over 1d) across all cores by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the Deployment.yaml for the symbols service.
Docker Compose: Consider increasing cpus: of the symbols container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_symbols_provisioning_container_cpu_usage_long_term"
]

_{Managed by the Sourcegraph Code-intelligence team.}

symbols: provisioning_container_memory_usage_long_term

container memory usage (1d maximum) by instance

Descriptions

warning symbols: 80%+ container memory usage (1d maximum) by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing memory limits in the Deployment.yaml for the symbols service.
Docker Compose: Consider increasing memory: of the symbols container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_symbols_provisioning_container_memory_usage_long_term"
]

_{Managed by the Sourcegraph Code-intelligence team.}

symbols: provisioning_container_cpu_usage_short_term

container cpu usage total (5m maximum) across all cores by instance

Descriptions

warning symbols: 90%+ container cpu usage total (5m maximum) across all cores by instance for 30m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the symbols container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_symbols_provisioning_container_cpu_usage_short_term"
]

_{Managed by the Sourcegraph Code-intelligence team.}

symbols: provisioning_container_memory_usage_short_term

container memory usage (5m maximum) by instance

Descriptions

warning symbols: 90%+ container memory usage (5m maximum) by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of symbols container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_symbols_provisioning_container_memory_usage_short_term"
]

_{Managed by the Sourcegraph Code-intelligence team.}

symbols: go_goroutines

maximum active goroutines

Descriptions

warning symbols: 10000+ maximum active goroutines for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_symbols_go_goroutines"
]

_{Managed by the Sourcegraph Code-intelligence team.}

symbols: go_gc_duration_seconds

maximum go garbage collection duration

Descriptions

warning symbols: 2s+ maximum go garbage collection duration

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_symbols_go_gc_duration_seconds"
]

_{Managed by the Sourcegraph Code-intelligence team.}

symbols: pods_available_percentage

percentage pods available

Descriptions

critical symbols: less than 90% percentage pods available for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_symbols_pods_available_percentage"
]

_{Managed by the Sourcegraph Code-intelligence team.}

syntect-server: container_cpu_usage

container cpu usage total (1m average) across all cores by instance

Descriptions

warning syntect-server: 99%+ container cpu usage total (1m average) across all cores by instance

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the syntect-server container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_syntect-server_container_cpu_usage"
]

_{Managed by the Sourcegraph Core application team.}

syntect-server: container_memory_usage

container memory usage by instance

Descriptions

warning syntect-server: 99%+ container memory usage by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of syntect-server container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_syntect-server_container_memory_usage"
]

_{Managed by the Sourcegraph Core application team.}

syntect-server: provisioning_container_cpu_usage_long_term

container cpu usage total (90th percentile over 1d) across all cores by instance

Descriptions

warning syntect-server: 80%+ container cpu usage total (90th percentile over 1d) across all cores by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the Deployment.yaml for the syntect-server service.
Docker Compose: Consider increasing cpus: of the syntect-server container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_syntect-server_provisioning_container_cpu_usage_long_term"
]

_{Managed by the Sourcegraph Core application team.}

syntect-server: provisioning_container_memory_usage_long_term

container memory usage (1d maximum) by instance

Descriptions

warning syntect-server: 80%+ container memory usage (1d maximum) by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing memory limits in the Deployment.yaml for the syntect-server service.
Docker Compose: Consider increasing memory: of the syntect-server container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_syntect-server_provisioning_container_memory_usage_long_term"
]

_{Managed by the Sourcegraph Core application team.}

syntect-server: provisioning_container_cpu_usage_short_term

container cpu usage total (5m maximum) across all cores by instance

Descriptions

warning syntect-server: 90%+ container cpu usage total (5m maximum) across all cores by instance for 30m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the syntect-server container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_syntect-server_provisioning_container_cpu_usage_short_term"
]

_{Managed by the Sourcegraph Core application team.}

syntect-server: provisioning_container_memory_usage_short_term

container memory usage (5m maximum) by instance

Descriptions

warning syntect-server: 90%+ container memory usage (5m maximum) by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of syntect-server container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_syntect-server_provisioning_container_memory_usage_short_term"
]

_{Managed by the Sourcegraph Core application team.}

syntect-server: pods_available_percentage

percentage pods available

Descriptions

critical syntect-server: less than 90% percentage pods available for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_syntect-server_pods_available_percentage"
]

_{Managed by the Sourcegraph Core application team.}

zoekt-indexserver: average_resolve_revision_duration

average resolve revision duration over 5m

Descriptions

warning zoekt-indexserver: 15s+ average resolve revision duration over 5m
critical zoekt-indexserver: 30s+ average resolve revision duration over 5m

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_zoekt-indexserver_average_resolve_revision_duration",
  "critical_zoekt-indexserver_average_resolve_revision_duration"
]

_{Managed by the Sourcegraph Search team.}

zoekt-indexserver: container_cpu_usage

container cpu usage total (1m average) across all cores by instance

Descriptions

warning zoekt-indexserver: 99%+ container cpu usage total (1m average) across all cores by instance

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the zoekt-indexserver container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_zoekt-indexserver_container_cpu_usage"
]

_{Managed by the Sourcegraph Search team.}

zoekt-indexserver: container_memory_usage

container memory usage by instance

Descriptions

warning zoekt-indexserver: 99%+ container memory usage by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of zoekt-indexserver container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_zoekt-indexserver_container_memory_usage"
]

_{Managed by the Sourcegraph Search team.}

zoekt-indexserver: provisioning_container_cpu_usage_long_term

container cpu usage total (90th percentile over 1d) across all cores by instance

Descriptions

warning zoekt-indexserver: 80%+ container cpu usage total (90th percentile over 1d) across all cores by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the Deployment.yaml for the zoekt-indexserver service.
Docker Compose: Consider increasing cpus: of the zoekt-indexserver container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_zoekt-indexserver_provisioning_container_cpu_usage_long_term"
]

_{Managed by the Sourcegraph Search team.}

zoekt-indexserver: provisioning_container_memory_usage_long_term

container memory usage (1d maximum) by instance

Descriptions

warning zoekt-indexserver: 80%+ container memory usage (1d maximum) by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing memory limits in the Deployment.yaml for the zoekt-indexserver service.
Docker Compose: Consider increasing memory: of the zoekt-indexserver container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_zoekt-indexserver_provisioning_container_memory_usage_long_term"
]

_{Managed by the Sourcegraph Search team.}

zoekt-indexserver: provisioning_container_cpu_usage_short_term

container cpu usage total (5m maximum) across all cores by instance

Descriptions

warning zoekt-indexserver: 90%+ container cpu usage total (5m maximum) across all cores by instance for 30m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the zoekt-indexserver container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_zoekt-indexserver_provisioning_container_cpu_usage_short_term"
]

_{Managed by the Sourcegraph Search team.}

zoekt-indexserver: provisioning_container_memory_usage_short_term

container memory usage (5m maximum) by instance

Descriptions

warning zoekt-indexserver: 90%+ container memory usage (5m maximum) by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of zoekt-indexserver container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_zoekt-indexserver_provisioning_container_memory_usage_short_term"
]

_{Managed by the Sourcegraph Search team.}

zoekt-indexserver: pods_available_percentage

percentage pods available

Descriptions

critical zoekt-indexserver: less than 90% percentage pods available for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_zoekt-indexserver_pods_available_percentage"
]

_{Managed by the Sourcegraph Search team.}

zoekt-webserver: indexed_search_request_errors

indexed search request errors every 5m by code

Descriptions

warning zoekt-webserver: 5%+ indexed search request errors every 5m by code for 5m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_zoekt-webserver_indexed_search_request_errors"
]

_{Managed by the Sourcegraph Search team.}

zoekt-webserver: container_cpu_usage

container cpu usage total (1m average) across all cores by instance

Descriptions

warning zoekt-webserver: 99%+ container cpu usage total (1m average) across all cores by instance

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the zoekt-webserver container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_zoekt-webserver_container_cpu_usage"
]

_{Managed by the Sourcegraph Search team.}

zoekt-webserver: container_memory_usage

container memory usage by instance

Descriptions

warning zoekt-webserver: 99%+ container memory usage by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of zoekt-webserver container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_zoekt-webserver_container_memory_usage"
]

_{Managed by the Sourcegraph Search team.}

zoekt-webserver: provisioning_container_cpu_usage_long_term

container cpu usage total (90th percentile over 1d) across all cores by instance

Descriptions

warning zoekt-webserver: 80%+ container cpu usage total (90th percentile over 1d) across all cores by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the Deployment.yaml for the zoekt-webserver service.
Docker Compose: Consider increasing cpus: of the zoekt-webserver container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_zoekt-webserver_provisioning_container_cpu_usage_long_term"
]

_{Managed by the Sourcegraph Search team.}

zoekt-webserver: provisioning_container_memory_usage_long_term

container memory usage (1d maximum) by instance

Descriptions

warning zoekt-webserver: 80%+ container memory usage (1d maximum) by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing memory limits in the Deployment.yaml for the zoekt-webserver service.
Docker Compose: Consider increasing memory: of the zoekt-webserver container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_zoekt-webserver_provisioning_container_memory_usage_long_term"
]

_{Managed by the Sourcegraph Search team.}

zoekt-webserver: provisioning_container_cpu_usage_short_term

container cpu usage total (5m maximum) across all cores by instance

Descriptions

warning zoekt-webserver: 90%+ container cpu usage total (5m maximum) across all cores by instance for 30m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the zoekt-webserver container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_zoekt-webserver_provisioning_container_cpu_usage_short_term"
]

_{Managed by the Sourcegraph Search team.}

zoekt-webserver: provisioning_container_memory_usage_short_term

container memory usage (5m maximum) by instance

Descriptions

warning zoekt-webserver: 90%+ container memory usage (5m maximum) by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of zoekt-webserver container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_zoekt-webserver_provisioning_container_memory_usage_short_term"
]

_{Managed by the Sourcegraph Search team.}

prometheus: prometheus_rule_eval_duration

average prometheus rule group evaluation duration over 10m by rule group

Descriptions

warning prometheus: 30s+ average prometheus rule group evaluation duration over 10m by rule group

Possible solutions

Check the Container monitoring (not available on server) panels and try increasing resources for Prometheus if necessary.
If the rule group taking a long time to evaluate belongs to /sg_prometheus_addons, try reducing the complexity of any custom Prometheus rules provided.
If the rule group taking a long time to evaluate belongs to /sg_config_prometheus, please open an issue.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_prometheus_prometheus_rule_eval_duration"
]

_{Managed by the Sourcegraph Distribution team.}

prometheus: prometheus_rule_eval_failures

failed prometheus rule evaluations over 5m by rule group

Descriptions

warning prometheus: 0+ failed prometheus rule evaluations over 5m by rule group

Possible solutions

Check Prometheus logs for messages related to rule group evaluation (generally with log field component="rule manager").
If the rule group failing to evaluate belongs to /sg_prometheus_addons, ensure any custom Prometheus configuration provided is valid.
If the rule group taking a long time to evaluate belongs to /sg_config_prometheus, please open an issue.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_prometheus_prometheus_rule_eval_failures"
]

_{Managed by the Sourcegraph Distribution team.}

prometheus: alertmanager_notification_latency

alertmanager notification latency over 1m by integration

Descriptions

warning prometheus: 1s+ alertmanager notification latency over 1m by integration

Possible solutions

Check the Container monitoring (not available on server) panels and try increasing resources for Prometheus if necessary.
Ensure that your observability.alerts configuration (in site configuration) is valid.
Check if the relevant alert integration service is experiencing downtime or issues.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_prometheus_alertmanager_notification_latency"
]

_{Managed by the Sourcegraph Distribution team.}

prometheus: alertmanager_notification_failures

failed alertmanager notifications over 1m by integration

Descriptions

warning prometheus: 0+ failed alertmanager notifications over 1m by integration

Possible solutions

Ensure that your observability.alerts configuration (in site configuration) is valid.
Check if the relevant alert integration service is experiencing downtime or issues.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_prometheus_alertmanager_notification_failures"
]

_{Managed by the Sourcegraph Distribution team.}

prometheus: prometheus_config_status

prometheus configuration reload status

Descriptions

warning prometheus: less than 1 prometheus configuration reload status

Possible solutions

Check Prometheus logs for messages related to configuration loading.
Ensure any custom configuration you have provided Prometheus is valid.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_prometheus_prometheus_config_status"
]

_{Managed by the Sourcegraph Distribution team.}

prometheus: alertmanager_config_status

alertmanager configuration reload status

Descriptions

warning prometheus: less than 1 alertmanager configuration reload status

Possible solutions

Ensure that your observability.alerts configuration (in site configuration) is valid.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_prometheus_alertmanager_config_status"
]

_{Managed by the Sourcegraph Distribution team.}

prometheus: prometheus_tsdb_op_failure

prometheus tsdb failures by operation over 1m by operation

Descriptions

warning prometheus: 0+ prometheus tsdb failures by operation over 1m by operation

Possible solutions

Check Prometheus logs for messages related to the failing operation.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_prometheus_prometheus_tsdb_op_failure"
]

_{Managed by the Sourcegraph Distribution team.}

prometheus: prometheus_target_sample_exceeded

prometheus scrapes that exceed the sample limit over 10m

Descriptions

warning prometheus: 0+ prometheus scrapes that exceed the sample limit over 10m

Possible solutions

Check Prometheus logs for messages related to target scrape failures.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_prometheus_prometheus_target_sample_exceeded"
]

_{Managed by the Sourcegraph Distribution team.}

prometheus: prometheus_target_sample_duplicate

prometheus scrapes rejected due to duplicate timestamps over 10m

Descriptions

warning prometheus: 0+ prometheus scrapes rejected due to duplicate timestamps over 10m

Possible solutions

Check Prometheus logs for messages related to target scrape failures.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_prometheus_prometheus_target_sample_duplicate"
]

_{Managed by the Sourcegraph Distribution team.}

prometheus: container_cpu_usage

container cpu usage total (1m average) across all cores by instance

Descriptions

warning prometheus: 99%+ container cpu usage total (1m average) across all cores by instance

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the prometheus container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_prometheus_container_cpu_usage"
]

_{Managed by the Sourcegraph Distribution team.}

prometheus: container_memory_usage

container memory usage by instance

Descriptions

warning prometheus: 99%+ container memory usage by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of prometheus container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_prometheus_container_memory_usage"
]

_{Managed by the Sourcegraph Distribution team.}

prometheus: provisioning_container_cpu_usage_long_term

container cpu usage total (90th percentile over 1d) across all cores by instance

Descriptions

warning prometheus: 80%+ container cpu usage total (90th percentile over 1d) across all cores by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the Deployment.yaml for the prometheus service.
Docker Compose: Consider increasing cpus: of the prometheus container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_prometheus_provisioning_container_cpu_usage_long_term"
]

_{Managed by the Sourcegraph Distribution team.}

prometheus: provisioning_container_memory_usage_long_term

container memory usage (1d maximum) by instance

Descriptions

warning prometheus: 80%+ container memory usage (1d maximum) by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing memory limits in the Deployment.yaml for the prometheus service.
Docker Compose: Consider increasing memory: of the prometheus container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_prometheus_provisioning_container_memory_usage_long_term"
]

_{Managed by the Sourcegraph Distribution team.}

prometheus: provisioning_container_cpu_usage_short_term

container cpu usage total (5m maximum) across all cores by instance

Descriptions

warning prometheus: 90%+ container cpu usage total (5m maximum) across all cores by instance for 30m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the prometheus container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_prometheus_provisioning_container_cpu_usage_short_term"
]

_{Managed by the Sourcegraph Distribution team.}

prometheus: provisioning_container_memory_usage_short_term

container memory usage (5m maximum) by instance

Descriptions

warning prometheus: 90%+ container memory usage (5m maximum) by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of prometheus container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_prometheus_provisioning_container_memory_usage_short_term"
]

_{Managed by the Sourcegraph Distribution team.}

prometheus: pods_available_percentage

percentage pods available

Descriptions

critical prometheus: less than 90% percentage pods available for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_prometheus_pods_available_percentage"
]

_{Managed by the Sourcegraph Distribution team.}

executor: container_cpu_usage

container cpu usage total (1m average) across all cores by instance

Descriptions

warning executor: 99%+ container cpu usage total (1m average) across all cores by instance

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the (executor|sourcegraph-code-intel-indexers|executor-batches) container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_executor_container_cpu_usage"
]

_{Managed by the Sourcegraph Code-intelligence team.}

executor: container_memory_usage

container memory usage by instance

Descriptions

warning executor: 99%+ container memory usage by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of (executor|sourcegraph-code-intel-indexers|executor-batches) container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_executor_container_memory_usage"
]

_{Managed by the Sourcegraph Code-intelligence team.}

executor: provisioning_container_cpu_usage_long_term

container cpu usage total (90th percentile over 1d) across all cores by instance

Descriptions

warning executor: 80%+ container cpu usage total (90th percentile over 1d) across all cores by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the Deployment.yaml for the (executor|sourcegraph-code-intel-indexers|executor-batches) service.
Docker Compose: Consider increasing cpus: of the (executor|sourcegraph-code-intel-indexers|executor-batches) container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_executor_provisioning_container_cpu_usage_long_term"
]

_{Managed by the Sourcegraph Code-intelligence team.}

executor: provisioning_container_memory_usage_long_term

container memory usage (1d maximum) by instance

Descriptions

warning executor: 80%+ container memory usage (1d maximum) by instance for 336h0m0s

Possible solutions

Kubernetes: Consider increasing memory limits in the Deployment.yaml for the (executor|sourcegraph-code-intel-indexers|executor-batches) service.
Docker Compose: Consider increasing memory: of the (executor|sourcegraph-code-intel-indexers|executor-batches) container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_executor_provisioning_container_memory_usage_long_term"
]

_{Managed by the Sourcegraph Code-intelligence team.}

executor: provisioning_container_cpu_usage_short_term

container cpu usage total (5m maximum) across all cores by instance

Descriptions

warning executor: 90%+ container cpu usage total (5m maximum) across all cores by instance for 30m0s

Possible solutions

Kubernetes: Consider increasing CPU limits in the the relevant Deployment.yaml.
Docker Compose: Consider increasing cpus: of the (executor|sourcegraph-code-intel-indexers|executor-batches) container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_executor_provisioning_container_cpu_usage_short_term"
]

_{Managed by the Sourcegraph Code-intelligence team.}

executor: provisioning_container_memory_usage_short_term

container memory usage (5m maximum) by instance

Descriptions

warning executor: 90%+ container memory usage (5m maximum) by instance

Possible solutions

Kubernetes: Consider increasing memory limit in relevant Deployment.yaml.
Docker Compose: Consider increasing memory: of (executor|sourcegraph-code-intel-indexers|executor-batches) container in docker-compose.yml.
Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_executor_provisioning_container_memory_usage_short_term"
]

_{Managed by the Sourcegraph Code-intelligence team.}

executor: go_goroutines

maximum active goroutines

Descriptions

warning executor: 10000+ maximum active goroutines for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_executor_go_goroutines"
]

_{Managed by the Sourcegraph Code-intelligence team.}

executor: go_gc_duration_seconds

maximum go garbage collection duration

Descriptions

warning executor: 2s+ maximum go garbage collection duration

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "warning_executor_go_gc_duration_seconds"
]

_{Managed by the Sourcegraph Code-intelligence team.}

executor: pods_available_percentage

percentage pods available

Descriptions

critical executor: less than 90% percentage pods available for 10m0s

Possible solutions

Silence this alert: If you are aware of this alert and want to silence notifications for it, add the following to your site configuration and set a reminder to re-evaluate the alert:

"observability.silenceAlerts": [
  "critical_executor_pods_available_percentage"
]

_{Managed by the Sourcegraph Code-intelligence team.}

Files

alert_solutions.md

Latest commit

History

alert_solutions.md

File metadata and controls

Alert solutions

frontend: 99th_percentile_search_request_duration

frontend: 90th_percentile_search_request_duration

frontend: hard_timeout_search_responses

frontend: hard_error_search_responses

frontend: partial_timeout_search_responses

frontend: search_alert_user_suggestions

frontend: page_load_latency

frontend: blob_load_latency

frontend: 99th_percentile_search_codeintel_request_duration

frontend: 90th_percentile_search_codeintel_request_duration

frontend: hard_timeout_search_codeintel_responses

frontend: hard_error_search_codeintel_responses

frontend: partial_timeout_search_codeintel_responses

frontend: search_codeintel_alert_user_suggestions

frontend: 99th_percentile_search_api_request_duration

frontend: 90th_percentile_search_api_request_duration

frontend: hard_timeout_search_api_responses

frontend: hard_error_search_api_responses

frontend: partial_timeout_search_api_responses

frontend: search_api_alert_user_suggestions

frontend: internal_indexed_search_error_responses

frontend: internal_unindexed_search_error_responses

frontend: internal_api_error_responses

frontend: 99th_percentile_gitserver_duration

frontend: gitserver_error_responses

frontend: observability_test_alert_warning

frontend: observability_test_alert_critical

frontend: mean_blocked_seconds_per_conn_request

frontend: container_cpu_usage

frontend: container_memory_usage

frontend: provisioning_container_cpu_usage_long_term

frontend: provisioning_container_memory_usage_long_term

frontend: provisioning_container_cpu_usage_short_term

frontend: provisioning_container_memory_usage_short_term

frontend: go_goroutines

frontend: go_gc_duration_seconds

frontend: pods_available_percentage

frontend: mean_successful_sentinel_duration_5m

frontend: mean_sentinel_stream_latency_5m

frontend: 90th_percentile_successful_sentinel_duration_5m

frontend: 90th_percentile_sentinel_stream_latency_5m

gitserver: disk_space_remaining

gitserver: running_git_commands

gitserver: repository_clone_queue_size

gitserver: repository_existence_check_queue_size

gitserver: frontend_internal_api_error_responses

gitserver: mean_blocked_seconds_per_conn_request

gitserver: container_cpu_usage

gitserver: container_memory_usage

gitserver: provisioning_container_cpu_usage_long_term

gitserver: provisioning_container_cpu_usage_short_term

gitserver: go_goroutines

gitserver: go_gc_duration_seconds

gitserver: pods_available_percentage

github-proxy: github_proxy_waiting_requests

github-proxy: container_cpu_usage

github-proxy: container_memory_usage

github-proxy: provisioning_container_cpu_usage_long_term

github-proxy: provisioning_container_memory_usage_long_term

github-proxy: provisioning_container_cpu_usage_short_term

github-proxy: provisioning_container_memory_usage_short_term

github-proxy: go_goroutines

github-proxy: go_gc_duration_seconds

github-proxy: pods_available_percentage

postgres: connections

postgres: transaction_durations

postgres: postgres_up

postgres: invalid_indexes

postgres: pg_exporter_err

postgres: migration_in_progress

postgres: provisioning_container_cpu_usage_long_term

postgres: provisioning_container_memory_usage_long_term

postgres: provisioning_container_cpu_usage_short_term