Skip to content
This repository has been archived by the owner on Jan 15, 2025. It is now read-only.

Commit

Permalink
2.43.0 updates (#1271)
Browse files Browse the repository at this point in the history
* release setup

* PMM-13133

* improved wording

* Update query-analytics.md

* Download server diagnostics endpoint

* Update troubleshoot.md

* PMM-13054

* fortmatting

* typo

* Update docs/get-started/query-analytics.md

Co-authored-by: Alex Demidoff <[email protected]>

* Update docs/get-started/query-analytics.md

Co-authored-by: Alex Demidoff <[email protected]>

* transferred RelNotes

* formatting

* resized image

* feedback from Steve

* Update docs/release-notes/2.43.0.md

Co-authored-by: Steve Hoffman <[email protected]>

* feedback from Steve

* Update docs/release-notes/2.43.0.md

Co-authored-by: Roman Novikov <[email protected]>

* release date

* Update 2.43.0.md

Co-authored-by: Alex Demidoff <[email protected]>

* feedback from Roma

* IMPROVED  Monitoring PBM SECTION

* formatting

* formatting

* formatting

* formatting

* typo

* Update docs/release-notes/2.43.0.md

Co-authored-by: Michael Okoko <[email protected]>

* updated release date

* updated link text

* Update 2.43.0.md

* PMM-13243

* PMM-13327 - New MongoDB Router Summary (#1306)

* initial mention of the new Router Summary in the 2.43.0 RN

* initial commit new mongodb router summary doc

* catalina's feedback

Co-authored-by: Catalina A <[email protected]>

* catalina's feedback 2

Co-authored-by: Catalina A <[email protected]>

* catalina's feedback 3

Co-authored-by: Catalina A <[email protected]>

* catalina's feedback 4

Co-authored-by: Catalina A <[email protected]>

* catalina's feedback 5

Co-authored-by: Catalina A <[email protected]>

* catalina's feedback 6

Co-authored-by: Catalina A <[email protected]>

* catalina's feedback 7

Co-authored-by: Catalina A <[email protected]>

* catalina's feedback 8

Co-authored-by: Catalina A <[email protected]>

* catalina's feedback 9

Co-authored-by: Catalina A <[email protected]>

* catalina's feedback 10

Co-authored-by: Catalina A <[email protected]>

* minor edits

* duplicate row after applying suggestions

* link to dashboard doc

---------

Co-authored-by: Catalina A <[email protected]>

* PMM-13244

* GAed and redesigned MongoDB dashboards  (#1305)

* draft

* draft

* added descriptions for replset summary

* draft

* added content and updated sarshboard status

* linked new topics in ToC

* feedback from Santo

* Update docs/details/dashboards/dashboard-sharded-cluster-summary.md

Co-authored-by: Santo <[email protected]>

* feedback from Santo

* added description for chunks distribution

---------

Co-authored-by: Santo <[email protected]>

* added rederence to roma's blogpost

* New MongoDB dashboards - attempt to fix multiple issues on ` 2.43.0_updates` (#1307)

* remove redundant element

* mongodb sharded cluster (old and new)

* replicaset summary (old and new)

* collection overview - fix broken link / name incongruences

* lower case - or link is broken

* lower case - or link is broken

* lower case - or link is broken

* mongodb instances overview was missing - but the link was there

* order consistency - menu / table

* attempt to fix things

* add experimental in the router summary name

* remove unused experimental oplog link

there is one for non-experimental

* fix Toclinks

* fix ToC links

* fix links

* added known issue

* PMM-13348

* Update docs/release-notes/2.43.0.md

Co-authored-by: Santo <[email protected]>

* feedback from Nurlan

* feedback from Santo

* feedback from Ivan

* removed known issues

* PMM-13141

* format

* PMM-13362 - feedback from Santo

* feedback from Nurlan

* PMM-13141

---------

Co-authored-by: Alex Demidoff <[email protected]>
Co-authored-by: Steve Hoffman <[email protected]>
Co-authored-by: Roman Novikov <[email protected]>
Co-authored-by: Michael Okoko <[email protected]>
Co-authored-by: Santo <[email protected]>
  • Loading branch information
6 people authored Sep 19, 2024
1 parent 0db9283 commit 34e1a7a
Show file tree
Hide file tree
Showing 25 changed files with 926 additions and 236 deletions.
Binary file added docs/_images/MongoDB_ReplSetSummary.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_images/MongoDB_Sharded_Cluster_Summary.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_images/Mongodb_Collections_Overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_images/Mongodb_Oplog.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_images/PMM_MongoDB_Router_Summary.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_images/alert_flow - Copy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_images/new_Mongo_menu.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
291 changes: 143 additions & 148 deletions docs/details/commands/pmm-admin.md

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
# MongoDB Cluster Summary
# MongoDB Cluster Summary (OLD)

??? info "Dashboard update notice"
A [new version of the MongoDB Sharded Cluster Summary dashboard](../../details/dashboards/dashboard-sharded-cluster-summary.md) is available.
This older version will be deprecated and removed from PMM in the near future. We encourage you to start using the new dashboard to benefit from its enhanced monitoring capabilities.

![!image](../../_images/PMM_MongoDB_Cluster_Summary.jpg)

Expand Down

This file was deleted.

11 changes: 0 additions & 11 deletions docs/details/dashboards/dashboard-mongodb-experimental_oplog.md

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
# MongoDB ReplSet Summary
# MongoDB ReplSet Summary (OLD)

??? info "Dashboard update notice"
A [new version of the MongoDB ReplSet Summary dashboard](../../details/dashboards/dashboard-replsetsummary.md) is available. This older version will be deprecated and removed from PMM in the near future. We encourage you to start using the new dashboard to benefit from its enhanced monitoring capabilities.

![!image](../../_images/PMM_MongoDB_ReplSet_Summary.jpg)

Expand Down
62 changes: 62 additions & 0 deletions docs/details/dashboards/dashboard-mongodb-router-summary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Experimental MongoDB Router Summary

This dashboard is available starting from PMM 2.43 and is specifically designed for monitoring MongoS (router) nodes in sharded MongoDB clusters.

![!image](../../_images/PMM_MongoDB_Router_Summary.png)

## Overview
For each MongoS in the cluster, this section includes main monitoring metrics like CPU, memory and disk usage. Uptime and MongoS version are reported as well.

### CPU Usage
Shows CPU usage as a percentage from 0% to 100%. It updates every minute, turning from green to red when usage exceeds 80%. This helps quickly spot high CPU load, which could affect system performance, and monitor how hard the CPU is working at a glance.

### Memory Used
Displays the percentage of total system memory currently in use. It updates regularly, showing green up to 80% of usage and red beyond that threshold.

Use this for a quick visual indicator of memory consumption to monitor available memory without swapping as it's an easy way to assess how close the system is to its memory limits.

### Disk IO Utilization
Shows how busy the disk is handling read/write requests. The meter turns red above 80%, warning of potential slowdowns. It updates regularly, giving administrators a quick way to check if the disk is keeping up with demand or if it's becoming a bottleneck in system performance.

### Disk Space Utilization
Shows how much of the total disk space is currently in use. The meter turns red when usage exceeds 80%, warning of low free space. It updates regularly, giving you a quick way to check if the disk is nearing capacity. This helps prevent "disk full" errors that could disrupt services or system operation.

### Disk IOPS
Shows how many read and write operations the disk performs each second. The blue color helps spot spikes in disk activity. These spikes could mean the disk is struggling to keep up, which might slow down the system. It's a quick way for you to check if the disk is working too hard.

### Network Traffic
Combines both incoming (received) and outgoing (transmitted) data, excluding local traffic. It gives you a quick view of overall network activity, helping spot unusual spikes or drops in data flow that might affect system performance.

### Uptime
Shows how long the system has been running without a restart. As uptime increases, the color changes from red to orange to green, giving a quick visual indicator of system stability. Red indicates very recent restarts (less than 5 minutes), orange shows short uptimes (5 minutes to 1 hour), and green represents longer uptimes (over 1 hour). This helps you easily spot recent system restarts or confirm continuous operation.

### Version
Displays the current version of MongoDB running on the system. This information is crucial for ensuring the system is running the intended version and for quickly identifying any nodes that might need updates.

## Node States
Shows the status of all MongoDB Shard (MongoS) nodes in the selected cluster over time. It uses a color-coded timeline: green bars mean a node is "UP" and working, while red bars show it's "DOWN" or unreachable. This simple view helps you quickly spot which nodes are active, see any recent status changes, and identify patterns in node availability.

## Details
This section includes additional information like "Command Operations", "Connections", "Query execution times" and "Query efficiency".

### Command Operations
Shows MongoDB command operations over time, displaying rates for inserts, updates, deletes, queries, and TTL deletions per second.

Use this to monitor overall database workload, compare operation types, spot peak usage and unusual patterns, assess replication activity, and track automatic data cleanup.

### Connections
Displays MongoDB connection metrics over time, showing both current and available connections. Use this to monitor connection usage trends, identify periods of high demand, and ensure the database isn't reaching its connection limits.

By comparing current to available connections, it's easy to spot potential bottlenecks or capacity issues before they impact performance.

### Query execution times
Shows the average execution times for MongoDB queries over time, categorized into read, write, and other command operations.

Use this to identify slow queries, performance bottlenecks, and unusual spikes in execution times. Comparing latencies across operation types can also guide decisions on indexing strategies and query optimizations.

### Query Efficiency
Visualizes MongoDB query efficiency over time, displaying the ratio of scanned documents or index entries to returned documents, along with operation latencies.

A ratio near 1 indicates highly efficient queries, while higher values (e.g., 100) suggest inefficiency.

Compare document scans, index scans, and operation latencies to quickly identify poorly performing queries, and ensure that queries execute as efficiently as possible.
11 changes: 11 additions & 0 deletions docs/details/dashboards/dashboard-mongodb_collection_overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# MongoDB Collections Overview

This realtime dashboard contains panels of data about the Hottest Collections in the MongoDB database.

The Instance level includes two panels, one for the **Hottest Collections by Read (Total)** and the **Hottest Collections by Write (total)**.

The next panel displays data at the **Database Level**, where you can view MongoDB metrics such as **Commands**, **Inserts**, **Updates**, **Removes**, and **Getmore**.

The last panel shows the number of operations in the chosen database.

![!image](../../_images/Mongodb_Collections_Overview.png)
5 changes: 5 additions & 0 deletions docs/details/dashboards/dashboard-mongodb_oplog.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# MongoDB Oplog Details

This realtime dashboard contains Oplog details such as Recovery Window, Processing Time, Buffer Capacity, and Oplog Operations.

![!image](../../_images/Mongodb_Oplog.png)
167 changes: 167 additions & 0 deletions docs/details/dashboards/dashboard-replsetsummary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@

# MongoDB ReplSet Summary

The MongoDB ReplSet Summary dashboard offers a comprehensive view of your MongoDB replica set's health and performance. It provides clear insights for both simple and complex, multi-environment setups.

The dashboard displays key metrics for individual nodes and the entire replica set, allowing you to quickly spot issues and maintain optimal database performance. With focused information and effective visualizations, it helps you identify and resolve potential problems efficiently, making it easier to manage MongoDB deployments of any size.

![MongoDB ReplSet Summary](../../_images/MongoDB_ReplSetSummary.png)

## Overview

The overview section displays essential data for individual nodes, such as their role, CPU usage, memory consumption, disk space, network traffic, uptime, and the current MongoDB version.

## State

Displays the current state of a MongoDB replica set member. It shows a single value representing the node's role, such as PRIMARY, SECONDARY, or ARBITER. The state is color-coded for quick visual identification. This information is crucial for understanding the current role and health of each node in your MongoDB replica set.

### CPU Usage

Displays the current CPU usage percentage for the selected MongoDB service. It shows how much of the CPU's capacity is being used, with a range from 0% to 100%.

The gauge is color-coded, turning red when usage exceeds 80%, helping you quickly identify high CPU load situations. This metric is crucial for monitoring the performance and resource utilization of your MongoDB instance, allowing you to spot potential bottlenecks or overloaded servers at a glance.

### Memory Used

Shows an estimate of how much memory can be used without causing swapping on the MongoDB server. It displays the percentage of memory currently in use, with 100% indicating that all available memory is used and swapping may occur. The gauge turns red above 80% usage, signaling that free memory is running low. This metric is crucial for predicting potential performance issues due to memory constraints, helping you proactively manage your MongoDB instance's memory resources to avoid swapping and maintain optimal performance.

### Disk I/O Utilization

Displays disk utilization as a percentage, showing how often there was at least one I/O request active for the MongoDB server. Ranging from 0% to 100%, it helps determine if disk load is evenly distributed or if I/O is bottlenecked. Higher values suggest more intense, potentially queued disk activity. The gauge turns red above 80%, indicating possible I/O constraints. Use this metric alongside I/O latency and queue depth to assess overall storage performance and identify potential disk-related issues affecting your MongoDB instance's responsiveness

### Disk Space Utilization

Shows the percentage of used disk space for the MongoDB server's data storage. It ranges from 0% to 100%, with higher values indicating less free space. The gauge turns red above 80% usage, warning of potential disk space issues. This metric is crucial for preventing *Disk full* errors that could disrupt services or crash the system. When free space approaches zero, consider removing unused files or expanding storage capacity to ensure smooth MongoDB operation and prevent data-related incidents.

### Disk IOPS

This stat panel displays the current rate of disk Input/Output Operations Per Second (IOPS) for the MongoDB server, showing separate values for read and write operations. It provides a real-time view of the physical I/O load on the storage system. The panel uses an area graph to visualize recent trends. High IOPS values or sudden spikes can indicate potential performance issues due to I/O subsystem overload. Monitor this metric to identify periods of intense disk activity and potential storage bottlenecks that could affect MongoDB's performance.

### Network Traffic

Displays the current network traffic for the MongoDB server, showing separate values for inbound and outbound data transfer rates in bytes per second. It uses an area graph to visualize recent trends in network activity. The panel provides a real-time view of data movement across the network, helping you monitor the MongoDB server's network load.

High values or sudden spikes can indicate increased database activity, potential performance bottlenecks, or unusual network patterns. Use this metric to assess network utilization and identify periods of intense data transfer that might affect MongoDB's performance or user experience.

### Uptime

Displays the current uptime of the MongoDB server, showing how long it has been running without a shutdown or restart. The value is presented in seconds and uses color-coding for quick status assessment: red for very recent starts, orange for short uptimes, and green for longer periods.

This metric is useful for tracking system stability, identifying recent restarts, and monitoring continuous operation time. Long uptimes generally indicate stable operation, while short uptimes might suggest recent maintenance or unexpected restarts that could warrant investigation.

### Version

Shows the current version of MongoDB running on the selected node in the replica set.

This information is crucial for ensuring consistency across your MongoDB deployment, tracking upgrade status, and identifying potential version-related issues or compatibility concerns. Regular checks of this panel can help maintain a uniform MongoDB version across your infrastructure and assist in planning future upgrades or troubleshooting version-specific problems.

## States

### Node States

Visualizes the status changes of each node in the MongoDB replica set over the selected time range.

The timeline format allows you to easily track state transitions, identify periods of instability, and understand the roles of different nodes throughout the monitored period. This visualization is crucial for monitoring replica set health, detecting failovers or reconfigurations, and ensuring the overall stability of your MongoDB cluster. Use this panel to quickly spot any unusual patterns or frequent state changes that might require further investigation.

For more details on replica set states, see to the [MongoDB documentation](https://www.mongodb.com/docs/manual/reference/replica-states/).

## Details

### Command Operations

Shows the rates of different MongoDB operations per second, including primary operations (like queries, inserts, updates, and deletes), replicated operations on secondary nodes, and automatic deletions by TTL indexes.

It helps you visualize your database's workload, showing how different types of operations contribute to overall activity. Use this to spot unusual patterns, balance between read and write operations, and understand your MongoDB instance's performance at a glance.

You can filter the chart to focus on specific command types by clicking on their names in the legend. This will display only the selected metric. To view multiple speci
metrics, use *Ctrl + click* to select multiple items.

### Top Hottest Collections by Read

Lists the five collections with the highest read activity. Use this panel to quickly identify which collections are under the most demand, allowing you to monitor read-heavy workloads and optimize performance accordingly.

### Query Execution Times

Displays the average latency of operations, categorized by read, write, or command. It visualizes how long each type of operation takes to execute over time, helping you identify trends or potential performance bottlenecks in your database operations. Use this panel to you to monitor and optimize the responsiveness of your MongoDB cluster.

### Top Hottest Collections by Write

Lists the five collections with the most write operations.

### Query Efficiency

Measures the efficiency of queries in your MongoDB cluster by showing the ratio of documents or index entries scanned versus documents returned. A ratio of 1 indicates that every document returned matched the query criteria exactly, while a higher value, such as 100, suggests that on average, 100 documents were scanned to return a single document.

Use this panel to assess query performance, identify inefficient queries, and optimize indexing strategies.

### Queued Operations

Displays the number of operations that are queued due to locks within your MongoDB cluster. It helps identify performance bottlenecks by showing how many operations are waiting because of locking issues.

Use this panel to track these queued operations and monitor the impact of locking on system performance over time and take action if necessary.

### Reads & Writes

Tracks the number of read and write operations over time in your MongoDB environment. Reads represent data queries, while writes represent data modifications.

Use this panel to get insights into the workload distribution and monitor the performance of database operations, ensuring that the system is handling read and write operations efficiently.

### Connections

Monitors the average number of active and available MongoDB connections over time.

Use this panel to track connection usage and ensure the database has sufficient capacity to handle incoming requests without reaching its limit.

## Collection Details

### Size of Collections

Displays the storage size of MongoDB collections, which are analogous to tables in relational databases, offering insights into the storage footprint of each collection across different nodes.

Use it to effectively monitor and manage data distribution and storage consumption. The data is organized by database name, collection, and node, and can be easily filtered and sorted for detailed analysis.

### Number of Collections

Provides a count of collections across different databases and nodes, helping you understand the structure and scale of your MongoDB deployments. The data is organized by database name and node, and you can filter and sort it for detailed insights.

Use this table to monitor the distribution of collections and manage your database schema effectively.

## Replication

### Replication Lag

Monitors replication lag, which occurs when a secondary node cannot replicate data as fast as it is written to the primary node. Causes of lag can include network latency, packet loss, or routing issues.

### OpLog Recovery Window

Indicates the timespan (window) between the newest and oldest operations in the Oplog collection.

## Performance

### Flow Control

Monitors and displays the performance metrics related to flow control in a MongoDB cluster. It provides insights into the frequency and duration of lagged operations, which can help you identify potential bottlenecks or performance issues.

### WiredTiger Concurrency Tickets Available

Shows the number of available WiredTiger concurrency tickets, which control the number of operations that can run simultaneously in the storage engine.

## Nodes Summary

Provides a quick overview of the health and resource utilization of your nodes, making it easy to spot any potential issues or resource constraints.

## CPU Usage

Measures CPU time as a percentage of the CPU's total capacity, providing insights into CPU utilization.

## CPU Saturation

Indicates when a system is running at maximum CPU capacity, leading to increased data queuing and potential performance degradation.

## Disk I/O and Swap Activity

Tracks disk I/O operations and swap activity, which involve transferring data between the hard disk drive and RAM.

## Network Traffic

Monitors network traffic, showing the amount of data moving across the network at any given time.
Loading

0 comments on commit 34e1a7a

Please sign in to comment.