2.43.0 updates (#1271)

* release setup * PMM-13133 * improved wording * Update query-analytics.md * Download server diagnostics endpoint * Update troubleshoot.md * PMM-13054 * fortmatting * typo * Update docs/get-started/query-analytics.md Co-authored-by: Alex Demidoff <[email protected]> * Update docs/get-started/query-analytics.md Co-authored-by: Alex Demidoff <[email protected]> * transferred RelNotes * formatting * resized image * feedback from Steve * Update docs/release-notes/2.43.0.md Co-authored-by: Steve Hoffman <[email protected]> * feedback from Steve * Update docs/release-notes/2.43.0.md Co-authored-by: Roman Novikov <[email protected]> * release date * Update 2.43.0.md Co-authored-by: Alex Demidoff <[email protected]> * feedback from Roma * IMPROVED Monitoring PBM SECTION * formatting * formatting * formatting * formatting * typo * Update docs/release-notes/2.43.0.md Co-authored-by: Michael Okoko <[email protected]> * updated release date * updated link text * Update 2.43.0.md * PMM-13243 * PMM-13327 - New MongoDB Router Summary (#1306) * initial mention of the new Router Summary in the 2.43.0 RN * initial commit new mongodb router summary doc * catalina's feedback Co-authored-by: Catalina A <[email protected]> * catalina's feedback 2 Co-authored-by: Catalina A <[email protected]> * catalina's feedback 3 Co-authored-by: Catalina A <[email protected]> * catalina's feedback 4 Co-authored-by: Catalina A <[email protected]> * catalina's feedback 5 Co-authored-by: Catalina A <[email protected]> * catalina's feedback 6 Co-authored-by: Catalina A <[email protected]> * catalina's feedback 7 Co-authored-by: Catalina A <[email protected]> * catalina's feedback 8 Co-authored-by: Catalina A <[email protected]> * catalina's feedback 9 Co-authored-by: Catalina A <[email protected]> * catalina's feedback 10 Co-authored-by: Catalina A <[email protected]> * minor edits * duplicate row after applying suggestions * link to dashboard doc --------- Co-authored-by: Catalina A <[email protected]> * PMM-13244 * GAed and redesigned MongoDB dashboards (#1305) * draft * draft * added descriptions for replset summary * draft * added content and updated sarshboard status * linked new topics in ToC * feedback from Santo * Update docs/details/dashboards/dashboard-sharded-cluster-summary.md Co-authored-by: Santo <[email protected]> * feedback from Santo * added description for chunks distribution --------- Co-authored-by: Santo <[email protected]> * added rederence to roma's blogpost * New MongoDB dashboards - attempt to fix multiple issues on ` 2.43.0_updates` (#1307) * remove redundant element * mongodb sharded cluster (old and new) * replicaset summary (old and new) * collection overview - fix broken link / name incongruences * lower case - or link is broken * lower case - or link is broken * lower case - or link is broken * mongodb instances overview was missing - but the link was there * order consistency - menu / table * attempt to fix things * add experimental in the router summary name * remove unused experimental oplog link there is one for non-experimental * fix Toclinks * fix ToC links * fix links * added known issue * PMM-13348 * Update docs/release-notes/2.43.0.md Co-authored-by: Santo <[email protected]> * feedback from Nurlan * feedback from Santo * feedback from Ivan * removed known issues * PMM-13141 * format * PMM-13362 - feedback from Santo * feedback from Nurlan * PMM-13141 --------- Co-authored-by: Alex Demidoff <[email protected]> Co-authored-by: Steve Hoffman <[email protected]> Co-authored-by: Roman Novikov <[email protected]> Co-authored-by: Michael Okoko <[email protected]> Co-authored-by: Santo <[email protected]>
percona · Sep 19, 2024 · 34e1a7a · 34e1a7a
1 parent 0db9283
commit 34e1a7a
Show file tree

Hide file tree

Showing 25 changed files with 926 additions and 236 deletions.
diff --git a/docs/_images/MongoDB_ReplSetSummary.png b/docs/_images/MongoDB_ReplSetSummary.png
diff --git a/docs/_images/MongoDB_Sharded_Cluster_Summary.png b/docs/_images/MongoDB_Sharded_Cluster_Summary.png
diff --git a/docs/_images/Mongodb_Collections_Overview.png b/docs/_images/Mongodb_Collections_Overview.png
diff --git a/docs/_images/Mongodb_Oplog.png b/docs/_images/Mongodb_Oplog.png
diff --git a/docs/_images/PMM_MongoDB_Router_Summary.png b/docs/_images/PMM_MongoDB_Router_Summary.png
diff --git a/docs/_images/alert_flow - Copy.png b/docs/_images/alert_flow - Copy.png
diff --git a/docs/_images/new_Mongo_menu.png b/docs/_images/new_Mongo_menu.png
diff --git a/docs/details/commands/pmm-admin.md b/docs/details/commands/pmm-admin.md
diff --git a/...ards/dashboard-mongodb-cluster-summary.md → .../dashboard-mongodb-cluster-summary-old.md b/...ards/dashboard-mongodb-cluster-summary.md → .../dashboard-mongodb-cluster-summary-old.md
@@ -1,4 +1,8 @@
-# MongoDB Cluster Summary
+# MongoDB Cluster Summary (OLD)
+
+??? info "Dashboard update notice"
+     A [new version of the MongoDB Sharded Cluster Summary dashboard](../../details/dashboards/dashboard-sharded-cluster-summary.md) is available. 
+     This older version will be deprecated and removed from PMM in the near future. We encourage you to start using the new dashboard to benefit from its enhanced monitoring capabilities.
 
 ![!image](../../_images/PMM_MongoDB_Cluster_Summary.jpg)
 

diff --git a/docs/details/dashboards/dashboard-mongodb-experimental_collection_overview.md b/docs/details/dashboards/dashboard-mongodb-experimental_collection_overview.md
diff --git a/docs/details/dashboards/dashboard-mongodb-experimental_oplog.md b/docs/details/dashboards/dashboard-mongodb-experimental_oplog.md
diff --git a/...ards/dashboard-mongodb-replset-summary.md → .../dashboard-mongodb-replset-summary-old.md b/...ards/dashboard-mongodb-replset-summary.md → .../dashboard-mongodb-replset-summary-old.md
@@ -1,4 +1,7 @@
-# MongoDB ReplSet Summary
+# MongoDB ReplSet Summary (OLD)
+
+??? info "Dashboard update notice"
+     A [new version of the MongoDB ReplSet Summary dashboard](../../details/dashboards/dashboard-replsetsummary.md) is available. This older version will be deprecated and removed from PMM in the near future. We encourage you to start using the new dashboard to benefit from its enhanced monitoring capabilities.
 
 ![!image](../../_images/PMM_MongoDB_ReplSet_Summary.jpg)
 

diff --git a/docs/details/dashboards/dashboard-mongodb-router-summary.md b/docs/details/dashboards/dashboard-mongodb-router-summary.md
@@ -0,0 +1,62 @@
+# Experimental MongoDB Router Summary
+
+This dashboard is available starting from PMM 2.43 and is specifically designed for monitoring MongoS (router) nodes in sharded MongoDB clusters.
+
+![!image](../../_images/PMM_MongoDB_Router_Summary.png)
+
+## Overview
+For each MongoS in the cluster, this section includes main monitoring metrics like CPU, memory and disk usage. Uptime and MongoS version are reported as well.
+
+### CPU Usage
+Shows CPU usage as a percentage from 0% to 100%. It updates every minute, turning from green to red when usage exceeds 80%. This helps quickly spot high CPU load, which could affect system performance, and monitor how hard the CPU is working at a glance.
+
+### Memory Used
+Displays the percentage of total system memory currently in use. It updates regularly, showing green up to 80% of usage and red beyond that threshold.
+
+Use this for a quick visual indicator of memory consumption to monitor available memory without swapping as it's an easy way to assess how close the system is to its memory limits.
+
+### Disk IO Utilization
+Shows how busy the disk is handling read/write requests. The meter turns red above 80%, warning of potential slowdowns. It updates regularly, giving administrators a quick way to check if the disk is keeping up with demand or if it's becoming a bottleneck in system performance.
+
+### Disk Space Utilization
+Shows how much of the total disk space is currently in use. The meter turns red when usage exceeds 80%, warning of low free space. It updates regularly, giving you a quick way to check if the disk is nearing capacity. This helps prevent "disk full" errors that could disrupt services or system operation.
+
+### Disk IOPS
+Shows how many read and write operations the disk performs each second. The blue color helps spot spikes in disk activity. These spikes could mean the disk is struggling to keep up, which might slow down the system. It's a quick way for you to check if the disk is working too hard.
+
+### Network Traffic
+Combines both incoming (received) and outgoing (transmitted) data, excluding local traffic. It gives you a quick view of overall network activity, helping spot unusual spikes or drops in data flow that might affect system performance.
+
+### Uptime
+Shows how long the system has been running without a restart. As uptime increases, the color changes from red to orange to green, giving a quick visual indicator of system stability. Red indicates very recent restarts (less than 5 minutes), orange shows short uptimes (5 minutes to 1 hour), and green represents longer uptimes (over 1 hour). This helps you easily spot recent system restarts or confirm continuous operation.
+
+### Version
+Displays the current version of MongoDB running on the system. This information is crucial for ensuring the system is running the intended version and for quickly identifying any nodes that might need updates.
+
+## Node States
+Shows the status of all MongoDB Shard (MongoS) nodes in the selected cluster over time. It uses a color-coded timeline: green bars mean a node is "UP" and working, while red bars show it's "DOWN" or unreachable. This simple view helps you quickly spot which nodes are active, see any recent status changes, and identify patterns in node availability.
+
+## Details
+This section includes additional information like "Command Operations", "Connections", "Query execution times" and "Query efficiency".
+
+### Command Operations
+Shows MongoDB command operations over time, displaying rates for inserts, updates, deletes, queries, and TTL deletions per second.
+
+Use this to monitor overall database workload, compare operation types, spot peak usage and unusual patterns, assess replication activity, and track automatic data cleanup.
+
+### Connections
+Displays MongoDB connection metrics over time, showing both current and available connections. Use this to monitor connection usage trends, identify periods of high demand, and ensure the database isn't reaching its connection limits.
+
+By comparing current to available connections, it's easy to spot potential bottlenecks or capacity issues before they impact performance.
+
+### Query execution times
+Shows the average execution times for MongoDB queries over time, categorized into read, write, and other command operations.
+
+Use this to identify slow queries, performance bottlenecks, and unusual spikes in execution times. Comparing latencies across operation types can also guide decisions on indexing strategies and query optimizations.
+
+### Query Efficiency
+Visualizes MongoDB query efficiency over time, displaying the ratio of scanned documents or index entries to returned documents, along with operation latencies.
+
+A ratio near 1 indicates highly efficient queries, while higher values (e.g., 100) suggest inefficiency.
+
+Compare document scans, index scans, and operation latencies to quickly identify poorly performing queries, and ensure that queries execute as efficiently as possible.
diff --git a/docs/details/dashboards/dashboard-mongodb_collection_overview.md b/docs/details/dashboards/dashboard-mongodb_collection_overview.md
@@ -0,0 +1,11 @@
+# MongoDB Collections Overview
+
+This realtime dashboard contains panels of data about the Hottest Collections in the MongoDB database.
+
+The Instance level includes two panels, one for the **Hottest Collections by Read (Total)** and the **Hottest Collections by Write (total)**.
+
+The next panel displays data at the **Database Level**, where you can view MongoDB metrics such as **Commands**, **Inserts**, **Updates**, **Removes**, and **Getmore**.
+
+The last panel shows the number of operations in the chosen database.
+
+![!image](../../_images/Mongodb_Collections_Overview.png)
diff --git a/docs/details/dashboards/dashboard-mongodb_oplog.md b/docs/details/dashboards/dashboard-mongodb_oplog.md
@@ -0,0 +1,5 @@
+# MongoDB Oplog Details
+
+This realtime dashboard contains Oplog details such as Recovery Window, Processing Time, Buffer Capacity, and Oplog Operations.
+
+![!image](../../_images/Mongodb_Oplog.png)
diff --git a/docs/details/dashboards/dashboard-replsetsummary.md b/docs/details/dashboards/dashboard-replsetsummary.md
@@ -0,0 +1,167 @@
+
+# MongoDB ReplSet Summary
+
+The MongoDB ReplSet Summary dashboard offers a comprehensive view of your MongoDB replica set's health and performance. It provides clear insights for both simple and complex, multi-environment setups.
+
+The dashboard displays key metrics for individual nodes and the entire replica set, allowing you to quickly spot issues and maintain optimal database performance. With focused information and effective visualizations, it helps you identify and resolve potential problems efficiently, making it easier to manage MongoDB deployments of any size.
+
+![MongoDB ReplSet Summary](../../_images/MongoDB_ReplSetSummary.png)
+
+## Overview
+
+The overview section displays essential data for individual nodes, such as their role, CPU usage, memory consumption, disk space, network traffic, uptime, and the current MongoDB version.
+
+## State
+
+Displays the current state of a MongoDB replica set member. It shows a single value representing the node's role, such as PRIMARY, SECONDARY, or ARBITER. The state is color-coded for quick visual identification. This information is crucial for understanding the current role and health of each node in your MongoDB replica set.
+
+### CPU Usage
+
+Displays the current CPU usage percentage for the selected MongoDB service. It shows how much of the CPU's capacity is being used, with a range from 0% to 100%. 
+
+The gauge is color-coded, turning red when usage exceeds 80%, helping you quickly identify high CPU load situations. This metric is crucial for monitoring the performance and resource utilization of your MongoDB instance, allowing you to spot potential bottlenecks or overloaded servers at a glance.
+
+### Memory Used
+
+Shows an estimate of how much memory can be used without causing swapping on the MongoDB server. It displays the percentage of memory currently in use, with 100% indicating that all available memory is used and swapping may occur. The gauge turns red above 80% usage, signaling that free memory is running low. This metric is crucial for predicting potential performance issues due to memory constraints, helping you proactively manage your MongoDB instance's memory resources to avoid swapping and maintain optimal performance.
+
+### Disk I/O Utilization
+
+Displays disk utilization as a percentage, showing how often there was at least one I/O request active for the MongoDB server. Ranging from 0% to 100%, it helps determine if disk load is evenly distributed or if I/O is bottlenecked. Higher values suggest more intense, potentially queued disk activity. The gauge turns red above 80%, indicating possible I/O constraints. Use this metric alongside I/O latency and queue depth to assess overall storage performance and identify potential disk-related issues affecting your MongoDB instance's responsiveness
+
+### Disk Space Utilization
+
+Shows the percentage of used disk space for the MongoDB server's data storage. It ranges from 0% to 100%, with higher values indicating less free space. The gauge turns red above 80% usage, warning of potential disk space issues. This metric is crucial for preventing *Disk full* errors that could disrupt services or crash the system. When free space approaches zero, consider removing unused files or expanding storage capacity to ensure smooth MongoDB operation and prevent data-related incidents.
+
+### Disk IOPS
+
+This stat panel displays the current rate of disk Input/Output Operations Per Second (IOPS) for the MongoDB server, showing separate values for read and write operations. It provides a real-time view of the physical I/O load on the storage system. The panel uses an area graph to visualize recent trends. High IOPS values or sudden spikes can indicate potential performance issues due to I/O subsystem overload. Monitor this metric to identify periods of intense disk activity and potential storage bottlenecks that could affect MongoDB's performance.
+
+### Network Traffic
+
+Displays the current network traffic for the MongoDB server, showing separate values for inbound and outbound data transfer rates in bytes per second. It uses an area graph to visualize recent trends in network activity. The panel provides a real-time view of data movement across the network, helping you monitor the MongoDB server's network load.
+
+High values or sudden spikes can indicate increased database activity, potential performance bottlenecks, or unusual network patterns. Use this metric to assess network utilization and identify periods of intense data transfer that might affect MongoDB's performance or user experience.
+
+### Uptime
+
+Displays the current uptime of the MongoDB server, showing how long it has been running without a shutdown or restart. The value is presented in seconds and uses color-coding for quick status assessment: red for very recent starts, orange for short uptimes, and green for longer periods.
+
+This metric is useful for tracking system stability, identifying recent restarts, and monitoring continuous operation time. Long uptimes generally indicate stable operation, while short uptimes might suggest recent maintenance or unexpected restarts that could warrant investigation.
+
+### Version
+
+Shows the current version of MongoDB running on the selected node in the replica set.
+
+This information is crucial for ensuring consistency across your MongoDB deployment, tracking upgrade status, and identifying potential version-related issues or compatibility concerns. Regular checks of this panel can help maintain a uniform MongoDB version across your infrastructure and assist in planning future upgrades or troubleshooting version-specific problems.
+
+## States
+
+### Node States
+
+Visualizes the status changes of each node in the MongoDB replica set over the selected time range. 
+
+The timeline format allows you to easily track state transitions, identify periods of instability, and understand the roles of different nodes throughout the monitored period. This visualization is crucial for monitoring replica set health, detecting failovers or reconfigurations, and ensuring the overall stability of your MongoDB cluster. Use this panel to quickly spot any unusual patterns or frequent state changes that might require further investigation.
+
+For more details on replica set states, see to the [MongoDB documentation](https://www.mongodb.com/docs/manual/reference/replica-states/).
+
+## Details
+
+### Command Operations
+
+Shows the rates of different MongoDB operations per second, including primary operations (like queries, inserts, updates, and deletes), replicated operations on secondary nodes, and automatic deletions by TTL indexes.
+
+It helps you visualize your database's workload, showing how different types of operations contribute to overall activity. Use this to spot unusual patterns, balance between read and write operations, and understand your MongoDB instance's performance at a glance.
+
+You can filter the chart to focus on specific command types by clicking on their names in the legend. This will display only the selected metric. To view multiple speci
+metrics, use *Ctrl + click*  to select multiple items.
+
+### Top Hottest Collections by Read
+
+Lists the five collections with the highest read activity. Use this panel to quickly identify which collections are under the most demand, allowing you to monitor read-heavy workloads and optimize performance accordingly.
+
+### Query Execution Times
+
+Displays the average latency of operations, categorized by read, write, or command. It visualizes how long each type of operation takes to execute over time, helping you identify trends or potential performance bottlenecks in your database operations. Use this panel to you to monitor and optimize the responsiveness of your MongoDB cluster.
+
+### Top Hottest Collections by Write
+
+Lists the five collections with the most write operations.
+
+### Query Efficiency
+
+Measures the efficiency of queries in your MongoDB cluster by showing the ratio of documents or index entries scanned versus documents returned. A ratio of 1 indicates that every document returned matched the query criteria exactly, while a higher value, such as 100, suggests that on average, 100 documents were scanned to return a single document.
+
+Use this panel to assess query performance, identify inefficient queries, and optimize indexing strategies. 
+
+### Queued Operations
+
+Displays the number of operations that are queued due to locks within your MongoDB cluster. It helps identify performance bottlenecks by showing how many operations are waiting because of locking issues. 
+
+Use this panel to track these queued operations and monitor the impact of locking on system performance over time and take action if necessary.
+
+### Reads & Writes
+
+Tracks the number of read and write operations over time in your MongoDB environment. Reads represent data queries, while writes represent data modifications.
+
+Use this panel to get insights into the workload distribution and monitor the performance of database operations, ensuring that the system is handling read and write operations efficiently.
+
+### Connections
+
+Monitors the average number of active and available MongoDB connections over time.
+
+Use this panel to track connection usage and ensure the database has sufficient capacity to handle incoming requests without reaching its limit.
+
+## Collection Details
+
+### Size of Collections
+
+Displays the storage size of MongoDB collections, which are analogous to tables in relational databases, offering insights into the storage footprint of each collection across different nodes.
+
+Use it to effectively monitor and manage data distribution and storage consumption. The data is organized by database name, collection, and node, and can be easily filtered and sorted for detailed analysis.
+
+### Number of Collections
+
+Provides a count of collections across different databases and nodes, helping you understand the structure and scale of your MongoDB deployments. The data is organized by database name and node, and you can filter and sort it for detailed insights. 
+
+Use this table to monitor the distribution of collections and manage your database schema effectively.
+
+## Replication
+
+### Replication Lag
+
+Monitors replication lag, which occurs when a secondary node cannot replicate data as fast as it is written to the primary node. Causes of lag can include network latency, packet loss, or routing issues.
+
+### OpLog Recovery Window
+
+Indicates the timespan (window) between the newest and oldest operations in the Oplog collection.
+
+## Performance
+
+### Flow Control
+
+Monitors and displays the performance metrics related to flow control in a MongoDB cluster. It provides insights into the frequency and duration of lagged operations, which can help you identify potential bottlenecks or performance issues.
+
+### WiredTiger Concurrency Tickets Available
+
+Shows the number of available WiredTiger concurrency tickets, which control the number of operations that can run simultaneously in the storage engine.
+
+## Nodes Summary
+
+Provides a quick overview of the health and resource utilization of your nodes, making it easy to spot any potential issues or resource constraints.
+
+## CPU Usage
+
+Measures CPU time as a percentage of the CPU's total capacity, providing insights into CPU utilization.
+
+## CPU Saturation
+
+Indicates when a system is running at maximum CPU capacity, leading to increased data queuing and potential performance degradation.
+
+## Disk I/O and Swap Activity
+
+Tracks disk I/O operations and swap activity, which involve transferring data between the hard disk drive and RAM.
+
+## Network Traffic
+
+Monitors network traffic, showing the amount of data moving across the network at any given time.