Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Meta][Metricbeat] - Collect additional Elasticsearch node metrics for enhanced dashboards #42131

Open
5 tasks
VimCommando opened this issue Dec 20, 2024 · 0 comments
Assignees

Comments

@VimCommando
Copy link

VimCommando commented Dec 20, 2024

Metricbeat (as of 8.15.3) used for stack monitoring collection is still missing some helpful metrics for building comprehensive monitoring dashboards.

Here is a potential list of metrics to included from _node/stats:

jvm.threads.count
http.total_opened
process.open_file_descriptors
process.mem.total_virtual_in_bytes

transport.rx_count
transport.rx_size_in_bytes
transport.tx_count
transport.tx_size_in_bytes

ingest.total.count
ingest.total.time_in_millis
ingest.total.failed

indices.fielddata.evictions
indices.get.time_in_millis
indices.get.total
indices.merges.total
indices.merges.total_time_in_millis
indices.search.fetch_time_in_millis
indices.search.fetch_total
indices.search.query_time_in_millis
indices.search.query_total
indices.translog.operations
indices.translog.size_in_bytes

thread_pool.esql_worker.active
thread_pool.esql_worker.queue
thread_pool.esql_worker.rejected
thread_pool.flush.active
thread_pool.flush.queue
thread_pool.flush.rejected
thread_pool.force_merge.active
thread_pool.get.active
thread_pool.search.active
thread_pool.write.active
thread_pool.search_worker.active
thread_pool.search_worker.queue
thread_pool.search_worker.rejected
thread_pool.snapshot.active
thread_pool.snapshot.queue
thread_pool.snapshot.rejected
thread_pool.system_read.active
thread_pool.system_read.queue
thread_pool.system_read.rejected
thread_pool.system_write.active
thread_pool.system_write.queue
thread_pool.system_write.rejected

Some newer features such as ES|QL (esql_worker) and intra-segment search parallelism (search_worker) have been introduced in 8.x and Metricbeat monitoring isn't capturing the relevant thread pools yet.

The average service time can also be helpful, for example the write time per document or query time per search. This is usually just a simple division like indices.write.time_in_millis / indices.write.total, but if it is calculated at ingest time, it is possible to sort by this metric in visualizations.

Tasks

Preview Give feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants