-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a markdown with high level kubernetes metadata enrichment explanation #38757
Merged
MichaelKatsoulis
merged 7 commits into
elastic:main
from
MichaelKatsoulis:update-k8s-enrichers-doc
Apr 15, 2024
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
ceee9dd
Add a markdown with high level kubernetes metadata enrichment explana…
MichaelKatsoulis cce85bb
Updates
MichaelKatsoulis 40238e0
Update
MichaelKatsoulis f0ffafc
Add table of watchers
MichaelKatsoulis 524d388
Update metricbeat/module/kubernetes/util/enrichers.md
MichaelKatsoulis 55c9c11
Update metricbeat/module/kubernetes/util/enrichers.md
MichaelKatsoulis dd945de
Minor changes
MichaelKatsoulis File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
## Kubernetes Metadata enrichment | ||
|
||
The metadata enrichment process involves associating contextual information, such as Kubernetes metadata (e.g., labels, annotations, resource names), with metrics and events collected by Elastic Agent and Beats in Kubernetes environments. This process enhances the understanding and analysis of collected data by providing additional context. | ||
|
||
### Key Components: | ||
|
||
1. **Metricsets:** | ||
- Metricsets are responsible for collecting metrics and events from various sources within Kubernetes, such as kubelet and kube-state-metrics. | ||
|
||
2. **Enrichers:** | ||
- Enrichers are components responsible for enriching collected data with Kubernetes metadata. Each metricset is associated with its enricher, which handles the metadata enrichment process. | ||
MichaelKatsoulis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
3. **Watchers:** | ||
- Watchers are mechanisms used to monitor Kubernetes resources and detect changes, such as the addition, update, or deletion of resources like pods or nodes. | ||
|
||
4. **Metadata Generators:** | ||
- Metadata generators are responsible for generating metadata associated with Kubernetes resources. These generators are utilized by enrichers to collect relevant metadata. Each enricher has one metadata generator. | ||
|
||
### Metadata Generation Process: | ||
|
||
1. **Initialization:** | ||
- Metricsets are initialized with their respective enrichers during startup. Enrichers are responsible for managing the metadata enrichment process for their associated metricsets. | ||
|
||
2. **Watcher Creation:** | ||
- Multiple enrichers are associated with one watcher. For example a pod watcher is associated with pod, state_pod, container and state_container metricsets and their enrichers. | ||
MichaelKatsoulis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Watchers are created to monitor Kubernetes resources relevant to the metricset's data collection requirements. For example pod metricset triggers the creation of watcher for pods, nodes and namespaces. | ||
|
||
3. **Metadata Generation:** | ||
- When a watcher detects a change in a monitored resource (e.g., a new pod creation or a label update), it triggers the associated enrichers' metadata generation process. | ||
|
||
4. **Enrichment Generation Process:** | ||
- The enricher collects relevant metadata from Kubernetes API objects corresponding to the detected changes. This metadata includes information like labels, annotations, resource names, etc. | ||
|
||
5. **Association with Events:** | ||
MichaelKatsoulis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- The collected metadata are associated with the metricset's events. This association process enriches the events with contextual information, providing deeper insights into the collected data. The enriched events generated from beats/agent are sent to the configured output (e.g. Elasticsearch). | ||
|
||
### Handling Edge Cases: | ||
|
||
1. **Synchronization:** | ||
- Special mechanisms are in place to handle scenarios where resources trigger events before associated enrichers are fully initialized. Proactive synchronization ensures that existing resource metadata is captured and updated in enricher maps. | ||
- When a watcher detects events (like object additions or updates), it updates a list (metadataObjects) with the IDs of these detected objects. Before introducing new enrichers, existing metadataObjects are reviewed. For each existing object ID, the corresponding metadata is retrieved and used to update the new enrichers, ensuring that metadata for pre-existing resources is properly captured and integrated into the new enricher's metadata map. This synchronization process guarantees accurate metadata enrichment, even for resources that triggered events before the initialization of certain enrichers. | ||
|
||
### Watcher Management: | ||
|
||
1. **Initialization Sequence:** | ||
- Watchers are initialized and managed by metricsets. Extra watchers, such as those for namespaces and nodes, are first initialized in order to ensure metadata availability before the main watcher starts monitoring resources. | ||
|
||
2. **Configuration Updates:** | ||
- Watcher configurations, such as watch options or resource filtering criteria, can be updated dynamically. A mechanism is in place to seamlessly transition to updated configurations without disrupting data collection. | ||
|
||
|
||
### Flow example | ||
|
||
In the following diagram, an example of different metricsets leveraging the same watchers is depicted. Metricsets have their own enrichers but share watchers. The watchers monitor the Kubernetes API for specific resource updates. | ||
[metadata diag](../_meta/images/enrichers.png) | ||
|
||
### Expected watchers per metricset | ||
|
||
The following table demonstrates which watchers are needed for each metricset by default. | ||
Note that no watcher monitoring the same resource kind will be created twice. | ||
|
||
| Metricset | Namespace watcher | Node watcher | Resource watcher | Notes | | ||
|----------------------|:-----------------:|:------------:|:----------------:|-----------------------------------------------------------| | ||
| API Server | ✕ | ✕ | ✕ | | | ||
| Container | ✓ | ✓ | ✓ | | | ||
| Controller manager | ✕ | ✕ | ✓ | | | ||
| Event | ✓ | ✕ | ✓ | | | ||
| Node | ✕ | ✓ | ✓ | Resource watcher should be the same as node watcher. | | ||
| Pod | ✓ | ✓ | ✓ | | | ||
| Proxy | ✕ | ✕ | ✕ | | | ||
| Scheduler | ✕ | ✕ | ✕ | | | ||
| State container | ✓ | ✓ | ✓ | | | ||
| State cronjob | ✓ | ✕ | ✓ | | | ||
| State daemonset | ✓ | ✕ | ✓ | | | ||
| State deployment | ✓ | ✕ | ✓ | | | ||
| State job | ✓ | ✕ | ✓ | | | ||
| State namespace | ✓ | ✕ | ✓ | Resource watcher should be the same as namespace watcher. | | ||
| State node | ✕ | ✓ | ✓ | Resource watcher should be the same as node watcher. | | ||
| State PV | ✕ | ✕ | ✓ | | | ||
| State PVC | ✓ | ✕ | ✓ | | | ||
| State pod | ✓ | ✓ | ✓ | | | ||
| State replicaset | ✓ | ✕ | ✓ | | | ||
| State resource quota | ✓ | ✕ | ✓ | | | ||
| State service | ✓ | ✕ | ✓ | | | ||
| State statefulset | ✓ | ✕ | ✓ | | | ||
| State storage class | ✕ | ✕ | ✓ | | | ||
| System | ✕ | ✕ | ✕ | | | ||
| Volume | ✕ | ✕ | ✕ | | | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To differentiate with datasets. Just writ something same term is called datasets in elastic-agent.
https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme