Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate if it is possible to set orchestrator fields from Cloud provider kubernetes metadata #33081

Closed
tetianakravchenko opened this issue Sep 14, 2022 · 16 comments · Fixed by #37685
Assignees
Labels
Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team

Comments

@tetianakravchenko
Copy link
Contributor

tetianakravchenko commented Sep 14, 2022

Azure

Describe the enhancement:

orchestrator.cluster.name and orchestrator.cluster.url will not be set when metricbeat is running on AKS.

as mentioned in https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-module-kubernetes.html#_dashboard_32:

This field gets its value from sources like kube_config, kubeadm-config configMap, and Google Cloud’s meta API for GKE.

this feature was introduced by #26056

Similar to how we support GKE Metadata now, we should investigate if it is possible to get k8s cluster name and k8s cluster url to set orchestrator.cluster fields using the Azure kubernetes metadata.

AWS

EKS: initial investigation - #30229 (comment)

Google cloud

we already provide setting orchestrator.cluster.* from GKE metadata

@tetianakravchenko tetianakravchenko added the Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team label Sep 14, 2022
@tetianakravchenko tetianakravchenko changed the title Investigate if it is possible to set orchestrator fields from Azure kubernetes metadata Investigate if it is possible to set orchestrator fields from Cloud provider kubernetes metadata Sep 19, 2022
@tetianakravchenko
Copy link
Contributor Author

tetianakravchenko commented Oct 20, 2022

Work on this issue might require an alignment on how to organize the code better:
now we have GetKubernetesClusterIdentifier, that only check kubeconfig and kube-admin sources to set cluster name and url. At the same time cluster metadata for GKE is added in add_cloud_metadata processor.
Cluster metadata should be set in the same place, that requires code re-organisation as we are planing to add support for more providers.

Based on that, we might reconsider work done in elastic/cloudbeat#455, see the relevant disscussion https://github.com/elastic/cloudbeat/pull/455/files#r999273250

@ofiriro3
Copy link
Contributor

ofiriro3 commented Oct 20, 2022

Hi,

My team @elastic/cloud-security-posture has just implemented a processor that uses the GetKubernetesClusterIdentifier function.

We would be happy if you can create another action item to revisit how we can organize the various implementation better.

cc @ChrsMark

@ChrsMark
Copy link
Member

ChrsMark commented Nov 2, 2022

@tetianakravchenko fyi that in GKE autopilot the fields are not set.

@gsantoro
Copy link
Contributor

gsantoro commented Nov 7, 2022

@tetianakravchenko AKS is missing kubeconfig and it doesn't container the cluster name or the cluster url.

@gsantoro
Copy link
Contributor

hey @ChrsMark and @tetianakravchenko,
couple of ideas:

  • since 3/4 of our cloud environments (all except for GKE standard) don't provide a cluster.name/url in the metadata api...
  • ...and we can either provide a custom solution in code for each provider (and that probably will take some time to implement)...
  • ...or we have to write docs for each cloud environment for the user to fix this on their own in Kibana (at least for the meantime)....
  • ...and the fix in Kibana is quite complicated since the user have to paste a few lines of processor yaml code into the Processor box in the right location (aka under Kube-state-metrics / Node Metrics UI)...
  • ...and that is probably very error prone and I can already see the infinite list of SDH coming our way...
  • ...and I am not a huge fun of docs if there is a better alternative...

... I was thinking

  • ...if instead of providing docs for the customer to fix this issue on their...
  • ...we could embed the "Processor" code already in the integration (in the right place) and expose an optional field to the User in the Kibana UI.

A couple of use case how I see this to play out:

  • if we can get this info from the metadata API (like for GKE standard), we can use that value and ignore any user input
  • otherwise (for the other 3/4 options), we ask the user provided us a cluster.name/url.
    • if the user doesn't provide the cluster name, we might use a default value like "cluster-name" (or anything else). This way we still fix the dashboards and it works either way.
    • if the user provide a cluster name, we use that value instead. We might even think of using this value anyway for GKE standard if we want to provide the user the ability to provide a more user friendly name for the cluster.

I don't mean this to be the final solution but it could be easily be a temporary fix until we find a better way to fix this in code.

What do you guys think?

@ChrsMark
Copy link
Member

  • ...if instead of providing docs for the customer to fix this issue on their...
  • ...we could embed the "Processor" code already in the integration (in the right place) and expose an optional field to the User in the Kibana UI.

That would work yes, and specially if we define this in integration level and not just in data_stream level.

However in that case we expose sth that is a "patch" to the users and at some point we will remove this so I have mixed feelings about this and I would prefer prioritizing the backend implementation and invest time directly on this.

In any case, that would be doable if @mlunadia agrees with that from product perspective.
cc: @gizas

@gsantoro
Copy link
Contributor

gsantoro commented Nov 23, 2022

hello @ChrsMark and @tetianakravchenko, @gizas ,
I managed to get the cluster name from the AKS metadata endpoint with the following bash script

kubectl debug node/aks-nodepool1-36348082-vmss000000 -it --image=mcr.microsoft.com/dotnet/runtime-deps:6.0 -- /bin/bash -c 'apt-get update; \
apt-get install -y curl jq; \
RESOURCE_NAME=$(curl -s -H Metadata:true --noproxy "*" "http://169.254.169.254/metadata/instance?api-version=2021-02-01" | jq .compute.resourceGroupName); \
arrIN=(${RESOURCE_NAME//_/ }); \
echo ${arrIN[2]};

Here I'm using a debug command to get a node shell to query the AKS metadata from inside the K8s cluster. The AKS metadata expose the cluster name inside a field under the jsonpath .compute.resourceGroupName. You need to split that string by _ and then get the 3 element in the array.

@ChrsMark
Copy link
Member

hello @ChrsMark and @tetianakravchenko, @gizas , I managed to get the cluster name from the AKS metadata endpoint with the following bash script

kubectl debug node/aks-nodepool1-36348082-vmss000000 -it --image=mcr.microsoft.com/dotnet/runtime-deps:6.0 -- /bin/bash -c 'apt-get update; \
apt-get install -y curl jq; \
RESOURCE_NAME=$(curl -s -H Metadata:true --noproxy "*" "http://169.254.169.254/metadata/instance?api-version=2021-02-01" | jq .compute.resourceGroupName); \
arrIN=(${RESOURCE_NAME//_/ }); \
echo ${arrIN[2]};

Here I'm using a debug command to get a node shell to query the AKS metadata from inside the K8s cluster. The AKS metadata expose the cluster name inside a field under the jsonpath .compute.resourceGroupName. You need to split that string by _ and then get the 3 element in the array.

Nice! I guess this can be added at https://github.com/elastic/beats/blob/25786cdda70b31cb1738373265bf3a0f3dec76f6/libbeat/processors/add_cloud_metadata/provider_azure_vm.go similarly to what we do for the gke case at https://github.com/elastic/beats/blob/25786cdda70b31cb1738373265bf3a0f3dec76f6/libbeat/processors/add_cloud_metadata/provider_google_gce.go.
Btw where we can find this 169.254.169.254 IP? Is this the Node's IP?

In general we try to add the cloud provider specific implementation under the add_cloud_metadata processor.
So the basic implementation for these metadata that uses the kubeconfig apporach lives in https://github.com/elastic/elastic-agent-autodiscover/blob/6ee69244193e9ba3159304650f39152f8fad32a7/kubernetes/metadata/metadata.go#L110 but when this is not able to cover the case we leverage the add_cloud_metadata processor (which enabled by default) to get these.

@gizas
Copy link
Contributor

gizas commented Nov 24, 2022

169.254.169.254

This is a predefined IP for Azure.
https://learn.microsoft.com/en-us/azure/virtual-machines/windows/instance-metadata-service?tabs=linux

I guess the result will be the same for [168.63.129.16] ?

@tetianakravchenko
Copy link
Contributor Author

tetianakravchenko commented May 14, 2023

hello @ChrsMark and @tetianakravchenko, @gizas ,
I managed to get the cluster name from the AKS metadata endpoint with the following bash script

kubectl debug node/aks-nodepool1-36348082-vmss000000 -it --image=mcr.microsoft.com/dotnet/runtime-deps:6.0 -- /bin/bash -c 'apt-get update;
apt-get install -y curl jq;
RESOURCE_NAME=$(curl -s -H Metadata:true --noproxy "*" "http://169.254.169.254/metadata/instance?api-version=2021-02-01" | jq .compute.resourceGroupName);
arrIN=(${RESOURCE_NAME//_/ });
echo ${arrIN[2]};

Here I'm using a debug command to get a node shell to query the AKS metadata from inside the K8s cluster. The AKS metadata expose the cluster name inside a field under the jsonpath .compute.resourceGroupName. You need to split that string by _ and then get the 3 element in the array.

This approach will not work for cases when the Resource group or cluster name contains the _

  1. if Resource group is not defined, default name will be _group:
Screenshot 2023-05-14 at 16 15 31 2. `_` is also acceptable for the cluster name

MC_<resourcName>_<clusterName>_<region>, for this cluster it will be MC_k8s_cluster_name_group_k8s_cluster_name_eastus

@Vbubblery
Copy link

Hi,

Any updates or solution here?

@Grinch321
Copy link

Hi, any updates?

@gizas
Copy link
Contributor

gizas commented Dec 22, 2023

support for AWS already included in agent version 8.9.0 and later(See release-notes-8.9.0.html Issue 35182)

For AKS is still in the roadmap

@ptonini
Copy link

ptonini commented Jan 14, 2024

I've implemented a simple workaround, setting the cluster name as a environment variable on the pod:

processors: |
  - add_fields:
      target: orchestrator.cluster
      fields:
        name: $${env.CLUSTER_NAME}
        url: $${env.CLUSTER_URL}

@MichaelKatsoulis MichaelKatsoulis self-assigned this Jan 22, 2024
@MichaelKatsoulis
Copy link
Contributor

The best way to retrieve the AKS cluster name is by using the azure sdk and for the given subscription Id list the Managed Clusters. Then we can filter by the resourceGroupName which we get from the metadata endpoint.
This solution can give us the cluster id and cluster name. But due to azure authentication reasons the TENANT_ID, CLIENT_ID and CLIENT_SECRET are required. These can be provided as env vars in metricbeat/agent.

#37685

@randywatson1979
Copy link

randywatson1979 commented Mar 1, 2024

Well, you could at least get the resourcegroup/cluster name from:
kubernetes.node.labels.kubernetes_azure_com/cluster
Which looks like "MC_some_name_some_name_westeurope" which is unique to an aks cluster.

Would like to output that value in an advanced watcher template, but I have no idea how to escape/format that forward slash in order for the message parser to recognise and read the value of that key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants