-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A prometheus exporter for playbook metrics ? #177
Comments
Hi ! I think I can take a look at that, I'm not very familiar with ara bit I've been using a while ago and I'm also a casual prometheus user. I'll dive into the code to see If I'm of any help this weekend 😉 |
@IlyesSemlali nice, thanks for looking at it! |
Wanted to let you know that I've proposed an implementation to query ara in order to return metrics about tasks: https://review.opendev.org/#/c/760736/ For example, metrics from the last 1000 tasks:
The CLI framework lets us return that data in json or csv which I suppose could then be made available by a prometheus exporter:
Let me know what you think ? |
There are other works in progress for ara playbook metrics and ara host metrics. I don't have a lot of experience with prometheus but there is a client library that we could use in python: https://pypi.org/project/prometheus-client/ |
@IlyesSemlali @dmsimard any further plans here? |
@b-reich I am not using prometheus at this time and so I am not pursuing this right now. You may want to look at the following CLI commands to find out if there is something that can help: |
Hello, In my work, I use Ansible with ARA and the possibility of having metrics in the Prometheus/Grafana stack interests me a lot. |
Hi @TibScript, this may surface on my end in the not too distant future. Did you end up with something that works ? |
I am experimenting and learning this as I go -- I don't believe I am using the right approach but I'm sharing this in case anyone has suggestions, comments or would like to improve on it: https://gist.github.com/dmsimard/68c149eea34dbff325c9e4e9c39980a0 I've included a sample of the /metrics endpoint in the gist. Edit: I forgot to mention that something notably missing from this first iteration are playbook and task durations. I've tried to implement them using Here's two sample screenshots showing it works with a local prometheus instance: |
For a third iteration, I decided to move the standalone script into the ara CLI so it's possible to start the exporter by running I've opened up a branch and a PR so I will work there instead of the gist. |
Hey @dmsimard my familiarity with Prometheus is Nautobot, which is also Django based, and they leverage this library, django-prometheus I believe it has alot of built in django metrics that you can enable by default, from API performance, model performance, db queries, and the like, but what seems to be the common thread is that they are able to quickly add new things that we might want to look at about the data Nautobot holds as well. For Ansible, I could see things like job/task performance, host failure hot list (which hosts are failing most often), number of times a playbook was ran, technically I think you have alot of great metrics that you show in various ways already, but exporting to prometheus lets folks use something like Grafana to graph those things that are important to them with minimal effort or middleware required for them to write. So in my opinion, just starting with some of the metrics about plays, hosts, tasks, etc that you have now would get the ball rolling! |
I have not considered the django side of the metrics yet but it's true that it can be useful and it's good to know, thanks ! If anyone wants to tackle this, they can go ahead as I continue iterating on the playbook metrics.
Yes, we can consider that one of the objectives is pretty graphs about playbook metrics in grafana :p While I have been a user (and operator) of both prometheus and grafana, I have been mostly privileged by the fact that so many exporters and graphs had already been written so until now I have not needed to truly learn how it all works underneath. There be dragons. The challenge is parsing data into the right formats (and field types) as a proper time series with the timestamps provided by ara -- not the time the metric sample is taken. Then, we probably need to some math and find out the right arcane grafana or promql query to produce the pretty graphs. If anyone wants to help or point me in the right direction, head to the PR 🙏 |
I have come across this insightful mailing list thread about ingesting metrics with supplied timestamps: https://groups.google.com/g/prometheus-users/c/YqFc1MZLCsM There is a suggestion to try Histograms instead of Gauges so I will look into that next. Edit: some additional reading I've come across about setting timestamps on metrics: |
What is the idea ?
I haven't yet looked at what this would look like in practice but we have a lot of metrics in ara, such as:
We could explore how to make these metrics useful for monitoring or fancy graphs.
The text was updated successfully, but these errors were encountered: