You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
As a developer or operator of a system using OpenTelemetry behind a REST API, the calls through the Kusto Query client library show up in App Insights (and other monitoring/telemetry tools) as pain HTTP dependency calls. Challenges:
Hard to troubleshoot slow queries without seeing commands and seeing the query stats
No easy way to correlate client-side logs in App Insights with service diagnostics logs in Azure Monitor
These can be implemented by each client, but many modern
Describe the solution you'd like
This work could be done inside of the Kusto client library or possibly as an add-on package in there are sufficient hooks to wire into it from the outside. (If not and an external instrumentation library is preferred, it may be necessary to add some hooks into a PR into this library).
Create a Span for query execution calls
Enhance the Kusto Query (and possibly ingest) client using OpenTelemetry, creating a Span around the call into .execute method. On that Span add attributes like:
Query:
query statement (opt-in due to possible data leakage into logs)
Response:
Number of rows returned in main result set (possibly other result metadata)
Returned query stats JSON
The Span would have the client perspective on query duration and wrap the downstream HTTP request which would still land in OpenTelemetry through instrumentation of underlying HTTP client libraries.
Pass the OpenTelemetry trace context into the call to ADX
This would allow joining the client Span data (in App Insights, the dependency table) to ADX's Azure Monitor server-side query logs. This makes it easy to trace a client-side performance issue to the server-side perspective.
Do this by capturing the OpenTelemetry spanContext.traceId into the clientRequestProperties.clientRequestId property in .execute. (Use this as the default unless the client passes a value in for this into .execute
I haven't worked enough with the Ingest side of this library to have an opinion on how OpenTelemetry could add value there. I suggest starting with the query side for this first cut.
Describe alternatives you've considered
Currently wrapping the KustClient to add this functionality. It has been very useful in the short time we have had it.
We have a plan to introduce client-side logs to all SDKs. though we do not plan to take a dependency on OpenTelemetry.
Generally, Query requests contain ClientRequestId which allows for to correlate actions with the Kusto backend (though I can't attest as to how it works in App Insights or Azure Monitor.
With just out-of-the-box OpenTelemetry instrumentation on a typical Java environment, the call to ADX just just an opaque REST request:
It's kind of a dead-end TBH.
But with SQL Server, CosmosDb and others, I automatically get custom Trace data with Spans that give me useful information on the query and ADX's work on that operation so I don't have to copy/paste or join between systems to learn the basics.
On a recent project, my team added custom OpenTelemetry hooks at the application level in our Kusto Client query wrapper library, adding the query command, RequestProperties, QueryCompetionProperties and the returned rowcount in a new ADX Query Trace Span that appears in the App Insights distributed trace tree. With this in place, a system operator can easily troubleshoot an issue entirely in App Insights vs having to search log files or copy/paste/join correlation IDs between different logs. (In other work with CosmosDb and SQL Server, this wasn't necessary -- it was just something we had to enable or configure.)
An instrumentation would define some sensible defaults on what to capture since some of this context is verbose, but OOTB some reasonable defaults would make this more operator and developer friendly to use.
This does seem like a common issue across any version of the library.. .NET, Java, Node/JS all have OpenTelemetry SDKs to enable this.
Are there hooks where an OpenTelemetry add-on package/library could be created? Many libraries do it this way. If the hooks are there, an add-on would be registered on top of the library.
Is your feature request related to a problem? Please describe.
As a developer or operator of a system using OpenTelemetry behind a REST API, the calls through the Kusto Query client library show up in App Insights (and other monitoring/telemetry tools) as pain HTTP dependency calls. Challenges:
These can be implemented by each client, but many modern
Describe the solution you'd like
This work could be done inside of the Kusto client library or possibly as an add-on package in there are sufficient hooks to wire into it from the outside. (If not and an external instrumentation library is preferred, it may be necessary to add some hooks into a PR into this library).
Enhance the Kusto Query (and possibly ingest) client using OpenTelemetry, creating a Span around the call into
.execute
method. On that Span add attributes like:Query:
Response:
The Span would have the client perspective on query duration and wrap the downstream HTTP request which would still land in OpenTelemetry through instrumentation of underlying HTTP client libraries.
This would allow joining the client Span data (in App Insights, the
dependency
table) to ADX's Azure Monitor server-side query logs. This makes it easy to trace a client-side performance issue to the server-side perspective.Do this by capturing the OpenTelemetry
spanContext.traceId
into theclientRequestProperties.clientRequestId
property in.execute
. (Use this as the default unless the client passes a value in for this into.execute
I haven't worked enough with the Ingest side of this library to have an opinion on how OpenTelemetry could add value there. I suggest starting with the query side for this first cut.
Describe alternatives you've considered
Currently wrapping the KustClient to add this functionality. It has been very useful in the short time we have had it.
Additional context
Prior art:
I may have a hackathon team interested in working on this and contributing it here... TBD.
The text was updated successfully, but these errors were encountered: