ddtrace/tracer: add integration tag to spans_started/finished #3023

hannahkm · 2024-12-10T16:06:19Z

What does this PR do?

Add a integration tag to the existing datadog.tracer.spans_started and datadog.tracer.spans_finished metrics. The value of the tag will be the name of the component from which the span was started. For example, for a contrib, it will be the name of the contrib package (chi, net/http, etc). For spans that were created manually, the tag will say manual.

Motivation

We want to know, in addition to when a span is started, where the span originated from. This could be a contrib or a manual implementation.

Reviewer's Checklist

Changed code has unit tests for its functionality at or near 100% coverage.
System-Tests covering this feature have been added and enabled with the va.b.c-dev version tag.
There is a benchmark for any new code, or changes to existing code.
If this interacts with the agent in a new way, a system test has been added.
Add an appropriate team label so this PR gets put in the right place for the release notes.
Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild.
For internal contributors, a matching PR should be created to the v2-dev branch and reviewed by @DataDog/apm-go.

Unsure? Have a question? Request a review!

datadog-datadog-prod-us1 · 2024-12-10T16:13:11Z

Datadog Report

Branch report: apm-rd/span-source-health-metric
Commit report: ddbc079
Test service: dd-trace-go

✅ 0 Failed, 5199 Passed, 72 Skipped, 2m 55.39s Total Time

pr-commenter · 2024-12-10T17:48:35Z

Benchmarks

Benchmark execution time: 2025-01-17 16:08:31

Comparing candidate commit 6899d90 in PR branch apm-rd/span-source-health-metric with baseline commit 382c311 in branch main.

Found 1 performance improvements and 5 performance regressions! Performance is the same for 53 metrics, 0 unstable metrics.

scenario:BenchmarkInjectW3C-24

🟩 execution_time [-133.484ns; -112.916ns] or [-3.243%; -2.743%]

scenario:BenchmarkSetTagStringPtr-24

🟥 execution_time [+3.976ns; +6.764ns] or [+2.325%; +3.956%]

scenario:BenchmarkSingleSpanRetention/no-rules-24

🟥 execution_time [+8.931µs; +9.840µs] or [+3.811%; +4.199%]

scenario:BenchmarkSingleSpanRetention/with-rules/match-all-24

🟥 execution_time [+8.712µs; +9.619µs] or [+3.684%; +4.068%]

scenario:BenchmarkSingleSpanRetention/with-rules/match-half-24

🟥 execution_time [+8.676µs; +11.137µs] or [+3.668%; +4.709%]

scenario:BenchmarkTracerAddSpans-24

🟥 execution_time [+170.625ns; +190.975ns] or [+4.435%; +4.964%]

mtoffl01

Ok, so you're reporting spansStarted/spansFinished on span.Start/span.Finished if the integration is not empty, and leaving the chunk reporting to any spans that are manual... I understand why you did this but not totally sure about the approach.

span.Start and span.Stop are typically called quite frequently, so if a majority of the spans are from automatic integrations, this will be very noisy (and defeats the purpose of reporting the metrics at a specified interval, to reduce noise)

One alternative idea:
Change the way we track spansStarted and spansFinished to be some kind of counter map that includes the integration name, e.g. map[string]uint32 where the key is the integration name and the value is the count of spans started/finished that integration name. Then, in this goroutine, we'll have to iterate over the map and report the spans started/finished per integration
(or some other idea I haven't thought of?)

ddtrace/mocktracer/mockspan.go

ddtrace/mocktracer/mockspan_test.go

hannahkm · 2024-12-17T20:01:43Z

@mtoffl01 Good points! A map would probably work better; I was hesitant at first since I didn't want to change too much of what already exists, but knowing that these metrics are pretty old... I'm more down to change it up now.

…metric

darccio · 2024-12-20T13:12:34Z

@hannahkm I'm approving this but we should investigate why the benchmarks report the increased allocations.

mtoffl01

Overall, I definitely have some concerns 🤔 Maybe you can write some additional tests to provide peace of mind....

Tests designed to try and make the system fail -- what happens when you have multiple goroutines access a start span / finish span method, can we prove that we've protected against a race condition?
Maybe you want to write dedicated benchmarks to show how much performance is impacted

ddtrace/tracer/tracer.go

Co-authored-by: Mikayla Toffler <[email protected]>

ddtrace/tracer/tracer.go

…metric

ddtrace/tracer/metrics.go

ddtrace/tracer/tracer.go

ddtrace/tracer/spancontext.go

ddtrace/tracer/metrics.go

mtoffl01

It lgtm, BUT I would recommend we have more complex tests for SpansStarted and SpansFinished; like, generating multiple spans of different integrations and checking the metrics reported (rather than just 1 span).

ddtrace/mocktracer/mockspan.go

ddtrace/tracer/metrics.go

…metric

ddtrace/tracer: use ext.Component to report source of new spans

1aa93b8

hannahkm added 2 commits December 10, 2024 11:14

ddtrace/tracer: apply source to finished spans health metric

0a51b8a

ddtrace/tracer: check for nil span before checking source

7601d35

hannahkm added 9 commits December 10, 2024 14:25

ddtrace/mocktracer: update mockspan to also hold source

13ecd2c

contrib: check for correct source on mockspans in tests

31aa679

contrib: remove incorrect checks for source

819e312

ddtrace/tracer: check for appropriate tag in spans_started metric

24ad484

ddtrace/tracer: test for different values of source

a8f097d

contrib,ddtrace/tracer: rename source to integration

feb73d7

ddtrace/mocktracer: replace missed source with integration

7fdb0c8

ddtrace/tracer: fix false positives in test

4b609a2

ddtrace/tracer: create test for spans_finished integration tag

f061f22

hannahkm changed the title ~~ddtrace/tracer: add source tag to spans_started health metric~~ ddtrace/tracer: add integration tag to spans_started/finished Dec 12, 2024

github-actions bot added the apm:ecosystem contrib/* related feature requests or bugs label Dec 12, 2024

ddtrace/tracer: fix failing smoke tests

236cb25

mtoffl01 requested changes Dec 17, 2024

View reviewed changes

ddtrace/mocktracer/mockspan.go Outdated Show resolved Hide resolved

ddtrace/mocktracer/mockspan.go Outdated Show resolved Hide resolved

ddtrace/mocktracer/mockspan_test.go Outdated Show resolved Hide resolved

hannahkm added 4 commits December 17, 2024 15:28

ddtrace/tracer: use map to keep track of spans started and finished

090c79c

ddtrace/tracer: fix races when accessing maps

c7db44d

ddtrace/tracer: replace sprintf usage with concat

df8f03e

Merge remote-tracking branch 'origin' into apm-rd/span-source-health-…

5cf2e2e

…metric

hannahkm marked this pull request as ready for review December 19, 2024 21:27

hannahkm requested review from a team as code owners December 19, 2024 21:27

darccio approved these changes Dec 20, 2024

View reviewed changes

mtoffl01 reviewed Dec 20, 2024

View reviewed changes

ddtrace/tracer/tracer.go Outdated Show resolved Hide resolved

ddtrace/tracer/tracer.go Outdated Show resolved Hide resolved

hannahkm and others added 7 commits January 3, 2025 13:28

ddtrace/tracer: improve testname

c005d73

Co-authored-by: Mikayla Toffler <[email protected]>

ddtrace/tracer: convert map to use int64 instead of uint32

cf97626

ddtrace/tracer: fix missed type change

f3505c9

ddtracer/tracer: check that metric counts revert to 0 after reporting

dbf336c

ddtrace/tracer: use xsync Map for spansStarted and Finished

c5f05f5

internal/exectracetest: go mod tidy

32c20f7

try using delete instead of clear

5d56dc5

felixge reviewed Jan 7, 2025

View reviewed changes

ddtrace/tracer/tracer.go Outdated Show resolved Hide resolved

hannahkm added 7 commits January 8, 2025 13:08

use atomic.int64 instead of int64

f4e7820

use waitgroups to control goroutines

aca72be

Merge remote-tracking branch 'origin' into apm-rd/span-source-health-…

2e98ead

…metric

Merge remote-tracking branch 'origin' into apm-rd/span-source-health-…

fc70dca

…metric

fix flaky attempt: add sleep time

889f6f6

Merge branch 'main' into apm-rd/span-source-health-metric

7ea2d4c

properly check for integration tags on metric spans

4c93020

mtoffl01 reviewed Jan 10, 2025

View reviewed changes

hannahkm added 6 commits January 10, 2025 16:32

nit: comment and doc clarifications

2ea545c

don't return early from reporting span metrics

8983bbc

remove unnecessary for loop for counting span metrics

ce1b7eb

use atomic.Swap() to reset values to 0

31cb904

Merge branch 'main' into apm-rd/span-source-health-metric

facb353

increase statsInterval to resolve flaking

753232f

mtoffl01 approved these changes Jan 14, 2025

View reviewed changes

ddtrace/mocktracer/mockspan.go Outdated Show resolved Hide resolved

ddtrace/tracer/metrics.go Show resolved Hide resolved

hannahkm added 7 commits January 15, 2025 11:20

review fixes

9fca3e9

test metrics with multiple different integrations

88c1a4b

Merge branch 'main' into apm-rd/span-source-health-metric

9b92152

fix test and hopefully fix a flake?

e908760

remove sleep to fix flake in test

28d392f

Merge remote-tracking branch 'origin' into apm-rd/span-source-health-…

cfdbc7f

…metric

Merge branch 'main' into apm-rd/span-source-health-metric

6899d90

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ddtrace/tracer: add integration tag to spans_started/finished #3023

ddtrace/tracer: add integration tag to spans_started/finished #3023

hannahkm commented Dec 10, 2024 •

edited

Loading

datadog-datadog-prod-us1 bot commented Dec 10, 2024 •

edited

Loading

pr-commenter bot commented Dec 10, 2024 •

edited

Loading

mtoffl01 left a comment •

edited

Loading

hannahkm commented Dec 17, 2024

darccio commented Dec 20, 2024

mtoffl01 left a comment •

edited

Loading

mtoffl01 left a comment

ddtrace/tracer: add integration tag to spans_started/finished #3023

Are you sure you want to change the base?

ddtrace/tracer: add integration tag to spans_started/finished #3023

Conversation

hannahkm commented Dec 10, 2024 • edited Loading

What does this PR do?

Motivation

Reviewer's Checklist

datadog-datadog-prod-us1 bot commented Dec 10, 2024 • edited Loading

Datadog Report

pr-commenter bot commented Dec 10, 2024 • edited Loading

Benchmarks

scenario:BenchmarkInjectW3C-24

scenario:BenchmarkSetTagStringPtr-24

scenario:BenchmarkSingleSpanRetention/no-rules-24

scenario:BenchmarkSingleSpanRetention/with-rules/match-all-24

scenario:BenchmarkSingleSpanRetention/with-rules/match-half-24

scenario:BenchmarkTracerAddSpans-24

mtoffl01 left a comment • edited Loading

Choose a reason for hiding this comment

hannahkm commented Dec 17, 2024

darccio commented Dec 20, 2024

mtoffl01 left a comment • edited Loading

Choose a reason for hiding this comment

mtoffl01 left a comment

Choose a reason for hiding this comment

hannahkm commented Dec 10, 2024 •

edited

Loading

datadog-datadog-prod-us1 bot commented Dec 10, 2024 •

edited

Loading

pr-commenter bot commented Dec 10, 2024 •

edited

Loading

mtoffl01 left a comment •

edited

Loading

mtoffl01 left a comment •

edited

Loading