fix: merge gauges properly and avoid dropping existing points #444

tobz · 2025-01-24T18:39:25Z

Context

When metrics are merged in the aggregate transform, we call MetricValues::merge, which is meant to uniformly handle the merge of different points for all metric types. This method handles both the merging of points based on their timestamp, as well as the specific merge behavior for the point values themselves.

An issue was pointed out with how ADP was handling gauges, and after some debugging, it became apparent that when merging gauges, we were actually ignoring the timestamp but, which meant that if you had an existing point for a gauge, say with timestamp X, and you merged in another point with a timestamp Y, the first point would be entirely dropped.

This manifested itself as a fun bug where, due to the bucket width and flush interval defaults, a gauge being reported every 10 seconds was only ever emitted every 30 seconds.

Solution

While gauges nominally have "last write wins" behavior, that logic only applies for a given timestamp. This PR fixes the logic of merging gauges such that we don't drop the previous points, and only apply the LWW behavior when merging two points with an identical timestamp.

Further, we've added a number of unit tests to ensure the merging behavior is correct for each metric type. We've additionally added a number of From and Display impls on the related types... some of which we only needed for initial debugging (but might be valuable in the future) and some of which are used by the unit tests.

tobz · 2025-01-24T18:41:22Z

lib/saluki-event/src/metric/value/mod.rs

                }
            }
-            (Self::Gauge(a), Self::Gauge(b)) => *a = b,
+            (Self::Gauge(a), Self::Gauge(b)) => a.merge(b, merge_scalar_latest),


This is the fix, basically. Right here. This one lil guy.

tobz · 2025-01-24T18:45:25Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

pr-commenter · 2025-01-24T18:56:00Z

Regression Detector (DogStatsD)

Regression Detector Results

Run ID: a37173bd-7754-4503-9ef7-af66b2e5a1c7

Baseline: 7.62.0-rc.8
Comparison: 7.62.0-rc.8

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	trials
➖	dsd_uds_100mb_3k_contexts_distributions_only	memory utilization	+1.48	[+1.28, +1.69]	1
➖	quality_gates_idle_rss	memory utilization	+0.42	[+0.31, +0.52]	1
➖	dsd_uds_10mb_3k_contexts	ingress throughput	+0.00	[-0.00, +0.00]	1
➖	dsd_uds_1mb_3k_contexts	ingress throughput	+0.00	[-0.00, +0.00]	1
➖	dsd_uds_512kb_3k_contexts	ingress throughput	+0.00	[-0.01, +0.01]	1
➖	dsd_uds_1mb_50k_contexts	ingress throughput	+0.00	[-0.00, +0.00]	1
➖	dsd_uds_100mb_250k_contexts	ingress throughput	-0.00	[-0.00, +0.00]	1
➖	dsd_uds_1mb_3k_contexts_dualship	ingress throughput	-0.00	[-0.00, +0.00]	1
➖	dsd_uds_1mb_50k_contexts_memlimit	ingress throughput	-0.00	[-0.00, +0.00]	1
➖	dsd_uds_100mb_3k_contexts	ingress throughput	-0.00	[-0.05, +0.04]	1
➖	dsd_uds_40mb_12k_contexts_40_senders	ingress throughput	-0.00	[-0.00, +0.00]	1
➖	dsd_uds_500mb_3k_contexts	ingress throughput	-0.27	[-0.30, -0.24]	1

Bounds Checks: ❌ Failed

perf	experiment	bounds_check_name	replicates_passed	links
❌	quality_gates_idle_rss	memory_usage	0/10

Explanation

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

pr-commenter · 2025-01-24T18:56:15Z

Regression Detector (Saluki)

Regression Detector Results

Run ID: bb0a943d-79e1-47f1-9d95-58230bc95bbd

Baseline: 3576732
Comparison: 0d1856f
Diff

❌ Experiments with missing or malformed data

This is a critical error. No usable optimization goal data was produced by the listed experiments. This may be a result of misconfiguration. Ping #single-machine-performance and we can help out.

dsd_uds_100mb_250k_contexts

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	trials
➖	dsd_uds_500mb_3k_contexts	ingress throughput	+3.26	[+3.14, +3.39]	1
➖	dsd_uds_100mb_3k_contexts_distributions_only	memory utilization	+0.86	[+0.73, +0.99]	1
➖	dsd_uds_50mb_10k_contexts_no_inlining	ingress throughput	+0.02	[-0.05, +0.09]	1
➖	dsd_uds_50mb_10k_contexts_no_inlining_no_allocs	ingress throughput	+0.01	[-0.05, +0.07]	1
➖	dsd_uds_1mb_3k_contexts_dualship	ingress throughput	+0.00	[-0.00, +0.01]	1
➖	dsd_uds_40mb_12k_contexts_40_senders	ingress throughput	+0.00	[-0.02, +0.03]	1
➖	dsd_uds_100mb_3k_contexts	ingress throughput	+0.00	[-0.05, +0.05]	1
➖	dsd_uds_512kb_3k_contexts	ingress throughput	+0.00	[-0.01, +0.01]	1
➖	dsd_uds_1mb_50k_contexts	ingress throughput	-0.01	[-0.02, +0.01]	1
➖	dsd_uds_1mb_3k_contexts	ingress throughput	-0.01	[-0.03, +0.00]	1
➖	dsd_uds_10mb_3k_contexts	ingress throughput	-0.02	[-0.04, +0.01]	1
➖	quality_gates_idle_rss	memory utilization	-0.12	[-0.15, -0.09]	1
➖	dsd_uds_1mb_50k_contexts_memlimit	ingress throughput	-3.50	[-4.00, -3.01]	1

Bounds Checks: ✅ Passed

perf	experiment	bounds_check_name	replicates_passed	links
✅	quality_gates_idle_rss	memory_usage	10/10

Explanation

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

pr-commenter · 2025-01-24T18:56:47Z

Regression Detector Links

Experiment Result Links

experiment	link(s)
dsd_uds_100mb_250k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_100mb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_100mb_3k_contexts_distributions_only	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_10mb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_3k_contexts_dualship	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_50k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_50k_contexts_memlimit	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_40mb_12k_contexts_40_senders	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_500mb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_512kb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
quality_gates_idle_rss	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_50mb_10k_contexts_no_inlining (ADP only)	[Profiling (ADP)] [SMP Dashboard]
dsd_uds_50mb_10k_contexts_no_inlining_no_allocs (ADP only)	[Profiling (ADP)] [SMP Dashboard]

…flushed during aggregation

The base branch was changed.

tobz requested review from a team as code owners January 24, 2025 18:39

tobz added the type/bug Bug fixes. label Jan 24, 2025

tobz changed the base branch from main to tobz/context-resolver-origin-uniqueness-test January 24, 2025 18:40

tobz commented Jan 24, 2025

View reviewed changes

tobz mentioned this pull request Jan 24, 2025

debugging: add trace logging to show exactly which buckets are being flushed during aggregation #443

Closed

tobz force-pushed the tobz/context-resolver-origin-uniqueness-test branch from b1cda7c to d59b379 Compare January 27, 2025 14:51

tobz force-pushed the tobz/fix-broken-gauge-merge-behavior branch from c695664 to 0a73476 Compare January 27, 2025 14:52

github-actions bot removed area/config Configuration. area/components Sources, transforms, and destinations. area/ci CI/CD, automated testing, etc. labels Jan 27, 2025

github-actions bot removed source/dogstatsd DogStatsD source. destination/datadog-metrics Datadog Metrics destination. destination/prometheus Prometheus Scrape destination. destination/datadog Common Datadog destination code. labels Jan 27, 2025

tobz force-pushed the tobz/context-resolver-origin-uniqueness-test branch from d59b379 to 36f5486 Compare January 27, 2025 14:59

tobz force-pushed the tobz/fix-broken-gauge-merge-behavior branch 2 times, most recently from 5287ff7 to 0d1856f Compare January 27, 2025 16:52

tobz changed the base branch from tobz/context-resolver-origin-uniqueness-test to main January 27, 2025 16:53

tobz mentioned this pull request Jan 27, 2025

chore: fold OriginKey hash into main hash to shrink ContextKey #452

Merged

tobz changed the base branch from main to tobz/context-resolver-origin-uniqueness-test January 27, 2025 16:55

tobz added 2 commits January 28, 2025 10:55

debugging: add trace logging to show exactly which buckets are being …

efd8a8d

…flushed during aggregation

fix broken gauge merge behavior

0549c44

lukesteensen previously approved these changes Jan 28, 2025

View reviewed changes

tobz changed the base branch from tobz/context-resolver-origin-uniqueness-test to main January 28, 2025 16:39

tobz requested a review from a team as a code owner January 28, 2025 16:39

tobz changed the base branch from main to tobz/context-resolver-origin-uniqueness-test January 28, 2025 16:39

lukesteensen previously approved these changes Jan 28, 2025

View reviewed changes

tobz force-pushed the tobz/fix-broken-gauge-merge-behavior branch from 0d1856f to 0549c44 Compare January 28, 2025 16:40

tobz changed the base branch from tobz/context-resolver-origin-uniqueness-test to main January 28, 2025 16:40

lukesteensen approved these changes Jan 28, 2025

View reviewed changes

tobz merged commit e278a97 into main Jan 28, 2025
10 of 14 checks passed

tobz deleted the tobz/fix-broken-gauge-merge-behavior branch January 28, 2025 16:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: merge gauges properly and avoid dropping existing points #444

fix: merge gauges properly and avoid dropping existing points #444

tobz commented Jan 24, 2025

tobz Jan 24, 2025

tobz commented Jan 24, 2025 •

edited

Loading

pr-commenter bot commented Jan 24, 2025 •

edited

Loading

Fine details of change detection per experiment

Explanation

pr-commenter bot commented Jan 24, 2025 •

edited

Loading

Fine details of change detection per experiment

Bounds Checks: ✅ Passed

Explanation

pr-commenter bot commented Jan 24, 2025 •

edited

Loading

fix: merge gauges properly and avoid dropping existing points #444

fix: merge gauges properly and avoid dropping existing points #444

Conversation

tobz commented Jan 24, 2025

Context

Solution

tobz Jan 24, 2025

Choose a reason for hiding this comment

tobz commented Jan 24, 2025 • edited Loading

pr-commenter bot commented Jan 24, 2025 • edited Loading

Regression Detector (DogStatsD)

Regression Detector Results

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment

Bounds Checks: ❌ Failed

Explanation

pr-commenter bot commented Jan 24, 2025 • edited Loading

Regression Detector (Saluki)

Regression Detector Results

❌ Experiments with missing or malformed data

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment

Bounds Checks: ✅ Passed

Explanation

pr-commenter bot commented Jan 24, 2025 • edited Loading

Regression Detector Links

Experiment Result Links

tobz commented Jan 24, 2025 •

edited

Loading

pr-commenter bot commented Jan 24, 2025 •

edited

Loading

pr-commenter bot commented Jan 24, 2025 •

edited

Loading

pr-commenter bot commented Jan 24, 2025 •

edited

Loading