-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: introduce first-class origin tags abstraction #441
Conversation
Regression Detector (DogStatsD)Regression Detector ResultsRun ID: b95e7750-c31f-4dbc-81d2-1ba814485db5 Baseline: 7.62.0-rc.8 Optimization Goals: ✅ No significant changes detected
|
perf | experiment | goal | Δ mean % | Δ mean % CI | trials | links |
---|---|---|---|---|---|---|
➖ | dsd_uds_100mb_3k_contexts_distributions_only | memory utilization | +1.56 | [+1.36, +1.76] | 1 | |
➖ | dsd_uds_100mb_3k_contexts | ingress throughput | +0.00 | [-0.05, +0.05] | 1 | |
➖ | dsd_uds_40mb_12k_contexts_40_senders | ingress throughput | +0.00 | [-0.00, +0.00] | 1 | |
➖ | dsd_uds_500mb_3k_contexts | ingress throughput | +0.00 | [-0.01, +0.01] | 1 | |
➖ | dsd_uds_1mb_3k_contexts_dualship | ingress throughput | +0.00 | [-0.00, +0.00] | 1 | |
➖ | dsd_uds_1mb_50k_contexts_memlimit | ingress throughput | +0.00 | [-0.00, +0.00] | 1 | |
➖ | dsd_uds_100mb_250k_contexts | ingress throughput | -0.00 | [-0.00, +0.00] | 1 | |
➖ | dsd_uds_1mb_50k_contexts | ingress throughput | -0.00 | [-0.00, +0.00] | 1 | |
➖ | dsd_uds_512kb_3k_contexts | ingress throughput | -0.00 | [-0.01, +0.01] | 1 | |
➖ | dsd_uds_1mb_3k_contexts | ingress throughput | -0.00 | [-0.00, +0.00] | 1 | |
➖ | quality_gates_idle_rss | memory utilization | -0.00 | [-0.08, +0.08] | 1 | |
➖ | dsd_uds_10mb_3k_contexts | ingress throughput | -0.01 | [-0.02, +0.00] | 1 |
Bounds Checks: ❌ Failed
perf | experiment | bounds_check_name | replicates_passed | links |
---|---|---|---|---|
❌ | quality_gates_idle_rss | memory_usage | 0/10 |
Explanation
Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%
Performance changes are noted in the perf column of each table:
- ✅ = significantly better comparison variant performance
- ❌ = significantly worse comparison variant performance
- ➖ = no significant change in performance
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
Regression Detector LinksExperiment Result Links
|
d9aba07
to
80ef087
Compare
ec65bf8
to
30c0540
Compare
80ef087
to
b21a8c5
Compare
30c0540
to
71ee9c4
Compare
fe4c64e
to
71ee9c4
Compare
71ee9c4
to
623ad73
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note on something it'd be nice to simplify and/or document a bit more, which can certainly be a followup.
/// See [`OriginEnrichmentConfiguration`] for more details. | ||
#[serde(default)] | ||
origin_enrichment: OriginEnrichmentConfiguration, | ||
pub struct DogStatsDConfiguration<R = ()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice not to add this generic, since I don't think it's really serving any meaningful purpose. You can't build the source with the default unit type, and there's really only one implementation of the trait.
More generally, abstracting with the trait leads to a bigger stack of nouns to hold in your head when reviewing this stuff (i.e. WorkloadProvider
-> OriginTagsResolver
-> OriginTagsQuerier
-> TagStoreQuerier
+ DogStatsDOriginTagResolver
), which is especially confusing because at least TagStoreQuerier
is used at multiple levels. That's not to say the design is wrong, it's just complex. If there's a workable way to simplify it, great, and if not then it might be useful have some kind of ascii diagram in a comment that illustrates how all the pieces fit together 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do an immediate follow-up PR to clean up all of this and remove the generic because I don't disagree that it's a bit intertwined.
Context
Currently, we've frontloaded origin detection/enrichment in the ingestion phase, where we strive to both resolve the origin of a metric and then also resolve the tags for that origin, cache them, and return a stable reference to those tags.... all before we resolve the context.
This technically works, but has one major issue: origin tags arrive incrementally from the Core Agent, especially during startup. Taking this approach, which necessitates caching to maintain reasonable performance, leads to issues with missing tags and effectively breaks origin enrichment.
Solution
We've taken the tack of breaking apart the resolving of an origin -- taking the raw identifiers and computing the unique origin identifier -- and the resolving of an origins tags.
Our new approach maintains much of the same machinery -- turning a raw
OriginInfo<'a>
into anOriginKey
that is later used to look up the origin tags -- but we've added the aspect of deferring the actual resolution of the origin tags until they're queried.To support this, we've created a new abstraction for origin tags called
OriginTags
. This type is the facade to visiting the tags attached to a given origin. It is powered byOriginKey
, and a new trait,OriginTagsResolver
, that replaces the oldOriginEnricher
. This allows us to resolve a stable origin identifier (OriginKey
), bundle that identifier (and the resolver used to get it) intoOriginTags
itself, and then at the point where the origin tags are querier, the resolver is used again to look up the tags associated with theOriginKey
.Instead of merging and allocating storage for these tags, we've also added a new abstraction --
OriginTagsVisitor
-- that is used to visit the tags attached to an origin. This lets us maintain a single copy of the tags inTagStore
and simply iterate over them on-demand when requested.A new type,
OriginTagsQuerier
, manages the link between the tag store, External Data resolver, and the logic to determine how to "resolve" an origin to its constituent parts -- pod, container, etc -- and sits withinsaluki-env
. This type is wired into the ADP-specific workload provider, and bridged back tosaluki-context
by implementing the newOriginTagsResolver
, such that configuring the DogStatsD source to use it is as simple as referencing the existing workload provider used in ADP.Additionally, the
Tagged
implementation forContext
now transparently visits the origin tags as well, which means both instrumented and origin tags are transparently chained together when visiting the tags for a context.