-
Notifications
You must be signed in to change notification settings - Fork 619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add latency metrics for DiscoverPollEndpoint and ACS #4454
Conversation
@@ -776,6 +782,7 @@ func (client *ecsClient) discoverPollEndpoint(containerInstanceArn string, | |||
} | |||
return nil, err | |||
} | |||
client.metricsFactory.New(metrics.DiscoverPollEndpointDurationName).WithGauge(time.Since(discoverPollEndpointStartTime)).Done(nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides this gauge metric, can we emit one more counter metric to monitor if the discoverPollEndpoint call is successful or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, think this will be a good addition to get more insights into DiscoverPollEndpoint()
// WithMetricsFactory is an ECSClientOption that configures | ||
// ecsClient.metricsFactory with the value passed as a parameter. | ||
// This is especially useful for emitting metrics in the ECS Client | ||
func WithMetricsFactory(metricsFactory metrics.EntryFactory) ECSClientOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
ecs-agent/acs/session/session.go
Outdated
@@ -262,8 +271,13 @@ func (s *session) startSessionOnce(ctx context.Context) error { | |||
}) | |||
return err | |||
} | |||
acsConnectionMetric.Done(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
withGauge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And besides the duration metric, can we also have a counter metric for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
withGauge()
is used to add any value to a given metric - useful when we want to add info to a metric outside of what's already provided (ie startTime, metric name, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the purpose for the counter metric?
ecs-agent/metrics/constants.go
Outdated
@@ -46,6 +46,11 @@ const ( | |||
ACSDisconnectTimeoutMetricName = agentAvailabilityNamespace + ".ACSDisconnectTimeout" | |||
TCSDisconnectTimeoutMetricName = agentAvailabilityNamespace + ".TCSDisconnectTimeout" | |||
|
|||
// ACS Session Metrics | |||
acsStartSessionNamespace = "ACSStartSession" | |||
DiscoverPollEndpointDurationName = acsStartSessionNamespace + ".DiscoverPollEndpointDuration" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a different namespace/prefix for this DiscoverPollEndpointDurationName
? because discoverPollEndpoint is not for ACS only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh right yes good point ill go ahead and change that
|
||
// WithMetricsFactory is an ECSClientOption that configures | ||
// ecsClient.metricsFactory with the value passed as a parameter. | ||
// This is especially useful for emitting metrics in the ECS Client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. Comments end with period.
|
||
mockACSConnectEndpointEntry := mock_metrics.NewMockEntry(ctrl) | ||
mockACSConnectEndpointEntry.EXPECT().Done(gomock.Any()).AnyTimes() | ||
mockMetricsFactory.EXPECT().New("ACSStartSession.ACSConnectEndpointDuration").Return(mockACSConnectEndpointEntry).AnyTimes() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. Import the metric instead of hard-coding it.
Summary
Adding new metrics and variables in acs session
Implementation details
Added new metrics and variables in acs session for
StartSessionOnce
Testing
Tested by running all unit tests in acs
New tests cover the changes:
No new tests, but modified existing tests to account for new metrics (ie mocked the calls to create metrics)
Description for the changelog
Enhancement - add metrics to monitor latency in acs session
Additional Information
Does this PR include breaking model changes? If so, Have you added transformation functions?
No
Does this PR include the addition of new environment variables in the README?
No
Licensing
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.