Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: modify wantLastValueMetric() to verify the last value of a metric #627

Closed
wants to merge 7 commits into from

Conversation

rhatgadkar-goog
Copy link
Contributor

No description provided.

@rhatgadkar-goog
Copy link
Contributor Author

These errors are occurring because the open_connections metric is having value 0. This could be because this metric is being updated in a goroutine inside the Dial() function: https://github.com/GoogleCloudPlatform/alloydb-go-connector/blob/623-wantlastvaluemetric/dialer.go#L377. And maybe the goroutine didn't execute yet.

@rhatgadkar-goog rhatgadkar-goog force-pushed the 623-wantlastvaluemetric branch from 400b1b7 to 6508352 Compare October 1, 2024 22:45
@rhatgadkar-goog rhatgadkar-goog changed the title fix: modify wantLastValueMetric() to verify the last value of a metric chore: modify wantLastValueMetric() to verify the last value of a metric Oct 1, 2024
@rhatgadkar-goog rhatgadkar-goog force-pushed the 623-wantlastvaluemetric branch 3 times, most recently from 69fd4d4 to 896d256 Compare October 2, 2024 16:29
@rhatgadkar-goog
Copy link
Contributor Author

I found some interesting observation of this issue. The issue does not happen when only running the TestDialerWithMetrics test, like this: go test -run TestDialerWithMetrics. But it happens intermittently when running all the short tests, like this: go test -short. So it's flaky when running all the short tests.

I added some print statements to help in debugging this: in RecordOpenConnections(), in wantLastValueMetric(), and in the beginning and end of TestDialerWithMetrics.

When only running TestDialerWithMetrics, the value of open_connections is consistently 2 when dialing both connections. So the last value of open_connections will always be 2:
singletest

But when running all the short tests, the last value of open_connections fluctuates between 0 and 2. It's flaky:
When TestDialerWithMetrics passed:
passingshorttests
When TestDialerWithMetrics failed:
failingshorttests

I'm thinking that the 0 value is coming from the previous tests that ran before TestDialerWithMetrics. I tried to fix this by only processing the metrics in wantLastValueMetric() that occurred when TestDialerWithMetrics was called (by reading the timestamp of the metric), but that didn't work.

I think that this issue may take time to troubleshoot. So I think we can hold off on working on this for now, because it's not blocking anything.

@enocom enocom force-pushed the 623-wantlastvaluemetric branch from 44e7dfc to 0c37f77 Compare October 9, 2024 17:59
@enocom enocom force-pushed the 623-wantlastvaluemetric branch from 0c37f77 to d5ac5e4 Compare October 9, 2024 17:59
@nancynh
Copy link
Collaborator

nancynh commented Oct 10, 2024

@rhatgadkar-goog is this ready for review? I see y'all have managed to get the tests passing :)

@rhatgadkar-goog
Copy link
Contributor Author

No, it's not ready for review. Eno and I discussed that there's still more that we need to investigate about. We're keeping this as low priority for now

@enocom
Copy link
Member

enocom commented Jan 14, 2025

Going to close this since we're not actively working on it. We can re-open if we pick this back up.

@enocom enocom closed this Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

wantLastValueMetric() is not verifying the last value of a metric
3 participants