-
Notifications
You must be signed in to change notification settings - Fork 491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
algod: Add static EnableTelemetry retry #6183
base: master
Are you sure you want to change the base?
Conversation
This one does not solve the HeartBeat race where HB sometimes is not being sent via telemetry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with adding the go routine loop, but if we're doing that, let's only do that. We can remove the initial attempt to enable, and only enable in the loop.
This change will make static telemetry init fully async instead of sync. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6183 +/- ##
==========================================
- Coverage 55.70% 54.90% -0.81%
==========================================
Files 494 494
Lines 69972 69976 +4
==========================================
- Hits 38981 38422 -559
- Misses 28276 28875 +599
+ Partials 2715 2679 -36 ☔ View full report in Codecov by Sentry. |
@urtho want to merge in master here and we'll get this pulled in? |
} | ||
fmt.Fprintln(os.Stdout, "error creating telemetry hook", err) | ||
// Try to reenable every minute | ||
time.Sleep(time.Minute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this to go on indefinitely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined to say yes. It's only once per minute. If that sends like too much, maybe double the time each time, up to, say, 10 minutes?
To keep those, how about a short sleep, say 2 seconds, after starting the go routine? If you want to get very fancy, you could have the init code write to a channel when it completes, and the sleep could be a select on the channel or until a few seconds timer expires. But I would be happy with a short also and a comment explaining the reasoning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would accept this, but I'd prefer a little delay to try to get telemetry initialized before start events.
Maybe just to try a synchronous init attempt and then run a goroutine in case of failure? |
Remote telemetry with a static URI in config never gets enabled past the initial, single try at algod startup.
Remote logging to static URI never gets enabled in the event the Internet or remote service is not available during Algod startup.
There is no such issue with dynamic remote telemetry (DNS based discovery) as it retries the connection with TelemetryURIUpdateService.
This PR adds a loop that retries the static remote service every minute until it succeeds.