Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove sleep loop from moth/setup routines. #1244

Closed

Conversation

petersilva
Copy link
Contributor

working on #339.

We if we have multiple broker connections, then we can't loop waiting to connect to each one... because every time
one broker connection goes down, it will just get stuck there.

  • old behaviour: getSetup and putSetup loop waiting to get a connection.
  • new behavious: getSetup and putSetup fail and return.
  • main loop then calls them again.

This is also consistent with the pattern of trying to have a single place where the code sleeps.

The only change in this code is to remove the loop (which outdented the entire routine by 4... sigh...)
changing "break" to "return" in a few places, and removing the sleeps.

old behaviour: getSetup and putSetup loop waiting to get a connection.
new behavious: getSetup and putSetup fail and return.

main loop then calls them again.
Copy link

Test Results

249 tests   246 ✅  1m 24s ⏱️
  1 suites    1 💤
  1 files      2 ❌

For more details on these failures, see this check.

Results for commit f05d7eb.

@petersilva
Copy link
Contributor Author

so... first patch or two was to make sure that, we don't loop when a broker is down in the connection establishment phase. A second step is that... when the connection establishment does fail, we should notice and back off for a while.

The later patches:

  • notice how long a connection attempt took.
  • implement exponential backoff as a function of how many connection failures, and how long a connection failure takes (if it takes 5 minutes, then it doesn't make much sense to wait 1 second before trying again.) It should be at least 10 minutes before we try again.

@petersilva
Copy link
Contributor Author

There is still an inner loop in the mqtt getsetup stuff... Just doing the same thing to mqtt that was done to amqp isn't enough... The change should work in some failure modes, but perhaps not others.

mshak2 and others added 18 commits October 3, 2024 09:45
* QoS should be called by all instances, not just the lead. It's a per-channel/connection setting, not per-queue

* Cancel consumer on clean up

* we log when the consumer is started, log when it's cancelled too

* getCleanUp should have been close. Cleanup is for deleting queue, etc.
old behaviour: getSetup and putSetup loop waiting to get a connection.
new behavious: getSetup and putSetup fail and return.

main loop then calls them again.
@petersilva
Copy link
Contributor Author

oh my goodness... every time I try to resolve conflicts... I just get more conflicts...
I'm going to try a fresh branch and some cherry-picking.

@petersilva
Copy link
Contributor Author

replaced by #1247

@petersilva petersilva closed this Oct 3, 2024
@petersilva petersilva deleted the issue339_try3_part1_fail_connections_in_main_loop branch November 28, 2024 20:36
@petersilva petersilva restored the issue339_try3_part1_fail_connections_in_main_loop branch November 28, 2024 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants