Connection handling improvements. #66

arohter · 2014-01-08T10:47:58Z

More robust zk client error handling: support progressively more severe retry logic.
Automatic fork and InheritedError handling:
Lazy reconnects reduce zookeeper and redis server load.
Reduces/prevents kernel mutex deadlocks in forking env.
Trace connection debug logging.

…re retry logic. Automatic fork and InheritedError handling: Lazy reconnects reduce zookeeper and redis server load. Reduces/prevents kernel mutex deadlocks in forking env. Trace connection debug logging.

arohter · 2014-01-08T11:38:02Z

We make heavy use of process-forking environments (unicorn and resque), along with multiple parallel redis server connections (sharded clusters). In production usage we were running into kernel mutex deadlocks (we're on ruby 1.9). Research (1) indicates that there's some subtle pthread mutex handling bugs in ruby, especially during process termination. This primarily was caused by reconnect() calls, which trigger zookeeper thread mutex deadlocking. ZK and redis libs can detect InheritedErrors during forking, so there's really no need to preemptively do this via reconnect() anymore.

As a nice side effect of this new 'lazy' connection handling, we reduce zookeeper and redis server load, since we're cycling client connections and lookups far far less.

We've also done rather extensive zookeeper cluster failure load testing, leading to robustified zookeeper connectivity failure handling. ZK failures are still pretty much stop-the-world events for redis command processing, but using cached redis cluster state is on the todo list.

(1)
#38
zk-ruby/zk#50
resque/resque#578
https://bugs.ruby-lang.org/issues/6438
vendor/bundle/ruby/1.9.1/gems/zookeeper-1.4.7/ext/c_zookeeper.rb:237:in 'join': deadlock detected (fatal)

arohter · 2014-01-08T11:52:06Z

This merge will resolve both #56 and #47 / #48

Also, the previous PR #48 should resolve #44

Also, the previous PR #57 should resolve #42

arohter · 2014-01-08T11:55:38Z

Will look into test failures....

ryanlecompte · 2014-01-09T04:07:39Z

lib/redis_failover/client.rb

+      #Resque automatically calls this method on job fork.
+      #We now auto-detect underlying zk & redis client InheritedError's and reconnect automatically as needed.
+
+      #purge_clients


Can we remove all code that's commented out like this? We can always look at the previous code via git history.

ryanlecompte · 2014-01-09T04:08:21Z

Thanks for putting this together! These fixes look good. Just a minor code about leaving in commented-out code. Also, any thoughts on the build failures?

ryanlecompte · 2014-01-09T05:44:50Z

@arohter - I just made you a committer for the project! Thanks for helping maintain redis_failover. Feel free to merge this once you think it's ready.

arohter · 2014-01-15T20:48:06Z

Fixed and added some tests. Have some in-progress tests for zk error handling, but will do that in separate PR.

The lingering jruby test failure should disappear in the read-only slave healthchecking fix PR coming next.

Connection handling improvements.

eric · 2014-01-15T22:54:02Z

lib/redis_failover.rb

+#NOTE: We've found that using the 'recommended' zk fork-hook would trigger
+#kernel mutex deadlocks in forking env (unicorn & resque) [ruby 1.9]
+#https://github.com/zk-ruby/zk/wiki/Forking & https://github.com/zk-ruby/zk/blob/master/RELEASES.markdown#v150
+#ZK.install_fork_hook


What version of zk and zookeeper were you using?

In the past couple months I made a lot of fixes that improved some of these situations.

...and not calling the fork hooks will likely lead to segfaults (which is why this stuff is here in the first place).

The issue is that the apache zookeeper C library does not provide a way to say, "cleanup this zookeeper connection without issuing a close", so the child can't clean up the parents connection without closing the parents connection.

this is insanely dangerous and will break worse with this disabled. Safe operation requires the use of the fork hooks to ensure that the mutexes are owned by the thread calling fork() and the child is able to clean up the connection properly.

This is likely just masking the error or causing a different kind of broken behavior. The latest release of zk-1.9.3 & zookeeper-1.4.8 have fixes for the event delivery system that we believe caused the deadlocks (@eric has been running them with success in production for several weeks)

You must install the fork hooks for correct operation in forking environments.

I've addressed this in my longer comment below, but to be clear, redis_failover has never installed ZK.install_fork_hook. My PR simply adds a comment about our experiences with it.

P.S. I'm happy to remove the comment, especially now that we have new zk/zookeeper releases.

arohter · 2014-01-16T00:46:54Z

We're running the latest released versions of zk (1.9.2) and zookeeper (1.4.7).

You made a lot of fixes to redis_failover, or to zk/zookeeper code in the past few months? I'd be very interested in seeing these :)

Interestingly enough, we found that using the "recommended" ZK.install_fork_hook actually made things far worse - we were getting a ton of deadlocks - which is why we stopped using it (but I added the NOTE text for future ref).

After some research, as you mentioned, the issues stem from the underlying zookeeper threading model, combined with ruby's (at least 1.9.3, have yet to try 2.0 or 2.1) buggy kernel mutex handling. In our case, due to our heavy use of resque's process-forking jobs, the preemptive reconnect() calls (which trigger zk.reopen) combined with process termination triggers the bug. In the end, removing the preemptive reopen, combined with lazy retry handling significantly reduced mutex contention, virtually eliminating deadlocks in our production env.

eric · 2014-01-16T00:57:32Z

The changes were made to zk-ruby/zk and zk-ruby/zookeeper.

Do you have a ruby stack trace of where the processes were hung?

I just noticed it looks like there are a couple fixes that haven't made it into official releases yet, so I'll work with @slyphon to make that happen.

arohter · 2014-01-16T02:01:31Z

Some traces [pthread_mutex_destroy: Device or resource busy EBUSY]:
https://gist.github.com/arohter/5daa1262ba0921b7a831
https://gist.github.com/arohter/eae44f2a9513f46c4202
https://gist.github.com/arohter/c4622eaf177172925bea

Here's a bit with fork hook enabled:
https://gist.github.com/arohter/a32b38649d6d2a249faa
https://gist.github.com/arohter/716a26274b71d032c391

strace of the hung processes show them spinning on sched_yield (sometimes using 100% cpu). Like zk-ruby/zk#50

eric · 2014-01-16T02:20:23Z

Are you saying those first ones with the pthread_mutex_destroy error were happening with the fork hook enabled or disabled? Those would definitely be the sorts of problems that could happen without the hook enabled.

Without the stack trace of all threads in the ones with fork hooks enabled, it's hard to know exactly what else was going on to cause the deadlocks.

I am working with @slyphon to create releases for the latest changes to zk and zookeeper. I'd be interested to see if they fix the issues you see with using the fork hook.

eric · 2014-01-16T02:34:56Z

Okay. zookeeper 1.4.8 and zk 1.9.3 have been released.

Let me know if those fix your problem when you have the fork hooks enabled.

arohter · 2014-01-16T02:35:03Z

The pthread_mutex_destroy are without fork hook. I dont think I have any gist logs of the traces with the fork hook enabled.

eric · 2014-01-16T02:36:24Z

I'm definitely concerned with master not having the fork hooks enabled. They're there to prevent those pthread errors you saw.

slyphon · 2014-01-16T02:48:36Z

lib/redis_failover/client.rb

+      #NOTE: Explicit/manual reconnects are no longer needed or desired, and
+      #triggered kernel mutex deadlocks in forking env (unicorn & resque) [ruby 1.9]
+      #Resque automatically calls this method on job fork.
+      #We now auto-detect underlying zk & redis client InheritedError's and reconnect automatically as needed.


no no no, do not do this please, for the love of god. we've fixed the deadlocks. don't do a buggy workaround.

This PR does not change behavior wrt use (or not) of ZK.install_fork_hook

eric · 2014-01-16T02:53:01Z

I would suggest this pull request be reverted in its entirety and revisited.

There's a bunch of stuff here that is troubling.

…rovements" This reverts commit 5049611, reversing changes made to a95bb65.

Revert "Merge pull request #66 from fanhattan/connection-handling-improv...

arohter · 2014-01-16T09:24:45Z

Wow. Go offline for a few hours and things turn crazy... ;-)

Let's be perfectly clear here folks: ZK.install_fork_hook was and has never been enabled in redis_failover
This PR changed nothing in this regard. It simply added a (commented out) reference to the existence of this hook, since I've spent a great deal of time playing (and deadlocking) with it.

Walking through the logic and production experience backing this PR:

Ruby apparently has a bug related to kernel mutex handling, particularly during process shutdown.
The underlying zookeeper c library unfortunately exposes this bug, due to its heavy use of thread mutexes.
Combine this with use of forking applications like Resque (lots of processes starting and stopping), and the exposure to this mutex handling surface becomes much much greater.
This is exacerbated by the fact that Resque preemptively calls reconnect() after fork, and thus zk.reopen(), since redis-rb can't share connections between different processes (it'll throw Inherited Error when you try).
Mutex holding during process shutdown makes things go 💥

redis_failover already detected zookeeper InheritedConnectionErrors, automatically (and correctly) calling zk.reopen as needed.
This PR simply adds auto detection and handling of the equivalent redis InheritedErrors, removing the need for explicit and preemptive reconnect calls by forking applications.
To be clear, this doesn't fix (or change) the underlying mutex handling bug; it merely reduces the exposure to running into it.

Reverting this PR is unnecessary, and without these code changes redis_failover was unusable for us in the real world. In our heavy production workload, moving to this "only as needed" reconnect/reopen handling virtually eliminated the large amount of pthread_mutex_destroy errors we were seeing. It also noticeably reduced the connection load on our zookeeper and redis clusters.

Now, on to the root topic of the objections: ZK.install_fork_hook

We were initially very excited about discovering the existence of ZK.install_fork_hook, as it seemed to offer a solution to the very problem we were hitting. Unfortunately, after enabling it, we started running into deadlocks. [The good news is it appeared to reduce/eliminate the pthread_mutex_destroy errors!] Deadlocks and 100% cpu spins are far worse than ungraceful process exits. ZK.install_fork_hook + zk 1.9.2 + zookeeper 1.4.7 was a completely unusable combo for us.

Good news is apparently we're not alone in running into ZK.install_fork_hook problems. And @eric has kindly worked on fixing the underlying issue! I'm looking forward to trying it again with zookeeper 1.4.8 and zk 1.9.3.
I'd consider that a separate PR however, as this PR does not change behavior wrt use (or not) of ZK.install_fork_hook.

ryanlecompte · 2014-01-17T16:09:41Z

@arohter, could you please do the following:

test redis_failover with the latest versions of zookeeper/zk and ZK.install_fork_hook
issue a new pull request with only the fixes from this PR that you still think are valid after testing with the versions above

This will give us more context and help us re-review the changes that were made. Sorry for the confusion.

arohter · 2014-01-17T19:06:03Z

All the changes in this PR are still valid, as they have nothing to do with the use of ZK.install_fork_hook.
I've removed my comments about the hook however, since it was clearly confusing folks.
I've re-issued the PR here #71 , although it's a bit unhappy atm due to your forced revert.

I will test and issue a new PR for the use of ZK.install_fork_hook with latest zk/zookeeper deps, as this would technically be a new feature to redis_failover.

arohter · 2014-03-17T08:00:45Z

For the record, the use of the zk fork hook ( https://github.com/fanhattan/redis_failover/compare/conn_handling_v2_nofork...conn_handling_v2_forkhook ) still causes almost immediate deadlocks in our env, despite running latest versions of zk/zookeeper.

https://gist.github.com/arohter/a2c8f4171fcd19d7e738

eric · 2014-03-17T08:14:46Z

That is definitely a strange one. I wonder what is causing this to happen
when it doesn't happen in the tests.

Would you be able to set the environment variable ZKRB_DEBUG=true and run
it again to get some debug output from the logger?

It appears that something about the shutdown process isn't working as it
should.

The reason why it's so important to be running the fork hooks is because
without them, you're going to get random "flakiness" that could include:
randomly hanging, headlocks, and garbled communication.

We've put a lot of effort into the zookeeper forking behavior to accomodate
the specifics of how the C zookeeper library works and how the ruby runtime
works, so it would be good to get to the bottom of this so it can be fixed
for everyone.

On Mon, Mar 17, 2014 at 1:00 AM, arohter [email protected] wrote:

For the record, the use of the zk fork hook ( fanhattan@
conn_handling_v2_nofork...conn_handling_v2_forkhookhttps://github.com/fanhattan/redis_failover/compare/conn_handling_v2_nofork...conn_handling_v2_forkhook) still causes almost immediate deadlocks in our env, despite running
latest versions of zk/zookeeper.

https://gist.github.com/arohter/a2c8f4171fcd19d7e738

Reply to this email directly or view it on GitHubhttps://github.com//pull/66#issuecomment-37791918
.

ryanlecompte · 2014-03-17T14:54:00Z

+1 for trying to debug the ZK shutdown issue!

On Mar 17, 2014, at 1:14 AM, Eric Lindvall [email protected] wrote:

That is definitely a strange one. I wonder what is causing this to happen
when it doesn't happen in the tests.

Would you be able to set the environment variable ZKRB_DEBUG=true and run
it again to get some debug output from the logger?

It appears that something about the shutdown process isn't working as it
should.

The reason why it's so important to be running the fork hooks is because
without them, you're going to get random "flakiness" that could include:
randomly hanging, headlocks, and garbled communication.

We've put a lot of effort into the zookeeper forking behavior to accomodate
the specifics of how the C zookeeper library works and how the ruby runtime
works, so it would be good to get to the bottom of this so it can be fixed
for everyone.

On Mon, Mar 17, 2014 at 1:00 AM, arohter [email protected] wrote:

For the record, the use of the zk fork hook ( fanhattan@
conn_handling_v2_nofork...conn_handling_v2_forkhookhttps://github.com/fanhattan/redis_failover/compare/conn_handling_v2_nofork...conn_handling_v2_forkhook) still causes almost immediate deadlocks in our env, despite running
latest versions of zk/zookeeper.

https://gist.github.com/arohter/a2c8f4171fcd19d7e738

Reply to this email directly or view it on GitHubhttps://github.com//pull/66#issuecomment-37791918
.

—
Reply to this email directly or view it on GitHub.

arohter · 2014-03-17T18:02:31Z

Yes, I will try and grab some debug logs of this in action.

More robust zk client error handling: support progressively more seve…

259b540

…re retry logic. Automatic fork and InheritedError handling: Lazy reconnects reduce zookeeper and redis server load. Reduces/prevents kernel mutex deadlocks in forking env. Trace connection debug logging.

ryanlecompte reviewed Jan 9, 2014
View reviewed changes

arohter and others added 7 commits January 9, 2014 00:38

Remove commented out code. Add compat note.

c7fea67

Fix existing tests.

2e43b9e

trace_id test.

ac17ae0

Cleanup

af4054e

Add vendor/ to gitignore

f570399

Tests for redis command dispatch failure handling.

88dfab5

Cleanup some spec cruft.

df4ff01

arohter added a commit that referenced this pull request Jan 15, 2014

Merge pull request #66 from fanhattan/connection-handling-improvements

5049611

Connection handling improvements.

arohter merged commit 5049611 into ryanlecompte:master Jan 15, 2014

arohter deleted the connection-handling-improvements branch January 15, 2014 21:04

eric reviewed Jan 15, 2014
View reviewed changes

slyphon reviewed Jan 16, 2014
View reviewed changes

ryanlecompte added a commit that referenced this pull request Jan 16, 2014

Revert "Merge pull request #66 from fanhattan/connection-handling-imp…

21a0bc4

…rovements" This reverts commit 5049611, reversing changes made to a95bb65.

ryanlecompte added a commit that referenced this pull request Jan 16, 2014

Merge pull request #70 from ryanlecompte/revert_pull_request_66

f91fcd8

Revert "Merge pull request #66 from fanhattan/connection-handling-improv...

arohter restored the connection-handling-improvements branch January 16, 2014 07:38

arohter mentioned this pull request Jan 17, 2014

Cleanup reconnect note. Remove note about ZK.install_fork_hook #71

Closed

arohter mentioned this pull request Mar 17, 2014

Connection handling improvements v2 #73

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Connection handling improvements. #66

Connection handling improvements. #66

arohter commented Jan 8, 2014

arohter commented Jan 8, 2014

arohter commented Jan 8, 2014

arohter commented Jan 8, 2014

ryanlecompte Jan 9, 2014

ryanlecompte commented Jan 9, 2014

ryanlecompte commented Jan 9, 2014

arohter commented Jan 15, 2014

eric Jan 15, 2014

eric Jan 15, 2014

slyphon Jan 16, 2014

arohter Jan 16, 2014

arohter commented Jan 16, 2014

eric commented Jan 16, 2014

arohter commented Jan 16, 2014

eric commented Jan 16, 2014

eric commented Jan 16, 2014

arohter commented Jan 16, 2014

eric commented Jan 16, 2014

slyphon Jan 16, 2014

arohter Jan 16, 2014

eric commented Jan 16, 2014

arohter commented Jan 16, 2014

ryanlecompte commented Jan 17, 2014

arohter commented Jan 17, 2014

arohter commented Mar 17, 2014

eric commented Mar 17, 2014

ryanlecompte commented Mar 17, 2014

arohter commented Mar 17, 2014

Connection handling improvements. #66

Connection handling improvements. #66

Conversation

arohter commented Jan 8, 2014

arohter commented Jan 8, 2014

arohter commented Jan 8, 2014

arohter commented Jan 8, 2014

ryanlecompte Jan 9, 2014

Choose a reason for hiding this comment

ryanlecompte commented Jan 9, 2014

ryanlecompte commented Jan 9, 2014

arohter commented Jan 15, 2014

eric Jan 15, 2014

Choose a reason for hiding this comment

eric Jan 15, 2014

Choose a reason for hiding this comment

slyphon Jan 16, 2014

Choose a reason for hiding this comment

arohter Jan 16, 2014

Choose a reason for hiding this comment

arohter commented Jan 16, 2014

eric commented Jan 16, 2014

arohter commented Jan 16, 2014

eric commented Jan 16, 2014

eric commented Jan 16, 2014

arohter commented Jan 16, 2014

eric commented Jan 16, 2014

slyphon Jan 16, 2014

Choose a reason for hiding this comment

arohter Jan 16, 2014

Choose a reason for hiding this comment

eric commented Jan 16, 2014

arohter commented Jan 16, 2014

ryanlecompte commented Jan 17, 2014

arohter commented Jan 17, 2014

arohter commented Mar 17, 2014

eric commented Mar 17, 2014

ryanlecompte commented Mar 17, 2014

arohter commented Mar 17, 2014