-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible race condition under load #89
Comments
Perhaps just an exception getting swallowed, maybe here. |
Ah, yes that sounds like a swallowed exception. galgeek recently added some more logging to WbCdxApi to help with that. |
Yeah, I dunno what's going on. Some clumsy handling of error conditions in my client was confusing things. Having tidied that up I only get
Having added some logging to OutbackCDX, I can see
(warning mixture of threads in that log) So the error seems to be consistent with the client going away ( |
I note my NGINX comparison would not have stretched the chunked transfer decoding, for example, so may not be a fair test. It's also possible I'm exhausting ephemeral ports on this machine, but it's an odd way for that error to show up. |
Yeah, that's consistent with the client closing the connection. Although it could obviously be doing that in response to something the server does like invalid chunked encoding. Can't be ephemeral port exhaustion because the connection was already established (unless the client library responds to being unable to open new connections by closing random existing ones or something). One thing you could try is running OutbackCDX with the undocumented and experimental '-u' command-line option. This makes it use Undertow instead of NanoHTTPD as the http server. Chances are Undertow has more overhead but it's more battle tested and probably has better error handling. |
Oof, well, to give you an idea of how it's going, I tried upping the threads some more and got this!
So perhaps there are deeper problems. Thanks for the tip about Undertow. It does seem slightly happier, reporting some errors on the back-end but not in the client (?!).
and then later on:
Upped the threads to OutbackCDX and these errors no longer showed up. Could be coincidence. Very odd speed profile. First column is parallel thread count. Third is seconds per thread (where each thread has the same workload). Gets faster up to 750 threads then big slowdown at 1000. Not clear why.
I think the problem is likely not with |
Using Undertow, through Docker:
No errors! Then back to NanoHTTPD, still under Docker:
So, while the underlying reasons are unclear, Undertow appears to be faster in general and more stable in particular. So I'll try using Undertow mode in production. |
Good to know. Maybe I'll make undertow mode the default then and deprecate nanohttpd. |
I've been running some load tests on OutbackCDX, and as indicated in this comment when I run 1000 threads (all running the same
GET
100 times), I start seeing odd errors:It's rock solid at 750 threads, but at 1000 it goes wonky. The same client works fine at 1000 threads when running against an NGINX instance configured to respond as if it was OutbackCDX.
So, this seems like it might be a subtle race condition under load in OutbackCDX itself?
EDIT Sorry should have mentioned that this is irrespective of the number of threads OutbackCDX is configured to use (as long as it's plenty!) and doesn't see to be related to
ulimits
(which manifests itself differently).The text was updated successfully, but these errors were encountered: