-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc,zlib: improve note on threadpool usage #20380
Conversation
Changing the wording while still referring to the threadpool size doesn't seem to make much sense. I think the issue is two-fold: that too few threads could starve async zlib requests (if I understand the original text correctly) and the memory fragmentation issue. So I think we should make this text clearer, with the threadpool link being associated with the former and perhaps some additional information for the latter. |
Fragmentation is still caused by thread pool usage, didn't want to go into details as that depends on the OS/allocator. |
What I mean is it's confusing when someone clicks that link, expecting to find more information about the |
Ok, I see. The text on https://nodejs.org/api/cli.html#cli_uv_threadpool_size_size seems ok
I'm not sure how to change that to make it clearer. Suggestions are welcome. |
831e238
to
f313879
Compare
That text has nothing to do with the memory fragmentation problem being described now. |
@mscdex better like this? |
I suppose, if we do not have any concrete suggestions for the memory issue. |
Linked issue has some suggestions like disabling THP or using a different allocator but they are impractical or ineffective. |
information. | ||
threadpool. This can lead to surprising effects in some applications, such as | ||
subpar performance (which can be mitigated by adjusting the [pool size][]) | ||
and/or unrecoverable and catastrophic memory fragmentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this warning is a bit lax. I think it would be best to give a more detailed explanation of what is wrong and how to mitigate it (CPU bound tasks need their time one way or the other, so a solution could be to use a separate Node.js instance as a worker that is connected to the main application with a queue and the queue sends new tasks as soon as the worker is done with one entry).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BridgeAR Can you expand on that? I can’t really make out a difference between what you’re describing and how the libuv event loop works right now…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, what how I understood the issue so far is that the actual call to the async functions cause the problem. The reason is that the task itself is CPU bound and if we trigger lots of async calls, we end up with catastrophic memory fragmentation. The libuv event loop can not prevent that each call will at least allocate some memory.
I just suggest to document that it is best to only have a single worker for n Node.js instances that will handle all the async tasks. The single worker could actually process m
tasks in parallel, while m
stands for the number of CPU cores. That should mitigate the issue, if I am not mistaken. Besides that we might want to re-evaluate the recommendation to always use async calls in case the actual work will be CPU bound. Using sync calls will definitely not cause this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BridgeAR In ws we are doing something similar. We use a queue to limit the maximum number of concurrent calls to zlib: https://github.com/websockets/ws/blob/690b3f277c6f5c3aef8cd84792929450f516b3ae/lib/permessage-deflate.js#L67-L73.
It helps but according to this comment even setting concurrency to 1 does not fully fixes the issue. Your suggestion can help with applications but it's a bit impractical for libraries.
Also the point of this PR is to only make people aware of the "issue". A detailed explanation of why and where it happens and how to mitigate it's out of the scope of this PR as that would require multiple pages of docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really a terrible warning. It is just scary works with no explanation or advice. You may as well have said "Warning: if you use this code you may die. Good luck!"
This needs a rebase. |
Raise awareness against the catastrophic memory fragmentation that can be created while using the asynchronous zlib APIs. Refs: nodejs#8871
9574a79
to
951e50c
Compare
Done. |
Raise awareness against the catastrophic memory fragmentation that can be created while using the asynchronous zlib APIs. PR-URL: #20380 Refs: #8871 Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Anatoli Papirovski <[email protected]>
Landed in 0234068 |
Raise awareness against the catastrophic memory fragmentation that can be created while using the asynchronous zlib APIs. PR-URL: #20380 Refs: #8871 Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Anatoli Papirovski <[email protected]>
Raise awareness against the catastrophic memory fragmentation that can
be created while using the asynchronous zlib APIs.
Refs: #8871
Checklist