-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory queue: free event data sooner after acknowledgments #38042
Conversation
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
💛 Build succeeded, but was flaky
Failed CI StepsHistorycc @faec |
💚 Build Succeeded
cc @faec |
💚 Build Succeeded
cc @faec |
💚 Build Succeeded
cc @faec |
💚 Build Succeeded
cc @faec |
💚 Build Succeeded
cc @faec |
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
@faec this might be more trouble than it is worth, but what do you think about adding a benchmark test for this? I was thinking 2 scenarios might be interesting. 1) single producer and single consumer going as fast as possible. 2) multiple producers going as fast as possible and multiple consumers, but have one of those consumers be "slow". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// ackedChan is buffered so output workers don't block on acknowledgment | ||
// if ackLoop is busy. (10 is probably more than we really need, but | ||
// no harm in being safe.) | ||
ackedChan: make(chan *batch, 10), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker.
If the buffering is for the workers could we perhaps make this dynamic for the number of workers? something like number of workers + 1? I'm just wondering if you are on a 64 core Gravitron server and the number of workers is 64, is 10 enough?
This PR is on hold because #38166 gets much better improvement with a much smaller change, and might make this one completely redundant. Once that one's merged, I'll reevaluate whether this one is still worth going ahead with. |
Closing since this is redundant with other changes that have been merged in the meantime |
Proposed commit message
Refactor the memory queue's ackLoop goroutine to allow earlier freeing of event data:
In benchmarks, CPU and ingestion rate were unaffected, while memory used dropped by 0-5% depending on configuration, with the biggest improvements being in the
scale
andthroughput
performance presets.There is no visible functional difference from this change, except that event pointers are reset to null slightly sooner in the acknowledgment process.
Checklist
I have made corresponding changes to the documentationI have made corresponding change to the default configuration filesI have added tests that prove my fix is effective or that my feature worksI have added an entry inCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.