-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix thread-local leaks (alternate approach) #205
Conversation
…sting usages used the interface incorrectly and leaked contents outside the context of the try-with-resources block. Refactor as StorageSupport, which lets us drop the autocloseable wrapper object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm after a quick glance, just some random thoughts about array pooling
I assume testing shows that the leak is gone.
} | ||
var localRavv = ravvCopy.get(); | ||
float[] v = localRavv.vectorValue(targetOrd); | ||
return localRavv.isValueShared() ? Arrays.copyOf(v, v.length) : v; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
random thoughts:
will we benefit from pooling the float[]? E.g. we can use multiple pools can be for small arrays (< 256 len), med (< 768), large (< 2048), and unpooled for others, adjust to the most common cases.
will vector instructions benefit from off-heap (direct memory) arrays?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely could benefit from pooling that (or a similar solution); this area of code is something I've looked at in the native branch WIP, which changes vector representation and many of the details of our computations around them.
The main point of PS was to add support for virtual threads via QueuedPooling. If we don't care about supporting VT then we should just rip the whole thing out and use ETL instead which has a much friendlier API (doesn't require explicit close). |
Re the motivation for this approach over ExplicitThreadLocal,
I think it's pretty reasonable to say "we assume that you call I don't want to leave TL+TLM in the code, it's too fragile both wrt to the circular references you identified and also even without circular references there are just no guarantees when Entry references get expunged without explicit calls to |
I agree with that conclusion. |
We can't allow object-level thread-local values to reference the object, since this means there will always be a strong reference to the key in the thread-local map, preventing these thread-locals for long-lived threads from being reclaimed.
This PR fixes two thread-local leak sources. The first was introduced with resumable search, where the
GraphSearcher
stores the score function. In several cases, thescoreFunction
is a lambda with an implicitthis
referring to aGraphIndexBuilder
. If the last search run by aGraphIndexBuilder
is one of these cases, theGraphSearcher
thread-local will be kept alive indefinitely by this circular reference. The second source is much older, but very similar. AGraphIndexBuilder
's thread-localGraphSearcher
references a view of anOnHeapGraphIndex
initialized in theGraphIndexBuilder
constructor. ThisOnHeapGraphIndex
was initialized with aneighborFactory
containing an implicitthis
to the GraphIndexBuilder instance. This loop also keep theGraphSearcher
thread-local alive indefinitely.In the process of working this PR, I discovered several cases where a try-with-resources block over a PoolingSupport retained the contents received from a
RandomAccessVectorValues
outside the context of the try-with-resources. If this PoolingSupport was ever a queued version, this would violate the contract of the autocloseable and potentially cause incorrect behavior. Since we don't currently use queued versions of the PoolingSupport, I opted to simplify PoolingSupport to just shared/thread-local values, which eliminates the need for autocloseable entirely. I find this cleaner given that we don't use queueing, but as an alternative, I could fix all cases where a pooled value leaks contents outside a try-with-resources (which would come at runtime cost with no benefit, since we don't use queuing).