You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Exact search evaluates vectors in linear fashion. Leveraging IndexInput#prefetch to load the next vector in memory, can possibly help with reducing the read cost during runtime reducing the latencies. Prefetch gives a madvise WILL_NEED system call to the kernel, kernel may use this signal to prefetch a set of bytes async.
We need to benchmark and see if this yields improvements.
Pre-requisites
Lucene 10.x: prefetch API is only available with Lucene 10.x
Lucene changes to have prefetch supported in FloatVectorValues: Currently it is not supported and requires a lucene contribution
This can help speed up filtering queries, rescoring and exact search scripting
The text was updated successfully, but these errors were encountered:
A similar mechanism is being addressed here with searchable snapshot in core where based on file type we can perform the read ahead of the blocks. So for exact search if we are using flat vector files then access to that file can be implicitly powered using read ahead functionality to help in sequential access cases. This can tie up well with prefetch interface later where accessor can provide specific indication on when to perform read ahead vs when not to (random access).
@sohami Thanks for the reference. I would be interested in the low level RFC/ implementation, currently there are only specific cases where we want prefetch since it affects search latencies for lucene engine (and with partial loading it might affect faiss engine as well). Its easy to add a prefetch API in float vector values which can use IndexInput#prefetch and then call prefetch based on how many vectors you need instead of a predefined block of data.
Description
Exact search evaluates vectors in linear fashion. Leveraging IndexInput#prefetch to load the next vector in memory, can possibly help with reducing the read cost during runtime reducing the latencies. Prefetch gives a madvise
WILL_NEED
system call to the kernel, kernel may use this signal to prefetch a set of bytes async.We need to benchmark and see if this yields improvements.
Pre-requisites
This can help speed up filtering queries, rescoring and exact search scripting
The text was updated successfully, but these errors were encountered: