Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrency optimization for native graph loading #2345
base: main
Are you sure you want to change the base?
Concurrency optimization for native graph loading #2345
Changes from all commits
4dc8449
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please, have a private method
openIndex()
here so this is taken care as an when the code changes?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private method for this as well? There will be additional null check but everytime
get
returns accessRecency should be updated.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There could be a case where Multiple threads trigger eviction and graph loading concurrently, leading to temporary spikes in memory usage. Can we think of using bounded concurrency for eviction and graph loading tasks with thread pools?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will take it up in a separate issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a fair callout. I think we need to improve on our cache operations in general.
I think the problem we are going through right now is that the cache operations can be async in nature (cleanup, eviction) where as we use it as a 1:1 reference for the off heap memory in use.
We can create a tracking issue and deal with this separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid this case when graph is partially loaded or an error occurs during loading, which endup cache being an inconsistent state . Can we ensure automaticity in graph loading and only put in cache if it is successful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there is error in graph loading then the entry will not be in cache. What would be the scenario where cache ends up in inconsistent state ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Gankris96 Can we wrap this call behind the same lock based logic above?
Just to make sure we do not open the same index files concurrently in two different threads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrapping this within a lock still seems to fail some bwc search tests where we endup getting incorrect results. Even doing so would not really help coz we don't solve the eventual problem of multiple graph files getting loaded at the same time because the load is not synchronized anymore.
This probably requires revisiting in a new separate issue where we refactor the whole cache strategy imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create an issue so that we can track it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the bwc failure was a different issue unrelated to this. I did add back the locking logic for this as well. It seems to work fine so we can keep this in.