-
-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Indexing causes SEARCH to become very slow #38
Comments
I think that this is just the price to pay for a single-threaded architecture. I'm not sure which part is slowest, but if it was something like the indexer we could try and move that to a new thread. |
The question too is, will there be a need for concurrent connections? If I am creating search engine I am going to be having a crawler and the SEARCH engine running in parallel. So info will be constantly indexed but I still want my searches to be fast. I like the idea of moving the indexer into a new thread and keeping the connection handling single threaded. |
Nice conversation folks, |
Interesting idea. The problem though is that each read and write is a blocking operation because of the single threaded architecture and the fact that we are currently processing all requests from a single connection until. On the Slack channel this conversation was continued and @00-matt suggested that instead of handling one connection at a time until the Would your idea of a double buffer fix the blocking issue? I also don't know if this would be possible or realistic to do within a single thread. |
Hmmm, got it. Double buffer technique is used to decouple Index creation and querying. Another idea @f-prime is, I find the discussion very interesting ! |
On Thursday, 11 July 2019 00:11:04 BST Deepan Prabhu Babu wrote:
One idea i had was, something similar to double buffer of graphics systems.
The index for searching can be read only. When a new index is ready each
moment, we can point to the new index and all new searches would happen on
the new index (going forward). This will keep the crawler separate from the
querying engine.
I think that this is definitely worth trying. Although I think it should be
made optional, because it could be worse for people with large data sets that
don't change very often.
On Thursday, 11 July 2019 00:54:07 BST Frankie Primerano wrote:
instead of handling one connection at a time until the `read()` is complete
we handle a single command per connection in a similar fashion to the NodeJS
event loop. This will prevent other connections from hanging for longer
periods of time than they need to.
Yeah it would be like using `process.nextTick()` in Node.js so that all users
get treated fairly. It wouldn't really make anything faster though.
Would your idea of a double buffer fix the blocking issue? I also don't know
if this would be possible or realistic to do within a single thread.
It should do, but it will require another thread to do the indexing in the
background.
On Thursday, 11 July 2019 04:56:58 BST Deepan Prabhu Babu wrote:
Did you think about, callback model for giving out the results ?
So you take **a query and a callback** on one request, and when the results
are ready, you pass the result through a callback to the client which
requested the search. Client don't synchronously wait for you to reply, but
actually get fed the results when they are ready.
I don't think that doing this on the server would provide any value. Clients
can already be non-blocking (see the Node.js client library). Having replies
happen out of order from the server would mean that we would need to give each
request a unique ID to match it with a later response, similar to what
JSON-RPC does.
|
Its been a while i touched C and C++. |
Hey @deepanprabhu glad you are interested! I'd suggest joining the slack channel here https://join.slack.com/t/fist-global/shared_invite/enQtNjcyNzY4MTUwMDg0LTRiYzM5ZWNkOTMwODYzODRjNDQzNThiYjdhNjgzZDUxZGYxODRjOTI4NTcwYmYzYmI5MTViYjFiNGFlNWEwYjY You can also check the current open tasks here: https://github.com/f-prime/fist/projects/1 Just claim something in Todo and make a PR for the fix and we can discuss the implementation there :) |
While attempting to index a lot of Documents, I also tried to send a few SEARCH commands to see how things would behave. While the indexing was still going on, SEARCH commands became very very slow.
The text was updated successfully, but these errors were encountered: