Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Global, incremental, syntax based, fuzzy index of all types/function in the project #73

Open
matklad opened this issue Dec 2, 2024 · 0 comments
Labels
enhancement New feature or request
Milestone

Comments

@matklad
Copy link

matklad commented Dec 2, 2024

This shouldn't be too hard, and is quite cute! That's what you need to do:

  • A function that takes a file, parses it into tree sitter tree, and extracts set of symbols and meta information about the symbol (so, string, line/column, flag for pub, enum(u8) for kind (function, constant, struct, etc))

  • Some infra to discover all .zig files in project and run the above processing for all of them, in parallel

  • A function that, for each file, converts the set of strings to an FSM (an optionally compressed trie)

  • A lookup function that takes two state machines and builds their "intersection state machine", an FSM containing strings which belong to both input state machines. This is how you do fuzzy search --- the trie from the previous step holds all the strings you have, a fuzzy query can be thought of as a subsequence state machine, and the intersection of the two is the query result.

  • Secret Sauce: an incremental update procedure which, when a single file changes, doesn't re-build the entire index, and only re-builds the state machine for this specific file.

    In general, because this stuff is trivial to make both parallel & incremental, it can feel very, very fast. It kills me that VS Code has an artificial debouncing delay between when you type the first letter of the query, and when the results are returned. It would be stupid to not get first dozen results in under 8 milliseconds!

  • File watcher to kick the above when appropriate

  • Some glue code to take user's input and thread it through all FSMs, appropriately waiting for FSMs to get built!

The original implementation in rust-analyzer, which took advantage of the existing fsm crate, was about 200 lines of code!

@neurocyte neurocyte added the enhancement New feature or request label Dec 2, 2024
@neurocyte neurocyte added this to the v1.0.0 milestone Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants