Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiline search #16

Open
stelf opened this issue Apr 6, 2021 · 2 comments
Open

multiline search #16

stelf opened this issue Apr 6, 2021 · 2 comments

Comments

@stelf
Copy link

stelf commented Apr 6, 2021

perhaps something obvious for others..., but really would be nice to have an example of multi-line search. not sure whether is feature request or docs improvement request.

otherwise all works great on Win10 20H2 with Windows Terminal and PS 7.2

update

what i understand from https://zeux.io/2019/04/20/qgrep-internals/ is that qgrep works on a line-by-line basis, but then the article states that ...

The search is done on a line by line basis, however instead of feeding each line to the regular expression engine at a time, the regular expression is ran on the entire file at a time

which means that there should be an option to apply an s or sm modifier to the regexp (re2 supports these, although I tried to feed (?sm) to qgrep and error is produced).

... so the issue is rather a feature request.

@zeux
Copy link
Owner

zeux commented May 5, 2021

The core problem with multiline search is the fact that qgrep splits (long) files between different chunks. This is fairly critical for being able to maintain good search performance on large files - without this, chunks would be very different in size due to occasional large files which would significantly decrease efficiency of parallel search.

It's easy of course to ask re2 to do multiline search, but this will occasionally miss matches that cross the chunk boundary.

@stelf
Copy link
Author

stelf commented May 27, 2021

this makes sense, indeed.

perhaps then an option to enable chunks to be on a line boundary and while it can miss some matches it will at least enable finding reasonable results that can afterwards be double checked with ripgrep or ag. there are many times when devs. would split lines to keep the 80 chars width or just make sure it is readable - with long function calls, SQL concatenations, etc. too many examples really

point being that qgrep does incredible job, my tool of choice, but anyway has to be double checked now and then when the result is expected to spill over the line boundary.

for the record : presently using it against 90k source files of .. various origin and languages, but still having to double check certain results with rg.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants