Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

globset: Support paths already present as [u8] (e.g. typed_path::Path) #2954

Open
aawsome opened this issue Dec 24, 2024 · 3 comments
Open

Comments

@aawsome
Copy link

aawsome commented Dec 24, 2024

Hi!

I'd like to use globset to match paths - but I like to use typed_path::Path for them. This means they are already represented as [u8] on windows and unix platforms. Unfortunately, matching is currently only supported for AsRef<Path> and Candidate which can itself only be created from AsRef<Path>.

I could convert typed_path::Path into a std::pathPath/std::path::PathBuf but this is ugly because it will internally convert back and forth and (more annoying) the conversation may fail (e.g. non-UTF parts on windows).

Is it possible to add support for AsRef<[u8]> or something similar?

Thanks in advance!

@BurntSushi
Copy link
Owner

globset itself uses &[u8] internally, so in theory, this shouldn't be much of a problem. But you can't ask for either AsRef<[u8]> or AsRef<Path> in the same API without a new trait, which I have no intent of doing. Which means you need another API for it.

However... it looks like typed_path::Path implements AsRef<OsStr>? Although, I guess that's only on Unix. But at least on Unix, it looks like you can get an &OsStr and thus a &Path with zero cost.

I might revisit this in a globset 0.5 release (no idea when that will happen), but for now, I think you might need to just deal with the conversion (which should be zero cost on Unix).

@aawsome
Copy link
Author

aawsome commented Dec 24, 2024

Thanks @BurntSushi for your fast response!

I must say that I am not very satisfied with your proposal :-/ The idea of using typed_path::Path is to have treatment of unix-style paths under unix and windows without the need for platform-specific code. Also, as mentioned, using a non-valid path on windows leads to extra trouble which I don't see how to fix, actually..

About API: A Candidate::from_bytes<P: AsRef<[u8]>(path: P) -> Self (replace naming if you find something more suited) would be sufficient from my point of view. I agree that having traits such that the matching methods would run with both AsRef<std::path::Path> and AsRef<[u8]> would be even more elegant.

I however respect if you don't want to change your API. In that case I'd actually prefer to maintain a fork which provides that functionality (which seems to be the lesser evil to me compared to the conversion hassle when coding platform-agnostic)

Merry Christmas!

@BurntSushi
Copy link
Owner

I think I'd be open to a Candidate::from_bytes constructor. That's pretty unobtrusive.

Note that the encoding of the bytes still has to be conventionally UTF-8, even if it isn't fully valid. So, e.g., WTF-8 would be okay. But you'll get nonsense if you pass a &[u8] that is UCS-2 or UTF-16.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants