-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove hot uses of pathlib #14110
base: master
Are you sure you want to change the base?
Remove hot uses of pathlib #14110
Conversation
0b2ef7f
to
bf03ed2
Compare
Isn't this just os.path.commonpath? This is a well-known problem with pathlib -- it loves to reimplement os.path via complicated and fiddly data structures with questionable correctness characteristics. And apparently questionable performance characteristics too... |
It's similar to
Yeah, I like |
Faster, maybe. I haven't timed it. But with regard to returning a boolean etc. the point of commonpath isn't "this function returns whether a path is an ancestor" but rather that it's the trivial building block for such a check. It's code we already use in several places: if os.path.commonpath(x, y) == x:
do_thing_with(y) |
Yes, it is faster: :)
Also it doesn't crash and burn if you have different drives on Windows. :) Should I adjust other uses of |
Yup, we catch ValueError for those. So I suppose it may anyways be worth extracting into a dedicated function. :P
Yes, please. If we need this function for performance in a hot path anyway, then we might as well use it everywhere the same code pattern is used. |
d1eaebd
to
7d781c8
Compare
7d781c8
to
8ace12e
Compare
aacb03c
to
f3786b3
Compare
Introduce an alternative to os.path.commonpath(name, path) == path, which is a common idiom throughout Meson. Call it is_parent_path just like the existing static method in Generator. It is a bit faster and handles drives on Windows without the need for an exception handler. Signed-off-by: Paolo Bonzini <[email protected]>
f3786b3
to
f82e8a5
Compare
This saves about 1% of execution time; path manipulation is expensive. About 1% more could be saved by avoiding usage of "foo in bar.parents", which is an expensive way to say "bar.startswith(foo)". Signed-off-by: Paolo Bonzini <[email protected]>
pathlib's implementation of Path iteration is expensive; "foo.parents" has quadratic complexity when building it: return self._path._from_parsed_parts(self._drv, self._root, self._tail[:-idx - 1]) and comparisons performed by "path in file.parents" potentially have the same issue. Use the newly introduced "is_parent_path" function to check whether a file is under a path in the same way. This removes Path from its hottest use. Signed-off-by: Paolo Bonzini <[email protected]>
f82e8a5
to
0b4f9ba
Compare
pathlib's implementation of Path iteration is expensive; "foo.parents" has quadratic complexity when building it:
and comparisons performed by "path in file.parents" potentially have the same issue. Introduce a new function that checks whether a file is under a path in the same way, removing usage of Path from the biggest hotspot.