You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Unlike the other steps, I've done practically nothing to optimize the initial recursive tree traversal phase.
I'll want to do some cost-benefit research on the following as well as identifying other potential improvements:
Look into the performance effect of checking whether excludes contain meta-characters and using simple string matching if they don't.
As I understand it, fnmatch.fnmatch uses regexes internally and doesn't cache them. Given how many times it gets called, I should try using re.compile with fnmatch.translate instead.
I should also look into what the performance effect are of programmatically combining multiple fnmatch.translate outputs so the ignore check can be handled in a single pass.
Look into the memory-I/O trade-offs inherent in doing one stat call for each file and then caching it so it can be used both for sizeClassifier and for things like inode-based hardlink detection.
The text was updated successfully, but these errors were encountered:
ssokolow
changed the title
Look into optimization the initial "gather paths to analyze" phase
Look into optimizations for the initial "gather paths to analyze" phase
Aug 21, 2014
Unlike the other steps, I've done practically nothing to optimize the initial recursive tree traversal phase.
I'll want to do some cost-benefit research on the following as well as identifying other potential improvements:
fnmatch.fnmatch
uses regexes internally and doesn't cache them. Given how many times it gets called, I should try usingre.compile
withfnmatch.translate
instead.fnmatch.translate
outputs so the ignore check can be handled in a single pass.sizeClassifier
and for things like inode-based hardlink detection.The text was updated successfully, but these errors were encountered: