-
-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance by caching find_spec #2408
Improve performance by caching find_spec #2408
Conversation
The cache feels dirty but There may be room for improvement, however I think this can get the ball rolling to discuss changes/improvements. Here is a brief review of the profiling I did. Before PRTimings using hyperfine
Syscalls
After PRTimings using hyperfine
Syscalls
|
I also obtained this call graph to find the best place to cache. The issue refers to |
Also, I used the repository linked in the issue to obtain all the profiling data. I am still yet to profile the memory allocated for the process execution, will try to obtain that since I want to measure the hit an unbound cache may create. |
astroid/interpreter/_import/spec.py
Outdated
def wrapper(*args): | ||
key = ".".join(args[0]) | ||
if key not in modpath_cache: | ||
modpath_cache[key] = func(*args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this cache also needs to be cleared there ?
Line 437 in 4ccb9c4
def clear_cache(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing that out. The cache is now cleared from there and I also renamed variables a little bit and made the code look more like existing cache code (e.g. the inference tip cache).
4ccb9c4
to
2dfce27
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, it seems good to clear the cache here as well. I check "Approve" here, but I don't astroid well ( I just submitted an issue yesterday for the first time ). A maintainer should approve, not me, but I can say that these changes look good to me.
astroid/interpreter/_import/spec.py
Outdated
def spec_cache(func): | ||
def wrapper(*args): | ||
key = ".".join(args[0]) | ||
if key not in _spec_cache: | ||
_spec_cache[key] = func(*args) | ||
|
||
return _spec_cache[key] | ||
|
||
return wrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add typing to this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have typed the function and also the dictionary used for caching (mypy was complaining).
astroid/interpreter/_import/spec.py
Outdated
@@ -423,6 +429,18 @@ def _find_spec_with_path( | |||
raise ImportError(f"No module named {'.'.join(module_parts)}") | |||
|
|||
|
|||
def spec_cache(func: Callable) -> Callable: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def spec_cache(func: Callable) -> Callable: | |
def spec_cache(func: Callable[[list[str], Sequence[str] | None], ModuleSpec]) -> Callable[[...], ModuleSpec]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes have been applied.
cc546b9
to
7fcfdac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the analysis of what need to be cachec. Any reason not to use functools.cache
or functools.lru_cache
instead of using a global ?
I really wanted to use something standard, however the problem I found with functools caches is that they take into consideration all function arguments while I thought it only made sense to cache the requested module because:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we also need a changelog entry?
Code LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right thank you for the explanation. In find_spec
we have :type modpath: list or tuple
, maybe we could force it to be a tuple with something like this:
def make_modpath_hashable(
func: Callable[[list[str], Sequence[str] | None], ModuleSpec]
) -> Callable[..., ModuleSpec]:
def wrapper(modpath: list[str], *args) -> ModuleSpec:
return func(tuple(modpath), *args)
return wrapper
@make_modpath_hashable
@lru_cache(maxsize=1024)
def find_spec(modpath: list[str], path: Sequence[str] | None = None) -> ModuleSpec:
(I didn't test if it work tbh, but it should ? )
astroid/interpreter/_import/spec.py
Outdated
func: Callable[[list[str], Sequence[str] | None], ModuleSpec] | ||
) -> Callable[..., ModuleSpec]: | ||
def wrapper(modpath: list[str], *args) -> ModuleSpec: | ||
key = ".".join(modpath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
key = ".".join(modpath) | |
key = tuple(modpath) |
I suppose it's faster to get something hashable that way ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks better, thanks for the suggestion, I have applied it.
I tried the approach but It does not seem to work, main reason being that We would also need to make What do you think? Maybe I am missing something. I think we could work a hack use lru_cache internals but the code would be probably more complicated than using our own cache 🤔 |
I'm hesitant to add a global, because pylint/astroid can be used on massive codebase and we had issues with memory constantly increasing in the past. I don't remember all cache clear API where we need to add this global to myself, and would need some time to check it (Jacob probably remember better than me if it comes to it though). Using a global is not impossible but it's easy to make a mistake and create a memory leak. So using lru_cache to set a max number of value stored would be convenient / reassuring. I suppose another (less elegant than adding two decorators) approach would be to create a new cached subfunction inside the current function with the paths we want to cache specifically, what do you think ? |
38315c0
to
1b2f354
Compare
I think I was able to find a solution that lets us use Without cache:
With cache:
What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing, thank you ! The perf increase and the design are great. I need to fix the coverage and the pipeline for pypy 3.10 on main but I think we're going to release astroid 3.2.0 for this alone !
1b2f354
to
e593f84
Compare
Nuked AstroidManager as suggested. Just waiting on your thoughts about exceptions preventing not found packages from being cached. |
Thanks, I didn't notice that ImportError is raised when nothing is found. Good point. However, I still think we should add the path to the cache key even when modules were found. The following is unexpected, and I think it's realistic given the upstream callers like >>> from astroid.interpreter._import.spec import find_spec
>>> find_spec(['brain'], ['astroid'])
ModuleSpec(name='brain', type=<ModuleType.PKG_DIRECTORY: 3>, location='astroid/brain', origin=None, submodule_search_locations=None)
>>> find_spec(['brain'], ['pylint'])
ModuleSpec(name='brain', type=<ModuleType.PKG_DIRECTORY: 3>, location='astroid/brain', origin=None, submodule_search_locations=None) Line 195 in 7a3b482
|
For example, the path is part of the cache key for modules here: Lines 301 to 305 in 7a3b482
|
I made the search path part of its cache by converting it into a frozenset however there are lots of cache misses and the performance benefit gets nullified, so maybe this is harder than I thought 🤔 . This is the diff I tried (when compared with the current PR) diff --git a/astroid/interpreter/_import/spec.py b/astroid/interpreter/_import/spec.py
index c812a612..62bb3e6c 100644
--- a/astroid/interpreter/_import/spec.py
+++ b/astroid/interpreter/_import/spec.py
@@ -428,6 +428,7 @@ def _find_spec_with_path(
@dataclass(frozen=True)
class SpecArgs:
key: tuple
+ path_cache: frozenset | None
modpath: list[str] = field(compare=False, hash=False)
path: Sequence[str] | None = field(compare=False, hash=False)
@@ -449,7 +450,7 @@ def find_spec(modpath: list[str], path: Sequence[str] | None = None) -> ModuleSp
:return: A module spec, which describes how the module was
found and where.
"""
- return _find_spec(SpecArgs(tuple(modpath), modpath, path))
+ return _find_spec(SpecArgs(tuple(modpath), frozenset(path) if path else None, modpath, path)) Here are the cache specs after running it on the test repo provided in the original issue: No cache for search path: Am I missing something? Maybe a better way to cache the path? Otherwise it seems like it may be better to optimize somewhere else. |
Also since the main point of this PR (perf) may take a while, I can make a new PR to apply the nuking of the Astroid test class separately. |
About the dataclasses, did you consider doing something like this for simplicity? diff --git a/astroid/interpreter/_import/spec.py b/astroid/interpreter/_import/spec.py
index c812a612..65d7c88d 100644
--- a/astroid/interpreter/_import/spec.py
+++ b/astroid/interpreter/_import/spec.py
@@ -16,7 +16,6 @@ import types
import warnings
import zipimport
from collections.abc import Iterator, Sequence
-from dataclasses import dataclass, field
from functools import lru_cache
from pathlib import Path
from typing import Any, Literal, NamedTuple, Protocol
@@ -425,13 +424,6 @@ def _find_spec_with_path(
raise ImportError(f"No module named {'.'.join(module_parts)}")
-@dataclass(frozen=True)
-class SpecArgs:
- key: tuple
- modpath: list[str] = field(compare=False, hash=False)
- path: Sequence[str] | None = field(compare=False, hash=False)
-
-
def find_spec(modpath: list[str], path: Sequence[str] | None = None) -> ModuleSpec:
"""Find a spec for the given module.
@@ -449,15 +441,14 @@ def find_spec(modpath: list[str], path: Sequence[str] | None = None) -> ModuleSp
:return: A module spec, which describes how the module was
found and where.
"""
- return _find_spec(SpecArgs(tuple(modpath), modpath, path))
-
+ return _find_spec(tuple(modpath), tuple(path) if path else None)
@lru_cache(maxsize=1024)
-def _find_spec(args: SpecArgs) -> ModuleSpec:
- _path = args.path or sys.path
+def _find_spec(modpath: tuple[str], path: tuple[str] | None = None) -> ModuleSpec:
+ _path = path or sys.path
# Need a copy for not mutating the argument.
- modpath = args.modpath[:]
+ modpath = list(modpath)
submodule_path = None
module_parts = modpath[:]
@@ -466,7 +457,7 @@ def _find_spec(args: SpecArgs) -> ModuleSpec:
while modpath:
modname = modpath.pop(0)
finder, spec = _find_spec_with_path(
- _path, modname, module_parts, processed, submodule_path or args.path
+ _path, modname, module_parts, processed, submodule_path or path
)
processed.append(modname)
if modpath: |
I'm wondering if the issue is in pylint. We're probably doing some unnecessary "is this import a relative import, if so provide the filepath we started from". It's providing all those filepaths that get checked first and strike out that are adding to the number of calls here. This diff reduces a ton of calls, but it would need to be checked for correctness: diff --git a/pylint/checkers/imports.py b/pylint/checkers/imports.py
index ac8962c50..361bd5571 100644
--- a/pylint/checkers/imports.py
+++ b/pylint/checkers/imports.py
@@ -1043,11 +1043,13 @@ class ImportsChecker(DeprecatedMixin, BaseChecker):
module_file = node.root().file
context_name = node.root().name
base = os.path.splitext(os.path.basename(module_file))[0]
-
try:
- importedmodname = astroid.modutils.get_module_part(
- importedmodname, module_file
- )
+ if isinstance(node, nodes.ImportFrom):
+ importedmodname = astroid.modutils.get_module_part(
+ importedmodname, module_file
+ )
+ else:
+ importedmodname = astroid.modutils.get_module_part(importedmodname)
except ImportError:
pass |
Yes, it seems like there are lots of calls for the same package with different paths which obliterates the cache. I collected all calls to find_spec with their arguments and grouped them by requested modpath. Here is an excerpt:
jinja2 was requested 403 times and those were the search paths used. I also wonder why jinja2 is looked for since its not used in the test repo. I will investigate in pylint. |
e593f84
to
53c71a7
Compare
I followed the suggestion and applied a similar change in pylint. Cache specs do improve a lot. Also it seems like previous tests were running with a polluted virtualenv so that explains stuff like looking for jinja2 😅. Run times diminished in all cases because of the clean virtualenv, however relatively speaking the performance increase seems even better than the first iteration. I also changed tuples to frozensets since apparently they are faster by a noticeable margin (according to hyperfine). StatsNo Cache
Cachetuple
frozenset
The related PR in pylint that would make this cache work is pylint#9595. |
53c71a7
to
9984bc4
Compare
Nevermind about the frozenset, when converting it back to a list ordering is sometimes lost so it can't be used, at least not for the sensitive modpath, for the search paths I don't think the order matters too much (?) so we can use frozensets there. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2408 +/- ##
==========================================
+ Coverage 92.76% 92.78% +0.02%
==========================================
Files 94 94
Lines 11087 11095 +8
==========================================
+ Hits 10285 10295 +10
+ Misses 802 800 -2
Flags with carried forward coverage won't be shown. Click here to find out more.
|
The potential indeterminacy of the frozenset does scare me a little bit. Just this week on Discord someone was asking for pointers about debugging indeterminate behavior of the I'd suggest using tuple for both. But if you want to prepare a new benchmark and discuss it, that's fine too. |
Right, more than 20% faster is HUGE !
Not sure if I understand this right but the order of discovery does matter, right ? Doesn't it affect pylint-dev/pylint#6535 (pinned issue in pylint) ? |
Yeah, I traced what happens to that path argument, and it eventually ends up here, in an order-sensitive for loop in Side note: it seems like our constructor for astroid/astroid/interpreter/_import/spec.py Lines 85 to 86 in 4a8827d
|
Certain checkers upstream on pylint like import-error heavily use find_spec. This method is IO intensive as it looks for files across several search paths to return a ModuleSpec. Since imports across files may repeat themselves it makes sense to cache this method in order to speed up the linting process. Closes pylint-dev/pylint#9310.
This class predates efforts to have a central interface to control global state (including caches) and it is no longer needed.
9984bc4
to
98a0380
Compare
Ooops , I changed the search paths to tuple as well. Perf is still good, in fact since most of the time we should be getting no search paths I ran hyperfine again and there was not much of an impact. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, great work!
Thanks to everyone as well for the help and reviews 🤝 |
Certain checkers upstream on pylint like import-error heavily use find_spec. This method is IO intensive as it looks for files across several search paths to return a ModuleSpec.
Since imports across files may repeat themselves it makes sense to cache this method in order to speed up the linting process.
Local testing shows that caching reduces the total amount of calls to find_module methods (used by find_spec) by about 50%. Linting the test repository in the related issue goes from 40 seconds to 37 seconds. This was on a NVME disk and after warmup, so timing gains may be bigger on slower file systems like the one mentioned in the referenced issue.
Closes pylint-dev/pylint#9310.