-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup pytensor import #1161
base: main
Are you sure you want to change the base?
Speedup pytensor import #1161
Conversation
5e060e1
to
9f0a6cc
Compare
948b079
to
89fe0d7
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1161 +/- ##
==========================================
- Coverage 82.11% 82.07% -0.04%
==========================================
Files 186 186
Lines 48201 48232 +31
Branches 8679 8673 -6
==========================================
+ Hits 39579 39588 +9
- Misses 6447 6473 +26
+ Partials 2175 2171 -4
|
89fe0d7
to
0419bf6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read through the changes and it looks good, however I am wondering if replacing one initial import with possibly several ones (one per function) would not make things slower overall?
Yeah it should ideally be done outside of the function, but inside the perform method. For all those Scalar Ops that have a The concern is valid for the linalg / random stuff. Gonna think about it a bit more then... |
There's an uglier alternative with global variables xD import numpy as np
from numpy import sin
def foo0(x):
return np.sin(x)
def foo1(x):
return sin(x)
sin2 = None
def foo2(x):
global sin2
if sin2 is None:
from numpy import sin
sin2 = sin
return sin2(x)
def foo3(x):
from numpy import sin
return sin(x)
%timeit foo0(5.0)
%timeit foo1(5.0)
%timeit foo2(5.0)
%timeit foo3(5.0)
# 931 ns ± 20.2 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
# 840 ns ± 16.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
# 847 ns ± 7.45 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
# 1.77 μs ± 12.7 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) Always importing has a considerable overhead indeed. |
Could this help? https://scientific-python.org/specs/spec-0001/ |
I have to check carefully, we do a lot of registering stuff when pytensor is loaded (all the rewrites and whatnot), which involve instantiating many Ops. |
If I provide the blas__ldflags to avoid #1160 I get these results running:
python -m benchmark_imports pytensor
Before:
After:
For perspective numpy used to take 6.5% of the total import time, and now takes 25%.
I couldn't avoid importing scipy.sparse because we register the base
spmatrix
on theas_symbolic
andshared_constructor
dispatch functions. The import is messy: scipy/scipy#22382📚 Documentation preview 📚: https://pytensor--1161.org.readthedocs.build/en/1161/