Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve multithreading in fit_poisson_nmf #56

Open
pcarbo opened this issue Nov 12, 2024 · 4 comments
Open

Improve multithreading in fit_poisson_nmf #56

pcarbo opened this issue Nov 12, 2024 · 4 comments

Comments

@pcarbo
Copy link
Member

pcarbo commented Nov 12, 2024

From @aksarkar:

I found that fit_poisson_nmf spends almost all of its time in single-threaded code, specifically cost and in poisson_nmf_kkt. One obvious enhancement would be to move the objective function computation into the additive Poisson regression code (i.e., break it up across threads) and then sum up the results in single-threaded code. Similarly, one could compute the max absolute KKT residual inside the Poisson regression code and then take the max over the results in single-threaded code.

@aksarkar
Copy link
Member

Thinking about this a bit more, my suggested enhancements may make extrapolation more complex since the current implementation extrapolates the whole L (F) matrix at once.

Is it principled to extrapolate each additive Poisson regression?

@pcarbo
Copy link
Member Author

pcarbo commented Nov 12, 2024

The computations for extrapolation are particularly simple, so I'm not sure if there is much to be gained by parallelizing this code (but I could be wrong).

@aksarkar
Copy link
Member

The issue is that

fit$loss.fnly <- sum(cost(X,fit$Ly,t(Fn),control$eps))

is also a single-threaded bottleneck for large data sets, and my suggestion to compute the loss function in each subproblem does not work here.

@pcarbo
Copy link
Member Author

pcarbo commented Nov 13, 2024

Yes, although this relates to your earlier point about the cost function not being parallelized. If designed carefully I think I can tackle all these at the same time. I'll circle back to this soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants