update fit_cov_ebnmf #4

willwerscheid · 2023-11-08T14:05:50Z

use flash_update_data and lowrank plus sparse representation

pcarbo · 2023-11-08T22:38:05Z

@willwerscheid Is the idea here that this new flash_fit_cov_ebnmf should produce the same result as the old flash_fit_cov_ebnmf, but the new version should be much faster than the old version when Y is sparse?

willwerscheid · 2023-11-09T01:08:31Z

Yes, this should produce the same result. (If I add flash_factors_split then the result could be very slightly different since I would keep the values of EL2 and EF2 when doing the splitting.) I am not sure about it being much faster. Hopefully at least somewhat faster but I don't have any guesses about how much. We will need to do some testing. I think @YushaLiu was planning to test on the pdac dataset.

YushaLiu · 2023-12-04T20:59:04Z

Hi @willwerscheid, sorry I wanted to try this earlier on the pancreatic cancer data but my RCC account was locked and just got reactivated. I went over this updated file, and have a few quick questions.

In line 134 and 137, you use * for matrix multiplication. Do you want to do element wise matrix multiplication here? In particular, in line 137, it seems L.pm and F.pm do not have the same dimension. Does it make sense to use * to multiply them?
With your new definition of fit_ebmf_to_YY, my understanding is that dat can be a covariance matrix or a low rank representation depending on the size of Y. If that is the case, in line 142-143 I think you also need to consider separately when dat is a matrix or a list?

willwerscheid · 2023-12-04T21:06:41Z

I think lines 134 and 137 are correct. You should not need to reconstruct the full matrix to get the diagonals. L.pm and F.pm are the same dimension; n = p because it is a covariance matrix. But I think you are correct about lines 142-3. There should be a line handling the case where dat is a matrix.

YushaLiu · 2023-12-05T01:10:29Z

I think lines 134 and 137 are correct. You should not need to reconstruct the full matrix to get the diagonals. L.pm and F.pm are the same dimension; n = p because it is a covariance matrix. But I think you are correct about lines 142-3. There should be a line handling the case where dat is a matrix.

Oh you are right! I forgot that here L.pm and F.pm are the same dimension because n=p here.

willwerscheid added 3 commits November 6, 2023 16:20

update fit_cov_ebnmf to use lowrank plus sparse data structure

ab66574

fix calculation of data diagonal

10874cd

form covariance matrix when there are many more genes than cells

720f837

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update fit_cov_ebnmf #4

update fit_cov_ebnmf #4

willwerscheid commented Nov 8, 2023

pcarbo commented Nov 8, 2023

willwerscheid commented Nov 9, 2023

YushaLiu commented Dec 4, 2023

willwerscheid commented Dec 4, 2023

YushaLiu commented Dec 5, 2023

update fit_cov_ebnmf #4

Are you sure you want to change the base?

update fit_cov_ebnmf #4

Conversation

willwerscheid commented Nov 8, 2023

pcarbo commented Nov 8, 2023

willwerscheid commented Nov 9, 2023

YushaLiu commented Dec 4, 2023

willwerscheid commented Dec 4, 2023

YushaLiu commented Dec 5, 2023