Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add option to use maskfile #12

Open
zmaroti opened this issue Oct 19, 2023 · 1 comment
Open

add option to use maskfile #12

zmaroti opened this issue Oct 19, 2023 · 1 comment

Comments

@zmaroti
Copy link

zmaroti commented Oct 19, 2023

Hi,

It would be nice if you could add the maskfile option to either to the

hapBLOCK_chroms (to not emit IBD from mask areas, since all relevant genom coordinate info is available here)
or
filter_ibd_df plus the caller create_ind_ibd_df ind_all_ibd_df (to filter IBD instead of (or additionally with) the SNP density parameter)

functions as a parameter since this could be handled naturally in the base package.

(The individual IBD data in the output of hapBLOCK_chroms (yet) does not contain the genomic coordinates, and the mapping data is not the same scale (M vs cM) as in the mask data, thus simple "shell magic" would be complex to do this.)

While at a few samples, and at the individual pairwise IBD share it is not an issue, when you work with several hundreds individuals the combinations (N*(N-1)/2) gets large and at these genome locations almost everyone will share IBD with all other samples. This result in nedlessly large portion of these false positive IBD compared to the randomly distributed true IBD in the outputs.

Thanks!

@hringbauer
Copy link
Owner

That is an excellent suggestion showing some deep competence. Thank you!

We will work on implementing it, as it could substantially speed up the post-processing for large datasets. I leave the thread open until then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants