Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The windows.bed is empty #4

Open
jacobhsu35 opened this issue Nov 23, 2016 · 1 comment
Open

The windows.bed is empty #4

jacobhsu35 opened this issue Nov 23, 2016 · 1 comment

Comments

@jacobhsu35
Copy link

I'm trying CLAMMS for me exome sequencing data and have followed the instruction on the github page.
However, I keep getting the error messages like this for all of the chromosomes. Then, the windows.bed is empty.
Feature (Y:1264940-1265060) beyond the length of Y size (0 bp). Skipping.

The command I tried as below :
PATH/clamms/annotate_windows.sh xgen_IDT_forCLAMMS.bed GATK/bundle/2.8/hg19/ucsc.hg19.fasta mappability.bed $INSERT_SIZE PATH/clamms/data/clamms_special_regions.bed > windows.bed

I have tried several different insert size including 0,50, 100,130,150, 300, 400.
Do you have any idea which part I might be missing?

Could you please provide a test sample and script, then I can try to figure out myself. Thank you for considering my request.

@rgcgithub
Copy link
Owner

In case you didn't receive my email, I'll paste the response here:

This is an error being thrown by bedtools, which is called twice in the annotate_windows.sh script. My guess is that the first call to bedtools nuc is the one erring, suggesting that either the genome.fa file or targets.bed file has regions on Y (and possibly other chromosomes) that are not represented in the other file. You can test this by running just the relevant command, i.e.:

bedtools nuc -fi GATK/bundle/2.8/hg19/ucsc.hg19.fasta -bed xgen_IDT_forCLAMMS.bed

If that reproduces the error, make sure the files are consistent (including chromosome names, which should not have “chr” in them).

Evan

As I said above, this is most likely due to regions in your targets file (xgen_IDT_forCLAMMS.bed) not being represented in your genome FASTA file (ucsc.hg19.fasta) or not being coded consistently (e.g. one has "chrY" and the other has "Y"). Make sure these commands output the same chromosome names and that they both have the Y chromosome:

`cut -f 1 xgen_IDT_forCLAMMS.bed | uniq`
`grep '^>' ucsc.hg19.fasta`

As far as examples, you can take a look at the "CLAMMS: Compute and annotate windows from exome targets" precisionFDA app (requires a precisionFDA account): https://precision.fda.gov/apps/app-F0V2jQ804XXz6G90pZB10fJj
 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants