-B A2G output format explanation #46

ckuenne · 2021-08-23T09:40:31Z

Hi,

I'm back with more questions. Seems like I don't really understand the output format when using the "-B A2G" parameter.

cmd help:

-B <READ-TAG>         Tag reads by base substitution.
                      Count non-reference base substitution per read and stratify.
                      Requires stranded library type.
                      (Format for T to C mismatch: T2C; use ',' to separate substitutions)
                      Default: none

The manual has this to say:

6.3
Adding read/base stratification
Read stratification or partitioning based on base substitution(s) can be enabled by adding “-B <BASE-SUB>” to
your JACUSA2 run statement. <BASE-SUB> defines the base substitution X2Y : X, Y ∈ {A, C, G, T}, X ̸= Y
of interest where X is the reference base and Y is a base call from some read. It is required to provide a stranded
library type for each condition because otherwise X2Y cannot unambigously be determined from the sequencing
data. It is possible to provide multiple base substitutions by separating each with a “,”.
For each site the output will consist of at least on line the represents the total not stratified reads. The “info”
column will contain a field the following field “tag=*” indicating that the total reads are shown. If read with the
wanted base substitution A2G for example is encountered, all sites that are covered by this read will have an
additional line of output and the “info” column will have a value of “tag=A2G”.

call:
java -Xmx55g -jar /mnt/software/x86_64/packages/jacusa/2.0.1/jacusa.jar call-2 -a D,M,Y -filterNM 5 -s -c 2 -T 1 -P FR-SECONDSTRAND -B A2G -p 16 -r x/j2.a2g ./star_ht_1/igv/ht_1.bam,./star_ht_2/igv/ht_2.bam,./star_ht_3/igv/ht_3.bam,./star_ht_4/igv/ht_4.bam ./star_m3d-ht_1/igv/m3d-ht_1.bam,./star_m3d-ht_2/igv/m3d-ht_2.bam,./star_m3d-ht_3/igv/m3d-ht_3.bam,./star_m3d-ht_4/igv/m3d-ht_4.bam

example of j2.a2g:

#contig	start	end	name	score	strand	bases11	bases12	bases13	bases14	bases21	bases22	bases23	bases24	info	filter	ref		
chr1	3386985	3386986	call-2	1.086141859	-	6,0,0,0	5,0,0,0	8,0,0,0	18,0,2,0	4,0,0,0	7,0,0,0	12,0,0,0	3,0,0,0	tag=*	*	A
chr1	3386985	3386986	call-2	*	-	1,0,0,0	1,0,0,0	2,0,0,0	4,0,2,0	*	2,0,0,0	*	*	tag=A2G	*	A

bases1* = 4 replicates of the reference condition (ht)
bases2* = 4 replicates of the tribe condition (m3dht, =modififed A2G)

So the first line per variant is the "normal" jacusa output with ACGT coverage and the second should be only the A2G modifications? But how do I read that second line?

The text was updated successfully, but these errors were encountered:

ryrl9703 · 2022-01-25T05:22:27Z

I have the same question, too. Anyone can explan?

piechottam · 2022-01-25T06:04:01Z

The first line "tag=*" contains ALL reads.
The second line "tag=A2G" contains ONLY reads with A->G substititions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

-B A2G output format explanation #46

-B A2G output format explanation #46

ckuenne commented Aug 23, 2021 •

edited

Loading

ryrl9703 commented Jan 25, 2022

piechottam commented Jan 25, 2022

-B A2G output format explanation #46

-B A2G output format explanation #46

Comments

ckuenne commented Aug 23, 2021 • edited Loading

ryrl9703 commented Jan 25, 2022

piechottam commented Jan 25, 2022

ckuenne commented Aug 23, 2021 •

edited

Loading