Skip to content

Commit

Permalink
added test files for bubble detection
Browse files Browse the repository at this point in the history
  • Loading branch information
lh3 committed Feb 22, 2024
1 parent 8b97aaa commit a9582c3
Show file tree
Hide file tree
Showing 16 changed files with 183 additions and 3 deletions.
9 changes: 6 additions & 3 deletions pangene.1
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,12 @@ on the genome is at least
.I FLOAT
fraction of the shorter alignment [0.5]
.TP
.BI -J
Do not attempt to identify pseudogenes jointly across samples.
.TP
.BI -E
ignore genes that are single-exon in all genomes
.TP
.BI -p \ FLOAT
A gene is considered a segment in the graph if it is
.I dominant
Expand All @@ -143,9 +149,6 @@ Drop a gene if it connects more than
.I INT
loci that are distant from each other [3]
.TP
.BI -J
Do not attempt to identify pseudogenes jointly across samples.
.TP
.BI -b \ FLOAT
Demote a branching arc if it is weaker than the best branching arc by
.I FLOAT
Expand Down
10 changes: 10 additions & 0 deletions test/bubble/t1-1.gfa
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
S CAPNS2 * LN:i:248 ng:i:98 nc:i:98 c1:i:98 c2:i:0 pp:Z:CAPNS2:ENSP00000400882.2
S CES1 * LN:i:568 ng:i:98 nc:i:111 c1:i:98 c2:i:0 pp:Z:CES1:ENSP00000353720.4
S CES5A * LN:i:604 ng:i:98 nc:i:98 c1:i:98 c2:i:0 pp:Z:CES5A:ENSP00000428864.1
S GNAO1 * LN:i:354 ng:i:98 nc:i:98 c1:i:98 c2:i:0 pp:Z:GNAO1:ENSP00000262493.6
S SLC6A2 * LN:i:628 ng:i:98 nc:i:98 c1:i:98 c2:i:0 pp:Z:SLC6A2:ENSP00000219833.8
L CAPNS2 + SLC6A2 + 0M L1:i:248 L2:i:628
L CES1 - CES5A - 0M L1:i:568 L2:i:604
L CES5A - GNAO1 + 0M L1:i:604 L2:i:354
L SLC6A2 + CES1 + 0M L1:i:628 L2:i:568
L SLC6A2 + CES1 - 0M L1:i:628 L2:i:568
6 changes: 6 additions & 0 deletions test/bubble/t1-2.gfa
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
S CES1 * LN:i:568 ng:i:98 nc:i:111 c1:i:98 c2:i:0 pp:Z:CES1:ENSP00000353720.4
S CES5A * LN:i:604 ng:i:98 nc:i:98 c1:i:98 c2:i:0 pp:Z:CES5A:ENSP00000428864.1
S SLC6A2 * LN:i:628 ng:i:98 nc:i:98 c1:i:98 c2:i:0 pp:Z:SLC6A2:ENSP00000219833.8
L CES1 - CES5A - 0M L1:i:568 L2:i:604
L SLC6A2 + CES1 + 0M L1:i:628 L2:i:568
L SLC6A2 + CES1 - 0M L1:i:628 L2:i:568
12 changes: 12 additions & 0 deletions test/bubble/t1-3.gfa
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
S DMRTC1B * LN:i:192 ng:i:77 nc:i:129 c1:i:77 c2:i:0 pp:Z:DMRTC1B:ENSP00000362632.3
S FAM236A * LN:i:79 ng:i:78 nc:i:151 c1:i:78 c2:i:0 pp:Z:FAM236A:ENSP00000490343.2
S FAM236C * LN:i:79 ng:i:78 nc:i:125 c1:i:77 c2:i:1 pp:Z:FAM236C:ENSP00000490543.1
S NAP1L2 * LN:i:460 ng:i:78 nc:i:79 c1:i:78 c2:i:20 pp:Z:NAP1L2:ENSP00000362616.3
S PABPC1L2B * LN:i:200 ng:i:78 nc:i:158 c1:i:78 c2:i:20 pp:Z:PABPC1L2B:ENSP00000362621.3
S PHKA1 * LN:i:1240 ng:i:78 nc:i:78 c1:i:78 c2:i:0 pp:Z:PHKA1:ENSP00000362640.3
L DMRTC1B - FAM236C + 0M L1:i:192 L2:i:79
L DMRTC1B - FAM236A + 0M L1:i:192 L2:i:79
L FAM236A + PABPC1L2B + 0M L1:i:79 L2:i:200
L FAM236C + FAM236A + 0M L1:i:79 L2:i:79
L PABPC1L2B - NAP1L2 - 0M L1:i:200 L2:i:460
L PHKA1 - FAM236A - 0M L1:i:1240 L2:i:79
8 changes: 8 additions & 0 deletions test/bubble/t1-4.gfa
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
S FAM102B * LN:i:360 ng:i:98 nc:i:99 c1:i:98 c2:i:0 pp:Z:FAM102B:ENSP00000359052.3
S NBPF4 * LN:i:667 ng:i:94 nc:i:95 c1:i:94 c2:i:4 pp:Z:NBPF4:ENSP00000479545.1
S NBPF6 * LN:i:667 ng:i:97 nc:i:101 c1:i:97 c2:i:1 pp:Z:NBPF6:ENSP00000359057.3
S SLC25A24 * LN:i:477 ng:i:98 nc:i:98 c1:i:98 c2:i:0 pp:Z:SLC25A24:ENSP00000457733.1
L NBPF4 - NBPF6 + 0M L1:i:667 L2:i:667
L NBPF6 + FAM102B + 0M L1:i:667 L2:i:360
L SLC25A24 - NBPF4 - 0M L1:i:477 L2:i:667
L SLC25A24 - NBPF6 - 0M L1:i:477 L2:i:667
8 changes: 8 additions & 0 deletions test/bubble/t1-5.gfa
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
S RGPD1 * LN:i:1756 ng:i:62 nc:i:63 c1:i:60 c2:i:38 pp:Z:RGPD1:ENSP00000492954.1
S IGKV3OR2-268 * LN:i:116 ng:i:98 nc:i:99 c1:i:98 c2:i:0 pp:Z:IGKV3OR2-268:ENSP00000474297.1
S PLGLB2 * LN:i:96 ng:i:98 nc:i:198 c1:i:98 c2:i:0 pp:Z:PLGLB2:ENSP00000352458.4
S RGPD2 * LN:i:1756 ng:i:97 nc:i:133 c1:i:94 c2:i:4 pp:Z:RGPD2:ENSP00000381214.3
L RGPD1 + PLGLB2 - 0M L1:i:1756 L2:i:96
L IGKV3OR2-268 + PLGLB2 + 0M L1:i:116 L2:i:96
L IGKV3OR2-268 - PLGLB2 + 0M L1:i:116 L2:i:96
L PLGLB2 + RGPD2 - 0M L1:i:96 L2:i:1756
11 changes: 11 additions & 0 deletions test/bubble/t1-6.gfa
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
S BET1L * LN:i:152 ng:i:98 nc:i:98 c1:i:98 c2:i:0 pp:Z:BET1L:ENSP00000386558.1
S SCGB1C1 * LN:i:95 ng:i:98 nc:i:98 c1:i:74 c2:i:24 pp:Z:SCGB1C1:ENSP00000344545.2
S ODF3 * LN:i:254 ng:i:98 nc:i:99 c1:i:98 c2:i:0 pp:Z:ODF3:ENSP00000325868.5
S RIC8A * LN:i:537 ng:i:98 nc:i:99 c1:i:98 c2:i:0 pp:Z:RIC8A:ENSP00000325941.5
S SIRT3 * LN:i:399 ng:i:98 nc:i:99 c1:i:98 c2:i:0 pp:Z:SIRT3:ENSP00000372191.4
L BET1L + ODF3 - 0M L1:i:152 L2:i:254
L BET1L - SCGB1C1 + 0M L1:i:152 L2:i:95
L BET1L - RIC8A + 0M L1:i:152 L2:i:537
L SCGB1C1 + ODF3 + 0M L1:i:95 L2:i:254
L ODF3 + RIC8A + 0M L1:i:254 L2:i:537
L RIC8A + SIRT3 - 0M L1:i:537 L2:i:399
16 changes: 16 additions & 0 deletions test/bubble/t1-7.gfa
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
S C1GALT1 * LN:i:363 ng:i:98 nc:i:98 c1:i:98 c2:i:0 pp:Z:C1GALT1:ENSP00000389176.2
S CCZ1B * LN:i:482 ng:i:98 nc:i:195 c1:i:98 c2:i:0 pp:Z:CCZ1B:ENSP00000314544.8
S OCM * LN:i:109 ng:i:98 nc:i:98 c1:i:98 c2:i:0 pp:Z:OCM:ENSP00000401365.1
S PMS2 * LN:i:924 ng:i:98 nc:i:98 c1:i:98 c2:i:0 pp:Z:PMS2:ENSP00000514637.1
S RSPH10B * LN:i:870 ng:i:75 nc:i:93 c1:i:62 c2:i:36 pp:Z:RSPH10B:ENSP00000385443.1
S RSPH10B2 * LN:i:870 ng:i:80 nc:i:102 c1:i:76 c2:i:22 pp:Z:RSPH10B2:ENSP00000297186.3
S ZNF12 * LN:i:697 ng:i:98 nc:i:98 c1:i:98 c2:i:0 pp:Z:ZNF12:ENSP00000385939.1
L CCZ1B - C1GALT1 + 0M
L OCM + CCZ1B + 0M
L PMS2 + RSPH10B2 + 0M
L RSPH10B + CCZ1B - 0M
L RSPH10B - PMS2 - 0M
L RSPH10B - ZNF12 + 0M
L RSPH10B2 + CCZ1B - 0M
L ZNF12 - RSPH10B2 + 0M
L ZNF12 + PMS2 + 0M
16 changes: 16 additions & 0 deletions test/bubble/t1-7a.gfa
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
S OCM * LN:i:109 ng:i:98 nc:i:98 c1:i:98 c2:i:0 pp:Z:OCM:ENSP00000401365.1
S C1GALT1 * LN:i:363 ng:i:98 nc:i:98 c1:i:98 c2:i:0 pp:Z:C1GALT1:ENSP00000389176.2
S CCZ1B * LN:i:482 ng:i:98 nc:i:195 c1:i:98 c2:i:0 pp:Z:CCZ1B:ENSP00000314544.8
S PMS2 * LN:i:924 ng:i:98 nc:i:98 c1:i:98 c2:i:0 pp:Z:PMS2:ENSP00000514637.1
S RSPH10B * LN:i:870 ng:i:75 nc:i:93 c1:i:62 c2:i:36 pp:Z:RSPH10B:ENSP00000385443.1
S RSPH10B2 * LN:i:870 ng:i:80 nc:i:102 c1:i:76 c2:i:22 pp:Z:RSPH10B2:ENSP00000297186.3
S ZNF12 * LN:i:697 ng:i:98 nc:i:98 c1:i:98 c2:i:0 pp:Z:ZNF12:ENSP00000385939.1
L CCZ1B - C1GALT1 + 0M
L OCM + CCZ1B + 0M
L PMS2 + RSPH10B2 + 0M
L RSPH10B + CCZ1B - 0M
L RSPH10B - PMS2 - 0M
L RSPH10B - ZNF12 + 0M
L RSPH10B2 + CCZ1B - 0M
L ZNF12 - RSPH10B2 + 0M
L ZNF12 + PMS2 + 0M
19 changes: 19 additions & 0 deletions test/bubble/t1-8.gfa
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
S RTL8C * LN:i:113 ng:i:78 nc:i:80 c1:i:78 c2:i:0 pp:Z:RTL8C:ENSP00000257013.7
S CT55 * LN:i:264 ng:i:79 nc:i:83 c1:i:79 c2:i:19 pp:Z:CT55:ENSP00000276241.6
S ETDB * LN:i:59 ng:i:78 nc:i:161 c1:i:78 c2:i:0 pp:Z:ETDB:ENSP00000490943.1
S ETDC * LN:i:59 ng:i:78 nc:i:80 c1:i:78 c2:i:0 pp:Z:ETDC:ENSP00000490576.1
S INTS6L * LN:i:898 ng:i:78 nc:i:78 c1:i:78 c2:i:20 pp:Z:INTS6L:ENSP00000491427.1
S RTL8A * LN:i:113 ng:i:78 nc:i:81 c1:i:78 c2:i:0 pp:Z:RTL8A:ENSP00000375267.1
S SMIM10L2B * LN:i:78 ng:i:78 nc:i:161 c1:i:78 c2:i:20 pp:Z:SMIM10L2B:ENSP00000487709.1
S ZNF449 * LN:i:518 ng:i:78 nc:i:81 c1:i:78 c2:i:20 pp:Z:ZNF449:ENSP00000339585.4
S ZNF75D * LN:i:510 ng:i:78 nc:i:80 c1:i:78 c2:i:0 pp:Z:ZNF75D:ENSP00000359802.3
L RTL8C + RTL8A - 0M L1:i:113 L2:i:113
L ETDB + ZNF75D - 0M L1:i:59 L2:i:510
L ETDB - CT55 + 0M L1:i:59 L2:i:264
L ETDB - CT55 - 0M L1:i:59 L2:i:264
L ETDC + ZNF449 + 0M L1:i:59 L2:i:518
L RTL8A - SMIM10L2B - 0M L1:i:113 L2:i:78
L SMIM10L2B + INTS6L + 0M L1:i:78 L2:i:898
L SMIM10L2B - ETDB - 0M L1:i:78 L2:i:59
L SMIM10L2B - ZNF449 - 0M L1:i:78 L2:i:518
L ZNF75D - ETDC + 0M L1:i:510 L2:i:59
22 changes: 22 additions & 0 deletions test/bubble/t1-8c.gfa
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
S RTL8C * LN:i:113 ng:i:78 nc:i:80 c1:i:78 c2:i:0 pp:Z:RTL8C:ENSP00000257013.7
S CT55 * LN:i:264 ng:i:79 nc:i:83 c1:i:79 c2:i:19 pp:Z:CT55:ENSP00000276241.6
S ETDB * LN:i:59 ng:i:78 nc:i:161 c1:i:78 c2:i:0 pp:Z:ETDB:ENSP00000490943.1
S ETDC * LN:i:59 ng:i:78 nc:i:80 c1:i:78 c2:i:0 pp:Z:ETDC:ENSP00000490576.1
S INTS6L * LN:i:898 ng:i:78 nc:i:78 c1:i:78 c2:i:20 pp:Z:INTS6L:ENSP00000491427.1
S RTL8A * LN:i:113 ng:i:78 nc:i:81 c1:i:78 c2:i:0 pp:Z:RTL8A:ENSP00000375267.1
S SMIM10L2B * LN:i:78 ng:i:78 nc:i:161 c1:i:78 c2:i:20 pp:Z:SMIM10L2B:ENSP00000487709.1
S ZNF449 * LN:i:518 ng:i:78 nc:i:81 c1:i:78 c2:i:20 pp:Z:ZNF449:ENSP00000339585.4
S ZNF75D * LN:i:510 ng:i:78 nc:i:80 c1:i:78 c2:i:0 pp:Z:ZNF75D:ENSP00000359802.3
S CT45A1 * LN:i:100
L RTL8C + RTL8A - 0M L1:i:113 L2:i:113
L ETDB + ZNF75D - 0M L1:i:59 L2:i:510
L ETDB - CT55 + 0M L1:i:59 L2:i:264
L ETDB - CT55 - 0M L1:i:59 L2:i:264
L ETDC + ZNF449 + 0M L1:i:59 L2:i:518
L RTL8A - SMIM10L2B - 0M L1:i:113 L2:i:78
L SMIM10L2B + INTS6L + 0M L1:i:78 L2:i:898
L SMIM10L2B - ETDB - 0M L1:i:78 L2:i:59
L SMIM10L2B - ZNF449 - 0M L1:i:78 L2:i:518
L ZNF75D - ETDC + 0M L1:i:510 L2:i:59
L INTS6L + CT45A1 + 0M
L CT45A1 + CT45A1 + 0M
8 changes: 8 additions & 0 deletions test/bubble/t2-0-simple.gfa
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
S s1 * LN:i:100
S s2 * LN:i:100
S s3 * LN:i:100
S s4 * LN:i:100
L s1 + s2 +
L s1 + s3 +
L s2 + s4 +
L s3 + s4 +
15 changes: 15 additions & 0 deletions test/bubble/t2-1.gfa
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
S s1 * LN:i:100
S s2 * LN:i:100
S s3 * LN:i:100
S s4 * LN:i:100
S s5 * LN:i:100
S s6 * LN:i:100
S s7 * LN:i:100
L s1 + s2 + 0M
L s2 + s3 + 0M
L s1 + s4 + 0M
L s4 + s5 + 0M
L s5 + s6 + 0M
L s4 + s7 + 0M
L s7 + s6 + 0M
L s6 + s3 + 0M
15 changes: 15 additions & 0 deletions test/bubble/t2-2.gfa
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
S s1 * LN:100
S s2 * LN:100
S s3 * LN:100
S s4 * LN:100
S s5 * LN:100
S s6 * LN:100
S s7 * LN:100
L s1 + s2 +
L s1 + s3 +
L s2 + s4 +
L s3 + s4 +
L s4 + s5 +
L s4 + s6 +
L s5 + s7 +
L s6 + s7 +
5 changes: 5 additions & 0 deletions test/bubble/t2-3.gfa
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
S s1 * LN:100
S s2 * LN:100
S s3 * LN:100
L s1 + s2 +
L s3 + s1 +
6 changes: 6 additions & 0 deletions test/bubble/t2-4.gfa
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
S s1 * LN:i:100
S s2 * LN:i:100
S s3 * LN:i:100
L s1 + s2 + 0M
L s2 + s2 - 0M
L s2 - s3 + 0M

0 comments on commit a9582c3

Please sign in to comment.