-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
+1 end position in _ampliconx_ecDNA_x_intervals.bed in dir _classification_bed_files #13
Comments
Hi, thanks for reaching out - I will attempt to reproduce the issue locally and get back to you in the next couple of days. Re: 0.4.16 "result table creation bugfixes" - that fix was related to transforming the classification results into more condensed .tsv and .json files, which also reported the paths of some relevant files for each focal amp. They were not bugfixes to the classification table itself. Please let me know if I misunderstood your query though. Thanks, |
Sorry - can you clarify, are you experiencing this coordinate error in 0.4.13? Did you experience it also in the latest version 0.4.16? |
Hi Jens, Thanks for your timely reply. I just tested and the coordinate error also shows up in v0.4.16. Please find output details in the following. =-=-= _amplicon1_cycles.txt =-=-= Interval 1 chr7 54754577 55441772 =-=-= _amplicon1_ecDNA_1_intervals.bed =-=-= chr7 54770769 55085822 =-=-= _amplicon1_unknown_1_intervals.bed =-=-= Thanks, |
Hi Nan, Thanks again for checking your local files and providing these examples. I realize now the issue is as follows: AmpliconArchitect uses a 0-based fully-closed counting system (endpoint included). This allows 'intervals' of size 1bp to have the same start and end coordinate. It is useful for AA, but not for external programs. AmpliconClassifier reports intervals using a 0-based half-closed counting system (endpoint excluded). This is what the UCSC genome browser and the IGV browser use. We made this change to enable better compatibility with outside tools when seeing where intervals map. I have updated the README to contain this information. In my examination of my own files, I found that all amplicon types are reported with the 0-based half-closed system (at least in 0.4.16 and on). If you find some amplicon types that don't obey that, please do let me know, and if possible provide the graph and cycles file so I can reproduce it locally. Congratulations by the way on the eccDNAdb site. It has a lot of very nice functionality! Thank you, |
Hi Jens, Thanks for your detailed explanation of the difference between AmpliconArchitect and AmpliconClassifier outputs. Recently I am working with AmpliconClassifier outputs, and I will tell you if I see other issues. Congratulations on your new publication in Nature. Also, I appreciate your generous share of the Amplicon software suite which has helped me a lot. Warmest regards, |
v0.4.13
The coordinate in cycle file is, for example,
chr1 100 500
however, in the eccDNA_x_intervals.bed output file, it becomes
chr1 100 501
for other outputs, like BFB_x_intervals.bed, unknown_x_intervals.bed, it is unchanged, namely
chr1 100 500
I tried to locate the code that causes the difference. It might be within the amplicon_annotation function definition in ac_annotation.py, but I am not sure. Please help to fix it.
I also notice that v0.4.16 changelog says Result table creation bugfixes. Has the issue been fixed in the new version?
The text was updated successfully, but these errors were encountered: