Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistency between the ecDNA counts reported by annotated_cycles and ecDNA_counts.tsv #20

Open
newpest opened this issue Apr 16, 2024 · 3 comments

Comments

@newpest
Copy link

newpest commented Apr 16, 2024

There are two ecDNAs in AC's annotated_cycles file, but ecDNA_counts.tsv only gives one, and the bed file of this ecDNA is a combination of the two ecDNAs in the annotated_cycles file, so which one should be used as the standard for the final result?

ecDNA_counts.tsv

ampliconarchitect amplicon1 Cyclic Positive None detected 1

bed

20 45770081 45771921
20 45778344 45788294
20 52181786 53557332
20 54157523 54158002
20 54159266 54161359
20 55675464 55855043
20 62288769 62289400
17 60001725 60001897
17 60094504 60095816
17 60188227 60190908
17 60212997 62388060

cycles

Cycle=1;Copy_count=19.6402746181;Length=19150;IsCyclicPath=True;CycleClass=ecDNA-like;Segments=62+,95+,50-,48-,84+,83-,60+,47+
Cycle=21;Copy_count=3.77523588444;Length=3734287;IsCyclicPath=True;CycleClass=ecDNA-like;Segments=53+,76-,87-,56-

@jluebeck
Copy link
Member

Hi, thanks for reaching out with this question.

One thing to keep in mind is the interpretation of the AA cycles file. In most cases, a combination of substructures may be reported for a single ecDNA. One of the benefits of AmpliconClassifier is that it determines if these substructures are from the same ecDNA or not. When ecDNA structure is complex, it typically cannot be resolved uniquely with short reads alone.

From the AA Readme:
Except in structurally simple cases, the decompositions reported by AA in the cycles file may represent computational substructures of a larger complete ecDNA of unknown structure. Because the signatures of ecDNA are still present in these substructures, for predictions on which entries are ecDNA-like, and for the predicted genome intervals captured on ecDNA please use AmpliconClassifier (AC) on your AA outputs to predict ecDNA status. Annotated versions of these cycles files indicating which cycles appear ecDNA-like, bed files and additional summary tables are directly produced by AC, simplifying interpretation. Most importantly however, individual AA cycles should not be interpreted as complete ecDNA reconstructions without first ruling out that these are substructures of the amplified regions.

In the case you showed, it would appear that the larger cycle (21) is more representative of the true ecDNA structure, and cycle 1 is some sort of substructure.

Thanks,
Jens

@newpest
Copy link
Author

newpest commented Apr 17, 2024

How to obtain the structure of the ecDNA obtained by AC through the classification_bed_files

@jluebeck
Copy link
Member

jluebeck commented Apr 17, 2024

The classification bed files are sorted by coordinate and are not helpful for structure determination. The exact structures of complex ecDNA are often only resolvable through the use of long read technologies, or application of techniques like Circle-Seq or CRISPR-Catch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants