Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

interpreting the results of AmpliconClassifier #18

Open
eesiribloom opened this issue Nov 23, 2023 · 1 comment
Open

interpreting the results of AmpliconClassifier #18

eesiribloom opened this issue Nov 23, 2023 · 1 comment
Labels

Comments

@eesiribloom
Copy link

I am struggling to completely understand the differences of the outputs in AA and AC and how to translate this to presenting the results of the analysis:
If I have an individual sample which appears to have ecDNA how should I understand the results of having say:

Several amplicons from AA, some of which might contain "ecDNA" according to AC and multiple ecDNA-like cycles.

I understand AC merges overlapping cycles/paths to estimate what might be an ecDNA. Firstly, what is an "amplicon" in comparison to the, anywhere from 1-5 or so, ecDNA contained within that output from AC. Within this there are several"ecDNA-like" cycles/paths...which I am a bit confused about. Are these ecDNA reported by AC then considered distinct? If I have a sample which has e.g. two ecDNA-containing amplicons, one with 5 ecDNAs and one with 1 ecDNA, should I interpret this single sample to contain 6 distinct ecDNA species? essentially, what is the level of resolution where I might say "this sample has this particular ecDNA made up of this particular genomic region/intervals" - would it be the amplicon, the ecDNA, or the cycle?

@jluebeck
Copy link
Member

Hi,

Thanks for reaching out with this great question. I think this is a very good point to clarify.

First, there is a bit of confusing terminology:

  • Focal amplification: A genomic amplification where all segments are amplified together as a single unit.
  • AA amplicon: A collection of genome regions where one or more focal amplifications exist in close proximity or which are connected by structural variants.

The distinction here is that an AA "amplicon" can contain multiple focal amplifications. When we report an ID for each feature we give it in the format of [sample]_[AA amplicon number]_[feature type]_[index of that feature in the AA amplicon]. For instance, sample_amplicon1_ecDNA_5, is the 5th ecDNA reported in the sample's AA amplicon1.

The multiple features reported in a single AA amplicon (e.g. ecDNA_1 and ecDNA_2) are indeed considered distinct by AC, as they are unconnected by SVs and/or are separated by >500kbp. For classes of different types (e.g. ecDNA and BFB), these can overlap genomically, and be considered distinct as the pattern of SVs and CNs that creates the classification are separate.

In your example, one AA amplicon with 5 ecDNAs, and another with 1 ecDNA, then yes, in total we should count 6 distinct ecDNAs predicted by AC. We consider events distinct at the "feature" level (ecDNA_1, 2, etc.) and count across all amplicons in a sample. We don't distinguish by counting individual cycles as multiple cycles can be detected from the same feature, particularly if the structure is complex and AA only extracts a collection of substructures of the larger ecDNA.

Please let me know if you have any additional questions here, I'd be happy to answer additional queries on this.

Thanks,
Jens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants