Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could i use gfase for Flye (HERRO corrected ul-ONT) #30

Open
YouxinZhao opened this issue Oct 7, 2024 · 1 comment
Open

Could i use gfase for Flye (HERRO corrected ul-ONT) #30

YouxinZhao opened this issue Oct 7, 2024 · 1 comment

Comments

@YouxinZhao
Copy link

Could i use gfase for gfa of Flye (HERRO corrected ul-ONT reads)?

@rlorigro
Copy link
Owner

Hi, apologies for the late reply. The answer is likely no, because in the past Flye did not generate a GFA in the late stages of assembly that would have sufficiently resolved haplotypes for mapping reads. However, if that has changed, then I would refer to these requirements from the publication:

  1. The cumulative consumed query/target length of a
    GFA overlap cigar must not exceed the length of the
    query/target node. This is not possible in a properly format-
    ted GFA, but it has been observed in both hifiasm and Verkko
    overlaps. If this input error is encountered, it is recommended
    to use the ‐‐skip_unzip argument.
  2. Edges (L lines) are not required. For phasing an input that
    is not in graph form (e.g., in FASTA/Q format), it is possible to
    simply convert it to a GFA without any L lines, and invoke
    the ‐‐skip_unzip and ‐‐use_homology flags to bin the contigs
    into bin 1 or 2 or unphased. Homology will be inferred entirely
    independently of the graph edges.
  3. When using an alignment as input, nodes (“segments”
    or S lines) in the graph should be mappable. Short nodes
    that are insufficient in length to map a Hi-C or Pore-C subread
    (≥150 bp of nonrepetitive sequence) will not accumulate any
    contacts in the mapping step. Nodes without contacts will
    not be phased, and GFAse will put them in the “unphased”
    bin instead of using them to extend a phased chain. In addition
    to this, when using the homology-based alt detection, nodes
    must be long enough to confidently map to each other. We rec-
    ommend starting with an assembly configuration that produc-
    es a phase N50 as large as possible without introducing any
    switch errors. For GFAs in which nodes are not mappable, the
    user may provide a custom contact map CSV as input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants