Skip to content

min_hash_distance

Young edited this page Feb 9, 2024 · 5 revisions

mash subworkflow

Mash dist isn't really built to identify contamination (which is why there are optional Kraken2 and Blobtools subworkflows), but it can quickly look through a large number of references for similar hits. As such, mash dist is being used here to identity possible sequences rather than attempting to characterize the reads. There is a plan to use mash screen instead of mash dist in the future, but the estimated date of completion for that is still very loose.

---
min hash distance
---
flowchart LR
fastq --> mash
contigs --> mash
mash --> A[other subworkflows]
Loading

There is a way to supply a custom mash reference file. Instructions are found on a different page in this wiki.

Relevant params with their default values:

# number of mash dist hits to save
params.mash_max_hits = 25
# specifying a different mash reference file
params.mash_db       = ""
Clone this wiki locally