Skip to content

Workflow to run FCS script that detects foreign genomic and adapter contamination

Notifications You must be signed in to change notification settings

hloucks/FCS-WDL

Repository files navigation

WDL that runs FCS workflow

Hailey Loucks 5/26/23

contact hloucks@ucsc.edu

This workflow will run FCS-GX and FCS Adapter workflows and output the reports generated by that process as well as a gzipped fasta with the contamination removed and adapters hard masked.

Inputs

  • Assembly
  • FCS - GX database download instructions here (validation step before running)

If running locally, the inputs.json file should look something like this:

{
 "RunFCS.assembly": "Assembly.fa.gz",
 "RunFCS.blast_div": "/test-only/test-only.blast_div.tsv.gz",
 "RunFCS.GXI": "/test-only/test-only.gxi",
 "RunFCS.GXS": "/test-only/test-only.gxs",
 "RunFCS.manifest": "/test-only/test-only.manifest",
 "RunFCS.metaJSON": "/test-only/test-only.meta.jsonl",
 "RunFCS.seq_info":"/test-only/test-only.seq_info.tsv.gz",
 "RunFCS.taxa": "/test-only/test-only.taxa.tsv",
 "RunFCS.diskSizeGBGX": 500,
 "RunFCS.diskSizeGBAdapter": 32,
 "RunFCS.threadCount": 20,
 "RunFCS.preemptible": 1
}

The script will localize all of the database files - you can ignore the readme file.

Outputs

  • Assembly.clean.fasta.gz - assembly with contam contigs/scaffolds removed
  • Assembly.contam.fasta.gz - fasta file containing the contamination contigs/scaffolds
  • Assembly.fcs_gx_report.txt - this is the FCS report of the genomic contamination
  • Assembly.fa.adapterClean.fa.gz - this is the cleaned version with adapter sequences
  • fcs_adaptor_report.txt - report of the adapter contamination identified

Notes

  • This workflow is hard coded for human assemblies
  • As of 6/13/23 there is an issue with the output of FCS-adapter being labeled as .gz but not gzipped, which you can see reflected in this workflow. If FCS adapter updates this it will need to be updated
  • The naming convention of the outfiles often includes ".fa" due to the naming convention of the FCS GX screen

About

Workflow to run FCS script that detects foreign genomic and adapter contamination

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages