Update README.md

FAANG · Jun 4, 2021 · 535c74b · 535c74b
1 parent 672d01f
commit 535c74b
Showing 1 changed file with 27 additions and 28 deletions.
diff --git a/README.md b/README.md
@@ -4,15 +4,14 @@
 
 To use this pipeline, simply clone or download this repository, and install the dependencies:
 
-- [Nextflow](https://www.nextflow.io/docs/latest/getstarted.html) >= 20.01.0
-- [Docker](https://docs.docker.com/engine/install/) >= 19.03.2 or [Singularity](https://sylabs.io/guides/3.5/user-guide/quick_start.html) >= 3.4
-
+- [Nextflow](https://www.nextflow.io/docs/latest/getstarted.html) >= 21.04.1
+- [Docker](https://docs.docker.com/engine/install/) >= 19.03.2 or [Singularity](https://sylabs.io/guides/3.5/user-guide/quick_start.html) >= 3.7.3
 
 ## Usage
 
 Try out this nextflow pipeline with:
 
-    ./nextflow-run FAANG/analysis-TAGADA --revision 0.3.1 --output directory --profile test docker
+    ./nextflow-run FAANG/analysis-TAGADA --revision 1.0.0 --output directory --profile test docker
 
 The `./nextflow-run` launcher script replaces the `nextflow run` command and grants these benefits:
 - Options can receive multiple space-separated parameters and unquoted globs.
@@ -25,17 +24,17 @@ The `./nextflow-run` launcher script replaces the `nextflow run` command and gra
 
 | Option | Parameter(s) | Description | Requirement |
 |--------|--------------|-------------|-------------|
-| __`--profile`__ | `<profile1>` `<profile2>` `...` | Profile(s) to use when<br>running the pipeline.<br>Specify the profiles that<br>fit your infrastructure<br>among `singularity`,<br>`docker`, `kubernetes`,<br>`slurm`. | Required |
-| __`--output`__ | `<directory>` | Output directory where<br>all temporary files, logs,<br>and results are written. | Required |
-| __`--reads`__ | `<reads.fq>` `<*.bam>` `...` | Input `fastq` file(s)<br>and/or `bam` file(s).<br><br>For single-end reads,<br>name your files:<br>`name.fq[.gz]`<br><br>For paired-end reads,<br>name your files:<br>`name_[R]{1,2}.fq[.gz]`<br><br>For mapped reads,<br>name your files:<br>`name.bam`<br><br>You may also provide urls<br>of files to be downloaded.<br><br>If the files are numerous,<br>you may provide a `.txt`<br>sheet with one path or url<br>per line. | Required |
-| __`--annotation`__ | `<annotation.gtf[.gz]>` | Input reference<br>annotation file or url. | Required |
-| __`--genome`__ | `<genome.fa[.gz]>` | Input genome<br>sequence file or url. | Required |
-| __`--index`__ | `<directory[.tar.gz]>` | Input genome index<br>directory or url. | Optional to skip<br>genome indexing. |
-| __`--metadata`__ | `<metadata.tsv>` | Input tabulated<br>metadata file or url. | Required if `--merge`<br>is provided. |
-| __`--merge`__ | `<factor1>` `<factor2>` `...` | Factor(s) to merge<br>mapped reads. See<br>the [merge factors](https://github.com/FAANG/analysis-TAGADA#merge-factors)<br>section for details. | Optional |
-| __`--max-cpus`__ | `<16>` | Maximum number of<br>CPU cores that can be<br>used for each process.<br>This is a limit, not the<br>actual number of<br>requested CPU cores. | Optional |
-| __`--max-memory`__ | `<64GB>` | Maximum memory that<br>can be used for each<br>process. This is a limit,<br>not the actual amount<br>of alloted memory. | Optional |
-| __`--max-time`__ | `<12h>` | Maximum time that can<br>be spent on each<br>process. This is a limit<br>and has no effect on the<br>duration of each process.| Optional |
+| __`--profile`__ | `profile1` `profile2` `...` | Profile(s) to use when<br>running the pipeline.<br>Specify the profiles that<br>fit your infrastructure<br>among `singularity`,<br>`docker`, `kubernetes`,<br>`slurm`. | Required |
+| __`--output`__ | `directory` | Output directory where<br>all temporary files, logs,<br>and results are written. | Required |
+| __`--reads`__ | `reads.fq` `*.bam` `...` | Input `fastq` file(s)<br>and/or `bam` file(s).<br><br>For single-end reads,<br>your files must end with:<br>`.fq[.gz]`<br><br>For paired-end reads,<br>your files must end with:<br>`_[R]{1,2}.fq[.gz]`<br><br>For mapped reads,<br>your files must end with:<br>`.bam`<br><br>You may also provide urls<br>of files to be downloaded.<br><br>If the files are numerous,<br>you may provide a `.txt`<br>sheet with one path or url<br>per line. | Required |
+| __`--annotation`__ | `annotation.gtf[.gz]` | Input reference<br>annotation file or url. | Required |
+| __`--genome`__ | `genome.fa[.gz]` | Input genome<br>sequence file or url. | Required |
+| __`--index`__ | `directory[.tar.gz]` | Input genome index<br>directory or url. | Optional to skip<br>genome indexing. |
+| __`--metadata`__ | `metadata.tsv` | Input tabulated<br>metadata file or url. | Required if `--merge`<br>is provided. |
+| __`--merge`__ | `factor1` `factor2` `...` | Factor(s) to merge<br>mapped reads. See<br>the [merge factors](https://github.com/FAANG/analysis-TAGADA#merge-factors)<br>section for details. | Optional |
+| __`--max-cpus`__ | `16` | Maximum number of<br>CPU cores that can be<br>used for each process.<br>This is a limit, not the<br>actual number of<br>requested CPU cores. | Optional |
+| __`--max-memory`__ | `64GB` | Maximum memory that<br>can be used for each<br>process. This is a limit,<br>not the actual amount<br>of alloted memory. | Optional |
+| __`--max-time`__ | `12h` | Maximum time that can<br>be spent on each<br>process. This is a limit<br>and has no effect on the<br>duration of each process.| Optional |
 | __`--resume`__ | | Preserve temporary files<br>and resume the pipeline<br>from the last completed<br>process. If this option is<br>absent, temporary files<br>will be deleted upon<br>completion, and the<br>pipeline will not be<br>resumable. | Optional |
 | __`--feelnc-args`__ | `'--mode shuffle ...'` | Custom arguments to<br>pass to FEELnc's<br>[coding potential](https://github.com/tderrien/FEELnc#2--feelnc_codpotpl) script<br>when detecting long<br>non-coding RNAs. | Optional |
 | __`--skip-feelnc`__ | | Skip the detection of long<br>non-coding RNAs with FEELnc. | Optional |
@@ -74,25 +73,25 @@ With the following arguments:
 
 ## Workflow
 
-The pipeline executes the following processes:
-1. Control reads __quality__ with [FastQC](https://github.com/s-andrews/FastQC).  
-   Outputs quality reports to `output/quality/raw`.
-2. __Trim__ adaptators from reads with [Trim Galore](https://github.com/FelixKrueger/TrimGalore).  
-   Outputs quality reports to `output/quality/trimmed`.
+The pipeline executes the following main processes:
+1. Control reads __quality__ with [FastQC](https://github.com/s-andrews/FastQC).
+2. __Trim__ adaptators from reads with [Trim Galore](https://github.com/FelixKrueger/TrimGalore).
 3. Estimate __overhang__ length of splice junctions, and __index__ the genome sequence with [STAR](https://github.com/alexdobin/STAR).  
-   Outputs indexed genome to `output/index`.
-4. __Map__ reads to indexed genome with [STAR](https://github.com/alexdobin/STAR).  
-   Outputs mapped reads to `output/maps`.
+   The indexed genome is saved to `output/index`.
+4. __Map__ reads to the indexed genome with [STAR](https://github.com/alexdobin/STAR).  
+   The mapped reads are saved to `output/maps`.
 5. Estimate __direction__ and __length__ of mapped reads, and compute genome __coverage__ with [Bedtools](https://github.com/arq5x/bedtools2).  
-   Outputs bedGraph files to `output/coverage`.
+   The coverage files are saved to `output/coverage`.
 6. __Merge__ mapped reads by factors with [Samtools](https://github.com/samtools/samtools).  
    See the [merge factors](#merge-factors) section for details.
-7. __Assemble__ transcripts and __combine__ them into a new assembly annotation with [StringTie](https://github.com/gpertea/stringtie).  
-   Outputs the new assembly annotation to `output/assembly`.
+7. __Assemble__ transcripts and __combine__ them into a novel annotation with [StringTie](https://github.com/gpertea/stringtie).  
+   The novel annotation is saved to `output/annotation`.
 8. __Detect__ long non-coding RNAs with [FEELnc](https://github.com/tderrien/FEELnc).  
-   Outputs the annotated long non-coding RNAs to `output/assembly`.
+   The annotations of long non-coding RNAs are saved to `output/annotation`.
 9. __Quantify__ genes and transcripts with [StringTie](https://github.com/gpertea/stringtie), and __format__ them into tabulated files.  
-   Outputs TPM values and read counts to `output/quantification` for the reference and assembly annotations.
+   The TPM values and read counts for each annotation are saved to `output/quantification`.
+10. Aggregates quality controls into a __report__ with [MultiQC](https://github.com/ewels/MultiQC).  
+    The report is saved to `output/control`.
 
 
 ## About this project