Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC using module classes v2 #6

Open
wants to merge 5 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,14 +63,14 @@ Each row represents a set of sequences (in this case the seatoxin and toxin prot

The toolsheet specifies **which combination of tools will be deployed and benchmark in the pipeline**.
Each line of the toolsheet defines a combination of guide tree and multiple sequence aligner to run with the respective arguments to be used.
The only required field is `aligner`. The fields `tree`, `args_tree` and `args_aligner` are optional and can be left empty.
The only required field is `aligner`. The fields `tree`, `args_guidetree` and `args_aligner` are optional and can be left empty.

It should look at follows:

`toolsheet.csv`:

```csv
tree,args_tree,aligner,args_aligner,
tree,args_guidetree,aligner,args_aligner,
FAMSA, -gt upgma -medoidtree, FAMSA,
, ,TCOFFEE,
FAMSA,,REGRESSIVE,
Expand Down
6 changes: 3 additions & 3 deletions assets/multiqc_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -117,11 +117,11 @@ custom_table_header_config:
hidden: False
namespace: "Alignment"
scale: "Paired"
args_tree:
args_guidetree:
description: "Arguments used to build the tree."
hidden: True
namespace: "Alignment"
args_tree_clean:
args_guidetree_clean:
description: "Arguments used to build the tree."
hidden: True
namespace: "Alignment"
Expand All @@ -143,7 +143,7 @@ table_columns_placement:
summary_stats:
fasta: 90
tree: 150
args_tree: 170
args_guidetree: 170
aligner: 200
args_aligner: 220
n_sequences: 250
Expand Down
30 changes: 21 additions & 9 deletions assets/schema_tools.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,31 +7,43 @@
"items": {
"type": "object",
"properties": {
"tree": {
"guidetree": {
"type": "string",
"pattern": "^\\S+$",
"errorMessage": "tree name cannot contain spaces",
"meta": ["tree"],
"meta": ["guidetree"],
"default": ""
},
"args_tree": {
"args_guidetree": {
"type": "string",
"meta": ["args_tree"],
"meta": ["args_guidetree"],
"default": ""
},
"aligner": {
"treealign": {
"type": "string",
"meta": ["aligner"],
"meta": ["treealign"],
"pattern": "^\\S+$",
"errorMessage": "align name must be provided and cannot contain spaces",
"default": ""
},
"args_aligner": {
"args_treealign": {
"type": "string",
"meta": ["args_aligner"],
"meta": ["args_treealign"],
"default": ""
},
"alignment": {
"type": "string",
"meta": ["alignment"],
"pattern": "^\\S+$",
"errorMessage": "align name must be provided and cannot contain spaces",
"default": ""
},
"args_alignment": {
"type": "string",
"meta": ["args_alignment"],
"default": ""
}
},
"required": ["aligner"]
"oneOf": [{ "required": ["alignment"] }, { "required": ["guidetree", "treealign"] }]
}
}
21 changes: 21 additions & 0 deletions assets/test_toolsheet.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
guidetree,args_guidetree,treealign,args_treealign,alignment,args_alignment
,,,,clustalo_align,
,,,,famsa_align,
,,,,foldmason_align,
,,,,kalign_align,
,,,,learnmsa_align,
,,,,magus_align,
,,,,mafft,
,,,,mafft, --dpparttree
,,,,muscle5_super5,
,,,,MTMALIGN,
,,,,REGRESSIVE,
,,,,REGRESSIVE,-reg_nseq 3
,,,,tcoffee_align,
,,,,UPP,
,,,,3DCOFFEE,
,,,,3DCOFFEE,-method TMalign_pair
famsa_guidetree,-gt upgma -medoidtree,famsa_treealign,
famsa_guidetree,,magus_treealign,
clustalo_align,,REGRESSIVE,
mafft,,FOLDMASON,
2 changes: 1 addition & 1 deletion assets/toolsheet.csv
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
tree,args_tree,aligner,args_aligner
tree,args_guidetree,aligner,args_aligner
FAMSA,,FAMSA,
,,MAFFT,--dpparttree
2 changes: 1 addition & 1 deletion bin/merge_scores.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
import sys
import pandas as pd

merging_cols = ["id", "tree", "args_tree", "aligner", "args_aligner"]
merging_cols = ["id", "tree", "args_guidetree", "aligner", "args_aligner"]
scores_files = sys.argv[2:]
outfile = sys.argv[1]

Expand Down
4 changes: 2 additions & 2 deletions bin/shiny_app/shiny_app.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,10 @@
def merge_tree_args(row):
if str(row["tree"]) == "DEFAULT":
return "None"
elif str(row["args_tree"]) == "default":
elif str(row["args_guidetree"]) == "default":
return str(row["tree"]) + " ()"
else:
return str(row["tree"]) + " (" + str(row["args_tree"]) + ")"
return str(row["tree"]) + " (" + str(row["args_guidetree"]) + ")"

inputfile["tree_args"] = inputfile.apply(merge_tree_args, axis=1)

Expand Down
70 changes: 43 additions & 27 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -64,62 +64,78 @@
}

//
// Tree building
// Tree building (guidetree)
//

withName: "CLUSTALO_GUIDETREE|FAMSA_GUIDETREE" {
withName: "CLUSTALO_GUIDETREE|FAMSA_GUIDETREE|MAGUS_GUIDETREE" {
tag = {
[
"${meta.id}",
meta.args_tree ? "args: ${meta.args_tree}" : ""
meta.args_guidetree ? "args: ${meta.args_guidetree}" : ""
].join(' ').trim()
}
ext.prefix = { "${meta.id}_${meta.tree}-args-${meta.args_tree_clean}" }
ext.args = { "${meta.args_tree}" == "null" ? '' : "${meta.args_tree}" }
ext.prefix = { "${meta.id}_${meta.guidetree}-args-${meta.args_guidetree_clean}" }
ext.args = { "${meta.args_guidetree}" == "null" ? '' : "${meta.args_guidetree}" }
publishDir = [
path: { "${params.outdir}/trees/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

//
// Alignment from a tree (treealign)
//

withName: "CLUSTALO_TREEALIGN|FAMSA_TREEALIGN|MAGUS_TREEALIGN|TCOFFEE_TREEALIGN"{
tag = {
[
"${meta.id}",
meta.args_treealign ? "args: ${meta.args_treealign}" : ""
].join(' ').trim()
}
ext.prefix = { "${meta.id}_${meta.treealign}-args-${meta.args_treealign_clean}_${meta.guidetree}-args-${meta.args_guidetree_clean}" }
ext.args = { "${meta.args_treealign}" == "null" ? '' : "${meta.args_treealign}" }
publishDir = [
path: { "${params.outdir}/alignment/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

//
// Alignment
//

withName: "CREATE_TCOFFEETEMPLATE" {
ext.prefix = { "${meta.id}" }
}
withName: "CLUSTALO_ALIGN|FAMSA_ALIGN|FOLDMASON_EASYMSA|KALIGN_ALIGN|LEARNMSA_ALIGN|MAFFT_ALIGN|MAGUS_ALIGN|MUSCLE5_SUPER5|TCOFFEE_REGRESSIVE|TCOFFEE_ALIGN|TCOFFEE3D_ALIGN|UPP_ALIGN" {
withName: "CLUSTALO_ALIGN|FAMSA_ALIGN|FOLDMASON_EASYMSA|KALIGN_ALIGN|LEARNMSA_ALIGN|MAFFT|MAGUS_ALIGN|MUSCLE5_SUPER5|TCOFFEE_REGRESSIVE|TCOFFEE_ALIGN|TCOFFEE3D_ALIGN|UPP_ALIGN" {
tag = {
[
"${meta.id}",
meta.tree ? "tree: ${meta.tree}" : "",
meta.args_tree ? "argstree: ${meta.args_tree}" : "",
meta.args_aligner ? "args: ${meta.args_aligner}" : ""
meta.args_alignment ? "args: ${meta.args_alignment}" : ""
].join(' ').trim()
}
ext.prefix = { "${meta.id}_${meta.tree}-args-${meta.args_tree_clean}_${meta.aligner}-args-${meta.args_aligner_clean}" }
ext.args = { "${meta.args_aligner}" == "null" ? '' : "${meta.args_aligner}" }
if(params.skip_compression){
publishDir = [
path: { "${params.outdir}/alignment/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}
ext.prefix = { "${meta.id}_${meta.alignment}-args-${meta.args_alignment_clean}" }
ext.args = { "${meta.args_alignment}" == "null" ? '' : "${meta.args_alignment}" }
publishDir = [
path: { "${params.outdir}/trees/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

withName: "MTMALIGN_ALIGN" {
tag = {
[
"${meta.id}",
meta.tree ? "tree: ${meta.tree}" : "",
meta.args_tree ? "argstree: ${meta.args_tree}" : "",
meta.args_guidetree ? "argstree: ${meta.args_guidetree}" : "",
meta.args_aligner ? "args: ${meta.args_aligner}" : ""
].join(' ').trim()
}
ext.prefix = { "${meta.id}_${meta.tree}-args-${meta.args_tree_clean}_${meta.aligner}-args-${meta.args_aligner_clean}" }
ext.prefix = { "${meta.id}_${meta.guidetree}-args-${meta.args_guidetree_clean}_${meta.alignment ?: meta.treealign}-args-${meta.args_alignment_clean ?: meta.args_treealign_clean}" }
ext.args = { "${meta.args_aligner}" == "null" ? '' : "${meta.args_aligner}" }
if(params.skip_compression){
publishDir = [
Expand Down Expand Up @@ -174,21 +190,21 @@
//

withName: 'PARSE_IRMSD' {
ext.prefix = { "${meta.id}_${meta.tree}-args-${meta.args_tree_clean}_${meta.aligner}-args-${meta.args_aligner_clean}_irmsd" }
ext.prefix = { "${meta.id}_${meta.guidetree}-args-${meta.args_guidetree_clean}_${meta.alignment ?: meta.treealign}-args-${meta.args_alignment_clean ?: meta.args_treealign_clean}_irmsd" }
}

withName: 'TCOFFEE_ALNCOMPARE_SP' {
ext.prefix = { "${meta.id}_${meta.tree}-args-${meta.args_tree_clean}_${meta.aligner}-args-${meta.args_aligner_clean}_sp" }
ext.prefix = { "${meta.id}_${meta.guidetree}-args-${meta.args_guidetree_clean}_${meta.alignment ?: meta.treealign}-args-${meta.args_alignment_clean ?: meta.args_treealign_clean}_sp" }
ext.args = "-compare_mode sp"
}

withName: 'TCOFFEE_ALNCOMPARE_TC' {
ext.prefix = { "${meta.id}_${meta.tree}-args-${meta.args_tree_clean}_${meta.aligner}-args-${meta.args_aligner_clean}_tc" }
ext.prefix = { "${meta.id}_${meta.guidetree}-args-${meta.args_guidetree_clean}_${meta.alignment ?: meta.treealign}-args-${meta.args_alignment_clean ?: meta.args_treealign_clean}_tc" }
ext.args = "-compare_mode tc"
}

withName: 'TCOFFEE_IRMSD' {
ext.prefix = { "${meta.id}_${meta.tree}-args-${meta.args_tree_clean}_${meta.aligner}-args-${meta.args_aligner_clean}_irmsd" }
ext.prefix = { "${meta.id}_${meta.guidetree}-args-${meta.args_guidetree_clean}_${meta.alignment ?: meta.treealign}-args-${meta.args_alignment_clean ?: meta.args_treealign_clean}_irmsd" }
publishDir = [
path: { "${params.outdir}/evaluation/${task.process.tokenize(':')[-1].toLowerCase()}" },
mode: params.publish_dir_mode,
Expand All @@ -198,7 +214,7 @@
}

withName: "CALC_GAPS" {
ext.prefix = { "${meta.id}_${meta.tree}-args-${meta.args_tree_clean}_${meta.aligner}-args-${meta.args_aligner_clean}_gaps" }
ext.prefix = { "${meta.id}_${meta.guidetree}-args-${meta.args_guidetree_clean}_${meta.alignment ?: meta.treealign}-args-${meta.args_alignment_clean ?: meta.args_treealign_clean}_gaps" }
}

withName: "CONCAT_IRMSD" {
Expand All @@ -222,7 +238,7 @@
}

withName: 'TCOFFEE_TCS' {
ext.prefix = { "${meta.id}_${meta.tree}-args-${meta.args_tree_clean}_${meta.aligner}-args-${meta.args_aligner_clean}_tcs" }
ext.prefix = { "${meta.id}_${meta.guidetree}-args-${meta.args_guidetree_clean}_${meta.alignment ?: meta.treealign}-args-${meta.args_alignment_clean ?: meta.args_treealign_clean}_tcs" }
publishDir = [
path: { "${params.outdir}/evaluation/${task.process.tokenize(':')[-1].toLowerCase()}" },
mode: params.publish_dir_mode,
Expand Down Expand Up @@ -274,7 +290,7 @@
// Visualization
//
withName: 'FOLDMASON_MSA2LDDTREPORT' {
ext.prefix = { "${meta.id}_${meta.tree}-args-${meta.args_tree_clean}_${meta.aligner}-args-${meta.args_aligner_clean}" }
ext.prefix = { "${meta.id}_${meta.guidetree}-args-${meta.args_guidetree_clean}_${meta.alignment ?: meta.treealign}-args-${meta.args_alignment_clean ?: meta.args_treealign_clean}" }
publishDir = [
path: { "${params.outdir}/visualization" },
mode: params.publish_dir_mode,
Expand Down
2 changes: 1 addition & 1 deletion conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -36,5 +36,5 @@ params {

// Input data
input = params.pipelines_testdata_base_path + 'multiplesequencealign/samplesheet/v1.1/samplesheet_test_af2.csv'
tools = params.pipelines_testdata_base_path + 'multiplesequencealign/toolsheet/v1.0/toolsheet_full.csv'
tools = "${projectDir}/assets/test_toolsheet.csv"
}
2 changes: 1 addition & 1 deletion conf/test_full.config
Original file line number Diff line number Diff line change
Expand Up @@ -37,5 +37,5 @@ params {

// Input data for full size test
input = params.pipelines_testdata_base_path + 'multiplesequencealign/samplesheet/v1.1/samplesheet_full.csv'
tools = params.pipelines_testdata_base_path + 'multiplesequencealign/toolsheet/v1.0/toolsheet_full.csv'
tools = "${projectDir}/assets/test_toolsheet.csv"
}
2 changes: 1 addition & 1 deletion conf/test_parameters.config
Original file line number Diff line number Diff line change
Expand Up @@ -26,5 +26,5 @@ params {

// Input data
input = params.pipelines_testdata_base_path + 'multiplesequencealign/samplesheet/v1.1/samplesheet_test_af2.csv'
tools = params.pipelines_testdata_base_path + 'multiplesequencealign/toolsheet/v1.0/toolsheet_full.csv'
tools = "${projectDir}/assets/test_toolsheet.csv"
}
18 changes: 9 additions & 9 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,14 +131,14 @@ Each line of the toolsheet defines a combination of guide tree and multiple sequ
A typical toolsheet should look at follows:

```csv title="toolsheet.csv"
tree,args_tree,aligner,args_aligner,
tree,args_guidetree,aligner,args_aligner,
FAMSA, -gt upgma -medoidtree, FAMSA,
, ,TCOFFEE,
FAMSA,,REGRESSIVE,
```

:::note
Each of the trees and aligners are available as standalones. You can leave `args_tree` and `args_aligner` empty if you are cool with the default settings of each method. Alternatively, you can leave `args_tree` empty to use the default guide tree with each aligner.
Each of the trees and aligners are available as standalones. You can leave `args_guidetree` and `args_aligner` empty if you are cool with the default settings of each method. Alternatively, you can leave `args_guidetree` empty to use the default guide tree with each aligner.
:::

:::note
Expand All @@ -147,18 +147,18 @@ use the exact spelling as listed above in [align](#3-align) and [guide trees](#2

`tree` is the tool used to build the tree (optional).

Arguments to the tree tool can be provided using `args_tree`. Please refer to each tool's documentation (optional).
Arguments to the tree tool can be provided using `args_guidetree`. Please refer to each tool's documentation (optional).

The `aligner` column contains the tool to run the alignment (optional).

Finally, the arguments to the aligner tool can be set by using the `args_aligner` column (optional).

| Column | Description |
| -------------- | -------------------------------------------------------------------------------- |
| `tree` | Optional. Tool used to build the tree. |
| `args_tree` | Optional. Arguments to the tree tool. Please refer to each tool's documentation. |
| `aligner` | Required. Tool to run the alignment. Available options listed above. |
| `args_aligner` | Optional. Arguments to the alignment tool. |
| Column | Description |
| ---------------- | -------------------------------------------------------------------------------- |
| `tree` | Optional. Tool used to build the tree. |
| `args_guidetree` | Optional. Arguments to the tree tool. Please refer to each tool's documentation. |
| `aligner` | Required. Tool to run the alignment. Available options listed above. |
| `args_aligner` | Optional. Arguments to the alignment tool. |

## Running the pipeline

Expand Down
Loading
Loading