-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for generating taxprofiler/funcscan input samplesheets for preprocessed FASTQs/FASTAs #688
base: dev
Are you sure you want to change the base?
Add support for generating taxprofiler/funcscan input samplesheets for preprocessed FASTQs/FASTAs #688
Changes from 1 commit
2a48e7f
3b80e0c
2bd8352
a744175
8672790
be91462
e9df126
e441e76
ddb9c96
bf11fb3
8724961
b354da9
f6b9a99
bec8347
67958ec
997674a
aa71298
0163690
535747c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, since FastQ files are being pulled from the publishDir, it might be a good idea to include options that override user inputs for params.publish_dir_mode (so that it is always 'copy' if a samplesheet is generated) and params.save_clipped_reads, params.save_phixremoved_reads ...etc so that the preprocessed FastQ files are published to the params.outdir if a downstream samplesheet is generated |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,8 +7,7 @@ workflow SAMPLESHEET_TAXPROFILER { | |
ch_reads | ||
|
||
main: | ||
format = 'csv' // most common format in nf-core | ||
format_sep = ',' | ||
format = 'csv' | ||
|
||
def fastq_rel_path = '/' | ||
if (params.bbnorm) { | ||
|
@@ -36,7 +35,7 @@ workflow SAMPLESHEET_TAXPROFILER { | |
} | ||
.tap{ ch_colnames } | ||
jfy133 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
channelToSamplesheet(ch_colnames, ch_list_for_samplesheet, 'downstream_samplesheets', 'taxprofiler', format, format_sep) | ||
channelToSamplesheet(ch_list_for_samplesheet, "${params.outdir}/downstream_samplesheets/mag", format) | ||
|
||
} | ||
|
||
|
@@ -45,8 +44,7 @@ workflow SAMPLESHEET_FUNCSCAN { | |
ch_assemblies | ||
|
||
main: | ||
format = 'csv' // most common format in nf-core | ||
format_sep = ',' | ||
format = 'csv' | ||
|
||
ch_list_for_samplesheet = ch_assemblies | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Next thing which I don't think will be so complicated is to add another input channel for bins, and here make an if/else statement if they want to send just the raw assemblies (all contigs) or binned contigs to the samplesheet. It will need another pipeline level parameter too though |
||
.map { | ||
|
@@ -57,8 +55,7 @@ workflow SAMPLESHEET_FUNCSCAN { | |
} | ||
.tap{ ch_colnames } | ||
|
||
channelToSamplesheet(ch_colnames, ch_list_for_samplesheet, 'downstream_samplesheets', 'funcscan', format, format_sep) | ||
|
||
channelToSamplesheet(ch_list_for_samplesheet, "${params.outdir}/downstream_samplesheets/funcscan", format) | ||
} | ||
|
||
workflow GENERATE_DOWNSTREAM_SAMPLESHEETS { | ||
|
@@ -78,14 +75,17 @@ workflow GENERATE_DOWNSTREAM_SAMPLESHEETS { | |
} | ||
} | ||
|
||
// Constructs the header string and then the strings of each row, and | ||
def channelToSamplesheet(ch_header, ch_list_for_samplesheet, outdir_subdir, pipeline, format, format_sep) { | ||
def channelToSamplesheet(ch_list_for_samplesheet, path, format) { | ||
def format_sep = [csv: ",", tsv: "\t", txt: "\t"][format] | ||
|
||
def ch_header = ch_list_for_samplesheet | ||
|
||
ch_header | ||
.first() | ||
.map{ it.keySet().join(format_sep) } | ||
.concat( ch_list_for_samplesheet.map{ it.values().join(format_sep) }) | ||
.map { it.keySet().join(format_sep) } | ||
.concat(ch_list_for_samplesheet.map { it.values().join(format_sep) }) | ||
.collectFile( | ||
name:"${params.outdir}/${outdir_subdir}/${pipeline}.${format}", | ||
name: "${path}.${format}", | ||
newLine: true, | ||
sort: false | ||
) | ||
|
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -118,6 +118,11 @@ workflow PIPELINE_INITIALISATION { | |
// | ||
validateInputParameters( | ||
hybrid | ||
jfy133 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
// Validate samplesheet generation parameters | ||
if (params.generate_downstream_samplesheets && !params.generate_pipeline_samplesheets) { | ||
error('[nf-core/createtaxdb] If supplying `--generate_downstream_samplesheets`, you must also specify which pipeline to generate for with `--generate_pipeline_samplesheets! Check input.') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nf-core/mag ? |
||
} | ||
) | ||
|
||
// Validate PRE-ASSEMBLED CONTIG input when supplied | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like @jfy133 used only one workflow, which will selectively generate samplesheets based on params.generate_pipeline_samplesheets. Do you think it would be best to keep that consistent?