-
Notifications
You must be signed in to change notification settings - Fork 28
Enable multi-lane support #20
Comments
Hi Marco, I think this is important (more so for WGS samples). We use an approach where an input yaml file is created, that consists of a The input channel is then set up as follows: def GetReadPair = { sk, rk ->
tuple(file(params.samples[sk].readunits[rk]['fq1']),
file(params.samples[sk].readunits[rk]['fq2']))
}
def GetReadUnitKeys = { sk ->
params.samples[sk].readunits.keySet()
}
Channel
.from(sample_keys)
.map { sk -> tuple(sk, GetReadUnitKeys(sk).collect{GetReadPair(sk, it)}.flatten()) }
.set { fastq_ch } There might be more elegant ways. Probably makes sense to discuss this on Gitter... |
In Sarek, we have one BWA-mem | samtools sort process |
For read pairs we do this: // Define channel for raw reads
if (pairedEnd) {
rawReads = designFilePaths
.splitCsv(sep: '\t', header: true)
.map { row -> [ row.sample_id, [row.fastq_read1, row.fastq_read2], row.experiment_id, row.biosample, row.factor, row.treatment, row.replicate, row.control_id ] }
} else {
rawReads = designFilePaths
.splitCsv(sep: '\t', header: true)
.map { row -> [ row.sample_id, [row.fastq_read1], row.experiment_id, row.biosample, row.factor, row.treatment, row.replicate, row.control_id ] }
} |
Thanks @vsmalladi and @maxulysse for the references! |
Some sequencing setups will split libraries across lanes. This is currently not modeled in the pipeline.
Using a CSV to keep track of IndividualID and sampleID, we could do something along these lines:
The text was updated successfully, but these errors were encountered: