-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add fastq_util tool fastq_pre_barcodes to qc dir #252
Open
irisdianauy
wants to merge
21
commits into
develop
Choose a base branch
from
feature/fastq_utils_xml
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 1 commit
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
81bb001
Add fastq_util tool fastq_pre_barcodes to qc dir
irisdianauy 519d130
Create .shed.yml
pcm32 51b56db
Cleanup and add help and tests
irisdianauy 944d448
Add tool profile, version, and citation
irisdianauy 5cbca26
Remove comment
irisdianauy 3896d3d
Add review suggestions
irisdianauy 74f94c1
Edit info in .shed.yml
irisdianauy d3b9367
Add script to fetch test data
irisdianauy cc6ffb9
Add edits from planemo test debugging
irisdianauy 1be446b
Add working tests
irisdianauy b46612e
Add test data
irisdianauy 2a0db33
Edit link in get_test_data.sh
irisdianauy 7fa3a6c
Cleanup
irisdianauy e8d4ca2
Correct base link
irisdianauy c40fc6a
Correct package version, remove not needed packages
irisdianauy e9f8adb
Delete test files
irisdianauy 36546ac
Add fastqsanger.gz format
irisdianauy a01f3a4
Change output compare to sim_size
irisdianauy 99f06b9
Correct a file link
irisdianauy ca725ae
Revert back to output compare diff
irisdianauy a47d961
Compare outputs by md5
irisdianauy File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,145 @@ | ||
<tool id="fastq_pre_barcodes" name="FASTQ barcodes preprocessor" profile="10" version="conda-package-version+galaxy0"> | ||
<description>Preprocesses the reads to move the barcodes (UMI, Cell, ...) to the respective readname, optionally discarding reads with bases in the barcode regions below a given threshold.</description> | ||
<command detect_errors="exit_code"><![CDATA[ | ||
#set params_optional = [] | ||
|
||
#if $read2: | ||
${params_optional}.append($read2) | ||
#end if | ||
|
||
#if $index1: | ||
${params_optional}.append($index1) | ||
#end if | ||
pcm32 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
#if $index2: | ||
${params_optional}.append($index2) | ||
#end if | ||
|
||
#if $index3: | ||
${params_optional}.append($index3) | ||
#end if | ||
|
||
#if $phred_encoding: | ||
${params_optional}.append($phred_encoding) | ||
#end if | ||
|
||
#if $min_qual: | ||
${params_optional}.append($min_qual) | ||
#end if | ||
|
||
#if $outfile2: | ||
${params_optional}.append($outfile2) | ||
#end if | ||
|
||
#if $outfile3: | ||
${params_optional}.append($outfile3) | ||
#end if | ||
|
||
#if $interleaved: | ||
${params_optional}.append($interleaved) | ||
#end if | ||
|
||
#if $umi_read: | ||
${params_optional}.append($umi_read) | ||
#end if | ||
|
||
#if $umi_offset: | ||
${params_optional}.append($umi_offset) | ||
#end if | ||
|
||
#if $umi_size: | ||
${params_optional}.append($umi_size) | ||
#end if | ||
|
||
#if $Cell_read: | ||
${params_optional}.append($Cell_read) | ||
#end if | ||
|
||
#if $Cell_offset: | ||
${params_optional}.append($Cell_offset) | ||
#end if | ||
|
||
#if $Cell_size: | ||
${params_optional}.append($Cell_size) | ||
#end if | ||
|
||
#if $sample_read: | ||
${params_optional}.append($sample_read) | ||
#end if | ||
|
||
#if $sample_offset: | ||
${params_optional}.append($sample_offset) | ||
#end if | ||
|
||
#if $sample_size: | ||
${params_optional}.append($sample_size) | ||
#end if | ||
|
||
#if $read1_offset: | ||
${params_optional}.append($read1_offset) | ||
#end if | ||
|
||
#if $read1_size: | ||
${params_optional}.append($read1_size) | ||
#end if | ||
|
||
#if $read2_offset: | ||
${params_optional}.append($read2_offset) | ||
#end if | ||
|
||
#if $read2_offset: | ||
${params_optional}.append($read2_offset) | ||
#end if | ||
|
||
#if $use_10x: | ||
${params_optional}.append($use_10x) | ||
#end if | ||
|
||
#if $brief: | ||
${params_optional}.append($brief) | ||
#elif $verbose: | ||
${params_optional}.append($verbose) | ||
#end if | ||
|
||
# set params_optional_str = " ".join($params_optional) | ||
|
||
fastq_pre_barcodes --read1 $read1 --outfile $outfile1 $params_optional_str | ||
pcm32 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
]]></command> | ||
<inputs> | ||
<param label="Verbose" optional='true' value='false' name="verbose" argument="--verbose" type="boolean" truevalue='--verbose' falsevalue='' checked='true' help="Increase level of messages printed to stderr"/> | ||
<param label="Brief" optional='true' value='true' name="brief" argument="--brief" type="boolean" truevalue='--brief' falsevalue='' checked='true' help="Decrease level of messages printed to stderr"/> | ||
<param label="Read1" name="read1" argument="--read1" type="data" format='?' help="fastq (optional gzipped) file name"/> | ||
<param label="Read2" name="read2" argument="--read2" type="data" format='?' help="fastq (optional gzipped) file name"/> | ||
<param label="Index1" name="index1" argument="--index1" type="data" format='?' help="fastq (optional gzipped) file name"/> | ||
<param label="Index2" name="index2" argument="--index2" type="data" format='?' help="fastq (optional gzipped) file name"/> | ||
<param label="Index3" name="index3" argument="--index3" type="data" format='?' help="fastq (optional gzipped) file name"/> | ||
<param label="PHRED Encoding" name="phred_encoding" argument="--phred_encoding" type="select" help="PHRED encoding used in the input files"> | ||
<option value="33" selected="true">33</option> | ||
<option value="64">64</option> | ||
</param> | ||
<param label="Minimum Quality" optional='true' value='' name="min_qual" argument="--min_qual" type="integer" min="0" max="40" help="[0-40]. Defines the minimum quality that all bases in the UMI, Cell or Sample should have (reads that do not pass the criteria are discarded). 0 disables the filter."/> | ||
<param label="Interleaved Data" name="interleaved" argument="--interleaved" type="text" help="Interleaved data, in this format: (read1|read2|index1|index2|index3),(read1|read2|index1|index2|index3)"/> | ||
<param label="UMI read" name="umi_read" argument="--umi_read" type="text" help="File in which UMI read can be found, in this format: (read1|read2|index1|index2|index3)"/> | ||
<param label="UMI offset" name="umi_offset" argument="--umi_offset" type="integer" help="Offset (integer)"/> | ||
<param label="UMI Size" name="umi_size" argument="--umi_size" type="integer" help="Number of bases after the offset"/> | ||
<param label="Cell Read" name="Cell_read" argument="--Cell_read" type="text" help="File in which Cell can be found, in this format: (read1|read2|index1|index2|index3)"/> | ||
<param label="Cell Offset" name="Cell_offset" argument="--Cell_offset" type="integer" help="Offset"/> | ||
<param label="Cell Size" name="Cell_size" argument="--Cell_size" type="integer" help="Number of bases after the offset"/> | ||
<param label="Sample Read" name="sample_read" argument="--sample_read" type="text" help="File in which sample barcode can be found, in this format: (read1|read2|index1|index2|index3)"/> | ||
<param label="Sample Offset" name="sample_offset" argument="--sample_offset" type="integer" help="Offset"/> | ||
<param label="Sample Size" name="sample_size" argument="--sample_size" type="integer" help="Number of bases after the offset"/> | ||
<param label="read1 Offset" name="read1_offset" argument="--read1_offset" type="integer" help="None"/> | ||
<param label="read1 Size" name="read1_size" argument="--read1_size" type="integer" help="None"/> | ||
<param label="read2 Offset" name="read2_offset" argument="--read2_offset" type="integer" help="None"/> | ||
<param label="read2 Size" name="read2_size" argument="--read2_size" type="integer" help="None"/> | ||
<param label="Use 10x tags" name="use_10x" argument="--10x" type="text" help="Use 10X UMI tags (UB and UY) instead of the default tags defined in the SAM specification"/> | ||
</inputs> | ||
<outputs> | ||
<data label="${tool.name} on ${on_string}: Output file 1" name="outfile1" format='?' /> | ||
<data label="${tool.name} on ${on_string}: Output file 2" name="outfile2" format='?' /> | ||
<data label="${tool.name} on ${on_string}: Output file 3" name="outfile3" format='?' /> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. also format needs to be set here, please see galaxy datatypes in the Galaxy docs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done in 3896d3d. |
||
</outputs> | ||
<tests></tests> | ||
<help></help> | ||
</tool> |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing the requirements as well (the bioconda package that this will use to run)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 51b56db but I'm not sure if I referenced the correct version for samtools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that that will be hard to know, @pinin4fjords might be able to point you to where IRAP is installed on Noah to check the version used. We could in principle try a few runs with this (I suspect most up to date) version and if results are equivalent maybe we keep the newest version. Although maybe for a start, might be better to go if possible with the currently used version in noah.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried here: https://wwwdev.ebi.ac.uk/gxa/experiments/E-MTAB-2706/Supplementary%20Information but it is not mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
irap has samtools samtools 1.9, fastq_utils 0.16.3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in 3896d3d, but the test log says it's still using fastq_utils 0.25.1. Any idea how I might force it to use the correct fastq_utils version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing what you mean in the logs right now, but it may be because that version isn't available in Conda- see https://anaconda.org/bioconda/fastq_utils/files. You could try picking the oldest version available for now, but since we can't easily match versions maybe we should bite the bullet and use the latest. Okay with you @pcm32 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The html output of the local planemo test that I ran says
fastq_utils 0.25.1
in its report. Not sure how to view the html here, but maybe they're using the same version.According to the fastq_utils repo, these are the dependencies:
samtools (version 0.1.19) and zlib (http://zlib.net/) version 1.2.11 or latest are required to compile fastq_utils. ... The bam_annotate.sh script requires samtools (version 1.5 or higher).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to latest version in c40fc6a