-
Notifications
You must be signed in to change notification settings - Fork 720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GTF filter error in 3.13.2 but works in 3.12.0 #1147
Comments
Hi @heathfuqua ! Be great if you can provide us with links to where we can download the GTF and Fasta files so we can reproduce please? Looks like an encoding issue but be good to confirm. |
Yes, that GTF sanity check is a recent addition in version 3.13 of the pipeline, so it is comprehensible that version 3.12 runs without issues. Since the decode error occurs in position 1 and the invalid byte happens to be Just |
(or rename the file to suffix it with .gz so that the pipeline recognises the GTF as compressed and uncompresses it) |
Thanks for your help. Error executing process > 'NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF (Danio_rerio.GRCz11.110.gtf.gz)' -- Caused by: Command executed: gunzip cat <<-END_VERSIONS > versions.yml Command exit status: -- I'd gotten a similar error with an unzipped fasta file. See below. This file I also verified as unzipped just now. The exit status of the task that caused the workflow execution to fail was: 1 -- Error executing process > 'NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF_FILTER (Danio_rerio.GRCz11.dna.fa)' Caused by: Command executed: filter_gtf.py cat <<-END_VERSIONS > versions.yml Command exit status: Command output: Command error: |
Thanks @heathfuqua ! Are you able to share the file with us somehow or point us to where we can download it please? |
Sure thing.
The gtf: https://ftp.ensembl.org/pub/release-110/gtf/danio_rerio/Danio_rerio.GRCz11.110.gtf.gz
The fasta file was created from a zcat of the non-masked files (Danio_rerio.GRCz11.dna.chromosome.*.fa.gz) available here: https://ftp.ensembl.org/pub/release-110/fasta/danio_rerio/dna/
|
The gunzip error is suspicious, that should work just fine. In any case I can't replicate that issue by downloading that GTF, gunzipping it, and running the script in a conda env. If your data are public, could you provide your sample sheet please? That will allow us to run the workflow and see if we can replicate things at that level. |
Also:
|
Ok, a colleague just launched run in 3.13.2 using the redownloaded files and had no problems, so clearly I made an error somewhere along the way and misattributed that as a bug. |
No worries, thanks for letting us know. |
Description of the bug
When running zebrafish samples in rnaseq pipeline on 3.13.2, failure occurs at gtf_filter job. Failure does not occur with all same settings on 3.12.0.
Command used and terminal output
Relevant files
rnaseq.params.json
System information
Nextflow version: 23.04.3
Hardware: AWS cloud
Executor: awsbatch
Container engine: Docker
Version of nf-core/rnaseq 3.13.2
The text was updated successfully, but these errors were encountered: