You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to replicate your results using CHEUI, but I'm encountering issues with nanopolish during the preprocessing steps.
When generating the index, I receive the following error (example for one sample):
[readdb] indexing ${WORKDIR}/fast5/HeLa_all_fast5_M11A_KO/data/xc17/pm1122/SRA/HeLa/fast5s/all_fast5_M11A_KO/
[readdb] num reads: 1260168, num reads with path to fast5: 630084
fast5 files could not be located for 630084 reads
If I then run nanopolish eventalign, the output file is empty. Do you know what might be causing this issue? I'm unsure whether it’s related to the downloaded samples or an error in my code. Below, I’ve included the code I used, in case it helps identify any potential problems. Thank you very much for your help!
For the alignment step, I followed the instructions in your documentation. For nanopolish, I used version 0.14.0 (this is the only version available on the cluster that I'm using):
#!/bin/bash
#SBATCH -t 36:00:00 # execution time
#SBATCH --mem=30G
#SBATCH -c 24
module load cesga/2020 gcccore/system nanopolish/0.14.0
FASTQ_DIR=${WORKDIR}/fastq
FAST5_DIR=${WORKDIR}/fast5
BAM_DIR=${WORKDIR}/01_aligments
REF=${WORKDIR}/reference/gencode.v38.transcripts.fa.gz
OUTDIR_BASE=${WORKDIR}/02_nanopolish
mkdir -p ${OUTDIR_BASE}
# index for nanopolish
nanopolish index -d ${FAST5_DIR}/HeLa_all_fast5_M11A_KO/data/xc17/pm1122/SRA/HeLa/fast5s/all_fast5_M11A_KO/ ${FASTQ_DIR}/HeLa_M11A_KO.fastq
nanopolish index -d ${FAST5_DIR}/HeLa_all_fast5_M11A_WT/data/xc17/pm1122/SRA/HeLa/fast5s/all_fast5_M11A_WT/ ${FASTQ_DIR}/HeLa_M11A_WT.fastq
nanopolish index -d ${FAST5_DIR}/HeLa_all_fast5_M11B_KO/data/xc17/pm1122/SRA/HeLa/fast5s/all_fast5_M11B_KO/ ${FASTQ_DIR}/HeLa_M11B_KO.fastq
nanopolish index -d ${FAST5_DIR}/HeLa_all_fast5_M11B_WT/data/xc17/pm1122/SRA/HeLa/fast5s/all_fast5_M11B_WT/ ${FASTQ_DIR}/HeLa_M11B_WT.fastq
nanopolish index -d ${FAST5_DIR}/HeLa_all_fast5_M11C_KO/data/xc17/pm1122/SRA/HeLa/fast5s/all_fast5_M11C_KO/ ${FASTQ_DIR}/HeLa_M11C_KO.fastq
nanopolish index -d ${FAST5_DIR}/HeLa_all_fast5_M11C_WT/data/xc17/pm1122/SRA/HeLa/fast5s/all_fast5_M11C_WT/ ${FASTQ_DIR}/HeLa_M11C_WT.fastq
if compgen -G "${FASTQ_DIR}/*.fastq" > /dev/null; then
for FASTQ_FILE in "${FASTQ_DIR}"/*.fastq; do
SAMPLE_NAME=$(basename "${FASTQ_FILE}" .fastq)
BAM_FILE="${BAM_DIR}/${SAMPLE_NAME}_s.bam"
if [[ -f "${BAM_FILE}" ]]; then
OUTFILE="${OUTDIR_BASE}/${SAMPLE_NAME}.txt"
mkdir -p "$(dirname "${OUTFILE}")"
# Launch nanopolish
echo "Running Nanopolish for $SAMPLE_NAME"
nanopolish eventalign\
-t 24 \
--reads "${FASTQ_FILE}" \
--bam "${BAM_FILE}" \
--genome "${REF}" \
--scale-events --signal-index --samples --print-read-names > "${OUTFILE}" 2> "${OUTFILE}.err"
echo "Nanopolish eventalign ended"
fi
done
fi
The text was updated successfully, but these errors were encountered:
it seems you are using the wrong path (${WORKDIR}/fast5/HeLa_all_fast5_M11A_KO/data/xc17/pm1122/SRA/HeLa/fast5s/all_fast5_M11A_KO/ ) . This is just an example path provided. You need to use the path where you downloaded the fast5s.
I double-checked the path I provided and confirmed that it points to the correct location of the downloaded fast5 files (I used ls ${FAST5_DIR}/HeLa_all_fast5_M11A_KO/data/xc17/pm1122/SRA/HeLa/fast5s/all_fast5_M11A_KO/, and the fast5 files were listed).
The directory was created when I downloaded and decompressed the files. Since the files are indeed present in the directory, I’m unsure why nanopolish is unable to locate them for half of the reads and why, when I use nanopolish eventalign, the output is empty.
Hello,
I am trying to replicate your results using CHEUI, but I'm encountering issues with nanopolish during the preprocessing steps.
When generating the index, I receive the following error (example for one sample):
If I then run nanopolish eventalign, the output file is empty. Do you know what might be causing this issue? I'm unsure whether it’s related to the downloaded samples or an error in my code. Below, I’ve included the code I used, in case it helps identify any potential problems. Thank you very much for your help!
I downloaded both the fastq and fast5 files (HeLa WT and KO samples) from this link: https://trace.ncbi.nlm.nih.gov/Traces/study/?acc=SRP393373&o=acc_s%3Aa
I also downloaded the transcriptome version you used:
https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_38/gencode.v38.transcripts.fa.gz
For the alignment step, I followed the instructions in your documentation. For nanopolish, I used version 0.14.0 (this is the only version available on the cluster that I'm using):
The text was updated successfully, but these errors were encountered: