Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hi-C Scaffolding Error #317

Open
emilytrybulec opened this issue Jan 23, 2025 · 3 comments
Open

Hi-C Scaffolding Error #317

emilytrybulec opened this issue Jan 23, 2025 · 3 comments

Comments

@emilytrybulec
Copy link

Hello,

I've run verkko for a particular genome assembly and recently got Omni-C data to improve the quality of my assembly. Using this command,

verkko -d verkko2.2_omnic_persephone_2 \
    --hifi /core/projects/colossalanalyses/Marsupial_Assemblies/7_Petrogale_Persephone/prep/HerroCorrected_PPer_ULandStd.fasta \
    --nano /seqdata/EBP/animal/vertebrate/petrogale/nanopore/2024NOV06_RON_Ppersephone_UHMW.fastq.gz \
    --min-ont-length 1000 \
    --hic1 /core/projects/colossalanalyses/Marsupial_Assemblies/7_Petrogale_Persephone/argonaut/p_persephone_assembly/01_read_qc/fastp/pper_ill_T1_1.fastp.fastq.gz \
    --hic2 /core/projects/colossalanalyses/Marsupial_Assemblies/7_Petrogale_Persephone/argonaut/p_persephone_assembly/01_read_qc/fastp/pper_ill_T1_2.fastp.fastq.gz \
    --slurm

I run into this error

Error in rule HiC_rdnascaff:
    jobid: 0
    input: 8-hicPipeline/unitigs.hpc.noseq.gfa, 8-hicPipeline/prescaf_rukki.paths.tsv, 8-hicPipeline/prescaf_rukki.paths.gaf, 8-hicPipeline/unitigs.telo, 8-hicPipeline/unitigs_nonhpc50.mashmap, 8-hicPipeline/paths2ref.mashmap, 8-hicPipeline/hic_nodefiltered.bam
    output: 8-hicPipeline/scaff_rukki.paths.tsv, 8-hicPipeline/scaff_rukki.paths.gaf, 8-hicPipeline/rukki.paths.tsv, 8-hicPipeline/rukki.paths.gaf
    log: 8-hicPipeline/hic_scaff.err (check log file(s) for error details)
    shell:
        
cd 8-hicPipeline

cat > ./hic_scaff.sh <<EOF
#!/bin/sh
set -e
 /home/FCAM/etrybulec/yes/envs/verkko2.2/lib/verkko/scripts/launch_scaffolding.py . False


cp ../8-hicPipeline/scaff_rukki.paths.tsv ../8-hicPipeline/rukki.paths.tsv
cp ../8-hicPipeline/scaff_rukki.paths.gaf ../8-hicPipeline/rukki.paths.gaf

cp ../8-hicPipeline/rukki.paths.tsv ../assembly.paths.tsv
EOF

chmod +x ./hic_scaff.sh

./hic_scaff.sh > ../8-hicPipeline/hic_scaff.err 2>&1
        
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
    cluster_jobid: 206690

Error executing rule HiC_rdnascaff on cluster (jobid: 0, external: 206690, jobscript: /core/projects/colossalanalyses/Marsupial_Assemblies/7_Petrogale_Persephone/verkko/verkko2.2_omnic_persephone_2/.snakemake/tmp.rll_hk5h/verkko.HiC_rdnascaff.0.sh). For error details see the cluster log and the log files of the involved rule(s).
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2025-01-17T091903.999612.snakemake.log

hic_scaff.err contains the following at the end

00:00:57 INFO:  ScaffoldGraph - Loading Hi-C alignments
00:03:29 INFO:  ScaffoldGraph - HomologyStorage - Loaded 6086 out of 24290 mashmap lines
00:03:29 INFO:  ScaffoldGraph - HomologyStorage - 1949 nodes have at least one used homology
00:13:28 INFO:  ScaffoldGraph - Pairwise distances in assembly graph calculated
00:13:28 INFO:  ScaffoldGraph - Found haploid path haplotype1_from_utig4-13551 with homology 101590272 and len 209842619 
00:13:28 INFO:  ScaffoldGraph - Found haploid path haplotype2_from_utig4-21644 with homology 8011849 and len 19200743 
Killed

I have ensured that rukki is part of my verkko environment, and I've also attempted to allocate more memory to these jobs, which resulted in batch job submission errors. Any suggestions would be greatly appreciated.

@skoren
Copy link
Member

skoren commented Jan 23, 2025

It looks like the job is being killed by the cluster. Can you post the usage stats for the job ID 206690 from your cluster (how much memory/time it requested and used before being killed). If you have larger memory nodes (over 128gb) you could add --shc-run 8 240 48 which would let it use 240gb and run for 48 hours instead of 24 and re-start verkko to see if it completes.

@Dmitry-Antipov can suggest files you can share for him to take a look locally at what resources it is using and/or if it can be reduced.

@emilytrybulec
Copy link
Author

emilytrybulec commented Jan 23, 2025

I'll give that a go now- thank you!
These are the stats for the job and it does look like the memory maxed out

Job ID: 206690
Cluster: mantis
User/Group: etrybulec/oneilllab
State: FAILED (exit code 1)
Nodes: 1
Cores per node: 8
CPU Utilized: 13:18:16
CPU Efficiency: 12.38% of 4-11:26:48 core-walltime
Job Wall-clock time: 13:25:51
Memory Utilized: 387.56 GB
Memory Efficiency: 242.23% of 160.00 GB

@Dmitry-Antipov
Copy link
Contributor

What is the assembly graph size (may be counted as the number of contigs in 8-hicPipeline/unitigs.fasta)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants