Adding instructions for running with limited cpus and/or memory #29

mikemc · 2024-06-27T21:06:16Z

In the previous version of the pipeline, there were options in the config file for limiting the cpus and memory usage, but these parameters seem to be gone. In run.nf, I see how we can set params for fastqc, but not for other steps.

Currently, if I run the test set on an instance with 8 cpus, I fail at the 'RUN:CLEAN:CUTADAPT' step (output below). Somewhere, 'task.cpus' is being set to 16 despite running on an instance with just 8 cpus in 'ec2_local' mode.

ERROR ~ Error executing process > 'RUN:CLEAN:CUTADAPT (1)'

Caused by:
  Process requirement exceeds available CPUs -- req: 16; avail: 8


Command executed:

  par="-b file:adapters.fasta -B file:adapters.fasta -j 16 -m 20 -e 0.33 --action=trim"
  out="-o 6A_cutadapt_1.fastq.gz -p 6A_cutadapt_2.fastq.gz"
  log="6A_cutadapt_log.txt"
  cutadapt ${par} ${out} 6A_1.fastq.gz 6A_2.fastq.gz > ${log}

The text was updated successfully, but these errors were encountered:

mikemc · 2024-06-27T21:15:59Z

Ok, I see that these are now in https://github.com/naobservatory/mgs-workflow/blob/master/configs/resources.config; perhaps the README can point people there, and maybe we could start the ec2 modes with a check that these resources are available, and if not give a message informing the user to edit this file.

mikemc · 2024-06-27T21:48:54Z

I'm running on an instance with 8 cpus and 16G memory. After setting the cpu limit to 7 and the memory limit to 14G on all the non-single processes in the 'resources.config' file, things work until 'RUN:DEDUP:CLUMPIFY_PAIRED'

  java -ea -Xmx30g -Xms30g -cp /opt/bbmap/current/ clump.Clumpify in=SS1_fastp_1.fastq.gz in2=SS1_fastp_2.fastq.gz out=SS1_dedup_1.fastq.gz out2=SS1_dedup_2.fastq.gz reorder dedupe containment t=7 -Xmx30g
  OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x0000000080000000, 21474836480, 0) failed; error='Cannot allocate memory' (errno=12)
  #
  # There is insufficient memory for the Java Runtime Environment to continue.
  # Native memory allocation (mmap) failed to map 21474836480 bytes for committing reserved memory.
  # An error report file with more information is saved as:
  # hs_err_pid17.log

It looks like this step is not using the memory parameter. I guess in the CLUMPIFY_PAIRED process we need to change

        par="reorder dedupe containment t=!{task.cpus} -Xmx30g"

If I manually change the memory flag to Xmx15g I get an error

  java -ea -Xmx15 -Xms15 -cp /opt/bbmap/current/ clump.Clumpify in=SS2_fastp_1.fastq.gz in2=SS2_fastp_2.fastq.gz out=SS2_dedup_1.fastq.gz out2=SS2_dedup_2.fastq.gz reorder dedupe containment t=7 -Xmx15
  Error occurred during initialization of VM
  Too small initial heap

willbradshaw · 2024-07-01T18:16:00Z

Yeah, these are now all in configs/resources.config. Agree it would be good to point to this in the README.

BBTools modules, as you've seen, have their own memory allocation syntax that doesn't play super nicely with Nextflow's. I'd be very happy for someone to figure out how to make them play well together (e.g. automatically reformatting task.memory into an -Xmx flag via string parsing). Though as you've seen, some BBTools steps will just fail if you don't give them pretty generous memory allocations.

mikemc · 2024-07-01T18:50:48Z

automatically reformatting task.memory into an -Xmx flag via string parsing

I can do this - I'll open a new issue specifically to track this fix

willbradshaw · 2024-12-17T19:34:05Z

@harmonbhasin I think something on this would be good to include in our planned expanded documentation next quarter.

harmonbhasin · 2024-12-18T14:37:47Z

Clarified this issue with Will: Add a section to the wiki on resources, specifically when you don't have enough resources. The suggested plan of action will most likely be change resources/config.yml.

harmonbhasin · 2025-01-22T19:26:14Z

This will be addressed in #152

mikemc mentioned this issue Jul 1, 2024

Respect given memory parameter in BBTools processes #36

Closed

willbradshaw added documentation Improvements or additions to documentation priority_3 labels Sep 18, 2024

willbradshaw added priority_2 and removed priority_3 labels Dec 17, 2024

harmonbhasin self-assigned this Jan 22, 2025

harmonbhasin added the done Issues that have been addressed in dev branch, but not reflected in master branch label Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding instructions for running with limited cpus and/or memory #29

Adding instructions for running with limited cpus and/or memory #29

mikemc commented Jun 27, 2024 •

edited

Loading

mikemc commented Jun 27, 2024

mikemc commented Jun 27, 2024 •

edited

Loading

willbradshaw commented Jul 1, 2024

mikemc commented Jul 1, 2024

willbradshaw commented Dec 17, 2024

harmonbhasin commented Dec 18, 2024 •

edited

Loading

harmonbhasin commented Jan 22, 2025

Adding instructions for running with limited cpus and/or memory #29

Adding instructions for running with limited cpus and/or memory #29

Comments

mikemc commented Jun 27, 2024 • edited Loading

mikemc commented Jun 27, 2024

mikemc commented Jun 27, 2024 • edited Loading

willbradshaw commented Jul 1, 2024

mikemc commented Jul 1, 2024

willbradshaw commented Dec 17, 2024

harmonbhasin commented Dec 18, 2024 • edited Loading

harmonbhasin commented Jan 22, 2025

mikemc commented Jun 27, 2024 •

edited

Loading

mikemc commented Jun 27, 2024 •

edited

Loading

harmonbhasin commented Dec 18, 2024 •

edited

Loading