Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Trinity with grid_exec #38

Open
avani-bhojwani opened this issue Jan 31, 2025 · 0 comments
Open

Using Trinity with grid_exec #38

avani-bhojwani opened this issue Jan 31, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@avani-bhojwani
Copy link
Collaborator

avani-bhojwani commented Jan 31, 2025

Description of the bug

From Isaac:
I'm running this pipeline on a slurm cluster with Singularity and am trying to get set up to use --grid_exec.
If I configure the param for extra trinity args as:
"extra_trinity_args": "--SS_lib_type RF --grid_exec \"sbatch --qos shortq --time 0:29:59 --mem 1GB --cpus-per-task 1\"",
I get the error: bash: line 1: sbatch: command not found .
Reading further, I've seen that it's recommended to use HPCGridRunner to distribute the commands in recursive_trinity.cmds as jobs. I've tried this approach, but get the error (with paths shortened):

	*** Dispatching parallel commands to the compute farm:
Friday, January 17, 2025: 16:11:21	CMD: .../hpc_cmds_GridRunner.pl -G .../hpc_conf/SLURM.Monash.conf -c recursive_trinity.cmds
SERVER: m3v117, PID: 24318
FARMIT failed to accept job: sbatch --qos shortq --time 0:29:59 --mem 1GB --cpus-per-task 1 -e .../pooled_reads_trinity/farmit.J24318.m3v117.24318.1737090681/cmds/J24318.S0.sh.stderr -o .../pooled_reads_trinity/farmit.J24318.m3v117.24318.1737090681/cmds/J24318.S0.sh.stdout .../pooled_reads_trinity/farmit.J24318.m3v117.24318.1737090681/cmds/J24318.S0.sh 2>&1 
 (ret -1)
FARMIT failed to accept job.  Will try again shortly.

I thought that the above attempts fail because Trinity is being run within a Singularity container and so sbatch isn't available, but the error message mentioning FARMIT comes from HpcGridRunner - not sure how it'd be reached if the execution environment was inside the container.
I've tried running HpcGridRunner separately on recursive_trinity.cmds and it worked to distribute jobs - but they all failed because they're configured for the path to Trinity inside the container.
I came across these threads with Brian's responses seemingly about this issue, but it's not clear to me if his changes have made it to the current version of Trinity v2.15.2. trinityrnaseq/trinityrnaseq#952 and https://groups.google.com/g/trinityrnaseq-users/c/-DnnDYKe0xo/m/Lp6ZUF-QAwAJ

When we get around to troubleshooting, we're thinking to try using a local Trinity installation paired with skipping the singularity container for that stage/process. That way it should execute in an environment that can submit sbatch

Command used and terminal output

Relevant files

No response

System information

No response

@avani-bhojwani avani-bhojwani added the bug Something isn't working label Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant