Report warning message when analyzing data with two replicates #7289

Xinlei-Gao · 2025-01-10T14:52:48Z

The update version deals with two cases:

If there are two conditions with two replicates, skip running anota2seq and report warning message in the log file.
If there are more than two conditions with two replicates, perform analysis with the parameter onlyGroup = TRUE, and report warning message in the log file.

PR checklist

Closes #XXX

…sis on datasets with only two replicates. The update version deals with two cases: 1. If there are two conditions with two replicates, skip running anota2seq and report warning message in the log file. 2. If there are more than two conditions with two replicates, perform analysis with the parameter onlyGroup = TRUE, and report warning message in the log file.

lpantano · 2025-01-10T18:57:20Z

Hi @Xinlei-Gao, did you get to run nf-core modules test <MODULE> --profile docker successfully locally? I think you need to update your snapshots and that is done by running twice the modules test (I believe).

Xinlei-Gao · 2025-01-10T19:36:21Z

Hi Lorena, I haven't done it but I will run the module test twice and let you know how it goes. Thank you, Xinlei

…

On Fri, Jan 10, 2025 at 13:57 Lorena Pantano ***@***.***> wrote: Hi @Xinlei-Gao <https://github.com/Xinlei-Gao>, did you get to run nf-core modules test <MODULE> --profile docker successfully locally? I think you need to update your snapshots and that is done by running twice the modules test (I believe). — Reply to this email directly, view it on GitHub <#7289 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ATBSG35DRZC5F3NC7FBYKST2KAJ2NAVCNFSM6AAAAABU6RIDN6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOBTGYYDIOBSGI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Xinlei-Gao · 2025-01-10T20:42:46Z

Hi Lorena, I have tried running an nf-core modules test on my laptop and it failed. There are 3 tests and they all failed due to the reason below: 1. The first test 'human-tsv', which uses three replicates in two conditions by default. Nextflow stdout: │ │ │ │ ERROR ~ Error executing process > 'ANOTA2SEQ_ANOTA2SEQRUN ([id:treatment_vs_control])' │ │ │ │ Caused by: │ │ Process `ANOTA2SEQ_ANOTA2SEQRUN ([id:treatment_vs_control])` terminated with an error exit status (137) │ │ │ │ │ │ Command executed: │ │ │ │ Rscript /Users/barc/Documents/GitHub/nf-core-modules/modules/nf-core/anota2seq/anota2seqrun/templates/anota2seqrun.r --output_prefix treatment_vs_control │ │ --sample_treatment_col treatment --reference_level control --target_level treatment --sample_file samplesheet.csv --count_file salmon.merged.gene_counts_length_scaled.tsv │ │ │ │ Command exit status: │ │ 137 │ │ │ │ Command output: │ │ (empty) │ │ │ │ Command error: │ │ .command.run: line 303: 99175 Killed: 9 docker run -i --cpu-shares 2048 --memory 4096m -e "NXF_TASK_WORKDIR" -e "NXF_DEBUG=${NXF_DEBUG:=0}" -v │ │ /Users/barc/Documents/GitHub/nf-core-modules/.nf-test/tests/ff19134b301656f70715dcf80deb5bd3/work:/Users/barc/Documents/GitHub/nf-core-modules/.nf-test/tests/ff19134b301656f70715 │ │ dcf80deb5bd3/work -w "$NXF_TASK_WORKDIR" -u $(id -u):$(id -g) --platform=linux/amd64 --name $NXF_BOXID quay.io/biocontainers/bioconductor-anota2seq:1.24.0--r43hdfd78af_0 │ │ /bin/bash .command.run nxf_trace │ │ .command.run: line 290: 99176 Killed: 9 return It seems like it failed due to being killed by an out-of-memory error. I am not sure why. It looks docker run requires 4096m memory and my laptop has 16 GB memory which should be sufficient. Please correct me if I am wrong. 2. The second test for 'two conditions with two replicates'. It failed due to not having access to the modified sample sheet file and count file. The required files for this test have been put as another PR to nf-core test-datasets. So I suppose the PR to nf-core test-datasets has to be processed first for this test to run? 3. The third test for 'three conditions with two replicates'. It failed due to the same reason as the 2. No access to required modified sample sheet files on nf-core test-datasets. Do you have any advice on how to proceed with the tests? I appreciate your insight very much! Best, Xinlei

…

On Fri, Jan 10, 2025 at 2:36 PM Xinlei Gao ***@***.***> wrote: Hi Lorena, I haven't done it but I will run the module test twice and let you know how it goes. Thank you, Xinlei On Fri, Jan 10, 2025 at 13:57 Lorena Pantano ***@***.***> wrote: > Hi @Xinlei-Gao <https://github.com/Xinlei-Gao>, did you get to run nf-core > modules test <MODULE> --profile docker successfully locally? I think you > need to update your snapshots and that is done by running twice the modules > test (I believe). > > — > Reply to this email directly, view it on GitHub > <#7289 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/ATBSG35DRZC5F3NC7FBYKST2KAJ2NAVCNFSM6AAAAABU6RIDN6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOBTGYYDIOBSGI> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> >

SPPearce · 2025-01-13T12:50:41Z

I'm confused by this module, what exactly is it doing?

Xinlei-Gao · 2025-01-13T14:57:46Z

Hi Simon, This module is to generalize the current anota2seq module to deal with the datasets with two replicates without throwing an unexpected error. In detail, the current anota2seq module can only perform analysis on datasets with at least three replicates per sample. If the input data doesn't satisfy, it will cause error and fail the nf-core pipeline. The modified codes added a step to check the number of replicates and number of conditions, and deal with these situations as below: 1. If there are only two replicates in two conditions, give a warning message that the analysis can not be performed and skip to the next step without error; 2. If there are two replicates in more than two conditions, call the function using a parameter 'onlyGroup = TRUE', so that the translational efficiency analysis can be performed. This parameter setting is based on 'anota2seq' documentation. In brief, I added these to try to help the nf-core anota2seq module deal with datasets with different numbers of replicates more robustly. Hope this makes sense to you. Thank you, Xinlei

…

On Mon, Jan 13, 2025 at 7:51 AM Simon Pearce ***@***.***> wrote: I'm confused by this module, what exactly is it doing? — Reply to this email directly, view it on GitHub <#7289 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ATBSG36YDOH3ZK6QCR722TL2KOZDRAVCNFSM6AAAAABU6RIDN6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOBXGAZDENRQGU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

SPPearce

Ok, I think I understand what the module is doing now (not the PR, the module itself), I was confused by the fact that a samplesheet was being passed.

SPPearce · 2025-01-13T17:18:49Z

modules/nf-core/anota2seq/anota2seqrun/main.nf

@@ -33,5 +33,7 @@ process ANOTA2SEQ_ANOTA2SEQRUN {
    task.ext.when == null || task.ext.when

    script:
-    template 'anota2seqrun.r'
+    """
+    Rscript ${projectDir}/modules/nf-core/anota2seq/anota2seqrun/templates/anota2seqrun.r --output_prefix ${task.ext.prefix ?: meta.id} --sample_treatment_col ${sample_treatment_col} --reference_level ${reference} --target_level ${target} --sample_file ${samplesheet} --count_file ${counts}


You can't do this, it won't work on cloud providers. Why are you trying to move away from the use of the template?

Xinlei-Gao · 2025-01-13T17:36:58Z

I tried keeping using template after modifying the R code anota2seqrun.r, but it failed when I ran the nf-core Ribo-seq pipeline (which called this module), due to some error that it can not correctly pass the parameter like $task.ext.***. So that I replaced the template command with directly calling Rscript and passing the parameters in the command line. The template R script itself would read an input samplesheet. The samplesheet file is provided in a nf-core test dataset. I don't know this can affect cloud platforms. If so what might be any suggestions? Thanks! Xinlei

…

On Mon, Jan 13, 2025 at 12:20 Simon Pearce ***@***.***> wrote: ***@***.**** commented on this pull request. Ok, I think I understand what the module is doing now (not the PR, the module itself), I was confused by the fact that a samplesheet was being passed. ------------------------------ In modules/nf-core/anota2seq/anota2seqrun/main.nf <#7289 (comment)>: > @@ -33,5 +33,7 @@ process ANOTA2SEQ_ANOTA2SEQRUN { task.ext.when == null || task.ext.when script: - template 'anota2seqrun.r' + """ + Rscript ${projectDir}/modules/nf-core/anota2seq/anota2seqrun/templates/anota2seqrun.r --output_prefix ${task.ext.prefix ?: meta.id} --sample_treatment_col ${sample_treatment_col} --reference_level ${reference} --target_level ${target} --sample_file ${samplesheet} --count_file ${counts} You can't do this, it won't work on cloud providers. Why are you trying to move away from the use of the template? — Reply to this email directly, view it on GitHub <#7289 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ATBSG32WUV2YZFQQULL57VT2KPYWLAVCNFSM6AAAAABU6RIDN6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDKNBXGI4TIOBWGE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

SPPearce · 2025-01-15T08:36:16Z

I tried keeping using template after modifying the R code anota2seqrun.r, but it failed when I ran the nf-core Ribo-seq pipeline (which called this module), due to some error that it can not correctly pass the parameter like $task.ext.***. So that I replaced the template command with directly calling Rscript and passing the parameters in the command line. The template R script itself would read an input samplesheet. The samplesheet file is provided in a nf-core test dataset. I don't know this can affect cloud platforms. If so what might be any suggestions? Thanks! Xinlei

Guidelines are here: https://nf-co.re/docs/guidelines/components/modules#script-inclusion
I think we should be able to fix the template issue, we should be able to use task.ext.args with the template.

SPPearce reviewed Jan 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Report warning message when analyzing data with two replicates #7289

Report warning message when analyzing data with two replicates #7289

Xinlei-Gao commented Jan 10, 2025

lpantano commented Jan 10, 2025

Xinlei-Gao commented Jan 10, 2025 via email

Xinlei-Gao commented Jan 10, 2025 via email

SPPearce commented Jan 13, 2025

Xinlei-Gao commented Jan 13, 2025 via email

SPPearce left a comment

SPPearce Jan 13, 2025

Xinlei-Gao commented Jan 13, 2025 via email

SPPearce commented Jan 15, 2025

Report warning message when analyzing data with two replicates #7289

Are you sure you want to change the base?

Report warning message when analyzing data with two replicates #7289

Conversation

Xinlei-Gao commented Jan 10, 2025

PR checklist

lpantano commented Jan 10, 2025

Xinlei-Gao commented Jan 10, 2025 via email

Xinlei-Gao commented Jan 10, 2025 via email

SPPearce commented Jan 13, 2025

Xinlei-Gao commented Jan 13, 2025 via email

SPPearce left a comment

Choose a reason for hiding this comment

SPPearce Jan 13, 2025

Choose a reason for hiding this comment

Xinlei-Gao commented Jan 13, 2025 via email

SPPearce commented Jan 15, 2025