-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Name of blocking variable not correctly recognized #245
Comments
This bug is due to this line of code:
It probably needs to be changed to: it.blocking = it.blocking.replaceAll('^NA$', '') |
The comment directly before this code block reads:
Why would this code block try to replace NA sub-strings in column headers of the samplesheet.csv file? Maybe you apply it to samplesheet.csv, too? |
The issue is that the current code replaces any "NA" in the string. Mine only replaces it if it's the whole string. However, come to think of it, IDK why it's replacing it in the variable name anyway, and whether that's desired behavior |
I assume (carefully) that .splitCsv might add NAs when it finds an empty column; is that correct, @pinin4fjords? Edit: No it doesn't, at least not in a little test I ran. Why might NAs sneak in? 🤔 |
Thanks for the bug report! This was definitely done in response to a bug, possibly people using NA in the input contrasts files to indicate missing values. So I don't want to remove this entirely. @BEFH - could you PR your regex fix please, since it's so concise? Please add a changelog entry in the same style as the others there when you do so. We should also document the special meaning of 'NA'. |
Hey, checking for and dealing with NA entries in contrasts.csv seems reasonable. But, replacing any pattern match of 'NA' with the empty string will destroy meaningful entries like 'RNA_CONCENTRATION' or 'ANALYSIS_OUTCOME' and so on. Would it be possible to replace 'NA' with the empty string only if the complete entry is 'NA' (in perl ^NA$) but otherwise don't replace NAs? |
Yes, that's the fix proposed by @BEFH |
Closed via #344 |
Description of the bug
When having in samplesheet.csv a variable (column header) RNA_extraction_date and when I use this as a blocking variable in the contrasts.csv then RNA_extraction_date from contrasts.csv is recognized as R_extraction_date and I get the error that R_extraction_date is not in samplesheet.csv
To me this looks like an issue with the 'NA' in RNA_extraction_data. NA is also the default in R for specifying 'Not Available' and this might cause the 'NA' in RNA_extraction_date to be replaced by the empty string leading to R_extraction_date. When I change RNA_extraction_date to Rna_extraction_date in samplesheet.csv and contrasts.csv then it seems to work.
Interestingly: the process NFCORE_DIFFERENTIALABUNDANCE:DIFFERENTIALABUNDANCE:VALIDATOR (samplesheet.csv) finishes with success but it looks as if this validator checks only samplesheet.csv but not contrats.csv. The error happens in process NFCORE_DIFFERENTIALABUNDANCE:DIFFERENTIALABUNDANCE:DESEQ2_DIFFERENTIAL which runs after the validator finished with success.
Command used and terminal output
Relevant files
No response
System information
The text was updated successfully, but these errors were encountered: