[TheiaProk wfs] upgrade StxTyper version and OPERON outputs #750
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR closes #694
🗑️ This dev branch should be deleted after merging to main.
🧠 Summary
This PR updates the StxTyper WDL task across all TheiaProk workflows by upgrading the Docker image to version 1.0.40 and updating the parsing logic to include two new OPERON output types: EXTENDED and AMBIGUOUS. Additionally, the no-hits branch has been updated to create placeholder files for these new outputs, ensuring the workflow does not fail when no hits are found.
Documentation updated to include new outputs
see new release notes here: https://github.com/ncbi/stxtyper/releases/tag/v1.0.40
⚡ Impacted Workflows/Tasks
TheiaProk workflows utilizing the StxTyper task within
merlin_magic.wdl
stxtyper.wdl
This PR may lead to different results in pre-existing outputs: Yes
This PR uses an element that could cause duplicate runs to have different results: No
🛠️ Changes
###:gear: Algorithm
Only updated parsing logic for new OPERON types and conditional logic for outputs in case of no hits or aggregated output.
➡️ Inputs
NA
⬅️ Outputs
Added two new outputs:
🧪 Testing
Tested the updated task using a local test assembly through miniwdl to verify that:
-The Docker image is correctly updated and outputs the correct version (1.0.40).
-The script correctly parses hits when present.
-In cases where no hits are found, all output files (including the new ones) are created with placeholder content ("None") to prevent delocalization failures.
ILMN PE with E. coli samples known to be stx positive
Repeated submissions from Curtis testing from a previous PR updating STXtyper showed that workflows ran successfully, and StxTyper outputs matched expectations except for one sample when comparing 161 samples with the last validation run here .
I re ran the same sample
M22F001452
and confirmed the same result.Ran the other Theiaprok workflows and received identical results to Curtis last validation runs.
ONT on some random Shigella that are likely stx negative
FASTA with same E. coli dataset that are stx positive
TheiaProk_Illumina_SE
Suggested Scenarios for Reviewer to Test
Test any of the TheiaProk workflows. E. coli and Shigella spp samples
Run task on assemblies with no hits for stx typer to verify all required outputs when no hits
🔬 Final Developer Checklist
workflows_overview
tables to be the tag for the next upcoming release. If you do not know the tag, please put "vX.X.X"🎯 Reviewer Checklist