Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixup config syntax #146

Merged
merged 3 commits into from
Mar 6, 2025
Merged

Fixup config syntax #146

merged 3 commits into from
Mar 6, 2025

Conversation

jameshadfield
Copy link
Member

Little improvements identified in post-merge review of #104

Copy link
Contributor

@joverlee521 joverlee521 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for following up on post-merge review!

rules/config.smk Outdated
the subtype query value does not vary across segment or time wildcards.
"""
def resolve(wildcards):
return resolve_config_value('subtype_query')({"subtype": wildcards.subtype, 'segment': '*', 'time': '*'})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know that this enforces that the subtype_query must be <subtype>/*/*.
When I tried modifying the config to

subtype_query:
    "h5n1-cattle-outbreak/ha/*": "genoflu in 'B3.13'"

I get the following error:

$ nextstrain build . -s genome-focused/Snakefile -n
Building DAG of jobs...
InputFunctionException in rule filter_sequences_by_subtype in file /nextstrain/build/genome-focused/../rules/main.smk, line 179:
Error:
  InvalidConfigError: Config structure incorrect or incomplete for config["subtype_query"]
	The dictionary is missing a matching key for the current target of 'h5n1-cattle-outbreak/*/*', or a fuzzy match (i.e. using "*" placeholders)
	P.S. If you want to use a single value across all builds then set a scalar value (number, string, boolean)
Wildcards:
  subtype=h5n1-cattle-outbreak
  segment=pb2
Traceback:
  File "/nextstrain/build/genome-focused/../rules/config.smk", line 261, in resolve
  File "/nextstrain/build/genome-focused/../rules/config.smk", line 147, in resolve

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relatedly, I want (wanted?) to change the workflow for the genome builds such that (conceptually) each segment had a different subtype query, kind of like you've tried here, but it required too much snakemake surgery so I kicked it down the road. This would be useful for us because, e.g., the B3.13 constellation is PB2:am2.2, PB1:am4, ... so we could build the avian-flu/h5n1-cattle-outbreak/pb2 tree filtering to am2.2 and the entire genome tree filtering to B3.13

Brings the config structure in line with all of the other config values,
but this necessitates a more complicated usage of
`resolve_config_value`. Hopefully the added comments are instructive to
people in the future!

This also greatly improves the error handling if we forget to add the
appropriate subtype to the config, or leave out 'subtype_query'
altogether.

Inspired by <#104 (comment)>
and <#146 (comment)>
@jameshadfield jameshadfield force-pushed the james/fixup-config-syntax branch from 43531bb to 7205647 Compare March 6, 2025 20:48
@jameshadfield jameshadfield merged commit 77074cf into master Mar 6, 2025
6 checks passed
@jameshadfield jameshadfield deleted the james/fixup-config-syntax branch March 6, 2025 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants