-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixup config syntax #146
Fixup config syntax #146
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for following up on post-merge review!
rules/config.smk
Outdated
the subtype query value does not vary across segment or time wildcards. | ||
""" | ||
def resolve(wildcards): | ||
return resolve_config_value('subtype_query')({"subtype": wildcards.subtype, 'segment': '*', 'time': '*'}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know that this enforces that the subtype_query
must be <subtype>/*/*
.
When I tried modifying the config to
subtype_query:
"h5n1-cattle-outbreak/ha/*": "genoflu in 'B3.13'"
I get the following error:
$ nextstrain build . -s genome-focused/Snakefile -n
Building DAG of jobs...
InputFunctionException in rule filter_sequences_by_subtype in file /nextstrain/build/genome-focused/../rules/main.smk, line 179:
Error:
InvalidConfigError: Config structure incorrect or incomplete for config["subtype_query"]
The dictionary is missing a matching key for the current target of 'h5n1-cattle-outbreak/*/*', or a fuzzy match (i.e. using "*" placeholders)
P.S. If you want to use a single value across all builds then set a scalar value (number, string, boolean)
Wildcards:
subtype=h5n1-cattle-outbreak
segment=pb2
Traceback:
File "/nextstrain/build/genome-focused/../rules/config.smk", line 261, in resolve
File "/nextstrain/build/genome-focused/../rules/config.smk", line 147, in resolve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relatedly, I want (wanted?) to change the workflow for the genome builds such that (conceptually) each segment had a different subtype query, kind of like you've tried here, but it required too much snakemake surgery so I kicked it down the road. This would be useful for us because, e.g., the B3.13
constellation is PB2:am2.2, PB1:am4, ...
so we could build the avian-flu/h5n1-cattle-outbreak/pb2 tree filtering to am2.2
and the entire genome tree filtering to B3.13
Brings the config structure in line with all of the other config values, but this necessitates a more complicated usage of `resolve_config_value`. Hopefully the added comments are instructive to people in the future! This also greatly improves the error handling if we forget to add the appropriate subtype to the config, or leave out 'subtype_query' altogether. Inspired by <#104 (comment)> and <#146 (comment)>
43531bb
to
7205647
Compare
Little improvements identified in post-merge review of #104