Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confirm regex is still valid for non-enum selected envo terms #305

Closed
mslarae13 opened this issue Jan 29, 2025 · 3 comments
Closed

Confirm regex is still valid for non-enum selected envo terms #305

mslarae13 opened this issue Jan 29, 2025 · 3 comments
Assignees

Comments

@mslarae13
Copy link
Contributor

The Env Triad squad developed a small selection of terms for submitters to pick their env triad terms for soil, sediment, water and plant.

As part of enabling these enumerations in the submission portal we need to update the submission schema.
However, we should also ensure that the regular expression still works.
Submitters can discover envo terms outside of the provided enum & this is important in our learning what other terms should be included in the enums.

Criteria for completion

  • Confirm and test that the regex is still valid in the submission schema & submission portal for all extension and all environmental triad terms (broad, local, medium)
@turbomam
Copy link
Member

turbomam commented Feb 3, 2025

Check for patterns in the environmental triad slot_usages

print out the any_ofs first to get oriented

make squeaky-clean all test
yq '.classes | to_entries | map(select(.value.slot_usage.env_broad_scale)) | from_entries | .[].slot_usage.env_broad_scale.any_of' src/nmdc_submission_schema/schema/nmdc_submission_schema.yaml 
yq '.classes | to_entries | map(select(.value.slot_usage.env_broad_scale)) | from_entries | .[].slot_usage.env_broad_scale.pattern' src/nmdc_submission_schema/schema/nmdc_submission_schema.yaml 
null
null
null
null
null
null
null
- range: EnvBroadScalePlantAssociatedEnum
- range: string
- range: EnvBroadScaleSedimentEnum
- range: string
- range: EnvBroadScaleSoilEnum
- range: string
null
- range: EnvBroadScaleWaterEnum
- range: string


^\S+.*\S+ \[[a-zA-Z]{2,}:\d+\]$
^\S+.*\S+ \[[a-zA-Z]{2,}:\d+\]$
^\S+.*\S+ \[[a-zA-Z]{2,}:\d+\]$
^\S+.*\S+ \[[a-zA-Z]{2,}:\d+\]$
^\S+.*\S+ \[[a-zA-Z]{2,}:\d+\]$
^\S+.*\S+ \[[a-zA-Z]{2,}:\d+\]$
^\S+.*\S+ \[[a-zA-Z]{2,}:\d+\]$
^\S+.*\S+ \[[a-zA-Z]{2,}:\d+\]$
^\S+.*\S+ \[[a-zA-Z]{2,}:\d+\]$
^\S+.*\S+ \[[a-zA-Z]{2,}:\d+\]$
^\S+.*\S+ \[[a-zA-Z]{2,}:\d+\]$
^\S+.*\S+ \[[a-zA-Z]{2,}:\d+\]$

@turbomam
Copy link
Member

turbomam commented Feb 3, 2025

The same pattern is asserted on all of the env_local_scale and env_medium slot_usages too

@mslarae13 can you say more about what makes the regular expression in the pattern valid or not valid? These patterns will accept all of these, even though only the first one is a valid pairing of a OBO foundry label and id. See https://regexr.com/8bn75

  • soil [ENVO:00001998]
  • banana [ENVO:00001998]
  • Jenny [ENVO:8675309]

It won't accept

  • soil
  • ENVO:00001998
  • soil [ENVO_00001998]

@mslarae13
Copy link
Contributor Author

That's correct. We currently have no additional validation for pairing and that just is what it is right now until the ontology squad we're kicking off can provide that validation or at least that check.

@turbomam turbomam closed this as completed Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants