Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UO term duplication in PSI-MS-CV effects minor (or major?) issue with semantic validation #245

Open
mwalzer opened this issue Nov 28, 2024 · 3 comments
Labels

Comments

@mwalzer
Copy link
Collaborator

mwalzer commented Nov 28, 2024

If a .mzqc uses unit terms and lists UO as one of the ControlledVocabularies, e.g.

    "controlledVocabularies": [
      {
        "name": "Proteomics Standards Initiative Mass Spectrometry Ontology",
        "uri": "https://github.com/HUPO-PSI/psi-ms-CV/releases/download/v4.1.186/psi-ms.obo",
        "version": "4.1.186"
      },
      {
        "name": "Unit Ontology",
        "uri": "https://raw.githubusercontent.com/bio-ontology-research-group/unit-ontology/v2023-05-23/unit-ontology.obo",
        "version": "v2023-05-23"
      }
    ]

The issue will be:

vscode@dab2d5d09534:/workspaces/pymzqc$ python mzqcaccessories/offlinevalidator/mzqc_offline_validator.py tests/examples/individual-runs.mzQC 
{
    "input files": [],
    "label uniqueness": [],
    "metric use": [],
    "ontology load errors": [],
    "ontology term errors": [
        "Ambiguous CVTerms of severity 6 and message: term found in multiple vocabularies = Term('UO:0000189', name='count unit'),Term('UO:0000189', name='count unit')",
        "Ambiguous CVTerms of severity 6 and message: term found in multiple vocabularies = Term('UO:0000189', name='count unit'),Term('UO:0000189', name='count unit')",
        "Ambiguous CVTerms of severity 6 and message: term found in multiple vocabularies = Term('UO:0000189', name='count unit'),Term('UO:0000189', name='count unit')",
        "Ambiguous CVTerms of severity 6 and message: term found in multiple vocabularies = Term('UO:0000010', name='second'),Term('UO:0000010', name='second')"
    ],
    "schema validation": "success"
}
@mwalzer
Copy link
Collaborator Author

mwalzer commented Nov 28, 2024

If a file uses vocabularies with overlapping terms (name and id), I'd consider this a major issue, but in the unit instance of our schema, not so much...
still, I would not want to hardcode some exception for the duplications in the validator.

@mwalzer
Copy link
Collaborator Author

mwalzer commented Nov 28, 2024

omission of the UO in ControlledVocabularies yields validation success, of course

@bittremieux
Copy link
Collaborator

We should document on the website that UO terms are replicated in the PSI-MS CV and that those should be preferred if available to avoid the issue with duplicate definitions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants