Skip to content

Commit

Permalink
Merge pull request #10 from HUPO-PSI/add-sphinx-docs
Browse files Browse the repository at this point in the history
Extend documentation
  • Loading branch information
RalfG authored Nov 15, 2024
2 parents bfdb59d + b943138 commit 237bc25
Show file tree
Hide file tree
Showing 18 changed files with 415 additions and 30 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
.python-version
.venv/
.github/
docs/_build/
specification/annotation-schema.md
55 changes: 55 additions & 0 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
############
Contributing
############

This document briefly describes how to contribute to
`mzPAF <https://github.com/hupo-psi/mzPAF>`_.



Before you begin
################

If you have an idea for a feature, use case to add or an approach for a bugfix,
you are welcome to communicate it with the community by opening a
thread in `GitHub Issues <https://github.com/hupo-psi/mzPAF/issues>`_.



Documentation local setup
#########################

To work on the documentation and get a live preview, install the requirements
and run ``sphinx-autobuild``:

.. code-block:: sh
pip install -r ./docs/requirements.txt
sphinx-autobuild ./docs/ ./docs/_build/
Then browse to http://localhost:8000 to watch the live preview.



How to contribute
#################

- Fork `mzPAF <https://github.com/hupo-psi/mzPAF>`_ on GitHub to
make your changes.
- Commit and push your changes to your
`fork <https://help.github.com/articles/pushing-to-a-remote/>`_.
- Ensure that the tests and documentation (both Python docstrings and files in
``/docs/``) have been updated according to your changes. Python
docstrings are formatted in the
`numpydoc style <https://numpydoc.readthedocs.io/en/latest/format.html>`_.
- Open a
`pull request <https://help.github.com/articles/creating-a-pull-request/>`_
with these changes. You pull request message ideally should include:

- A description of why the changes should be made.
- A description of the implementation of the changes.
- A description of how to test the changes.

- The pull request should pass all the continuous integration tests which are
automatically run by
`GitHub Actions <https://github.com/hupo-psi/mzPAF/actions>`_.
116 changes: 108 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,118 @@
# mzPAF Peak Annotation Format

The mzPAF proposed standard is a specification for a fragment ion peak annotation format for mass spectra, focused on peptides. This provides for a standardized format for describing the origin of fragment ions to be used in spectral libraries, other formats that aim to describe fragment ions, and software tools that annotate fragment ions.
## About

The main home page for mzPAF is at the PSI web site: [https://psidev.info/mzPAF](https://psidev.info/mzPAF)
mzPAF is a specification for a fragment ion peak annotation format for mass spectra, focused on
peptides. This provides for a standardized format for describing the origin of fragment ions to be
used in spectral libraries, other formats that aim to describe fragment ions, and software tools
that annotate fragment ions.

# Status
- Official mzPAF homepage: [psidev.info/mzPAF](https://psidev.info/mzPAF)
- mzPAF documentation: [mzpaf.readthedocs.io](https://mzpaf.readthedocs.io)

Updated: 2024-10-15
## Status

The specification has been resubmitted to the PSI Document Process and is undergoing final community review. It is anticipated to become a formal PSI standard near the end of 2024.
_Updated: 2024-10-15_

The specification has been resubmitted to the PSI Document Process and is undergoing final
community review. It is anticipated to become a formal PSI standard near the end of 2024.

# Available Materials
- The current DRAFT specification: [mzPAF_specification_v1.0-draft15.pdf](https://github.com/HUPO-PSI/mzPAF/blob/main/specification/mzPAF_specification_v1.0-draft15.pdf?raw=true)
- Example annotated spectra: [Examples](https://github.com/HUPO-PSI/mzPAF/tree/main/examples)
- The GitHub repo associated with mzPAF: [https://github.com/HUPO-PSI/mzPAF](https://github.com/HUPO-PSI/mzPAF)
- The GitHub repo assocated with the related mzSpecLib standard: [https://github.com/HUPO-PSI/mzSpecLib](https://github.com/HUPO-PSI/mzSpecLib)

## In short

- mzPAF is a single string of characters, case sensitive, without length limit
- Multiple possible explanations are comma-separated
- Deltas of observed – theoretical _m/z_ values are prefixed with a slash (`/`)
- Confidence of annotations are prefixed with an asterisk (`*`)

The basic format of each annotation is:

```
annotation1/delta,annotation2/delta,...
```

or:

```
annotation1/delta*confidence,annotation2/delta*confidence,...
```

For example:

```
b2-H2O/3.2ppm,b4-H2O^2/3.2ppm
```

or:

```
b2-H2O/3.2ppm*0.75,b4-H2O^2/3.2ppm*0.25
```

mzPAF supports:

- Annotations of multiple analytes: `1@y12/0.13,2@b9-NH3/0.23`
- Mass deltas in ppm instead of _m/z_ unit: `y1/-1.4ppm`
- Confidence levels per annotation: `y1/-1.4ppm*0.75`
- Advanced ion notation: `[ion type](neutral loss)(isotope)(adduct type)(charge)`, e.g.: `y4-H2O+2i[M+H+Na]^2`:
- Ion types:
- Peptide ion series (a, b, c, x, y, z): `y4`
- Unknown ions: `?`
- Immonium ions: `IY`
- Internal fragment ions: `m3:6`
- Intact precursor ions: `p^2`
- A set of reference ions: `r[TMT127N]`
- Named compounds: `_{Urocanic Acid}`
- Chemical formulas: `f{C16H22O}`
- Smiles: `s{CN=C=O}[M+H]`
- Embedded ProForma annotations: `0@b2{LC[Carbamidomethyl]}`
- Neutral gains and losses: `y2+CO-H2O`
- Isotopes: `y2+2i`
- Adduct types: `y2[M+H]`
- Charge states: `^2`
- Multiple peaks per annotation: `&y7/-0.001` and `y7/0.000*0.95`

Read the
[full DRAFT specificiation](https://github.com/HUPO-PSI/mzPAF/blob/main/specification/mzPAF_specification_v1.0-draft14.docx?raw=true)
for more details and examples.

## Getting started

### mzPAF in Python

The [mzPAF Python package](https://mzpaf.readthedocs.io/en/latest/implementations/python/) can
parse mzPAF strings into their components, convert to the JSON representation, or serialize back
to an mzPAF string.

```python
>>> import mzpaf
>>> annotations = mzpaf.parse_annotation("b2-H2O/3.2ppm*0.75,b4-H2O^2/3.2ppm*0.25")
>>> print(annotations[0].to_json())
{'neutral_losses': ['-H2O'], 'isotope': 0, 'adducts': [], 'charge': 1, 'analyte_reference': None, 'mass_error': {'value': 3.2, 'unit': 'ppm'}, 'confidence': 0.75, 'molecule_description': {'series_label': 'peptide', 'series': 'b', 'position': 2, 'sequence': None}}
>>> print(anno[0].serialize())
'b2-H2O/3.2ppm*0.75'
```

Learn more at the
[package documentation](https://mzpaf.readthedocs.io/en/latest/implementations/python/).

### mzPAF regular expressions

The mzPAF specification includes regular expressions for parsing mzPAF strings. These can be used
in any programming language that supports regular expressions.

Learn more at the
[mzPAF regex documentation](https://mzpaf.readthedocs.io/en/latest/implementations/regex/).

### mzPAF Lark grammar

mzPAF has also been defined as a
[Lark grammar](https://mzpaf.readthedocs.io/en/latest/implementations/lark/).

### Links

- The mzPAF GitHub repo: [github.com/HUPO-PSI/mzPAF](https://github.com/HUPO-PSI/mzPAF)
- The GitHub repo for the related mzSpecLib standard: [github.com/HUPO-PSI/mzSpecLib](https://github.com/HUPO-PSI/mzSpecLib)
- HUPO-PSI homepage: [psidev.info](https://www.psidev.info/)
1 change: 1 addition & 0 deletions docs/.readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,4 @@ python:
path: implementations/python
extra_requirements:
- docs
- requirements: docs/requirements.txt
51 changes: 50 additions & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,53 @@
"""Configuration file for the Sphinx documentation builder."""

# Scripts
import json
import shutil
from pathlib import Path

import jsonschema2md
import pandas as pd


def get_jsonschema_docs(input_json, output_markdown):
"""Generate markdown documentation from a JSON schema."""
parser = jsonschema2md.Parser()
with open(input_json, encoding="utf-8") as f_in:
output_md = parser.parse_schema(json.load(f_in))

with open(output_markdown, "w", encoding="utf-8") as f_out:
f_out.writelines(output_md)


def get_reference_molecules_md(input_json, output_markdown):
"""Generate a markdown table of reference molecules."""
df = pd.read_json(input_json).T
buf = df.to_markdown().replace(' nan ', ' ')
with open(output_markdown, 'wt') as fh:
fh.write(buf)


get_jsonschema_docs(
"../specification/annotation-schema.json",
"../specification/annotation-schema.md"
)
get_jsonschema_docs(
"../specification/reference_data/reference_molecule_schema.json",
"../specification/reference_data/reference_molecule_schema.md"
)

get_reference_molecules_md(
"../specification/reference_data/reference_molecules.json",
"../specification/reference_data/reference_molecules.md"
)

if not Path("_static/img/lark-railroad-diagram.svg").exists():
shutil.copy(
"../specification/grammars/schema_images/Annotation.svg",
"_static/img/lark-railroad-diagram.svg"
)


# Project information
project = "mzPAF"
author = "HUPO-PSI"
Expand All @@ -16,7 +64,7 @@
"sphinx_click.ext",
"myst_parser",
]
source_suffix = [".rst"]
source_suffix = [".rst", ".md"]
master_doc = "index"
exclude_patterns = ["_build"]

Expand Down Expand Up @@ -46,6 +94,7 @@
"python": ("https://docs.python.org/3", None),
"psims": ("https://mobiusklein.github.io/psims/docs/build/html/", None),
"pyteomics": ("https://pyteomics.readthedocs.io/en/stable/", None),
"mzspeclib": ("https://mzspeclib.readthedocs.io/en/latest/", None),
}


Expand Down
1 change: 1 addition & 0 deletions docs/contributing.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.. include:: ../CONTRIBUTING.rst
36 changes: 36 additions & 0 deletions docs/implementations/json/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
###########
JSON Schema
###########

About
=====

Instead of representing mzPAF as a single string, it can alternatively be expressed as a JSON
object. This format is more compatible for inter-program communication, especially through web
APIs. You can find the JSON schema for mzPAF on GitHub via the following link:

https://raw.githubusercontent.com/HUPO-PSI/mzPAF/main/specification/annotation-schema.json

Replace ``main`` in the URL with the desired version tag to access the schema for a particular
version.

Examples
========

.. literalinclude:: ../../../specification/annotation-example-1.json
:language: json

.. literalinclude:: ../../../specification/annotation-example-2.json
:language: json

.. literalinclude:: ../../../specification/annotation-example-3.json
:language: json



Full schema documentation
=========================

.. include:: ../../../specification/annotation-schema.md
:parser: myst_parser.sphinx_
:start-line: 4
17 changes: 17 additions & 0 deletions docs/implementations/lark/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
############
Lark grammar
############


About
=====

[todo]


Railroad diagram
================

.. figure:: ../../_static/img/lark-railroad-diagram.svg
:alt: Lark grammar

2 changes: 2 additions & 0 deletions docs/implementations/python/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ Python API
:imported-members:


.. manually documented as parse_annotation is undocumented
.. autofunction:: parse_annotation

Parse a string into one or more :class:`IonAnnotationBase` instances.
11 changes: 10 additions & 1 deletion docs/implementations/python/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,19 @@
Python implementation
#####################

About
=====

.. include:: ../../../implementations/python/README.md
:parser: myst_parser.sphinx_


Full API documentation
======================

.. toctree::
:caption: Contents
:maxdepth: 2
:glob:

*

25 changes: 25 additions & 0 deletions docs/implementations/regex/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
###################
Regular expressions
###################

mzPAF has been defined in several regular expression dialects.

.. tip::

Regex101.com is a great tool to test regular expressions. Try out the mzPAF regex there:
`regex101.com/r/gDPlJu/1 <https://regex101.com/r/gDPlJu/1>`_.

Python
======

.. literalinclude:: ../../../specification/grammars/regex_sre.py
:language: python
:linenos:


Javascript ECMA
===============

.. literalinclude:: ../../../specification/grammars/regex_ecma.js
:language: javascript
:linenos:
6 changes: 3 additions & 3 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
.. include:: ../README.md
:parser: myst_parser.sphinx_


.. toctree::
:caption: About
:hidden:
:includehidden:
:glob:

Home <self>
implementations/index
specification/index
Specification <specification/index>
Implementations <implementations/index>
Contributing <contributing>
Loading

0 comments on commit 237bc25

Please sign in to comment.