The following monosaccharides are indexed by GNOme:
- Hex
- HexNAc
- dHex
- NeuAc
- NeuGc
- Pen
- Fuc
The following "free" substituents are also included:
- S - sulfate
- P - phosphate
In order to map a glycan composition to an entry in GNOme, it must be composed of only these components. Glycans containing other monosaccharides may be included, but they will be encoded as "Xxx".
- Construct a directed graph where all terms are nodes.
- For each term p, find all terms C which have an
is_a
relationship with them and create an edge from p to c for each c in C.
- For each child term of
GNO:00000001
c, if thename
ofc
matches the pattern/molecular weight (\d+(\.\d+)?) Da/
, then addc
to a list of(term, mass)
pairsI
. - Sort
I
bymass
.
- For query glycan
G
, compute the neutral monoisotopic mass ofG
. Search the indexI
from Step 2 for the mass ofG
, finding the nearest available term within 0.1 Da.