You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our current pipeline for identifying species occurrences is to use the WoRMS taxonomy as an entity set for NER analysis of the student papers. GBIF offers a tool to check species names (https://www.gbif.org/tools/species-lookup). Could we use something like this to improve our accuracy in identifying species in our text?
A first step might be to compare the initial output of our NER process, a list of genus and species names, with the species that exist in GBIF. Is everything we found via WoRMS in GBIF? If so, is there a different pipeline for pulling species names from papers via GBIF that would be more accurate? Consider scenarios when the name is misspelled or the OCR botched a few letters. GBIF offers from fuzzy logic in string matching.
The text was updated successfully, but these errors were encountered:
Our current pipeline for identifying species occurrences is to use the WoRMS taxonomy as an entity set for NER analysis of the student papers. GBIF offers a tool to check species names (https://www.gbif.org/tools/species-lookup). Could we use something like this to improve our accuracy in identifying species in our text?
A first step might be to compare the initial output of our NER process, a list of genus and species names, with the species that exist in GBIF. Is everything we found via WoRMS in GBIF? If so, is there a different pipeline for pulling species names from papers via GBIF that would be more accurate? Consider scenarios when the name is misspelled or the OCR botched a few letters. GBIF offers from fuzzy logic in string matching.
The text was updated successfully, but these errors were encountered: