From 5341b2adf6049db47961f28838e6774d5320f579 Mon Sep 17 00:00:00 2001 From: jenniferjiangkells Date: Tue, 23 Jul 2024 15:27:26 +0100 Subject: [PATCH] Deployed 3c63f74 with MkDocs version: 1.6.0 --- contributing/index.html | 2 +- search/search_index.json | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/contributing/index.html b/contributing/index.html index ecc7a9a..8530ad2 100644 --- a/contributing/index.html +++ b/contributing/index.html @@ -963,7 +963,7 @@

Contributing

-

Contribute to MiADE!

+

Contribute to MiADE! Contribution guide

diff --git a/search/search_index.json b/search/search_index.json index 18ec043..7d32a97 100644 --- a/search/search_index.json +++ b/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Welcome to the MiADE Documentation","text":"

MiADE (Medical information AI Data Extractor) is a set of tools for extracting formattable data from clinical notes stored in electronic health record systems (EHRs). Built with Cogstack's MedCAT package.

"},{"location":"#installing","title":"Installing","text":"

To install MiADE, you need to download the spacy base model and Med7 model first:

pip install https://huggingface.co/kormilitzin/en_core_med7_lg/resolve/main/en_core_med7_lg-any-py3-none-any.whl\npython -m spacy download en_core_web_md\n
Then, install MiADE:

pip install miade\n
"},{"location":"#license","title":"License","text":"

MiADE is licensed under Elastic License 2.0.

The Elastic License 2.0 is a flexible license that allows you to use, copy, distribute, make available, and prepare derivative works of the software, as long as you do not provide the software to others as a managed service or include it in a free software directory. For the full license text, see our license page.

"},{"location":"contributing/","title":"Contributing","text":"

Contribute to MiADE!

"},{"location":"about/overview/","title":"Project Overview","text":""},{"location":"about/overview/#background","title":"Background","text":"

Data about people\u2019s health stored in electronic health records (EHRs) can play an important role in improving the quality of patient care. Much of the information in EHRs is recorded in ordinary language without any restriction on format ('free text'), as this is the natural way in which people communicate. However, if this information were stored in a standardised, structured format, computers will also be able to process the information to help clinicians find and interpret information for better and safer decision making. This would enable EHR systems such as Epic, the system in place at UCLH since April 2019, to support clinical decision making. For instance, the system may be able to ensure that a patient is not prescribed medicine that would give them an allergic reaction.

"},{"location":"about/overview/#the-challenge","title":"The challenge","text":"

Free text may contain words and abbreviations which may be interpreted in more than one way, such as 'HR', which can mean 'Hour' or 'Heart Rate'. Free text may also contain negations; for example, a diagnosis may be mentioned in the text but the rest of the sentence might say that it was ruled out. Although computers can be used to interpret free text, they cannot always get it right, so clinicians will always have to check the results to ensure patient safety. Expressing information in a structured way can avoid this problem, but has a big disadvantage - it can be time-consuming for clinicians to enter the information. This can mean that information is incomplete, or clinicians are so busy on the computer that they do not have time to listen to their patients.

"},{"location":"about/overview/#meeting-the-need","title":"Meeting the need","text":"

The aim of MiADE is to develop a system to support automatic conversion of the clinician\u2019s free text into a structured format. The clinician can check the structured data immediately, before making it a formal part of the patient\u2019s record. The system will record a patient\u2019s diagnoses, medications and allergies in a structured way, using NHS-endorsed clinical data standards (e.g. FIHR and SNOMED CT). It will use a technique called Natural Language Processing (NLP). NLP has been used by research teams to extract information from existing EHRs but has rarely been used to improve the way information is entered in the first place. Our NLP system will continuously learn and improve as more text is analysed and checked by clinicians.

We will first test the system in University College London Hospitals, where a new EHR system called Epic is in place. We will study how effective it is, and how clinicians and patients find it when it is used in consultations. Based on feedback, we will make improvements and install it for testing at a second site (Great Ormond Street Hospital). Our aim is for the system to be eventually rolled out to more hospitals and doctors\u2019 surgeries across the NHS.

"},{"location":"about/team/","title":"Team","text":"

The MiADE project is developed by a team of clinicians, developers, AI researchers, and data standard experts at University College London (UCL) and the University College London Hospitals (UCLH), in collaboration with the Cogstack at King's College London (KCL).

"},{"location":"api-reference/annotator/","title":"Annotator","text":"

Bases: ABC

An abstract base class for annotators.

Annotators are responsible for processing medical notes and extracting relevant concepts from them.

Attributes:

Name Type Description cat CAT

The MedCAT instance used for concept extraction.

config AnnotatorConfig

The configuration for the annotator.

Source code in src/miade/annotators.py
class Annotator(ABC):\n    \"\"\"\n    An abstract base class for annotators.\n\n    Annotators are responsible for processing medical notes and extracting relevant concepts from them.\n\n    Attributes:\n        cat (CAT): The MedCAT instance used for concept extraction.\n        config (AnnotatorConfig): The configuration for the annotator.\n    \"\"\"\n\n    def __init__(self, cat: CAT, config: AnnotatorConfig = None):\n        self.cat = cat\n        self.config = config if config is not None else AnnotatorConfig()\n\n        if self.config.negation_detection == \"negex\":\n            self._add_negex_pipeline()\n\n        # TODO make paragraph processing params configurable\n        self.structured_prob_lists = {\n            ParagraphType.prob: Relevance.PRESENT,\n            ParagraphType.imp: Relevance.PRESENT,\n            ParagraphType.pmh: Relevance.HISTORIC,\n        }\n        self.structured_med_lists = {\n            ParagraphType.med: SubstanceCategory.TAKING,\n            ParagraphType.allergy: SubstanceCategory.ADVERSE_REACTION,\n        }\n        self.irrelevant_paragraphs = [ParagraphType.ddx, ParagraphType.exam, ParagraphType.plan]\n\n    def _add_negex_pipeline(self) -> None:\n        \"\"\"\n        Adds the negex pipeline to the MedCAT instance.\n        \"\"\"\n        self.cat.pipe.spacy_nlp.add_pipe(\"sentencizer\")\n        self.cat.pipe.spacy_nlp.enable_pipe(\"sentencizer\")\n        self.cat.pipe.spacy_nlp.add_pipe(\"negex\")\n\n    @property\n    @abstractmethod\n    def concept_types(self):\n        \"\"\"\n        Abstract property that should return a list of concept types supported by the annotator.\n        \"\"\"\n        pass\n\n    @property\n    @abstractmethod\n    def pipeline(self):\n        \"\"\"\n        Abstract property that should return a list of pipeline steps for the annotator.\n        \"\"\"\n        pass\n\n    @abstractmethod\n    def process_paragraphs(self):\n        \"\"\"\n        Abstract method that should implement the logic for processing paragraphs in a note.\n        \"\"\"\n        pass\n\n    @abstractmethod\n    def postprocess(self):\n        \"\"\"\n        Abstract method that should implement the logic for post-processing extracted concepts.\n        \"\"\"\n        pass\n\n    def run_pipeline(self, note: Note, record_concepts: List[Concept]) -> List[Concept]:\n        \"\"\"\n        Runs the annotation pipeline on a given note and returns the extracted concepts.\n\n        Args:\n            note (Note): The input note to process.\n            record_concepts (List[Concept]): The list of concepts from existing EHR records.\n\n        Returns:\n            The extracted concepts from the note.\n        \"\"\"\n        concepts: List[Concept] = []\n\n        for pipe in self.pipeline:\n            if pipe not in self.config.disable:\n                if pipe == \"preprocessor\":\n                    note = self.preprocess(note)\n                elif pipe == \"medcat\":\n                    concepts = self.get_concepts(note)\n                elif pipe == \"paragrapher\":\n                    concepts = self.process_paragraphs(note, concepts)\n                elif pipe == \"postprocessor\":\n                    concepts = self.postprocess(concepts)\n                elif pipe == \"deduplicator\":\n                    concepts = self.deduplicate(concepts, record_concepts)\n\n        return concepts\n\n    def get_concepts(self, note: Note) -> List[Concept]:\n        \"\"\"\n        Extracts concepts from a note using the MedCAT instance.\n\n        Args:\n            note (Note): The input note to extract concepts from.\n\n        Returns:\n            The extracted concepts from the note.\n        \"\"\"\n        concepts: List[Concept] = []\n        for entity in self.cat.get_entities(note)[\"entities\"].values():\n            try:\n                concepts.append(Concept.from_entity(entity))\n                log.debug(f\"Detected concept ({concepts[-1].id} | {concepts[-1].name})\")\n            except ValueError as e:\n                log.warning(f\"Concept skipped: {e}\")\n\n        return concepts\n\n    @staticmethod\n    def preprocess(note: Note) -> Note:\n        \"\"\"\n        Preprocesses a note by cleaning its text and splitting it into paragraphs.\n\n        Args:\n            note (Note): The input note to preprocess.\n\n        Returns:\n            The preprocessed note.\n        \"\"\"\n        note.clean_text()\n        note.get_paragraphs()\n\n        return note\n\n    @staticmethod\n    def deduplicate(concepts: List[Concept], record_concepts: Optional[List[Concept]]) -> List[Concept]:\n        \"\"\"\n        Removes duplicate concepts from the extracted concepts list by strict ID matching.\n\n        Args:\n            concepts (List[Concept]): The list of extracted concepts.\n            record_concepts (Optional[List[Concept]]): The list of concepts from existing EHR records.\n\n        Returns:\n            The deduplicated list of concepts.\n        \"\"\"\n        if record_concepts is not None:\n            record_ids = {record_concept.id for record_concept in record_concepts}\n            record_names = {record_concept.name for record_concept in record_concepts}\n        else:\n            record_ids = set()\n            record_names = set()\n\n        # Use an OrderedDict to keep track of ids as it preservers original MedCAT order (the order it appears in text)\n        filtered_concepts: List[Concept] = []\n        existing_concepts = OrderedDict()\n\n        # Filter concepts that are in record or exist in concept list\n        for concept in concepts:\n            if concept.id is not None and (concept.id in record_ids or concept.id in existing_concepts):\n                log.debug(f\"Removed concept ({concept.id} | {concept.name}): concept id exists in record\")\n            # check name match for null ids - VTM deduplication\n            elif concept.id is None and (concept.name in record_names or concept.name in existing_concepts.values()):\n                log.debug(f\"Removed concept ({concept.id} | {concept.name}): concept name exists in record\")\n            else:\n                filtered_concepts.append(concept)\n                existing_concepts[concept.id] = concept.name\n\n        return filtered_concepts\n\n    @staticmethod\n    def add_numbering_to_name(concepts: List[Concept]) -> List[Concept]:\n        \"\"\"\n        Adds numbering to the names of problem concepts to control output ordering.\n\n        Args:\n            concepts (List[Concept]): The list of concepts to add numbering to.\n\n        Returns:\n            The list of concepts with numbering added to their names.\n        \"\"\"\n        # Prepend numbering to problem concepts e.g. 00 asthma, 01 stroke...\n        for i, concept in enumerate(concepts):\n            concept.name = f\"{i:02} {concept.name}\"\n\n        return concepts\n\n    def __call__(\n        self,\n        note: Note,\n        record_concepts: Optional[List[Concept]] = None,\n    ) -> List[Concept]:\n        \"\"\"\n        Runs the annotation pipeline on a given note and returns the extracted concepts.\n\n        Args:\n            note (Note): The input note to process.\n            record_concepts (Optional[List[Concept]]): The list of concepts from existing EHR records.\n\n        Returns:\n            The extracted concepts from the note.\n        \"\"\"\n        concepts = self.run_pipeline(note, record_concepts)\n\n        if self.config.add_numbering:\n            concepts = self.add_numbering_to_name(concepts)\n\n        return concepts\n
"},{"location":"api-reference/annotator/#miade.annotators.Annotator.concept_types","title":"concept_types abstractmethod property","text":"

Abstract property that should return a list of concept types supported by the annotator.

"},{"location":"api-reference/annotator/#miade.annotators.Annotator.pipeline","title":"pipeline abstractmethod property","text":"

Abstract property that should return a list of pipeline steps for the annotator.

"},{"location":"api-reference/annotator/#miade.annotators.Annotator.__call__","title":"__call__(note, record_concepts=None)","text":"

Runs the annotation pipeline on a given note and returns the extracted concepts.

Parameters:

Name Type Description Default note Note

The input note to process.

required record_concepts Optional[List[Concept]]

The list of concepts from existing EHR records.

None

Returns:

Type Description List[Concept]

The extracted concepts from the note.

Source code in src/miade/annotators.py
def __call__(\n    self,\n    note: Note,\n    record_concepts: Optional[List[Concept]] = None,\n) -> List[Concept]:\n    \"\"\"\n    Runs the annotation pipeline on a given note and returns the extracted concepts.\n\n    Args:\n        note (Note): The input note to process.\n        record_concepts (Optional[List[Concept]]): The list of concepts from existing EHR records.\n\n    Returns:\n        The extracted concepts from the note.\n    \"\"\"\n    concepts = self.run_pipeline(note, record_concepts)\n\n    if self.config.add_numbering:\n        concepts = self.add_numbering_to_name(concepts)\n\n    return concepts\n
"},{"location":"api-reference/annotator/#miade.annotators.Annotator.add_numbering_to_name","title":"add_numbering_to_name(concepts) staticmethod","text":"

Adds numbering to the names of problem concepts to control output ordering.

Parameters:

Name Type Description Default concepts List[Concept]

The list of concepts to add numbering to.

required

Returns:

Type Description List[Concept]

The list of concepts with numbering added to their names.

Source code in src/miade/annotators.py
@staticmethod\ndef add_numbering_to_name(concepts: List[Concept]) -> List[Concept]:\n    \"\"\"\n    Adds numbering to the names of problem concepts to control output ordering.\n\n    Args:\n        concepts (List[Concept]): The list of concepts to add numbering to.\n\n    Returns:\n        The list of concepts with numbering added to their names.\n    \"\"\"\n    # Prepend numbering to problem concepts e.g. 00 asthma, 01 stroke...\n    for i, concept in enumerate(concepts):\n        concept.name = f\"{i:02} {concept.name}\"\n\n    return concepts\n
"},{"location":"api-reference/annotator/#miade.annotators.Annotator.deduplicate","title":"deduplicate(concepts, record_concepts) staticmethod","text":"

Removes duplicate concepts from the extracted concepts list by strict ID matching.

Parameters:

Name Type Description Default concepts List[Concept]

The list of extracted concepts.

required record_concepts Optional[List[Concept]]

The list of concepts from existing EHR records.

required

Returns:

Type Description List[Concept]

The deduplicated list of concepts.

Source code in src/miade/annotators.py
@staticmethod\ndef deduplicate(concepts: List[Concept], record_concepts: Optional[List[Concept]]) -> List[Concept]:\n    \"\"\"\n    Removes duplicate concepts from the extracted concepts list by strict ID matching.\n\n    Args:\n        concepts (List[Concept]): The list of extracted concepts.\n        record_concepts (Optional[List[Concept]]): The list of concepts from existing EHR records.\n\n    Returns:\n        The deduplicated list of concepts.\n    \"\"\"\n    if record_concepts is not None:\n        record_ids = {record_concept.id for record_concept in record_concepts}\n        record_names = {record_concept.name for record_concept in record_concepts}\n    else:\n        record_ids = set()\n        record_names = set()\n\n    # Use an OrderedDict to keep track of ids as it preservers original MedCAT order (the order it appears in text)\n    filtered_concepts: List[Concept] = []\n    existing_concepts = OrderedDict()\n\n    # Filter concepts that are in record or exist in concept list\n    for concept in concepts:\n        if concept.id is not None and (concept.id in record_ids or concept.id in existing_concepts):\n            log.debug(f\"Removed concept ({concept.id} | {concept.name}): concept id exists in record\")\n        # check name match for null ids - VTM deduplication\n        elif concept.id is None and (concept.name in record_names or concept.name in existing_concepts.values()):\n            log.debug(f\"Removed concept ({concept.id} | {concept.name}): concept name exists in record\")\n        else:\n            filtered_concepts.append(concept)\n            existing_concepts[concept.id] = concept.name\n\n    return filtered_concepts\n
"},{"location":"api-reference/annotator/#miade.annotators.Annotator.get_concepts","title":"get_concepts(note)","text":"

Extracts concepts from a note using the MedCAT instance.

Parameters:

Name Type Description Default note Note

The input note to extract concepts from.

required

Returns:

Type Description List[Concept]

The extracted concepts from the note.

Source code in src/miade/annotators.py
def get_concepts(self, note: Note) -> List[Concept]:\n    \"\"\"\n    Extracts concepts from a note using the MedCAT instance.\n\n    Args:\n        note (Note): The input note to extract concepts from.\n\n    Returns:\n        The extracted concepts from the note.\n    \"\"\"\n    concepts: List[Concept] = []\n    for entity in self.cat.get_entities(note)[\"entities\"].values():\n        try:\n            concepts.append(Concept.from_entity(entity))\n            log.debug(f\"Detected concept ({concepts[-1].id} | {concepts[-1].name})\")\n        except ValueError as e:\n            log.warning(f\"Concept skipped: {e}\")\n\n    return concepts\n
"},{"location":"api-reference/annotator/#miade.annotators.Annotator.postprocess","title":"postprocess() abstractmethod","text":"

Abstract method that should implement the logic for post-processing extracted concepts.

Source code in src/miade/annotators.py
@abstractmethod\ndef postprocess(self):\n    \"\"\"\n    Abstract method that should implement the logic for post-processing extracted concepts.\n    \"\"\"\n    pass\n
"},{"location":"api-reference/annotator/#miade.annotators.Annotator.preprocess","title":"preprocess(note) staticmethod","text":"

Preprocesses a note by cleaning its text and splitting it into paragraphs.

Parameters:

Name Type Description Default note Note

The input note to preprocess.

required

Returns:

Type Description Note

The preprocessed note.

Source code in src/miade/annotators.py
@staticmethod\ndef preprocess(note: Note) -> Note:\n    \"\"\"\n    Preprocesses a note by cleaning its text and splitting it into paragraphs.\n\n    Args:\n        note (Note): The input note to preprocess.\n\n    Returns:\n        The preprocessed note.\n    \"\"\"\n    note.clean_text()\n    note.get_paragraphs()\n\n    return note\n
"},{"location":"api-reference/annotator/#miade.annotators.Annotator.process_paragraphs","title":"process_paragraphs() abstractmethod","text":"

Abstract method that should implement the logic for processing paragraphs in a note.

Source code in src/miade/annotators.py
@abstractmethod\ndef process_paragraphs(self):\n    \"\"\"\n    Abstract method that should implement the logic for processing paragraphs in a note.\n    \"\"\"\n    pass\n
"},{"location":"api-reference/annotator/#miade.annotators.Annotator.run_pipeline","title":"run_pipeline(note, record_concepts)","text":"

Runs the annotation pipeline on a given note and returns the extracted concepts.

Parameters:

Name Type Description Default note Note

The input note to process.

required record_concepts List[Concept]

The list of concepts from existing EHR records.

required

Returns:

Type Description List[Concept]

The extracted concepts from the note.

Source code in src/miade/annotators.py
def run_pipeline(self, note: Note, record_concepts: List[Concept]) -> List[Concept]:\n    \"\"\"\n    Runs the annotation pipeline on a given note and returns the extracted concepts.\n\n    Args:\n        note (Note): The input note to process.\n        record_concepts (List[Concept]): The list of concepts from existing EHR records.\n\n    Returns:\n        The extracted concepts from the note.\n    \"\"\"\n    concepts: List[Concept] = []\n\n    for pipe in self.pipeline:\n        if pipe not in self.config.disable:\n            if pipe == \"preprocessor\":\n                note = self.preprocess(note)\n            elif pipe == \"medcat\":\n                concepts = self.get_concepts(note)\n            elif pipe == \"paragrapher\":\n                concepts = self.process_paragraphs(note, concepts)\n            elif pipe == \"postprocessor\":\n                concepts = self.postprocess(concepts)\n            elif pipe == \"deduplicator\":\n                concepts = self.deduplicate(concepts, record_concepts)\n\n    return concepts\n
"},{"location":"api-reference/concept/","title":"Concept","text":"

Bases: object

Represents a concept in the system.

Attributes:

Name Type Description id str

The unique identifier of the concept.

name str

The name of the concept.

category Optional[Enum]

The category of the concept (optional).

start Optional[int]

The start position of the concept (optional).

end Optional[int]

The end position of the concept (optional).

dosage Optional[Dosage]

The dosage of the concept (optional).

linked_concepts Optional[List[Concept]]

The linked concepts of the concept (optional).

negex Optional[bool]

The negex value of the concept (optional).

meta_anns Optional[List[MetaAnnotations]]

The meta annotations of the concept (optional).

debug_dict Optional[Dict]

The debug dictionary of the concept (optional).

Source code in src/miade/concept.py
class Concept(object):\n    \"\"\"Represents a concept in the system.\n\n    Attributes:\n        id (str): The unique identifier of the concept.\n        name (str): The name of the concept.\n        category (Optional[Enum]): The category of the concept (optional).\n        start (Optional[int]): The start position of the concept (optional).\n        end (Optional[int]): The end position of the concept (optional).\n        dosage (Optional[Dosage]): The dosage of the concept (optional).\n        linked_concepts (Optional[List[Concept]]): The linked concepts of the concept (optional).\n        negex (Optional[bool]): The negex value of the concept (optional).\n        meta_anns (Optional[List[MetaAnnotations]]): The meta annotations of the concept (optional).\n        debug_dict (Optional[Dict]): The debug dictionary of the concept (optional).\n    \"\"\"\n\n    def __init__(\n        self,\n        id: str,\n        name: str,\n        category: Optional[Enum] = None,\n        start: Optional[int] = None,\n        end: Optional[int] = None,\n        dosage: Optional[Dosage] = None,\n        linked_concepts: Optional[List[Concept]] = None,\n        negex: Optional[bool] = None,\n        meta_anns: Optional[List[MetaAnnotations]] = None,\n        debug_dict: Optional[Dict] = None,\n    ):\n        self.name = name\n        self.id = id\n        self.category = category\n        self.start = start\n        self.end = end\n        self.dosage = dosage\n        self.linked_concepts = linked_concepts\n        self.negex = negex\n        self.meta = meta_anns\n        self.debug = debug_dict\n\n        if linked_concepts is None:\n            self.linked_concepts = []\n\n    @classmethod\n    def from_entity(cls, entity: Dict) -> Concept:\n        \"\"\"\n        Converts an entity dictionary into a Concept object.\n\n        Args:\n            entity (Dict): The entity dictionary containing the necessary information.\n\n        Returns:\n            The Concept object created from the entity dictionary.\n        \"\"\"\n        meta_anns = None\n        if entity[\"meta_anns\"]:\n            meta_anns = [MetaAnnotations(**value) for value in entity[\"meta_anns\"].values()]\n\n        return Concept(\n            id=entity[\"cui\"],\n            name=entity[\n                \"source_value\"\n            ],  # can also use detected_name which is spell checked but delimited by ~ e.g. liver~failure\n            category=None,\n            start=entity[\"start\"],\n            end=entity[\"end\"],\n            negex=entity[\"negex\"] if \"negex\" in entity else None,\n            meta_anns=meta_anns,\n        )\n\n    def __str__(self):\n        return (\n            f\"{{name: {self.name}, id: {self.id}, category: {self.category}, start: {self.start}, end: {self.end},\"\n            f\" dosage: {self.dosage}, linked_concepts: {self.linked_concepts}, negex: {self.negex}, meta: {self.meta}}} \"\n        )\n\n    def __hash__(self):\n        return hash((self.id, self.name, self.category))\n\n    def __eq__(self, other):\n        return self.id == other.id and self.name == other.name and self.category == other.category\n\n    def __lt__(self, other):\n        return int(self.id) < int(other.id)\n\n    def __gt__(self, other):\n        return int(self.id) > int(other.id)\n
"},{"location":"api-reference/concept/#miade.concept.Concept.from_entity","title":"from_entity(entity) classmethod","text":"

Converts an entity dictionary into a Concept object.

Parameters:

Name Type Description Default entity Dict

The entity dictionary containing the necessary information.

required

Returns:

Type Description Concept

The Concept object created from the entity dictionary.

Source code in src/miade/concept.py
@classmethod\ndef from_entity(cls, entity: Dict) -> Concept:\n    \"\"\"\n    Converts an entity dictionary into a Concept object.\n\n    Args:\n        entity (Dict): The entity dictionary containing the necessary information.\n\n    Returns:\n        The Concept object created from the entity dictionary.\n    \"\"\"\n    meta_anns = None\n    if entity[\"meta_anns\"]:\n        meta_anns = [MetaAnnotations(**value) for value in entity[\"meta_anns\"].values()]\n\n    return Concept(\n        id=entity[\"cui\"],\n        name=entity[\n            \"source_value\"\n        ],  # can also use detected_name which is spell checked but delimited by ~ e.g. liver~failure\n        category=None,\n        start=entity[\"start\"],\n        end=entity[\"end\"],\n        negex=entity[\"negex\"] if \"negex\" in entity else None,\n        meta_anns=meta_anns,\n    )\n
"},{"location":"api-reference/dosage/","title":"Dosage","text":"

Bases: object

Container for drug dosage information

Source code in src/miade/dosage.py
class Dosage(object):\n    \"\"\"\n    Container for drug dosage information\n    \"\"\"\n\n    def __init__(\n        self,\n        dose: Optional[Dose],\n        duration: Optional[Duration],\n        frequency: Optional[Frequency],\n        route: Optional[Route],\n        text: Optional[str] = None,\n    ):\n        self.text = text\n        self.dose = dose\n        self.duration = duration\n        self.frequency = frequency\n        self.route = route\n\n    @classmethod\n    def from_doc(cls, doc: Doc, calculate: bool = True):\n        \"\"\"\n        Parses dosage from a spacy doc object.\n\n        Args:\n            doc (Doc): Spacy doc object with processed dosage text.\n            calculate (bool, optional): Whether to calculate duration if total and daily dose is given. Defaults to True.\n\n        Returns:\n            An instance of the class with the parsed dosage information.\n\n        \"\"\"\n        quantities = []\n        units = []\n        dose_start = 1000\n        dose_end = 0\n        daily_dose = None\n        total_dose = None\n        route_text = None\n        duration_text = None\n\n        for ent in doc.ents:\n            if ent.label_ == \"DOSAGE\":\n                if ent._.total_dose:\n                    total_dose = float(ent.text)\n                else:\n                    quantities.append(ent.text)\n                    # get span of full dosage string - not strictly needed but nice to have\n                    if ent.start < dose_start:\n                        dose_start = ent.start\n                    if ent.end > dose_end:\n                        dose_end = ent.end\n            elif ent.label_ == \"FORM\":\n                if ent._.total_dose:\n                    # de facto unit is in total dose\n                    units = [ent.text]\n                else:\n                    units.append(ent.text)\n                    if ent.start < dose_start:\n                        dose_start = ent.start\n                    if ent.end > dose_end:\n                        dose_end = ent.end\n            elif ent.label_ == \"DURATION\":\n                duration_text = ent.text\n            elif ent.label_ == \"ROUTE\":\n                route_text = ent.text\n\n        dose = parse_dose(\n            text=\" \".join(doc.text.split()[dose_start:dose_end]),\n            quantities=quantities,\n            units=units,\n            results=doc._.results,\n        )\n\n        frequency = parse_frequency(text=doc.text, results=doc._.results)\n\n        route = parse_route(text=route_text, dose=dose)\n\n        # technically not information recorded so will keep as an option\n        if calculate:\n            # if duration not given in text could extract this from total dose if given\n            if total_dose is not None and dose is not None and doc._.results[\"freq\"]:\n                if dose.value is not None:\n                    daily_dose = float(dose.value) * (round(doc._.results[\"freq\"] / doc._.results[\"time\"]))\n                elif dose.high is not None:\n                    daily_dose = float(dose.high) * (round(doc._.results[\"freq\"] / doc._.results[\"time\"]))\n\n        duration = parse_duration(\n            text=duration_text,\n            results=doc._.results,\n            total_dose=total_dose,\n            daily_dose=daily_dose,\n        )\n\n        return cls(\n            text=doc._.original_text,\n            dose=dose,\n            duration=duration,\n            frequency=frequency,\n            route=route,\n        )\n\n    def __str__(self):\n        return f\"{self.__dict__}\"\n\n    def __eq__(self, other):\n        return self.__dict__ == other.__dict__\n
"},{"location":"api-reference/dosage/#miade.dosage.Dosage.from_doc","title":"from_doc(doc, calculate=True) classmethod","text":"

Parses dosage from a spacy doc object.

Parameters:

Name Type Description Default doc Doc

Spacy doc object with processed dosage text.

required calculate bool

Whether to calculate duration if total and daily dose is given. Defaults to True.

True

Returns:

Type Description

An instance of the class with the parsed dosage information.

Source code in src/miade/dosage.py
@classmethod\ndef from_doc(cls, doc: Doc, calculate: bool = True):\n    \"\"\"\n    Parses dosage from a spacy doc object.\n\n    Args:\n        doc (Doc): Spacy doc object with processed dosage text.\n        calculate (bool, optional): Whether to calculate duration if total and daily dose is given. Defaults to True.\n\n    Returns:\n        An instance of the class with the parsed dosage information.\n\n    \"\"\"\n    quantities = []\n    units = []\n    dose_start = 1000\n    dose_end = 0\n    daily_dose = None\n    total_dose = None\n    route_text = None\n    duration_text = None\n\n    for ent in doc.ents:\n        if ent.label_ == \"DOSAGE\":\n            if ent._.total_dose:\n                total_dose = float(ent.text)\n            else:\n                quantities.append(ent.text)\n                # get span of full dosage string - not strictly needed but nice to have\n                if ent.start < dose_start:\n                    dose_start = ent.start\n                if ent.end > dose_end:\n                    dose_end = ent.end\n        elif ent.label_ == \"FORM\":\n            if ent._.total_dose:\n                # de facto unit is in total dose\n                units = [ent.text]\n            else:\n                units.append(ent.text)\n                if ent.start < dose_start:\n                    dose_start = ent.start\n                if ent.end > dose_end:\n                    dose_end = ent.end\n        elif ent.label_ == \"DURATION\":\n            duration_text = ent.text\n        elif ent.label_ == \"ROUTE\":\n            route_text = ent.text\n\n    dose = parse_dose(\n        text=\" \".join(doc.text.split()[dose_start:dose_end]),\n        quantities=quantities,\n        units=units,\n        results=doc._.results,\n    )\n\n    frequency = parse_frequency(text=doc.text, results=doc._.results)\n\n    route = parse_route(text=route_text, dose=dose)\n\n    # technically not information recorded so will keep as an option\n    if calculate:\n        # if duration not given in text could extract this from total dose if given\n        if total_dose is not None and dose is not None and doc._.results[\"freq\"]:\n            if dose.value is not None:\n                daily_dose = float(dose.value) * (round(doc._.results[\"freq\"] / doc._.results[\"time\"]))\n            elif dose.high is not None:\n                daily_dose = float(dose.high) * (round(doc._.results[\"freq\"] / doc._.results[\"time\"]))\n\n    duration = parse_duration(\n        text=duration_text,\n        results=doc._.results,\n        total_dose=total_dose,\n        daily_dose=daily_dose,\n    )\n\n    return cls(\n        text=doc._.original_text,\n        dose=dose,\n        duration=duration,\n        frequency=frequency,\n        route=route,\n    )\n
"},{"location":"api-reference/dosageextractor/","title":"DosageExtractor","text":"

Parses and extracts drug dosage

Attributes:

Name Type Description model str

The name of the model to be used for dosage extraction.

dosage_extractor Language

The Spacy pipeline for dosage extraction.

Source code in src/miade/dosageextractor.py
class DosageExtractor:\n    \"\"\"\n    Parses and extracts drug dosage\n\n    Attributes:\n        model (str): The name of the model to be used for dosage extraction.\n        dosage_extractor (Language): The Spacy pipeline for dosage extraction.\n    \"\"\"\n\n    def __init__(self, model: str = \"en_core_med7_lg\"):\n        self.model = model\n        self.dosage_extractor = self._create_drugdoseade_pipeline()\n\n    def _create_drugdoseade_pipeline(self) -> Language:\n        \"\"\"\n        Creates a spacy pipeline with given model (default med7)\n        and customised pipeline components for dosage extraction\n\n        Returns:\n            nlp (spacy.Language): The Spacy pipeline for dosage extraction.\n        \"\"\"\n        nlp = spacy.load(self.model)\n        nlp.add_pipe(\"preprocessor\", first=True)\n        nlp.add_pipe(\"pattern_matcher\", before=\"ner\")\n        nlp.add_pipe(\"entities_refiner\", after=\"ner\")\n\n        log.info(f\"Loaded drug dosage extractor with model {self.model}\")\n\n        return nlp\n\n    def extract(self, text: str, calculate: bool = True) -> Optional[Dosage]:\n        \"\"\"\n        Processes a string that contains dosage instructions (excluding drug concept as this is handled by core)\n\n        Args:\n            text (str): The string containing dosage instructions.\n            calculate (bool): Whether to calculate duration from total and daily dose, if given.\n\n        Returns:\n            The dosage object with parsed dosages in CDA format.\n        \"\"\"\n        doc = self.dosage_extractor(text)\n\n        log.debug(f\"NER results: {[(e.text, e.label_, e._.total_dose) for e in doc.ents]}\")\n        log.debug(f\"Lookup results: {doc._.results}\")\n\n        dosage = Dosage.from_doc(doc=doc, calculate=calculate)\n\n        if all(v is None for v in [dosage.dose, dosage.frequency, dosage.route, dosage.duration]):\n            return None\n\n        return dosage\n\n    def __call__(self, text: str, calculate: bool = True):\n        return self.extract(text, calculate)\n
"},{"location":"api-reference/dosageextractor/#miade.dosageextractor.DosageExtractor.extract","title":"extract(text, calculate=True)","text":"

Processes a string that contains dosage instructions (excluding drug concept as this is handled by core)

Parameters:

Name Type Description Default text str

The string containing dosage instructions.

required calculate bool

Whether to calculate duration from total and daily dose, if given.

True

Returns:

Type Description Optional[Dosage]

The dosage object with parsed dosages in CDA format.

Source code in src/miade/dosageextractor.py
def extract(self, text: str, calculate: bool = True) -> Optional[Dosage]:\n    \"\"\"\n    Processes a string that contains dosage instructions (excluding drug concept as this is handled by core)\n\n    Args:\n        text (str): The string containing dosage instructions.\n        calculate (bool): Whether to calculate duration from total and daily dose, if given.\n\n    Returns:\n        The dosage object with parsed dosages in CDA format.\n    \"\"\"\n    doc = self.dosage_extractor(text)\n\n    log.debug(f\"NER results: {[(e.text, e.label_, e._.total_dose) for e in doc.ents]}\")\n    log.debug(f\"Lookup results: {doc._.results}\")\n\n    dosage = Dosage.from_doc(doc=doc, calculate=calculate)\n\n    if all(v is None for v in [dosage.dose, dosage.frequency, dosage.route, dosage.duration]):\n        return None\n\n    return dosage\n
"},{"location":"api-reference/medsallergiesannotator/","title":"MedsAllergiesAnnotator","text":"

Bases: Annotator

Annotator class for medication and allergy concepts.

This class extends the Annotator base class and provides methods for running a pipeline of annotation tasks on a given note, as well as validating and converting concepts related to medications and allergies.

Attributes:

Name Type Description valid_meds List[int]

A list of valid medication IDs.

reactions_subset_lookup Dict[int, str]

A dictionary mapping reaction IDs to their corresponding subset IDs.

allergens_subset_lookup Dict[int, str]

A dictionary mapping allergen IDs to their corresponding subset IDs.

allergy_type_lookup Dict[str, List[str]]

A dictionary mapping allergen types to their corresponding codes.

vtm_to_vmp_lookup Dict[str, str]

A dictionary mapping VTM (Virtual Therapeutic Moiety) IDs to VMP (Virtual Medicinal Product) IDs.

vtm_to_text_lookup Dict[str, str]

A dictionary mapping VTM IDs to their corresponding text.

Source code in src/miade/annotators.py
class MedsAllergiesAnnotator(Annotator):\n    \"\"\"\n    Annotator class for medication and allergy concepts.\n\n    This class extends the `Annotator` base class and provides methods for running a pipeline of\n    annotation tasks on a given note, as well as validating and converting concepts related to\n    medications and allergies.\n\n    Attributes:\n        valid_meds (List[int]): A list of valid medication IDs.\n        reactions_subset_lookup (Dict[int, str]): A dictionary mapping reaction IDs to their corresponding subset IDs.\n        allergens_subset_lookup (Dict[int, str]): A dictionary mapping allergen IDs to their corresponding subset IDs.\n        allergy_type_lookup (Dict[str, List[str]]): A dictionary mapping allergen types to their corresponding codes.\n        vtm_to_vmp_lookup (Dict[str, str]): A dictionary mapping VTM (Virtual Therapeutic Moiety) IDs to VMP (Virtual Medicinal Product) IDs.\n        vtm_to_text_lookup (Dict[str, str]): A dictionary mapping VTM IDs to their corresponding text.\n    \"\"\"\n\n    def __init__(self, cat: CAT, config: AnnotatorConfig = None):\n        super().__init__(cat, config)\n        self._load_med_allergy_lookup_data()\n\n    @property\n    def concept_types(self) -> List[Category]:\n        \"\"\"\n        Returns a list of concept types.\n\n        Returns:\n            [Category.MEDICATION, Category.ALLERGY, Category.REACTION]\n        \"\"\"\n        return [Category.MEDICATION, Category.ALLERGY, Category.REACTION]\n\n    @property\n    def pipeline(self) -> List[str]:\n        \"\"\"\n        Returns a list of annotators in the pipeline.\n\n        The annotators are executed in the order they appear in the list.\n\n        Returns:\n            [\"preprocessor\", \"medcat\", \"paragrapher\", \"postprocessor\", \"dosage_extractor\", \"vtm_converter\", \"deduplicator\"]\n        \"\"\"\n        return [\n            \"preprocessor\",\n            \"medcat\",\n            \"paragrapher\",\n            \"postprocessor\",\n            \"dosage_extractor\",\n            \"vtm_converter\",\n            \"deduplicator\",\n        ]\n\n    def run_pipeline(\n        self, note: Note, record_concepts: List[Concept], dosage_extractor: Optional[DosageExtractor]\n    ) -> List[Concept]:\n        \"\"\"\n        Runs the annotation pipeline on the given note.\n\n        Args:\n            note (Note): The input note to run the pipeline on.\n            record_concepts (List[Concept]): The list of previously recorded concepts.\n            dosage_extractor (Optional[DosageExtractor]): The dosage extractor function.\n\n        Returns:\n            The list of annotated concepts.\n        \"\"\"\n        concepts: List[Concept] = []\n\n        for pipe in self.pipeline:\n            if pipe not in self.config.disable:\n                if pipe == \"preprocessor\":\n                    note = self.preprocess(note)\n                elif pipe == \"medcat\":\n                    concepts = self.get_concepts(note)\n                elif pipe == \"paragrapher\":\n                    concepts = self.process_paragraphs(note, concepts)\n                elif pipe == \"postprocessor\":\n                    concepts = self.postprocess(concepts, note)\n                elif pipe == \"deduplicator\":\n                    concepts = self.deduplicate(concepts, record_concepts)\n                elif pipe == \"vtm_converter\":\n                    concepts = self.convert_VTM_to_VMP_or_text(concepts)\n                elif pipe == \"dosage_extractor\" and dosage_extractor is not None:\n                    concepts = self.add_dosages_to_concepts(dosage_extractor, concepts, note)\n\n        return concepts\n\n    def _load_med_allergy_lookup_data(self) -> None:\n        \"\"\"\n        Loads the medication and allergy lookup data.\n        \"\"\"\n        if not os.path.isdir(self.config.lookup_data_path):\n            raise RuntimeError(f\"No lookup data configured: {self.config.lookup_data_path} does not exist!\")\n        else:\n            self.valid_meds = load_lookup_data(self.config.lookup_data_path + \"valid_meds.csv\", no_header=True)\n            self.reactions_subset_lookup = load_lookup_data(\n                self.config.lookup_data_path + \"reactions_subset.csv\", as_dict=True\n            )\n            self.allergens_subset_lookup = load_lookup_data(\n                self.config.lookup_data_path + \"allergens_subset.csv\", as_dict=True\n            )\n            self.allergy_type_lookup = load_allergy_type_combinations(self.config.lookup_data_path + \"allergy_type.csv\")\n            self.vtm_to_vmp_lookup = load_lookup_data(self.config.lookup_data_path + \"vtm_to_vmp.csv\")\n            self.vtm_to_text_lookup = load_lookup_data(self.config.lookup_data_path + \"vtm_to_text.csv\", as_dict=True)\n\n    def _validate_meds(self, concept) -> bool:\n        \"\"\"\n        Validates if the concept is a valid medication.\n\n        Args:\n            concept: The concept to validate.\n\n        Returns:\n            True if the concept is a valid medication, False otherwise.\n        \"\"\"\n        # check if substance is valid med\n        if int(concept.id) in self.valid_meds.values:\n            return True\n        return False\n\n    def _validate_and_convert_substance(self, concept) -> bool:\n        \"\"\"\n        Validates and converts a substance concept for allergy.\n\n        Args:\n            concept: The substance concept to be validated and converted.\n\n        Returns:\n            True if the substance is valid and converted successfully, False otherwise.\n        \"\"\"\n        # check if substance is valid substance for allergy - if it is, convert it to Epic subset and return that concept\n        lookup_result = self.allergens_subset_lookup.get(int(concept.id))\n        if lookup_result is not None:\n            log.debug(\n                f\"Converted concept ({concept.id} | {concept.name}) to \"\n                f\"({lookup_result['subsetId']} | {concept.name}): valid Epic allergen subset\"\n            )\n            concept.id = str(lookup_result[\"subsetId\"])\n\n            # then check the allergen type from lookup result - e.g. drug, food\n            try:\n                concept.category = AllergenType(str(lookup_result[\"allergenType\"]).lower())\n                log.debug(\n                    f\"Assigned substance concept ({concept.id} | {concept.name}) \"\n                    f\"to allergen type category {concept.category}\"\n                )\n            except ValueError as e:\n                log.warning(f\"Allergen type not found for {concept.__str__()}: {e}\")\n\n            return True\n        else:\n            log.warning(f\"No lookup subset found for substance ({concept.id} | {concept.name})\")\n            return False\n\n    def _validate_and_convert_reaction(self, concept) -> bool:\n        \"\"\"\n        Validates and converts a reaction concept to the Epic subset.\n\n        Args:\n            concept: The concept to be validated and converted.\n\n        Returns:\n            True if the concept is a valid reaction and successfully converted to the Epic subset,\n                  False otherwise.\n        \"\"\"\n        # check if substance is valid reaction - if it is, convert it to Epic subset and return that concept\n        lookup_result = self.reactions_subset_lookup.get(int(concept.id), None)\n        if lookup_result is not None:\n            log.debug(\n                f\"Converted concept ({concept.id} | {concept.name}) to \"\n                f\"({lookup_result} | {concept.name}): valid Epic reaction subset\"\n            )\n            concept.id = str(lookup_result)\n            return True\n        else:\n            log.warning(f\"Reaction not found in Epic subset conversion for concept {concept.__str__()}\")\n            return False\n\n    def _validate_and_convert_concepts(self, concept: Concept) -> Concept:\n        \"\"\"\n        Validates and converts the given concept based on its metadata annotations.\n\n        Args:\n            concept (Concept): The concept to be validated and converted.\n\n        Returns:\n            The validated and converted concept.\n\n        \"\"\"\n        meta_ann_values = [meta_ann.value for meta_ann in concept.meta] if concept.meta is not None else []\n\n        # assign categories\n        if SubstanceCategory.ADVERSE_REACTION in meta_ann_values:\n            if self._validate_and_convert_substance(concept):\n                self._convert_allergy_type_to_code(concept)\n                self._convert_allergy_severity_to_code(concept)\n                concept.category = Category.ALLERGY\n            else:\n                log.warning(f\"Double-checking if concept ({concept.id} | {concept.name}) is in reaction subset\")\n                if self._validate_and_convert_reaction(concept) and (\n                    ReactionPos.BEFORE_SUBSTANCE in meta_ann_values or ReactionPos.AFTER_SUBSTANCE in meta_ann_values\n                ):\n                    concept.category = Category.REACTION\n                else:\n                    log.warning(\n                        f\"Reaction concept ({concept.id} | {concept.name}) not in subset or reaction_pos is NOT_REACTION\"\n                    )\n        if SubstanceCategory.TAKING in meta_ann_values:\n            if self._validate_meds(concept):\n                concept.category = Category.MEDICATION\n        if SubstanceCategory.NOT_SUBSTANCE in meta_ann_values and (\n            ReactionPos.BEFORE_SUBSTANCE in meta_ann_values or ReactionPos.AFTER_SUBSTANCE in meta_ann_values\n        ):\n            if self._validate_and_convert_reaction(concept):\n                concept.category = Category.REACTION\n\n        return concept\n\n    @staticmethod\n    def add_dosages_to_concepts(\n        dosage_extractor: DosageExtractor, concepts: List[Concept], note: Note\n    ) -> List[Concept]:\n        \"\"\"\n        Gets dosages for medication concepts\n\n        Args:\n            dosage_extractor (DosageExtractor): The dosage extractor object\n            concepts (List[Concept]): List of concepts extracted\n            note (Note): The input note\n\n        Returns:\n            List of concepts with dosages for medication concepts\n        \"\"\"\n\n        for ind, concept in enumerate(concepts):\n            next_med_concept = concepts[ind + 1] if len(concepts) > ind + 1 else None\n            dosage_string = get_dosage_string(concept, next_med_concept, note.text)\n            if len(dosage_string.split()) > 2:\n                concept.dosage = dosage_extractor(dosage_string)\n                concept.category = Category.MEDICATION if concept.dosage is not None else None\n                if concept.dosage is not None:\n                    log.debug(\n                        f\"Extracted dosage for medication concept \"\n                        f\"({concept.id} | {concept.name}): {concept.dosage.text} {concept.dosage.dose}\"\n                    )\n\n        return concepts\n\n    @staticmethod\n    def _link_reactions_to_allergens(concept_list: List[Concept], note: Note, link_distance: int = 5) -> List[Concept]:\n        \"\"\"\n        Links reaction concepts to allergen concepts based on their proximity in the given concept list.\n\n        Args:\n            concept_list (List[Concept]): The list of concepts to search for reaction and allergen concepts.\n            note (Note): The note object containing the text.\n            link_distance (int, optional): The maximum distance between a reaction and an allergen to be considered linked.\n                Defaults to 5.\n\n        Returns:\n            The updated concept list with reaction concepts removed and linked to their corresponding allergen concepts.\n        \"\"\"\n        allergy_concepts = [concept for concept in concept_list if concept.category == Category.ALLERGY]\n        reaction_concepts = [concept for concept in concept_list if concept.category == Category.REACTION]\n\n        for reaction_concept in reaction_concepts:\n            nearest_allergy_concept = None\n            min_distance = inf\n            meta_ann_values = (\n                [meta_ann.value for meta_ann in reaction_concept.meta] if reaction_concept.meta is not None else []\n            )\n\n            for allergy_concept in allergy_concepts:\n                # skip if allergy is after and meta is before_substance\n                if ReactionPos.BEFORE_SUBSTANCE in meta_ann_values and allergy_concept.start < reaction_concept.start:\n                    continue\n                # skip if allergy is before and meta is after_substance\n                elif ReactionPos.AFTER_SUBSTANCE in meta_ann_values and allergy_concept.start > reaction_concept.start:\n                    continue\n                else:\n                    distance = calculate_word_distance(\n                        reaction_concept.start, reaction_concept.end, allergy_concept.start, allergy_concept.end, note\n                    )\n                    log.debug(\n                        f\"Calculated distance between reaction {reaction_concept.name} \"\n                        f\"and allergen {allergy_concept.name}: {distance}\"\n                    )\n                    if distance == -1:\n                        log.warning(\n                            f\"Indices for {reaction_concept.name} or {allergy_concept.name} invalid: \"\n                            f\"({reaction_concept.start}, {reaction_concept.end})\"\n                            f\"({allergy_concept.start}, {allergy_concept.end})\"\n                        )\n                        continue\n\n                    if distance <= link_distance and distance < min_distance:\n                        min_distance = distance\n                        nearest_allergy_concept = allergy_concept\n\n            if nearest_allergy_concept is not None:\n                nearest_allergy_concept.linked_concepts.append(reaction_concept)\n                log.debug(\n                    f\"Linked reaction concept {reaction_concept.name} to \"\n                    f\"allergen concept {nearest_allergy_concept.name}\"\n                )\n\n        # Remove the linked REACTION concepts from the main list\n        updated_concept_list = [concept for concept in concept_list if concept.category != Category.REACTION]\n\n        return updated_concept_list\n\n    @staticmethod\n    def _convert_allergy_severity_to_code(concept: Concept) -> bool:\n        \"\"\"\n        Converts allergy severity to corresponding codes and links them to the concept.\n\n        Args:\n            concept (Concept): The concept to convert severity for.\n\n        Returns:\n            True if the conversion is successful, False otherwise.\n        \"\"\"\n        meta_ann_values = [meta_ann.value for meta_ann in concept.meta] if concept.meta is not None else []\n        if Severity.MILD in meta_ann_values:\n            concept.linked_concepts.append(Concept(id=\"L\", name=\"Low\", category=Category.SEVERITY))\n        elif Severity.MODERATE in meta_ann_values:\n            concept.linked_concepts.append(Concept(id=\"M\", name=\"Moderate\", category=Category.SEVERITY))\n        elif Severity.SEVERE in meta_ann_values:\n            concept.linked_concepts.append(Concept(id=\"H\", name=\"High\", category=Category.SEVERITY))\n        elif Severity.UNSPECIFIED in meta_ann_values:\n            return True\n        else:\n            log.warning(f\"No severity annotation associated with ({concept.id} | {concept.name})\")\n            return False\n\n        log.debug(\n            f\"Linked severity concept ({concept.linked_concepts[-1].id} | {concept.linked_concepts[-1].name}) \"\n            f\"to allergen concept ({concept.id} | {concept.name}): valid meta model output\"\n        )\n\n        return True\n\n    def _convert_allergy_type_to_code(self, concept: Concept) -> bool:\n        \"\"\"\n        Converts the allergy type of a concept to a code and adds it as a linked concept.\n\n        Args:\n            concept (Concept): The concept whose allergy type needs to be converted.\n\n        Returns:\n            True if the conversion and linking were successful, False otherwise.\n        \"\"\"\n        # get the ALLERGYTYPE meta-annotation\n        allergy_type = [meta_ann for meta_ann in concept.meta if meta_ann.name == \"allergy_type\"]\n        if len(allergy_type) != 1:\n            log.warning(\n                f\"Unable to map allergy type code: allergy_type meta-annotation \"\n                f\"not found for concept {concept.__str__()}\"\n            )\n            return False\n        else:\n            allergy_type = allergy_type[0].value\n\n        # perform lookup with ALLERGYTYPE and AllergenType combination\n        lookup_combination: Tuple[str, str] = (concept.category.value, allergy_type.value)\n        allergy_type_lookup_result = self.allergy_type_lookup.get(lookup_combination)\n\n        # add resulting allergy type concept as to linked_concept\n        if allergy_type_lookup_result is not None:\n            concept.linked_concepts.append(\n                Concept(\n                    id=str(allergy_type_lookup_result[0]),\n                    name=allergy_type_lookup_result[1],\n                    category=Category.ALLERGY_TYPE,\n                )\n            )\n            log.debug(\n                f\"Linked allergy_type concept ({allergy_type_lookup_result[0]} | {allergy_type_lookup_result[1]})\"\n                f\" to allergen concept ({concept.id} | {concept.name}): valid meta model output + allergytype lookup\"\n            )\n        else:\n            log.warning(f\"Allergen and adverse reaction type combination not found: {lookup_combination}\")\n\n        return True\n\n    def _process_meta_ann_by_paragraph(self, concept: Concept, paragraph: Paragraph):\n        \"\"\"\n        Process the meta annotations for a given concept and paragraph.\n\n        Args:\n            concept (Concept): The concept object.\n            paragraph (Paragraph): The paragraph object.\n\n        Returns:\n            None\n        \"\"\"\n        # if paragraph is structured meds to convert to corresponding relevance\n        if paragraph.type in self.structured_med_lists:\n            for meta in concept.meta:\n                if meta.name == \"substance_category\" and meta.value in [\n                    SubstanceCategory.TAKING,\n                    SubstanceCategory.IRRELEVANT,\n                ]:\n                    new_relevance = self.structured_med_lists[paragraph.type]\n                    if meta.value != new_relevance:\n                        log.debug(\n                            f\"Converted {meta.value} to \"\n                            f\"{new_relevance} for concept ({concept.id} | {concept.name}): \"\n                            f\"paragraph is {paragraph.type}\"\n                        )\n                        meta.value = new_relevance\n        # if paragraph is probs or irrelevant section, convert substance to irrelevant\n        elif paragraph.type in self.structured_prob_lists or paragraph.type in self.irrelevant_paragraphs:\n            for meta in concept.meta:\n                if meta.name == \"substance_category\" and meta.value != SubstanceCategory.IRRELEVANT:\n                    log.debug(\n                        f\"Converted {meta.value} to \"\n                        f\"{SubstanceCategory.IRRELEVANT} for concept ({concept.id} | {concept.name}): \"\n                        f\"paragraph is {paragraph.type}\"\n                    )\n                    meta.value = SubstanceCategory.IRRELEVANT\n\n    def process_paragraphs(self, note: Note, concepts: List[Concept]) -> List[Concept]:\n        \"\"\"\n        Process the paragraphs in a note and update the list of concepts.\n\n        Args:\n            note (Note): The note object containing the paragraphs.\n            concepts (List[Concept]): The list of concepts to be updated.\n\n        Returns:\n            The updated list of concepts.\n        \"\"\"\n        for paragraph in note.paragraphs:\n            for concept in concepts:\n                if concept.start >= paragraph.start and concept.end <= paragraph.end:\n                    # log.debug(f\"({concept.name} | {concept.id}) is in {paragraph.type}\")\n                    if concept.meta:\n                        self._process_meta_ann_by_paragraph(concept, paragraph)\n\n        return concepts\n\n    def postprocess(self, concepts: List[Concept], note: Note) -> List[Concept]:\n        \"\"\"\n        Postprocesses a list of concepts and links reactions to allergens.\n\n        Args:\n            concepts (List[Concept]): The list of concepts to be postprocessed.\n            note (Note): The note object associated with the concepts.\n\n        Returns:\n           The postprocessed list of concepts.\n        \"\"\"\n        # deepcopy so we still have reference to original list of concepts\n        all_concepts = deepcopy(concepts)\n        processed_concepts = []\n\n        for concept in all_concepts:\n            concept = self._validate_and_convert_concepts(concept)\n            processed_concepts.append(concept)\n\n        processed_concepts = self._link_reactions_to_allergens(processed_concepts, note)\n\n        return processed_concepts\n\n    def convert_VTM_to_VMP_or_text(self, concepts: List[Concept]) -> List[Concept]:\n        \"\"\"\n        Converts medication concepts from VTM (Virtual Therapeutic Moiety) to VMP (Virtual Medicinal Product) or text.\n\n        Args:\n            concepts (List[Concept]): A list of medication concepts.\n\n        Returns:\n            A list of medication concepts with updated IDs, names, and dosages.\n\n        \"\"\"\n        # Get medication concepts\n        med_concepts = [concept for concept in concepts if concept.category == Category.MEDICATION]\n        self.vtm_to_vmp_lookup[\"dose\"] = self.vtm_to_vmp_lookup[\"dose\"].astype(float)\n\n        med_concepts_with_dose = []\n        # I don't know man...Need to improve dosage methods\n        for concept in med_concepts:\n            if concept.dosage is not None:\n                if concept.dosage.dose:\n                    if concept.dosage.dose.value is not None and concept.dosage.dose.unit is not None:\n                        med_concepts_with_dose.append(concept)\n\n        med_concepts_no_dose = [concept for concept in concepts if concept not in med_concepts_with_dose]\n\n        # Create a temporary DataFrame to match vtmId, dose, and unit\n        temp_df = pd.DataFrame(\n            {\n                \"vtmId\": [int(concept.id) for concept in med_concepts_with_dose],\n                \"dose\": [float(concept.dosage.dose.value) for concept in med_concepts_with_dose],\n                \"unit\": [concept.dosage.dose.unit for concept in med_concepts_with_dose],\n            }\n        )\n\n        # Merge with the lookup df to get vmpId\n        merged_df = temp_df.merge(self.vtm_to_vmp_lookup, on=[\"vtmId\", \"dose\", \"unit\"], how=\"left\")\n\n        # Update id in the concepts list\n        for index, concept in enumerate(med_concepts_with_dose):\n            # Convert VTM to VMP id\n            vmp_id = merged_df.at[index, \"vmpId\"]\n            if not pd.isna(vmp_id):\n                log.debug(\n                    f\"Converted ({concept.id} | {concept.name}) to \"\n                    f\"({int(vmp_id)} | {concept.name + ' ' + str(int(concept.dosage.dose.value)) + concept.dosage.dose.unit} \"\n                    f\"tablets): valid extracted dosage + VMP lookup\"\n                )\n                concept.id = str(int(vmp_id))\n                concept.name += \" \" + str(int(concept.dosage.dose.value)) + str(concept.dosage.dose.unit) + \" tablets\"\n                # If found VMP match change the dosage to 1 tablet\n                concept.dosage.dose.value = 1\n                concept.dosage.dose.unit = \"{tbl}\"\n            else:\n                # If no match with dose convert to text\n                lookup_result = self.vtm_to_text_lookup.get(int(concept.id))\n                if lookup_result is not None:\n                    log.debug(\n                        f\"Converted ({concept.id} | {concept.name}) to (None | {lookup_result}: no match to VMP dosage lookup)\"\n                    )\n                    concept.id = None\n                    concept.name = lookup_result\n\n        # Convert rest of VTMs that have no dose for VMP conversion to text\n        for concept in med_concepts_no_dose:\n            lookup_result = self.vtm_to_text_lookup.get(int(concept.id))\n            if lookup_result is not None:\n                log.debug(f\"Converted ({concept.id} | {concept.name}) to (None | {lookup_result}): no dosage detected\")\n                concept.id = None\n                concept.name = lookup_result\n\n        return concepts\n\n    def __call__(\n        self,\n        note: Note,\n        record_concepts: Optional[List[Concept]] = None,\n        dosage_extractor: Optional[DosageExtractor] = None,\n    ) -> List[Concept]:\n        \"\"\"\n        Annotates the given note with concepts using the pipeline.\n\n        Args:\n            note (Note): The note to be annotated.\n            record_concepts (Optional[List[Concept]]): A list of concepts to be recorded.\n            dosage_extractor (Optional[DosageExtractor]): A dosage extractor to be used.\n\n        Returns:\n            The annotated concepts.\n        \"\"\"\n        concepts = self.run_pipeline(note, record_concepts, dosage_extractor)\n\n        if self.config.add_numbering:\n            concepts = self.add_numbering_to_name(concepts)\n\n        return concepts\n
"},{"location":"api-reference/medsallergiesannotator/#miade.annotators.MedsAllergiesAnnotator.concept_types","title":"concept_types: List[Category] property","text":"

Returns a list of concept types.

Returns:

Type Description List[Category]

[Category.MEDICATION, Category.ALLERGY, Category.REACTION]

"},{"location":"api-reference/medsallergiesannotator/#miade.annotators.MedsAllergiesAnnotator.pipeline","title":"pipeline: List[str] property","text":"

Returns a list of annotators in the pipeline.

The annotators are executed in the order they appear in the list.

Returns:

Type Description List[str]

[\"preprocessor\", \"medcat\", \"paragrapher\", \"postprocessor\", \"dosage_extractor\", \"vtm_converter\", \"deduplicator\"]

"},{"location":"api-reference/medsallergiesannotator/#miade.annotators.MedsAllergiesAnnotator.__call__","title":"__call__(note, record_concepts=None, dosage_extractor=None)","text":"

Annotates the given note with concepts using the pipeline.

Parameters:

Name Type Description Default note Note

The note to be annotated.

required record_concepts Optional[List[Concept]]

A list of concepts to be recorded.

None dosage_extractor Optional[DosageExtractor]

A dosage extractor to be used.

None

Returns:

Type Description List[Concept]

The annotated concepts.

Source code in src/miade/annotators.py
def __call__(\n    self,\n    note: Note,\n    record_concepts: Optional[List[Concept]] = None,\n    dosage_extractor: Optional[DosageExtractor] = None,\n) -> List[Concept]:\n    \"\"\"\n    Annotates the given note with concepts using the pipeline.\n\n    Args:\n        note (Note): The note to be annotated.\n        record_concepts (Optional[List[Concept]]): A list of concepts to be recorded.\n        dosage_extractor (Optional[DosageExtractor]): A dosage extractor to be used.\n\n    Returns:\n        The annotated concepts.\n    \"\"\"\n    concepts = self.run_pipeline(note, record_concepts, dosage_extractor)\n\n    if self.config.add_numbering:\n        concepts = self.add_numbering_to_name(concepts)\n\n    return concepts\n
"},{"location":"api-reference/medsallergiesannotator/#miade.annotators.MedsAllergiesAnnotator.add_dosages_to_concepts","title":"add_dosages_to_concepts(dosage_extractor, concepts, note) staticmethod","text":"

Gets dosages for medication concepts

Parameters:

Name Type Description Default dosage_extractor DosageExtractor

The dosage extractor object

required concepts List[Concept]

List of concepts extracted

required note Note

The input note

required

Returns:

Type Description List[Concept]

List of concepts with dosages for medication concepts

Source code in src/miade/annotators.py
@staticmethod\ndef add_dosages_to_concepts(\n    dosage_extractor: DosageExtractor, concepts: List[Concept], note: Note\n) -> List[Concept]:\n    \"\"\"\n    Gets dosages for medication concepts\n\n    Args:\n        dosage_extractor (DosageExtractor): The dosage extractor object\n        concepts (List[Concept]): List of concepts extracted\n        note (Note): The input note\n\n    Returns:\n        List of concepts with dosages for medication concepts\n    \"\"\"\n\n    for ind, concept in enumerate(concepts):\n        next_med_concept = concepts[ind + 1] if len(concepts) > ind + 1 else None\n        dosage_string = get_dosage_string(concept, next_med_concept, note.text)\n        if len(dosage_string.split()) > 2:\n            concept.dosage = dosage_extractor(dosage_string)\n            concept.category = Category.MEDICATION if concept.dosage is not None else None\n            if concept.dosage is not None:\n                log.debug(\n                    f\"Extracted dosage for medication concept \"\n                    f\"({concept.id} | {concept.name}): {concept.dosage.text} {concept.dosage.dose}\"\n                )\n\n    return concepts\n
"},{"location":"api-reference/medsallergiesannotator/#miade.annotators.MedsAllergiesAnnotator.convert_VTM_to_VMP_or_text","title":"convert_VTM_to_VMP_or_text(concepts)","text":"

Converts medication concepts from VTM (Virtual Therapeutic Moiety) to VMP (Virtual Medicinal Product) or text.

Parameters:

Name Type Description Default concepts List[Concept]

A list of medication concepts.

required

Returns:

Type Description List[Concept]

A list of medication concepts with updated IDs, names, and dosages.

Source code in src/miade/annotators.py
def convert_VTM_to_VMP_or_text(self, concepts: List[Concept]) -> List[Concept]:\n    \"\"\"\n    Converts medication concepts from VTM (Virtual Therapeutic Moiety) to VMP (Virtual Medicinal Product) or text.\n\n    Args:\n        concepts (List[Concept]): A list of medication concepts.\n\n    Returns:\n        A list of medication concepts with updated IDs, names, and dosages.\n\n    \"\"\"\n    # Get medication concepts\n    med_concepts = [concept for concept in concepts if concept.category == Category.MEDICATION]\n    self.vtm_to_vmp_lookup[\"dose\"] = self.vtm_to_vmp_lookup[\"dose\"].astype(float)\n\n    med_concepts_with_dose = []\n    # I don't know man...Need to improve dosage methods\n    for concept in med_concepts:\n        if concept.dosage is not None:\n            if concept.dosage.dose:\n                if concept.dosage.dose.value is not None and concept.dosage.dose.unit is not None:\n                    med_concepts_with_dose.append(concept)\n\n    med_concepts_no_dose = [concept for concept in concepts if concept not in med_concepts_with_dose]\n\n    # Create a temporary DataFrame to match vtmId, dose, and unit\n    temp_df = pd.DataFrame(\n        {\n            \"vtmId\": [int(concept.id) for concept in med_concepts_with_dose],\n            \"dose\": [float(concept.dosage.dose.value) for concept in med_concepts_with_dose],\n            \"unit\": [concept.dosage.dose.unit for concept in med_concepts_with_dose],\n        }\n    )\n\n    # Merge with the lookup df to get vmpId\n    merged_df = temp_df.merge(self.vtm_to_vmp_lookup, on=[\"vtmId\", \"dose\", \"unit\"], how=\"left\")\n\n    # Update id in the concepts list\n    for index, concept in enumerate(med_concepts_with_dose):\n        # Convert VTM to VMP id\n        vmp_id = merged_df.at[index, \"vmpId\"]\n        if not pd.isna(vmp_id):\n            log.debug(\n                f\"Converted ({concept.id} | {concept.name}) to \"\n                f\"({int(vmp_id)} | {concept.name + ' ' + str(int(concept.dosage.dose.value)) + concept.dosage.dose.unit} \"\n                f\"tablets): valid extracted dosage + VMP lookup\"\n            )\n            concept.id = str(int(vmp_id))\n            concept.name += \" \" + str(int(concept.dosage.dose.value)) + str(concept.dosage.dose.unit) + \" tablets\"\n            # If found VMP match change the dosage to 1 tablet\n            concept.dosage.dose.value = 1\n            concept.dosage.dose.unit = \"{tbl}\"\n        else:\n            # If no match with dose convert to text\n            lookup_result = self.vtm_to_text_lookup.get(int(concept.id))\n            if lookup_result is not None:\n                log.debug(\n                    f\"Converted ({concept.id} | {concept.name}) to (None | {lookup_result}: no match to VMP dosage lookup)\"\n                )\n                concept.id = None\n                concept.name = lookup_result\n\n    # Convert rest of VTMs that have no dose for VMP conversion to text\n    for concept in med_concepts_no_dose:\n        lookup_result = self.vtm_to_text_lookup.get(int(concept.id))\n        if lookup_result is not None:\n            log.debug(f\"Converted ({concept.id} | {concept.name}) to (None | {lookup_result}): no dosage detected\")\n            concept.id = None\n            concept.name = lookup_result\n\n    return concepts\n
"},{"location":"api-reference/medsallergiesannotator/#miade.annotators.MedsAllergiesAnnotator.postprocess","title":"postprocess(concepts, note)","text":"

Postprocesses a list of concepts and links reactions to allergens.

Parameters:

Name Type Description Default concepts List[Concept]

The list of concepts to be postprocessed.

required note Note

The note object associated with the concepts.

required

Returns:

Type Description List[Concept]

The postprocessed list of concepts.

Source code in src/miade/annotators.py
def postprocess(self, concepts: List[Concept], note: Note) -> List[Concept]:\n    \"\"\"\n    Postprocesses a list of concepts and links reactions to allergens.\n\n    Args:\n        concepts (List[Concept]): The list of concepts to be postprocessed.\n        note (Note): The note object associated with the concepts.\n\n    Returns:\n       The postprocessed list of concepts.\n    \"\"\"\n    # deepcopy so we still have reference to original list of concepts\n    all_concepts = deepcopy(concepts)\n    processed_concepts = []\n\n    for concept in all_concepts:\n        concept = self._validate_and_convert_concepts(concept)\n        processed_concepts.append(concept)\n\n    processed_concepts = self._link_reactions_to_allergens(processed_concepts, note)\n\n    return processed_concepts\n
"},{"location":"api-reference/medsallergiesannotator/#miade.annotators.MedsAllergiesAnnotator.process_paragraphs","title":"process_paragraphs(note, concepts)","text":"

Process the paragraphs in a note and update the list of concepts.

Parameters:

Name Type Description Default note Note

The note object containing the paragraphs.

required concepts List[Concept]

The list of concepts to be updated.

required

Returns:

Type Description List[Concept]

The updated list of concepts.

Source code in src/miade/annotators.py
def process_paragraphs(self, note: Note, concepts: List[Concept]) -> List[Concept]:\n    \"\"\"\n    Process the paragraphs in a note and update the list of concepts.\n\n    Args:\n        note (Note): The note object containing the paragraphs.\n        concepts (List[Concept]): The list of concepts to be updated.\n\n    Returns:\n        The updated list of concepts.\n    \"\"\"\n    for paragraph in note.paragraphs:\n        for concept in concepts:\n            if concept.start >= paragraph.start and concept.end <= paragraph.end:\n                # log.debug(f\"({concept.name} | {concept.id}) is in {paragraph.type}\")\n                if concept.meta:\n                    self._process_meta_ann_by_paragraph(concept, paragraph)\n\n    return concepts\n
"},{"location":"api-reference/medsallergiesannotator/#miade.annotators.MedsAllergiesAnnotator.run_pipeline","title":"run_pipeline(note, record_concepts, dosage_extractor)","text":"

Runs the annotation pipeline on the given note.

Parameters:

Name Type Description Default note Note

The input note to run the pipeline on.

required record_concepts List[Concept]

The list of previously recorded concepts.

required dosage_extractor Optional[DosageExtractor]

The dosage extractor function.

required

Returns:

Type Description List[Concept]

The list of annotated concepts.

Source code in src/miade/annotators.py
def run_pipeline(\n    self, note: Note, record_concepts: List[Concept], dosage_extractor: Optional[DosageExtractor]\n) -> List[Concept]:\n    \"\"\"\n    Runs the annotation pipeline on the given note.\n\n    Args:\n        note (Note): The input note to run the pipeline on.\n        record_concepts (List[Concept]): The list of previously recorded concepts.\n        dosage_extractor (Optional[DosageExtractor]): The dosage extractor function.\n\n    Returns:\n        The list of annotated concepts.\n    \"\"\"\n    concepts: List[Concept] = []\n\n    for pipe in self.pipeline:\n        if pipe not in self.config.disable:\n            if pipe == \"preprocessor\":\n                note = self.preprocess(note)\n            elif pipe == \"medcat\":\n                concepts = self.get_concepts(note)\n            elif pipe == \"paragrapher\":\n                concepts = self.process_paragraphs(note, concepts)\n            elif pipe == \"postprocessor\":\n                concepts = self.postprocess(concepts, note)\n            elif pipe == \"deduplicator\":\n                concepts = self.deduplicate(concepts, record_concepts)\n            elif pipe == \"vtm_converter\":\n                concepts = self.convert_VTM_to_VMP_or_text(concepts)\n            elif pipe == \"dosage_extractor\" and dosage_extractor is not None:\n                concepts = self.add_dosages_to_concepts(dosage_extractor, concepts, note)\n\n    return concepts\n
"},{"location":"api-reference/metaannotations/","title":"MetaAnnotations","text":"

Bases: BaseModel

Represents a meta annotation with a name, value, and optional confidence.

Attributes:

Name Type Description name str

The name of the meta annotation.

value Enum

The value of the meta annotation.

confidence float

The confidence level of the meta annotation.

Source code in src/miade/metaannotations.py
class MetaAnnotations(BaseModel):\n    \"\"\"\n    Represents a meta annotation with a name, value, and optional confidence.\n\n    Attributes:\n        name (str): The name of the meta annotation.\n        value (Enum): The value of the meta annotation.\n        confidence (float, optional): The confidence level of the meta annotation.\n    \"\"\"\n\n    name: str\n    value: Enum\n    confidence: Optional[float]\n\n    @validator(\"value\", pre=True)\n    def validate_value(cls, value, values):\n        enum_dict = META_ANNS_DICT\n        if isinstance(value, str):\n            enum_type = enum_dict.get(values[\"name\"])\n            if enum_type is not None:\n                try:\n                    return enum_type(value)\n                except ValueError:\n                    raise ValueError(f\"Invalid value: {value}\")\n            else:\n                raise ValueError(f\"Invalid mapping for {values['name']}\")\n\n        return value\n\n    def __eq__(self, other):\n        return self.name == other.name and self.value == other.value\n
"},{"location":"api-reference/note/","title":"Note","text":"

Bases: object

Represents a note object.

Attributes:

Name Type Description text str

The text content of the note.

raw_text str

The raw text content of the note.

regex_config str

The path to the regex configuration file.

paragraphs Optional[List[Paragraph]]

A list of paragraphs in the note.

Source code in src/miade/note.py
class Note(object):\n    \"\"\"\n    Represents a note object.\n\n    Attributes:\n        text (str): The text content of the note.\n        raw_text (str): The raw text content of the note.\n        regex_config (str): The path to the regex configuration file.\n        paragraphs (Optional[List[Paragraph]]): A list of paragraphs in the note.\n    \"\"\"\n\n    def __init__(self, text: str, regex_config_path: str = \"./data/regex_para_chunk.csv\"):\n        self.text = text\n        self.raw_text = text\n        self.regex_config = load_regex_config_mappings(regex_config_path)\n        self.paragraphs: Optional[List[Paragraph]] = []\n\n    def clean_text(self) -> None:\n        \"\"\"\n        Cleans the text content of the note.\n\n        This method performs various cleaning operations on the text content of the note,\n        such as replacing spaces, removing punctuation, and removing empty lines.\n        \"\"\"\n\n        # Replace all types of spaces with a single normal space, preserving \"\\n\"\n        self.text = re.sub(r\"(?:(?!\\n)\\s)+\", \" \", self.text)\n\n        # Remove en dashes that are not between two numbers\n        self.text = re.sub(r\"(?<![0-9])-(?![0-9])\", \"\", self.text)\n\n        # Remove all punctuation except full stops, question marks, dash and line breaks\n        self.text = re.sub(r\"[^\\w\\s.,?\\n-]\", \"\", self.text)\n\n        # Remove spaces if the entire line (between two line breaks) is just spaces\n        self.text = re.sub(r\"(?<=\\n)\\s+(?=\\n)\", \"\", self.text)\n\n    def get_paragraphs(self) -> None:\n        \"\"\"\n        Splits the note into paragraphs.\n\n        This method splits the text content of the note into paragraphs based on double line breaks.\n        It also assigns a paragraph type to each paragraph based on matching patterns in the heading.\n        \"\"\"\n\n        paragraphs = re.split(r\"\\n\\n+\", self.text)\n        start = 0\n\n        for text in paragraphs:\n            # Default to prose\n            paragraph_type = ParagraphType.prose\n\n            # Use re.search to find everything before first \\n\n            match = re.search(r\"^(.*?)(?:\\n|$)([\\s\\S]*)\", text)\n\n            # Check if a match is found\n            if match:\n                heading = match.group(1)\n                body = match.group(2)\n            else:\n                heading = text\n                body = \"\"\n\n            end = start + len(text)\n            paragraph = Paragraph(heading=heading, body=body, type=paragraph_type, start=start, end=end)\n            start = end + 2  # Account for the two newline characters\n\n            # Convert the heading to lowercase for case-insensitive matching\n            if heading:\n                heading = heading.lower()\n                # Iterate through the dictionary items and patterns\n                for paragraph_type, pattern in self.regex_config.items():\n                    if re.search(pattern, heading):\n                        paragraph.type = paragraph_type\n                        break  # Exit the loop if a match is found\n\n            self.paragraphs.append(paragraph)\n\n    def __str__(self):\n        return self.text\n
"},{"location":"api-reference/note/#miade.note.Note.clean_text","title":"clean_text()","text":"

Cleans the text content of the note.

This method performs various cleaning operations on the text content of the note, such as replacing spaces, removing punctuation, and removing empty lines.

Source code in src/miade/note.py
def clean_text(self) -> None:\n    \"\"\"\n    Cleans the text content of the note.\n\n    This method performs various cleaning operations on the text content of the note,\n    such as replacing spaces, removing punctuation, and removing empty lines.\n    \"\"\"\n\n    # Replace all types of spaces with a single normal space, preserving \"\\n\"\n    self.text = re.sub(r\"(?:(?!\\n)\\s)+\", \" \", self.text)\n\n    # Remove en dashes that are not between two numbers\n    self.text = re.sub(r\"(?<![0-9])-(?![0-9])\", \"\", self.text)\n\n    # Remove all punctuation except full stops, question marks, dash and line breaks\n    self.text = re.sub(r\"[^\\w\\s.,?\\n-]\", \"\", self.text)\n\n    # Remove spaces if the entire line (between two line breaks) is just spaces\n    self.text = re.sub(r\"(?<=\\n)\\s+(?=\\n)\", \"\", self.text)\n
"},{"location":"api-reference/note/#miade.note.Note.get_paragraphs","title":"get_paragraphs()","text":"

Splits the note into paragraphs.

This method splits the text content of the note into paragraphs based on double line breaks. It also assigns a paragraph type to each paragraph based on matching patterns in the heading.

Source code in src/miade/note.py
def get_paragraphs(self) -> None:\n    \"\"\"\n    Splits the note into paragraphs.\n\n    This method splits the text content of the note into paragraphs based on double line breaks.\n    It also assigns a paragraph type to each paragraph based on matching patterns in the heading.\n    \"\"\"\n\n    paragraphs = re.split(r\"\\n\\n+\", self.text)\n    start = 0\n\n    for text in paragraphs:\n        # Default to prose\n        paragraph_type = ParagraphType.prose\n\n        # Use re.search to find everything before first \\n\n        match = re.search(r\"^(.*?)(?:\\n|$)([\\s\\S]*)\", text)\n\n        # Check if a match is found\n        if match:\n            heading = match.group(1)\n            body = match.group(2)\n        else:\n            heading = text\n            body = \"\"\n\n        end = start + len(text)\n        paragraph = Paragraph(heading=heading, body=body, type=paragraph_type, start=start, end=end)\n        start = end + 2  # Account for the two newline characters\n\n        # Convert the heading to lowercase for case-insensitive matching\n        if heading:\n            heading = heading.lower()\n            # Iterate through the dictionary items and patterns\n            for paragraph_type, pattern in self.regex_config.items():\n                if re.search(pattern, heading):\n                    paragraph.type = paragraph_type\n                    break  # Exit the loop if a match is found\n\n        self.paragraphs.append(paragraph)\n
"},{"location":"api-reference/noteprocessor/","title":"NoteProcessor","text":"

Main processor of MiADE which extract, postprocesses, and deduplicates concepts given annotators (MedCAT models), Note, and existing concepts

Parameters:

Name Type Description Default model_directory Path

Path to directory that contains medcat models and a config.yaml file

required model_config_path Path

Path to the model config file. Defaults to None.

None log_level int

Log level. Defaults to logging.INFO.

INFO dosage_extractor_log_level int

Log level for dosage extractor. Defaults to logging.INFO.

INFO device str

Device to run inference on (cpu or gpu). Defaults to \"cpu\".

'cpu' custom_annotators List[Annotator]

List of custom annotators. Defaults to None.

None Source code in src/miade/core.py
class NoteProcessor:\n    \"\"\"\n    Main processor of MiADE which extract, postprocesses, and deduplicates concepts given\n    annotators (MedCAT models), Note, and existing concepts\n\n    Args:\n        model_directory (Path): Path to directory that contains medcat models and a config.yaml file\n        model_config_path (Path, optional): Path to the model config file. Defaults to None.\n        log_level (int, optional): Log level. Defaults to logging.INFO.\n        dosage_extractor_log_level (int, optional): Log level for dosage extractor. Defaults to logging.INFO.\n        device (str, optional): Device to run inference on (cpu or gpu). Defaults to \"cpu\".\n        custom_annotators (List[Annotator], optional): List of custom annotators. Defaults to None.\n    \"\"\"\n\n    def __init__(\n        self,\n        model_directory: Path,\n        model_config_path: Path = None,\n        log_level: int = logging.INFO,\n        dosage_extractor_log_level: int = logging.INFO,\n        device: str = \"cpu\",\n        custom_annotators: Optional[List[Annotator]] = None,\n    ):\n        logging.getLogger(\"miade\").setLevel(log_level)\n        logging.getLogger(\"miade.dosageextractor\").setLevel(dosage_extractor_log_level)\n        logging.getLogger(\"miade.drugdoseade\").setLevel(dosage_extractor_log_level)\n\n        self.device: str = device\n\n        self.annotators: List[Annotator] = []\n        self.model_directory: Path = model_directory\n        self.model_config_path: Path = model_config_path\n        self.model_factory: ModelFactory = self._load_model_factory(custom_annotators)\n        self.dosage_extractor: DosageExtractor = DosageExtractor()\n\n    def _load_config(self) -> Dict:\n        \"\"\"\n        Loads the configuration file (config.yaml) in the configured model path.\n        If the model path is not explicitly passed, it defaults to the model directory.\n\n        Returns:\n            A dictionary containing the loaded config file.\n        \"\"\"\n        if self.model_config_path is None:\n            config_path = os.path.join(self.model_directory, \"config.yaml\")\n        else:\n            config_path = self.model_config_path\n\n        if os.path.isfile(config_path):\n            log.info(f\"Found config file {config_path}\")\n        else:\n            log.error(f\"No model config file found at {config_path}\")\n\n        with open(config_path, \"r\") as f:\n            config = yaml.safe_load(f)\n\n        return config\n\n    def _load_model_factory(self, custom_annotators: Optional[List[Annotator]] = None) -> ModelFactory:\n        \"\"\"\n        Loads the model factory which maps model aliases to MedCAT model IDs and MiADE annotators.\n\n        Args:\n            custom_annotators (List[Annotators], optional): List of custom annotators to initialize. Defaults to None.\n\n        Returns:\n            The initialized ModelFactory object.\n\n        Raises:\n            Exception: If there is an error loading MedCAT models.\n\n        \"\"\"\n        meta_cat_config_dict = {\"general\": {\"device\": self.device}}\n        config_dict = self._load_config()\n        loaded_models = {}\n\n        # get model {id: cat_model}\n        log.info(f\"Loading MedCAT models from {self.model_directory}\")\n        for model_pack_filepath in self.model_directory.glob(\"*.zip\"):\n            try:\n                cat = MiADE_CAT.load_model_pack(str(model_pack_filepath), meta_cat_config_dict=meta_cat_config_dict)\n                # temp fix reload to load stop words\n                cat.pipe._nlp = spacy.load(\n                    cat.config.general.spacy_model, disable=cat.config.general.spacy_disabled_components\n                )\n                cat._create_pipeline(config=cat.config)\n                cat_id = cat.config.version[\"id\"]\n                loaded_models[cat_id] = cat\n            except Exception as e:\n                raise Exception(f\"Error loading MedCAT models: {e}\")\n\n        mapped_models = {}\n        # map to name if given {name: <class CAT>}\n        if \"models\" in config_dict:\n            for name, model_id in config_dict[\"models\"].items():\n                cat_model = loaded_models.get(model_id)\n                if cat_model is None:\n                    log.warning(f\"No match for model id {model_id} in {self.model_directory}, skipping\")\n                    continue\n                mapped_models[name] = cat_model\n        else:\n            log.warning(\"No model ids configured!\")\n\n        mapped_annotators = {}\n        # {name: <class Annotator>}\n        if \"annotators\" in config_dict:\n            for name, annotator_string in config_dict[\"annotators\"].items():\n                if custom_annotators is not None:\n                    for annotator_class in custom_annotators:\n                        if annotator_class.__name__ == annotator_string:\n                            mapped_annotators[name] = annotator_class\n                            break\n                if name not in mapped_annotators:\n                    try:\n                        annotator_class = getattr(sys.modules[__name__], annotator_string)\n                        mapped_annotators[name] = annotator_class\n                    except AttributeError as e:\n                        log.warning(f\"{annotator_string} not found: {e}\")\n        else:\n            log.warning(\"No annotators configured!\")\n\n        mapped_configs = {}\n        if \"general\" in config_dict:\n            for name, config in config_dict[\"general\"].items():\n                try:\n                    mapped_configs[name] = AnnotatorConfig(**config)\n                except Exception as e:\n                    log.error(f\"Error processing config for '{name}': {str(e)}\")\n        else:\n            log.warning(\"No general settings configured, using default settings.\")\n\n        model_factory_config = {\"models\": mapped_models, \"annotators\": mapped_annotators, \"configs\": mapped_configs}\n\n        return ModelFactory(**model_factory_config)\n\n    def add_annotator(self, name: str) -> None:\n        \"\"\"\n        Adds an annotator to the processor.\n\n        Args:\n            name (str): The alias of the annotator to add.\n\n        Returns:\n            None\n\n        Raises:\n            Exception: If there is an error creating the annotator.\n        \"\"\"\n        try:\n            annotator = create_annotator(name, self.model_factory)\n            log.info(\n                f\"Added {type(annotator).__name__} to processor with config {self.model_factory.configs.get(name)}\"\n            )\n        except Exception as e:\n            raise Exception(f\"Error creating annotator: {e}\")\n\n        self.annotators.append(annotator)\n\n    def remove_annotator(self, name: str) -> None:\n        \"\"\"\n        Removes an annotator from the processor.\n\n        Args:\n            name (str): The alias of the annotator to remove.\n\n        Returns:\n            None\n        \"\"\"\n        annotator_found = False\n        annotator_name = self.model_factory.annotators[name]\n\n        for annotator in self.annotators:\n            if type(annotator).__name__ == annotator_name.__name__:\n                self.annotators.remove(annotator)\n                annotator_found = True\n                log.info(f\"Removed {type(annotator).__name__} from processor\")\n                break\n\n        if not annotator_found:\n            log.warning(f\"Annotator {type(name).__name__} not found in processor\")\n\n    def print_model_cards(self) -> None:\n        \"\"\"\n        Prints the model cards for each annotator in the `annotators` list.\n\n        Each model card includes the name of the annotator's class and its category.\n        \"\"\"\n        for annotator in self.annotators:\n            print(f\"{type(annotator).__name__}: {annotator.cat}\")\n\n    def process(self, note: Note, record_concepts: Optional[List[Concept]] = None) -> List[Concept]:\n        \"\"\"\n        Process the given note and extract concepts using the loaded annotators.\n\n        Args:\n            note (Note): The note to be processed.\n            record_concepts (Optional[List[Concept]]): A list of existing concepts in the EHR record.\n\n        Returns:\n            A list of extracted concepts.\n\n        \"\"\"\n        if not self.annotators:\n            log.warning(\"No annotators loaded, use .add_annotator() to load annotators\")\n            return []\n\n        concepts: List[Concept] = []\n\n        for annotator in self.annotators:\n            log.debug(f\"Processing concepts with {type(annotator).__name__}\")\n            if Category.MEDICATION in annotator.concept_types:\n                detected_concepts = annotator(note, record_concepts, self.dosage_extractor)\n                concepts.extend(detected_concepts)\n            else:\n                detected_concepts = annotator(note, record_concepts)\n                concepts.extend(detected_concepts)\n\n        return concepts\n\n    def get_concept_dicts(\n        self, note: Note, filter_uncategorized: bool = True, record_concepts: Optional[List[Concept]] = None\n    ) -> List[Dict]:\n        \"\"\"\n        Returns concepts in dictionary format.\n\n        Args:\n            note (Note): Note containing text to extract concepts from.\n            filter_uncategorized (bool): If True, does not return concepts where category=None. Default is True.\n            record_concepts (Optional[List[Concept]]): List of concepts in existing record.\n\n        Returns:\n            Extracted concepts in JSON-compatible dictionary format.\n        \"\"\"\n        concepts = self.process(note, record_concepts)\n        concept_list = []\n        for concept in concepts:\n            if filter_uncategorized and concept.category is None:\n                continue\n            concept_dict = concept.__dict__\n            if concept.dosage is not None:\n                concept_dict[\"dosage\"] = {\n                    \"dose\": concept.dosage.dose.dict() if concept.dosage.dose else None,\n                    \"duration\": concept.dosage.duration.dict() if concept.dosage.duration else None,\n                    \"frequency\": concept.dosage.frequency.dict() if concept.dosage.frequency else None,\n                    \"route\": concept.dosage.route.dict() if concept.dosage.route else None,\n                }\n            if concept.meta is not None:\n                meta_anns = []\n                for meta in concept.meta:\n                    meta_dict = meta.__dict__\n                    meta_dict[\"value\"] = meta.value.name\n                    meta_anns.append(meta_dict)\n                concept_dict[\"meta\"] = meta_anns\n            if concept.category is not None:\n                concept_dict[\"category\"] = concept.category.name\n            concept_list.append(concept_dict)\n\n        return concept_list\n
"},{"location":"api-reference/noteprocessor/#miade.core.NoteProcessor.add_annotator","title":"add_annotator(name)","text":"

Adds an annotator to the processor.

Parameters:

Name Type Description Default name str

The alias of the annotator to add.

required

Returns:

Type Description None

None

Raises:

Type Description Exception

If there is an error creating the annotator.

Source code in src/miade/core.py
def add_annotator(self, name: str) -> None:\n    \"\"\"\n    Adds an annotator to the processor.\n\n    Args:\n        name (str): The alias of the annotator to add.\n\n    Returns:\n        None\n\n    Raises:\n        Exception: If there is an error creating the annotator.\n    \"\"\"\n    try:\n        annotator = create_annotator(name, self.model_factory)\n        log.info(\n            f\"Added {type(annotator).__name__} to processor with config {self.model_factory.configs.get(name)}\"\n        )\n    except Exception as e:\n        raise Exception(f\"Error creating annotator: {e}\")\n\n    self.annotators.append(annotator)\n
"},{"location":"api-reference/noteprocessor/#miade.core.NoteProcessor.get_concept_dicts","title":"get_concept_dicts(note, filter_uncategorized=True, record_concepts=None)","text":"

Returns concepts in dictionary format.

Parameters:

Name Type Description Default note Note

Note containing text to extract concepts from.

required filter_uncategorized bool

If True, does not return concepts where category=None. Default is True.

True record_concepts Optional[List[Concept]]

List of concepts in existing record.

None

Returns:

Type Description List[Dict]

Extracted concepts in JSON-compatible dictionary format.

Source code in src/miade/core.py
def get_concept_dicts(\n    self, note: Note, filter_uncategorized: bool = True, record_concepts: Optional[List[Concept]] = None\n) -> List[Dict]:\n    \"\"\"\n    Returns concepts in dictionary format.\n\n    Args:\n        note (Note): Note containing text to extract concepts from.\n        filter_uncategorized (bool): If True, does not return concepts where category=None. Default is True.\n        record_concepts (Optional[List[Concept]]): List of concepts in existing record.\n\n    Returns:\n        Extracted concepts in JSON-compatible dictionary format.\n    \"\"\"\n    concepts = self.process(note, record_concepts)\n    concept_list = []\n    for concept in concepts:\n        if filter_uncategorized and concept.category is None:\n            continue\n        concept_dict = concept.__dict__\n        if concept.dosage is not None:\n            concept_dict[\"dosage\"] = {\n                \"dose\": concept.dosage.dose.dict() if concept.dosage.dose else None,\n                \"duration\": concept.dosage.duration.dict() if concept.dosage.duration else None,\n                \"frequency\": concept.dosage.frequency.dict() if concept.dosage.frequency else None,\n                \"route\": concept.dosage.route.dict() if concept.dosage.route else None,\n            }\n        if concept.meta is not None:\n            meta_anns = []\n            for meta in concept.meta:\n                meta_dict = meta.__dict__\n                meta_dict[\"value\"] = meta.value.name\n                meta_anns.append(meta_dict)\n            concept_dict[\"meta\"] = meta_anns\n        if concept.category is not None:\n            concept_dict[\"category\"] = concept.category.name\n        concept_list.append(concept_dict)\n\n    return concept_list\n
"},{"location":"api-reference/noteprocessor/#miade.core.NoteProcessor.print_model_cards","title":"print_model_cards()","text":"

Prints the model cards for each annotator in the annotators list.

Each model card includes the name of the annotator's class and its category.

Source code in src/miade/core.py
def print_model_cards(self) -> None:\n    \"\"\"\n    Prints the model cards for each annotator in the `annotators` list.\n\n    Each model card includes the name of the annotator's class and its category.\n    \"\"\"\n    for annotator in self.annotators:\n        print(f\"{type(annotator).__name__}: {annotator.cat}\")\n
"},{"location":"api-reference/noteprocessor/#miade.core.NoteProcessor.process","title":"process(note, record_concepts=None)","text":"

Process the given note and extract concepts using the loaded annotators.

Parameters:

Name Type Description Default note Note

The note to be processed.

required record_concepts Optional[List[Concept]]

A list of existing concepts in the EHR record.

None

Returns:

Type Description List[Concept]

A list of extracted concepts.

Source code in src/miade/core.py
def process(self, note: Note, record_concepts: Optional[List[Concept]] = None) -> List[Concept]:\n    \"\"\"\n    Process the given note and extract concepts using the loaded annotators.\n\n    Args:\n        note (Note): The note to be processed.\n        record_concepts (Optional[List[Concept]]): A list of existing concepts in the EHR record.\n\n    Returns:\n        A list of extracted concepts.\n\n    \"\"\"\n    if not self.annotators:\n        log.warning(\"No annotators loaded, use .add_annotator() to load annotators\")\n        return []\n\n    concepts: List[Concept] = []\n\n    for annotator in self.annotators:\n        log.debug(f\"Processing concepts with {type(annotator).__name__}\")\n        if Category.MEDICATION in annotator.concept_types:\n            detected_concepts = annotator(note, record_concepts, self.dosage_extractor)\n            concepts.extend(detected_concepts)\n        else:\n            detected_concepts = annotator(note, record_concepts)\n            concepts.extend(detected_concepts)\n\n    return concepts\n
"},{"location":"api-reference/noteprocessor/#miade.core.NoteProcessor.remove_annotator","title":"remove_annotator(name)","text":"

Removes an annotator from the processor.

Parameters:

Name Type Description Default name str

The alias of the annotator to remove.

required

Returns:

Type Description None

None

Source code in src/miade/core.py
def remove_annotator(self, name: str) -> None:\n    \"\"\"\n    Removes an annotator from the processor.\n\n    Args:\n        name (str): The alias of the annotator to remove.\n\n    Returns:\n        None\n    \"\"\"\n    annotator_found = False\n    annotator_name = self.model_factory.annotators[name]\n\n    for annotator in self.annotators:\n        if type(annotator).__name__ == annotator_name.__name__:\n            self.annotators.remove(annotator)\n            annotator_found = True\n            log.info(f\"Removed {type(annotator).__name__} from processor\")\n            break\n\n    if not annotator_found:\n        log.warning(f\"Annotator {type(name).__name__} not found in processor\")\n
"},{"location":"api-reference/problemsannotator/","title":"ProblemsAnnotator","text":"

Bases: Annotator

Annotator class for identifying and processing problems in medical notes.

This class extends the base Annotator class and provides specific functionality for identifying and processing problems in medical notes. It implements methods for loading problem lookup data, processing meta annotations, filtering concepts, and post-processing the annotated concepts.

Attributes:

Name Type Description cat CAT

The CAT (Concept Annotation Tool) instance used for annotation.

config AnnotatorConfig

The configuration object for the annotator.

Properties

concept_types (list): A list of concept types supported by this annotator. pipeline (list): The list of processing steps in the annotation pipeline.

Source code in src/miade/annotators.py
class ProblemsAnnotator(Annotator):\n    \"\"\"\n    Annotator class for identifying and processing problems in medical notes.\n\n    This class extends the base `Annotator` class and provides specific functionality\n    for identifying and processing problems in medical notes. It implements methods\n    for loading problem lookup data, processing meta annotations, filtering concepts,\n    and post-processing the annotated concepts.\n\n    Attributes:\n        cat (CAT): The CAT (Concept Annotation Tool) instance used for annotation.\n        config (AnnotatorConfig): The configuration object for the annotator.\n\n    Properties:\n        concept_types (list): A list of concept types supported by this annotator.\n        pipeline (list): The list of processing steps in the annotation pipeline.\n    \"\"\"\n\n    def __init__(self, cat: CAT, config: AnnotatorConfig = None):\n        super().__init__(cat, config)\n        self._load_problems_lookup_data()\n\n    @property\n    def concept_types(self) -> List[Category]:\n        \"\"\"\n        Get the list of concept types supported by this annotator.\n\n        Returns:\n            [Category.PROBLEM]\n        \"\"\"\n        return [Category.PROBLEM]\n\n    @property\n    def pipeline(self) -> List[str]:\n        \"\"\"\n        Get the list of processing steps in the annotation pipeline.\n\n        Returns:\n            [\"preprocessor\", \"medcat\", \"paragrapher\", \"postprocessor\", \"deduplicator\"]\n        \"\"\"\n        return [\"preprocessor\", \"medcat\", \"paragrapher\", \"postprocessor\", \"deduplicator\"]\n\n    def _load_problems_lookup_data(self) -> None:\n        \"\"\"\n        Load the problem lookup data.\n\n        Raises:\n            RuntimeError: If the lookup data directory does not exist.\n        \"\"\"\n        if not os.path.isdir(self.config.lookup_data_path):\n            raise RuntimeError(f\"No lookup data configured: {self.config.lookup_data_path} does not exist!\")\n        else:\n            self.negated_lookup = load_lookup_data(self.config.lookup_data_path + \"negated.csv\", as_dict=True)\n            self.historic_lookup = load_lookup_data(self.config.lookup_data_path + \"historic.csv\", as_dict=True)\n            self.suspected_lookup = load_lookup_data(self.config.lookup_data_path + \"suspected.csv\", as_dict=True)\n            self.filtering_blacklist = load_lookup_data(\n                self.config.lookup_data_path + \"problem_blacklist.csv\", no_header=True\n            )\n\n    def _process_meta_annotations(self, concept: Concept) -> Optional[Concept]:\n        \"\"\"\n        Process the meta annotations for a concept.\n\n        Args:\n            concept (Concept): The concept to process.\n\n        Returns:\n           The processed concept, or None if it should be removed.\n\n        Raises:\n            ValueError: If the concept has an invalid negex value.\n        \"\"\"\n        # Add, convert, or ignore concepts\n        meta_ann_values = [meta_ann.value for meta_ann in concept.meta] if concept.meta is not None else []\n\n        convert = False\n        tag = \"\"\n        # only get meta model results if negex is false\n        if concept.negex is not None:\n            if concept.negex:\n                convert = self.negated_lookup.get(int(concept.id), False)\n                tag = \" (negated)\"\n            elif Presence.SUSPECTED in meta_ann_values:\n                convert = self.suspected_lookup.get(int(concept.id), False)\n                tag = \" (suspected)\"\n            elif Relevance.HISTORIC in meta_ann_values:\n                convert = self.historic_lookup.get(int(concept.id), False)\n                tag = \" (historic)\"\n        else:\n            if Presence.NEGATED in meta_ann_values:\n                convert = self.negated_lookup.get(int(concept.id), False)\n                tag = \" (negated)\"\n            elif Presence.SUSPECTED in meta_ann_values:\n                convert = self.suspected_lookup.get(int(concept.id), False)\n                tag = \" (suspected)\"\n            elif Relevance.HISTORIC in meta_ann_values:\n                convert = self.historic_lookup.get(int(concept.id), False)\n                tag = \" (historic)\"\n\n        if convert:\n            if tag == \" (negated)\" and concept.negex:\n                log.debug(\n                    f\"Converted concept ({concept.id} | {concept.name}) to ({str(convert)} | {concept.name + tag}): \"\n                    f\"negation detected by negex\"\n                )\n            else:\n                log.debug(\n                    f\"Converted concept ({concept.id} | {concept.name}) to ({str(convert)} | {concept.name + tag}):\"\n                    f\"detected by meta model\"\n                )\n            concept.id = str(convert)\n            concept.name += tag\n        else:\n            if concept.negex:\n                log.debug(f\"Removed concept ({concept.id} | {concept.name}): negation (negex) with no conversion match\")\n                return None\n            if concept.negex is None and Presence.NEGATED in meta_ann_values:\n                log.debug(\n                    f\"Removed concept ({concept.id} | {concept.name}): negation (meta model) with no conversion match\"\n                )\n                return None\n            if Presence.SUSPECTED in meta_ann_values:\n                log.debug(f\"Removed concept ({concept.id} | {concept.name}): suspected with no conversion match\")\n                return None\n            if Relevance.IRRELEVANT in meta_ann_values:\n                log.debug(f\"Removed concept ({concept.id} | {concept.name}): irrelevant concept\")\n                return None\n            if Relevance.HISTORIC in meta_ann_values:\n                log.debug(f\"No change to concept ({concept.id} | {concept.name}): historic with no conversion match\")\n\n        concept.category = Category.PROBLEM\n\n        return concept\n\n    def _is_blacklist(self, concept):\n        \"\"\"\n        Check if a concept is in the filtering blacklist.\n\n        Args:\n            concept: The concept to check.\n\n        Returns:\n            True if the concept is in the blacklist, False otherwise.\n        \"\"\"\n        # filtering blacklist\n        if int(concept.id) in self.filtering_blacklist.values:\n            log.debug(f\"Removed concept ({concept.id} | {concept.name}): concept in problems blacklist\")\n            return True\n        return False\n\n    def _process_meta_ann_by_paragraph(\n        self, concept: Concept, paragraph: Paragraph, prob_concepts_in_structured_sections: List[Concept]\n    ):\n        \"\"\"\n        Process the meta annotations for a concept based on the paragraph type.\n\n        Args:\n            concept (Concept): The concept to process.\n            paragraph (Paragraph): The paragraph containing the concept.\n            prob_concepts_in_structured_sections (List[Concept]): The list of problem concepts in structured sections.\n        \"\"\"\n        # if paragraph is structured problems section, add to prob list and convert to corresponding relevance\n        if paragraph.type in self.structured_prob_lists:\n            prob_concepts_in_structured_sections.append(concept)\n            for meta in concept.meta:\n                if meta.name == \"relevance\" and meta.value == Relevance.IRRELEVANT:\n                    new_relevance = self.structured_prob_lists[paragraph.type]\n                    log.debug(\n                        f\"Converted {meta.value} to \"\n                        f\"{new_relevance} for concept ({concept.id} | {concept.name}): \"\n                        f\"paragraph is {paragraph.type}\"\n                    )\n                    meta.value = new_relevance\n        # if paragraph is meds or irrelevant section, convert problems to irrelevant\n        elif paragraph.type in self.structured_med_lists or paragraph.type in self.irrelevant_paragraphs:\n            for meta in concept.meta:\n                if meta.name == \"relevance\" and meta.value != Relevance.IRRELEVANT:\n                    log.debug(\n                        f\"Converted {meta.value} to \"\n                        f\"{Relevance.IRRELEVANT} for concept ({concept.id} | {concept.name}): \"\n                        f\"paragraph is {paragraph.type}\"\n                    )\n                    meta.value = Relevance.IRRELEVANT\n\n    def process_paragraphs(self, note: Note, concepts: List[Concept]) -> List[Concept]:\n        \"\"\"\n        Process the paragraphs in a note and filter the concepts.\n\n        Args:\n            note (Note): The note to process.\n            concepts (List[Concept]): The list of concepts to filter.\n\n        Returns:\n            The filtered list of concepts.\n        \"\"\"\n        prob_concepts_in_structured_sections: List[Concept] = []\n\n        for paragraph in note.paragraphs:\n            for concept in concepts:\n                if concept.start >= paragraph.start and concept.end <= paragraph.end:\n                    # log.debug(f\"({concept.name} | {concept.id}) is in {paragraph.type}\")\n                    if concept.meta:\n                        self._process_meta_ann_by_paragraph(concept, paragraph, prob_concepts_in_structured_sections)\n\n        # if more than set no. concepts in prob or imp or pmh sections, return only those and ignore all other concepts\n        if len(prob_concepts_in_structured_sections) > self.config.structured_list_limit:\n            log.debug(\n                f\"Ignoring concepts elsewhere in the document because \"\n                f\"more than {self.config.structured_list_limit} concepts exist \"\n                f\"in prob, imp, pmh structured sections: {len(prob_concepts_in_structured_sections)}\"\n            )\n            return prob_concepts_in_structured_sections\n\n        return concepts\n\n    def postprocess(self, concepts: List[Concept]) -> List[Concept]:\n        \"\"\"\n        Post-process the concepts and filter out irrelevant concepts.\n\n        Args:\n            concepts (List[Concept]): The list of concepts to post-process.\n\n        Returns:\n            The filtered list of concepts.\n        \"\"\"\n        # deepcopy so we still have reference to original list of concepts\n        all_concepts = deepcopy(concepts)\n        filtered_concepts = []\n        for concept in all_concepts:\n            if self._is_blacklist(concept):\n                continue\n            # meta annotations\n            concept = self._process_meta_annotations(concept)\n            # ignore concepts filtered by meta-annotations\n            if concept is None:\n                continue\n            filtered_concepts.append(concept)\n\n        return filtered_concepts\n
"},{"location":"api-reference/problemsannotator/#miade.annotators.ProblemsAnnotator.concept_types","title":"concept_types: List[Category] property","text":"

Get the list of concept types supported by this annotator.

Returns:

Type Description List[Category]

[Category.PROBLEM]

"},{"location":"api-reference/problemsannotator/#miade.annotators.ProblemsAnnotator.pipeline","title":"pipeline: List[str] property","text":"

Get the list of processing steps in the annotation pipeline.

Returns:

Type Description List[str]

[\"preprocessor\", \"medcat\", \"paragrapher\", \"postprocessor\", \"deduplicator\"]

"},{"location":"api-reference/problemsannotator/#miade.annotators.ProblemsAnnotator.postprocess","title":"postprocess(concepts)","text":"

Post-process the concepts and filter out irrelevant concepts.

Parameters:

Name Type Description Default concepts List[Concept]

The list of concepts to post-process.

required

Returns:

Type Description List[Concept]

The filtered list of concepts.

Source code in src/miade/annotators.py
def postprocess(self, concepts: List[Concept]) -> List[Concept]:\n    \"\"\"\n    Post-process the concepts and filter out irrelevant concepts.\n\n    Args:\n        concepts (List[Concept]): The list of concepts to post-process.\n\n    Returns:\n        The filtered list of concepts.\n    \"\"\"\n    # deepcopy so we still have reference to original list of concepts\n    all_concepts = deepcopy(concepts)\n    filtered_concepts = []\n    for concept in all_concepts:\n        if self._is_blacklist(concept):\n            continue\n        # meta annotations\n        concept = self._process_meta_annotations(concept)\n        # ignore concepts filtered by meta-annotations\n        if concept is None:\n            continue\n        filtered_concepts.append(concept)\n\n    return filtered_concepts\n
"},{"location":"api-reference/problemsannotator/#miade.annotators.ProblemsAnnotator.process_paragraphs","title":"process_paragraphs(note, concepts)","text":"

Process the paragraphs in a note and filter the concepts.

Parameters:

Name Type Description Default note Note

The note to process.

required concepts List[Concept]

The list of concepts to filter.

required

Returns:

Type Description List[Concept]

The filtered list of concepts.

Source code in src/miade/annotators.py
def process_paragraphs(self, note: Note, concepts: List[Concept]) -> List[Concept]:\n    \"\"\"\n    Process the paragraphs in a note and filter the concepts.\n\n    Args:\n        note (Note): The note to process.\n        concepts (List[Concept]): The list of concepts to filter.\n\n    Returns:\n        The filtered list of concepts.\n    \"\"\"\n    prob_concepts_in_structured_sections: List[Concept] = []\n\n    for paragraph in note.paragraphs:\n        for concept in concepts:\n            if concept.start >= paragraph.start and concept.end <= paragraph.end:\n                # log.debug(f\"({concept.name} | {concept.id}) is in {paragraph.type}\")\n                if concept.meta:\n                    self._process_meta_ann_by_paragraph(concept, paragraph, prob_concepts_in_structured_sections)\n\n    # if more than set no. concepts in prob or imp or pmh sections, return only those and ignore all other concepts\n    if len(prob_concepts_in_structured_sections) > self.config.structured_list_limit:\n        log.debug(\n            f\"Ignoring concepts elsewhere in the document because \"\n            f\"more than {self.config.structured_list_limit} concepts exist \"\n            f\"in prob, imp, pmh structured sections: {len(prob_concepts_in_structured_sections)}\"\n        )\n        return prob_concepts_in_structured_sections\n\n    return concepts\n
"},{"location":"user-guide/configuration/","title":"Configurations","text":""},{"location":"user-guide/configuration/#annotator","title":"Annotator","text":"

The MiADE processor is configured by a yaml file that maps a human-readable key for each of your models to a MedCAT model ID and a MiADE annotator class. The config file must be in the same folder as the MedCAT models.

config.yaml
models:\n  problems: f25ec9423958e8d6\n  meds/allergies: a146c741501cf1f7\nannotators:\n  problems: ProblemsAnnotator\n  meds/allergies: MedsAllergiesAnnotator\ngeneral:\n  problems:\n    lookup_data_path: ./lookup_data/\n    negation_detection: None\n    structured_list_limit: 0  # if more than this number of concepts in structure section, ignore concepts in prose\n    disable: []\n    add_numbering: True\n  meds/allergies:\n    lookup_data_path: ./lookup_data/\n    negation_detection: None\n    disable: []\n    add_numbering: False\n
"},{"location":"user-guide/configuration/#lookup-table","title":"Lookup Table","text":"

Lookup tables are by default not packaged with the main MiADE package to provide flexibility to customise the postprocessing steps. We provide example lookup data in miade-dataset which you can download and use.

git clone https://github.com/uclh-criu/miade-datasets.git\n
"},{"location":"user-guide/quickstart/","title":"Quickstart","text":""},{"location":"user-guide/quickstart/#extract-concepts-and-dosages-from-a-note-using-miade","title":"Extract concepts and dosages from a Note using MiADE","text":""},{"location":"user-guide/quickstart/#configuring-the-miade-processor","title":"Configuring the MiADE Processor","text":"

NoteProcessor is the MiADE core. It is initialised with a model directory path that contains all the MedCAT model pack .zip files we would like to use in our pipeline, and a config file that maps an alias to the model IDs (model IDs can be found in MedCAT model_cards or usually will be in the name) and annotators we would like to use:

config.yaml

models:\n  problems: f25ec9423958e8d6\n  meds/allergies: a146c741501cf1f7\nannotators:\n  problems: ProblemsAnnotator\n  meds/allergies: MedsAllergiesAnnotator\n
We can initialise a MiADE NoteProcessor object by passing in the model directory which contains our MedCAT models and config.yaml file:

miade = NoteProcessor(Path(\"path/to/model/dir\"))\n
Once NoteProcessor is initialised, we can add annotators by the aliases we have specified in config.yaml to our processor:

miade.add_annotator(\"problems\", use_negex=True)\nmiade.add_annotator(\"meds/allergies\")\n

When adding annotators, we have the option to add NegSpacy to the MedCAT spaCy pipeline, which implements the NegEx algorithm (Chapman et al. 2001) for negation detection. This allows the models to perform simple rule-based negation detection in the absence of MetaCAT models.

"},{"location":"user-guide/quickstart/#creating-a-note","title":"Creating a Note","text":"

Create a Note object which contains the text we would like to extract concepts and dosages from:

text = \"\"\"\nSuspected heart failure\n\nPMH:\nprev history of Hypothyroidism\nMI 10 years ago\n\n\nCurrent meds:\nLosartan 100mg daily\nAtorvastatin 20mg daily\nParacetamol 500mg tablets 2 tabs qds prn\n\nAllergies:\nPenicillin - rash\n\nReferred with swollen ankles and shortness of breath since 2 weeks.\n\"\"\"\n\nnote = Note(text)\n
"},{"location":"user-guide/quickstart/#extracting-concepts-and-dosages","title":"Extracting Concepts and Dosages","text":"

MiADE currently extracts concepts in SNOMED CT. Each concept contains:

The dosages associated with medication concepts are extracted by the built-in MiADE DosageExtractor, using a combination of NER model Med7 and the CALIBER rule-based drug dose lookup algorithm. It returns: The output format is directly translatable to HL7 CDA but can also easily be converted to FHIR.

Putting it all together, we can now extract concepts from our Note object:

as Concept objectas Dict
concepts = miade.process(note)\nfor concept in concepts:\n    print(concept)\n\n# {name: breaking out - eruption, id: 271807003, category: Category.REACTION, start: 204, end: 208, dosage: None, negex: False, meta: None} \n# {name: penicillin, id: 764146007, category: Category.ALLERGY, start: 191, end: 201, dosage: None, negex: False, meta: None} \n
concepts = miade.get_concept_dicts(note)\nprint(concepts)\n\n# [{'name': 'hypothyroidism (historic)',\n# 'id': '161443002',\n# 'category': 'PROBLEM',\n# 'start': 46,\n# 'end': 60,\n# 'dosage': None,\n# 'negex': False,\n# 'meta': [{'name': 'relevance',\n#           'value': 'HISTORIC',\n#           'confidence': 0.999841570854187},\n# ...\n
"},{"location":"user-guide/quickstart/#handling-existing-records-deduplication","title":"Handling existing records: deduplication","text":"

MiADE is built to handle existing medication records from EHR systems that can be sent alongside the note. It will perform basic deduplication matching on id for existing record concepts.

# create list of concepts that already exists in patient record\nrecord_concepts = [\n    Concept(id=\"161443002\", name=\"hypothyroidism (historic)\", category=Category.PROBLEM),\n    Concept(id=\"267039000\", name=\"swollen ankle\", category=Category.PROBLEM)\n]\n

We can pass in a list of existing concepts from the EHR to MiADE at runtime:

miade.process(note=note, record_concepts=record_concepts)\n
"},{"location":"user-guide/quickstart/#customising-miade","title":"Customising MiADE","text":""},{"location":"user-guide/quickstart/#training-custom-medcat-models","title":"Training Custom MedCAT Models","text":"

MiADE provides command line interface scripts for automatically building MedCAT model packs, unsupervised training, supervised training steps, and the creation and training of MetaCAT models. For more information on MedCAT models, see MedCAT documentation and paper.

The --synthetic-data-path option allows you to add synthetically generated training data in CSV format to the supervised and MetaCAT training steps. The CSV should have the following format:

text cui name start end relevance presence laterality no history of liver failure 59927004 hepatic failure 14 26 historic negated none

# Trains unsupervised training step of MedCAT model\nmiade train $MODEL_PACK_PATH $TEXT_DATA_PATH --tag \"miade-example\"\n
# Trains supervised training step of MedCAT model\nmiade train-supervised $MODEL_PACK_PATH $MEDCAT_JSON_EXPORT --synthetic-data-path $SYNTHETIC_CSV_PATH\n
# Creates BBPE tokenizer for MetaCAT\nmiade create-bbpe-tokenizer $TEXT_DATA_PATH\n
# Initialises MetaCAT models to do training on\nmiade create-metacats $TOKENIZER_PATH $CATEGORY_NAMES\n
# Trains the MetaCAT Bi-LSTM models\nmiade train-metacats $METACAT_MODEL_PATH $MEDCAT_JSON_EXPORT --synthetic-data-path $SYNTHETIC_CSV_PATH\n
# Packages MetaCAT models with the main MedCAT model pack\nmiade add_metacat_models $MODEL_PACK_PATH $METACAT_MODEL_PATH\n

"},{"location":"user-guide/quickstart/#creating-custom-miade-annotators","title":"Creating Custom MiADE Annotators","text":"

We can add custom annotators with more specialised postprocessing steps to MiADE by subclassing Annotator and initialising NoteProcessor with a list of custom annotators

Annotator methods include:

An example custom Annotator class might look like this:

class CustomAnnotator(Annotator):\n    def __init__(self, cat: MiADE_CAT):\n        super().__init__(cat)\n        # we need to include MEDICATIONS in concept types so MiADE processor will also extract dosages\n        self.concept_types = [Category.MEDICATION, Category.ALLERGY]\n\n    def postprocess(self, concepts: List[Concept]) -> List[Concept]:\n        # some example post-processing code\n        reactions = [\"271807003\"]\n        allergens = [\"764146007\"]\n        for concept in concepts:\n            if concept.id in reactions:\n                concept.category = Category.REACTION\n            elif concept.id in allergens:\n                concept.category = Category.ALLERGY\n        return concepts\n\n    def __call__(\n        self,\n        note: Note,\n        record_concepts: Optional[List[Concept]] = None,\n        dosage_extractor: Optional[DosageExtractor] = None,\n    ):\n        concepts = self.get_concepts(note)\n        concepts = self.postprocess(concepts)\n        # run dosage extractor if given\n        if dosage_extractor is not None:\n            concepts = self.add_dosages_to_concepts(dosage_extractor, concepts, note)\n        concepts = self.deduplicate(concepts, record_concepts)\n\n        return concepts\n

Add custom annotator to config file:

config.yaml
models:\n  problems: f25ec9423958e8d6\n  meds/allergies: a146c741501cf1f7\n  custom: a146c741501cf1f7\nannotators:\n  problems: ProblemsAnnotator\n  meds/allergies: MedsAllergiesAnnotator\n  custom: CustomAnnotator\n

Initialise MiADE with the custom annotator:

miade = NoteProcessor(Path(MODEL_DIR), custom_annotators=[CustomAnnotator])\nmiade.add_annotator(\"custom\")\n
"}]} \ No newline at end of file +{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Welcome to the MiADE Documentation","text":"

MiADE (Medical information AI Data Extractor) is a set of tools for extracting formattable data from clinical notes stored in electronic health record systems (EHRs). Built with Cogstack's MedCAT package.

"},{"location":"#installing","title":"Installing","text":"

To install MiADE, you need to download the spacy base model and Med7 model first:

pip install https://huggingface.co/kormilitzin/en_core_med7_lg/resolve/main/en_core_med7_lg-any-py3-none-any.whl\npython -m spacy download en_core_web_md\n
Then, install MiADE:

pip install miade\n
"},{"location":"#license","title":"License","text":"

MiADE is licensed under Elastic License 2.0.

The Elastic License 2.0 is a flexible license that allows you to use, copy, distribute, make available, and prepare derivative works of the software, as long as you do not provide the software to others as a managed service or include it in a free software directory. For the full license text, see our license page.

"},{"location":"contributing/","title":"Contributing","text":"

Contribute to MiADE! Contribution guide

"},{"location":"about/overview/","title":"Project Overview","text":""},{"location":"about/overview/#background","title":"Background","text":"

Data about people\u2019s health stored in electronic health records (EHRs) can play an important role in improving the quality of patient care. Much of the information in EHRs is recorded in ordinary language without any restriction on format ('free text'), as this is the natural way in which people communicate. However, if this information were stored in a standardised, structured format, computers will also be able to process the information to help clinicians find and interpret information for better and safer decision making. This would enable EHR systems such as Epic, the system in place at UCLH since April 2019, to support clinical decision making. For instance, the system may be able to ensure that a patient is not prescribed medicine that would give them an allergic reaction.

"},{"location":"about/overview/#the-challenge","title":"The challenge","text":"

Free text may contain words and abbreviations which may be interpreted in more than one way, such as 'HR', which can mean 'Hour' or 'Heart Rate'. Free text may also contain negations; for example, a diagnosis may be mentioned in the text but the rest of the sentence might say that it was ruled out. Although computers can be used to interpret free text, they cannot always get it right, so clinicians will always have to check the results to ensure patient safety. Expressing information in a structured way can avoid this problem, but has a big disadvantage - it can be time-consuming for clinicians to enter the information. This can mean that information is incomplete, or clinicians are so busy on the computer that they do not have time to listen to their patients.

"},{"location":"about/overview/#meeting-the-need","title":"Meeting the need","text":"

The aim of MiADE is to develop a system to support automatic conversion of the clinician\u2019s free text into a structured format. The clinician can check the structured data immediately, before making it a formal part of the patient\u2019s record. The system will record a patient\u2019s diagnoses, medications and allergies in a structured way, using NHS-endorsed clinical data standards (e.g. FIHR and SNOMED CT). It will use a technique called Natural Language Processing (NLP). NLP has been used by research teams to extract information from existing EHRs but has rarely been used to improve the way information is entered in the first place. Our NLP system will continuously learn and improve as more text is analysed and checked by clinicians.

We will first test the system in University College London Hospitals, where a new EHR system called Epic is in place. We will study how effective it is, and how clinicians and patients find it when it is used in consultations. Based on feedback, we will make improvements and install it for testing at a second site (Great Ormond Street Hospital). Our aim is for the system to be eventually rolled out to more hospitals and doctors\u2019 surgeries across the NHS.

"},{"location":"about/team/","title":"Team","text":"

The MiADE project is developed by a team of clinicians, developers, AI researchers, and data standard experts at University College London (UCL) and the University College London Hospitals (UCLH), in collaboration with the Cogstack at King's College London (KCL).

"},{"location":"api-reference/annotator/","title":"Annotator","text":"

Bases: ABC

An abstract base class for annotators.

Annotators are responsible for processing medical notes and extracting relevant concepts from them.

Attributes:

Name Type Description cat CAT

The MedCAT instance used for concept extraction.

config AnnotatorConfig

The configuration for the annotator.

Source code in src/miade/annotators.py
class Annotator(ABC):\n    \"\"\"\n    An abstract base class for annotators.\n\n    Annotators are responsible for processing medical notes and extracting relevant concepts from them.\n\n    Attributes:\n        cat (CAT): The MedCAT instance used for concept extraction.\n        config (AnnotatorConfig): The configuration for the annotator.\n    \"\"\"\n\n    def __init__(self, cat: CAT, config: AnnotatorConfig = None):\n        self.cat = cat\n        self.config = config if config is not None else AnnotatorConfig()\n\n        if self.config.negation_detection == \"negex\":\n            self._add_negex_pipeline()\n\n        # TODO make paragraph processing params configurable\n        self.structured_prob_lists = {\n            ParagraphType.prob: Relevance.PRESENT,\n            ParagraphType.imp: Relevance.PRESENT,\n            ParagraphType.pmh: Relevance.HISTORIC,\n        }\n        self.structured_med_lists = {\n            ParagraphType.med: SubstanceCategory.TAKING,\n            ParagraphType.allergy: SubstanceCategory.ADVERSE_REACTION,\n        }\n        self.irrelevant_paragraphs = [ParagraphType.ddx, ParagraphType.exam, ParagraphType.plan]\n\n    def _add_negex_pipeline(self) -> None:\n        \"\"\"\n        Adds the negex pipeline to the MedCAT instance.\n        \"\"\"\n        self.cat.pipe.spacy_nlp.add_pipe(\"sentencizer\")\n        self.cat.pipe.spacy_nlp.enable_pipe(\"sentencizer\")\n        self.cat.pipe.spacy_nlp.add_pipe(\"negex\")\n\n    @property\n    @abstractmethod\n    def concept_types(self):\n        \"\"\"\n        Abstract property that should return a list of concept types supported by the annotator.\n        \"\"\"\n        pass\n\n    @property\n    @abstractmethod\n    def pipeline(self):\n        \"\"\"\n        Abstract property that should return a list of pipeline steps for the annotator.\n        \"\"\"\n        pass\n\n    @abstractmethod\n    def process_paragraphs(self):\n        \"\"\"\n        Abstract method that should implement the logic for processing paragraphs in a note.\n        \"\"\"\n        pass\n\n    @abstractmethod\n    def postprocess(self):\n        \"\"\"\n        Abstract method that should implement the logic for post-processing extracted concepts.\n        \"\"\"\n        pass\n\n    def run_pipeline(self, note: Note, record_concepts: List[Concept]) -> List[Concept]:\n        \"\"\"\n        Runs the annotation pipeline on a given note and returns the extracted concepts.\n\n        Args:\n            note (Note): The input note to process.\n            record_concepts (List[Concept]): The list of concepts from existing EHR records.\n\n        Returns:\n            The extracted concepts from the note.\n        \"\"\"\n        concepts: List[Concept] = []\n\n        for pipe in self.pipeline:\n            if pipe not in self.config.disable:\n                if pipe == \"preprocessor\":\n                    note = self.preprocess(note)\n                elif pipe == \"medcat\":\n                    concepts = self.get_concepts(note)\n                elif pipe == \"paragrapher\":\n                    concepts = self.process_paragraphs(note, concepts)\n                elif pipe == \"postprocessor\":\n                    concepts = self.postprocess(concepts)\n                elif pipe == \"deduplicator\":\n                    concepts = self.deduplicate(concepts, record_concepts)\n\n        return concepts\n\n    def get_concepts(self, note: Note) -> List[Concept]:\n        \"\"\"\n        Extracts concepts from a note using the MedCAT instance.\n\n        Args:\n            note (Note): The input note to extract concepts from.\n\n        Returns:\n            The extracted concepts from the note.\n        \"\"\"\n        concepts: List[Concept] = []\n        for entity in self.cat.get_entities(note)[\"entities\"].values():\n            try:\n                concepts.append(Concept.from_entity(entity))\n                log.debug(f\"Detected concept ({concepts[-1].id} | {concepts[-1].name})\")\n            except ValueError as e:\n                log.warning(f\"Concept skipped: {e}\")\n\n        return concepts\n\n    @staticmethod\n    def preprocess(note: Note) -> Note:\n        \"\"\"\n        Preprocesses a note by cleaning its text and splitting it into paragraphs.\n\n        Args:\n            note (Note): The input note to preprocess.\n\n        Returns:\n            The preprocessed note.\n        \"\"\"\n        note.clean_text()\n        note.get_paragraphs()\n\n        return note\n\n    @staticmethod\n    def deduplicate(concepts: List[Concept], record_concepts: Optional[List[Concept]]) -> List[Concept]:\n        \"\"\"\n        Removes duplicate concepts from the extracted concepts list by strict ID matching.\n\n        Args:\n            concepts (List[Concept]): The list of extracted concepts.\n            record_concepts (Optional[List[Concept]]): The list of concepts from existing EHR records.\n\n        Returns:\n            The deduplicated list of concepts.\n        \"\"\"\n        if record_concepts is not None:\n            record_ids = {record_concept.id for record_concept in record_concepts}\n            record_names = {record_concept.name for record_concept in record_concepts}\n        else:\n            record_ids = set()\n            record_names = set()\n\n        # Use an OrderedDict to keep track of ids as it preservers original MedCAT order (the order it appears in text)\n        filtered_concepts: List[Concept] = []\n        existing_concepts = OrderedDict()\n\n        # Filter concepts that are in record or exist in concept list\n        for concept in concepts:\n            if concept.id is not None and (concept.id in record_ids or concept.id in existing_concepts):\n                log.debug(f\"Removed concept ({concept.id} | {concept.name}): concept id exists in record\")\n            # check name match for null ids - VTM deduplication\n            elif concept.id is None and (concept.name in record_names or concept.name in existing_concepts.values()):\n                log.debug(f\"Removed concept ({concept.id} | {concept.name}): concept name exists in record\")\n            else:\n                filtered_concepts.append(concept)\n                existing_concepts[concept.id] = concept.name\n\n        return filtered_concepts\n\n    @staticmethod\n    def add_numbering_to_name(concepts: List[Concept]) -> List[Concept]:\n        \"\"\"\n        Adds numbering to the names of problem concepts to control output ordering.\n\n        Args:\n            concepts (List[Concept]): The list of concepts to add numbering to.\n\n        Returns:\n            The list of concepts with numbering added to their names.\n        \"\"\"\n        # Prepend numbering to problem concepts e.g. 00 asthma, 01 stroke...\n        for i, concept in enumerate(concepts):\n            concept.name = f\"{i:02} {concept.name}\"\n\n        return concepts\n\n    def __call__(\n        self,\n        note: Note,\n        record_concepts: Optional[List[Concept]] = None,\n    ) -> List[Concept]:\n        \"\"\"\n        Runs the annotation pipeline on a given note and returns the extracted concepts.\n\n        Args:\n            note (Note): The input note to process.\n            record_concepts (Optional[List[Concept]]): The list of concepts from existing EHR records.\n\n        Returns:\n            The extracted concepts from the note.\n        \"\"\"\n        concepts = self.run_pipeline(note, record_concepts)\n\n        if self.config.add_numbering:\n            concepts = self.add_numbering_to_name(concepts)\n\n        return concepts\n
"},{"location":"api-reference/annotator/#miade.annotators.Annotator.concept_types","title":"concept_types abstractmethod property","text":"

Abstract property that should return a list of concept types supported by the annotator.

"},{"location":"api-reference/annotator/#miade.annotators.Annotator.pipeline","title":"pipeline abstractmethod property","text":"

Abstract property that should return a list of pipeline steps for the annotator.

"},{"location":"api-reference/annotator/#miade.annotators.Annotator.__call__","title":"__call__(note, record_concepts=None)","text":"

Runs the annotation pipeline on a given note and returns the extracted concepts.

Parameters:

Name Type Description Default note Note

The input note to process.

required record_concepts Optional[List[Concept]]

The list of concepts from existing EHR records.

None

Returns:

Type Description List[Concept]

The extracted concepts from the note.

Source code in src/miade/annotators.py
def __call__(\n    self,\n    note: Note,\n    record_concepts: Optional[List[Concept]] = None,\n) -> List[Concept]:\n    \"\"\"\n    Runs the annotation pipeline on a given note and returns the extracted concepts.\n\n    Args:\n        note (Note): The input note to process.\n        record_concepts (Optional[List[Concept]]): The list of concepts from existing EHR records.\n\n    Returns:\n        The extracted concepts from the note.\n    \"\"\"\n    concepts = self.run_pipeline(note, record_concepts)\n\n    if self.config.add_numbering:\n        concepts = self.add_numbering_to_name(concepts)\n\n    return concepts\n
"},{"location":"api-reference/annotator/#miade.annotators.Annotator.add_numbering_to_name","title":"add_numbering_to_name(concepts) staticmethod","text":"

Adds numbering to the names of problem concepts to control output ordering.

Parameters:

Name Type Description Default concepts List[Concept]

The list of concepts to add numbering to.

required

Returns:

Type Description List[Concept]

The list of concepts with numbering added to their names.

Source code in src/miade/annotators.py
@staticmethod\ndef add_numbering_to_name(concepts: List[Concept]) -> List[Concept]:\n    \"\"\"\n    Adds numbering to the names of problem concepts to control output ordering.\n\n    Args:\n        concepts (List[Concept]): The list of concepts to add numbering to.\n\n    Returns:\n        The list of concepts with numbering added to their names.\n    \"\"\"\n    # Prepend numbering to problem concepts e.g. 00 asthma, 01 stroke...\n    for i, concept in enumerate(concepts):\n        concept.name = f\"{i:02} {concept.name}\"\n\n    return concepts\n
"},{"location":"api-reference/annotator/#miade.annotators.Annotator.deduplicate","title":"deduplicate(concepts, record_concepts) staticmethod","text":"

Removes duplicate concepts from the extracted concepts list by strict ID matching.

Parameters:

Name Type Description Default concepts List[Concept]

The list of extracted concepts.

required record_concepts Optional[List[Concept]]

The list of concepts from existing EHR records.

required

Returns:

Type Description List[Concept]

The deduplicated list of concepts.

Source code in src/miade/annotators.py
@staticmethod\ndef deduplicate(concepts: List[Concept], record_concepts: Optional[List[Concept]]) -> List[Concept]:\n    \"\"\"\n    Removes duplicate concepts from the extracted concepts list by strict ID matching.\n\n    Args:\n        concepts (List[Concept]): The list of extracted concepts.\n        record_concepts (Optional[List[Concept]]): The list of concepts from existing EHR records.\n\n    Returns:\n        The deduplicated list of concepts.\n    \"\"\"\n    if record_concepts is not None:\n        record_ids = {record_concept.id for record_concept in record_concepts}\n        record_names = {record_concept.name for record_concept in record_concepts}\n    else:\n        record_ids = set()\n        record_names = set()\n\n    # Use an OrderedDict to keep track of ids as it preservers original MedCAT order (the order it appears in text)\n    filtered_concepts: List[Concept] = []\n    existing_concepts = OrderedDict()\n\n    # Filter concepts that are in record or exist in concept list\n    for concept in concepts:\n        if concept.id is not None and (concept.id in record_ids or concept.id in existing_concepts):\n            log.debug(f\"Removed concept ({concept.id} | {concept.name}): concept id exists in record\")\n        # check name match for null ids - VTM deduplication\n        elif concept.id is None and (concept.name in record_names or concept.name in existing_concepts.values()):\n            log.debug(f\"Removed concept ({concept.id} | {concept.name}): concept name exists in record\")\n        else:\n            filtered_concepts.append(concept)\n            existing_concepts[concept.id] = concept.name\n\n    return filtered_concepts\n
"},{"location":"api-reference/annotator/#miade.annotators.Annotator.get_concepts","title":"get_concepts(note)","text":"

Extracts concepts from a note using the MedCAT instance.

Parameters:

Name Type Description Default note Note

The input note to extract concepts from.

required

Returns:

Type Description List[Concept]

The extracted concepts from the note.

Source code in src/miade/annotators.py
def get_concepts(self, note: Note) -> List[Concept]:\n    \"\"\"\n    Extracts concepts from a note using the MedCAT instance.\n\n    Args:\n        note (Note): The input note to extract concepts from.\n\n    Returns:\n        The extracted concepts from the note.\n    \"\"\"\n    concepts: List[Concept] = []\n    for entity in self.cat.get_entities(note)[\"entities\"].values():\n        try:\n            concepts.append(Concept.from_entity(entity))\n            log.debug(f\"Detected concept ({concepts[-1].id} | {concepts[-1].name})\")\n        except ValueError as e:\n            log.warning(f\"Concept skipped: {e}\")\n\n    return concepts\n
"},{"location":"api-reference/annotator/#miade.annotators.Annotator.postprocess","title":"postprocess() abstractmethod","text":"

Abstract method that should implement the logic for post-processing extracted concepts.

Source code in src/miade/annotators.py
@abstractmethod\ndef postprocess(self):\n    \"\"\"\n    Abstract method that should implement the logic for post-processing extracted concepts.\n    \"\"\"\n    pass\n
"},{"location":"api-reference/annotator/#miade.annotators.Annotator.preprocess","title":"preprocess(note) staticmethod","text":"

Preprocesses a note by cleaning its text and splitting it into paragraphs.

Parameters:

Name Type Description Default note Note

The input note to preprocess.

required

Returns:

Type Description Note

The preprocessed note.

Source code in src/miade/annotators.py
@staticmethod\ndef preprocess(note: Note) -> Note:\n    \"\"\"\n    Preprocesses a note by cleaning its text and splitting it into paragraphs.\n\n    Args:\n        note (Note): The input note to preprocess.\n\n    Returns:\n        The preprocessed note.\n    \"\"\"\n    note.clean_text()\n    note.get_paragraphs()\n\n    return note\n
"},{"location":"api-reference/annotator/#miade.annotators.Annotator.process_paragraphs","title":"process_paragraphs() abstractmethod","text":"

Abstract method that should implement the logic for processing paragraphs in a note.

Source code in src/miade/annotators.py
@abstractmethod\ndef process_paragraphs(self):\n    \"\"\"\n    Abstract method that should implement the logic for processing paragraphs in a note.\n    \"\"\"\n    pass\n
"},{"location":"api-reference/annotator/#miade.annotators.Annotator.run_pipeline","title":"run_pipeline(note, record_concepts)","text":"

Runs the annotation pipeline on a given note and returns the extracted concepts.

Parameters:

Name Type Description Default note Note

The input note to process.

required record_concepts List[Concept]

The list of concepts from existing EHR records.

required

Returns:

Type Description List[Concept]

The extracted concepts from the note.

Source code in src/miade/annotators.py
def run_pipeline(self, note: Note, record_concepts: List[Concept]) -> List[Concept]:\n    \"\"\"\n    Runs the annotation pipeline on a given note and returns the extracted concepts.\n\n    Args:\n        note (Note): The input note to process.\n        record_concepts (List[Concept]): The list of concepts from existing EHR records.\n\n    Returns:\n        The extracted concepts from the note.\n    \"\"\"\n    concepts: List[Concept] = []\n\n    for pipe in self.pipeline:\n        if pipe not in self.config.disable:\n            if pipe == \"preprocessor\":\n                note = self.preprocess(note)\n            elif pipe == \"medcat\":\n                concepts = self.get_concepts(note)\n            elif pipe == \"paragrapher\":\n                concepts = self.process_paragraphs(note, concepts)\n            elif pipe == \"postprocessor\":\n                concepts = self.postprocess(concepts)\n            elif pipe == \"deduplicator\":\n                concepts = self.deduplicate(concepts, record_concepts)\n\n    return concepts\n
"},{"location":"api-reference/concept/","title":"Concept","text":"

Bases: object

Represents a concept in the system.

Attributes:

Name Type Description id str

The unique identifier of the concept.

name str

The name of the concept.

category Optional[Enum]

The category of the concept (optional).

start Optional[int]

The start position of the concept (optional).

end Optional[int]

The end position of the concept (optional).

dosage Optional[Dosage]

The dosage of the concept (optional).

linked_concepts Optional[List[Concept]]

The linked concepts of the concept (optional).

negex Optional[bool]

The negex value of the concept (optional).

meta_anns Optional[List[MetaAnnotations]]

The meta annotations of the concept (optional).

debug_dict Optional[Dict]

The debug dictionary of the concept (optional).

Source code in src/miade/concept.py
class Concept(object):\n    \"\"\"Represents a concept in the system.\n\n    Attributes:\n        id (str): The unique identifier of the concept.\n        name (str): The name of the concept.\n        category (Optional[Enum]): The category of the concept (optional).\n        start (Optional[int]): The start position of the concept (optional).\n        end (Optional[int]): The end position of the concept (optional).\n        dosage (Optional[Dosage]): The dosage of the concept (optional).\n        linked_concepts (Optional[List[Concept]]): The linked concepts of the concept (optional).\n        negex (Optional[bool]): The negex value of the concept (optional).\n        meta_anns (Optional[List[MetaAnnotations]]): The meta annotations of the concept (optional).\n        debug_dict (Optional[Dict]): The debug dictionary of the concept (optional).\n    \"\"\"\n\n    def __init__(\n        self,\n        id: str,\n        name: str,\n        category: Optional[Enum] = None,\n        start: Optional[int] = None,\n        end: Optional[int] = None,\n        dosage: Optional[Dosage] = None,\n        linked_concepts: Optional[List[Concept]] = None,\n        negex: Optional[bool] = None,\n        meta_anns: Optional[List[MetaAnnotations]] = None,\n        debug_dict: Optional[Dict] = None,\n    ):\n        self.name = name\n        self.id = id\n        self.category = category\n        self.start = start\n        self.end = end\n        self.dosage = dosage\n        self.linked_concepts = linked_concepts\n        self.negex = negex\n        self.meta = meta_anns\n        self.debug = debug_dict\n\n        if linked_concepts is None:\n            self.linked_concepts = []\n\n    @classmethod\n    def from_entity(cls, entity: Dict) -> Concept:\n        \"\"\"\n        Converts an entity dictionary into a Concept object.\n\n        Args:\n            entity (Dict): The entity dictionary containing the necessary information.\n\n        Returns:\n            The Concept object created from the entity dictionary.\n        \"\"\"\n        meta_anns = None\n        if entity[\"meta_anns\"]:\n            meta_anns = [MetaAnnotations(**value) for value in entity[\"meta_anns\"].values()]\n\n        return Concept(\n            id=entity[\"cui\"],\n            name=entity[\n                \"source_value\"\n            ],  # can also use detected_name which is spell checked but delimited by ~ e.g. liver~failure\n            category=None,\n            start=entity[\"start\"],\n            end=entity[\"end\"],\n            negex=entity[\"negex\"] if \"negex\" in entity else None,\n            meta_anns=meta_anns,\n        )\n\n    def __str__(self):\n        return (\n            f\"{{name: {self.name}, id: {self.id}, category: {self.category}, start: {self.start}, end: {self.end},\"\n            f\" dosage: {self.dosage}, linked_concepts: {self.linked_concepts}, negex: {self.negex}, meta: {self.meta}}} \"\n        )\n\n    def __hash__(self):\n        return hash((self.id, self.name, self.category))\n\n    def __eq__(self, other):\n        return self.id == other.id and self.name == other.name and self.category == other.category\n\n    def __lt__(self, other):\n        return int(self.id) < int(other.id)\n\n    def __gt__(self, other):\n        return int(self.id) > int(other.id)\n
"},{"location":"api-reference/concept/#miade.concept.Concept.from_entity","title":"from_entity(entity) classmethod","text":"

Converts an entity dictionary into a Concept object.

Parameters:

Name Type Description Default entity Dict

The entity dictionary containing the necessary information.

required

Returns:

Type Description Concept

The Concept object created from the entity dictionary.

Source code in src/miade/concept.py
@classmethod\ndef from_entity(cls, entity: Dict) -> Concept:\n    \"\"\"\n    Converts an entity dictionary into a Concept object.\n\n    Args:\n        entity (Dict): The entity dictionary containing the necessary information.\n\n    Returns:\n        The Concept object created from the entity dictionary.\n    \"\"\"\n    meta_anns = None\n    if entity[\"meta_anns\"]:\n        meta_anns = [MetaAnnotations(**value) for value in entity[\"meta_anns\"].values()]\n\n    return Concept(\n        id=entity[\"cui\"],\n        name=entity[\n            \"source_value\"\n        ],  # can also use detected_name which is spell checked but delimited by ~ e.g. liver~failure\n        category=None,\n        start=entity[\"start\"],\n        end=entity[\"end\"],\n        negex=entity[\"negex\"] if \"negex\" in entity else None,\n        meta_anns=meta_anns,\n    )\n
"},{"location":"api-reference/dosage/","title":"Dosage","text":"

Bases: object

Container for drug dosage information

Source code in src/miade/dosage.py
class Dosage(object):\n    \"\"\"\n    Container for drug dosage information\n    \"\"\"\n\n    def __init__(\n        self,\n        dose: Optional[Dose],\n        duration: Optional[Duration],\n        frequency: Optional[Frequency],\n        route: Optional[Route],\n        text: Optional[str] = None,\n    ):\n        self.text = text\n        self.dose = dose\n        self.duration = duration\n        self.frequency = frequency\n        self.route = route\n\n    @classmethod\n    def from_doc(cls, doc: Doc, calculate: bool = True):\n        \"\"\"\n        Parses dosage from a spacy doc object.\n\n        Args:\n            doc (Doc): Spacy doc object with processed dosage text.\n            calculate (bool, optional): Whether to calculate duration if total and daily dose is given. Defaults to True.\n\n        Returns:\n            An instance of the class with the parsed dosage information.\n\n        \"\"\"\n        quantities = []\n        units = []\n        dose_start = 1000\n        dose_end = 0\n        daily_dose = None\n        total_dose = None\n        route_text = None\n        duration_text = None\n\n        for ent in doc.ents:\n            if ent.label_ == \"DOSAGE\":\n                if ent._.total_dose:\n                    total_dose = float(ent.text)\n                else:\n                    quantities.append(ent.text)\n                    # get span of full dosage string - not strictly needed but nice to have\n                    if ent.start < dose_start:\n                        dose_start = ent.start\n                    if ent.end > dose_end:\n                        dose_end = ent.end\n            elif ent.label_ == \"FORM\":\n                if ent._.total_dose:\n                    # de facto unit is in total dose\n                    units = [ent.text]\n                else:\n                    units.append(ent.text)\n                    if ent.start < dose_start:\n                        dose_start = ent.start\n                    if ent.end > dose_end:\n                        dose_end = ent.end\n            elif ent.label_ == \"DURATION\":\n                duration_text = ent.text\n            elif ent.label_ == \"ROUTE\":\n                route_text = ent.text\n\n        dose = parse_dose(\n            text=\" \".join(doc.text.split()[dose_start:dose_end]),\n            quantities=quantities,\n            units=units,\n            results=doc._.results,\n        )\n\n        frequency = parse_frequency(text=doc.text, results=doc._.results)\n\n        route = parse_route(text=route_text, dose=dose)\n\n        # technically not information recorded so will keep as an option\n        if calculate:\n            # if duration not given in text could extract this from total dose if given\n            if total_dose is not None and dose is not None and doc._.results[\"freq\"]:\n                if dose.value is not None:\n                    daily_dose = float(dose.value) * (round(doc._.results[\"freq\"] / doc._.results[\"time\"]))\n                elif dose.high is not None:\n                    daily_dose = float(dose.high) * (round(doc._.results[\"freq\"] / doc._.results[\"time\"]))\n\n        duration = parse_duration(\n            text=duration_text,\n            results=doc._.results,\n            total_dose=total_dose,\n            daily_dose=daily_dose,\n        )\n\n        return cls(\n            text=doc._.original_text,\n            dose=dose,\n            duration=duration,\n            frequency=frequency,\n            route=route,\n        )\n\n    def __str__(self):\n        return f\"{self.__dict__}\"\n\n    def __eq__(self, other):\n        return self.__dict__ == other.__dict__\n
"},{"location":"api-reference/dosage/#miade.dosage.Dosage.from_doc","title":"from_doc(doc, calculate=True) classmethod","text":"

Parses dosage from a spacy doc object.

Parameters:

Name Type Description Default doc Doc

Spacy doc object with processed dosage text.

required calculate bool

Whether to calculate duration if total and daily dose is given. Defaults to True.

True

Returns:

Type Description

An instance of the class with the parsed dosage information.

Source code in src/miade/dosage.py
@classmethod\ndef from_doc(cls, doc: Doc, calculate: bool = True):\n    \"\"\"\n    Parses dosage from a spacy doc object.\n\n    Args:\n        doc (Doc): Spacy doc object with processed dosage text.\n        calculate (bool, optional): Whether to calculate duration if total and daily dose is given. Defaults to True.\n\n    Returns:\n        An instance of the class with the parsed dosage information.\n\n    \"\"\"\n    quantities = []\n    units = []\n    dose_start = 1000\n    dose_end = 0\n    daily_dose = None\n    total_dose = None\n    route_text = None\n    duration_text = None\n\n    for ent in doc.ents:\n        if ent.label_ == \"DOSAGE\":\n            if ent._.total_dose:\n                total_dose = float(ent.text)\n            else:\n                quantities.append(ent.text)\n                # get span of full dosage string - not strictly needed but nice to have\n                if ent.start < dose_start:\n                    dose_start = ent.start\n                if ent.end > dose_end:\n                    dose_end = ent.end\n        elif ent.label_ == \"FORM\":\n            if ent._.total_dose:\n                # de facto unit is in total dose\n                units = [ent.text]\n            else:\n                units.append(ent.text)\n                if ent.start < dose_start:\n                    dose_start = ent.start\n                if ent.end > dose_end:\n                    dose_end = ent.end\n        elif ent.label_ == \"DURATION\":\n            duration_text = ent.text\n        elif ent.label_ == \"ROUTE\":\n            route_text = ent.text\n\n    dose = parse_dose(\n        text=\" \".join(doc.text.split()[dose_start:dose_end]),\n        quantities=quantities,\n        units=units,\n        results=doc._.results,\n    )\n\n    frequency = parse_frequency(text=doc.text, results=doc._.results)\n\n    route = parse_route(text=route_text, dose=dose)\n\n    # technically not information recorded so will keep as an option\n    if calculate:\n        # if duration not given in text could extract this from total dose if given\n        if total_dose is not None and dose is not None and doc._.results[\"freq\"]:\n            if dose.value is not None:\n                daily_dose = float(dose.value) * (round(doc._.results[\"freq\"] / doc._.results[\"time\"]))\n            elif dose.high is not None:\n                daily_dose = float(dose.high) * (round(doc._.results[\"freq\"] / doc._.results[\"time\"]))\n\n    duration = parse_duration(\n        text=duration_text,\n        results=doc._.results,\n        total_dose=total_dose,\n        daily_dose=daily_dose,\n    )\n\n    return cls(\n        text=doc._.original_text,\n        dose=dose,\n        duration=duration,\n        frequency=frequency,\n        route=route,\n    )\n
"},{"location":"api-reference/dosageextractor/","title":"DosageExtractor","text":"

Parses and extracts drug dosage

Attributes:

Name Type Description model str

The name of the model to be used for dosage extraction.

dosage_extractor Language

The Spacy pipeline for dosage extraction.

Source code in src/miade/dosageextractor.py
class DosageExtractor:\n    \"\"\"\n    Parses and extracts drug dosage\n\n    Attributes:\n        model (str): The name of the model to be used for dosage extraction.\n        dosage_extractor (Language): The Spacy pipeline for dosage extraction.\n    \"\"\"\n\n    def __init__(self, model: str = \"en_core_med7_lg\"):\n        self.model = model\n        self.dosage_extractor = self._create_drugdoseade_pipeline()\n\n    def _create_drugdoseade_pipeline(self) -> Language:\n        \"\"\"\n        Creates a spacy pipeline with given model (default med7)\n        and customised pipeline components for dosage extraction\n\n        Returns:\n            nlp (spacy.Language): The Spacy pipeline for dosage extraction.\n        \"\"\"\n        nlp = spacy.load(self.model)\n        nlp.add_pipe(\"preprocessor\", first=True)\n        nlp.add_pipe(\"pattern_matcher\", before=\"ner\")\n        nlp.add_pipe(\"entities_refiner\", after=\"ner\")\n\n        log.info(f\"Loaded drug dosage extractor with model {self.model}\")\n\n        return nlp\n\n    def extract(self, text: str, calculate: bool = True) -> Optional[Dosage]:\n        \"\"\"\n        Processes a string that contains dosage instructions (excluding drug concept as this is handled by core)\n\n        Args:\n            text (str): The string containing dosage instructions.\n            calculate (bool): Whether to calculate duration from total and daily dose, if given.\n\n        Returns:\n            The dosage object with parsed dosages in CDA format.\n        \"\"\"\n        doc = self.dosage_extractor(text)\n\n        log.debug(f\"NER results: {[(e.text, e.label_, e._.total_dose) for e in doc.ents]}\")\n        log.debug(f\"Lookup results: {doc._.results}\")\n\n        dosage = Dosage.from_doc(doc=doc, calculate=calculate)\n\n        if all(v is None for v in [dosage.dose, dosage.frequency, dosage.route, dosage.duration]):\n            return None\n\n        return dosage\n\n    def __call__(self, text: str, calculate: bool = True):\n        return self.extract(text, calculate)\n
"},{"location":"api-reference/dosageextractor/#miade.dosageextractor.DosageExtractor.extract","title":"extract(text, calculate=True)","text":"

Processes a string that contains dosage instructions (excluding drug concept as this is handled by core)

Parameters:

Name Type Description Default text str

The string containing dosage instructions.

required calculate bool

Whether to calculate duration from total and daily dose, if given.

True

Returns:

Type Description Optional[Dosage]

The dosage object with parsed dosages in CDA format.

Source code in src/miade/dosageextractor.py
def extract(self, text: str, calculate: bool = True) -> Optional[Dosage]:\n    \"\"\"\n    Processes a string that contains dosage instructions (excluding drug concept as this is handled by core)\n\n    Args:\n        text (str): The string containing dosage instructions.\n        calculate (bool): Whether to calculate duration from total and daily dose, if given.\n\n    Returns:\n        The dosage object with parsed dosages in CDA format.\n    \"\"\"\n    doc = self.dosage_extractor(text)\n\n    log.debug(f\"NER results: {[(e.text, e.label_, e._.total_dose) for e in doc.ents]}\")\n    log.debug(f\"Lookup results: {doc._.results}\")\n\n    dosage = Dosage.from_doc(doc=doc, calculate=calculate)\n\n    if all(v is None for v in [dosage.dose, dosage.frequency, dosage.route, dosage.duration]):\n        return None\n\n    return dosage\n
"},{"location":"api-reference/medsallergiesannotator/","title":"MedsAllergiesAnnotator","text":"

Bases: Annotator

Annotator class for medication and allergy concepts.

This class extends the Annotator base class and provides methods for running a pipeline of annotation tasks on a given note, as well as validating and converting concepts related to medications and allergies.

Attributes:

Name Type Description valid_meds List[int]

A list of valid medication IDs.

reactions_subset_lookup Dict[int, str]

A dictionary mapping reaction IDs to their corresponding subset IDs.

allergens_subset_lookup Dict[int, str]

A dictionary mapping allergen IDs to their corresponding subset IDs.

allergy_type_lookup Dict[str, List[str]]

A dictionary mapping allergen types to their corresponding codes.

vtm_to_vmp_lookup Dict[str, str]

A dictionary mapping VTM (Virtual Therapeutic Moiety) IDs to VMP (Virtual Medicinal Product) IDs.

vtm_to_text_lookup Dict[str, str]

A dictionary mapping VTM IDs to their corresponding text.

Source code in src/miade/annotators.py
class MedsAllergiesAnnotator(Annotator):\n    \"\"\"\n    Annotator class for medication and allergy concepts.\n\n    This class extends the `Annotator` base class and provides methods for running a pipeline of\n    annotation tasks on a given note, as well as validating and converting concepts related to\n    medications and allergies.\n\n    Attributes:\n        valid_meds (List[int]): A list of valid medication IDs.\n        reactions_subset_lookup (Dict[int, str]): A dictionary mapping reaction IDs to their corresponding subset IDs.\n        allergens_subset_lookup (Dict[int, str]): A dictionary mapping allergen IDs to their corresponding subset IDs.\n        allergy_type_lookup (Dict[str, List[str]]): A dictionary mapping allergen types to their corresponding codes.\n        vtm_to_vmp_lookup (Dict[str, str]): A dictionary mapping VTM (Virtual Therapeutic Moiety) IDs to VMP (Virtual Medicinal Product) IDs.\n        vtm_to_text_lookup (Dict[str, str]): A dictionary mapping VTM IDs to their corresponding text.\n    \"\"\"\n\n    def __init__(self, cat: CAT, config: AnnotatorConfig = None):\n        super().__init__(cat, config)\n        self._load_med_allergy_lookup_data()\n\n    @property\n    def concept_types(self) -> List[Category]:\n        \"\"\"\n        Returns a list of concept types.\n\n        Returns:\n            [Category.MEDICATION, Category.ALLERGY, Category.REACTION]\n        \"\"\"\n        return [Category.MEDICATION, Category.ALLERGY, Category.REACTION]\n\n    @property\n    def pipeline(self) -> List[str]:\n        \"\"\"\n        Returns a list of annotators in the pipeline.\n\n        The annotators are executed in the order they appear in the list.\n\n        Returns:\n            [\"preprocessor\", \"medcat\", \"paragrapher\", \"postprocessor\", \"dosage_extractor\", \"vtm_converter\", \"deduplicator\"]\n        \"\"\"\n        return [\n            \"preprocessor\",\n            \"medcat\",\n            \"paragrapher\",\n            \"postprocessor\",\n            \"dosage_extractor\",\n            \"vtm_converter\",\n            \"deduplicator\",\n        ]\n\n    def run_pipeline(\n        self, note: Note, record_concepts: List[Concept], dosage_extractor: Optional[DosageExtractor]\n    ) -> List[Concept]:\n        \"\"\"\n        Runs the annotation pipeline on the given note.\n\n        Args:\n            note (Note): The input note to run the pipeline on.\n            record_concepts (List[Concept]): The list of previously recorded concepts.\n            dosage_extractor (Optional[DosageExtractor]): The dosage extractor function.\n\n        Returns:\n            The list of annotated concepts.\n        \"\"\"\n        concepts: List[Concept] = []\n\n        for pipe in self.pipeline:\n            if pipe not in self.config.disable:\n                if pipe == \"preprocessor\":\n                    note = self.preprocess(note)\n                elif pipe == \"medcat\":\n                    concepts = self.get_concepts(note)\n                elif pipe == \"paragrapher\":\n                    concepts = self.process_paragraphs(note, concepts)\n                elif pipe == \"postprocessor\":\n                    concepts = self.postprocess(concepts, note)\n                elif pipe == \"deduplicator\":\n                    concepts = self.deduplicate(concepts, record_concepts)\n                elif pipe == \"vtm_converter\":\n                    concepts = self.convert_VTM_to_VMP_or_text(concepts)\n                elif pipe == \"dosage_extractor\" and dosage_extractor is not None:\n                    concepts = self.add_dosages_to_concepts(dosage_extractor, concepts, note)\n\n        return concepts\n\n    def _load_med_allergy_lookup_data(self) -> None:\n        \"\"\"\n        Loads the medication and allergy lookup data.\n        \"\"\"\n        if not os.path.isdir(self.config.lookup_data_path):\n            raise RuntimeError(f\"No lookup data configured: {self.config.lookup_data_path} does not exist!\")\n        else:\n            self.valid_meds = load_lookup_data(self.config.lookup_data_path + \"valid_meds.csv\", no_header=True)\n            self.reactions_subset_lookup = load_lookup_data(\n                self.config.lookup_data_path + \"reactions_subset.csv\", as_dict=True\n            )\n            self.allergens_subset_lookup = load_lookup_data(\n                self.config.lookup_data_path + \"allergens_subset.csv\", as_dict=True\n            )\n            self.allergy_type_lookup = load_allergy_type_combinations(self.config.lookup_data_path + \"allergy_type.csv\")\n            self.vtm_to_vmp_lookup = load_lookup_data(self.config.lookup_data_path + \"vtm_to_vmp.csv\")\n            self.vtm_to_text_lookup = load_lookup_data(self.config.lookup_data_path + \"vtm_to_text.csv\", as_dict=True)\n\n    def _validate_meds(self, concept) -> bool:\n        \"\"\"\n        Validates if the concept is a valid medication.\n\n        Args:\n            concept: The concept to validate.\n\n        Returns:\n            True if the concept is a valid medication, False otherwise.\n        \"\"\"\n        # check if substance is valid med\n        if int(concept.id) in self.valid_meds.values:\n            return True\n        return False\n\n    def _validate_and_convert_substance(self, concept) -> bool:\n        \"\"\"\n        Validates and converts a substance concept for allergy.\n\n        Args:\n            concept: The substance concept to be validated and converted.\n\n        Returns:\n            True if the substance is valid and converted successfully, False otherwise.\n        \"\"\"\n        # check if substance is valid substance for allergy - if it is, convert it to Epic subset and return that concept\n        lookup_result = self.allergens_subset_lookup.get(int(concept.id))\n        if lookup_result is not None:\n            log.debug(\n                f\"Converted concept ({concept.id} | {concept.name}) to \"\n                f\"({lookup_result['subsetId']} | {concept.name}): valid Epic allergen subset\"\n            )\n            concept.id = str(lookup_result[\"subsetId\"])\n\n            # then check the allergen type from lookup result - e.g. drug, food\n            try:\n                concept.category = AllergenType(str(lookup_result[\"allergenType\"]).lower())\n                log.debug(\n                    f\"Assigned substance concept ({concept.id} | {concept.name}) \"\n                    f\"to allergen type category {concept.category}\"\n                )\n            except ValueError as e:\n                log.warning(f\"Allergen type not found for {concept.__str__()}: {e}\")\n\n            return True\n        else:\n            log.warning(f\"No lookup subset found for substance ({concept.id} | {concept.name})\")\n            return False\n\n    def _validate_and_convert_reaction(self, concept) -> bool:\n        \"\"\"\n        Validates and converts a reaction concept to the Epic subset.\n\n        Args:\n            concept: The concept to be validated and converted.\n\n        Returns:\n            True if the concept is a valid reaction and successfully converted to the Epic subset,\n                  False otherwise.\n        \"\"\"\n        # check if substance is valid reaction - if it is, convert it to Epic subset and return that concept\n        lookup_result = self.reactions_subset_lookup.get(int(concept.id), None)\n        if lookup_result is not None:\n            log.debug(\n                f\"Converted concept ({concept.id} | {concept.name}) to \"\n                f\"({lookup_result} | {concept.name}): valid Epic reaction subset\"\n            )\n            concept.id = str(lookup_result)\n            return True\n        else:\n            log.warning(f\"Reaction not found in Epic subset conversion for concept {concept.__str__()}\")\n            return False\n\n    def _validate_and_convert_concepts(self, concept: Concept) -> Concept:\n        \"\"\"\n        Validates and converts the given concept based on its metadata annotations.\n\n        Args:\n            concept (Concept): The concept to be validated and converted.\n\n        Returns:\n            The validated and converted concept.\n\n        \"\"\"\n        meta_ann_values = [meta_ann.value for meta_ann in concept.meta] if concept.meta is not None else []\n\n        # assign categories\n        if SubstanceCategory.ADVERSE_REACTION in meta_ann_values:\n            if self._validate_and_convert_substance(concept):\n                self._convert_allergy_type_to_code(concept)\n                self._convert_allergy_severity_to_code(concept)\n                concept.category = Category.ALLERGY\n            else:\n                log.warning(f\"Double-checking if concept ({concept.id} | {concept.name}) is in reaction subset\")\n                if self._validate_and_convert_reaction(concept) and (\n                    ReactionPos.BEFORE_SUBSTANCE in meta_ann_values or ReactionPos.AFTER_SUBSTANCE in meta_ann_values\n                ):\n                    concept.category = Category.REACTION\n                else:\n                    log.warning(\n                        f\"Reaction concept ({concept.id} | {concept.name}) not in subset or reaction_pos is NOT_REACTION\"\n                    )\n        if SubstanceCategory.TAKING in meta_ann_values:\n            if self._validate_meds(concept):\n                concept.category = Category.MEDICATION\n        if SubstanceCategory.NOT_SUBSTANCE in meta_ann_values and (\n            ReactionPos.BEFORE_SUBSTANCE in meta_ann_values or ReactionPos.AFTER_SUBSTANCE in meta_ann_values\n        ):\n            if self._validate_and_convert_reaction(concept):\n                concept.category = Category.REACTION\n\n        return concept\n\n    @staticmethod\n    def add_dosages_to_concepts(\n        dosage_extractor: DosageExtractor, concepts: List[Concept], note: Note\n    ) -> List[Concept]:\n        \"\"\"\n        Gets dosages for medication concepts\n\n        Args:\n            dosage_extractor (DosageExtractor): The dosage extractor object\n            concepts (List[Concept]): List of concepts extracted\n            note (Note): The input note\n\n        Returns:\n            List of concepts with dosages for medication concepts\n        \"\"\"\n\n        for ind, concept in enumerate(concepts):\n            next_med_concept = concepts[ind + 1] if len(concepts) > ind + 1 else None\n            dosage_string = get_dosage_string(concept, next_med_concept, note.text)\n            if len(dosage_string.split()) > 2:\n                concept.dosage = dosage_extractor(dosage_string)\n                concept.category = Category.MEDICATION if concept.dosage is not None else None\n                if concept.dosage is not None:\n                    log.debug(\n                        f\"Extracted dosage for medication concept \"\n                        f\"({concept.id} | {concept.name}): {concept.dosage.text} {concept.dosage.dose}\"\n                    )\n\n        return concepts\n\n    @staticmethod\n    def _link_reactions_to_allergens(concept_list: List[Concept], note: Note, link_distance: int = 5) -> List[Concept]:\n        \"\"\"\n        Links reaction concepts to allergen concepts based on their proximity in the given concept list.\n\n        Args:\n            concept_list (List[Concept]): The list of concepts to search for reaction and allergen concepts.\n            note (Note): The note object containing the text.\n            link_distance (int, optional): The maximum distance between a reaction and an allergen to be considered linked.\n                Defaults to 5.\n\n        Returns:\n            The updated concept list with reaction concepts removed and linked to their corresponding allergen concepts.\n        \"\"\"\n        allergy_concepts = [concept for concept in concept_list if concept.category == Category.ALLERGY]\n        reaction_concepts = [concept for concept in concept_list if concept.category == Category.REACTION]\n\n        for reaction_concept in reaction_concepts:\n            nearest_allergy_concept = None\n            min_distance = inf\n            meta_ann_values = (\n                [meta_ann.value for meta_ann in reaction_concept.meta] if reaction_concept.meta is not None else []\n            )\n\n            for allergy_concept in allergy_concepts:\n                # skip if allergy is after and meta is before_substance\n                if ReactionPos.BEFORE_SUBSTANCE in meta_ann_values and allergy_concept.start < reaction_concept.start:\n                    continue\n                # skip if allergy is before and meta is after_substance\n                elif ReactionPos.AFTER_SUBSTANCE in meta_ann_values and allergy_concept.start > reaction_concept.start:\n                    continue\n                else:\n                    distance = calculate_word_distance(\n                        reaction_concept.start, reaction_concept.end, allergy_concept.start, allergy_concept.end, note\n                    )\n                    log.debug(\n                        f\"Calculated distance between reaction {reaction_concept.name} \"\n                        f\"and allergen {allergy_concept.name}: {distance}\"\n                    )\n                    if distance == -1:\n                        log.warning(\n                            f\"Indices for {reaction_concept.name} or {allergy_concept.name} invalid: \"\n                            f\"({reaction_concept.start}, {reaction_concept.end})\"\n                            f\"({allergy_concept.start}, {allergy_concept.end})\"\n                        )\n                        continue\n\n                    if distance <= link_distance and distance < min_distance:\n                        min_distance = distance\n                        nearest_allergy_concept = allergy_concept\n\n            if nearest_allergy_concept is not None:\n                nearest_allergy_concept.linked_concepts.append(reaction_concept)\n                log.debug(\n                    f\"Linked reaction concept {reaction_concept.name} to \"\n                    f\"allergen concept {nearest_allergy_concept.name}\"\n                )\n\n        # Remove the linked REACTION concepts from the main list\n        updated_concept_list = [concept for concept in concept_list if concept.category != Category.REACTION]\n\n        return updated_concept_list\n\n    @staticmethod\n    def _convert_allergy_severity_to_code(concept: Concept) -> bool:\n        \"\"\"\n        Converts allergy severity to corresponding codes and links them to the concept.\n\n        Args:\n            concept (Concept): The concept to convert severity for.\n\n        Returns:\n            True if the conversion is successful, False otherwise.\n        \"\"\"\n        meta_ann_values = [meta_ann.value for meta_ann in concept.meta] if concept.meta is not None else []\n        if Severity.MILD in meta_ann_values:\n            concept.linked_concepts.append(Concept(id=\"L\", name=\"Low\", category=Category.SEVERITY))\n        elif Severity.MODERATE in meta_ann_values:\n            concept.linked_concepts.append(Concept(id=\"M\", name=\"Moderate\", category=Category.SEVERITY))\n        elif Severity.SEVERE in meta_ann_values:\n            concept.linked_concepts.append(Concept(id=\"H\", name=\"High\", category=Category.SEVERITY))\n        elif Severity.UNSPECIFIED in meta_ann_values:\n            return True\n        else:\n            log.warning(f\"No severity annotation associated with ({concept.id} | {concept.name})\")\n            return False\n\n        log.debug(\n            f\"Linked severity concept ({concept.linked_concepts[-1].id} | {concept.linked_concepts[-1].name}) \"\n            f\"to allergen concept ({concept.id} | {concept.name}): valid meta model output\"\n        )\n\n        return True\n\n    def _convert_allergy_type_to_code(self, concept: Concept) -> bool:\n        \"\"\"\n        Converts the allergy type of a concept to a code and adds it as a linked concept.\n\n        Args:\n            concept (Concept): The concept whose allergy type needs to be converted.\n\n        Returns:\n            True if the conversion and linking were successful, False otherwise.\n        \"\"\"\n        # get the ALLERGYTYPE meta-annotation\n        allergy_type = [meta_ann for meta_ann in concept.meta if meta_ann.name == \"allergy_type\"]\n        if len(allergy_type) != 1:\n            log.warning(\n                f\"Unable to map allergy type code: allergy_type meta-annotation \"\n                f\"not found for concept {concept.__str__()}\"\n            )\n            return False\n        else:\n            allergy_type = allergy_type[0].value\n\n        # perform lookup with ALLERGYTYPE and AllergenType combination\n        lookup_combination: Tuple[str, str] = (concept.category.value, allergy_type.value)\n        allergy_type_lookup_result = self.allergy_type_lookup.get(lookup_combination)\n\n        # add resulting allergy type concept as to linked_concept\n        if allergy_type_lookup_result is not None:\n            concept.linked_concepts.append(\n                Concept(\n                    id=str(allergy_type_lookup_result[0]),\n                    name=allergy_type_lookup_result[1],\n                    category=Category.ALLERGY_TYPE,\n                )\n            )\n            log.debug(\n                f\"Linked allergy_type concept ({allergy_type_lookup_result[0]} | {allergy_type_lookup_result[1]})\"\n                f\" to allergen concept ({concept.id} | {concept.name}): valid meta model output + allergytype lookup\"\n            )\n        else:\n            log.warning(f\"Allergen and adverse reaction type combination not found: {lookup_combination}\")\n\n        return True\n\n    def _process_meta_ann_by_paragraph(self, concept: Concept, paragraph: Paragraph):\n        \"\"\"\n        Process the meta annotations for a given concept and paragraph.\n\n        Args:\n            concept (Concept): The concept object.\n            paragraph (Paragraph): The paragraph object.\n\n        Returns:\n            None\n        \"\"\"\n        # if paragraph is structured meds to convert to corresponding relevance\n        if paragraph.type in self.structured_med_lists:\n            for meta in concept.meta:\n                if meta.name == \"substance_category\" and meta.value in [\n                    SubstanceCategory.TAKING,\n                    SubstanceCategory.IRRELEVANT,\n                ]:\n                    new_relevance = self.structured_med_lists[paragraph.type]\n                    if meta.value != new_relevance:\n                        log.debug(\n                            f\"Converted {meta.value} to \"\n                            f\"{new_relevance} for concept ({concept.id} | {concept.name}): \"\n                            f\"paragraph is {paragraph.type}\"\n                        )\n                        meta.value = new_relevance\n        # if paragraph is probs or irrelevant section, convert substance to irrelevant\n        elif paragraph.type in self.structured_prob_lists or paragraph.type in self.irrelevant_paragraphs:\n            for meta in concept.meta:\n                if meta.name == \"substance_category\" and meta.value != SubstanceCategory.IRRELEVANT:\n                    log.debug(\n                        f\"Converted {meta.value} to \"\n                        f\"{SubstanceCategory.IRRELEVANT} for concept ({concept.id} | {concept.name}): \"\n                        f\"paragraph is {paragraph.type}\"\n                    )\n                    meta.value = SubstanceCategory.IRRELEVANT\n\n    def process_paragraphs(self, note: Note, concepts: List[Concept]) -> List[Concept]:\n        \"\"\"\n        Process the paragraphs in a note and update the list of concepts.\n\n        Args:\n            note (Note): The note object containing the paragraphs.\n            concepts (List[Concept]): The list of concepts to be updated.\n\n        Returns:\n            The updated list of concepts.\n        \"\"\"\n        for paragraph in note.paragraphs:\n            for concept in concepts:\n                if concept.start >= paragraph.start and concept.end <= paragraph.end:\n                    # log.debug(f\"({concept.name} | {concept.id}) is in {paragraph.type}\")\n                    if concept.meta:\n                        self._process_meta_ann_by_paragraph(concept, paragraph)\n\n        return concepts\n\n    def postprocess(self, concepts: List[Concept], note: Note) -> List[Concept]:\n        \"\"\"\n        Postprocesses a list of concepts and links reactions to allergens.\n\n        Args:\n            concepts (List[Concept]): The list of concepts to be postprocessed.\n            note (Note): The note object associated with the concepts.\n\n        Returns:\n           The postprocessed list of concepts.\n        \"\"\"\n        # deepcopy so we still have reference to original list of concepts\n        all_concepts = deepcopy(concepts)\n        processed_concepts = []\n\n        for concept in all_concepts:\n            concept = self._validate_and_convert_concepts(concept)\n            processed_concepts.append(concept)\n\n        processed_concepts = self._link_reactions_to_allergens(processed_concepts, note)\n\n        return processed_concepts\n\n    def convert_VTM_to_VMP_or_text(self, concepts: List[Concept]) -> List[Concept]:\n        \"\"\"\n        Converts medication concepts from VTM (Virtual Therapeutic Moiety) to VMP (Virtual Medicinal Product) or text.\n\n        Args:\n            concepts (List[Concept]): A list of medication concepts.\n\n        Returns:\n            A list of medication concepts with updated IDs, names, and dosages.\n\n        \"\"\"\n        # Get medication concepts\n        med_concepts = [concept for concept in concepts if concept.category == Category.MEDICATION]\n        self.vtm_to_vmp_lookup[\"dose\"] = self.vtm_to_vmp_lookup[\"dose\"].astype(float)\n\n        med_concepts_with_dose = []\n        # I don't know man...Need to improve dosage methods\n        for concept in med_concepts:\n            if concept.dosage is not None:\n                if concept.dosage.dose:\n                    if concept.dosage.dose.value is not None and concept.dosage.dose.unit is not None:\n                        med_concepts_with_dose.append(concept)\n\n        med_concepts_no_dose = [concept for concept in concepts if concept not in med_concepts_with_dose]\n\n        # Create a temporary DataFrame to match vtmId, dose, and unit\n        temp_df = pd.DataFrame(\n            {\n                \"vtmId\": [int(concept.id) for concept in med_concepts_with_dose],\n                \"dose\": [float(concept.dosage.dose.value) for concept in med_concepts_with_dose],\n                \"unit\": [concept.dosage.dose.unit for concept in med_concepts_with_dose],\n            }\n        )\n\n        # Merge with the lookup df to get vmpId\n        merged_df = temp_df.merge(self.vtm_to_vmp_lookup, on=[\"vtmId\", \"dose\", \"unit\"], how=\"left\")\n\n        # Update id in the concepts list\n        for index, concept in enumerate(med_concepts_with_dose):\n            # Convert VTM to VMP id\n            vmp_id = merged_df.at[index, \"vmpId\"]\n            if not pd.isna(vmp_id):\n                log.debug(\n                    f\"Converted ({concept.id} | {concept.name}) to \"\n                    f\"({int(vmp_id)} | {concept.name + ' ' + str(int(concept.dosage.dose.value)) + concept.dosage.dose.unit} \"\n                    f\"tablets): valid extracted dosage + VMP lookup\"\n                )\n                concept.id = str(int(vmp_id))\n                concept.name += \" \" + str(int(concept.dosage.dose.value)) + str(concept.dosage.dose.unit) + \" tablets\"\n                # If found VMP match change the dosage to 1 tablet\n                concept.dosage.dose.value = 1\n                concept.dosage.dose.unit = \"{tbl}\"\n            else:\n                # If no match with dose convert to text\n                lookup_result = self.vtm_to_text_lookup.get(int(concept.id))\n                if lookup_result is not None:\n                    log.debug(\n                        f\"Converted ({concept.id} | {concept.name}) to (None | {lookup_result}: no match to VMP dosage lookup)\"\n                    )\n                    concept.id = None\n                    concept.name = lookup_result\n\n        # Convert rest of VTMs that have no dose for VMP conversion to text\n        for concept in med_concepts_no_dose:\n            lookup_result = self.vtm_to_text_lookup.get(int(concept.id))\n            if lookup_result is not None:\n                log.debug(f\"Converted ({concept.id} | {concept.name}) to (None | {lookup_result}): no dosage detected\")\n                concept.id = None\n                concept.name = lookup_result\n\n        return concepts\n\n    def __call__(\n        self,\n        note: Note,\n        record_concepts: Optional[List[Concept]] = None,\n        dosage_extractor: Optional[DosageExtractor] = None,\n    ) -> List[Concept]:\n        \"\"\"\n        Annotates the given note with concepts using the pipeline.\n\n        Args:\n            note (Note): The note to be annotated.\n            record_concepts (Optional[List[Concept]]): A list of concepts to be recorded.\n            dosage_extractor (Optional[DosageExtractor]): A dosage extractor to be used.\n\n        Returns:\n            The annotated concepts.\n        \"\"\"\n        concepts = self.run_pipeline(note, record_concepts, dosage_extractor)\n\n        if self.config.add_numbering:\n            concepts = self.add_numbering_to_name(concepts)\n\n        return concepts\n
"},{"location":"api-reference/medsallergiesannotator/#miade.annotators.MedsAllergiesAnnotator.concept_types","title":"concept_types: List[Category] property","text":"

Returns a list of concept types.

Returns:

Type Description List[Category]

[Category.MEDICATION, Category.ALLERGY, Category.REACTION]

"},{"location":"api-reference/medsallergiesannotator/#miade.annotators.MedsAllergiesAnnotator.pipeline","title":"pipeline: List[str] property","text":"

Returns a list of annotators in the pipeline.

The annotators are executed in the order they appear in the list.

Returns:

Type Description List[str]

[\"preprocessor\", \"medcat\", \"paragrapher\", \"postprocessor\", \"dosage_extractor\", \"vtm_converter\", \"deduplicator\"]

"},{"location":"api-reference/medsallergiesannotator/#miade.annotators.MedsAllergiesAnnotator.__call__","title":"__call__(note, record_concepts=None, dosage_extractor=None)","text":"

Annotates the given note with concepts using the pipeline.

Parameters:

Name Type Description Default note Note

The note to be annotated.

required record_concepts Optional[List[Concept]]

A list of concepts to be recorded.

None dosage_extractor Optional[DosageExtractor]

A dosage extractor to be used.

None

Returns:

Type Description List[Concept]

The annotated concepts.

Source code in src/miade/annotators.py
def __call__(\n    self,\n    note: Note,\n    record_concepts: Optional[List[Concept]] = None,\n    dosage_extractor: Optional[DosageExtractor] = None,\n) -> List[Concept]:\n    \"\"\"\n    Annotates the given note with concepts using the pipeline.\n\n    Args:\n        note (Note): The note to be annotated.\n        record_concepts (Optional[List[Concept]]): A list of concepts to be recorded.\n        dosage_extractor (Optional[DosageExtractor]): A dosage extractor to be used.\n\n    Returns:\n        The annotated concepts.\n    \"\"\"\n    concepts = self.run_pipeline(note, record_concepts, dosage_extractor)\n\n    if self.config.add_numbering:\n        concepts = self.add_numbering_to_name(concepts)\n\n    return concepts\n
"},{"location":"api-reference/medsallergiesannotator/#miade.annotators.MedsAllergiesAnnotator.add_dosages_to_concepts","title":"add_dosages_to_concepts(dosage_extractor, concepts, note) staticmethod","text":"

Gets dosages for medication concepts

Parameters:

Name Type Description Default dosage_extractor DosageExtractor

The dosage extractor object

required concepts List[Concept]

List of concepts extracted

required note Note

The input note

required

Returns:

Type Description List[Concept]

List of concepts with dosages for medication concepts

Source code in src/miade/annotators.py
@staticmethod\ndef add_dosages_to_concepts(\n    dosage_extractor: DosageExtractor, concepts: List[Concept], note: Note\n) -> List[Concept]:\n    \"\"\"\n    Gets dosages for medication concepts\n\n    Args:\n        dosage_extractor (DosageExtractor): The dosage extractor object\n        concepts (List[Concept]): List of concepts extracted\n        note (Note): The input note\n\n    Returns:\n        List of concepts with dosages for medication concepts\n    \"\"\"\n\n    for ind, concept in enumerate(concepts):\n        next_med_concept = concepts[ind + 1] if len(concepts) > ind + 1 else None\n        dosage_string = get_dosage_string(concept, next_med_concept, note.text)\n        if len(dosage_string.split()) > 2:\n            concept.dosage = dosage_extractor(dosage_string)\n            concept.category = Category.MEDICATION if concept.dosage is not None else None\n            if concept.dosage is not None:\n                log.debug(\n                    f\"Extracted dosage for medication concept \"\n                    f\"({concept.id} | {concept.name}): {concept.dosage.text} {concept.dosage.dose}\"\n                )\n\n    return concepts\n
"},{"location":"api-reference/medsallergiesannotator/#miade.annotators.MedsAllergiesAnnotator.convert_VTM_to_VMP_or_text","title":"convert_VTM_to_VMP_or_text(concepts)","text":"

Converts medication concepts from VTM (Virtual Therapeutic Moiety) to VMP (Virtual Medicinal Product) or text.

Parameters:

Name Type Description Default concepts List[Concept]

A list of medication concepts.

required

Returns:

Type Description List[Concept]

A list of medication concepts with updated IDs, names, and dosages.

Source code in src/miade/annotators.py
def convert_VTM_to_VMP_or_text(self, concepts: List[Concept]) -> List[Concept]:\n    \"\"\"\n    Converts medication concepts from VTM (Virtual Therapeutic Moiety) to VMP (Virtual Medicinal Product) or text.\n\n    Args:\n        concepts (List[Concept]): A list of medication concepts.\n\n    Returns:\n        A list of medication concepts with updated IDs, names, and dosages.\n\n    \"\"\"\n    # Get medication concepts\n    med_concepts = [concept for concept in concepts if concept.category == Category.MEDICATION]\n    self.vtm_to_vmp_lookup[\"dose\"] = self.vtm_to_vmp_lookup[\"dose\"].astype(float)\n\n    med_concepts_with_dose = []\n    # I don't know man...Need to improve dosage methods\n    for concept in med_concepts:\n        if concept.dosage is not None:\n            if concept.dosage.dose:\n                if concept.dosage.dose.value is not None and concept.dosage.dose.unit is not None:\n                    med_concepts_with_dose.append(concept)\n\n    med_concepts_no_dose = [concept for concept in concepts if concept not in med_concepts_with_dose]\n\n    # Create a temporary DataFrame to match vtmId, dose, and unit\n    temp_df = pd.DataFrame(\n        {\n            \"vtmId\": [int(concept.id) for concept in med_concepts_with_dose],\n            \"dose\": [float(concept.dosage.dose.value) for concept in med_concepts_with_dose],\n            \"unit\": [concept.dosage.dose.unit for concept in med_concepts_with_dose],\n        }\n    )\n\n    # Merge with the lookup df to get vmpId\n    merged_df = temp_df.merge(self.vtm_to_vmp_lookup, on=[\"vtmId\", \"dose\", \"unit\"], how=\"left\")\n\n    # Update id in the concepts list\n    for index, concept in enumerate(med_concepts_with_dose):\n        # Convert VTM to VMP id\n        vmp_id = merged_df.at[index, \"vmpId\"]\n        if not pd.isna(vmp_id):\n            log.debug(\n                f\"Converted ({concept.id} | {concept.name}) to \"\n                f\"({int(vmp_id)} | {concept.name + ' ' + str(int(concept.dosage.dose.value)) + concept.dosage.dose.unit} \"\n                f\"tablets): valid extracted dosage + VMP lookup\"\n            )\n            concept.id = str(int(vmp_id))\n            concept.name += \" \" + str(int(concept.dosage.dose.value)) + str(concept.dosage.dose.unit) + \" tablets\"\n            # If found VMP match change the dosage to 1 tablet\n            concept.dosage.dose.value = 1\n            concept.dosage.dose.unit = \"{tbl}\"\n        else:\n            # If no match with dose convert to text\n            lookup_result = self.vtm_to_text_lookup.get(int(concept.id))\n            if lookup_result is not None:\n                log.debug(\n                    f\"Converted ({concept.id} | {concept.name}) to (None | {lookup_result}: no match to VMP dosage lookup)\"\n                )\n                concept.id = None\n                concept.name = lookup_result\n\n    # Convert rest of VTMs that have no dose for VMP conversion to text\n    for concept in med_concepts_no_dose:\n        lookup_result = self.vtm_to_text_lookup.get(int(concept.id))\n        if lookup_result is not None:\n            log.debug(f\"Converted ({concept.id} | {concept.name}) to (None | {lookup_result}): no dosage detected\")\n            concept.id = None\n            concept.name = lookup_result\n\n    return concepts\n
"},{"location":"api-reference/medsallergiesannotator/#miade.annotators.MedsAllergiesAnnotator.postprocess","title":"postprocess(concepts, note)","text":"

Postprocesses a list of concepts and links reactions to allergens.

Parameters:

Name Type Description Default concepts List[Concept]

The list of concepts to be postprocessed.

required note Note

The note object associated with the concepts.

required

Returns:

Type Description List[Concept]

The postprocessed list of concepts.

Source code in src/miade/annotators.py
def postprocess(self, concepts: List[Concept], note: Note) -> List[Concept]:\n    \"\"\"\n    Postprocesses a list of concepts and links reactions to allergens.\n\n    Args:\n        concepts (List[Concept]): The list of concepts to be postprocessed.\n        note (Note): The note object associated with the concepts.\n\n    Returns:\n       The postprocessed list of concepts.\n    \"\"\"\n    # deepcopy so we still have reference to original list of concepts\n    all_concepts = deepcopy(concepts)\n    processed_concepts = []\n\n    for concept in all_concepts:\n        concept = self._validate_and_convert_concepts(concept)\n        processed_concepts.append(concept)\n\n    processed_concepts = self._link_reactions_to_allergens(processed_concepts, note)\n\n    return processed_concepts\n
"},{"location":"api-reference/medsallergiesannotator/#miade.annotators.MedsAllergiesAnnotator.process_paragraphs","title":"process_paragraphs(note, concepts)","text":"

Process the paragraphs in a note and update the list of concepts.

Parameters:

Name Type Description Default note Note

The note object containing the paragraphs.

required concepts List[Concept]

The list of concepts to be updated.

required

Returns:

Type Description List[Concept]

The updated list of concepts.

Source code in src/miade/annotators.py
def process_paragraphs(self, note: Note, concepts: List[Concept]) -> List[Concept]:\n    \"\"\"\n    Process the paragraphs in a note and update the list of concepts.\n\n    Args:\n        note (Note): The note object containing the paragraphs.\n        concepts (List[Concept]): The list of concepts to be updated.\n\n    Returns:\n        The updated list of concepts.\n    \"\"\"\n    for paragraph in note.paragraphs:\n        for concept in concepts:\n            if concept.start >= paragraph.start and concept.end <= paragraph.end:\n                # log.debug(f\"({concept.name} | {concept.id}) is in {paragraph.type}\")\n                if concept.meta:\n                    self._process_meta_ann_by_paragraph(concept, paragraph)\n\n    return concepts\n
"},{"location":"api-reference/medsallergiesannotator/#miade.annotators.MedsAllergiesAnnotator.run_pipeline","title":"run_pipeline(note, record_concepts, dosage_extractor)","text":"

Runs the annotation pipeline on the given note.

Parameters:

Name Type Description Default note Note

The input note to run the pipeline on.

required record_concepts List[Concept]

The list of previously recorded concepts.

required dosage_extractor Optional[DosageExtractor]

The dosage extractor function.

required

Returns:

Type Description List[Concept]

The list of annotated concepts.

Source code in src/miade/annotators.py
def run_pipeline(\n    self, note: Note, record_concepts: List[Concept], dosage_extractor: Optional[DosageExtractor]\n) -> List[Concept]:\n    \"\"\"\n    Runs the annotation pipeline on the given note.\n\n    Args:\n        note (Note): The input note to run the pipeline on.\n        record_concepts (List[Concept]): The list of previously recorded concepts.\n        dosage_extractor (Optional[DosageExtractor]): The dosage extractor function.\n\n    Returns:\n        The list of annotated concepts.\n    \"\"\"\n    concepts: List[Concept] = []\n\n    for pipe in self.pipeline:\n        if pipe not in self.config.disable:\n            if pipe == \"preprocessor\":\n                note = self.preprocess(note)\n            elif pipe == \"medcat\":\n                concepts = self.get_concepts(note)\n            elif pipe == \"paragrapher\":\n                concepts = self.process_paragraphs(note, concepts)\n            elif pipe == \"postprocessor\":\n                concepts = self.postprocess(concepts, note)\n            elif pipe == \"deduplicator\":\n                concepts = self.deduplicate(concepts, record_concepts)\n            elif pipe == \"vtm_converter\":\n                concepts = self.convert_VTM_to_VMP_or_text(concepts)\n            elif pipe == \"dosage_extractor\" and dosage_extractor is not None:\n                concepts = self.add_dosages_to_concepts(dosage_extractor, concepts, note)\n\n    return concepts\n
"},{"location":"api-reference/metaannotations/","title":"MetaAnnotations","text":"

Bases: BaseModel

Represents a meta annotation with a name, value, and optional confidence.

Attributes:

Name Type Description name str

The name of the meta annotation.

value Enum

The value of the meta annotation.

confidence float

The confidence level of the meta annotation.

Source code in src/miade/metaannotations.py
class MetaAnnotations(BaseModel):\n    \"\"\"\n    Represents a meta annotation with a name, value, and optional confidence.\n\n    Attributes:\n        name (str): The name of the meta annotation.\n        value (Enum): The value of the meta annotation.\n        confidence (float, optional): The confidence level of the meta annotation.\n    \"\"\"\n\n    name: str\n    value: Enum\n    confidence: Optional[float]\n\n    @validator(\"value\", pre=True)\n    def validate_value(cls, value, values):\n        enum_dict = META_ANNS_DICT\n        if isinstance(value, str):\n            enum_type = enum_dict.get(values[\"name\"])\n            if enum_type is not None:\n                try:\n                    return enum_type(value)\n                except ValueError:\n                    raise ValueError(f\"Invalid value: {value}\")\n            else:\n                raise ValueError(f\"Invalid mapping for {values['name']}\")\n\n        return value\n\n    def __eq__(self, other):\n        return self.name == other.name and self.value == other.value\n
"},{"location":"api-reference/note/","title":"Note","text":"

Bases: object

Represents a note object.

Attributes:

Name Type Description text str

The text content of the note.

raw_text str

The raw text content of the note.

regex_config str

The path to the regex configuration file.

paragraphs Optional[List[Paragraph]]

A list of paragraphs in the note.

Source code in src/miade/note.py
class Note(object):\n    \"\"\"\n    Represents a note object.\n\n    Attributes:\n        text (str): The text content of the note.\n        raw_text (str): The raw text content of the note.\n        regex_config (str): The path to the regex configuration file.\n        paragraphs (Optional[List[Paragraph]]): A list of paragraphs in the note.\n    \"\"\"\n\n    def __init__(self, text: str, regex_config_path: str = \"./data/regex_para_chunk.csv\"):\n        self.text = text\n        self.raw_text = text\n        self.regex_config = load_regex_config_mappings(regex_config_path)\n        self.paragraphs: Optional[List[Paragraph]] = []\n\n    def clean_text(self) -> None:\n        \"\"\"\n        Cleans the text content of the note.\n\n        This method performs various cleaning operations on the text content of the note,\n        such as replacing spaces, removing punctuation, and removing empty lines.\n        \"\"\"\n\n        # Replace all types of spaces with a single normal space, preserving \"\\n\"\n        self.text = re.sub(r\"(?:(?!\\n)\\s)+\", \" \", self.text)\n\n        # Remove en dashes that are not between two numbers\n        self.text = re.sub(r\"(?<![0-9])-(?![0-9])\", \"\", self.text)\n\n        # Remove all punctuation except full stops, question marks, dash and line breaks\n        self.text = re.sub(r\"[^\\w\\s.,?\\n-]\", \"\", self.text)\n\n        # Remove spaces if the entire line (between two line breaks) is just spaces\n        self.text = re.sub(r\"(?<=\\n)\\s+(?=\\n)\", \"\", self.text)\n\n    def get_paragraphs(self) -> None:\n        \"\"\"\n        Splits the note into paragraphs.\n\n        This method splits the text content of the note into paragraphs based on double line breaks.\n        It also assigns a paragraph type to each paragraph based on matching patterns in the heading.\n        \"\"\"\n\n        paragraphs = re.split(r\"\\n\\n+\", self.text)\n        start = 0\n\n        for text in paragraphs:\n            # Default to prose\n            paragraph_type = ParagraphType.prose\n\n            # Use re.search to find everything before first \\n\n            match = re.search(r\"^(.*?)(?:\\n|$)([\\s\\S]*)\", text)\n\n            # Check if a match is found\n            if match:\n                heading = match.group(1)\n                body = match.group(2)\n            else:\n                heading = text\n                body = \"\"\n\n            end = start + len(text)\n            paragraph = Paragraph(heading=heading, body=body, type=paragraph_type, start=start, end=end)\n            start = end + 2  # Account for the two newline characters\n\n            # Convert the heading to lowercase for case-insensitive matching\n            if heading:\n                heading = heading.lower()\n                # Iterate through the dictionary items and patterns\n                for paragraph_type, pattern in self.regex_config.items():\n                    if re.search(pattern, heading):\n                        paragraph.type = paragraph_type\n                        break  # Exit the loop if a match is found\n\n            self.paragraphs.append(paragraph)\n\n    def __str__(self):\n        return self.text\n
"},{"location":"api-reference/note/#miade.note.Note.clean_text","title":"clean_text()","text":"

Cleans the text content of the note.

This method performs various cleaning operations on the text content of the note, such as replacing spaces, removing punctuation, and removing empty lines.

Source code in src/miade/note.py
def clean_text(self) -> None:\n    \"\"\"\n    Cleans the text content of the note.\n\n    This method performs various cleaning operations on the text content of the note,\n    such as replacing spaces, removing punctuation, and removing empty lines.\n    \"\"\"\n\n    # Replace all types of spaces with a single normal space, preserving \"\\n\"\n    self.text = re.sub(r\"(?:(?!\\n)\\s)+\", \" \", self.text)\n\n    # Remove en dashes that are not between two numbers\n    self.text = re.sub(r\"(?<![0-9])-(?![0-9])\", \"\", self.text)\n\n    # Remove all punctuation except full stops, question marks, dash and line breaks\n    self.text = re.sub(r\"[^\\w\\s.,?\\n-]\", \"\", self.text)\n\n    # Remove spaces if the entire line (between two line breaks) is just spaces\n    self.text = re.sub(r\"(?<=\\n)\\s+(?=\\n)\", \"\", self.text)\n
"},{"location":"api-reference/note/#miade.note.Note.get_paragraphs","title":"get_paragraphs()","text":"

Splits the note into paragraphs.

This method splits the text content of the note into paragraphs based on double line breaks. It also assigns a paragraph type to each paragraph based on matching patterns in the heading.

Source code in src/miade/note.py
def get_paragraphs(self) -> None:\n    \"\"\"\n    Splits the note into paragraphs.\n\n    This method splits the text content of the note into paragraphs based on double line breaks.\n    It also assigns a paragraph type to each paragraph based on matching patterns in the heading.\n    \"\"\"\n\n    paragraphs = re.split(r\"\\n\\n+\", self.text)\n    start = 0\n\n    for text in paragraphs:\n        # Default to prose\n        paragraph_type = ParagraphType.prose\n\n        # Use re.search to find everything before first \\n\n        match = re.search(r\"^(.*?)(?:\\n|$)([\\s\\S]*)\", text)\n\n        # Check if a match is found\n        if match:\n            heading = match.group(1)\n            body = match.group(2)\n        else:\n            heading = text\n            body = \"\"\n\n        end = start + len(text)\n        paragraph = Paragraph(heading=heading, body=body, type=paragraph_type, start=start, end=end)\n        start = end + 2  # Account for the two newline characters\n\n        # Convert the heading to lowercase for case-insensitive matching\n        if heading:\n            heading = heading.lower()\n            # Iterate through the dictionary items and patterns\n            for paragraph_type, pattern in self.regex_config.items():\n                if re.search(pattern, heading):\n                    paragraph.type = paragraph_type\n                    break  # Exit the loop if a match is found\n\n        self.paragraphs.append(paragraph)\n
"},{"location":"api-reference/noteprocessor/","title":"NoteProcessor","text":"

Main processor of MiADE which extract, postprocesses, and deduplicates concepts given annotators (MedCAT models), Note, and existing concepts

Parameters:

Name Type Description Default model_directory Path

Path to directory that contains medcat models and a config.yaml file

required model_config_path Path

Path to the model config file. Defaults to None.

None log_level int

Log level. Defaults to logging.INFO.

INFO dosage_extractor_log_level int

Log level for dosage extractor. Defaults to logging.INFO.

INFO device str

Device to run inference on (cpu or gpu). Defaults to \"cpu\".

'cpu' custom_annotators List[Annotator]

List of custom annotators. Defaults to None.

None Source code in src/miade/core.py
class NoteProcessor:\n    \"\"\"\n    Main processor of MiADE which extract, postprocesses, and deduplicates concepts given\n    annotators (MedCAT models), Note, and existing concepts\n\n    Args:\n        model_directory (Path): Path to directory that contains medcat models and a config.yaml file\n        model_config_path (Path, optional): Path to the model config file. Defaults to None.\n        log_level (int, optional): Log level. Defaults to logging.INFO.\n        dosage_extractor_log_level (int, optional): Log level for dosage extractor. Defaults to logging.INFO.\n        device (str, optional): Device to run inference on (cpu or gpu). Defaults to \"cpu\".\n        custom_annotators (List[Annotator], optional): List of custom annotators. Defaults to None.\n    \"\"\"\n\n    def __init__(\n        self,\n        model_directory: Path,\n        model_config_path: Path = None,\n        log_level: int = logging.INFO,\n        dosage_extractor_log_level: int = logging.INFO,\n        device: str = \"cpu\",\n        custom_annotators: Optional[List[Annotator]] = None,\n    ):\n        logging.getLogger(\"miade\").setLevel(log_level)\n        logging.getLogger(\"miade.dosageextractor\").setLevel(dosage_extractor_log_level)\n        logging.getLogger(\"miade.drugdoseade\").setLevel(dosage_extractor_log_level)\n\n        self.device: str = device\n\n        self.annotators: List[Annotator] = []\n        self.model_directory: Path = model_directory\n        self.model_config_path: Path = model_config_path\n        self.model_factory: ModelFactory = self._load_model_factory(custom_annotators)\n        self.dosage_extractor: DosageExtractor = DosageExtractor()\n\n    def _load_config(self) -> Dict:\n        \"\"\"\n        Loads the configuration file (config.yaml) in the configured model path.\n        If the model path is not explicitly passed, it defaults to the model directory.\n\n        Returns:\n            A dictionary containing the loaded config file.\n        \"\"\"\n        if self.model_config_path is None:\n            config_path = os.path.join(self.model_directory, \"config.yaml\")\n        else:\n            config_path = self.model_config_path\n\n        if os.path.isfile(config_path):\n            log.info(f\"Found config file {config_path}\")\n        else:\n            log.error(f\"No model config file found at {config_path}\")\n\n        with open(config_path, \"r\") as f:\n            config = yaml.safe_load(f)\n\n        return config\n\n    def _load_model_factory(self, custom_annotators: Optional[List[Annotator]] = None) -> ModelFactory:\n        \"\"\"\n        Loads the model factory which maps model aliases to MedCAT model IDs and MiADE annotators.\n\n        Args:\n            custom_annotators (List[Annotators], optional): List of custom annotators to initialize. Defaults to None.\n\n        Returns:\n            The initialized ModelFactory object.\n\n        Raises:\n            Exception: If there is an error loading MedCAT models.\n\n        \"\"\"\n        meta_cat_config_dict = {\"general\": {\"device\": self.device}}\n        config_dict = self._load_config()\n        loaded_models = {}\n\n        # get model {id: cat_model}\n        log.info(f\"Loading MedCAT models from {self.model_directory}\")\n        for model_pack_filepath in self.model_directory.glob(\"*.zip\"):\n            try:\n                cat = MiADE_CAT.load_model_pack(str(model_pack_filepath), meta_cat_config_dict=meta_cat_config_dict)\n                # temp fix reload to load stop words\n                cat.pipe._nlp = spacy.load(\n                    cat.config.general.spacy_model, disable=cat.config.general.spacy_disabled_components\n                )\n                cat._create_pipeline(config=cat.config)\n                cat_id = cat.config.version[\"id\"]\n                loaded_models[cat_id] = cat\n            except Exception as e:\n                raise Exception(f\"Error loading MedCAT models: {e}\")\n\n        mapped_models = {}\n        # map to name if given {name: <class CAT>}\n        if \"models\" in config_dict:\n            for name, model_id in config_dict[\"models\"].items():\n                cat_model = loaded_models.get(model_id)\n                if cat_model is None:\n                    log.warning(f\"No match for model id {model_id} in {self.model_directory}, skipping\")\n                    continue\n                mapped_models[name] = cat_model\n        else:\n            log.warning(\"No model ids configured!\")\n\n        mapped_annotators = {}\n        # {name: <class Annotator>}\n        if \"annotators\" in config_dict:\n            for name, annotator_string in config_dict[\"annotators\"].items():\n                if custom_annotators is not None:\n                    for annotator_class in custom_annotators:\n                        if annotator_class.__name__ == annotator_string:\n                            mapped_annotators[name] = annotator_class\n                            break\n                if name not in mapped_annotators:\n                    try:\n                        annotator_class = getattr(sys.modules[__name__], annotator_string)\n                        mapped_annotators[name] = annotator_class\n                    except AttributeError as e:\n                        log.warning(f\"{annotator_string} not found: {e}\")\n        else:\n            log.warning(\"No annotators configured!\")\n\n        mapped_configs = {}\n        if \"general\" in config_dict:\n            for name, config in config_dict[\"general\"].items():\n                try:\n                    mapped_configs[name] = AnnotatorConfig(**config)\n                except Exception as e:\n                    log.error(f\"Error processing config for '{name}': {str(e)}\")\n        else:\n            log.warning(\"No general settings configured, using default settings.\")\n\n        model_factory_config = {\"models\": mapped_models, \"annotators\": mapped_annotators, \"configs\": mapped_configs}\n\n        return ModelFactory(**model_factory_config)\n\n    def add_annotator(self, name: str) -> None:\n        \"\"\"\n        Adds an annotator to the processor.\n\n        Args:\n            name (str): The alias of the annotator to add.\n\n        Returns:\n            None\n\n        Raises:\n            Exception: If there is an error creating the annotator.\n        \"\"\"\n        try:\n            annotator = create_annotator(name, self.model_factory)\n            log.info(\n                f\"Added {type(annotator).__name__} to processor with config {self.model_factory.configs.get(name)}\"\n            )\n        except Exception as e:\n            raise Exception(f\"Error creating annotator: {e}\")\n\n        self.annotators.append(annotator)\n\n    def remove_annotator(self, name: str) -> None:\n        \"\"\"\n        Removes an annotator from the processor.\n\n        Args:\n            name (str): The alias of the annotator to remove.\n\n        Returns:\n            None\n        \"\"\"\n        annotator_found = False\n        annotator_name = self.model_factory.annotators[name]\n\n        for annotator in self.annotators:\n            if type(annotator).__name__ == annotator_name.__name__:\n                self.annotators.remove(annotator)\n                annotator_found = True\n                log.info(f\"Removed {type(annotator).__name__} from processor\")\n                break\n\n        if not annotator_found:\n            log.warning(f\"Annotator {type(name).__name__} not found in processor\")\n\n    def print_model_cards(self) -> None:\n        \"\"\"\n        Prints the model cards for each annotator in the `annotators` list.\n\n        Each model card includes the name of the annotator's class and its category.\n        \"\"\"\n        for annotator in self.annotators:\n            print(f\"{type(annotator).__name__}: {annotator.cat}\")\n\n    def process(self, note: Note, record_concepts: Optional[List[Concept]] = None) -> List[Concept]:\n        \"\"\"\n        Process the given note and extract concepts using the loaded annotators.\n\n        Args:\n            note (Note): The note to be processed.\n            record_concepts (Optional[List[Concept]]): A list of existing concepts in the EHR record.\n\n        Returns:\n            A list of extracted concepts.\n\n        \"\"\"\n        if not self.annotators:\n            log.warning(\"No annotators loaded, use .add_annotator() to load annotators\")\n            return []\n\n        concepts: List[Concept] = []\n\n        for annotator in self.annotators:\n            log.debug(f\"Processing concepts with {type(annotator).__name__}\")\n            if Category.MEDICATION in annotator.concept_types:\n                detected_concepts = annotator(note, record_concepts, self.dosage_extractor)\n                concepts.extend(detected_concepts)\n            else:\n                detected_concepts = annotator(note, record_concepts)\n                concepts.extend(detected_concepts)\n\n        return concepts\n\n    def get_concept_dicts(\n        self, note: Note, filter_uncategorized: bool = True, record_concepts: Optional[List[Concept]] = None\n    ) -> List[Dict]:\n        \"\"\"\n        Returns concepts in dictionary format.\n\n        Args:\n            note (Note): Note containing text to extract concepts from.\n            filter_uncategorized (bool): If True, does not return concepts where category=None. Default is True.\n            record_concepts (Optional[List[Concept]]): List of concepts in existing record.\n\n        Returns:\n            Extracted concepts in JSON-compatible dictionary format.\n        \"\"\"\n        concepts = self.process(note, record_concepts)\n        concept_list = []\n        for concept in concepts:\n            if filter_uncategorized and concept.category is None:\n                continue\n            concept_dict = concept.__dict__\n            if concept.dosage is not None:\n                concept_dict[\"dosage\"] = {\n                    \"dose\": concept.dosage.dose.dict() if concept.dosage.dose else None,\n                    \"duration\": concept.dosage.duration.dict() if concept.dosage.duration else None,\n                    \"frequency\": concept.dosage.frequency.dict() if concept.dosage.frequency else None,\n                    \"route\": concept.dosage.route.dict() if concept.dosage.route else None,\n                }\n            if concept.meta is not None:\n                meta_anns = []\n                for meta in concept.meta:\n                    meta_dict = meta.__dict__\n                    meta_dict[\"value\"] = meta.value.name\n                    meta_anns.append(meta_dict)\n                concept_dict[\"meta\"] = meta_anns\n            if concept.category is not None:\n                concept_dict[\"category\"] = concept.category.name\n            concept_list.append(concept_dict)\n\n        return concept_list\n
"},{"location":"api-reference/noteprocessor/#miade.core.NoteProcessor.add_annotator","title":"add_annotator(name)","text":"

Adds an annotator to the processor.

Parameters:

Name Type Description Default name str

The alias of the annotator to add.

required

Returns:

Type Description None

None

Raises:

Type Description Exception

If there is an error creating the annotator.

Source code in src/miade/core.py
def add_annotator(self, name: str) -> None:\n    \"\"\"\n    Adds an annotator to the processor.\n\n    Args:\n        name (str): The alias of the annotator to add.\n\n    Returns:\n        None\n\n    Raises:\n        Exception: If there is an error creating the annotator.\n    \"\"\"\n    try:\n        annotator = create_annotator(name, self.model_factory)\n        log.info(\n            f\"Added {type(annotator).__name__} to processor with config {self.model_factory.configs.get(name)}\"\n        )\n    except Exception as e:\n        raise Exception(f\"Error creating annotator: {e}\")\n\n    self.annotators.append(annotator)\n
"},{"location":"api-reference/noteprocessor/#miade.core.NoteProcessor.get_concept_dicts","title":"get_concept_dicts(note, filter_uncategorized=True, record_concepts=None)","text":"

Returns concepts in dictionary format.

Parameters:

Name Type Description Default note Note

Note containing text to extract concepts from.

required filter_uncategorized bool

If True, does not return concepts where category=None. Default is True.

True record_concepts Optional[List[Concept]]

List of concepts in existing record.

None

Returns:

Type Description List[Dict]

Extracted concepts in JSON-compatible dictionary format.

Source code in src/miade/core.py
def get_concept_dicts(\n    self, note: Note, filter_uncategorized: bool = True, record_concepts: Optional[List[Concept]] = None\n) -> List[Dict]:\n    \"\"\"\n    Returns concepts in dictionary format.\n\n    Args:\n        note (Note): Note containing text to extract concepts from.\n        filter_uncategorized (bool): If True, does not return concepts where category=None. Default is True.\n        record_concepts (Optional[List[Concept]]): List of concepts in existing record.\n\n    Returns:\n        Extracted concepts in JSON-compatible dictionary format.\n    \"\"\"\n    concepts = self.process(note, record_concepts)\n    concept_list = []\n    for concept in concepts:\n        if filter_uncategorized and concept.category is None:\n            continue\n        concept_dict = concept.__dict__\n        if concept.dosage is not None:\n            concept_dict[\"dosage\"] = {\n                \"dose\": concept.dosage.dose.dict() if concept.dosage.dose else None,\n                \"duration\": concept.dosage.duration.dict() if concept.dosage.duration else None,\n                \"frequency\": concept.dosage.frequency.dict() if concept.dosage.frequency else None,\n                \"route\": concept.dosage.route.dict() if concept.dosage.route else None,\n            }\n        if concept.meta is not None:\n            meta_anns = []\n            for meta in concept.meta:\n                meta_dict = meta.__dict__\n                meta_dict[\"value\"] = meta.value.name\n                meta_anns.append(meta_dict)\n            concept_dict[\"meta\"] = meta_anns\n        if concept.category is not None:\n            concept_dict[\"category\"] = concept.category.name\n        concept_list.append(concept_dict)\n\n    return concept_list\n
"},{"location":"api-reference/noteprocessor/#miade.core.NoteProcessor.print_model_cards","title":"print_model_cards()","text":"

Prints the model cards for each annotator in the annotators list.

Each model card includes the name of the annotator's class and its category.

Source code in src/miade/core.py
def print_model_cards(self) -> None:\n    \"\"\"\n    Prints the model cards for each annotator in the `annotators` list.\n\n    Each model card includes the name of the annotator's class and its category.\n    \"\"\"\n    for annotator in self.annotators:\n        print(f\"{type(annotator).__name__}: {annotator.cat}\")\n
"},{"location":"api-reference/noteprocessor/#miade.core.NoteProcessor.process","title":"process(note, record_concepts=None)","text":"

Process the given note and extract concepts using the loaded annotators.

Parameters:

Name Type Description Default note Note

The note to be processed.

required record_concepts Optional[List[Concept]]

A list of existing concepts in the EHR record.

None

Returns:

Type Description List[Concept]

A list of extracted concepts.

Source code in src/miade/core.py
def process(self, note: Note, record_concepts: Optional[List[Concept]] = None) -> List[Concept]:\n    \"\"\"\n    Process the given note and extract concepts using the loaded annotators.\n\n    Args:\n        note (Note): The note to be processed.\n        record_concepts (Optional[List[Concept]]): A list of existing concepts in the EHR record.\n\n    Returns:\n        A list of extracted concepts.\n\n    \"\"\"\n    if not self.annotators:\n        log.warning(\"No annotators loaded, use .add_annotator() to load annotators\")\n        return []\n\n    concepts: List[Concept] = []\n\n    for annotator in self.annotators:\n        log.debug(f\"Processing concepts with {type(annotator).__name__}\")\n        if Category.MEDICATION in annotator.concept_types:\n            detected_concepts = annotator(note, record_concepts, self.dosage_extractor)\n            concepts.extend(detected_concepts)\n        else:\n            detected_concepts = annotator(note, record_concepts)\n            concepts.extend(detected_concepts)\n\n    return concepts\n
"},{"location":"api-reference/noteprocessor/#miade.core.NoteProcessor.remove_annotator","title":"remove_annotator(name)","text":"

Removes an annotator from the processor.

Parameters:

Name Type Description Default name str

The alias of the annotator to remove.

required

Returns:

Type Description None

None

Source code in src/miade/core.py
def remove_annotator(self, name: str) -> None:\n    \"\"\"\n    Removes an annotator from the processor.\n\n    Args:\n        name (str): The alias of the annotator to remove.\n\n    Returns:\n        None\n    \"\"\"\n    annotator_found = False\n    annotator_name = self.model_factory.annotators[name]\n\n    for annotator in self.annotators:\n        if type(annotator).__name__ == annotator_name.__name__:\n            self.annotators.remove(annotator)\n            annotator_found = True\n            log.info(f\"Removed {type(annotator).__name__} from processor\")\n            break\n\n    if not annotator_found:\n        log.warning(f\"Annotator {type(name).__name__} not found in processor\")\n
"},{"location":"api-reference/problemsannotator/","title":"ProblemsAnnotator","text":"

Bases: Annotator

Annotator class for identifying and processing problems in medical notes.

This class extends the base Annotator class and provides specific functionality for identifying and processing problems in medical notes. It implements methods for loading problem lookup data, processing meta annotations, filtering concepts, and post-processing the annotated concepts.

Attributes:

Name Type Description cat CAT

The CAT (Concept Annotation Tool) instance used for annotation.

config AnnotatorConfig

The configuration object for the annotator.

Properties

concept_types (list): A list of concept types supported by this annotator. pipeline (list): The list of processing steps in the annotation pipeline.

Source code in src/miade/annotators.py
class ProblemsAnnotator(Annotator):\n    \"\"\"\n    Annotator class for identifying and processing problems in medical notes.\n\n    This class extends the base `Annotator` class and provides specific functionality\n    for identifying and processing problems in medical notes. It implements methods\n    for loading problem lookup data, processing meta annotations, filtering concepts,\n    and post-processing the annotated concepts.\n\n    Attributes:\n        cat (CAT): The CAT (Concept Annotation Tool) instance used for annotation.\n        config (AnnotatorConfig): The configuration object for the annotator.\n\n    Properties:\n        concept_types (list): A list of concept types supported by this annotator.\n        pipeline (list): The list of processing steps in the annotation pipeline.\n    \"\"\"\n\n    def __init__(self, cat: CAT, config: AnnotatorConfig = None):\n        super().__init__(cat, config)\n        self._load_problems_lookup_data()\n\n    @property\n    def concept_types(self) -> List[Category]:\n        \"\"\"\n        Get the list of concept types supported by this annotator.\n\n        Returns:\n            [Category.PROBLEM]\n        \"\"\"\n        return [Category.PROBLEM]\n\n    @property\n    def pipeline(self) -> List[str]:\n        \"\"\"\n        Get the list of processing steps in the annotation pipeline.\n\n        Returns:\n            [\"preprocessor\", \"medcat\", \"paragrapher\", \"postprocessor\", \"deduplicator\"]\n        \"\"\"\n        return [\"preprocessor\", \"medcat\", \"paragrapher\", \"postprocessor\", \"deduplicator\"]\n\n    def _load_problems_lookup_data(self) -> None:\n        \"\"\"\n        Load the problem lookup data.\n\n        Raises:\n            RuntimeError: If the lookup data directory does not exist.\n        \"\"\"\n        if not os.path.isdir(self.config.lookup_data_path):\n            raise RuntimeError(f\"No lookup data configured: {self.config.lookup_data_path} does not exist!\")\n        else:\n            self.negated_lookup = load_lookup_data(self.config.lookup_data_path + \"negated.csv\", as_dict=True)\n            self.historic_lookup = load_lookup_data(self.config.lookup_data_path + \"historic.csv\", as_dict=True)\n            self.suspected_lookup = load_lookup_data(self.config.lookup_data_path + \"suspected.csv\", as_dict=True)\n            self.filtering_blacklist = load_lookup_data(\n                self.config.lookup_data_path + \"problem_blacklist.csv\", no_header=True\n            )\n\n    def _process_meta_annotations(self, concept: Concept) -> Optional[Concept]:\n        \"\"\"\n        Process the meta annotations for a concept.\n\n        Args:\n            concept (Concept): The concept to process.\n\n        Returns:\n           The processed concept, or None if it should be removed.\n\n        Raises:\n            ValueError: If the concept has an invalid negex value.\n        \"\"\"\n        # Add, convert, or ignore concepts\n        meta_ann_values = [meta_ann.value for meta_ann in concept.meta] if concept.meta is not None else []\n\n        convert = False\n        tag = \"\"\n        # only get meta model results if negex is false\n        if concept.negex is not None:\n            if concept.negex:\n                convert = self.negated_lookup.get(int(concept.id), False)\n                tag = \" (negated)\"\n            elif Presence.SUSPECTED in meta_ann_values:\n                convert = self.suspected_lookup.get(int(concept.id), False)\n                tag = \" (suspected)\"\n            elif Relevance.HISTORIC in meta_ann_values:\n                convert = self.historic_lookup.get(int(concept.id), False)\n                tag = \" (historic)\"\n        else:\n            if Presence.NEGATED in meta_ann_values:\n                convert = self.negated_lookup.get(int(concept.id), False)\n                tag = \" (negated)\"\n            elif Presence.SUSPECTED in meta_ann_values:\n                convert = self.suspected_lookup.get(int(concept.id), False)\n                tag = \" (suspected)\"\n            elif Relevance.HISTORIC in meta_ann_values:\n                convert = self.historic_lookup.get(int(concept.id), False)\n                tag = \" (historic)\"\n\n        if convert:\n            if tag == \" (negated)\" and concept.negex:\n                log.debug(\n                    f\"Converted concept ({concept.id} | {concept.name}) to ({str(convert)} | {concept.name + tag}): \"\n                    f\"negation detected by negex\"\n                )\n            else:\n                log.debug(\n                    f\"Converted concept ({concept.id} | {concept.name}) to ({str(convert)} | {concept.name + tag}):\"\n                    f\"detected by meta model\"\n                )\n            concept.id = str(convert)\n            concept.name += tag\n        else:\n            if concept.negex:\n                log.debug(f\"Removed concept ({concept.id} | {concept.name}): negation (negex) with no conversion match\")\n                return None\n            if concept.negex is None and Presence.NEGATED in meta_ann_values:\n                log.debug(\n                    f\"Removed concept ({concept.id} | {concept.name}): negation (meta model) with no conversion match\"\n                )\n                return None\n            if Presence.SUSPECTED in meta_ann_values:\n                log.debug(f\"Removed concept ({concept.id} | {concept.name}): suspected with no conversion match\")\n                return None\n            if Relevance.IRRELEVANT in meta_ann_values:\n                log.debug(f\"Removed concept ({concept.id} | {concept.name}): irrelevant concept\")\n                return None\n            if Relevance.HISTORIC in meta_ann_values:\n                log.debug(f\"No change to concept ({concept.id} | {concept.name}): historic with no conversion match\")\n\n        concept.category = Category.PROBLEM\n\n        return concept\n\n    def _is_blacklist(self, concept):\n        \"\"\"\n        Check if a concept is in the filtering blacklist.\n\n        Args:\n            concept: The concept to check.\n\n        Returns:\n            True if the concept is in the blacklist, False otherwise.\n        \"\"\"\n        # filtering blacklist\n        if int(concept.id) in self.filtering_blacklist.values:\n            log.debug(f\"Removed concept ({concept.id} | {concept.name}): concept in problems blacklist\")\n            return True\n        return False\n\n    def _process_meta_ann_by_paragraph(\n        self, concept: Concept, paragraph: Paragraph, prob_concepts_in_structured_sections: List[Concept]\n    ):\n        \"\"\"\n        Process the meta annotations for a concept based on the paragraph type.\n\n        Args:\n            concept (Concept): The concept to process.\n            paragraph (Paragraph): The paragraph containing the concept.\n            prob_concepts_in_structured_sections (List[Concept]): The list of problem concepts in structured sections.\n        \"\"\"\n        # if paragraph is structured problems section, add to prob list and convert to corresponding relevance\n        if paragraph.type in self.structured_prob_lists:\n            prob_concepts_in_structured_sections.append(concept)\n            for meta in concept.meta:\n                if meta.name == \"relevance\" and meta.value == Relevance.IRRELEVANT:\n                    new_relevance = self.structured_prob_lists[paragraph.type]\n                    log.debug(\n                        f\"Converted {meta.value} to \"\n                        f\"{new_relevance} for concept ({concept.id} | {concept.name}): \"\n                        f\"paragraph is {paragraph.type}\"\n                    )\n                    meta.value = new_relevance\n        # if paragraph is meds or irrelevant section, convert problems to irrelevant\n        elif paragraph.type in self.structured_med_lists or paragraph.type in self.irrelevant_paragraphs:\n            for meta in concept.meta:\n                if meta.name == \"relevance\" and meta.value != Relevance.IRRELEVANT:\n                    log.debug(\n                        f\"Converted {meta.value} to \"\n                        f\"{Relevance.IRRELEVANT} for concept ({concept.id} | {concept.name}): \"\n                        f\"paragraph is {paragraph.type}\"\n                    )\n                    meta.value = Relevance.IRRELEVANT\n\n    def process_paragraphs(self, note: Note, concepts: List[Concept]) -> List[Concept]:\n        \"\"\"\n        Process the paragraphs in a note and filter the concepts.\n\n        Args:\n            note (Note): The note to process.\n            concepts (List[Concept]): The list of concepts to filter.\n\n        Returns:\n            The filtered list of concepts.\n        \"\"\"\n        prob_concepts_in_structured_sections: List[Concept] = []\n\n        for paragraph in note.paragraphs:\n            for concept in concepts:\n                if concept.start >= paragraph.start and concept.end <= paragraph.end:\n                    # log.debug(f\"({concept.name} | {concept.id}) is in {paragraph.type}\")\n                    if concept.meta:\n                        self._process_meta_ann_by_paragraph(concept, paragraph, prob_concepts_in_structured_sections)\n\n        # if more than set no. concepts in prob or imp or pmh sections, return only those and ignore all other concepts\n        if len(prob_concepts_in_structured_sections) > self.config.structured_list_limit:\n            log.debug(\n                f\"Ignoring concepts elsewhere in the document because \"\n                f\"more than {self.config.structured_list_limit} concepts exist \"\n                f\"in prob, imp, pmh structured sections: {len(prob_concepts_in_structured_sections)}\"\n            )\n            return prob_concepts_in_structured_sections\n\n        return concepts\n\n    def postprocess(self, concepts: List[Concept]) -> List[Concept]:\n        \"\"\"\n        Post-process the concepts and filter out irrelevant concepts.\n\n        Args:\n            concepts (List[Concept]): The list of concepts to post-process.\n\n        Returns:\n            The filtered list of concepts.\n        \"\"\"\n        # deepcopy so we still have reference to original list of concepts\n        all_concepts = deepcopy(concepts)\n        filtered_concepts = []\n        for concept in all_concepts:\n            if self._is_blacklist(concept):\n                continue\n            # meta annotations\n            concept = self._process_meta_annotations(concept)\n            # ignore concepts filtered by meta-annotations\n            if concept is None:\n                continue\n            filtered_concepts.append(concept)\n\n        return filtered_concepts\n
"},{"location":"api-reference/problemsannotator/#miade.annotators.ProblemsAnnotator.concept_types","title":"concept_types: List[Category] property","text":"

Get the list of concept types supported by this annotator.

Returns:

Type Description List[Category]

[Category.PROBLEM]

"},{"location":"api-reference/problemsannotator/#miade.annotators.ProblemsAnnotator.pipeline","title":"pipeline: List[str] property","text":"

Get the list of processing steps in the annotation pipeline.

Returns:

Type Description List[str]

[\"preprocessor\", \"medcat\", \"paragrapher\", \"postprocessor\", \"deduplicator\"]

"},{"location":"api-reference/problemsannotator/#miade.annotators.ProblemsAnnotator.postprocess","title":"postprocess(concepts)","text":"

Post-process the concepts and filter out irrelevant concepts.

Parameters:

Name Type Description Default concepts List[Concept]

The list of concepts to post-process.

required

Returns:

Type Description List[Concept]

The filtered list of concepts.

Source code in src/miade/annotators.py
def postprocess(self, concepts: List[Concept]) -> List[Concept]:\n    \"\"\"\n    Post-process the concepts and filter out irrelevant concepts.\n\n    Args:\n        concepts (List[Concept]): The list of concepts to post-process.\n\n    Returns:\n        The filtered list of concepts.\n    \"\"\"\n    # deepcopy so we still have reference to original list of concepts\n    all_concepts = deepcopy(concepts)\n    filtered_concepts = []\n    for concept in all_concepts:\n        if self._is_blacklist(concept):\n            continue\n        # meta annotations\n        concept = self._process_meta_annotations(concept)\n        # ignore concepts filtered by meta-annotations\n        if concept is None:\n            continue\n        filtered_concepts.append(concept)\n\n    return filtered_concepts\n
"},{"location":"api-reference/problemsannotator/#miade.annotators.ProblemsAnnotator.process_paragraphs","title":"process_paragraphs(note, concepts)","text":"

Process the paragraphs in a note and filter the concepts.

Parameters:

Name Type Description Default note Note

The note to process.

required concepts List[Concept]

The list of concepts to filter.

required

Returns:

Type Description List[Concept]

The filtered list of concepts.

Source code in src/miade/annotators.py
def process_paragraphs(self, note: Note, concepts: List[Concept]) -> List[Concept]:\n    \"\"\"\n    Process the paragraphs in a note and filter the concepts.\n\n    Args:\n        note (Note): The note to process.\n        concepts (List[Concept]): The list of concepts to filter.\n\n    Returns:\n        The filtered list of concepts.\n    \"\"\"\n    prob_concepts_in_structured_sections: List[Concept] = []\n\n    for paragraph in note.paragraphs:\n        for concept in concepts:\n            if concept.start >= paragraph.start and concept.end <= paragraph.end:\n                # log.debug(f\"({concept.name} | {concept.id}) is in {paragraph.type}\")\n                if concept.meta:\n                    self._process_meta_ann_by_paragraph(concept, paragraph, prob_concepts_in_structured_sections)\n\n    # if more than set no. concepts in prob or imp or pmh sections, return only those and ignore all other concepts\n    if len(prob_concepts_in_structured_sections) > self.config.structured_list_limit:\n        log.debug(\n            f\"Ignoring concepts elsewhere in the document because \"\n            f\"more than {self.config.structured_list_limit} concepts exist \"\n            f\"in prob, imp, pmh structured sections: {len(prob_concepts_in_structured_sections)}\"\n        )\n        return prob_concepts_in_structured_sections\n\n    return concepts\n
"},{"location":"user-guide/configuration/","title":"Configurations","text":""},{"location":"user-guide/configuration/#annotator","title":"Annotator","text":"

The MiADE processor is configured by a yaml file that maps a human-readable key for each of your models to a MedCAT model ID and a MiADE annotator class. The config file must be in the same folder as the MedCAT models.

config.yaml
models:\n  problems: f25ec9423958e8d6\n  meds/allergies: a146c741501cf1f7\nannotators:\n  problems: ProblemsAnnotator\n  meds/allergies: MedsAllergiesAnnotator\ngeneral:\n  problems:\n    lookup_data_path: ./lookup_data/\n    negation_detection: None\n    structured_list_limit: 0  # if more than this number of concepts in structure section, ignore concepts in prose\n    disable: []\n    add_numbering: True\n  meds/allergies:\n    lookup_data_path: ./lookup_data/\n    negation_detection: None\n    disable: []\n    add_numbering: False\n
"},{"location":"user-guide/configuration/#lookup-table","title":"Lookup Table","text":"

Lookup tables are by default not packaged with the main MiADE package to provide flexibility to customise the postprocessing steps. We provide example lookup data in miade-dataset which you can download and use.

git clone https://github.com/uclh-criu/miade-datasets.git\n
"},{"location":"user-guide/quickstart/","title":"Quickstart","text":""},{"location":"user-guide/quickstart/#extract-concepts-and-dosages-from-a-note-using-miade","title":"Extract concepts and dosages from a Note using MiADE","text":""},{"location":"user-guide/quickstart/#configuring-the-miade-processor","title":"Configuring the MiADE Processor","text":"

NoteProcessor is the MiADE core. It is initialised with a model directory path that contains all the MedCAT model pack .zip files we would like to use in our pipeline, and a config file that maps an alias to the model IDs (model IDs can be found in MedCAT model_cards or usually will be in the name) and annotators we would like to use:

config.yaml

models:\n  problems: f25ec9423958e8d6\n  meds/allergies: a146c741501cf1f7\nannotators:\n  problems: ProblemsAnnotator\n  meds/allergies: MedsAllergiesAnnotator\n
We can initialise a MiADE NoteProcessor object by passing in the model directory which contains our MedCAT models and config.yaml file:

miade = NoteProcessor(Path(\"path/to/model/dir\"))\n
Once NoteProcessor is initialised, we can add annotators by the aliases we have specified in config.yaml to our processor:

miade.add_annotator(\"problems\", use_negex=True)\nmiade.add_annotator(\"meds/allergies\")\n

When adding annotators, we have the option to add NegSpacy to the MedCAT spaCy pipeline, which implements the NegEx algorithm (Chapman et al. 2001) for negation detection. This allows the models to perform simple rule-based negation detection in the absence of MetaCAT models.

"},{"location":"user-guide/quickstart/#creating-a-note","title":"Creating a Note","text":"

Create a Note object which contains the text we would like to extract concepts and dosages from:

text = \"\"\"\nSuspected heart failure\n\nPMH:\nprev history of Hypothyroidism\nMI 10 years ago\n\n\nCurrent meds:\nLosartan 100mg daily\nAtorvastatin 20mg daily\nParacetamol 500mg tablets 2 tabs qds prn\n\nAllergies:\nPenicillin - rash\n\nReferred with swollen ankles and shortness of breath since 2 weeks.\n\"\"\"\n\nnote = Note(text)\n
"},{"location":"user-guide/quickstart/#extracting-concepts-and-dosages","title":"Extracting Concepts and Dosages","text":"

MiADE currently extracts concepts in SNOMED CT. Each concept contains:

The dosages associated with medication concepts are extracted by the built-in MiADE DosageExtractor, using a combination of NER model Med7 and the CALIBER rule-based drug dose lookup algorithm. It returns: The output format is directly translatable to HL7 CDA but can also easily be converted to FHIR.

Putting it all together, we can now extract concepts from our Note object:

as Concept objectas Dict
concepts = miade.process(note)\nfor concept in concepts:\n    print(concept)\n\n# {name: breaking out - eruption, id: 271807003, category: Category.REACTION, start: 204, end: 208, dosage: None, negex: False, meta: None} \n# {name: penicillin, id: 764146007, category: Category.ALLERGY, start: 191, end: 201, dosage: None, negex: False, meta: None} \n
concepts = miade.get_concept_dicts(note)\nprint(concepts)\n\n# [{'name': 'hypothyroidism (historic)',\n# 'id': '161443002',\n# 'category': 'PROBLEM',\n# 'start': 46,\n# 'end': 60,\n# 'dosage': None,\n# 'negex': False,\n# 'meta': [{'name': 'relevance',\n#           'value': 'HISTORIC',\n#           'confidence': 0.999841570854187},\n# ...\n
"},{"location":"user-guide/quickstart/#handling-existing-records-deduplication","title":"Handling existing records: deduplication","text":"

MiADE is built to handle existing medication records from EHR systems that can be sent alongside the note. It will perform basic deduplication matching on id for existing record concepts.

# create list of concepts that already exists in patient record\nrecord_concepts = [\n    Concept(id=\"161443002\", name=\"hypothyroidism (historic)\", category=Category.PROBLEM),\n    Concept(id=\"267039000\", name=\"swollen ankle\", category=Category.PROBLEM)\n]\n

We can pass in a list of existing concepts from the EHR to MiADE at runtime:

miade.process(note=note, record_concepts=record_concepts)\n
"},{"location":"user-guide/quickstart/#customising-miade","title":"Customising MiADE","text":""},{"location":"user-guide/quickstart/#training-custom-medcat-models","title":"Training Custom MedCAT Models","text":"

MiADE provides command line interface scripts for automatically building MedCAT model packs, unsupervised training, supervised training steps, and the creation and training of MetaCAT models. For more information on MedCAT models, see MedCAT documentation and paper.

The --synthetic-data-path option allows you to add synthetically generated training data in CSV format to the supervised and MetaCAT training steps. The CSV should have the following format:

text cui name start end relevance presence laterality no history of liver failure 59927004 hepatic failure 14 26 historic negated none

# Trains unsupervised training step of MedCAT model\nmiade train $MODEL_PACK_PATH $TEXT_DATA_PATH --tag \"miade-example\"\n
# Trains supervised training step of MedCAT model\nmiade train-supervised $MODEL_PACK_PATH $MEDCAT_JSON_EXPORT --synthetic-data-path $SYNTHETIC_CSV_PATH\n
# Creates BBPE tokenizer for MetaCAT\nmiade create-bbpe-tokenizer $TEXT_DATA_PATH\n
# Initialises MetaCAT models to do training on\nmiade create-metacats $TOKENIZER_PATH $CATEGORY_NAMES\n
# Trains the MetaCAT Bi-LSTM models\nmiade train-metacats $METACAT_MODEL_PATH $MEDCAT_JSON_EXPORT --synthetic-data-path $SYNTHETIC_CSV_PATH\n
# Packages MetaCAT models with the main MedCAT model pack\nmiade add_metacat_models $MODEL_PACK_PATH $METACAT_MODEL_PATH\n

"},{"location":"user-guide/quickstart/#creating-custom-miade-annotators","title":"Creating Custom MiADE Annotators","text":"

We can add custom annotators with more specialised postprocessing steps to MiADE by subclassing Annotator and initialising NoteProcessor with a list of custom annotators

Annotator methods include:

An example custom Annotator class might look like this:

class CustomAnnotator(Annotator):\n    def __init__(self, cat: MiADE_CAT):\n        super().__init__(cat)\n        # we need to include MEDICATIONS in concept types so MiADE processor will also extract dosages\n        self.concept_types = [Category.MEDICATION, Category.ALLERGY]\n\n    def postprocess(self, concepts: List[Concept]) -> List[Concept]:\n        # some example post-processing code\n        reactions = [\"271807003\"]\n        allergens = [\"764146007\"]\n        for concept in concepts:\n            if concept.id in reactions:\n                concept.category = Category.REACTION\n            elif concept.id in allergens:\n                concept.category = Category.ALLERGY\n        return concepts\n\n    def __call__(\n        self,\n        note: Note,\n        record_concepts: Optional[List[Concept]] = None,\n        dosage_extractor: Optional[DosageExtractor] = None,\n    ):\n        concepts = self.get_concepts(note)\n        concepts = self.postprocess(concepts)\n        # run dosage extractor if given\n        if dosage_extractor is not None:\n            concepts = self.add_dosages_to_concepts(dosage_extractor, concepts, note)\n        concepts = self.deduplicate(concepts, record_concepts)\n\n        return concepts\n

Add custom annotator to config file:

config.yaml
models:\n  problems: f25ec9423958e8d6\n  meds/allergies: a146c741501cf1f7\n  custom: a146c741501cf1f7\nannotators:\n  problems: ProblemsAnnotator\n  meds/allergies: MedsAllergiesAnnotator\n  custom: CustomAnnotator\n

Initialise MiADE with the custom annotator:

miade = NoteProcessor(Path(MODEL_DIR), custom_annotators=[CustomAnnotator])\nmiade.add_annotator(\"custom\")\n
"}]} \ No newline at end of file