Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding CS Linelist MTCT for assessing PMTCT gaps #549

Open
wants to merge 7 commits into
base: dev
Choose a base branch
from

Conversation

Marymary-dev
Copy link
Contributor

@Marymary-dev Marymary-dev commented Jan 27, 2025

Summary by Sourcery

New Features:

  • Added a new table CsLinelistMTCT to the HIVCaseSurveillance schema, which contains data for assessing PMTCT gaps.

Copy link
Contributor

sourcery-ai bot commented Jan 27, 2025

Reviewer's Guide by Sourcery

This pull request introduces a new SQL script to generate a linelist for MTCT (Mother-to-Child Transmission) analysis. The script extracts data from various tables in the NDWH database, including viral load, HEI (HIV-exposed infants), and PBFW (Pregnant and Breastfeeding Women) data, to create a comprehensive view of MTCT-related indicators. The script also identifies key populations and those lost to follow-up.

ER diagram for MTCT Linelist Data Sources

erDiagram
    FactViralLoad_Historical ||--o{ CsLinelistMTCT : provides_vl_data
    FactHEI ||--o{ CsLinelistMTCT : provides_hei_data
    FactPBFW ||--o{ CsLinelistMTCT : provides_pbfw_data
    FactARTHistory ||--o{ CsLinelistMTCT : provides_art_history

    FactViralLoad_Historical {
        int PatientKey
        string PatientPKHash
        date OrderedbyDate
        int FacilityKey
        string TestResult
        boolean IsPBFW
    }

    FactHEI {
        int PatientKey
        date DOB
        string MothersPatientPkHash
        boolean Paired
        boolean MotherOnART
        boolean OnProhylaxis
        boolean InfectedAt24mnths
    }

    FactPBFW {
        int PatientKey
        string PatientPKHash
        date AncDate1
        date AncDate2
        date AncDate3
        date AncDate4
    }

    CsLinelistMTCT {
        int PatientKey
        date DOB
        string MFLCode
        string MothersPatientPkHash
        boolean Paired
        boolean MotherOnART
        boolean InfantGivenProphylaxis
        boolean PositiveInfants
        boolean MotherUnsuppressesDuringPBF
        date FirstVisitDuringPBFW
        boolean IITEpisode
    }
Loading

Flow diagram for MTCT Linelist Generation Process

flowchart TD
    A[Start] --> B[Get Latest Viral Loads for PBFW]
    B --> C[Get HEI Information]
    C --> D[Get PBFW First Visit Dates]
    D --> E[Get Loss to Follow-up Information]
    E --> F[Combine All Data]
    F --> G[Create Final MTCT Linelist]
    G --> H[End]

    subgraph Data Sources
    VL[FactViralLoad_Historical]
    HEI[FactHEI]
    PBFW[FactPBFW]
    ART[FactARTHistory]
    end

    VL --> B
    HEI --> C
    PBFW --> D
    ART --> E
Loading

File-Level Changes

Change Details Files
Creation of the CsLinelistMTCT table.
  • The script creates a new table named CsLinelistMTCT in the HIVCaseSurveillance database.
  • The table is populated with data from multiple CTEs (Common Table Expressions) and joins.
  • The script includes logic to identify mothers not on ART, infants not given prophylaxis, and mother-infant pairs.
  • The script includes logic to identify mothers with unsuppressed viral load during pregnancy and breastfeeding.
  • The script includes logic to identify infants who are infected at 24 months.
  • The script includes logic to identify infants with an IIT (interruption in treatment) episode.
Scripts/REPORTING/HIVCaseSurveillance/load_cs_LinelistMTCT.sql
Extraction of viral load data for PBFW.
  • A CTE named Viralloads is created to extract viral load data for patients who are pregnant or breastfeeding (IsPBFW=1).
  • The CTE filters for viral load results greater than or equal to 200.
  • The CTE uses a window function to rank viral load results by date for each patient.
  • A CTE named Viralloads_LatestRecord is created to get the latest viral load record for each patient.
Scripts/REPORTING/HIVCaseSurveillance/load_cs_LinelistMTCT.sql
Extraction of HEI data.
  • A CTE named HEIs is created to extract data from the FactHEI table.
  • The CTE joins with other tables to get facility, agency, partner, and age group information.
  • The CTE calculates the cohort year and month based on the patient's date of birth.
  • The CTE identifies if the mother had an unsuppressed viral load during pregnancy and breastfeeding.
Scripts/REPORTING/HIVCaseSurveillance/load_cs_LinelistMTCT.sql
Extraction of PBFW start date.
  • A CTE named PBFW_StartDate is created to extract the first visit date during pregnancy and breastfeeding.
  • The CTE uses a window function to rank the visit dates for each patient.
  • The CTE coalesces the different ANC visit dates to get the first visit date.
Scripts/REPORTING/HIVCaseSurveillance/load_cs_LinelistMTCT.sql
Identification of IIT episodes.
  • A CTE named IIT is created to identify patients with an interruption in treatment.
  • The CTE filters for patients with ART outcomes of 'UNDOCUMENTED LOSS' or 'LOSS TO FOLLOW UP'.
  • The CTE filters for patients with an interruption in treatment between the first visit during pregnancy and breastfeeding.
Scripts/REPORTING/HIVCaseSurveillance/load_cs_LinelistMTCT.sql

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Marymary-dev - I've reviewed your changes - here's some feedback:

Overall Comments:

  • The date range comparison in the IIT CTE appears incorrect - it's comparing AsOfDate to the same date (FirstVisitDuringPBFW) for both start and end of range, which likely isn't the intended behavior.
  • Consider adding more robust error handling beyond the simple DROP TABLE check, such as TRY/CATCH blocks and validation of critical data points, given this deals with sensitive healthcare data.
Here's what I looked at during the review
  • 🟡 General issues: 2 issues found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

from NDWH.Fact.FactViralLoad_Historical as vlhist
left join NDWH.Dim.DimPatient as pat on pat.PatientKey=vlhist.Patientkey
left join NDWH.Dim.DimFacility as fac on fac.FacilityKey=vlhist.Facilitykey
where IsPBFW=1 and TRY_CAST(REPLACE(TestResult, ',', '') AS FLOAT) >= 200.00
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (performance): Consider reordering WHERE clause conditions for better performance

Moving the IsPBFW=1 condition before the expensive TRY_CAST operation would allow rows to be filtered out earlier, potentially improving query performance.

Suggested change
where IsPBFW=1 and TRY_CAST(REPLACE(TestResult, ',', '') AS FLOAT) >= 200.00
where IsPBFW=1
and TRY_CAST(REPLACE(TestResult, ',', '') AS FLOAT) >= 200.00

ARTOutcome
from NDWH.Fact.FactARTHistory as arthist
left join PBFW_StartDate on PBFW_StartDate.Patientkey=arthist.PatientKey
where ARTOutcome in ('UNDOCUMENTED LOSS','LOSS TO FOLLOW UP') and AsOfDate BETWEEN FirstVisitDuringPBFW AND FirstVisitDuringPBFW
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): The BETWEEN clause is comparing against the same value, which is likely incorrect

This condition will only match when AsOfDate exactly equals FirstVisitDuringPBFW. Is there supposed to be a different end date for the range?

@Marymary-dev Marymary-dev self-assigned this Feb 18, 2025
@nobert-mumo nobert-mumo self-requested a review February 23, 2025 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants