-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding CS Linelist MTCT for assessing PMTCT gaps #549
base: dev
Are you sure you want to change the base?
Conversation
Reviewer's Guide by SourceryThis pull request introduces a new SQL script to generate a linelist for MTCT (Mother-to-Child Transmission) analysis. The script extracts data from various tables in the NDWH database, including viral load, HEI (HIV-exposed infants), and PBFW (Pregnant and Breastfeeding Women) data, to create a comprehensive view of MTCT-related indicators. The script also identifies key populations and those lost to follow-up. ER diagram for MTCT Linelist Data SourceserDiagram
FactViralLoad_Historical ||--o{ CsLinelistMTCT : provides_vl_data
FactHEI ||--o{ CsLinelistMTCT : provides_hei_data
FactPBFW ||--o{ CsLinelistMTCT : provides_pbfw_data
FactARTHistory ||--o{ CsLinelistMTCT : provides_art_history
FactViralLoad_Historical {
int PatientKey
string PatientPKHash
date OrderedbyDate
int FacilityKey
string TestResult
boolean IsPBFW
}
FactHEI {
int PatientKey
date DOB
string MothersPatientPkHash
boolean Paired
boolean MotherOnART
boolean OnProhylaxis
boolean InfectedAt24mnths
}
FactPBFW {
int PatientKey
string PatientPKHash
date AncDate1
date AncDate2
date AncDate3
date AncDate4
}
CsLinelistMTCT {
int PatientKey
date DOB
string MFLCode
string MothersPatientPkHash
boolean Paired
boolean MotherOnART
boolean InfantGivenProphylaxis
boolean PositiveInfants
boolean MotherUnsuppressesDuringPBF
date FirstVisitDuringPBFW
boolean IITEpisode
}
Flow diagram for MTCT Linelist Generation Processflowchart TD
A[Start] --> B[Get Latest Viral Loads for PBFW]
B --> C[Get HEI Information]
C --> D[Get PBFW First Visit Dates]
D --> E[Get Loss to Follow-up Information]
E --> F[Combine All Data]
F --> G[Create Final MTCT Linelist]
G --> H[End]
subgraph Data Sources
VL[FactViralLoad_Historical]
HEI[FactHEI]
PBFW[FactPBFW]
ART[FactARTHistory]
end
VL --> B
HEI --> C
PBFW --> D
ART --> E
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Marymary-dev - I've reviewed your changes - here's some feedback:
Overall Comments:
- The date range comparison in the IIT CTE appears incorrect - it's comparing AsOfDate to the same date (FirstVisitDuringPBFW) for both start and end of range, which likely isn't the intended behavior.
- Consider adding more robust error handling beyond the simple DROP TABLE check, such as TRY/CATCH blocks and validation of critical data points, given this deals with sensitive healthcare data.
Here's what I looked at during the review
- 🟡 General issues: 2 issues found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
from NDWH.Fact.FactViralLoad_Historical as vlhist | ||
left join NDWH.Dim.DimPatient as pat on pat.PatientKey=vlhist.Patientkey | ||
left join NDWH.Dim.DimFacility as fac on fac.FacilityKey=vlhist.Facilitykey | ||
where IsPBFW=1 and TRY_CAST(REPLACE(TestResult, ',', '') AS FLOAT) >= 200.00 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (performance): Consider reordering WHERE clause conditions for better performance
Moving the IsPBFW=1 condition before the expensive TRY_CAST operation would allow rows to be filtered out earlier, potentially improving query performance.
where IsPBFW=1 and TRY_CAST(REPLACE(TestResult, ',', '') AS FLOAT) >= 200.00 | |
where IsPBFW=1 | |
and TRY_CAST(REPLACE(TestResult, ',', '') AS FLOAT) >= 200.00 |
ARTOutcome | ||
from NDWH.Fact.FactARTHistory as arthist | ||
left join PBFW_StartDate on PBFW_StartDate.Patientkey=arthist.PatientKey | ||
where ARTOutcome in ('UNDOCUMENTED LOSS','LOSS TO FOLLOW UP') and AsOfDate BETWEEN FirstVisitDuringPBFW AND FirstVisitDuringPBFW |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): The BETWEEN clause is comparing against the same value, which is likely incorrect
This condition will only match when AsOfDate exactly equals FirstVisitDuringPBFW. Is there supposed to be a different end date for the range?
Summary by Sourcery
New Features:
CsLinelistMTCT
to theHIVCaseSurveillance
schema, which contains data for assessing PMTCT gaps.