-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* created partitions on intermediate dataframes * test new partition * try broadcasting visit_occurrence_bound * testing using a refreshed visit_occurrence data frame * cache ehr records * testing a new strategy * updated ehrshot omop conversion * updated pyspark dependency * re-implemented generate_visit_id for ehrshot omop conversion * fixed the visit_id data type in ehrshot data because it needs to be loaded as Float otherwise the entire column would be populated with null * fixed the logic for creating the artificial visits * fixed a bug in creating the end date of the artificial visits * test visit_end_date bounding with the new OMOP * try broadcasting cohort_visit_occurrence * try broadcasting visit_index_date * broadcase visit_occurrence_person * try repartitioning * try a different partition strategy * randomly shuffle visit_occurrence_person * created the order and reshuffle the dataframe using the order afterwards * removed an extra comma from a query * removed event_group_ids from the cehrgpt input data * upgraded pyspark * replace event_group_id to N/A instead of NULL * Revert "removed event_group_ids from the cehrgpt input data" This reverts commit 38742ec. * do not take person records into account when creating artificial visits * invalidate the records that fall ouside the visits * downgrade pyspark to 3.1.2 * resolve the ambiguous visit_id column * set the ehrshot visit_id to string type when loading the csv file * use max_visit_id df to cross join * added another assertion to test whether the patient count equals 1 * udpated the visit construction logic to ensure its uniqueness * cache domain_records so record_id is fixed * changed death_concept_id to cause_concept_id * fixed the death token priority column * fixed the unit test * added an option to exclude features and store the cohorts in the meds format (#20) * convert prediction_time to timestamp when meds_format is enabled (#22) * added number_as_value and concept_as_value to the spark dataframes * set the default value for number_as_value and concept_as_value to None * insert an hour token between the visit type token and the first medical event in the inpatient visit * added include_inpatient_hour_token to extract_features * calculate the first inpatient hour token using the date part of visit_start_datetime * added aggregate_by_hour to perform the average aggregration over the lab values that occurred within the same hour * fixed a bug in the lab_hour * fixed a union bug * try caching the inpatient events in att decorator * fixed a bug in generating the hour tokens * fixed a bug in generating the hour tokens
- Loading branch information
Showing
15 changed files
with
371 additions
and
185 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.