Skip to content

Commit

Permalink
fixed a bug where the normalized value should be bounded by a multipl…
Browse files Browse the repository at this point in the history
…e of the standard deviation because the normalized value is assumed to follow the standard normal
  • Loading branch information
ChaoPang committed Sep 13, 2024
1 parent a495edf commit 179ee21
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions src/cehrbert/models/hf_models/tokenization_hf_cehrbert.py
Original file line number Diff line number Diff line change
Expand Up @@ -356,13 +356,13 @@ def normalize(self, concept_id, concept_value) -> float:
mean_ = concept_value - self._lab_stat_mapping[concept_id]["mean"]
std = self._lab_stat_mapping[concept_id]["std"]
if std > 0:
value_outlier_std = self._lab_stat_mapping[concept_id]["value_outlier_std"]
normalized_value = mean_ / self._lab_stat_mapping[concept_id]["std"]
# Clip the value between the lower and upper bounds of the corresponding lab
normalized_value = max(
self._lab_stat_mapping[concept_id]["lower_bound"],
min(self._lab_stat_mapping[concept_id]["upper_bound"], normalized_value),
)
normalized_value = max(-value_outlier_std, min(value_outlier_std, normalized_value))
else:
normalized_value = mean_
# If there is not a valid standard deviation,
# we just the normalized value to the mean of the standard normal
normalized_value = 0.0
return normalized_value
return concept_value

0 comments on commit 179ee21

Please sign in to comment.