You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the vaccine data shows upward and downward fluctuations, which doesn't make sense as we can't suddenly have less vaccinations than we did previously.
I imagine it is due to differences between reporting locations, but the online schema does not provide any details there. Our best route would be to confirm with the team that supports the data. I've corresponded with them a few times at [email protected].
Data currently goes through heavy normalization within apiClient.js getVaccineStatistics. The data was not grouped in any functional way, was not sorted and contained many duplicates. Handling was put into place to fix this, we currently simply take the first point available for the day, which is not the most skillful approach.
An approach to improving normalization here may be to keep the filtering, (the _groupBy and mapValues), but to drop the remove duplicates. We might want to have a more clever way of choosing the best fit from all the data points available for that day. Either by best fit to a trend line, by maximum, or by some other metric. There doesn't seem to be any useful meta data such as reporting site that allows us to differentiate one data point from the other, but again, maybe the team managing the data can elucidate there.
The text was updated successfully, but these errors were encountered:
Currently, the vaccine data shows upward and downward fluctuations, which doesn't make sense as we can't suddenly have less vaccinations than we did previously.
I imagine it is due to differences between reporting locations, but the online schema does not provide any details there. Our best route would be to confirm with the team that supports the data. I've corresponded with them a few times at [email protected].
Data source:
https://data-cdphe.opendata.arcgis.com/datasets/cdphe-covid19-vaccine-daily-summary-statistics
Data currently goes through heavy normalization within apiClient.js getVaccineStatistics. The data was not grouped in any functional way, was not sorted and contained many duplicates. Handling was put into place to fix this, we currently simply take the first point available for the day, which is not the most skillful approach.
An approach to improving normalization here may be to keep the filtering, (the _groupBy and mapValues), but to drop the remove duplicates. We might want to have a more clever way of choosing the best fit from all the data points available for that day. Either by best fit to a trend line, by maximum, or by some other metric. There doesn't seem to be any useful meta data such as reporting site that allows us to differentiate one data point from the other, but again, maybe the team managing the data can elucidate there.
The text was updated successfully, but these errors were encountered: