-
-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter FERC714 ETL by year #2649
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## dev #2649 +/- ##
=======================================
- Coverage 87.1% 87.1% -0.1%
=======================================
Files 86 86
Lines 10001 10004 +3
=======================================
+ Hits 8716 8717 +1
- Misses 1285 1287 +2
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
ds = context.resources.datastore | ||
ferc714_settings = context.resources.dataset_settings.ferc714 | ||
years = ", ".join(map(str, ferc714_settings.years)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very satisfying line to me
This issue partially addresses issue #2628 and blocking issues found in #2550. FERC 714 data is not time-subsetted, with the extractor reading in CSV files that are organized on a per-table basis. This code changes the extraction step to add the ability to filter by year on the
record_yr
column, which is present in all tables except for therespondent_id_ferc714
table. It also updates the fast ETL to include 2 years of data for the FERC 714 run: 2019 and 2020.While this should not make large performance improvements in the current ETL, this should hopefully help with some of the memory constraints encountered in #2550.
PR Checklist
dev
).