Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document new transforms & organize imports/logging #1962

Merged
merged 20 commits into from
Oct 20, 2022

Conversation

zaneselvans
Copy link
Member

@zaneselvans zaneselvans commented Sep 30, 2022

  • Use relative hierarchical imports across all of the package & subpackage __init__.py files, rather than importing every module absolutely in the top level __init__.py
  • Remove the absolufy-imports pre-commit hook.
  • Isolate logging functions in pudl.logging to avoid circular dependencies in pudl.helpers
  • Fix a dangling reference to the no-longer used Ferc1DbfSettings
  • Flesh out module / class / function docstrings to document the new transform setup.

Closes #1924

Rather than importing all modules using absolute paths in the top-level
__init__.py, manage package and sub-package imports hierarchically
within all of the various __init__.py files throughout the package using
relative imports, with each subpackage repsonsible for importing its own
modules and subpackages into the package / subpackage namespace.

This re-organization surfaced some circular dependencies resulting from
the logging functions being defined in pudl.helpers (which has all kinds
of other dependencies in it.

I removed the absolufy-imports pre-commit hook since there's no way to
turn it off in the __init__.py modules, which makes it impossible to
do imports that construct a clean namespace.

Also made sure that all modules use the root logger configured in
pudl.logging via get_logger.

There was also one lingering reference to pudl.settings.Ferc1DbfSettings
that hadn't been noticed because the testes still aren't passing, which
I swapped out for Ferc1Settings.
@zaneselvans zaneselvans marked this pull request as draft September 30, 2022 03:51
@zaneselvans zaneselvans added ferc1 Anything having to do with FERC Form 1 xbrl Related to the FERC XBRL transition docs Documentation for users and contributors. labels Sep 30, 2022
@codecov
Copy link

codecov bot commented Oct 4, 2022

Codecov Report

Base: 83.4% // Head: 83.5% // Increases project coverage by +0.1% 🎉

Coverage data is based on head (9ff08ad) compared to base (ee811ea).
Patch coverage: 90.3% of modified lines in pull request are covered.

Additional details and impacted files
@@                Coverage Diff                 @@
##           xbrl_integration   #1962     +/-   ##
==================================================
+ Coverage              83.4%   83.5%   +0.1%     
==================================================
  Files                    71      72      +1     
  Lines                  7769    7761      -8     
==================================================
+ Hits                   6484    6486      +2     
+ Misses                 1285    1275     -10     
Impacted Files Coverage Δ
src/pudl/metadata/fields.py 100.0% <ø> (ø)
src/pudl/cli.py 69.6% <33.3%> (+1.1%) ⬆️
src/pudl/analysis/service_territory.py 28.1% <50.0%> (-0.8%) ⬇️
src/pudl/convert/censusdp1tract_to_sqlite.py 72.3% <50.0%> (+6.2%) ⬆️
src/pudl/convert/epacems_to_parquet.py 65.5% <50.0%> (-1.2%) ⬇️
src/pudl/convert/ferc_to_sqlite.py 58.9% <50.0%> (-1.1%) ⬇️
src/pudl/workspace/datastore.py 68.2% <50.0%> (+0.1%) ⬆️
src/pudl/logging_helpers.py 76.9% <76.9%> (ø)
src/pudl/extract/ferc1.py 87.3% <80.0%> (+1.8%) ⬆️
src/pudl/analysis/allocate_net_gen.py 97.2% <100.0%> (-0.1%) ⬇️
... and 37 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@zaneselvans zaneselvans linked an issue Oct 6, 2022 that may be closed by this pull request
9 tasks
@zaneselvans zaneselvans marked this pull request as ready for review October 8, 2022 03:34
@zaneselvans zaneselvans requested a review from cmgosnell October 8, 2022 03:34
@zaneselvans zaneselvans self-assigned this Oct 8, 2022
@zaneselvans
Copy link
Member Author

@cmgosnell definitely open to cutting out text if you think it's too much. Or if there's duplication and you have a preference for the one place that the content should be (we can easily link between different parts of the docs)

@zaneselvans
Copy link
Member Author

zaneselvans commented Oct 11, 2022

@zschira @cmgosnell The CI is passing! 🎉 Also the CI is taking 70-80 minutes 😭!

I assume that a lot of the extra time is extracting all the different XBRL databases, now that we have updated (and much larger) For the ones that aren't actually integrated into PUDL (yet) I wonder if it would be more appropriate to test the XBRL to SQLite extraction in the ferc-xbrl-extractor repository instead? This would also provide concrete examples of how to actually run the extractions (which didn't seem very clear to me in the current docs) and would allow us to test extracting all of the XBRL data sources, even if they haven't been integrated with the rest of PUDL yet.

@zschira
Copy link
Member

zschira commented Oct 11, 2022

The CI is passing!

Awesome!!

As for the 70-80 minutes, I think that removing the need to extract all the XBRL every time makes sense. If we do move that testing to the extractor repo, we'll need to have access to raw XBRL filings there somehow. I'm sure we can just put a zipfile with a years worth of data in that repo or something.

Copy link
Member

@cmgosnell cmgosnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks pretty good overall! I left lots of little questions and requests for either clarifications, simplifications or moving explanations closer to the code/user.

a high level question/comment... were these import fixes actually related to the docs? if they weren't, I think it would have been easier for me as a reviewer to digest one of these types of changes at a time.

src/pudl/analysis/allocate_net_gen.py Show resolved Hide resolved
src/pudl/extract/ferc1.py Show resolved Hide resolved
src/pudl/extract/ferc1.py Show resolved Hide resolved
src/pudl/extract/ferc1.py Outdated Show resolved Hide resolved
src/pudl/transform/classes.py Show resolved Hide resolved
src/pudl/transform/classes.py Outdated Show resolved Hide resolved
src/pudl/transform/ferc1.py Outdated Show resolved Hide resolved
src/pudl/transform/ferc1.py Show resolved Hide resolved
src/pudl/transform/ferc1.py Show resolved Hide resolved
src/pudl/transform/params/__init__.py Show resolved Hide resolved
@zaneselvans
Copy link
Member Author

a high level question/comment... were these import fixes actually related to the docs? if they weren't, I think it would have been easier for me as a reviewer to digest one of these types of changes at a time.

No, it really shouldn't have been mixed in here. I screwed this up. The imports and the logging should have each been a separate discrete PR directly into dev (and then merged down into these branches) rather than being mingled in here. It started as an attempt to address a circular import that was causing problems in this PR, and then turned into something repository-wide, and changed directions after our discussion of #1925 and the decision not to change the imports / public facing API just in the pudl.transform module. But by the time I realized the right way to do it, it was entangled with other stuff. Sorry!

@zaneselvans
Copy link
Member Author

@cmgosnell I think I've at least responded to all of your comments, but I'm not sure if it'll all be satisfying. If you want to respond in comments go for it! Or we can talk on Friday maybe?

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@zaneselvans zaneselvans changed the base branch from xbrl_steam to xbrl_integration October 19, 2022 16:00
* Hard-coded the steam & hydro plant types in the field definitions to
  avoid a cross-subpackage circular import error. This isn't an ideal
  solution, but it works. Need to figure out the right way to deal with
  subpackage-level (rather than module-level) import dependencies
  appropriately.
* Changed hydro plant type categories to use snake_case as the steam
  plant type categories do.
* Renamed the pudl.logging module to pudl.logging_helpers since it
  mysteriously started colliding (apparently?) with the Python standard
  library logging module (even though it hadn't previously)
@zaneselvans zaneselvans merged commit a74f173 into xbrl_integration Oct 20, 2022
@zaneselvans zaneselvans deleted the xbrl-docs branch October 26, 2022 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation for users and contributors. ferc1 Anything having to do with FERC Form 1 xbrl Related to the FERC XBRL transition
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document refactored transform infrastructure
3 participants