Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc fixes/improvements #573

Merged
merged 8 commits into from
Dec 4, 2023
Merged

Misc fixes/improvements #573

merged 8 commits into from
Dec 4, 2023

Conversation

skrawcz
Copy link
Collaborator

@skrawcz skrawcz commented Dec 1, 2023

Fixes/improvees a few things:

  1. Fixes bug where passing in graph_attrs would override all values
  2. Captures more metadata from loaded dataframes.
  3. updates package.json for hub.
  4. Fixes Feast integration example type error #575 .

Changes

  1. graph.py graphviz viz fix. Updates tests
  2. updates all the dataframe savers & loaders to pull extra metadata. Updates tests.
  3. updates package.json for contrib site.
  4. graph.py changes for input node type checking & graphviz change.

How I tested this

  • locally

Notes

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.

So if you used `orient="LR"` it would get lost if you passed in
graph_attrs and it didn't have the appropriate value.

This fixes that and fixes the unit test and makes sure it's
deterministic.
We introspect the dataframe schema a little to capture
those aspects. The datatypes are converted to strings to
avert any serialization issues.
To have proper project name and version.
@skrawcz skrawcz requested a review from elijahbenizzy December 1, 2023 07:35
Copy link
Contributor

sweep-ai bot commented Dec 1, 2023

Apply Sweep Rules to your PR?

  • Apply: All new business logic should have corresponding unit tests.
  • Apply: Refactor large functions to be more modular.
  • Apply: Add docstrings to all functions and file headers.

Copy link
Collaborator

@elijahbenizzy elijahbenizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- a few quick fixes

hamilton/io/utils.py Show resolved Hide resolved
contrib/docs/docusaurus.config.js Outdated Show resolved Hide resolved
hamilton/graph.py Show resolved Hide resolved
setup.cfg Show resolved Hide resolved
So that people can Try Hamilton :)
Makes it explicit that Hamilton is a first party package.
The problem here was that when we pushed materializers we introduced
a regression where somehow input node types got treated differently.

The bug here was that "input" nodes are created at add dependency time,
thus if two functions request the same input, but provide different types
things would correctly error, but in this case in #575 the types were compatible.

So the design choice here was to:

1. Check that if one is the subset of the other, then we allow it.
2. We then make the subset type the type of the node.
3. We attach originating functions to ensure we can create a good error message.
This has the side-effect of propagating all the way through to Variables and such.
4. The mutating node functions are scoped to only work if the Node is deemed External, that
way we don't use that code path inadvertently in the future.
5. We fix an assumption in visualization that assumed inputs didn't have functions.

Note: the input/external/user-defined node creation should really be pulled out into a separate
step, and not created dynamically in the add_dependency part.
Copy link
Collaborator

@elijahbenizzy elijahbenizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits

hamilton/graph.py Show resolved Hide resolved
hamilton/graph.py Show resolved Hide resolved
tests/test_graph.py Show resolved Hide resolved
tests/test_graph.py Show resolved Hide resolved
tests/test_hamilton_driver.py Show resolved Hide resolved
So the git commit hook for isort was running with
different configuration from the pre-commit CI run.

This (a) fixes files to be consistent with (b) the configuration
now provided -- e.g. what is first party, and what is local.

This should hopefully now prevent any conflicts from reoccurring.
Copy link
Collaborator

@elijahbenizzy elijahbenizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@skrawcz skrawcz merged commit 67278b9 into main Dec 4, 2023
2 checks passed
@skrawcz skrawcz deleted the misc_fixes branch December 4, 2023 05:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feast integration example type error
2 participants