Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests and canonicalization for ConfigWorkflow #89

Merged
merged 11 commits into from
Jan 21, 2025
Merged

Conversation

DropD
Copy link
Collaborator

@DropD DropD commented Jan 14, 2025

First step towards clarifying and testing the data layout from which we can reliably build workflows.

Problem

See #88. This covers steps 1-8 for ConfigWorkflow

Changes

  • Doctests
    • enabled by default (required fixing some pretty_printer doctests)
    • Doctests for ConfigWorkflow, outlining the smallest possible valid yaml snippet
  • Unit tests
    • Check that we can build a CanonicalWorkflow from code as well as from the mimimal ConfigWorkflow instance
    • Check that we can use the intended load_workflow_config method to load the minimal yaml snippet
    • Check that we can convert the minimum valid CanonicalWorkflow into a core.Workflow

@DropD DropD requested a review from leclairm January 15, 2025 09:06
>>> empty_wf = ConfigWorkflow(cycles=[], tasks=[], data={})

"""

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: are we ok with this yaml snippet parsing without validation errors?

If so, should it fail with a clear message down the line or should it create a viable WorkGraph, which runs nothing?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we agreed in the meeting to produce an error, maybe it is easier for merging to do this in a subsequent PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once I merge #90 into this, the error would be raised during canonicalization


pretty_print.PrettyPrinter().format(testee)

assert testee.name is None
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: What does it mean to have a core.Workflow with name=None, rootdir=None?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realistically this can not happen in normal usage, (as tested in test_yaml_data_models.py). However, downstream code has to deal with the possibility or negate the help of static type checking.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we need to check if they are not None in the core.Workflow right? Should we make an issue to keep in mind?

Copy link
Contributor

@leclairm leclairm Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually, for core objects, the name and rootdir are compulsory but the reason for None possibility in the user provided information is different for both of them.

  • name can be None and then defaults in the core object to what is derived from the yaml config path.
  • rootdir should actually NOT be given by the user and only inferred from the yaml config path.

So if you create a canonical class:

  • name would stay str | None in ConfigWorkflow defaulting to None and be non default str in the canonical class
  • rootdir would belong to the canonical class as non default str and disappear from ConfigWorkflow

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will adapt this in #90

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will need to think about how to adapt the canonicalize function to allow passing in additional information though, because the additional information will be different per class if any (not sure if singledispatch is still the right match then).

def test_workflow_test_internal_dicts():
testee = models.ConfigWorkflow(
cycles=[],
tasks=[{"some_task": {"plugin": "shell"}}],
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: is it a problem, that we can not currently (for testing purposes) pass actual ConfigXyzTask instances into the constructor of ConfigWorkflow, but have to go through this dictionary form?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think yes. As you pointed out in the last meeting, we need to decouple the components for effective testing so we need to test YAML <-> CONFIG and CONFIG <-> GRAPH separately which requires a creation of the config objects without a yaml string.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of the drafted solution in #90, splitting into a "canonical" and a "yaml-parsed" class? Note that
a) Some of this would be alleviated by idempotent validators, which that approach would introduce gradually
b) The full benefits would only kick in, once all the config classes have undergone the same process. However, testability will increase incrementally.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made a comment there about the canonical workflow

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the creation of canonical classes to remove the type ambiguities. Same as rootdir in WorkFlowConfig.

@DropD DropD changed the title add tests related to ConfigWorkflow Tests and questions related to ConfigWorkflow Jan 15, 2025
@DropD DropD requested a review from agoscinski January 16, 2025 09:09
Copy link
Collaborator

@agoscinski agoscinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The questions you point out would need some further changes to be implement but maybe we implement them in another PR to avoid merge conflicts and focus on the canonicalization? I approve already.

>>> empty_wf = ConfigWorkflow(cycles=[], tasks=[], data={})

"""

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we agreed in the meeting to produce an error, maybe it is easier for merging to do this in a subsequent PR


pretty_print.PrettyPrinter().format(testee)

assert testee.name is None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we need to check if they are not None in the core.Workflow right? Should we make an issue to keep in mind?

DropD added 2 commits January 17, 2025 15:52
* finalize ConfigWorkflow canonicalization
* improve type hints
@DropD DropD changed the title Tests and questions related to ConfigWorkflow Tests and questions canonicalization for ConfigWorkflow Jan 17, 2025
@DropD DropD changed the title Tests and questions canonicalization for ConfigWorkflow Tests and canonicalization for ConfigWorkflow Jan 20, 2025
@DropD DropD merged commit 2c0ee10 into main Jan 21, 2025
3 checks passed
@DropD DropD deleted the test-config-workflow branch January 21, 2025 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants