Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Waiting for handlers to finish in all exit cases + abort and compensation in a message handler #144

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

dandavison
Copy link
Contributor

@dandavison dandavison commented Oct 2, 2024

From the README:

This sample demonstrates how to write a workflow that:

1. Ensures that an update handler is finished before a successful workflow return, and on workflow cancellation and failure
2. Execute a compensation/cleanup activity in the update handler when the workflow is cancelled or fails, and also in the main workflow method.

@dandavison dandavison force-pushed the message-handler-waiting-compensation-cleanup branch from c566b62 to 9355bc2 Compare October 2, 2024 17:34
@dandavison dandavison force-pushed the message-handler-waiting-compensation-cleanup branch 3 times, most recently from ae0a6e9 to 49f342f Compare October 4, 2024 12:28
@dandavison dandavison force-pushed the message-handler-waiting-compensation-cleanup branch 3 times, most recently from ed6b996 to 1ea0905 Compare October 4, 2024 16:31
@dandavison dandavison force-pushed the message-handler-waiting-compensation-cleanup branch from af64650 to 5867812 Compare October 7, 2024 18:23
Copy link
Contributor

@drewhoskins-temporal drewhoskins-temporal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels to me like there's two main things going on here.

  • cleaning up the workflow's handlers
  • aborting updates and compensating for them.

Both seem valid. The first feels mainstream, the second power user (though we may start to get feedback telling us otherwise).

I wonder if they should be separate samples, or at least more clearly delineate this one so people can just pick out what they need. (I don't want to scare people off of #1 because of the code for #2.)

I can imagine pretty achievable framework improvements we could do for both scenarios. Given that we've heard a lot more feedback about the first scenario, that'd be my guess as to where to invest for now.

Comment on lines 97 to 102
# 👉 "Race" the workflow_exit future against the handler's own application
# logic. Always use `workflow.wait` instead of `asyncio.wait` in
# Workflow code: asyncio's version is non-deterministic.
await workflow.wait( # type: ignore
[update_task, self.workflow_exit], return_when=asyncio.FIRST_EXCEPTION
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, and looks like a pain, but I'm not sure how often people want to cancel in-process update handlers in response to a workflow exit as opposed to simply waiting for the handlers to complete.
My guess is that most people are happy to wait because such cases are not particularly latency sensitive.
It may be okay that this is more of a power user struggle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I agree with this. The point isn't the latency -- it's that if the workflow is exiting due to a failure/cancellation then the update may need to perform a rollback / cancellation. So they can't "simply wait for the handler to complete" because the handler has to learn whether it should proceed to the end of its happy path, or perform rollback / compensation.

) from cast(BaseException, self.workflow_exit.exception())
# 👉 Catch BaseException since asyncio.CancelledError does not inherit
# from Exception.
except BaseException as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't thought deeply, but I'm wondering if using a child workflow would work better for use cases in which there are compensations. Would it be more straightforward? Would it work better, e.g. by letting the workflow exit and letting the child workflow finish up the compensations?
Obviously that's extra latency which some might not be able to afford. And it's a lot of code, though it might be easier than this.

) from e
raise

async def my_update_compensation(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be cool if folks could register cancellation handlers with an annotation similar to validators.

raise AssertionError("unreachable")


def is_workflow_exit_exception(e: BaseException) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 , This feels like a concept worth highlighting (and maybe adding a helper?) that I think lots of users would benefit from. Good to take every opportunity to teach users about which exceptions fail the workflow and the fact that some don't fail it (and that this is a good thing), and then show how that intersects with cleanup code like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants