Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors are escaping the error zone when thrown from wrapped register* callbacks. #59913

Open
jakemac53 opened this issue Jan 15, 2025 · 7 comments
Assignees
Labels
area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-async type-bug Incorrect behavior (everything from a crash to more subtle misbehavior)

Comments

@jakemac53
Copy link
Contributor

See https://dartpad.dev/?id=8d902660cb95b6e9eb87343b6bf1f198

I was playing around with "cancelling" async tasks by running them in zones, and wrapping all registered callbacks with a function that throws an exception if the task has been cancelled. Then I want to catch those specific exceptions and not bubble them up outside the zone. I am sure more needs to happen here to deal with edge cases and timers, but I was hoping this could work as a proof of concept.

However, I am unable to handle the errors thrown inside these callbacks, they don't reach the handleUncaughtError handler in the zone, even though I can see that I am currently running inside the forked zone (and its also the error zone) right before throwing.

Is this intended behavior? Is it not possible to do what I want here?

@dart-github-bot
Copy link
Collaborator

Summary: Zone's handleUncaughtError doesn't catch errors from wrapped register* callbacks. The provided DartPad example demonstrates this unexpected behavior.

@dart-github-bot dart-github-bot added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. triage-automation See https://github.com/dart-lang/ecosystem/tree/main/pkgs/sdk_triage_bot. type-bug Incorrect behavior (everything from a crash to more subtle misbehavior) labels Jan 15, 2025
@jakemac53 jakemac53 added area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. and removed area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. labels Jan 15, 2025
@jakemac53
Copy link
Contributor Author

Basically the use case here is I have some long running async work which I might need to redo at any point. If I know I need to redo it, I want to cancel the existing work before re-starting it with the new state. These are user provided async tasks and shouldn't need to constantly check if they have been cancelled, I just want to be able to have them automatically abandon work before starting to execute any new async task, if they have been cancelled.

@natebosch
Copy link
Member

they don't reach the handleUncaughtError handler in the zone, even though I can see that I am currently running inside the forked zone

I found that the error does reach the handleUncaughtError in addition to reaching whatever other error handler if you make the call to f(), so something about actually running the wrapped callback unblocks the error handling.

@devoncarew devoncarew removed the triage-automation See https://github.com/dart-lang/ecosystem/tree/main/pkgs/sdk_triage_bot. label Jan 15, 2025
@lrhn
Copy link
Member

lrhn commented Jan 15, 2025

Without seeing the code, my best guess is that the registerCallback or the run functions are running the throw in the wing zone.

Zone function handlers should not assume that they are run in a particular zone. They're most likely running in the zone that triggered them, and it's about job to make sure callbacks do run in the correct come.
Fx. if you "override" run, it should not call the callback itself, it should call parent.run (parent delegate is second argument) with the desired actual zone as (third) argument, which will actually change the current zone before running the function.

(Tried reading the code on a very small screen. Doesn't seem to override run, the registerCallbacks look fine. The handleUncaughtError should probably call the parent instead of throwing, but I might have made throwing work to reuse the original stack. Don't remember the details.
EDIT*2: I checked. If the handleUncaughtError throws, it gets passed to the parent zone's handleUncaughtError, and it retains the stack trace if it throws the same object again. It may be more or less expensive than calling parent.handleUncaughtError(zone, error, stackTrace);, depending on whether throwing is more expensive than extra recursion layers on the stack..)

Another thing that can go wrong is Future errors getting swallowed at error zone boundaries. Can't see where that should happen, but you often can't.

I'll look at it more when I get to a desktop.

@lrhn
Copy link
Member

lrhn commented Jan 16, 2025

Deleted. Ignore this, the bug was in the example code. (I didn't have a registerUnaryCallback, that's why Timer.periodic wasn't intercepted.)

So, to make up for it, here is a different approach to a cancellable zone:
https://dartpad.dev/?60f2e18e59895dbfdeea84e001703d75

Intercepts only run*, handleUncaughtError and the timers (because timers can be cancelled directly).
It could also implement its own microtask queue to eagerly cancel microtasks, but that's probably more work than it's worth.

@jakemac53
Copy link
Contributor Author

I found that the error does reach the handleUncaughtError in addition to reaching whatever other error handler if you make the call to f(), so something about actually running the wrapped callback unblocks the error handling.

Can you share an example? I am not quite understanding what you are doing here

@lrhn
Copy link
Member

lrhn commented Jan 16, 2025

I think I have an idea why it behaves the way it does.

The Timer actually registers a callback twice, one in the public Timer constructor, and once at the bottom of the stack in _rootCreateTimer.
The first one uses .bindCallbackGuarded which ensures that the callback throwing will be reported in the zone.
The last one just uses .bindCallback, so if the callback throws, it ends up back at the event loop. And timer event callbacks run on the root zone until they call the callback bound using bindCallback.

That has generally worked because the first bind protects the second from throwing functions.
It doesn't work when the second bind's registration introduces throwing.
This code breaks that, so the Timer callback, one step above the event loop, runs a function that throws synchronously, and the error goes directly to the event loop.

I know how to fix it, and maybe only register callbacks once.
I'll take a look through other async functions to see if they do something similar.

(It might make things a little harder that we've not been consistent. Currently both Timer and Zone.createTimer registers the function. The Zone.createTimer really shouldn't, it's a low-level function that higher level functions are built on top of, so it shouldn't be calling other low-level functions. I'll see if I can get away with making it not do so, so the function is registered in the high-level functions only.)

CL: https://dart-review.googlesource.com/c/sdk/+/405020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-async type-bug Incorrect behavior (everything from a crash to more subtle misbehavior)
Projects
None yet
Development

No branches or pull requests

5 participants