Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: handle panics inside user handlers #67

Merged
merged 7 commits into from
Jul 29, 2024
Merged

chore: handle panics inside user handlers #67

merged 7 commits into from
Jul 29, 2024

Conversation

yhl25
Copy link
Contributor

@yhl25 yhl25 commented Jul 22, 2024

No description provided.

@yhl25 yhl25 requested review from vigith and BulkBeing July 22, 2024 05:44
yhl25 added 2 commits July 22, 2024 11:21
Signed-off-by: Yashash H L <[email protected]>
Signed-off-by: Yashash H L <[email protected]>
@yhl25 yhl25 marked this pull request as ready for review July 22, 2024 14:11
src/sink.rs Outdated
match next_message {
Ok(Some(message)) => {
// If sending fails, it means the receiver is dropped, and we should stop the task.
if tx.send(message.into()).await.is_err() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please log this error

src/sink.rs Outdated
}
},
// If there's an error or the stream ends, break the loop to stop the task.
Ok(None) | Err(_) => break,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log error

src/sink.rs Outdated
Comment on lines 210 to 212
_ = writer_cln_token.cancelled() => {
break;
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not require this

src/sink.rs Outdated

// write to the user-defined channel
// spawn a task to read messages from the stream and send them to the user's sink handle
tokio::spawn(async move {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get the handle and check explicitly for status or abort of cancellation token has been invoked.

Copy link
Member

@vigith vigith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update

numaflow-rs/src/lib.rs

Lines 35 to 71 in fb73fca

// Error handling on Numaflow SDKs!
//
// Any non-recoverable error will cause the process to shutdown with a non-zero exit status. All errors are non-recoverable.
// If there are errors that are retriable, we (gRPC or Numaflow SDK) would have already retried it (hence not an error), that means,
// all errors raised by the SDK are non-recoverable.
//
// Task Ordering and error propagation.
//
// level-1 level-2 level-3
//
// +---> (service_fn) ->
// |
// |
// | +---> (task)
// | |
// | |
// (gRPC Service) ---+---> (service_fn) ---+---> (task)
// ^ | |
// | | |
// | | +---> (task)
// | |
// (shutdown) |
// | +---> (service_fn) ->
// |
// |
// (user)
//
// If a task at level-3 has an error, then that error will be propagated to level-2 (service_fn) via an mpsc::channel using the response channel.
// The Response channel passes a Result type and by returning Err() in response channel, it notifies top service_fn that the task wants to abort itself.
// service_fn (level-2) will now use another mpsc::channel to tell the gRPC service to cancel all the service_fns. gRPC service will
// will ask all the level-2 service_fns to abort using the CancellationToken. service_fn will call abort on all the tasks it created using internal
// mpsc::channel when CancellationToken has been dropped/cancelled.
//
// User can directly send shutdown request to the gRPC server which inturn cancels the CancellationToken.
//
// The above 3 level task ordering is only for complex cases like reduce, but for simpler endpoints like `map`, it only has 2 levels but
// the error propagation is handled the same way.

yhl25 added 3 commits July 23, 2024 13:57
Signed-off-by: Yashash H L <[email protected]>
Signed-off-by: Yashash H L <[email protected]>
@yhl25 yhl25 merged commit f02acf9 into main Jul 29, 2024
2 checks passed
@yhl25 yhl25 deleted the panic branch July 29, 2024 06:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants