Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregator setup pipelining #487

Draft
wants to merge 3 commits into
base: dev
Choose a base branch
from
Draft

Aggregator setup pipelining #487

wants to merge 3 commits into from

Conversation

doscortados
Copy link

Closes #460

@doscortados doscortados requested a review from a team January 28, 2025 14:39
@doscortados doscortados changed the base branch from main to dev January 28, 2025 14:40
@ozankaymak ozankaymak requested review from mmtftr, ekrembal, ceyhunsen and ozankaymak and removed request for a team January 28, 2025 14:42
Copy link

@mmtftr mmtftr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the changes.

I believe the original issue also wants the try_join_alls to happen in background tasks so that they're happening in parallel (you can check aggregator for details)

We're also going to be working on moving the business logic out of the gRPC handlers, and moving parsing/validation logic out into trait impls so that they're more easily tested and the code has less clutter.

@@ -284,117 +286,122 @@ impl Aggregator {
}
}

async fn collect<C, T, E, Fut, F>(clients: &[C], mut f: F) -> Result<Vec<T>, E>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, these will reduce boilerplate by a lot.

I think the naming needs to change though, this seems like a utility function that calls all clients with a given func and returns the inner response. Something like: map_clients_collect?

Also maybe #[inline] and/or #[track_caller] are appropriate here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this definitely needs refining, thank you for pointing it out. I didn't find an elegant way to generalize it over streaming (returning Streaming<T>) and regular/non-streaming (returning Response<T>) requests 🤷 .

.await
}

async fn pull<T>(mut stream: Streaming<T>) -> Result<Vec<T>, Status> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need docs on the behavior. This currently collects until end of stream unless an error happens, in which case we prematurely return the error. I think collect fits better here. This is also the behavior of collect when doing a Iter<Item = Result<T, E>> -> Result<Vec<T>, E>

Also can we move these two into another module?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this was just a first attempt to generalize stream collection, probably just .collect() on a stream would have been enough and this function is unnecessary.

@doscortados
Copy link
Author

Thank you for the comments @mmtftr , and sorry for my late reply!
This is rather initial version and significant re-work will follow in further commits.

At this point I see that mpsc channel isn't working out for multiplexing N-over-M streams, and I am trying out mpmc one to avoid pulling all the params in memory before streaming it to consumers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pipeline for aggregator setup
2 participants