Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track in/out pages in exchange #120867

Merged
merged 1 commit into from
Jan 27, 2025
Merged

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Jan 26, 2025

This is a spin-off of the "retry node requests on shard-level failures" work.

Currently, a driver can execute against multiple shards simultaneously. If the execution fails and no pages are added to the sink, we can retry the failed shards on another node. In another scenario, if no pages are fetched or added to the exchange source and the entire data node request fails, we can also retry the entire request. This change adds callbacks to RemoteSink and ExchangeSink, allowing for tracking of in/out pages.

@dnhatn dnhatn added the auto-backport Automatically create backport pull requests when merged label Jan 26, 2025
@dnhatn dnhatn marked this pull request as ready for review January 26, 2025 15:57
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jan 26, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

exchangeSource,
exchangeSink
exchangeSource::createExchangeSource,
() -> exchangeSink.createExchangeSink(() -> {})
Copy link
Contributor

@idegtiarenko idegtiarenko Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems all prod implementations are passing a noop runnable. Could you please point me to the actual usage? Or is it going to be added in a later prs?

Copy link
Member Author

@dnhatn dnhatn Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its usage will be added later in "retry node requests on shard-level failures" work.

@quux00
Copy link
Contributor

quux00 commented Jan 27, 2025

For the use case cited, this looks fine, as far as I can tell, as I'm not exactly sure how you'll use this in the next PR. But are there other use cases for this callback? If so, should we use something other than Runnable, in favor of something that returns metadata about the block? For example, would it be useful know things like: 1) which node or cluster the block came from; 2) whether this is the last block and no other blocks will be coming. Those could be useful for metadata accounting, especially around CCS work or maybe incremental results work that is planned for later this yar.

@dnhatn
Copy link
Member Author

dnhatn commented Jan 27, 2025

@quux00 For my use case, the callback updates a shared atomic boolean, while others might need a page count. Therefore, I chose to pass a Runnable to allow callers to manage their metadata externally.

@dnhatn dnhatn requested a review from idegtiarenko January 27, 2025 16:43
@dnhatn
Copy link
Member Author

dnhatn commented Jan 27, 2025

Thanks everyone!

@dnhatn dnhatn merged commit c971460 into elastic:main Jan 27, 2025
16 checks passed
@dnhatn dnhatn deleted the exchange-tracking-pages branch January 27, 2025 17:29
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jan 27, 2025
This is a spin-off of the "retry node requests on shard-level failures" work.

Currently, a driver can execute against multiple shards simultaneously. 
If the execution fails and no pages are added to the sink, we can retry
the failed shards on another node. In another scenario, if no pages are
fetched or added to the exchange source and the entire data node request
fails, we can also retry the entire request. This change adds callbacks
to RemoteSink and ExchangeSink, allowing for tracking of in/out pages.
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

elasticsearchmachine pushed a commit that referenced this pull request Jan 29, 2025
This is a spin-off of the "retry node requests on shard-level failures" work.

Currently, a driver can execute against multiple shards simultaneously. 
If the execution fails and no pages are added to the sink, we can retry
the failed shards on another node. In another scenario, if no pages are
fetched or added to the exchange source and the entire data node request
fails, we can also retry the entire request. This change adds callbacks
to RemoteSink and ExchangeSink, allowing for tracking of in/out pages.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants