-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support On-Demand Repartition #14411
base: main
Are you sure you want to change the base?
Conversation
54db067
to
6ffe62c
Compare
@Weijun-H has been working on this with the Synnada team for a while. The initial benchmark results were promising, so we decided to continue development while receiving community feedback 🚀 |
This is still in somewhat early stages, and there is work to do. But it might be good to get feedback early on from the community as the performance of this code is somewhat sensitive to idioms used with channels etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the work! Just put some comments
datafusion/physical-plan/src/repartition/on_demand_repartition.rs
Outdated
Show resolved
Hide resolved
datafusion/physical-plan/src/repartition/on_demand_repartition.rs
Outdated
Show resolved
Hide resolved
datafusion/physical-plan/src/repartition/on_demand_repartition.rs
Outdated
Show resolved
Hide resolved
datafusion/physical-plan/src/repartition/on_demand_repartition.rs
Outdated
Show resolved
Hide resolved
datafusion/physical-plan/src/repartition/on_demand_repartition.rs
Outdated
Show resolved
Hide resolved
datafusion/physical-plan/src/repartition/on_demand_repartition.rs
Outdated
Show resolved
Hide resolved
69a3c4f
to
f6934d1
Compare
Maybe I am missing something, but the benchmark numbers reported above don't really show much of an improvement For example, this branch appears to be basically the same
Are there any benchmarks that show a performance benefit of all this new code? |
@Weijun-H did some benchmarks a while back and the approach seemed promising in TPCH/SF50. @mertak-synnada will do a detailed review of this tomorrow and then @Weijun-H can run the latest benchmarks for us to see how the performance changes |
this might be a silly question but, did you set the config flag for |
…obin RepartitionExec
fa91ea3
to
beacced
Compare
I updated the latest benchmark results. It appears that the |
62726ec
to
3d4ee00
Compare
2ac6849
to
df119c3
Compare
Which issue does this PR close?
Closes #14287
Rationale for this change
prefer_round_robin_repartititon
in optimizer config, when it is false, replace allRoundRobinBatch
toOnDemandRepartition
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?
Benchmark