Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: SQL Inner Join Operation #394

Open
Lengxiaoyi opened this issue Nov 26, 2024 · 2 comments
Open

feat: SQL Inner Join Operation #394

Lengxiaoyi opened this issue Nov 26, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@Lengxiaoyi
Copy link

Hi,I paid attention to zksql when I was looking through the paper recently and reviewed TPC- H based on the code in the paper, I found that performing a join operation, which was not mentioned in the paper, would lead to memory overflow, and suggested that this project could measure the program's running maximum memory after implementing the sql inner_join operator, and it would be a more convincing proof of the correctness of the program selection. Good luck!

@Lengxiaoyi Lengxiaoyi added the enhancement New feature or request label Nov 26, 2024
@JayWhite2357 JayWhite2357 changed the title Testing the join operator feat: SQL Inner Join Operation Nov 26, 2024
@JayWhite2357
Copy link
Contributor

Hey @Lengxiaoyi. Joins are currently the next major feature that we are looking to add. @iajoiner is working on them

@Lengxiaoyi
Copy link
Author

Thank you very much for your reply, can I think that the current optimization scheme after implementing the join operator lies in the fact that it should be too large for the Cartesian product generated after the join operation, resulting in too large a PROOF SIZE. It seems that using VOLE in the head can reduce the memory usage since only one calculation is needed. Of course proof of sql will have a better implementation, looking forward to the project's continuous update, I can learn a lot of knowledge from proof of sql.Thank you~
#Suggestion:
Measure the peak memory usage and proof size after adding the query plan to the join algorithm to generate the corresponding proof.

@iajoiner iajoiner mentioned this issue Dec 13, 2024
5 tasks
iajoiner added a commit that referenced this issue Dec 18, 2024
Please be sure to look over the pull request guidelines here:
https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr.

# Please go through the following checklist
- [x] The PR title and commit messages adhere to guidelines here:
https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md.
In particular `!` is used if and only if at least one breaking change
has been introduced.
- [x] I have run the ci check script with `source
scripts/run_ci_checks.sh`.
- The following upstream PRs have been approved and merged:
  - [x] #391 
  - [x] #396


# Rationale for this change
This PR adds the actual sort-merge join process which completes a part
of #394.
<!--
Why are you proposing this change? If this is already explained clearly
in the linked issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.

 Example:
 Add `NestedLoopJoinExec`.
 Closes #345.

Since we added `HashJoinExec` in #323 it has been possible to do
provable inner joins. However performance is not satisfactory in some
cases. Hence we need to fix the problem by implement
`NestedLoopJoinExec` and speed up the code
 for `HashJoinExec`.
-->

# What changes are included in this PR?
- add `sort_merge_join`.
<!--
There is no need to duplicate the description in the ticket here but it
is sometimes worth providing a summary of the individual changes in this
PR.

Example:
- Add `NestedLoopJoinExec`.
- Speed up `HashJoinExec`.
- Route joins to `NestedLoopJoinExec` if the outer input is sufficiently
small.
-->

# Are these changes tested?
<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?

Example:
Yes.
-->
Yes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants