Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Incorrect SUM calculation in CTE with correlated subquery #8724

Closed
samjewell opened this issue Jan 9, 2025 · 3 comments · Fixed by dolthub/go-mysql-server#2812
Closed
Labels
analyzer bug Something isn't working customer issue good repro Easily reproducible bugs sql Issue with SQL

Comments

@samjewell
Copy link

samjewell commented Jan 9, 2025

I got some help from an LLM to write this description! So forgive me if it's a little verbose :-)

Description

I've discovered a bug in the go-mysql-server implementation where a correlated subquery within a CTE produces incorrect results for the SUM function.

Steps to Reproduce

  1. Set up a table and insert some data:
CREATE TABLE Product (
  Name VARCHAR(100),
  Price INT
);

INSERT INTO Product(Name, Price) VALUES ('Janos', 45);
INSERT INTO Product(Name, Price) VALUES ('Sam', 25);
  1. Execute the following query:
WITH
  cte AS (
    SELECT * FROM Product
  )
  
SELECT
  *,
  (SELECT SUM(Price) FROM cte) AS PriceTotal
FROM cte;

Expected Behavior

The query should return two rows, each with a PriceTotal of 70 (45 + 25).

This SQL Fiddle shows the expected behaviour: https://sqlfiddle.com/mysql/online-compiler?id=3885380d-b06e-4334-9a5f-d528dbbaac33

Actual Behavior

The query returns two rows, but each has a PriceTotal of 90. It appears that the SUM function is incorrectly adding the price of the first row twice (45 + 45) instead of summing all rows correctly.

Environment

  • go-mysql-server versions: v0.18.1 and v0.19.0 both have this issue

Additional Context

I've verified that this query produces the correct results (PriceTotal of 70) when run against a standard MySQL server (see the SQL Fiddle linked above). The issue seems to be specific to the go-mysql-server implementation.

I've also verified that the issue only occurs once the CTE is introduced. Without the CTE the Subquery works fine.

Possible Cause

There might be an issue with how correlated subqueries are handled within CTEs, specifically when aggregating functions like SUM are involved.

Suggested Fix

The implementation of correlated subqueries within CTEs may need to be reviewed, particularly focusing on how aggregations are computed in this context.

@max-hoffman
Copy link
Contributor

This is a naming edge case where aren't differentiating between two cte.price variable definitions, and are defaulting to the outer scope reference. Minimal change that shows this:

SELECT
  *,
  (SELECT SUM(a.Price) FROM cte a) AS PriceTotal
FROM cte;
+-------+-------+------------+
| Name  | Price | PriceTotal |
+-------+-------+------------+
| Sam   | 25    | 70         |
| Janos | 45    | 70         |
+-------+-------+------------+

I'd recommend window functions as the most natural way to express the original query pattern:

SELECT name, price, sum(price) over (range between unbounded preceding and unbounded following) as PriceTotal FROM product;
+-------+-------+------------+
| name  | price | PriceTotal |
+-------+-------+------------+
| Sam   | 25    | 70         |
| Janos | 45    | 70         |
+-------+-------+------------+

@max-hoffman max-hoffman transferred this issue from dolthub/go-mysql-server Jan 9, 2025
@max-hoffman max-hoffman added bug Something isn't working sql Issue with SQL analyzer good repro Easily reproducible bugs labels Jan 9, 2025
@samjewell
Copy link
Author

samjewell commented Jan 10, 2025

Thanks @max-hoffman for your speedy reply, and workaround suggestions. The first workaround looks great.

I'd recommend window functions as the most natural way to express the original query pattern:

I had a GROUP BY which prevented me from using a window function unfortunately. So I introduced an additional CTE and then a CROSS JOIN in the end, to get what I wanted.

@bpf120
Copy link

bpf120 commented Jan 10, 2025

@samjewell , thanks for filing this issue and using Dolt. We'd love to learn more about your use case. You can email me or swing by our Discord if you'd like to share.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analyzer bug Something isn't working customer issue good repro Easily reproducible bugs sql Issue with SQL
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants