Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Truncate MCVs #8041

Merged
merged 9 commits into from
Jun 21, 2024
Merged

Truncate MCVs #8041

merged 9 commits into from
Jun 21, 2024

Conversation

max-hoffman
Copy link
Contributor

@max-hoffman max-hoffman commented Jun 20, 2024

Sort and truncate MCVs. Only keep values whose frequency is > twice the uniform frequency. This prevents us from manually summing non-outliers (which is expensive).

@max-hoffman
Copy link
Contributor Author

#benchmark

Copy link

@coffeegoddd
Copy link
Contributor

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000
version result total
0bc232f ok 5937457
version total_tests
0bc232f 5937457
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@max-hoffman DOLT

test_name from_latency_p95 to_latency_p95 is_faster
tpcc-scale-factor-1 74.46 74.46 0
test_name server_name server_version tps test_name server_name server_version tps is_faster
tpcc-scale-factor-1 dolt b4dc473 33.54 tpcc-scale-factor-1 dolt 0bc232f 32.85 0

@coffeegoddd
Copy link
Contributor

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000
version result total
39bcc83 ok 5937457
version total_tests
39bcc83 5937457
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@max-hoffman DOLT

read_tests from_latency_median to_latency_median is_faster
covering_index_scan 2.81 2.81 0
groupby_scan 17.01 17.32 0
index_join 5.28 5.28 0
index_join_scan 3.25 2.52 1
index_scan 53.85 53.85 0
oltp_point_select 0.46 0.46 0
oltp_read_only 7.56 7.56 0
select_random_points 0.75 0.75 0
select_random_ranges 0.9 0.9 0
table_scan 54.83 54.83 0
types_table_scan 139.85 139.85 0
write_tests from_latency_median to_latency_median is_faster
oltp_delete_insert 6.09 6.09 0
oltp_insert 3.02 3.02 0
oltp_read_write 13.95 13.95 0
oltp_update_index 3.07 3.07 0
oltp_update_non_index 3.02 3.02 0
oltp_write_only 6.43 6.43 0
types_delete_insert 6.67 6.67 0

@max-hoffman max-hoffman changed the title [no-release-notes] test smaller set of mcv changes Truncate MCVs Jun 21, 2024
@max-hoffman max-hoffman requested a review from nicktobey June 21, 2024 16:58
@coffeegoddd
Copy link
Contributor

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000
version result total
c9875b3 ok 5937457
version total_tests
c9875b3 5937457
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000
version result total
ba8e7f4 ok 5937457
version total_tests
ba8e7f4 5937457
correctness_percentage
100.0

Copy link
Contributor

@jycor jycor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@max-hoffman max-hoffman merged commit 082a398 into main Jun 21, 2024
21 checks passed
@max-hoffman max-hoffman deleted the max/mcv-perf-2 branch June 21, 2024 21:46
Copy link
Contributor

@nicktobey nicktobey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of tests now return no mcv information. I'd make sure that we're not losing any test coverage as a result. Otherwise, this looks good.

start := m.Len()
for i, v := range *m {
if float64(v.cnt) >= cutoff {
start = i
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should break out of the loop here, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow up bug fixes here #8049

BoundVal: sql.Row{"i", int64(1)},
BoundCnt: 2,
}},
},
{
name: "mcvs",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets also add a test with multiple mcvs.


# setting variables doesn't hang or error
dolt sql -q "SET @@persist.dolt_stats_auto_refresh_enabled = 1;"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why were these deleted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced with the analyze below, we already test background loading figured this was overkill

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants