Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GH-3146] Optimize the binaryToDecimal function #3147

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

qian0817
Copy link

@qian0817 qian0817 commented Feb 6, 2025

Rationale for this change

#3146
If precision is less than 18, the condition unscaledNew <= -pow(10, 18) || unscaledNew >= pow(10, 18) can not be true, so we can remove the judgment logic here. Additionally, using BigDecimal.valueOf(unscaledNew, scale) is preferable over using BigDecimal.valueOf(unscaledNew / pow(10, scale)), as it does not convert the unscaled value to double.

What changes are included in this PR?

Optimize the binaryToDecimal function

Are these changes tested?

pass unit test.

Are there any user-facing changes?

No

Closes #3146

Copy link
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parquet-pig has been discussed to be removed: https://lists.apache.org/thread/vh1twzdbvm4fr4sl2wt8swqgq92k8369

Is it actually used in your case @qian0817?

cc @Fokko

@@ -60,12 +60,12 @@ public void testBinaryToDecimal() throws Exception {
// Test LONG
testDecimalConversion(Long.MAX_VALUE, 19, 0, "9223372036854775807");
testDecimalConversion(Long.MIN_VALUE, 19, 0, "-9223372036854775808");
testDecimalConversion(0L, 0, 0, "0.0");
testDecimalConversion(0L, 0, 0, "0");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do these two lines need change?

Copy link
Author

@qian0817 qian0817 Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this use case, 0 is the correct value. Using double to construct BigDecimal previously may lead to potential incorrect behavior.

@qian0817
Copy link
Author

qian0817 commented Feb 7, 2025

parquet-pig has been discussed to be removed: https://lists.apache.org/thread/vh1twzdbvm4fr4sl2wt8swqgq92k8369

Is it actually used in your case @qian0817?

cc @Fokko

I did not directly use the parquet-pig module; while writing my own parquet converter, I referenced some code from parquet-pig and then discovered the optimization points here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize the binaryToDecimal function in the DecimalUtils class
2 participants