[GH-3146] Optimize the binaryToDecimal function #3147

qian0817 · 2025-02-06T07:22:24Z

Rationale for this change

#3146
If precision is less than 18, the condition unscaledNew <= -pow(10, 18) || unscaledNew >= pow(10, 18) can not be true, so we can remove the judgment logic here. Additionally, using BigDecimal.valueOf(unscaledNew, scale) is preferable over using BigDecimal.valueOf(unscaledNew / pow(10, scale)), as it does not convert the unscaled value to double.

What changes are included in this PR?

Optimize the binaryToDecimal function

Are these changes tested?

pass unit test.

Are there any user-facing changes?

No

Closes #3146

wgtmac

parquet-pig has been discussed to be removed: https://lists.apache.org/thread/vh1twzdbvm4fr4sl2wt8swqgq92k8369

Is it actually used in your case @qian0817?

cc @Fokko

wgtmac · 2025-02-07T06:06:33Z

parquet-pig/src/test/java/org/apache/parquet/pig/TestDecimalUtils.java

@@ -60,12 +60,12 @@ public void testBinaryToDecimal() throws Exception {
    // Test LONG
    testDecimalConversion(Long.MAX_VALUE, 19, 0, "9223372036854775807");
    testDecimalConversion(Long.MIN_VALUE, 19, 0, "-9223372036854775808");
-    testDecimalConversion(0L, 0, 0, "0.0");
+    testDecimalConversion(0L, 0, 0, "0");


Why do these two lines need change?

For this use case, 0 is the correct value. Using double to construct BigDecimal previously may lead to potential incorrect behavior.

qian0817 · 2025-02-07T06:40:56Z

parquet-pig has been discussed to be removed: https://lists.apache.org/thread/vh1twzdbvm4fr4sl2wt8swqgq92k8369

Is it actually used in your case @qian0817?

cc @Fokko

I did not directly use the parquet-pig module; while writing my own parquet converter, I referenced some code from parquet-pig and then discovered the optimization points here.

optimizing the binaryToDecimal function

c1b3473

wgtmac reviewed Feb 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GH-3146] Optimize the binaryToDecimal function #3147

[GH-3146] Optimize the binaryToDecimal function #3147

qian0817 commented Feb 6, 2025 •

edited

Loading

wgtmac left a comment

wgtmac Feb 7, 2025

qian0817 Feb 7, 2025 •

edited

Loading

qian0817 commented Feb 7, 2025

[GH-3146] Optimize the binaryToDecimal function #3147

Are you sure you want to change the base?

[GH-3146] Optimize the binaryToDecimal function #3147

Conversation

qian0817 commented Feb 6, 2025 • edited Loading

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

wgtmac left a comment

Choose a reason for hiding this comment

wgtmac Feb 7, 2025

Choose a reason for hiding this comment

qian0817 Feb 7, 2025 • edited Loading

Choose a reason for hiding this comment

qian0817 commented Feb 7, 2025

qian0817 commented Feb 6, 2025 •

edited

Loading

qian0817 Feb 7, 2025 •

edited

Loading