Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ordering for dplyr locale issue. #27

Merged
merged 2 commits into from
Nov 1, 2024
Merged

Added ordering for dplyr locale issue. #27

merged 2 commits into from
Nov 1, 2024

Conversation

mbannick
Copy link
Owner

@mbannick mbannick commented Nov 1, 2024

In version 1.1.0 of dplyr, they switched to ordering group_by by the system locale (e.g., "English") to "C". This means that something like c("a", "b", "A", "B") would be reordered to c("A", "B", "a", "b"). This caused issues internally and produced incorrect results, when someone did not use a constant case (like "a" or "A") in their treatment variable naming.

The implemented fix uses the already factored variable, rather than a version of the variable that's not a factor. So group_by does not have to rely on a locale to figure out the ordering.

The tests for this have to intentionally set the locale to be English. This is because the C locale is automatically set for the testthat suite, meaning the test could pass, but the undesirable behavior could still occur when it's run outside of the testing suite.

@mbannick mbannick requested a review from tye27 November 1, 2024 22:18
@mbannick mbannick merged commit 112f2b7 into main Nov 1, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants