Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: fix compatibility issue of df.sort_index() and df.groupby(sort=True) #776

Merged
merged 5 commits into from
Jun 16, 2024

Conversation

luweizheng
Copy link
Collaborator

@luweizheng luweizheng commented Jun 6, 2024

What do these changes do?

sort_index

Xorbits' DataFrame sort_index is inconsistent with pandas when axis=1 and ignore_index=True.
pandas does it by sorting the index as usual and replacing index (when axis=0) or columns (when axis=1) with RangeIndex.
Here is the reference code.

So here we follow pandas and replace index or columns when ignore_index=True is passed.

groupby(sort=True).agg()

python/xorbits/_mars/dataframe/groupby/sort.py:

out_df = in_df.loc[pivots[p_index - 1] :].drop(
                    index=pivots[p_index - 1], errors="ignore"
                )

In the latest version, during the map phase of DataFrameGroupbySortShuffle, if in_df does not contain the index from pivots, an empty out_df is returned.

Related issue number

Check code requirements

  • tests added / passed (if needed)
  • Ensure all linting tests pass

@XprobeBot XprobeBot added the bug Something isn't working label Jun 6, 2024
@XprobeBot XprobeBot added this to the v0.7.3 milestone Jun 6, 2024
@luweizheng luweizheng changed the title BUG: fix df sort_index with axis=1 & ignore_index=True BUG: fix compatibility issue of df.sort_index() and df.groupby(sort=True) Jun 14, 2024
@qinxuye qinxuye merged commit 79c3bc8 into xorbitsai:main Jun 16, 2024
15 of 26 checks passed
@luweizheng luweizheng deleted the ci branch June 16, 2024 06:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants