Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: period[h] + column splicing is not work #60273

Open
3 tasks done
oxygenbilly opened this issue Nov 10, 2024 · 7 comments
Open
3 tasks done

BUG: period[h] + column splicing is not work #60273

oxygenbilly opened this issue Nov 10, 2024 · 7 comments
Assignees
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Period Period data type

Comments

@oxygenbilly
Copy link

oxygenbilly commented Nov 10, 2024

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

pr = pd.period_range('2024-01-01 00:00:00', '2024-01-01 02:00:00', freq='h')
df = pd.DataFrame(index=pr)
df['date'] = df.index.to_timestamp().floor('D')
df['hour'] = df.index.hour
df.index.name = 'value'
df = df.reset_index()
df = df.pivot(index='date', columns='hour', values='value')

print(df)
# hour                       0                 1                 2
# date                                                            
# 2024-01-01  2024-01-01 00:00  2024-01-01 01:00  2024-01-01 02:00

print(df[[0,1,2]])
# hour                       0                 1                 2
# date                                                            
# 2024-01-01  2024-01-01 00:00  2024-01-01 00:00  2024-01-01 00:00

Issue Description

when the datatype is period[h], the slicing will not produce the correct results.
if the datatype is changed to object. result is correct

Expected Behavior

The expected behavior is to slice the column properly based on hour [0, 1, 2].
However the above results give all [0] column for all of [0, 1, 2]

Installed Versions

INSTALLED VERSIONS

commit : d9cdd2e
python : 3.10.12.final.0
python-bits : 64
OS : Linux
OS-release : 6.1.85+
Version : #1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 2.2.2
numpy : 1.26.4
pytz : 2024.2
dateutil : 2.8.2
setuptools : 75.1.0
pip : 24.1.2
Cython : 3.0.11
pytest : 7.4.4
hypothesis : None
sphinx : 5.0.2
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 5.3.0
html5lib : 1.1
pymysql : None
psycopg2 : 2.9.10
jinja2 : 3.1.4
IPython : 7.34.0
pandas_datareader : 0.10.0
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.3
bottleneck : 1.4.2
dataframe-api-compat : None
fastparquet : None
fsspec : 2024.10.0
gcsfs : 2024.10.0
matplotlib : 3.8.0
numba : 0.60.0
numexpr : 2.10.1
odfpy : None
openpyxl : 3.1.5
pandas_gbq : 0.24.0
pyarrow : 17.0.0
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.13.1
sqlalchemy : 2.0.36
tables : 3.8.0
tabulate : 0.9.0
xarray : 2024.10.0
xlrd : 2.0.1
zstandard : None
tzdata : 2024.2
qtpy : None
pyqt5 : None

@oxygenbilly oxygenbilly added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 10, 2024
@rhshadrach
Copy link
Member

Thanks for the report! Confirmed on main. Further investigations and PRs to fix are welcome!

@rhshadrach rhshadrach added Indexing Related to indexing on series/frames, not to indexes themselves Period Period data type and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 10, 2024
@DhruvBShetty
Copy link
Contributor

Hello, @oxygenbilly, I would like to investigate this issue, if you aren't working on it.

@veronicabenedict
Copy link

@DhruvBShetty Hi! if you are no longer working on this, I would like to take this issue

@veronicabenedict
Copy link

take

@DhruvBShetty
Copy link
Contributor

@veronicabenedict Hi, sure, go ahead.

@yuanx749
Copy link
Contributor

Also not working for other period datetypes with column indexing.
A simpler reproducible example:

df = pd.DataFrame(
    data=[
        pd.Period("2024-01-01", freq="D"),
        pd.Period("2024-01-02", freq="D"),
        pd.Period("2024-01-03", freq="D"),
    ]
).T

print(df)
#             0           1           2
# 0  2024-01-01  2024-01-02  2024-01-03

print(df[[0, 1, 2]])
#             0           1           2
# 0  2024-01-01  2024-01-01  2024-01-01

@snitish
Copy link
Contributor

snitish commented Dec 6, 2024

@veronicabenedict any luck investigating this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Period Period data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants