You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The documentation mentions two different approaches.
First is using StringDtype:
import pandas as pd
import pyarrow as pa
df = pd.DataFrame({"x": ["foo", "bar", "baz"]}, dtype=pd.StringDtype("pyarrow"))
df_pa = pa.Table.from_pandas(df).to_pandas()
pd.testing.assert_frame_equal(df, df_pa)
Second is using ArrowDtype:
import pandas as pd
import pyarrow as pa
df = pd.DataFrame({"x": ["foo", "bar", "baz"]}, dtype=pd.ArrowDtype(pa.string()))
df_pa = pa.Table.from_pandas(df).to_pandas()
pd.testing.assert_frame_equal(df, df_pa)
However these both have assertion errors.
Using astype as shown below doesn't have the assertion error.
import pandas as pd
import pyarrow as pa
df = pd.DataFrame({"x": ["foo", "bar", "baz"]}, dtype="string[pyarrow]")
df_pa = pa.Table.from_pandas(df).to_pandas().astype("string[pyarrow]")
pd.testing.assert_frame_equal(df, df_pa)
The two approaches mentioned in the documentation are also mentioned in the issue from 2022 as working versions / fixes. However I think these approaches may not work with the current version of pandas.
Suggested fix for documentation
Documentation should be updated to reflect the .astype("string[pyarrow]") as possibly being the best practice approach for this situation.
The text was updated successfully, but these errors were encountered:
Pandas version checks
main
hereLocation of the documentation
https://pandas.pydata.org/pandas-docs/stable/reference/arrays.html#pyarrow
Documentation problem
This issue was first brought up: #50074
The documentation mentions two different approaches.
First is using StringDtype:
Second is using ArrowDtype:
However these both have assertion errors.
Using astype as shown below doesn't have the assertion error.
The two approaches mentioned in the documentation are also mentioned in the issue from 2022 as working versions / fixes. However I think these approaches may not work with the current version of pandas.
Suggested fix for documentation
Documentation should be updated to reflect the
.astype("string[pyarrow]")
as possibly being the best practice approach for this situation.The text was updated successfully, but these errors were encountered: