Skip to content
This repository was archived by the owner on Nov 1, 2024. It is now read-only.

Commit

Permalink
Delegate IColumn.fill_null/drop_null to Arrow (#96)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: #96

As title

NOTE: the changes of diff only delegate IColumn.fill_null to Arrow since TorchArrow is using Apache Arrow 2.0.0, which doesn't support `drop_null` until Apache Arrow 6: https://arrow.apache.org/docs/python/api/compute.html#selections

Reviewed By: wenleix

Differential Revision: D32770009

fbshipit-source-id: 2fbdbf6529ba677292acbcfd78f7d256282e3561
  • Loading branch information
Bo Huang authored and facebook-github-bot committed Dec 6, 2021
1 parent fb729a9 commit 5756613
Showing 1 changed file with 6 additions and 9 deletions.
15 changes: 6 additions & 9 deletions torcharrow/icolumn.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@

from .dispatcher import Device
from .expression import expression
from .interop import from_arrow
from .scope import Scope
from .trace import trace, traceproperty

Expand Down Expand Up @@ -1129,18 +1130,14 @@ def fill_null(self, fill_value: ty.Union[dt.ScalarTypes, ty.Dict]):
dtype: int64, length: 4, null_count: 0
"""
self._prototype_support_warning("fill_null")

if not isinstance(fill_value, IColumn._scalar_types):
raise TypeError(f"fill_null with {type(fill_value)} is not supported")
if isinstance(fill_value, IColumn._scalar_types):
res = Scope._EmptyColumn(self.dtype.constructor(nullable=False))
for m, i in self._items():
if not m:
res._append_value(i)
else:
res._append_value(fill_value)
return res._finalize()
import pyarrow.compute as pc

arr = pc.fill_null(self.to_arrow(), fill_value)
arr_dtype = self.dtype.with_null(nullable=False)
return from_arrow(arr, dtype=arr_dtype, device=self.device)
else:
raise TypeError(f"fill_null with {type(fill_value)} is not supported")

Expand Down

0 comments on commit 5756613

Please sign in to comment.