You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Per #37, scipy.sparse.hstack is called whenver a sparse matrix is in extracted. However, scipy.sparse.hstack cannot upcast dtype=object, so even if sparse=False for the mapper object, the hstack will fail whenver a np.ndarray of dtype=object is involved.
Passing example, note upcasts int64/float64 to float64.
In [432]:
df = pd.DataFrame({'int':[1,2,3],
'flt':[2.,3,4],
'obj':['r','w','b']})
mapper = sklearn_pandas.DataFrameMapper([
(['int'],[sklearn.preprocessing.OneHotEncoder()]),
(['flt'],[sklearn.preprocessing.OneHotEncoder()])
], sparse=True)
mapper.fit_transform(df)
Out[432]:
<3x6 sparse matrix of type '<type 'numpy.float64'>'
with 6 stored elements in Compressed Sparse Row format>
Failing example, unable to upcast int64/object see scipy\sparse\sputils.pyc for upcast code.
In [434]:
mapper = sklearn_pandas.DataFrameMapper([
(['int'],[sklearn.preprocessing.OneHotEncoder()]),
('obj', None)])
TypeError: no supported conversion for types: (dtype('float64'), dtype('O'))
I think it's ok if an error is thrown when sparse=True and an array of type object is involved, but not if sparse=False.
I'll submit a pull request with a recommended fix.
The text was updated successfully, but these errors were encountered:
bmweiner
added a commit
to bmweiner/sklearn-pandas
that referenced
this issue
Jan 10, 2016
Per #37,
scipy.sparse.hstack
is called whenver a sparse matrix is in extracted. However,scipy.sparse.hstack
cannot upcast dtype=object, so even if sparse=False for the mapper object, the hstack will fail whenver a np.ndarray of dtype=object is involved.Passing example, note upcasts int64/float64 to float64.
Failing example, unable to upcast int64/object see scipy\sparse\sputils.pyc for upcast code.
I think it's ok if an error is thrown when sparse=True and an array of type object is involved, but not if sparse=False.
I'll submit a pull request with a recommended fix.
The text was updated successfully, but these errors were encountered: