Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello! Thank you for this awesome library that helps me to use sklearn models in highly loaded systems.
I think I found some problems for OrdinalEncoder conversion. Current implementation doesn't work with encoder that was fitted on data with different column datatypes. For example, if I pass to fit() function pandas dataframe with different column datatypes (np.int64, object), I get error when I try run ONNX model using onnxruntime framework because ONNX model contains inconsistent attributes for LabelEncoder operator.
My suggestion is removing cast to string for each input variable and pay attention not only to the
categories_
array dtype but also to the type ofcategories_
values because in yourtest_ordinal_encoder_mixed_string_int_drop()
functioncategories_
array is object dtype, but values type is int.In the end I want attach demo jupyter notebook and ONNX models with old and new conversions for problem demonstration.
ordinal_encoder.zip
I will be waiting for comments on my pull request