OrdinalEncoder conversion fixes #1044

max-509 · 2023-11-11T13:10:13Z

Hello! Thank you for this awesome library that helps me to use sklearn models in highly loaded systems.

I think I found some problems for OrdinalEncoder conversion. Current implementation doesn't work with encoder that was fitted on data with different column datatypes. For example, if I pass to fit() function pandas dataframe with different column datatypes (np.int64, object), I get error when I try run ONNX model using onnxruntime framework because ONNX model contains inconsistent attributes for LabelEncoder operator.

My suggestion is removing cast to string for each input variable and pay attention not only to the categories_ array dtype but also to the type of categories_ values because in your test_ordinal_encoder_mixed_string_int_drop() function categories_ array is object dtype, but values type is int.

In the end I want attach demo jupyter notebook and ONNX models with old and new conversions for problem demonstration.
ordinal_encoder.zip

I will be waiting for comments on my pull request

tests/test_sklearn_voting_classifier_converter.py

…lEncoder is determined by the type of input variable, and not by the type of categories in OrdinalEncoder - Fixed FunctionTransformer converter. Added axis to Concat operator - Fixed Imputer shapes calculator. Now number of inputs can be >1 - Fixed SklearnMultiply converter. Now initializer type is equal to input type - Fixed Pipeline converter. Now Cast operator applies if pipeline outputs are different with last stage outputs - Changed VotingClassifier and VotingRegressor converter. Now VotingClassifier can accept number of inputs >1

tests/test_sklearn_ordinal_encoder.py

max-509 · 2023-12-05T15:41:26Z

@xadupre I have refactored the code via black

Vershinin Maxim WX1123714 added 2 commits November 11, 2023 19:32

OrdinalEncoder conversion fixes

9ab770f

Test fixes; added conversions for OrdinalEncoder input variables

bc4e95c

xadupre reviewed Nov 13, 2023

View reviewed changes

tests/test_sklearn_voting_classifier_converter.py Show resolved Hide resolved

max-509 added 2 commits November 19, 2023 15:25

CI fix

51137e2

github-advanced-security bot found potential problems Nov 30, 2023

View reviewed changes

tests/test_sklearn_ordinal_encoder.py Fixed Show fixed Hide fixed

max-509 added 2 commits December 5, 2023 20:59

Run black formatter

592f3b6

Changed assertTrue to assertIsNotNone

99d9406

xadupre approved these changes Dec 7, 2023

View reviewed changes

Merge branch 'main' into ordinal_encoder_fixes

a74389b

xadupre merged commit 78933db into onnx:main Dec 7, 2023
33 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OrdinalEncoder conversion fixes #1044

OrdinalEncoder conversion fixes #1044

max-509 commented Nov 11, 2023

max-509 commented Dec 5, 2023

OrdinalEncoder conversion fixes #1044

OrdinalEncoder conversion fixes #1044

Conversation

max-509 commented Nov 11, 2023

max-509 commented Dec 5, 2023