Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OrdinalEncoder conversion fixes #1044

Merged
merged 7 commits into from
Dec 7, 2023
Merged

Conversation

max-509
Copy link
Contributor

@max-509 max-509 commented Nov 11, 2023

Hello! Thank you for this awesome library that helps me to use sklearn models in highly loaded systems.

I think I found some problems for OrdinalEncoder conversion. Current implementation doesn't work with encoder that was fitted on data with different column datatypes. For example, if I pass to fit() function pandas dataframe with different column datatypes (np.int64, object), I get error when I try run ONNX model using onnxruntime framework because ONNX model contains inconsistent attributes for LabelEncoder operator.

My suggestion is removing cast to string for each input variable and pay attention not only to the categories_ array dtype but also to the type of categories_ values because in your test_ordinal_encoder_mixed_string_int_drop() function categories_ array is object dtype, but values type is int.

In the end I want attach demo jupyter notebook and ONNX models with old and new conversions for problem demonstration.
ordinal_encoder.zip

I will be waiting for comments on my pull request

…lEncoder is determined by the type of input variable, and not by the type of categories in OrdinalEncoder

- Fixed FunctionTransformer converter. Added axis to Concat operator
- Fixed Imputer shapes calculator. Now number of inputs can be >1
- Fixed SklearnMultiply converter. Now initializer type is equal to input type
- Fixed Pipeline converter. Now Cast operator applies if pipeline outputs are different with last stage outputs
- Changed VotingClassifier and VotingRegressor converter. Now VotingClassifier can accept number of inputs >1
@max-509
Copy link
Contributor Author

max-509 commented Dec 5, 2023

@xadupre I have refactored the code via black

@xadupre xadupre merged commit 78933db into onnx:main Dec 7, 2023
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants