Release 0.6.0 - Regression! #113
nnansters
announced in
Announcements
Replies: 1 comment 1 reply
-
@nnansters, Releases isn't updated to v0.6.0? It is still on v0.5.3 |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hey all,
Niels from NannyML engineering here to deliver you our 0.6.0 release 📦 🚴♂️
Installing / upgrading
You can get this latest version by using pip:
pip install -U nannyml
Or if you're using Conda:
conda install -c conda-forge nannyml
What's new?
Oh boy, I am very excited about this release, we have some big news. So without any digression, we're introducing support for regression! 🎉
Yes, you can now use regression models with all of our existing functionality: detecting data drift on model inputs, outputs and targets, calculating realized performance metrics and even... estimating performance!
Covariate shift: both univariate and multivariate covariate shift detection are just like before. Since they only use your model feature values, these actually already worked with regression models! 🙈
Model output drift: check the KS stat and the evolution over time of your regression predictions.
Target drift: check the KS stat and the evolution over time of your regression target values.
Calculating realized performance: easily calculate and plot the performance of your regression model using the following metrics: mae, mape, mse, msle, rmse and rmsle
Estimating performance: estimate the following metrics in absence of target values using our Direct Loss Estimator: mae, mape, mse, msle, rmse and rmsle
The latest produced by our research labs is the Direct Loss Estimator (DLE), allowing us to estimate the performance metrics for your regression model in absence of target values.
To learn more about how it works, check out the in-depth documentation.
This quick snippet demonstrates how to use it:
The introduction of regression made it necessary for us to break some existing interfaces 💔 . Model output drift calculation, target drift calculation and realized performance calculation now require an addition problem_type parameter.
We could have tried to infer this information from the data that is provided, but making it explicit is more transparent and future-proof, even if it comes at a small cost.
As an example, this is how to create a target drift calculator for multiclass classification:
Note the new
problem_type
parameter that can be set to a fixed string value or also accepts a ProblemType enum value.What's ### changed?
We've had some new people reaching out and helping us improve, for which we are grateful!
Some more consistency in error handling for our very minimal IO classes
Speeding up the
tox
build stepsFixing a leak of helper visualization columns into reference results
Refactored a lot of the documentation to use new internal tooling. Speeding up testing and updating the documentation is paramount, it is often the thing we need to spend to most time on before being able to release.
What's up next?
We'll be tackling time itself as we prepare to make the timestamp_column_name data requirement optional!
We hope you'll love the new release as much as we do! Your feedback is most welcome and appreciated!
I would also apologize for the silly puns, but meh, I have no regress. 🥁
Niels
Beta Was this translation helpful? Give feedback.
All reactions