Releases: JuliaAI/MLJ.jl
v0.5.3
v0.5.3 (2019-11-13)
Closed issues:
- Get Started Examples doesn't work on MLJ 0.5.2 (#324)
- Streamline tests (#323)
- Can't use FillImputer with @pipeline ? (#320)
- DecisionTreeClassifier producing strange results after upgrade (#319)
- Move the MLJ manual to new repo MLJManual (#316)
- Example of
unpack
in?unpack
doesn't work: ERROR: MethodError: no method matching !=(::Symbol) (#313) - Scitype check after dropping missing values (#306)
- train_test_pairs method in resampling interface needs extra arguments (#297)
- Comments in manual on multivariate targets need updating (#295)
- CV(shuffle=true) does not seem to properly shuffle the data (#289)
- Overload
mean
,mode
andmedian
forNodes
(#288) - @load is too slow (#280)
- Towards 0.5.2 (#273)
- Create/utilize a style guide (#243)
Merged pull requests:
v0.5.2
-
(Bug fix) Ensure
CV(shuffle=true)
actually does shuffle the data (#289) -
(Enhancement) Allow resampling strategies to see the data by adding arguments to the
train_test_rows
method implemented by new strategies. Refer to the updated manual under "Custom resampling strategies" for details (#297, PR #299) -
(Bug fix) Update requirements on MLJBase and MLJModels to resolve some issues with MLJModels 0.5.3 and MLJBase 0.7.2
v0.5.1
-
(Enhancement) Update requirements for MLJBase and MLJModels to eliminate the current cap of 0.5.2 on CategoricalArrays. Among other things, this allows more recent versions of CSV and DataFrames to be used with MLJ, and eliminates some warnings. (JuliaAI/MLJBase.jl#44, PR #275)
-
(Enhancement) The MLJBase update also adds the Brier score for probabilistic classifiers
-
(Bug) Fix a bug with
|>
syntax for building learning networks (julia >= 0.3) (#253, PR #263 ) -
(Bug) Fix problem with loading most ScikitLearn classifiers (#252)
-
(Enhancement) Allow specification of different resolutions for each dimension in a grid search (#269 , PR #278). Do
?Grid
for details. -
(Enhancement) Allow
selectcols(X, c)
to work on nodes in addition to tables/matrices/vectorsX
. So, ifX
is a node andN=selectnode(X, c)
, thenN()
is the same asselect(X(), c)
(#271)
v0.5.0
- (Enhancement) Add
|>
syntactic sugar for building learning networks. Requires julia 1.3 (#228, #231) - (Enhancement) Add
matching
method to streamline model search, without constructingMLJTask
objects withsupervised
orunsupervised
(the existing task constructors are to remain but be ultimately depreciated or replaced) (#236, PR #238) - (Mildly breaking) Change the method name
train_eval_pairs
for custom resampling strategies totrain_test_pairs
, which is less confusing. Update the manual accordingly. Unlikely to affect any users yet (#244) - (Enhancement, mildly breaking) Update MLJModels requirement to v0.5.0 to make available most scikit-learn classifiers,
KNNClassifier
and new improved version ofKNNRegressor
. AsKNNRegressor
is no longer an MLJ built-in model, code must be explicitly imported with@load KNNRegressor
before instantiating. Domodels()
to get up-do-date list of all models (MLJModels PR #60, MLJModels PR #47)
v0.4.0
- (Enhancment) Update to MLJBase 0.5.0 and MLJModels 0.4.0. The
following new scikit-learn models are thereby made available:
- ScikitLearn.jl
- SVM:
SVMClassifier
,SVMRegressor
,SVMNuClassifier
,
SVMNuRegressor
,SVMLClassifier
,SVMLRegressor
, - Linear Models (regressors):
ARDRegressor
,
BayesianRidgeRegressor
,ElasticNetRegressor
,
ElasticNetCVRegressor
,HuberRegressor
,LarsRegressor
,
LarsCVRegressor
,LassoRegressor
,LassoCVRegressor
,
LassoLarsRegressor
,LassoLarsCVRegressor
,
LassoLarsICRegressor
,LinearRegressor
,
OrthogonalMatchingPursuitRegressor
,
OrthogonalMatchingPursuitCVRegressor
,
PassiveAggressiveRegressor
,RidgeRegressor
,
RidgeCVRegressor
,SGDRegressor
,TheilSenRegressor
-
(New feature) The macro
@pipeline
allows one to construct linear
(non-branching) pipeline composite models with one line of code. One
may include static transformations (ordinary functions) in the
pipeline, as well as target transformations for the supervised case
(when one component model is supervised). -
(Breaking) Source nodes (type
Source
) now have akind
field,
which is either:input
,:target
or:other
, with:input
the
default value in thesource
constructor. If building a learning
network, and the network is to be exported as a standalone model,
then it is now necessary to tag the source nodes accordingly, as in
Xs = source(X)
andys = source(y, kind=:target)
. -
(Breaking) By virtue of the preceding change, the syntax for
exporting a learning network is simplified. Do?@from_network
for
details. Also, one now usesfitresults(N)
instead offit results(N, X, y)
andfitresults(N, X)
when exporting a learning
networkN
"by hand"; see the updated
manual
for details. -
(Breaking) One must explicitly state if a supervised learning
network being exported with@from_network
is probabilistic by
addingis_probablistic=true
to the macro expression. Before, this
information was unreliably inferred from the network. -
(Enhancement) Add macro-free method for loading model code into an arbitrary
module. Do?load
for details. -
(Enhancement)
@load
now returns a mode instance with default
hyperparameters (instead of nothing), as intree_model = @load DecisionTreeRegressor
-
(Breaking)
info("PCA")
now returns a named-tuple, instead of a
dictionary, of the properties of a the model named "PCA" -
(Breaking) The list returned by
models(conditional)
is now a list
of complete metadata entries (named-tuples, as returned by
info
). An entryproxy
appears in the list exactly when
conditional(proxy) == true
. Model query is simplified; for
examplemodels() do model model.is_supervised && model.is_pure_julia end
finds all pure julia supervised models. -
(Bug fix) Introduce new private methods to avoid relying on MLJBase
type piracy MLJBase
#30. -
(Enhancement) If
composite
is a a learning network exported as a
model, andm = machine(composite, args...)
thenreport(m)
returns the reports for each machine in the learning network, and
similarly forfitted_params(m)
. -
(Enhancement)
MLJ.table
,vcat
andhcat
now overloaded for
AbstractNode
, so that they can immediately be used in defining
learning networks. For example, ifX = source(rand(20,3))
and
y=source(rand(20))
thenMLJ.table(X)
andvcat(y, y)
both make
sense and define new nodes. -
(Enhancement)
pretty(X)
prints a pretty version of any tableX
,
complete with types and scitype annotations. Do?pretty
for
options. A wrap ofpretty_table
fromPrettyTables.jl
. -
(Enhancement)
std
is re-exported fromStatistics
-
(Enhancement) The manual and MLJ
cheatsheet
have been updated. -
Performance measures have been migrated to MLJBase, while the model
registry and model load/search facilities have migrated to
MLJModels. As relevant methods are re-exported to MLJ, this is
unlikely to effect many users.
v0.3.0
v0.3.0 (2019-08-21)
-
Introduction of traits for measures (loss functions, etc); see top
of /src/measures.jl for definitions. This- allows user to use loss functions from LossFunctions.jl,
- enables improved measure checks and error message reporting with measures
- allows
evaluate!
to report per-observation measures when
available (for later use by Bayesian optimisers, for example) - allows support for sample-weighted measures playing nicely
with rest of API
-
Improvements to resampling:
evaluate!
method now reports per-observation measures when
available- sample weights can be passed to
evaluate!
for use by measures
that support weights - user can pass a list of train/evaluation pairs of row indices
directly toevaluate!
, in place of aResamplingStrategy
object - implementing a new
ResamplingStrategy
is now straightforward (see docs) - one can call
evaluate
(no exclamation mark) directly on
model + data without first constructing a machine, if desired
-
Doc strings and the
manual have
been revised and updated. The manual includes a new section "Tuning
models", and extra material under "Learning networks" explaining how
to export learning networks as stand-alone models using the
@from_network
macro. -
Improved checks and error-reporting for binding models to data in
machines. -
(Breaking) CSV is now an optional dependency, which means you now
need to import CSV before you can load tasks withload_boston()
,
load_iris()
,load_crabs()
,load_ames()
,load_reduced_ames()
-
Added
schema
method for tables (re-exported from
ScientificTypes.jl). Returns a named tuple with keys:names
,
:types
,:scitypes
and:nrows
. -
(Breaking) Eliminate
scitypes
method. The scientific types of a
table are returned as part of ScientificTypesschema
method (see
above)
Closed issues:
- Migrate @load macro to MLJBase.jl? (#208)
- Loss functions in MLJ (#205)
- Missing package dependency? (#204)
- Test for MLJModels/Clustering.jl gives warning "implicit
dims=2
... (#202) - TunedModel objects not displaying correctly (#197)
- DecisionTreeRegressor fails to predict (#193)
- Control verbosity of @load macro (#192)
- How to know which models are regression models? (#191)
- Error loading the package (#190)
- Data science and ML ontologies in MLJ (#189)
- Machine
fit!
from the model not working (#187) - How can I extract a
fitresult
that does not contain any of the original data? (#186) - @from_network not working if MLJBase not in load path. (#184)
- Improve nested parameter specification in tuning (#180)
- Resampling strategies should have option for independent RNG (#178)
- Fitting SVC machine changes hyperparameters (#172)
- range(SVC, :gamma, ...) returns NominalRange instead of NumericRange (#170)
- Local support at the ATI? (#169)
- range(tree, :n, ...) not working for tree=DecisionTreeRegressorClassfier (#168)
- Add some transformers from MultivariateStats.jl (#167)
- Use of MLJRegistry (#165)
- No method matching build_tree (#164)
- Multiple learning curves just repeating the first curve (#163)
- Register v0.2.5 (#162)
- Register v0.2.4 (#160)
- Issue with Documentation/Example - DecisionTreeClassifier again... (#156)
- Convert example/xgboost.jl into notebook (#148)
- GSoC Project Proposal (#78)
- Implement MLJ interface for linear models (#35)
Merged pull requests:
- Update to MLJBase 0.4.0 (#212) (ablaom)
- Improve resampling/evaluation; add measures API (incl. LossFunctions) (#206) (ablaom)
- Fix link to ipynb tour (#203) (Kryohi)
- Fix197b (#201) (tlienart)
- Fix #192: Add verbosity option to @load macro (#196) (juliohm)
- Update bug_report.md (#194) (juliohm)
- Make CSV a test dependency (#185) (DilumAluthge)
- Minor fixes + docstrings (#183) (tlienart)
- typo fix in learning networks docs (#182) (tlienart)
- Add rng to resampling methods (cv and holdout) (#179) (ayush-1506)
- Change get_type implementation (#171) (oleskiewicz)
v0.2.5
v0.2.4
v0.2.4 (2019-06-13)
Closed issues:
- Allow ability to call
predict\(mach, \*\)
on a task (as well as new input data)? (#158) - misclassification_rate StackOverflow error (#133)
- Add task interface (#68)
- Treatment of supervised models predicting an ordered categorical (#48)
- Proposal for metadata (#22)
- Boosting Packages (#21)
- Literature discussion (#10)
v0.2.3
v0.2.2
v0.2.2 (2019-05-30)
Closed issues:
- specifying new rows in calls to
fit!
on a Node not triggering retraining. (#147) - fit! of Node sometimes calls
update
on model when it should callfit
on model (#146) - MultivariateStats.PCA has wrong load_path (#141)
- Error running the tour.ipynb notebook (#140)
- For reproducibility, include a Manifest.toml file with all examples. (#137)
- Coveralls activation (#131)
- Register v0.2.1 (#127)
- Migrate MultivariateStats: MLJ -> MLJModels (#125)
- Register v0.2.0 (#124)
- Wrap Scikitlearn.jl Elastic Net algorithms (#112)
Merged pull requests:
- Improve coverage of other files (#145) (giordano)
- add manifest.toml, project.toml for example notebooks, update tour.ipynb (#142) (ayush-1506)
- Simplify methods of load macro by reducing code duplication (#136) (giordano)
- Improve test coverage of tasks.jl (#135) (giordano)
- Move docs dependencies to Project.toml (#130) (giordano)
- Adding SimpleRidgeRegressor (#129) (ayush-1506)
- Send code coverage to Coveralls and Codecov (#128) (giordano)