From 294a42c245a892eeffdf51241c99ed9ecbf98a20 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Mon, 30 Oct 2023 10:41:53 +1300 Subject: [PATCH 1/4] update docs on implementing unsupervised models --- docs/src/adding_models_for_general_use.md | 46 ++++++++++++----------- 1 file changed, 25 insertions(+), 21 deletions(-) diff --git a/docs/src/adding_models_for_general_use.md b/docs/src/adding_models_for_general_use.md index 54d783d82..a30ad96f6 100755 --- a/docs/src/adding_models_for_general_use.md +++ b/docs/src/adding_models_for_general_use.md @@ -1209,35 +1209,39 @@ Your document string must include the following components, in order: Unsupervised models implement the MLJ model interface in a very similar fashion. The main differences are: -- The `fit` method has only one training argument `X`, as in `MLJModelInterface.fit(model, - verbosity, X)`. However, it has the same return value `(fitresult, cache, report)`. An - `update` method (e.g., for iterative models) can be optionally implemented in the same - way. For models that subtype `Static <: Unsupervised` (see also [Static - transformers](@ref) `fit` has no training arguments but does not need to be implemented - as a fallback returns `(nothing, nothing, nothing)`. - -- A `transform` method is compulsory and has the same signature as - `predict`, as in `MLJModelInterface.transform(model, fitresult, Xnew)`. +- The `fit` method, which still returns `(fitresult, cache, report)` will typically have + only one training argument `X`, as in `MLJModelInterface.fit(model, verbosity, X)`, + although this is not a hard requirement. For example, a feature selection tool (wrapping + some supervised model) might also include a target `y` as input. Furthermore, in the + case of models that subtype `Static <: Unsupervised` (see also [Static + transformers](@ref) `fit` has no training arguments at all, but does not need to be + implemented as a fallback returns `(nothing, nothing, nothing)`. + +- A `transform` and/or `predict` method is implemented, and has the same signature as + `predict` does in the supervised case, as in `MLJModelInterface.transform(model, + fitresult, Xnew)`. However, it may only have one data argument `Xnew`, unless `model <: + Static`, in which case there is no restriction. A use-case for `predict` is K-means + `MLJModelInterface.predict(model, fitresult, Xnew)`. A use-case is + clustering that `predict`s labels and `transform`s + input features into a space of lower dimension. See [Transformers + that also predict](@ref) for an example. -- Instead of defining the `target_scitype` trait, one declares an - `output_scitype` trait (see above for the meaning). +- The `target_scitype` trait continues to refer to the output of `predict`, if + implemented, while a trait, `output_scitype`, is for the output of `transform`. - An `inverse_transform` can be optionally implemented. The signature is the same as `transform`, as in `MLJModelInterface.inverse_transform(model, fitresult, Xout)`, which: - - must make sense for any `Xout` for which `scitype(Xout) <: - output_scitype(SomeSupervisedModel)` (see below); and - + output_scitype(SomeSupervisedModel)` (see below); and - must return an object `Xin` satisfying `scitype(Xin) <: - input_scitype(SomeSupervisedModel)`. + input_scitype(SomeSupervisedModel)`. + +For sample implementatations, see MLJ's [built-in +transformers](https://github.com/JuliaAI/MLJModels.jl/blob/dev/src/builtins/Transformers.jl) +and the clustering models at +[MLJClusteringInterface.jl](https://github.com/jbrea/MLJClusteringInterface.jl). -- A `predict` method may be optionally implemented, and has the same - signature as for supervised models, as in - `MLJModelInterface.predict(model, fitresult, Xnew)`. A use-case is - clustering algorithms that `predict` labels and `transform` new - input features into a space of lower dimension. See [Transformers - that also predict](@ref) for an example. ## Static models (models that do not generalize) From c0ad363e4a90b2c1124a83b27929cd00e7aa2f5f Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 21 Nov 2023 14:27:26 +1300 Subject: [PATCH 2/4] bump compat MLJFlow = "0.3" --- Project.toml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Project.toml b/Project.toml index 639553429..733bc97a0 100644 --- a/Project.toml +++ b/Project.toml @@ -34,7 +34,7 @@ Distributions = "0.21,0.22,0.23, 0.24, 0.25" MLJBase = "1" MLJBalancing = "0.1" MLJEnsembles = "0.4" -MLJFlow = "0.2" +MLJFlow = "0.3" MLJIteration = "0.6" MLJModels = "0.16" MLJTuning = "0.8" From d619f7952ad32b689da0414244dfb46b3ac17a82 Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 21 Nov 2023 14:28:04 +1300 Subject: [PATCH 3/4] bump 0.20.2 --- Project.toml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Project.toml b/Project.toml index 733bc97a0..1af396687 100644 --- a/Project.toml +++ b/Project.toml @@ -1,7 +1,7 @@ name = "MLJ" uuid = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7" authors = ["Anthony D. Blaom "] -version = "0.20.1" +version = "0.20.2" [deps] CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597" From 16a91b17b55ff0c02430c4ca887c1d959b976e8e Mon Sep 17 00:00:00 2001 From: "Anthony D. Blaom" Date: Tue, 21 Nov 2023 14:49:35 +1300 Subject: [PATCH 4/4] fix test for MLJFlow breakages --- test/exported_names.jl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/test/exported_names.jl b/test/exported_names.jl index c3e2dc8fb..6a9d6743e 100644 --- a/test/exported_names.jl +++ b/test/exported_names.jl @@ -35,7 +35,7 @@ Save() # MLJFlow -MLFlowLogger +MLJFlow.Logger # StatisticalMeasures