diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index 631211cd9..bb6960a62 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.10.4","generation_timestamp":"2024-06-06T02:27:12","documenter_version":"1.4.1"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.4","generation_timestamp":"2024-06-10T01:25:55","documenter_version":"1.4.1"}} \ No newline at end of file diff --git a/dev/about_mlj/index.html b/dev/about_mlj/index.html index 68ec3da4f..a991e8f4c 100644 --- a/dev/about_mlj/index.html +++ b/dev/about_mlj/index.html @@ -1,5 +1,5 @@ -About MLJ · MLJ

About MLJ

MLJ (Machine Learning in Julia) is a toolbox written in Julia providing a common interface and meta-algorithms for selecting, tuning, evaluating, composing and comparing over 180 machine learning models written in Julia and other languages. In particular MLJ wraps a large number of scikit-learn models.

MLJ is released under the MIT license.

Lightning tour

For help learning to use MLJ, see Learning MLJ.

A self-contained notebook and julia script of this demonstration is also available here.

The first code snippet below creates a new Julia environment MLJ_tour and installs just those packages needed for the tour. See Installation for more on creating a Julia environment for use with MLJ.

Julia installation instructions are here.

using Pkg
+About MLJ · MLJ

About MLJ

MLJ (Machine Learning in Julia) is a toolbox written in Julia providing a common interface and meta-algorithms for selecting, tuning, evaluating, composing and comparing over 180 machine learning models written in Julia and other languages. In particular MLJ wraps a large number of scikit-learn models.

MLJ is released under the MIT license.

Lightning tour

For help learning to use MLJ, see Learning MLJ.

A self-contained notebook and julia script of this demonstration is also available here.

The first code snippet below creates a new Julia environment MLJ_tour and installs just those packages needed for the tour. See Installation for more on creating a Julia environment for use with MLJ.

Julia installation instructions are here.

using Pkg
 Pkg.activate("MLJ_tour", shared=true)
 Pkg.add("MLJ")
 Pkg.add("MLJIteration")
@@ -54,4 +54,4 @@
       eprint={2012.15505},
       archivePrefix={arXiv},
       primaryClass={cs.LG}
-}
+}
diff --git a/dev/acceleration_and_parallelism/index.html b/dev/acceleration_and_parallelism/index.html index 97f76ced3..e066e4c64 100644 --- a/dev/acceleration_and_parallelism/index.html +++ b/dev/acceleration_and_parallelism/index.html @@ -1,2 +1,2 @@ -Acceleration and Parallelism · MLJ

Acceleration and Parallelism

User-facing interface

To enable composable, extensible acceleration of core MLJ methods, ComputationalResources.jl is utilized to provide some basic types and functions to make implementing acceleration easy. However, ambitious users or package authors have the option to define their own types to be passed as resources to acceleration, which must be <:ComputationalResources.AbstractResource.

Methods which support some form of acceleration support the acceleration keyword argument, which can be passed a "resource" from ComputationalResources. For example, passing acceleration=CPUProcesses() will utilize Distributed's multiprocessing functionality to accelerate the computation, while acceleration=CPUThreads() will use Julia's PARTR threading model to perform acceleration.

The default computational resource is CPU1(), which is simply serial processing via CPU. The default resource can be changed as in this example: MLJ.default_resource(CPUProcesses()). The argument must always have type <:ComputationalResource.AbstractResource. To inspect the current default, use MLJ.default_resource().

Note

You cannot use CPUThreads() with models wrapping python code.

+Acceleration and Parallelism · MLJ

Acceleration and Parallelism

User-facing interface

To enable composable, extensible acceleration of core MLJ methods, ComputationalResources.jl is utilized to provide some basic types and functions to make implementing acceleration easy. However, ambitious users or package authors have the option to define their own types to be passed as resources to acceleration, which must be <:ComputationalResources.AbstractResource.

Methods which support some form of acceleration support the acceleration keyword argument, which can be passed a "resource" from ComputationalResources. For example, passing acceleration=CPUProcesses() will utilize Distributed's multiprocessing functionality to accelerate the computation, while acceleration=CPUThreads() will use Julia's PARTR threading model to perform acceleration.

The default computational resource is CPU1(), which is simply serial processing via CPU. The default resource can be changed as in this example: MLJ.default_resource(CPUProcesses()). The argument must always have type <:ComputationalResource.AbstractResource. To inspect the current default, use MLJ.default_resource().

Note

You cannot use CPUThreads() with models wrapping python code.

diff --git a/dev/adding_models_for_general_use/index.html b/dev/adding_models_for_general_use/index.html index 36b4e8024..73fb7db7a 100644 --- a/dev/adding_models_for_general_use/index.html +++ b/dev/adding_models_for_general_use/index.html @@ -1,2 +1,2 @@ -Adding Models for General Use · MLJ
+Adding Models for General Use · MLJ
diff --git a/dev/api/index.html b/dev/api/index.html index 498fee261..557fe929b 100644 --- a/dev/api/index.html +++ b/dev/api/index.html @@ -1,2 +1,2 @@ -Index of Methods · MLJ

Index of Methods

+Index of Methods · MLJ

Index of Methods

diff --git a/dev/benchmarking/index.html b/dev/benchmarking/index.html index b021b6b3b..e77b5130f 100644 --- a/dev/benchmarking/index.html +++ b/dev/benchmarking/index.html @@ -1,2 +1,2 @@ -Benchmarking · MLJ
+Benchmarking · MLJ
diff --git a/dev/common_mlj_workflows/index.html b/dev/common_mlj_workflows/index.html index d590c323e..faa4b7cc5 100644 --- a/dev/common_mlj_workflows/index.html +++ b/dev/common_mlj_workflows/index.html @@ -1,5 +1,5 @@ -Common MLJ Workflows · MLJ

Common MLJ Workflows

This demo assumes you have certain packages in your active package environment. To activate a new environment, "MyNewEnv", with just these packages, do this in a new REPL session:

using Pkg
+Common MLJ Workflows · MLJ

Common MLJ Workflows

This demo assumes you have certain packages in your active package environment. To activate a new environment, "MyNewEnv", with just these packages, do this in a new REPL session:

using Pkg
 Pkg.activate("MyNewEnv")
 Pkg.add(["MLJ", "RDatasets", "DataFrames", "MLJDecisionTreeInterface",
     "MLJMultivariateStatsInterface", "NearestNeighborModels", "MLJGLMInterface",
@@ -256,8 +256,8 @@
   rng = Random._GLOBAL_RNG())

Bind the model and data together in a machine, which will additionally, store the learned parameters (fitresults) when fit:

mach = machine(tree, X, y)
untrained Machine; caches model-specific representations of data
   model: DecisionTreeClassifier(max_depth = 2, …)
   args: 
-    1:	Source @906 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @359 ⏎ AbstractVector{Multiclass{2}}
+    1:	Source @822 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @362 ⏎ AbstractVector{Multiclass{2}}
 

Split row indices into training and evaluation rows:

train, test = partition(eachindex(y), 0.7); # 70:30 split
([1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  131, 132, 133, 134, 135, 136, 137, 138, 139, 140], [141, 142, 143, 144, 145, 146, 147, 148, 149, 150  …  191, 192, 193, 194, 195, 196, 197, 198, 199, 200])

Fit on the train data set and evaluate on the test data set:

fit!(mach, rows=train)
 yhat = predict(mach, X[test,:])
 LogLoss(tol=1e-4)(yhat, y[test])
1.0788055664326648

Note LogLoss() has aliases log_loss and cross_entropy.

Predict on the new data set:

Xnew = (FL = rand(3), RW = rand(3), CL = rand(3), CW = rand(3), BD = rand(3))
@@ -330,14 +330,14 @@
 ┌───┬──────────────────────┬──────────────┬─────────────┐
 │   │ measure              │ operation    │ measurement │
 ├───┼──────────────────────┼──────────────┼─────────────┤
-│ A │ LogLoss(             │ predict      │ 4.81        │
+│ A │ LogLoss(             │ predict      │ 4.3         │
 │   │   tol = 2.22045e-16) │              │             │
 │ B │ Accuracy()           │ predict_mode │ 0.736       │
 └───┴──────────────────────┴──────────────┴─────────────┘
 ┌───┬───────────────────────┬─────────┐
 │   │ per_fold              │ 1.96*SE │
 ├───┼───────────────────────┼─────────┤
-│ A │ [5.1, 6.48, 3.07]     │ 2.38    │
+│ A │ [5.1, 4.97, 3.01]     │ 1.62    │
 │ B │ [0.696, 0.739, 0.769] │ 0.0513  │
 └───┴───────────────────────┴─────────┘
 

Changing a hyperparameter and re-evaluating:

tree.max_depth = 3
@@ -373,20 +373,20 @@
 mach =  machine(ols, X, y) |> fit!
trained Machine; caches model-specific representations of data
   model: LinearRegressor(fit_intercept = true, …)
   args: 
-    1:	Source @645 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @770 ⏎ AbstractVector{Continuous}
+    1:	Source @420 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @896 ⏎ AbstractVector{Continuous}
 

Get a named tuple representing the learned parameters, human-readable if appropriate:

fitted_params(mach)
(features = [:x1, :x2],
- coef = [1.0019484491170485, -2.0124583022683673],
- intercept = 0.05839955441025016,)

Get other training-related information:

report(mach)
(stderror = [0.007289046266540614, 0.009314702321351547, 0.009664751997931865],
+ coef = [0.9991493052514759, -2.000770727737916],
+ intercept = 0.04558920911757358,)

Get other training-related information:

report(mach)
(stderror = [0.0075716576252445695, 0.010270084681692026, 0.009571713656806065],
  dof_residual = 97.0,
- vcov = [5.3130195475769666e-5 -4.737168085144333e-5 -4.924311372223852e-5; -4.737168085144333e-5 8.676367933539189e-5 1.3161433444949447e-5; -4.924311372223852e-5 1.3161433444949447e-5 9.340743118152798e-5],
- deviance = 0.07678443409575168,
+ vcov = [5.7329999193924235e-5 -5.429443842036848e-5 -4.7225605422306874e-5; -5.429443842036848e-5 0.0001054746393691252 5.6999071938576035e-6; -4.7225605422306874e-5 5.6999071938576035e-6 9.161770232788773e-5],
+ deviance = 0.07659888168821351,
  coef_table = ──────────────────────────────────────────────────────────────────────────────
                   Coef.  Std. Error        t  Pr(>|t|)   Lower 95%   Upper 95%
 ──────────────────────────────────────────────────────────────────────────────
-(Intercept)   0.0583996  0.00728905     8.01    <1e-11   0.0439328   0.0728663
-x1            1.00195    0.0093147    107.57    <1e-99   0.983461    1.02044
-x2           -2.01246    0.00966475  -208.23    <1e-99  -2.03164    -1.99328
+(Intercept)   0.0455892  0.00757166     6.02    <1e-07   0.0305616   0.0606169
+x1            0.999149   0.0102701     97.29    <1e-97   0.978766    1.01953
+x2           -2.00077    0.00957171  -209.03    <1e-99  -2.01977    -1.98177
 ──────────────────────────────────────────────────────────────────────────────,)

Basic fit/transform for unsupervised models

Load data:

X, y = @load_iris  # a table and a vector
 train, test = partition(eachindex(y), 0.97, shuffle=true, rng=123)
([125, 100, 130, 9, 70, 148, 39, 64, 6, 107  …  110, 59, 139, 21, 112, 144, 140, 72, 109, 41], [106, 147, 47, 5])

Instantiate and fit the model/machine:

PCA = @load PCA
 pca = PCA(maxoutdim=2)
@@ -394,12 +394,12 @@
 fit!(mach, rows=train)
trained Machine; caches model-specific representations of data
   model: PCA(maxoutdim = 2, …)
   args: 
-    1:	Source @053 ⏎ Table{AbstractVector{Continuous}}
+    1:	Source @625 ⏎ Table{AbstractVector{Continuous}}
 

Transform selected data bound to the machine:

transform(mach, rows=test);
(x1 = [-3.394282685448322, -1.5219827578765053, 2.53824745518522, 2.7299639893931382],
  x2 = [0.547245022374522, -0.36842368617126425, 0.5199299511335688, 0.3448466122232349],)

Transform new data:

Xnew = (sepal_length=rand(3), sepal_width=rand(3),
         petal_length=rand(3), petal_width=rand(3));
-transform(mach, Xnew)
(x1 = [4.60254619833418, 4.963408439322138, 4.73352667809396],
- x2 = [-4.450747224690028, -4.340052887208079, -4.323758570369482],)

Inverting learned transformations

y = rand(100);
+transform(mach, Xnew)
(x1 = [4.932980176376836, 4.673447918876899, 5.286789315108594],
+ x2 = [-4.587828781511142, -4.427755497747251, -5.031367248586764],)

Inverting learned transformations

y = rand(100);
 stand = Standardizer()
 mach = machine(stand, y)
 fit!(mach)
@@ -462,13 +462,13 @@
   logger = nothing)

Bound the wrapped model to data:

mach = machine(tuned_forest, X, y)
untrained Machine; does not cache data
   model: ProbabilisticTunedModel(model = ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …), …)
   args: 
-    1:	Source @313 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @689 ⏎ AbstractVector{Multiclass{3}}
+    1:	Source @176 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @073 ⏎ AbstractVector{Multiclass{3}}
 

Fitting the resultant machine optimizes the hyperparameters specified in range, using the specified tuning and resampling strategies and performance measure (possibly a vector of measures), and retrains on all data bound to the machine:

fit!(mach)
trained Machine; does not cache data
   model: ProbabilisticTunedModel(model = ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …), …)
   args: 
-    1:	Source @313 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @689 ⏎ AbstractVector{Multiclass{3}}
+    1:	Source @176 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @073 ⏎ AbstractVector{Multiclass{3}}
 

Inspecting the optimal model:

F = fitted_params(mach)
(best_model = ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …),
  best_fitted_params = (fitresult = WrappedEnsemble(atom = DecisionTreeClassifier(max_depth = -1, …), …),),)
F.best_model
ProbabilisticEnsembleModel(
   model = DecisionTreeClassifier(
@@ -476,7 +476,7 @@
         min_samples_leaf = 1, 
         min_samples_split = 2, 
         min_purity_increase = 0.0, 
-        n_subfeatures = 4, 
+        n_subfeatures = 3, 
         post_prune = false, 
         merge_purity_threshold = 1.0, 
         display_depth = 5, 
@@ -489,12 +489,12 @@
   acceleration = CPU1{Nothing}(nothing), 
   out_of_bag_measure = Any[])

Inspecting details of tuning procedure:

r = report(mach);
 keys(r)
(:best_model, :best_history_entry, :history, :best_report, :plotting)
r.history[[1,end]]
2-element Vector{@NamedTuple{model::MLJEnsembles.ProbabilisticEnsembleModel{MLJDecisionTreeInterface.DecisionTreeClassifier}, measure::Vector{StatisticalMeasuresBase.RobustMeasure{StatisticalMeasuresBase.FussyMeasure{StatisticalMeasuresBase.RobustMeasure{StatisticalMeasures._BrierLossType}, typeof(StatisticalMeasures.l2_check)}}}, measurement::Vector{Float64}, per_fold::Vector{Vector{Float64}}, evaluation::CompactPerformanceEvaluation{MLJEnsembles.ProbabilisticEnsembleModel{MLJDecisionTreeInterface.DecisionTreeClassifier}, Vector{StatisticalMeasuresBase.RobustMeasure{StatisticalMeasuresBase.FussyMeasure{StatisticalMeasuresBase.RobustMeasure{StatisticalMeasures._BrierLossType}, typeof(StatisticalMeasures.l2_check)}}}, Vector{Float64}, Vector{typeof(predict)}, Vector{Vector{Float64}}, Vector{Vector{Vector{Float64}}}, CV}}}:
- (model = ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …), measure = [BrierLoss()], measurement = [0.11061688888888872], per_fold = [[0.008769777777777862, 0.00018311111111112943, 0.13994577777777764, 0.15614133333333288, 0.14898399999999967, 0.20967733333333313]], evaluation = CompactPerformanceEvaluation(0.111,))
- (model = ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …), measure = [BrierLoss()], measurement = [0.12125549176954746], per_fold = [[0.02781777777777793, 0.007603555555555701, 0.19223187037037057, 0.1535252222222222, 0.1663280555555555, 0.18002646913580272]], evaluation = CompactPerformanceEvaluation(0.121,))

Visualizing these results:

using Plots
+ (model = ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …), measure = [BrierLoss()], measurement = [0.10329451851851834], per_fold = [[-0.0, -0.0, 0.12643466666666656, 0.15470222222222174, 0.13779822222222193, 0.20083199999999976]], evaluation = CompactPerformanceEvaluation(0.103,))
+ (model = ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …), measure = [BrierLoss()], measurement = [0.11934060905349804], per_fold = [[0.026442666666666767, 0.005732444444444598, 0.1926373333333334, 0.14254809876543217, 0.1626662222222222, 0.1860168888888891]], evaluation = CompactPerformanceEvaluation(0.119,))

Visualizing these results:

using Plots
 plot(mach)

Predicting on new data using the optimized model trained on all data:

predict(mach, Xnew)
3-element UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}:
  UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)
  UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)
- UnivariateFinite{Multiclass{3}}(setosa=>0.767, versicolor=>0.213, virginica=>0.02)

Constructing linear pipelines

Reference: Linear Pipelines

Constructing a linear (unbranching) pipeline with a learned target transformation/inverse transformation:

X, y = @load_reduced_ames
+ UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)

Constructing linear pipelines

Reference: Linear Pipelines

Constructing a linear (unbranching) pipeline with a learned target transformation/inverse transformation:

X, y = @load_reduced_ames
 KNN = @load KNNRegressor
 knn_with_target = TransformedTargetModel(model=KNN(K=3), transformer=Standardizer())
TransformedTargetModelDeterministic(
   model = KNNRegressor(
@@ -558,14 +558,14 @@
 ┌──────────────────────┬───────────┬─────────────┐
 │ measure              │ operation │ measurement │
 ├──────────────────────┼───────────┼─────────────┤
-│ LogLoss(             │ predict   │ 0.428       │
+│ LogLoss(             │ predict   │ 0.429       │
 │   tol = 2.22045e-16) │           │             │
 └──────────────────────┴───────────┴─────────────┘
-┌────────────────────────────────────────────────┬─────────┐
-│ per_fold                                       │ 1.96*SE │
-├────────────────────────────────────────────────┼─────────┤
-│ [3.89e-15, 3.89e-15, 0.294, 0.41, 1.56, 0.299] │ 0.51    │
-└────────────────────────────────────────────────┴─────────┘
+┌─────────────────────────────────────────────────┬─────────┐
+│ per_fold                                        │ 1.96*SE │
+├─────────────────────────────────────────────────┼─────────┤
+│ [3.89e-15, 3.89e-15, 0.302, 0.381, 1.56, 0.329] │ 0.507   │
+└─────────────────────────────────────────────────┴─────────┘
 

Performance curves

Generate a plot of performance, as a function of some hyperparameter (building on the preceding example)

Single performance curve:

r = range(forest, :n, lower=1, upper=1000, scale=:log10)
 curve = learning_curve(mach,
                        range=r,
@@ -575,7 +575,7 @@
                        verbosity=0)
(parameter_name = "n",
  parameter_scale = :log10,
  parameter_values = [1, 2, 3, 4, 5, 6, 7, 8, 10, 11  …  281, 324, 373, 429, 494, 569, 655, 754, 869, 1000],
- measurements = [4.004850376568572, 4.1126732713223415, 4.067922726718731, 4.123999873775369, 4.150105956717014, 2.688089225524209, 2.715285824731319, 2.7309139415415857, 2.7444858783511297, 2.7476450089856033  …  1.269185619048552, 1.2786364928754186, 1.2725212042652867, 1.2789570911204242, 1.2797130430389276, 1.2768033472128724, 1.2644056972193418, 1.2598962094386172, 1.2612790173706743, 1.2557508210679436],)
using Plots
+ measurements = [8.009700753137146, 7.3165535725772, 4.165577152378119, 2.7016641697125308, 2.7264652068796558, 2.667200175335509, 2.679693430839872, 2.6990484091188085, 2.711284561225735, 1.95524844163632  …  1.2474446228963525, 1.2455088836705839, 1.243424421444324, 1.2363329736702997, 1.239539419310721, 1.2384777558609936, 1.2373480020980578, 1.243692344943664, 1.2429655812800875, 1.2395704269170391],)
using Plots
 plot(curve.parameter_values, curve.measurements,
      xlab=curve.parameter_name, xscale=curve.parameter_scale)

Multiple curves:

curve = learning_curve(mach,
                        range=r,
@@ -587,5 +587,5 @@
                        verbosity=0)
(parameter_name = "n",
  parameter_scale = :log10,
  parameter_values = [1, 2, 3, 4, 5, 6, 7, 8, 10, 11  …  281, 324, 373, 429, 494, 569, 655, 754, 869, 1000],
- measurements = [4.004850376568572 8.009700753137146 16.820371581588002 9.611640903764574; 4.004850376568572 8.009700753137146 9.087929700674836 9.611640903764574; … ; 1.2099979316961877 1.2316766858863117 1.266241881645686 1.274322191002287; 1.214989736207193 1.2334567682916915 1.2684272251885533 1.2728908797309264],)
plot(curve.parameter_values, curve.measurements,
-     xlab=curve.parameter_name, xscale=curve.parameter_scale)

+ measurements = [4.004850376568572 9.611640903764574 16.820371581588002 9.611640903764574; 4.004850376568572 8.040507294495367 16.820371581588002 9.611640903764574; … ; 1.2065681074574945 1.2347751366582833 1.264175918714098 1.2769557003939005; 1.2117643002923562 1.233755659045757 1.2660230330657796 1.2763335085717247],)
plot(curve.parameter_values, curve.measurements,
+     xlab=curve.parameter_name, xscale=curve.parameter_scale)

diff --git a/dev/composing_models/index.html b/dev/composing_models/index.html index 208e8be61..0757e7dd5 100644 --- a/dev/composing_models/index.html +++ b/dev/composing_models/index.html @@ -1,2 +1,2 @@ -Composing Models · MLJ

Composing Models

Three common ways of combining multiple models together have out-of-the-box implementations in MLJ:

  • Linear Pipelines (Pipeline)- for unbranching chains that take the output of one model (e.g., dimension reduction, such as PCA) and make it the input of the next model in the chain (e.g., a classification model, such as EvoTreeClassifier). To include transformations of the target variable in a supervised pipeline model, see Target Transformations.
  • Homogeneous Ensembles (EnsembleModel) - for blending the predictions of multiple supervised models all of the same type, but which receive different views of the training data to reduce overall variance. The technique implemented here is known as observation bagging.
  • Model Stacking - (Stack) for combining the predictions of a smaller number of models of possibly different types, with the help of an adjudicating model.

Additionally, more complicated model compositions are possible using:

  • Learning Networks - "blueprints" for combining models in flexible ways; these are simple transformations of your existing workflows which can be "exported" to define new, stand-alone model types.
+Composing Models · MLJ

Composing Models

Three common ways of combining multiple models together have out-of-the-box implementations in MLJ:

  • Linear Pipelines (Pipeline)- for unbranching chains that take the output of one model (e.g., dimension reduction, such as PCA) and make it the input of the next model in the chain (e.g., a classification model, such as EvoTreeClassifier). To include transformations of the target variable in a supervised pipeline model, see Target Transformations.
  • Homogeneous Ensembles (EnsembleModel) - for blending the predictions of multiple supervised models all of the same type, but which receive different views of the training data to reduce overall variance. The technique implemented here is known as observation bagging.
  • Model Stacking - (Stack) for combining the predictions of a smaller number of models of possibly different types, with the help of an adjudicating model.

Additionally, more complicated model compositions are possible using:

  • Learning Networks - "blueprints" for combining models in flexible ways; these are simple transformations of your existing workflows which can be "exported" to define new, stand-alone model types.
diff --git a/dev/controlling_iterative_models/index.html b/dev/controlling_iterative_models/index.html index 89bdf8582..c51fccf44 100644 --- a/dev/controlling_iterative_models/index.html +++ b/dev/controlling_iterative_models/index.html @@ -1,5 +1,5 @@ -Controlling Iterative Models · MLJ

Controlling Iterative Models

Iterative supervised machine learning models are usually trained until an out-of-sample estimate of the performance satisfies some stopping criterion, such as k consecutive deteriorations of the performance (see Patience below). A more sophisticated kind of control might dynamically mutate parameters, such as a learning rate, in response to the behavior of these estimates.

Some iterative model implementations enable some form of automated control, with the method and options for doing so varying from model to model. But sometimes it is up to the user to arrange control, which in the crudest case reduces to manually experimenting with the iteration parameter.

In response to this ad hoc state of affairs, MLJ provides a uniform and feature-rich interface for controlling any iterative model that exposes its iteration parameter as a hyper-parameter, and which implements the "warm restart" behavior described in Machines.

Basic use

As in Tuning Models, iteration control in MLJ is implemented as a model wrapper, which allows composition with other meta-algorithms. Ordinarily, the wrapped model behaves just like the original model, but with the training occurring on a subset of the provided data (to allow computation of an out-of-sample loss) and with the iteration parameter automatically determined by the controls specified in the wrapper.

By setting retrain=true one can ask that the wrapped model retrain on all supplied data, after learning the appropriate number of iterations from the controlled training phase:

using MLJ
+Controlling Iterative Models · MLJ

Controlling Iterative Models

Iterative supervised machine learning models are usually trained until an out-of-sample estimate of the performance satisfies some stopping criterion, such as k consecutive deteriorations of the performance (see Patience below). A more sophisticated kind of control might dynamically mutate parameters, such as a learning rate, in response to the behavior of these estimates.

Some iterative model implementations enable some form of automated control, with the method and options for doing so varying from model to model. But sometimes it is up to the user to arrange control, which in the crudest case reduces to manually experimenting with the iteration parameter.

In response to this ad hoc state of affairs, MLJ provides a uniform and feature-rich interface for controlling any iterative model that exposes its iteration parameter as a hyper-parameter, and which implements the "warm restart" behavior described in Machines.

Basic use

As in Tuning Models, iteration control in MLJ is implemented as a model wrapper, which allows composition with other meta-algorithms. Ordinarily, the wrapped model behaves just like the original model, but with the training occurring on a subset of the provided data (to allow computation of an out-of-sample loss) and with the iteration parameter automatically determined by the controls specified in the wrapper.

By setting retrain=true one can ask that the wrapped model retrain on all supplied data, after learning the appropriate number of iterations from the controlled training phase:

using MLJ
 
 X, y = make_moons(100, rng=123, noise=0.5)
 EvoTreeClassifier = @load EvoTreeClassifier verbosity=0
@@ -49,8 +49,8 @@
  - rng: Random.MersenneTwister(123)
 , …)
   args:
-    1:	Source @303 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @103 ⏎ AbstractVector{Multiclass{2}}

As detailed under IteratedModel below, the specified controls are repeatedly applied in sequence to a training machine, constructed under the hood, until one of the controls triggers a stop. Here Step(5) means "Compute 5 more iterations" (in this case starting from none); Patience(2) means "Stop at the end of the control cycle if there have been 2 consecutive drops in the log loss"; and NumberLimit(100) is a safeguard ensuring a stop after 100 control cycles (500 iterations). See Controls provided below for a complete list.

Because iteration is implemented as a wrapper, the "self-iterating" model can be evaluated using cross-validation, say, and the number of iterations on each fold will generally be different:

e = evaluate!(mach, resampling=CV(nfolds=3), measure=log_loss, verbosity=0);
+    1:	Source @119 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @599 ⏎ AbstractVector{Multiclass{2}}

As detailed under IteratedModel below, the specified controls are repeatedly applied in sequence to a training machine, constructed under the hood, until one of the controls triggers a stop. Here Step(5) means "Compute 5 more iterations" (in this case starting from none); Patience(2) means "Stop at the end of the control cycle if there have been 2 consecutive drops in the log loss"; and NumberLimit(100) is a safeguard ensuring a stop after 100 control cycles (500 iterations). See Controls provided below for a complete list.

Because iteration is implemented as a wrapper, the "self-iterating" model can be evaluated using cross-validation, say, and the number of iterations on each fold will generally be different:

e = evaluate!(mach, resampling=CV(nfolds=3), measure=log_loss, verbosity=0);
 map(e.report_per_fold) do r
     r.n_iterations
 end
3-element Vector{Int64}:
@@ -79,8 +79,8 @@
 trained Machine; does not cache data
   model: DeterministicIteratedModel(model = DeterministicTunedModel(model = RidgeRegressor(lambda = 1.0, …), …), …)
   args:
-    1:	Source @948 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @268 ⏎ AbstractVector{Continuous}
julia> report(mach).model_report.best_modelRidgeRegressor(
+    1:	Source @595 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @722 ⏎ AbstractVector{Continuous}
julia> report(mach).model_report.best_modelRidgeRegressor(
   lambda = 0.4243170708090101,
   fit_intercept = true,
   penalize_intercept = false,
@@ -164,4 +164,4 @@
 fit!(mach) # train for 100 iterations
 iterated_model.controls = [Step(1), NumberLimit(50)],
 fit!(mach) # train for an *extra* 50 iterations

More generally, if iterated_model is mutated and fit!(mach) is called again, then a warm restart is attempted if the only parameters to change are model or controls or both.

Specifically, train_mach.model is mutated to match the current value of iterated_model.model and the iteration parameter of the latter is updated to the last value used in the preceding fit!(mach) call. Then repeated application of the (updated) controls begin anew.

source

Controls

IterationControl.StepType
Step(; n=1)

An iteration control, as in, Step(2).

Train for n more iterations. Will never trigger a stop.

source
EarlyStopping.TimeLimitType
TimeLimit(; t=0.5)

An early stopping criterion for loss-reporting iterative algorithms.

Stopping is triggered after t hours have elapsed since the stopping criterion was initiated.

Any Julia built-in Real type can be used for t. Subtypes of Period may also be used, as in TimeLimit(t=Minute(30)).

Internally, t is rounded to nearest millisecond. ``

source
EarlyStopping.NumberLimitType
NumberLimit(; n=100)

An early stopping criterion for loss-reporting iterative algorithms.

A stop is triggered by n consecutive loss updates, excluding "training" loss updates.

If wrapped in a stopper::EarlyStopper, this is the number of calls to done!(stopper).

source
EarlyStopping.NumberSinceBestType
NumberSinceBest(; n=6)

An early stopping criterion for loss-reporting iterative algorithms.

A stop is triggered when the number of calls to the control, since the lowest value of the loss so far, is n.

For a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option.

source
EarlyStopping.InvalidValueType
InvalidValue()

An early stopping criterion for loss-reporting iterative algorithms.

Stop if a loss (or training loss) is NaN, Inf or -Inf (or, more precisely, if isnan(loss) or isinf(loss) is true).

For a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option.

source
EarlyStopping.ThresholdType
Threshold(; value=0.0)

An early stopping criterion for loss-reporting iterative algorithms.

A stop is triggered as soon as the loss drops below value.

For a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option.

source
EarlyStopping.GLType
GL(; alpha=2.0)

An early stopping criterion for loss-reporting iterative algorithms.

A stop is triggered when the (rescaled) generalization loss exceeds the threshold alpha.

Terminology. Suppose $E_1, E_2, ..., E_t$ are a sequence of losses, for example, out-of-sample estimates of the loss associated with some iterative machine learning algorithm. Then the generalization loss at time t, is given by

$GL_t = 100 (E_t - E_{opt}) \over |E_{opt}|$

where $E_{opt}$ is the minimum value of the sequence.

Reference: Prechelt, Lutz (1998): "Early Stopping- But When?", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer..

source
EarlyStopping.PQType
PQ(; alpha=0.75, k=5, tol=eps(Float64))

A stopping criterion for training iterative supervised learners.

A stop is triggered when Prechelt's progress-modified generalization loss exceeds the threshold $PQ_T > alpha$, or if the training progress drops below $P_j ≤ tol$. Here k is the number of training (in-sample) losses used to estimate the training progress.

Context and explanation of terminology

The training progress at time $j$ is defined by

$P_j = 1000 |M - m|/|m|$

where $M$ is the mean of the last k training losses $F_1, F_2, …, F_k$ and $m$ is the minimum value of those losses.

The progress-modified generalization loss at time $t$ is then given by

$PQ_t = GL_t / P_t$

where $GL_t$ is the generalization loss at time $t$; see GL.

PQ will stop when the following are true:

  1. At least k training samples have been collected via done!(c::PQ, loss; training = true) or update_training(c::PQ, loss, state)
  2. The last update was an out-of-sample update. (done!(::PQ, loss; training=true) is always false)
  3. The progress-modified generalization loss exceeds the threshold $PQ_t > alpha$ OR the training progress stalls $P_j ≤ tol$.

Reference: Prechelt, Lutz (1998): "Early Stopping- But When?", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer..

source
IterationControl.InfoType
Info(f=identity)

An iteration control, as in, Info(my_loss_function).

Log to Info the value of f(m), where m is the object being iterated. If IterativeControl.expose(m) has been overloaded, then log f(expose(m)) instead.

Can be suppressed by setting the global verbosity level sufficiently low.

See also Warn, Error.

source
IterationControl.WarnType
Warn(predicate; f="")

An iteration control, as in, Warn(m -> length(m.cache) > 100, f="Memory low").

If predicate(m) is true, then log to Warn the value of f (or f(IterationControl.expose(m)) if f is a function). Here m is the object being iterated.

Can be suppressed by setting the global verbosity level sufficiently low.

See also Info, Error.

source
IterationControl.ErrorType
Error(predicate; f="", exception=nothing))

An iteration control, as in, Error(m -> isnan(m.bias), f="Bias overflow!").

If predicate(m) is true, then log at the Error level the value of f (or f(IterationControl.expose(m)) if f is a function) and stop iteration at the end of the current control cycle. Here m is the object being iterated.

Specify exception=... to throw an immediate execption, without waiting to the end of the control cycle.

See also Info, Warn.

source
IterationControl.CallbackType
Callback(f=_->nothing, stop_if_true=false, stop_message=nothing, raw=false)

An iteration control, as in, Callback(m->put!(v, my_loss_function(m)).

Call f(IterationControl.expose(m)), where m is the object being iterated, unless raw=true, in which case call f(m) (guaranteed if expose has not been overloaded.) If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
IterationControl.WithNumberDoType
WithNumberDo(f=n->@info("number: $n"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithNumberDo(n->put!(my_channel, n)).

Call f(n + 1), where n is the number of complete control cycles. of the control (so, n = 1, 2, 3, ..., unless control is wrapped in a IterationControl.skip)`.

If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithIterationsDoType
WithIterationsDo(f=x->@info("iterations: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithIterationsDo(x->put!(my_channel, x)).

Call f(x), where x is the current number of model iterations (generally more than the number of control cycles). If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
IterationControl.WithLossDoType
WithLossDo(f=x->@info("loss: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithLossDo(x->put!(my_losses, x)).

Call f(loss), where loss is current loss.

If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
IterationControl.WithTrainingLossesDoType
WithTrainingLossesDo(f=v->@info("training: $v"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithTrainingLossesDo(v->put!(my_losses, last(v)).

Call f(training_losses), where training_losses is the vector of most recent batch of training losses.

If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithEvaluationDoType
WithEvaluationDo(f=x->@info("evaluation: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithEvaluationDo(x->put!(my_channel, x)).

Call f(x), where x is the latest performance evaluation, as returned by evaluate!(train_mach, resampling=..., ...). Not valid if resampling=nothing. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithFittedParamsDoType
WithFittedParamsDo(f=x->@info("fitted_params: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithFittedParamsDo(x->put!(my_channel, x)).

Call f(x), where x = fitted_params(mach) is the fitted parameters of the training machine, mach, in its current state. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithReportDoType
WithReportDo(f=x->@info("report: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithReportDo(x->put!(my_channel, x)).

Call f(x), where x = report(mach) is the report associated with the training machine, mach, in its current state. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithModelDoType
WithModelDo(f=x->@info("model: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithModelDo(x->put!(my_channel, x)).

Call f(x), where x is the model associated with the training machine; f may mutate x, as in f(x) = (x.learning_rate *= 0.9). If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.WithMachineDoType
WithMachineDo(f=x->@info("machine: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithMachineDo(x->put!(my_channel, x)).

Call f(x), where x is the training machine in its current state. If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

source
MLJIteration.SaveType
Save(filename="machine.jls")

An iteration control, as in, Save("run3/machine.jls").

Save the current state of the machine being iterated to disk, using the provided filename, decorated with a number, as in "run3/machine42.jls". The default behaviour uses the Serialization module but this can be changed by setting the method=save_fn(::String, ::Any) argument where save_fn is any serialization method. For more on what is meant by "the machine being iterated", see IteratedModel.

source

Control wrappers

IterationControl.skipFunction
IterationControl.skip(control, predicate=1)

An iteration control wrapper.

If predicate is an integer, k: Apply control on every k calls to apply the wrapped control, starting with the kth call.

If predicate is a function: Apply control as usual when predicate(n + 1) is true but otherwise skip. Here n is the number of control cycles applied so far.

source
IterationControl.louderFunction
IterationControl.louder(control, by=1)

Wrap control to make in more (or less) verbose. The same as control, but as if the global verbosity were increased by the value by.

source
IterationControl.with_state_doFunction
IterationControl.with_state_do(control,
-                              f=x->@info "$(typeof(control)) state: $x")

Wrap control to give access to it's internal state. Acts exactly like control except that f is called on the internal state of control. If f is not specified, the control type and state are logged to Info at every update (useful for debugging new controls).

Warning. The internal state of a control is not yet considered part of the public interface and could change between in any pre 1.0 release of IterationControl.jl.

source
+ f=x->@info "$(typeof(control)) state: $x")

Wrap control to give access to it's internal state. Acts exactly like control except that f is called on the internal state of control. If f is not specified, the control type and state are logged to Info at every update (useful for debugging new controls).

Warning. The internal state of a control is not yet considered part of the public interface and could change between in any pre 1.0 release of IterationControl.jl.

source
IterationControl.compositeFunction
composite(controls...)

Construct an iteration control that applies the specified controls in sequence.

source
diff --git a/dev/correcting_class_imbalance/index.html b/dev/correcting_class_imbalance/index.html index 1910fcb9a..fb4444e5d 100644 --- a/dev/correcting_class_imbalance/index.html +++ b/dev/correcting_class_imbalance/index.html @@ -1,5 +1,5 @@ -Correcting Class Imbalance · MLJ

Correcting Class Imbalance

Oversampling and undersampling methods

Models providing oversampling or undersampling methods, to correct for class imbalance, are listed under Class Imbalance. In particular, several popular algorithms are provided by the Imbalance.jl package, which includes detailed documentation and tutorials.

Incorporating class imbalance in supervised learning pipelines

One or more oversampling/undersampling algorithms can be fused with an MLJ classifier using the BalancedModel wrapper. This creates a new classifier which can be treated like any other; resampling to correct for class imbalance, relevant only for training of the atomic classifier, is then carried out internally. If, for example, one applies cross-validation to the wrapped classifier (using evaluate!, say) then this means over/undersampling is then repeated for each training fold automatically.

Refer to the MLJBalancing.jl documentation for further details.

MLJBalancing.BalancedModelFunction
BalancedModel(; model=nothing, balancer1=balancer_model1, balancer2=balancer_model2, ...)
+Correcting Class Imbalance · MLJ

Correcting Class Imbalance

Oversampling and undersampling methods

Models providing oversampling or undersampling methods, to correct for class imbalance, are listed under Class Imbalance. In particular, several popular algorithms are provided by the Imbalance.jl package, which includes detailed documentation and tutorials.

Incorporating class imbalance in supervised learning pipelines

One or more oversampling/undersampling algorithms can be fused with an MLJ classifier using the BalancedModel wrapper. This creates a new classifier which can be treated like any other; resampling to correct for class imbalance, relevant only for training of the atomic classifier, is then carried out internally. If, for example, one applies cross-validation to the wrapped classifier (using evaluate!, say) then this means over/undersampling is then repeated for each training fold automatically.

Refer to the MLJBalancing.jl documentation for further details.

MLJBalancing.BalancedModelFunction
BalancedModel(; model=nothing, balancer1=balancer_model1, balancer2=balancer_model2, ...)
 BalancedModel(model;  balancer1=balancer_model1, balancer2=balancer_model2, ...)

Given a classification model, and one or more balancer models that all implement the MLJModelInterface, BalancedModel allows constructing a sequential pipeline that wraps an arbitrary number of balancing models and a classifier together in a sequential pipeline.

Operation

  • During training, data is first passed to balancer1 and the result is passed to balancer2 and so on, the result from the final balancer is then passed to the classifier for training.
  • During prediction, the balancers have no effect.

Arguments

  • model::Supervised: A classification model that implements the MLJModelInterface.
  • balancer1::Static=...: The first balancer model to pass the data to. This keyword argument can have any name.
  • balancer2::Static=...: The second balancer model to pass the data to. This keyword argument can have any name.
  • and so on for an arbitrary number of balancers.

Returns

  • An instance of type ProbabilisticBalancedModel or DeterministicBalancedModel, depending on the prediction type of model.

Example

using MLJ
 using Imbalance
 
@@ -20,4 +20,4 @@
 
 # now this behaves as a unified model that can be trained, validated, fine-tuned, etc.
 mach = machine(balanced_model, X, y)
-fit!(mach)
source
+fit!(mach)
source
diff --git a/dev/evaluating_model_performance/index.html b/dev/evaluating_model_performance/index.html index 2929f9259..135bf19da 100644 --- a/dev/evaluating_model_performance/index.html +++ b/dev/evaluating_model_performance/index.html @@ -1,5 +1,5 @@ -Evaluating Model Performance · MLJ

Evaluating Model Performance

MLJ allows quick evaluation of a supervised model's performance against a battery of selected losses or scores. For more on available performance measures, see Performance Measures.

In addition to hold-out and cross-validation, the user can specify an explicit list of train/test pairs of row indices for resampling, or define new resampling strategies.

For simultaneously evaluating multiple models, see Comparing models of different type and nested cross-validation.

For externally logging the outcomes of performance evaluation experiments, see Logging Workflows

Evaluating against a single measure

julia> using MLJ
julia> X = (a=rand(12), b=rand(12), c=rand(12));
julia> y = X.a + 2X.b + 0.05*rand(12);
julia> model = (@load RidgeRegressor pkg=MultivariateStats verbosity=0)()RidgeRegressor( +Evaluating Model Performance · MLJ

Evaluating Model Performance

MLJ allows quick evaluation of a supervised model's performance against a battery of selected losses or scores. For more on available performance measures, see Performance Measures.

In addition to hold-out and cross-validation, the user can specify an explicit list of train/test pairs of row indices for resampling, or define new resampling strategies.

For simultaneously evaluating multiple models, see Comparing models of different type and nested cross-validation.

For externally logging the outcomes of performance evaluation experiments, see Logging Workflows

Evaluating against a single measure

julia> using MLJ
julia> X = (a=rand(12), b=rand(12), c=rand(12));
julia> y = X.a + 2X.b + 0.05*rand(12);
julia> model = (@load RidgeRegressor pkg=MultivariateStats verbosity=0)()RidgeRegressor( lambda = 1.0, bias = true)
julia> cv = CV(nfolds=3)CV( nfolds = 3, @@ -13,18 +13,18 @@ ┌──────────┬───────────┬─────────────┐ │ measure │ operation │ measurement │ ├──────────┼───────────┼─────────────┤ -│ LPLoss( │ predict │ 0.2 │ +│ LPLoss( │ predict │ 0.184 │ │ p = 2) │ │ │ └──────────┴───────────┴─────────────┘ -┌───────────────────────┬─────────┐ -│ per_fold │ 1.96*SE │ -├───────────────────────┼─────────┤ -│ [0.249, 0.133, 0.219] │ 0.0837 │ -└───────────────────────┴─────────┘

Alternatively, instead of applying evaluate to a model + data, one may call evaluate! on an existing machine wrapping the model in data:

julia> mach = machine(model, X, y)untrained Machine; caches model-specific representations of data
+┌────────────────────────┬─────────┐
+│ per_fold               │ 1.96*SE │
+├────────────────────────┼─────────┤
+│ [0.298, 0.163, 0.0892] │ 0.147   │
+└────────────────────────┴─────────┘

Alternatively, instead of applying evaluate to a model + data, one may call evaluate! on an existing machine wrapping the model in data:

julia> mach = machine(model, X, y)untrained Machine; caches model-specific representations of data
   model: RidgeRegressor(lambda = 1.0, …)
   args:
-    1:	Source @196 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @870 ⏎ AbstractVector{Continuous}
julia> evaluate!(mach, resampling=cv, measure=l2, verbosity=0)PerformanceEvaluation object with these fields: + 1: Source @420 ⏎ Table{AbstractVector{Continuous}} + 2: Source @983 ⏎ AbstractVector{Continuous}
julia> evaluate!(mach, resampling=cv, measure=l2, verbosity=0)PerformanceEvaluation object with these fields: model, measure, operation, measurement, per_fold, per_observation, fitted_params_per_fold, report_per_fold, @@ -33,14 +33,14 @@ ┌──────────┬───────────┬─────────────┐ │ measure │ operation │ measurement │ ├──────────┼───────────┼─────────────┤ -│ LPLoss( │ predict │ 0.2 │ +│ LPLoss( │ predict │ 0.184 │ │ p = 2) │ │ │ └──────────┴───────────┴─────────────┘ -┌───────────────────────┬─────────┐ -│ per_fold │ 1.96*SE │ -├───────────────────────┼─────────┤ -│ [0.249, 0.133, 0.219] │ 0.0837 │ -└───────────────────────┴─────────┘

(The latter call is a mutating call as the learned parameters stored in the machine potentially change. )

Multiple measures

Multiple measures are specified as a vector:

julia> evaluate!(
+┌────────────────────────┬─────────┐
+│ per_fold               │ 1.96*SE │
+├────────────────────────┼─────────┤
+│ [0.298, 0.163, 0.0892] │ 0.147   │
+└────────────────────────┴─────────┘

(The latter call is a mutating call as the learned parameters stored in the machine potentially change. )

Multiple measures

Multiple measures are specified as a vector:

julia> evaluate!(
            mach,
            resampling=cv,
            measures=[l1, rms, rmslp1],
@@ -54,18 +54,18 @@
 ┌───┬──────────────────────────────────────┬───────────┬─────────────┐
 │   │ measure                              │ operation │ measurement │
 ├───┼──────────────────────────────────────┼───────────┼─────────────┤
-│ A │ LPLoss(                              │ predict   │ 0.396       │
+│ A │ LPLoss(                              │ predict   │ 0.38        │
 │   │   p = 1)                             │           │             │
-│ B │ RootMeanSquaredError()               │ predict   │ 0.447       │
-│ C │ RootMeanSquaredLogProportionalError( │ predict   │ 0.201       │
+│ B │ RootMeanSquaredError()               │ predict   │ 0.429       │
+│ C │ RootMeanSquaredLogProportionalError( │ predict   │ 0.18        │
 │   │   offset = 1)                        │           │             │
 └───┴──────────────────────────────────────┴───────────┴─────────────┘
 ┌───┬───────────────────────┬─────────┐
 │   │ per_fold              │ 1.96*SE │
 ├───┼───────────────────────┼─────────┤
-│ A │ [0.419, 0.337, 0.43]  │ 0.0702  │
-│ B │ [0.499, 0.364, 0.468] │ 0.0978  │
-│ C │ [0.254, 0.168, 0.168] │ 0.0692  │
+│ A │ [0.526, 0.319, 0.294] │ 0.176   │
+│ B │ [0.546, 0.404, 0.299] │ 0.172   │
+│ C │ [0.207, 0.203, 0.116] │ 0.0714  │
 └───┴───────────────────────┴─────────┘

Custom measures can also be provided.

Specifying weights

Per-observation weights can be passed to measures. If a measure does not support weights, the weights are ignored:

julia> holdout = Holdout(fraction_train=0.8)Holdout(
   fraction_train = 0.8,
   shuffle = false,
@@ -87,15 +87,15 @@
 ┌───┬────────────┬───────────┬─────────────┐
 │   │ measure    │ operation │ measurement │
 ├───┼────────────┼───────────┼─────────────┤
-│ A │ LPLoss(    │ predict   │ 0.308       │
+│ A │ LPLoss(    │ predict   │ 0.269       │
 │   │   p = 2)   │           │             │
-│ B │ RSquared() │ predict   │ 0.226       │
+│ B │ RSquared() │ predict   │ 0.471       │
 └───┴────────────┴───────────┴─────────────┘
 ┌───┬───────────────────────┬─────────┐
 │   │ per_fold              │ 1.96*SE │
 ├───┼───────────────────────┼─────────┤
-│ A │ [0.382, 0.231, 0.311] │ 0.105   │
-│ B │ [0.294, 0.495, -0.11] │ 0.427   │
+│ A │ [0.33, 0.344, 0.133]  │ 0.164   │
+│ B │ [0.464, 0.365, 0.583] │ 0.151   │
 └───┴───────────────────────┴─────────┘

In classification problems, use class_weights=... to specify a class weight dictionary.

MLJBase.evaluate!Function
evaluate!(mach; resampling=CV(), measure=nothing, options...)

Estimate the performance of a machine mach wrapping a supervised model in data, using the specified resampling strategy (defaulting to 6-fold cross-validation) and measure, which can be a single measure or vector. Returns a PerformanceEvaluation object.

Available resampling strategies are CV, Holdout, InSample, StratifiedCV and TimeSeriesCV. If resampling is not an instance of one of these, then a vector of tuples of the form (train_rows, test_rows) is expected. For example, setting

resampling = [((1:100), (101:200)),
               ((101:200), (1:100))]

gives two-fold cross-validation using the first 200 rows of data.

Any measure conforming to the StatisticalMeasuresBase.jl API can be provided, assuming it can consume multiple observations.

Although evaluate! is mutating, mach.model and mach.args are not mutated.

Additional keyword options

  • rows - vector of observation indices from which both train and test folds are constructed (default is all observations)

  • operation/operations=nothing - One of predict, predict_mean, predict_mode, predict_median, or predict_joint, or a vector of these of the same length as measure/measures. Automatically inferred if left unspecified. For example, predict_mode will be used for a Multiclass target, if model is a probabilistic predictor, but measure is expects literal (point) target predictions. Operations actually applied can be inspected from the operation field of the object returned.

  • weights - per-sample Real weights for measures that support them (not to be confused with weights used in training, such as the w in mach = machine(model, X, y, w)).

  • class_weights - dictionary of Real per-class weights for use with measures that support these, in classification problems (not to be confused with weights used in training, such as the w in mach = machine(model, X, y, w)).

  • repeats::Int=1: set to a higher value for repeated (Monte Carlo) resampling. For example, if repeats = 10, then resampling = CV(nfolds=5, shuffle=true), generates a total of 50 (train, test) pairs for evaluation and subsequent aggregation.

  • acceleration=CPU1(): acceleration/parallelization option; can be any instance of CPU1, (single-threaded computation), CPUThreads (multi-threaded computation) or CPUProcesses (multi-process computation); default is default_resource(). These types are owned by ComputationalResources.jl.

  • force=false: set to true to force cold-restart of each training event

  • verbosity::Int=1 logging level; can be negative

  • check_measure=true: whether to screen measures for possible incompatibility with the model. Will not catch all incompatibilities.

  • per_observation=true: whether to calculate estimates for individual observations; if false the per_observation field of the returned object is populated with missings. Setting to false may reduce compute time and allocations.

  • logger - a logger object (see MLJBase.log_evaluation)

  • compact=false - if true, the returned evaluation object excludes these fields: fitted_params_per_fold, report_per_fold, train_test_rows.

See also evaluate, PerformanceEvaluation, CompactPerformanceEvaluation.

source
MLJBase.PerformanceEvaluationType
PerformanceEvaluation <: AbstractPerformanceEvaluation

Type of object returned by evaluate (for models plus data) or evaluate! (for machines). Such objects encode estimates of the performance (generalization error) of a supervised model or outlier detection model, and store other information ancillary to the computation.

If evaluate or evaluate! is called with the compact=true option, then a CompactPerformanceEvaluation object is returned instead.

When evaluate/evaluate! is called, a number of train/test pairs ("folds") of row indices are generated, according to the options provided, which are discussed in the evaluate! doc-string. Rows correspond to observations. The generated train/test pairs are recorded in the train_test_rows field of the PerformanceEvaluation struct, and the corresponding estimates, aggregated over all train/test pairs, are recorded in measurement, a vector with one entry for each measure (metric) recorded in measure.

When displayed, a PerformanceEvaluation object includes a value under the heading 1.96*SE, derived from the standard error of the per_fold entries. This value is suitable for constructing a formal 95% confidence interval for the given measurement. Such intervals should be interpreted with caution. See, for example, Bates et al. (2021).

Fields

These fields are part of the public API of the PerformanceEvaluation struct.

  • model: model used to create the performance evaluation. In the case a tuning model, this is the best model found.

  • measure: vector of measures (metrics) used to evaluate performance

  • measurement: vector of measurements - one for each element of measure - aggregating the performance measurements over all train/test pairs (folds). The aggregation method applied for a given measure m is StatisticalMeasuresBase.external_aggregation_mode(m) (commonly Mean() or Sum())

  • operation (e.g., predict_mode): the operations applied for each measure to generate predictions to be evaluated. Possibilities are: predict, predict_mean, predict_mode, predict_median, or predict_joint.

  • per_fold: a vector of vectors of individual test fold evaluations (one vector per measure). Useful for obtaining a rough estimate of the variance of the performance estimate.

  • per_observation: a vector of vectors of vectors containing individual per-observation measurements: for an evaluation e, e.per_observation[m][f][i] is the measurement for the ith observation in the fth test fold, evaluated using the mth measure. Useful for some forms of hyper-parameter optimization. Note that an aggregregated measurement for some measure measure is repeated across all observations in a fold if StatisticalMeasures.can_report_unaggregated(measure) == true. If e has been computed with the per_observation=false option, then e_per_observation is a vector of missings.

  • fitted_params_per_fold: a vector containing fitted params(mach) for each machine mach trained during resampling - one machine per train/test pair. Use this to extract the learned parameters for each individual training event.

  • report_per_fold: a vector containing report(mach) for each machine mach training in resampling - one machine per train/test pair.

  • train_test_rows: a vector of tuples, each of the form (train, test), where train and test are vectors of row (observation) indices for training and evaluation respectively.

  • resampling: the user-specified resampling strategy to generate the train/test pairs (or literal train/test pairs if that was directly specified).

  • repeats: the number of times the resampling strategy was repeated.

See also CompactPerformanceEvaluation.

source

User-specified train/test sets

Users can either provide an explicit list of train/test pairs of row indices for resampling, as in this example:

julia> fold1 = 1:6; fold2 = 7:12;
julia> evaluate!( mach, @@ -111,16 +111,16 @@ ┌───┬──────────┬───────────┬─────────────┐ │ │ measure │ operation │ measurement │ ├───┼──────────┼───────────┼─────────────┤ -│ A │ LPLoss( │ predict │ 0.663 │ +│ A │ LPLoss( │ predict │ 0.45 │ │ │ p = 1) │ │ │ -│ B │ LPLoss( │ predict │ 0.57 │ +│ B │ LPLoss( │ predict │ 0.264 │ │ │ p = 2) │ │ │ └───┴──────────┴───────────┴─────────────┘ ┌───┬────────────────┬─────────┐ │ │ per_fold │ 1.96*SE │ ├───┼────────────────┼─────────┤ -│ A │ [0.621, 0.705] │ 0.115 │ -│ B │ [0.459, 0.68] │ 0.306 │ +│ A │ [0.289, 0.612] │ 0.448 │ +│ B │ [0.1, 0.429] │ 0.455 │ └───┴────────────────┴─────────┘

Or the user can define their own re-usable ResamplingStrategy objects; see Custom resampling strategies below.

Built-in resampling strategies

MLJBase.HoldoutType
holdout = Holdout(; fraction_train=0.7, shuffle=nothing, rng=nothing)

Instantiate a Holdout resampling strategy, for use in evaluate!, evaluate and in tuning.

train_test_pairs(holdout, rows)

Returns the pair [(train, test)], where train and test are vectors such that rows=vcat(train, test) and length(train)/length(rows) is approximatey equal to fraction_train`.

Pre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the Holdout keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.

If rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is specified.

source
MLJBase.CVType
cv = CV(; nfolds=6,  shuffle=nothing, rng=nothing)

Cross-validation resampling strategy, for use in evaluate!, evaluate and tuning.

train_test_pairs(cv, rows)

Returns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector. With no row pre-shuffling, the order of rows is preserved, in the sense that rows coincides precisely with the concatenation of the test vectors, in the order they are generated. The first r test vectors have length n + 1, where n, r = divrem(length(rows), nfolds), and the remaining test vectors have length n.

Pre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the CV keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.

If rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.

source
MLJBase.StratifiedCVType
stratified_cv = StratifiedCV(; nfolds=6,
                                shuffle=false,
                                rng=Random.GLOBAL_RNG)

Stratified cross-validation resampling strategy, for use in evaluate!, evaluate and in tuning. Applies only to classification problems (OrderedFactor or Multiclass targets).

train_test_pairs(stratified_cv, rows, y)

Returns an nfolds-length iterator of (train, test) pairs of vectors (row indices) where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector.

Unlike regular cross-validation, the distribution of the levels of the target y corresponding to each train and test is constrained, as far as possible, to replicate that of y[rows] as a whole.

The stratified train_test_pairs algorithm is invariant to label renaming. For example, if you run replace!(y, 'a' => 'b', 'b' => 'a') and then re-run train_test_pairs, the returned (train, test) pairs will be the same.

Pre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the StratifedCV keywod constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.

If rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.

source
MLJBase.TimeSeriesCVType
tscv = TimeSeriesCV(; nfolds=4)

Cross-validation resampling strategy, for use in evaluate!, evaluate and tuning, when observations are chronological and not expected to be independent.

train_test_pairs(tscv, rows)

Returns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The rows are partitioned sequentially into nfolds + 1 approximately equal length partitions, where the first partition is the first train set, and the second partition is the first test set. The second train set consists of the first two partitions, and the second test set consists of the third partition, and so on for each fold.

The first partition (which is the first train set) has length n + r, where n, r = divrem(length(rows), nfolds + 1), and the remaining partitions (all of the test folds) have length n.

Examples

julia> MLJBase.train_test_pairs(TimeSeriesCV(nfolds=3), 1:10)
@@ -181,4 +181,4 @@
     train, test = partition(rows, holdout.fraction_train,
                           shuffle=holdout.shuffle, rng=holdout.rng)
     return [(train, test),]
-end
+end
diff --git a/dev/frequently_asked_questions/index.html b/dev/frequently_asked_questions/index.html index ae63fd806..4158795f2 100644 --- a/dev/frequently_asked_questions/index.html +++ b/dev/frequently_asked_questions/index.html @@ -1,2 +1,2 @@ -FAQ · MLJ

Frequently Asked Questions

Julia already has a great machine learning toolbox, ScitkitLearn.jl. Why MLJ?

An alternative machine learning toolbox for Julia users is ScikitLearn.jl. Initially intended as a Julia wrapper for the popular python library scikit-learn, ML algorithms written in Julia can also implement the ScikitLearn.jl API. Meta-algorithms (systematic tuning, pipelining, etc) remain python wrapped code, however.

While ScikitLearn.jl provides the Julia user with access to a mature and large library of machine learning models, the scikit-learn API on which it is modeled, dating back to 2007, is not likely to evolve significantly in the future. MLJ enjoys (or will enjoy) several features that should make it an attractive alternative in the longer term:

  • One language. ScikitLearn.jl wraps Python code, which in turn wraps C code for performance-critical routines. A Julia machine learning algorithm that implements the MLJ model interface is 100% Julia. Writing code in Julia is almost as fast as Python and well-written Julia code runs almost as fast as C. Additionally, a single language design provides superior interoperability. For example, one can implement: (i) gradient-descent tuning of hyperparameters, using automatic differentiation libraries such as Flux.jl; and (ii) GPU performance boosts without major code refactoring, using CuArrays.jl.

  • Registry for model metadata. In ScikitLearn.jl the list of available models, as well as model metadata (whether a model handles categorical inputs, whether it can make probabilistic predictions, etc) must be gleaned from the documentation. In MLJ, this information is more structured and is accessible to MLJ via a searchable model registry (without the models needing to be loaded).

  • Flexible API for model composition. Pipelines in scikit-learn are more of an afterthought than an integral part of the original design. By contrast, MLJ's user-interaction API was predicated on the requirements of a flexible "learning network" API, one that allows models to be connected in essentially arbitrary ways (such as Wolpert model stacks). Networks can be built and tested in stages before being exported as first-class stand-alone models. Networks feature "smart" training (only necessary components are retrained after parameter changes) and will eventually be trainable using a DAG scheduler.

  • Clean probabilistic API. The scikit-learn API does not specify a universal standard for the form of probabilistic predictions. By fixing a probabilistic API along the lines of the skpro project, MLJ aims to improve support for Bayesian statistics and probabilistic graphical models.

  • Universal adoption of categorical data types. Python's scientific array library NumPy has no dedicated data type for representing categorical data (i.e., no type that tracks the pool of all possible classes). Generally, scikit-learn models deal with this by requiring data to be relabeled as integers. However, the naive user trains a model on relabeled categorical data only to discover that evaluation on a test set crashes their code because a categorical feature takes on a value not observed in training. MLJ mitigates such issues by insisting on the use of categorical data types, and by insisting that MLJ model implementations preserve the class pools. If, for example, a training target contains classes in the pool that do not appear in the training set, a probabilistic prediction will nevertheless predict a distribution whose support includes the missing class, but which is appropriately weighted with probability zero.

Finally, we note that a large number of ScikitLearn.jl models are now wrapped for use in MLJ.

+FAQ · MLJ

Frequently Asked Questions

Julia already has a great machine learning toolbox, ScitkitLearn.jl. Why MLJ?

An alternative machine learning toolbox for Julia users is ScikitLearn.jl. Initially intended as a Julia wrapper for the popular python library scikit-learn, ML algorithms written in Julia can also implement the ScikitLearn.jl API. Meta-algorithms (systematic tuning, pipelining, etc) remain python wrapped code, however.

While ScikitLearn.jl provides the Julia user with access to a mature and large library of machine learning models, the scikit-learn API on which it is modeled, dating back to 2007, is not likely to evolve significantly in the future. MLJ enjoys (or will enjoy) several features that should make it an attractive alternative in the longer term:

  • One language. ScikitLearn.jl wraps Python code, which in turn wraps C code for performance-critical routines. A Julia machine learning algorithm that implements the MLJ model interface is 100% Julia. Writing code in Julia is almost as fast as Python and well-written Julia code runs almost as fast as C. Additionally, a single language design provides superior interoperability. For example, one can implement: (i) gradient-descent tuning of hyperparameters, using automatic differentiation libraries such as Flux.jl; and (ii) GPU performance boosts without major code refactoring, using CuArrays.jl.

  • Registry for model metadata. In ScikitLearn.jl the list of available models, as well as model metadata (whether a model handles categorical inputs, whether it can make probabilistic predictions, etc) must be gleaned from the documentation. In MLJ, this information is more structured and is accessible to MLJ via a searchable model registry (without the models needing to be loaded).

  • Flexible API for model composition. Pipelines in scikit-learn are more of an afterthought than an integral part of the original design. By contrast, MLJ's user-interaction API was predicated on the requirements of a flexible "learning network" API, one that allows models to be connected in essentially arbitrary ways (such as Wolpert model stacks). Networks can be built and tested in stages before being exported as first-class stand-alone models. Networks feature "smart" training (only necessary components are retrained after parameter changes) and will eventually be trainable using a DAG scheduler.

  • Clean probabilistic API. The scikit-learn API does not specify a universal standard for the form of probabilistic predictions. By fixing a probabilistic API along the lines of the skpro project, MLJ aims to improve support for Bayesian statistics and probabilistic graphical models.

  • Universal adoption of categorical data types. Python's scientific array library NumPy has no dedicated data type for representing categorical data (i.e., no type that tracks the pool of all possible classes). Generally, scikit-learn models deal with this by requiring data to be relabeled as integers. However, the naive user trains a model on relabeled categorical data only to discover that evaluation on a test set crashes their code because a categorical feature takes on a value not observed in training. MLJ mitigates such issues by insisting on the use of categorical data types, and by insisting that MLJ model implementations preserve the class pools. If, for example, a training target contains classes in the pool that do not appear in the training set, a probabilistic prediction will nevertheless predict a distribution whose support includes the missing class, but which is appropriately weighted with probability zero.

Finally, we note that a large number of ScikitLearn.jl models are now wrapped for use in MLJ.

diff --git a/dev/generating_synthetic_data/index.html b/dev/generating_synthetic_data/index.html index 8330b02eb..8d77c67fd 100644 --- a/dev/generating_synthetic_data/index.html +++ b/dev/generating_synthetic_data/index.html @@ -1,21 +1,21 @@ -Generating Synthetic Data · MLJ

Generating Synthetic Data

Here synthetic data means artificially generated data, with no reference to a "real world" data set. Not to be confused "fake data" obtained by resampling from a distribution fit to some actual real data.

MLJ has a set of functions - make_blobs, make_circles, make_moons and make_regression (closely resembling functions in scikit-learn of the same name) - for generating synthetic data sets. These are useful for testing machine learning models (e.g., testing user-defined composite models; see Composing Models)

Generating Gaussian blobs

MLJBase.make_blobsFunction
X, y = make_blobs(n=100, p=2; kwargs...)

Generate Gaussian blobs for clustering and classification problems.

Return value

By default, a table X with p columns (features) and n rows (observations), together with a corresponding vector of n Multiclass target observations y, indicating blob membership.

Keyword arguments

  • shuffle=true: whether to shuffle the resulting points,

  • centers=3: either a number of centers or a c x p matrix with c pre-determined centers,

  • cluster_std=1.0: the standard deviation(s) of each blob,

  • center_box=(-10. => 10.): the limits of the p-dimensional cube within which the cluster centers are drawn if they are not provided,

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type.

Example

X, y = make_blobs(100, 3; centers=2, cluster_std=[1.0, 3.0])
source
using MLJ, DataFrames
+Generating Synthetic Data · MLJ

Generating Synthetic Data

Here synthetic data means artificially generated data, with no reference to a "real world" data set. Not to be confused "fake data" obtained by resampling from a distribution fit to some actual real data.

MLJ has a set of functions - make_blobs, make_circles, make_moons and make_regression (closely resembling functions in scikit-learn of the same name) - for generating synthetic data sets. These are useful for testing machine learning models (e.g., testing user-defined composite models; see Composing Models)

Generating Gaussian blobs

MLJBase.make_blobsFunction
X, y = make_blobs(n=100, p=2; kwargs...)

Generate Gaussian blobs for clustering and classification problems.

Return value

By default, a table X with p columns (features) and n rows (observations), together with a corresponding vector of n Multiclass target observations y, indicating blob membership.

Keyword arguments

  • shuffle=true: whether to shuffle the resulting points,

  • centers=3: either a number of centers or a c x p matrix with c pre-determined centers,

  • cluster_std=1.0: the standard deviation(s) of each blob,

  • center_box=(-10. => 10.): the limits of the p-dimensional cube within which the cluster centers are drawn if they are not provided,

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type.

Example

X, y = make_blobs(100, 3; centers=2, cluster_std=[1.0, 3.0])
source
using MLJ, DataFrames
 X, y = make_blobs(100, 3; centers=2, cluster_std=[1.0, 3.0])
 dfBlobs = DataFrame(X)
 dfBlobs.y = y
-first(dfBlobs, 3)
3×4 DataFrame
Rowx1x2x3y
Float64Float64Float64Cat…
13.794776.21848.190741
22.482786.8372610.22521
32.118431.247685.354172
using VegaLite
+first(dfBlobs, 3)
3×4 DataFrame
Rowx1x2x3y
Float64Float64Float64Cat…
1-7.407887.00114-3.935691
27.31825-6.85747.096612
3-7.792015.2932-2.417341
using VegaLite
 dfBlobs |> @vlplot(:point, x=:x1, y=:x2, color = :"y:n") 

svg

dfBlobs |> @vlplot(:point, x=:x1, y=:x3, color = :"y:n") 

svg

Generating concentric circles

MLJBase.make_circlesFunction
X, y = make_circles(n=100; kwargs...)

Generate n labeled points close to two concentric circles for classification and clustering models.

Return value

By default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the smaller or larger circle, respectively.

Keyword arguments

  • shuffle=true: whether to shuffle the resulting points,

  • noise=0: standard deviation of the Gaussian noise added to the data,

  • factor=0.8: ratio of the smaller radius over the larger one,

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type.

Example

X, y = make_circles(100; noise=0.5, factor=0.3)
source
using MLJ, DataFrames
 X, y = make_circles(100; noise=0.05, factor=0.3)
 dfCircles = DataFrame(X)
 dfCircles.y = y
-first(dfCircles, 3)
3×3 DataFrame
Rowx1x2y
Float64Float64Cat…
1-0.8661220.4096651
20.211735-0.9833941
30.493587-0.958241
using VegaLite
+first(dfCircles, 3)
3×3 DataFrame
Rowx1x2y
Float64Float64Cat…
10.3097320.02194180
20.0718151-0.208390
30.004349550.996561
using VegaLite
 dfCircles |> @vlplot(:circle, x=:x1, y=:x2, color = :"y:n") 

svg

Sampling from two interleaved half-circles

MLJBase.make_moonsFunction
make_moons(n::Int=100; kwargs...)

Generates labeled two-dimensional points lying close to two interleaved semi-circles, for use with classification and clustering models.

Return value

By default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the left or right semi-circle.

Keyword arguments

  • shuffle=true: whether to shuffle the resulting points,

  • noise=0.1: standard deviation of the Gaussian noise added to the data,

  • xshift=1.0: horizontal translation of the second center with respect to the first one.

  • yshift=0.3: vertical translation of the second center with respect to the first one.

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type.

Example

X, y = make_moons(100; noise=0.5)
source
using MLJ, DataFrames
 X, y = make_moons(100; noise=0.05)
 dfHalfCircles = DataFrame(X)
 dfHalfCircles.y = y
-first(dfHalfCircles, 3)
3×3 DataFrame
Rowx1x2y
Float64Float64Cat…
10.849042-0.6491091
20.2489-0.2976561
30.002468660.1319941
using VegaLite
+first(dfHalfCircles, 3)
3×3 DataFrame
Rowx1x2y
Float64Float64Cat…
1-0.2545540.9071870
20.8822080.5049570
30.9755680.3659630
using VegaLite
 dfHalfCircles |> @vlplot(:circle, x=:x1, y=:x2, color = :"y:n") 

svg

Regression data generated from noisy linear models

MLJBase.make_regressionFunction
make_regression(n, p; kwargs...)

Generate Gaussian input features and a linear response with Gaussian noise, for use with regression models.

Return value

By default, a tuple (X, y) where table X has p columns and n rows (observations), together with a corresponding vector of n Continuous target observations y.

Keywords

  • intercept=true: Whether to generate data from a model with intercept.

  • n_targets=1: Number of columns in the target.

  • sparse=0: Proportion of the generating weight vector that is sparse.

  • noise=0.1: Standard deviation of the Gaussian noise added to the response (target).

  • outliers=0: Proportion of the response vector to make as outliers by adding a random quantity with high variance. (Only applied if binary is false.)

  • as_table=true: Whether X (and y, if n_targets > 1) should be a table or a matrix.

  • eltype=Float64: Element type for X and y. Must subtype AbstractFloat.

  • binary=false: Whether the target should be binarized (via a sigmoid).

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false).

Example

X, y = make_regression(100, 5; noise=0.5, sparse=0.2, outliers=0.1)
source
using MLJ, DataFrames
 X, y = make_regression(100, 5; noise=0.5, sparse=0.2, outliers=0.1)
 dfRegression = DataFrame(X)
 dfRegression.y = y
-first(dfRegression, 3)
3×6 DataFrame
Rowx1x2x3x4x5y
Float64Float64Float64Float64Float64Float64
1-0.1020320.3582670.2980441.504120.4145060.189413
21.36867-1.500251.79527-0.1612160.4474364.02934
30.618891-0.4510660.7864660.173420.7444192.1948
+first(dfRegression, 3)
3×6 DataFrame
Rowx1x2x3x4x5y
Float64Float64Float64Float64Float64Float64
10.4597940.491532-0.949771-1.26997-0.3451770.36311
21.975230.534939-0.9024311.791180.369042-0.200509
3-1.83557-0.5259870.1969981.092220.469514-0.149277
diff --git a/dev/getting_started/index.html b/dev/getting_started/index.html index b1649c62b..2b7857dcc 100644 --- a/dev/getting_started/index.html +++ b/dev/getting_started/index.html @@ -1,5 +1,5 @@ -Getting Started · MLJ

Getting Started

For an outline of MLJ's goals and features, see About MLJ.

This page introduces some MLJ basics, assuming some familiarity with machine learning. For a complete list of other MLJ learning resources, see Learning MLJ.

MLJ collects together the functionality provided by mutliple packages. To learn how to install components separately, run using MLJ; @doc MLJ.

This section introduces only the most basic MLJ operations and concepts. It assumes MLJ has been successfully installed. See Installation if this is not the case.

Choosing and evaluating a model

The following code loads Fisher's famous iris data set as a named tuple of column vectors:

julia> using MLJ
julia> iris = load_iris();
julia> selectrows(iris, 1:3) |> pretty┌──────────────┬─────────────┬──────────────┬─────────────┬──────────────────────────────────┐ +Getting Started · MLJ

Getting Started

For an outline of MLJ's goals and features, see About MLJ.

This page introduces some MLJ basics, assuming some familiarity with machine learning. For a complete list of other MLJ learning resources, see Learning MLJ.

MLJ collects together the functionality provided by mutliple packages. To learn how to install components separately, run using MLJ; @doc MLJ.

This section introduces only the most basic MLJ operations and concepts. It assumes MLJ has been successfully installed. See Installation if this is not the case.

Choosing and evaluating a model

The following code loads Fisher's famous iris data set as a named tuple of column vectors:

julia> using MLJ
julia> iris = load_iris();
julia> selectrows(iris, 1:3) |> pretty┌──────────────┬─────────────┬──────────────┬─────────────┬──────────────────────────────────┐ │ sepal_length │ sepal_width │ petal_length │ petal_width │ target │ │ Float64 │ Float64 │ Float64 │ Float64 │ CategoricalValue{String, UInt32} │ │ Continuous │ Continuous │ Continuous │ Continuous │ Multiclass{3} │ @@ -82,8 +82,8 @@ OrderedFactor

We use the scitype function to check how MLJ is going to interpret given data. Our choice of encoding for y works for DecisionTreeClassifier, because we have:

julia> scitype(y)AbstractVector{Multiclass{3}} (alias for AbstractArray{Multiclass{3}, 1})

and Multiclass{3} <: Finite. If we would encode with integers instead, we obtain:

julia> yint = int.(y);
julia> scitype(yint)AbstractVector{Count} (alias for AbstractArray{Count, 1})

and using yint in place of y in classification problems will fail. See also Working with Categorical Data.

For more on scientific types, see Data containers and scientific types below.

Fit and predict

To illustrate MLJ's fit and predict interface, let's perform our performance evaluations by hand, but using a simple holdout set, instead of cross-validation.

Wrapping the model in data creates a machine which will store training outcomes:

julia> mach = machine(tree, X, y)untrained Machine; caches model-specific representations of data
   model: DecisionTreeClassifier(max_depth = -1, …)
   args:
-    1:	Source @751 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @001 ⏎ AbstractVector{Multiclass{3}}

Training and testing on a hold-out set:

julia> train, test = partition(eachindex(y), 0.7); # 70:30 split
julia> fit!(mach, rows=train);[ Info: Training machine(DecisionTreeClassifier(max_depth = -1, …), …).
julia> yhat = predict(mach, X[test,:]);
julia> yhat[3:5]3-element UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}: + 1: Source @884 ⏎ Table{AbstractVector{Continuous}} + 2: Source @342 ⏎ AbstractVector{Multiclass{3}}

Training and testing on a hold-out set:

julia> train, test = partition(eachindex(y), 0.7); # 70:30 split
julia> fit!(mach, rows=train);[ Info: Training machine(DecisionTreeClassifier(max_depth = -1, …), …).
julia> yhat = predict(mach, X[test,:]);
julia> yhat[3:5]3-element UnivariateFiniteVector{Multiclass{3}, String, UInt32, Float64}: UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0) UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0) UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)
julia> log_loss(yhat, y[test])2.4029102259411435

Note that log_loss and cross_entropy are aliases for LogLoss() (which can be passed an optional keyword parameter, as in LogLoss(tol=0.001)). For a list of all losses and scores, and their aliases, run measures().

Notice that yhat is a vector of Distribution objects, because DecisionTreeClassifier makes probabilistic predictions. The methods of the Distributions.jl package can be applied to such distributions:

julia> broadcast(pdf, yhat[3:5], "virginica") # predicted probabilities of virginica3-element Vector{Float64}:
@@ -115,11 +115,11 @@
   count = false)
julia> mach2 = machine(stand, v)untrained Machine; caches model-specific representations of data model: Standardizer(features = Symbol[], …) args: - 1: Source @679 ⏎ AbstractVector{Continuous}
julia> fit!(mach2)[ Info: Training machine(Standardizer(features = Symbol[], …), …). + 1: Source @664 ⏎ AbstractVector{Continuous}
julia> fit!(mach2)[ Info: Training machine(Standardizer(features = Symbol[], …), …). trained Machine; caches model-specific representations of data model: Standardizer(features = Symbol[], …) args: - 1: Source @679 ⏎ AbstractVector{Continuous}
julia> w = transform(mach2, v)4-element Vector{Float64}: + 1: Source @664 ⏎ AbstractVector{Continuous}
julia> w = transform(mach2, v)4-element Vector{Float64}: -1.161895003862225 -0.3872983346207417 0.3872983346207417 @@ -238,4 +238,4 @@ input_scitype = Table{<:Union{AbstractVector{<:Continuous}, AbstractVector{<:Count}, AbstractVector{<:OrderedFactor}}}, target_scitype = AbstractVector{<:Finite}, - output_scitype = Unknown)
julia> i.input_scitypeTable{<:Union{AbstractVector{<:Continuous}, AbstractVector{<:Count}, AbstractVector{<:OrderedFactor}}}
julia> i.target_scitypeAbstractVector{<:Finite} (alias for AbstractArray{<:Finite, 1})

This output indicates that any table with Continuous, Count or OrderedFactor columns is acceptable as the input X, and that any vector with element scitype <: Finite is acceptable as the target y.

For more on matching models to data, see Model Search.

Scalar scientific types

Models in MLJ will always apply the MLJ convention described in ScientificTypes.jl to decide how to interpret the elements of your container types. Here are the key features of that convention:

  • Any AbstractFloat is interpreted as Continuous.

  • Any Integer is interpreted as Count.

  • Any CategoricalValue x, is interpreted as Multiclass or OrderedFactor, depending on the value of isordered(x).

  • Strings and Chars are not interpreted as Multiclass or OrderedFactor (they have scitypes Textual and Unknown respectively).

  • In particular, integers (including Bools) cannot be used to represent categorical data. Use the preceding coerce operations to coerce to a Finite scitype.

  • The scientific types of nothing and missing are Nothing and Missing, native types we also regard as scientific.

Use coerce(v, OrderedFactor) or coerce(v, Multiclass) to coerce a vector v of integers, strings or characters to a vector with an appropriate Finite (categorical) scitype. See also Working with Categorical Data, and the ScientificTypes.jl documentation.

+ output_scitype = Unknown)

julia> i.input_scitypeTable{<:Union{AbstractVector{<:Continuous}, AbstractVector{<:Count}, AbstractVector{<:OrderedFactor}}}
julia> i.target_scitypeAbstractVector{<:Finite} (alias for AbstractArray{<:Finite, 1})

This output indicates that any table with Continuous, Count or OrderedFactor columns is acceptable as the input X, and that any vector with element scitype <: Finite is acceptable as the target y.

For more on matching models to data, see Model Search.

Scalar scientific types

Models in MLJ will always apply the MLJ convention described in ScientificTypes.jl to decide how to interpret the elements of your container types. Here are the key features of that convention:

  • Any AbstractFloat is interpreted as Continuous.

  • Any Integer is interpreted as Count.

  • Any CategoricalValue x, is interpreted as Multiclass or OrderedFactor, depending on the value of isordered(x).

  • Strings and Chars are not interpreted as Multiclass or OrderedFactor (they have scitypes Textual and Unknown respectively).

  • In particular, integers (including Bools) cannot be used to represent categorical data. Use the preceding coerce operations to coerce to a Finite scitype.

  • The scientific types of nothing and missing are Nothing and Missing, native types we also regard as scientific.

Use coerce(v, OrderedFactor) or coerce(v, Multiclass) to coerce a vector v of integers, strings or characters to a vector with an appropriate Finite (categorical) scitype. See also Working with Categorical Data, and the ScientificTypes.jl documentation.

diff --git a/dev/glossary/index.html b/dev/glossary/index.html index b21425eb3..57989ab99 100644 --- a/dev/glossary/index.html +++ b/dev/glossary/index.html @@ -1,2 +1,2 @@ -Glossary · MLJ

Glossary

Note: This glossary includes some detail intended mainly for MLJ developers.

Basics

hyperparameters

Parameters on which some learning algorithm depends, specified before the algorithm is applied, and where learning is interpreted in the broadest sense. For example, PCA feature reduction is a "preprocessing" transformation "learning" a projection from training data, governed by a dimension hyperparameter. Hyperparameters in our sense may specify configuration (eg, number of parallel processes) even when this does not affect the end-product of learning. (But we exclude verbosity level.)

model (object of abstract type Model)

Object collecting together hyperpameters of a single algorithm. Models are classified either as supervised or unsupervised models (eg, "transformers"), with corresponding subtypes Supervised <: Model and Unsupervised <: Model.

fitresult (type generally defined outside of MLJ)

Also known as "learned" or "fitted" parameters, these are "weights", "coefficients", or similar parameters learned by an algorithm, after adopting the prescribed hyper-parameters. For example, decision trees of a random forest, the coefficients and intercept of a linear model, or the projection matrices of a PCA dimension-reduction algorithm.

operation

Data-manipulating operations (methods) using some fitresult. For supervised learners, the predict, predict_mean, predict_median, or predict_mode methods; for transformers, the transform or inverse_transform method. An operation may also refer to an ordinary data-manipulating method that does not depend on a fit-result (e.g., a broadcasted logarithm) which is then called static operation for clarity. An operation that is not static is dynamic.

machine (object of type Machine)

An object consisting of:

  1. A model

  2. A fit-result (undefined until training)

  3. Training arguments (one for each data argument of the model's associated fit method). A training argument is data used for training (subsampled by specifying rows=... in fit!) but also in evaluation (subsampled by specifying rows=... in predict, predict_mean, etc). Generally, there are two training arguments for supervised models, and just one for unsupervised models. Each argument is either a Source node, wrapping concrete data supplied to the machine constructor, or a Node, in the case of a learning network (see below). Both kinds of nodes can be called with an optional rows=... keyword argument to (lazily) return concrete data.

In addition, machines store "report" metadata, for recording algorithm-specific statistics of training (eg, an internal estimate of generalization error, feature importances); and they cache information allowing the fit-result to be updated without repeating unnecessary information.

Machines are trained by calls to a fit! method which may be passed an optional argument specifying the rows of data to be used in training.

For more, see the Machines section.

Learning Networks and Composite Models

Note: Multiple machines in a learning network may share the same model, and multiple learning nodes may share the same machine.

source node (object of type Source)

A container for training data and point of entry for new data in a learning network (see below).

node (object of type Node)

Essentially a machine (whose arguments are possibly other nodes) wrapped in an associated operation (e.g., predict or inverse_transform). It consists primarily of:

  1. An operation, static or dynamic.
  2. A machine, or nothing if the operation is static.
  3. Upstream connections to other nodes, specified by a list of arguments (one for each argument of the operation). These are the arguments on which the operation "acts" when the node N is called, as in N().

learning network

A directed acyclic graph implicit in the connections of a collection of source(s) and nodes.

wrapper

Any model with one or more other models as hyper-parameters.

composite model

Any wrapper, or any learning network, "exported" as a model (see Composing Models).

+Glossary · MLJ

Glossary

Note: This glossary includes some detail intended mainly for MLJ developers.

Basics

hyperparameters

Parameters on which some learning algorithm depends, specified before the algorithm is applied, and where learning is interpreted in the broadest sense. For example, PCA feature reduction is a "preprocessing" transformation "learning" a projection from training data, governed by a dimension hyperparameter. Hyperparameters in our sense may specify configuration (eg, number of parallel processes) even when this does not affect the end-product of learning. (But we exclude verbosity level.)

model (object of abstract type Model)

Object collecting together hyperpameters of a single algorithm. Models are classified either as supervised or unsupervised models (eg, "transformers"), with corresponding subtypes Supervised <: Model and Unsupervised <: Model.

fitresult (type generally defined outside of MLJ)

Also known as "learned" or "fitted" parameters, these are "weights", "coefficients", or similar parameters learned by an algorithm, after adopting the prescribed hyper-parameters. For example, decision trees of a random forest, the coefficients and intercept of a linear model, or the projection matrices of a PCA dimension-reduction algorithm.

operation

Data-manipulating operations (methods) using some fitresult. For supervised learners, the predict, predict_mean, predict_median, or predict_mode methods; for transformers, the transform or inverse_transform method. An operation may also refer to an ordinary data-manipulating method that does not depend on a fit-result (e.g., a broadcasted logarithm) which is then called static operation for clarity. An operation that is not static is dynamic.

machine (object of type Machine)

An object consisting of:

  1. A model

  2. A fit-result (undefined until training)

  3. Training arguments (one for each data argument of the model's associated fit method). A training argument is data used for training (subsampled by specifying rows=... in fit!) but also in evaluation (subsampled by specifying rows=... in predict, predict_mean, etc). Generally, there are two training arguments for supervised models, and just one for unsupervised models. Each argument is either a Source node, wrapping concrete data supplied to the machine constructor, or a Node, in the case of a learning network (see below). Both kinds of nodes can be called with an optional rows=... keyword argument to (lazily) return concrete data.

In addition, machines store "report" metadata, for recording algorithm-specific statistics of training (eg, an internal estimate of generalization error, feature importances); and they cache information allowing the fit-result to be updated without repeating unnecessary information.

Machines are trained by calls to a fit! method which may be passed an optional argument specifying the rows of data to be used in training.

For more, see the Machines section.

Learning Networks and Composite Models

Note: Multiple machines in a learning network may share the same model, and multiple learning nodes may share the same machine.

source node (object of type Source)

A container for training data and point of entry for new data in a learning network (see below).

node (object of type Node)

Essentially a machine (whose arguments are possibly other nodes) wrapped in an associated operation (e.g., predict or inverse_transform). It consists primarily of:

  1. An operation, static or dynamic.
  2. A machine, or nothing if the operation is static.
  3. Upstream connections to other nodes, specified by a list of arguments (one for each argument of the operation). These are the arguments on which the operation "acts" when the node N is called, as in N().

learning network

A directed acyclic graph implicit in the connections of a collection of source(s) and nodes.

wrapper

Any model with one or more other models as hyper-parameters.

composite model

Any wrapper, or any learning network, "exported" as a model (see Composing Models).

diff --git a/dev/homogeneous_ensembles/index.html b/dev/homogeneous_ensembles/index.html index 71146e8b5..002b5d8dd 100644 --- a/dev/homogeneous_ensembles/index.html +++ b/dev/homogeneous_ensembles/index.html @@ -1,8 +1,8 @@ -Homogeneous Ensembles · MLJ

Homogeneous Ensembles

Although an ensemble of models sharing a common set of hyperparameters can be defined using the learning network API, MLJ's EnsembleModel model wrapper is preferred, for convenience and best performance. Examples of using EnsembleModel are given in this Data Science Tutorial.

When bagging decision trees, further randomness is normally introduced by subsampling features, when training each node of each tree (Ho (1995), Brieman and Cutler (2001)). A bagged ensemble of such trees is known as a Random Forest. You can see an example of using EnsembleModel to build a random forest in this Data Science Tutorial. However, you may also want to use a canned random forest model. Run models("RandomForest") to list such models.

MLJEnsembles.EnsembleModelFunction
EnsembleModel(model,
+Homogeneous Ensembles · MLJ

Homogeneous Ensembles

Although an ensemble of models sharing a common set of hyperparameters can be defined using the learning network API, MLJ's EnsembleModel model wrapper is preferred, for convenience and best performance. Examples of using EnsembleModel are given in this Data Science Tutorial.

When bagging decision trees, further randomness is normally introduced by subsampling features, when training each node of each tree (Ho (1995), Brieman and Cutler (2001)). A bagged ensemble of such trees is known as a Random Forest. You can see an example of using EnsembleModel to build a random forest in this Data Science Tutorial. However, you may also want to use a canned random forest model. Run models("RandomForest") to list such models.

MLJEnsembles.EnsembleModelFunction
EnsembleModel(model,
               atomic_weights=Float64[],
               bagging_fraction=0.8,
               n=100,
               rng=GLOBAL_RNG,
               acceleration=CPU1(),
-              out_of_bag_measure=[])

Create a model for training an ensemble of n clones of model, with optional bagging. Ensembling is useful if fit!(machine(atom, data...)) does not create identical models on repeated calls (ie, is a stochastic model, such as a decision tree with randomized node selection criteria), or if bagging_fraction is set to a value less than 1.0, or both.

Here the atomic model must support targets with scitype AbstractVector{<:Finite} (single-target classifiers) or AbstractVector{<:Continuous} (single-target regressors).

If rng is an integer, then MersenneTwister(rng) is the random number generator used for bagging. Otherwise some AbstractRNG object is expected.

The atomic predictions are optionally weighted according to the vector atomic_weights (to allow for external optimization) except in the case that model is a Deterministic classifier, in which case atomic_weights are ignored.

The ensemble model is Deterministic or Probabilistic, according to the corresponding supertype of atom. In the case of deterministic classifiers (target_scitype(atom) <: Abstract{<:Finite}), the predictions are majority votes, and for regressors (target_scitype(atom)<: AbstractVector{<:Continuous}) they are ordinary averages. Probabilistic predictions are obtained by averaging the atomic probability distribution/mass functions; in particular, for regressors, the ensemble prediction on each input pattern has the type MixtureModel{VF,VS,D} from the Distributions.jl package, where D is the type of predicted distribution for atom.

Specify acceleration=CPUProcesses() for distributed computing, or CPUThreads() for multithreading.

If a single measure or non-empty vector of measures is specified by out_of_bag_measure, then out-of-bag estimates of performance are written to the training report (call report on the trained machine wrapping the ensemble model).

Important: If per-observation or class weights w (not to be confused with atomic weights) are specified when constructing a machine for the ensemble model, as in mach = machine(ensemble_model, X, y, w), then w is used by any measures specified in out_of_bag_measure that support them.

source
+ out_of_bag_measure=[])

Create a model for training an ensemble of n clones of model, with optional bagging. Ensembling is useful if fit!(machine(atom, data...)) does not create identical models on repeated calls (ie, is a stochastic model, such as a decision tree with randomized node selection criteria), or if bagging_fraction is set to a value less than 1.0, or both.

Here the atomic model must support targets with scitype AbstractVector{<:Finite} (single-target classifiers) or AbstractVector{<:Continuous} (single-target regressors).

If rng is an integer, then MersenneTwister(rng) is the random number generator used for bagging. Otherwise some AbstractRNG object is expected.

The atomic predictions are optionally weighted according to the vector atomic_weights (to allow for external optimization) except in the case that model is a Deterministic classifier, in which case atomic_weights are ignored.

The ensemble model is Deterministic or Probabilistic, according to the corresponding supertype of atom. In the case of deterministic classifiers (target_scitype(atom) <: Abstract{<:Finite}), the predictions are majority votes, and for regressors (target_scitype(atom)<: AbstractVector{<:Continuous}) they are ordinary averages. Probabilistic predictions are obtained by averaging the atomic probability distribution/mass functions; in particular, for regressors, the ensemble prediction on each input pattern has the type MixtureModel{VF,VS,D} from the Distributions.jl package, where D is the type of predicted distribution for atom.

Specify acceleration=CPUProcesses() for distributed computing, or CPUThreads() for multithreading.

If a single measure or non-empty vector of measures is specified by out_of_bag_measure, then out-of-bag estimates of performance are written to the training report (call report on the trained machine wrapping the ensemble model).

Important: If per-observation or class weights w (not to be confused with atomic weights) are specified when constructing a machine for the ensemble model, as in mach = machine(ensemble_model, X, y, w), then w is used by any measures specified in out_of_bag_measure that support them.

source
diff --git a/dev/index.html b/dev/index.html index 332228c44..e71338917 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,5 +1,5 @@ -Home · MLJ
+Home · MLJ
+ Star

Model Browser

Reference Manual

Basics

Getting Started | Working with Categorical Data | Common MLJ Workflows | Machines | MLJ Cheatsheet

Data

Working with Categorical Data | Preparing Data | Generating Synthetic Data | OpenML Integration | Correcting Class Imbalance

Models

Model Search | Loading Model Code | Transformers and Other Unsupervised Models | Simple User Defined Models | List of Supported Models | Third Party Packages

Meta-algorithms

Evaluating Model Performance | Tuning Models | Composing Models | Controlling Iterative Models | Learning Curves| Correcting Class Imbalance | Thresholding Probabilistic Predictors

Composition

Composing Models | Linear Pipelines | Target Transformations | Homogeneous Ensembles | Model Stacking | Learning Networks| Correcting Class Imbalance

Integration

Logging Workflows | OpenML Integration

Customization and Extension

Simple User Defined Models | Quick-Start Guide to Adding Models | Adding Models for General Use | Composing Models | Internals | Modifying Behavior

Miscellaneous

Weights | Acceleration and Parallelism | Performance Measures

diff --git a/dev/internals/index.html b/dev/internals/index.html index c1e7a3264..47dc281df 100644 --- a/dev/internals/index.html +++ b/dev/internals/index.html @@ -1,5 +1,5 @@ -Internals · MLJ

Internals

The machine interface, simplified

The following is a simplified description of the Machine interface. It predates the introduction of an optional data front-end for models (see Implementing a data front-end). See also the Glossary

The Machine type

mutable struct Machine{M<Model}
+Internals · MLJ
+end
diff --git a/dev/learning_curves/index.html b/dev/learning_curves/index.html index 44bb6f1db..630f33d86 100644 --- a/dev/learning_curves/index.html +++ b/dev/learning_curves/index.html @@ -1,5 +1,5 @@ -Learning Curves · MLJ

Learning Curves

A learning curve in MLJ is a plot of some performance estimate, as a function of some model hyperparameter. This can be useful when tuning a single model hyperparameter, or when deciding how many iterations are required for some iterative model. The learning_curve method does not actually generate a plot but generates the data needed to do so.

To generate learning curves you can bind data to a model by instantiating a machine. You can choose to supply all available data, as performance estimates are computed using a resampling strategy, defaulting to Holdout(fraction_train=0.7).

using MLJ
+Learning Curves · MLJ

Learning Curves

A learning curve in MLJ is a plot of some performance estimate, as a function of some model hyperparameter. This can be useful when tuning a single model hyperparameter, or when deciding how many iterations are required for some iterative model. The learning_curve method does not actually generate a plot but generates the data needed to do so.

To generate learning curves you can bind data to a model by instantiating a machine. You can choose to supply all available data, as performance estimates are computed using a resampling strategy, defaulting to Holdout(fraction_train=0.7).

using MLJ
 X, y = @load_boston;
 
 atom = (@load RidgeRegressor pkg=MLJLinearModels)()
@@ -62,4 +62,4 @@
      ylab="Holdout estimate of RMS error")
 
 
learning_curve(model::Supervised, X, y; kwargs...)
-learning_curve(model::Supervised, X, y, w; kwargs...)

Plot a learning curve (or curves) directly, without first constructing a machine.

Summary of key-word options

  • resolution - number of points generated from range (number model evaluations); default is 30

  • acceleration - parallelization option for passing to evaluate!; an instance of CPU1, CPUProcesses or CPUThreads from the ComputationalResources.jl; default is default_resource()

  • acceleration_grid - parallelization option for distributing each performancde evaluation

  • rngs - for specifying random number generator(s) to be passed to the model (see above)

  • rng_name - name of the model hyper-parameter representing a random number generator (see above); possibly nested

Other key-word options are documented at TunedModel.

source
+learning_curve(model::Supervised, X, y, w; kwargs...)

Plot a learning curve (or curves) directly, without first constructing a machine.

Summary of key-word options

Other key-word options are documented at TunedModel.

source diff --git a/dev/learning_mlj/index.html b/dev/learning_mlj/index.html index b7f737678..11b1c0dea 100644 --- a/dev/learning_mlj/index.html +++ b/dev/learning_mlj/index.html @@ -1,2 +1,2 @@ -Learning MLJ · MLJ

Learning MLJ

MLJ Cheatsheet

See also Getting help and reporting problems.

The present document, although littered with examples, is primarily intended as a complete reference.

Where to start?

Completely new to Julia?

Julia's learning resources page | Learn X in Y minutes | HelloJulia

New to data science?

Julia Data Science

New to machine learning?

Introduction to Statistical Learning with Julia versions of the R labs here

Know some ML and just want MLJ basics?

Getting Started | Common MLJ Workflows

An ML practitioner transitioning from another platform?

MLJ for Data Scientists in Two Hours | MLJTutorial

Other resources

+Learning MLJ · MLJ

Learning MLJ

MLJ Cheatsheet

See also Getting help and reporting problems.

The present document, although littered with examples, is primarily intended as a complete reference.

Where to start?

Completely new to Julia?

Julia's learning resources page | Learn X in Y minutes | HelloJulia

New to data science?

Julia Data Science

New to machine learning?

Introduction to Statistical Learning with Julia versions of the R labs here

Know some ML and just want MLJ basics?

Getting Started | Common MLJ Workflows

An ML practitioner transitioning from another platform?

MLJ for Data Scientists in Two Hours | MLJTutorial

Other resources

diff --git a/dev/learning_networks/index.html b/dev/learning_networks/index.html index 639e985c4..7e9f19013 100644 --- a/dev/learning_networks/index.html +++ b/dev/learning_networks/index.html @@ -1,5 +1,5 @@ -Learning Networks · MLJ

Learning Networks

Below is a practical guide to the MLJ implementation of learning networks, which have been described more abstractly in the article:

Anthony D. Blaom and Sebastian J. Voller (2020): Flexible model composition in machine learning and its implementation in MLJ. Preprint, arXiv:2012.15505.

Learning networks, an advanced but powerful MLJ feature, are "blueprints" for combining models in flexible ways, beyond ordinary linear pipelines and simple model ensembles. They are simple transformations of your existing workflows which can be "exported" to define new, re-usable composite model types (models which typically have other models as hyperparameters).

Pipeline models (see Pipeline), and model stacks (see Stack) are both implemented internally as exported learning networks.

Note

While learning networks can be used for complex machine learning workflows, their main purpose is for defining new stand-alone model types, which behave just like any other model type: Instances can be evaluated, tuned, inserted into pipelines, etc. In serious applications, users are encouraged to export their learning networks, as explained under Exporting a learning network as a new model type below, after testing the network, using a small training dataset.

Learning networks by example

Learning networks are best explained by way of example.

Lazy computation

The core idea of a learning network is delayed or lazy computation. Instead of

X = 4
+Learning Networks · MLJ

Learning Networks

Below is a practical guide to the MLJ implementation of learning networks, which have been described more abstractly in the article:

Anthony D. Blaom and Sebastian J. Voller (2020): Flexible model composition in machine learning and its implementation in MLJ. Preprint, arXiv:2012.15505.

Learning networks, an advanced but powerful MLJ feature, are "blueprints" for combining models in flexible ways, beyond ordinary linear pipelines and simple model ensembles. They are simple transformations of your existing workflows which can be "exported" to define new, re-usable composite model types (models which typically have other models as hyperparameters).

Pipeline models (see Pipeline), and model stacks (see Stack) are both implemented internally as exported learning networks.

Note

While learning networks can be used for complex machine learning workflows, their main purpose is for defining new stand-alone model types, which behave just like any other model type: Instances can be evaluated, tuned, inserted into pipelines, etc. In serious applications, users are encouraged to export their learning networks, as explained under Exporting a learning network as a new model type below, after testing the network, using a small training dataset.

Learning networks by example

Learning networks are best explained by way of example.

Lazy computation

The core idea of a learning network is delayed or lazy computation. Instead of

X = 4
 Y = 3
 Z = 2*X
 W = Y + Z
@@ -10,10 +10,10 @@
 Z = 2*X
 W = Y + Z
 W()
11

In the first computation X, Y, Z and W are all bound to ordinary data. In the second, they are bound to objects called nodes. The special nodes X and Y constitute "entry points" for data, and are called source nodes. As the terminology suggests, we can imagine these objects as part of a "network" (a directed acyclic graph) which can aid conceptualization (but is less useful in more complicated examples):

The origin of a node

The source nodes on which a given node depends are called the origins of the node:

os = origins(W)
2-element Vector{Source}:
- Source @317 ⏎ `Count`
- Source @622 ⏎ `Count`
X in os
true

Re-using a network

The advantage of lazy evaluation is that we can change data at a source node to repeat the calculation with new data. One way to do this (discouraged in practice) is to use rebind!:

Z()
8
rebind!(X, 6) # demonstration only!
+ Source @909 ⏎ `Count`
+ Source @464 ⏎ `Count`
X in os
true

Re-using a network

The advantage of lazy evaluation is that we can change data at a source node to repeat the calculation with new data. One way to do this (discouraged in practice) is to use rebind!:

Z()
8
rebind!(X, 6) # demonstration only!
 Z()
12

However, if a node has a unique origin, then one instead calls the node on the new data one would like to rebind to that origin:

origins(Z)
1-element Vector{Source}:
- Source @622 ⏎ `Count`
Z(6)
12
Z(4)
8

This has the advantage that you don't need to locate the origin and rebind data directly, and the unique-origin restriction turns out to be sufficient for the applications to learning we have in mind.

Overloading functions for use on nodes

Several built-in function like * and + above are overloaded in MLJBase to work on nodes, as illustrated above. Others that work out-of-the-box include: MLJBase.matrix, MLJBase.table, vcat, hcat, mean, median, mode, first, last, as well as broadcasted versions of log, exp, mean, mode and median. A function like sqrt is not overloaded, so that Q = sqrt(Z) will throw an error. Instead, we do

Q = node(sqrt, Z)
+ Source @464 ⏎ `Count`
Z(6)
12
Z(4)
8

This has the advantage that you don't need to locate the origin and rebind data directly, and the unique-origin restriction turns out to be sufficient for the applications to learning we have in mind.

Overloading functions for use on nodes

Several built-in function like * and + above are overloaded in MLJBase to work on nodes, as illustrated above. Others that work out-of-the-box include: MLJBase.matrix, MLJBase.table, vcat, hcat, mean, median, mode, first, last, as well as broadcasted versions of log, exp, mean, mode and median. A function like sqrt is not overloaded, so that Q = sqrt(Z) will throw an error. Instead, we do

Q = node(sqrt, Z)
 Z()
12
Q()
3.4641016151377544

You can learn more about the node function under More on defining new nodes

A network that learns

To incorporate learning in a network of nodes MLJ:

  • Allows binding of machines to nodes instead of data

  • Generates "operation" nodes when calling an operation like predict or transform on a machine and node input data. Such nodes point to both a machine (storing learned parameters) and the node from which to fetch data for applying the operation (which, unlike the nodes seen so far, depend on learned parameters to generate output).

For an example of a learning network that actually learns, we first synthesize some training data X, y, and production data Xnew:

using MLJ
 X, y = make_blobs(cluster_std=10.0, rng=123)  # `X` is a table, `y` a vector
 Xnew, _ = make_blobs(3) # `Xnew` is a table with the same number of columns

We choose a model do some dimension reduction, and another to perform classification:

pca = (@load PCA pkg=MultivariateStats verbosity=0)()
@@ -22,40 +22,40 @@
 x = transform(mach1, Xs) # defines a new node because `Xs` is a node
 
 mach2 = machine(tree, x, ys)
-yhat = predict(mach2, x) # defines a new node because `x` is a node
Node @416 → DecisionTreeClassifier(…)
+yhat = predict(mach2, x) # defines a new node because `x` is a node
Node @305 → DecisionTreeClassifier(…)
   args:
-    1:	Node @807 → PCA(…)
+    1:	Node @628 → PCA(…)
   formula:
     predict(
       machine(DecisionTreeClassifier(max_depth = -1, …), …), 
       transform(
         machine(PCA(maxoutdim = 0, …), …), 
-        Source @900))

Note that mach1 and mach2 are not themselves nodes. They point to the nodes they need to call to get training data and they are in turn pointed to by other nodes. In fact, an interesting implementation detail is that an "ordinary" machine is not actually bound directly to data, but bound to data wrapped in source nodes.

machine(pca, Xnew).args[1] # `Xnew` is ordinary data
Source @621 ⏎ `Table{AbstractVector{Continuous}}`

Before calling a node, we need to fit! the node, to trigger training of all the machines on which it depends:

julia> fit!(yhat)   # can include same keyword options for `fit!(::Machine, ...)`[ Info: Training machine(PCA(maxoutdim = 0, …), …).
+        Source @581))

Note that mach1 and mach2 are not themselves nodes. They point to the nodes they need to call to get training data and they are in turn pointed to by other nodes. In fact, an interesting implementation detail is that an "ordinary" machine is not actually bound directly to data, but bound to data wrapped in source nodes.

machine(pca, Xnew).args[1] # `Xnew` is ordinary data
Source @107 ⏎ `Table{AbstractVector{Continuous}}`

Before calling a node, we need to fit! the node, to trigger training of all the machines on which it depends:

julia> fit!(yhat)   # can include same keyword options for `fit!(::Machine, ...)`[ Info: Training machine(PCA(maxoutdim = 0, …), …).
 [ Info: Training machine(DecisionTreeClassifier(max_depth = -1, …), …).
-Node @416 → DecisionTreeClassifier(…)
+Node @305 → DecisionTreeClassifier(…)
   args:
-    1:	Node @807 → PCA(…)
+    1:	Node @628 → PCA(…)
   formula:
     predict(
       machine(DecisionTreeClassifier(max_depth = -1, …), …),
       transform(
         machine(PCA(maxoutdim = 0, …), …),
-        Source @900))
julia> yhat()[1:2] # or `yhat(rows=2)`2-element UnivariateFiniteVector{Multiclass{3}, Int64, UInt32, Float64}: + Source @581))
julia> yhat()[1:2] # or `yhat(rows=2)`2-element UnivariateFiniteVector{Multiclass{3}, Int64, UInt32, Float64}: UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>0.0, 3=>0.0) UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>0.0, 3=>0.0)

This last represents the prediction on the training data, because that's what resides at our source nodes. However, yhat has the unique origin X (because "training edges" in the complete associated directed graph are excluded for this purpose). We can therefore call yhat on our production data to get the corresponding predictions:

yhat(Xnew)
3-element UnivariateFiniteVector{Multiclass{3}, Int64, UInt32, Float64}:
  UnivariateFinite{Multiclass{3}}(1=>0.0, 2=>0.0, 3=>1.0)
  UnivariateFinite{Multiclass{3}}(1=>0.0, 2=>0.0, 3=>1.0)
  UnivariateFinite{Multiclass{3}}(1=>1.0, 2=>0.0, 3=>0.0)

Training is smart, in the sense that mutating a hyper-parameter of some component model does not force retraining of upstream machines:

julia> tree.max_depth = 11
julia> fit!(yhat)[ Info: Not retraining machine(PCA(maxoutdim = 0, …), …). Use `force=true` to force. [ Info: Updating machine(DecisionTreeClassifier(max_depth = 1, …), …). -Node @416 → DecisionTreeClassifier(…) +Node @305 → DecisionTreeClassifier(…) args: - 1: Node @807 → PCA(…) + 1: Node @628 → PCA(…) formula: predict( machine(DecisionTreeClassifier(max_depth = 1, …), …), transform( machine(PCA(maxoutdim = 0, …), …), - Source @900))
julia> yhat(Xnew)3-element UnivariateFiniteVector{Multiclass{3}, Int64, UInt32, Float64}: + Source @581))
julia> yhat(Xnew)3-element UnivariateFiniteVector{Multiclass{3}, Int64, UInt32, Float64}: UnivariateFinite{Multiclass{3}}(1=>0.357, 2=>0.4, 3=>0.243) UnivariateFinite{Multiclass{3}}(1=>0.357, 2=>0.4, 3=>0.243) UnivariateFinite{Multiclass{3}}(1=>0.357, 2=>0.4, 3=>0.243)

Multithreaded training

A more complicated learning network may contain machines that can be trained in parallel. In that case, a call like the following may speed up training:

tree.max_depth = 2
@@ -67,15 +67,15 @@
 NetworkComposite
NetworkComposite (alias for Union{AnnotatorNetworkComposite, DeterministicNetworkComposite, DeterministicSupervisedDetectorNetworkComposite, DeterministicUnsupervisedDetectorNetworkComposite, IntervalNetworkComposite, JointProbabilisticNetworkComposite, ProbabilisticNetworkComposite, ProbabilisticSetNetworkComposite, ProbabilisticSupervisedDetectorNetworkComposite, ProbabilisticUnsupervisedDetectorNetworkComposite, StaticNetworkComposite, SupervisedAnnotatorNetworkComposite, SupervisedDetectorNetworkComposite, SupervisedNetworkComposite, UnsupervisedAnnotatorNetworkComposite, UnsupervisedDetectorNetworkComposite, UnsupervisedNetworkComposite})

We next make our learning network model-generic by substituting each model instance with the corresponding symbol representing a property (field) of the new model struct:

mach1 = machine(:preprocessor, Xs)   # <---- `pca` swapped out for `:preprocessor`
 x = transform(mach1, Xs)
 mach2 = machine(:classifier, x, ys)  # <---- `tree` swapped out for `:classifier`
-yhat = predict(mach2, x)
Node @090 → :classifier
+yhat = predict(mach2, x)
Node @128 → :classifier
   args:
-    1:	Node @544 → :preprocessor
+    1:	Node @116 → :preprocessor
   formula:
     predict(
       machine(:classifier, …), 
       transform(
         machine(:preprocessor, …), 
-        Source @900))

Incidentally, this network can be used as before except we must provide an instance of CompositeA in our fit! calls, to indicate what actual models the symbols are being substituted with:

composite_a = CompositeA(pca, ConstantClassifier())
+        Source @581))

Incidentally, this network can be used as before except we must provide an instance of CompositeA in our fit! calls, to indicate what actual models the symbols are being substituted with:

composite_a = CompositeA(pca, ConstantClassifier())
 fit!(yhat, composite=composite_a)
 yhat(Xnew)
3-element UnivariateFiniteVector{Multiclass{3}, Int64, UInt32, Float64}:
  UnivariateFinite{Multiclass{3}}(1=>0.33, 2=>0.33, 3=>0.34)
@@ -417,4 +417,4 @@
 10
 

See also node

source
MLJBase.prefitFunction
MLJBase.prefit(model, verbosity, data...)

Returns a learning network interface (see below) for a learning network with source nodes that wrap data.

A user overloads MLJBase.prefit when exporting a learning network as a new stand-alone model type, of which model above will be an instance. See the MLJ reference manual for details.

A learning network interface is a named tuple declaring certain interface points in a learning network, to be used when "exporting" the network as a new stand-alone model type. Examples are

 (predict=yhat,)
  (transform=Xsmall, acceleration=CPUThreads())
- (predict=yhat, transform=W, report=(loss=loss_node,))

Here yhat, Xsmall, W and loss_node are nodes in the network.

The keys of the learning network interface always one of the following:

  • The name of an operation, such as :predict, :predict_mode, :transform, :inverse_transform. See "Operation keys" below.

  • :report, for exposing results of calling a node with no arguments in the composite model report. See "Including report nodes" below.

  • :fitted_params, for exposing results of calling a node with no arguments as fitted parameters of the composite model. See "Including fitted parameter nodes" below.

  • :acceleration, for articulating acceleration mode for training the network, e.g., CPUThreads(). Corresponding value must be an AbstractResource. If not included, CPU1() is used.

Operation keys

If the key is an operation, then the value must be a node n in the network with a unique origin (length(origins(n)) === 1). The intention of a declaration such as predict=yhat is that the exported model type implements predict, which, when applied to new data Xnew, should return yhat(Xnew).

Including report nodes

If the key is :report, then the corresponding value must be a named tuple

 (k1=n1, k2=n2, ...)

whose values are all nodes. For each k=n pair, the key k will appear as a key in the composite model report, with a corresponding value of deepcopy(n()), called immediatately after training or updating the network. For examples, refer to the "Learning Networks" section of the MLJ manual.

Including fitted parameter nodes

If the key is :fitted_params, then the behaviour is as for report nodes but results are exposed as fitted parameters of the composite model instead of the report.

source

See more on fitting nodes at fit! and fit_only!.

+ (predict=yhat, transform=W, report=(loss=loss_node,))

Here yhat, Xsmall, W and loss_node are nodes in the network.

The keys of the learning network interface always one of the following:

Operation keys

If the key is an operation, then the value must be a node n in the network with a unique origin (length(origins(n)) === 1). The intention of a declaration such as predict=yhat is that the exported model type implements predict, which, when applied to new data Xnew, should return yhat(Xnew).

Including report nodes

If the key is :report, then the corresponding value must be a named tuple

 (k1=n1, k2=n2, ...)

whose values are all nodes. For each k=n pair, the key k will appear as a key in the composite model report, with a corresponding value of deepcopy(n()), called immediatately after training or updating the network. For examples, refer to the "Learning Networks" section of the MLJ manual.

Including fitted parameter nodes

If the key is :fitted_params, then the behaviour is as for report nodes but results are exposed as fitted parameters of the composite model instead of the report.

source

See more on fitting nodes at fit! and fit_only!.

diff --git a/dev/linear_pipelines/index.html b/dev/linear_pipelines/index.html index b39e23d28..af10e1c96 100644 --- a/dev/linear_pipelines/index.html +++ b/dev/linear_pipelines/index.html @@ -1,5 +1,5 @@ -Linear Pipelines · MLJ

Linear Pipelines

In MLJ a pipeline is a composite model in which models are chained together in a linear (non-branching) chain. For other arrangements, including custom architectures via learning networks, see Composing Models.

For purposes of illustration, consider a supervised learning problem with the following toy data:

using MLJ
+Linear Pipelines · MLJ

Linear Pipelines

In MLJ a pipeline is a composite model in which models are chained together in a linear (non-branching) chain. For other arrangements, including custom architectures via learning networks, see Composing Models.

For purposes of illustration, consider a supervised learning problem with the following toy data:

using MLJ
 X = (age    = [23, 45, 34, 25, 67],
      gender = categorical(['m', 'm', 'f', 'm', 'f']));
 y = [67.0, 81.5, 55.6, 90.0, 61.1]

We would like to train using a K-nearest neighbor model, but the model type KNNRegressor assumes the features are all Continuous. This can be fixed by first:

  • coercing the :age feature to have Continuous type by replacing X with coerce(X, :age=>Continuous)
  • standardizing continuous features and one-hot encoding the Multiclass features using the ContinuousEncoder model

However, we can avoid separately applying these preprocessing steps (two of which require fit! steps) by combining them with the supervised KKNRegressor model in a new pipeline model, using Julia's |> syntax:

KNNRegressor = @load KNNRegressor pkg=NearestNeighborModels
@@ -41,4 +41,4 @@
 
 pipe1 = MLJBase.table |> ContinuousEncoder |> Standardizer
 pipe2 = PCA |> LinearRegressor
-pipe1 |> pipe2

At most one of the components may be a supervised model, but this model can appear in any position. A pipeline with a Supervised component is itself Supervised and implements the predict operation. It is otherwise Unsupervised (possibly Static) and implements transform.

Special operations

If all the components are invertible unsupervised models (ie, implement inverse_transform) then inverse_transform is implemented for the pipeline. If there are no supervised models, then predict is nevertheless implemented, assuming the last component is a model that implements it (some clustering models). Similarly, calling transform on a supervised pipeline calls transform on the supervised component.

Optional key-word arguments

  • prediction_type - prediction type of the pipeline; possible values: :deterministic, :probabilistic, :interval (default=:deterministic if not inferable)

  • operation - operation applied to the supervised component model, when present; possible values: predict, predict_mean, predict_median, predict_mode (default=predict)

  • cache - whether the internal machines created for component models should cache model-specific representations of data (see machine) (default=true)

Warning

Set cache=false to guarantee data anonymization.

To build more complicated non-branching pipelines, refer to the MLJ manual sections on composing models.

source
+pipe1 |> pipe2

At most one of the components may be a supervised model, but this model can appear in any position. A pipeline with a Supervised component is itself Supervised and implements the predict operation. It is otherwise Unsupervised (possibly Static) and implements transform.

Special operations

If all the components are invertible unsupervised models (ie, implement inverse_transform) then inverse_transform is implemented for the pipeline. If there are no supervised models, then predict is nevertheless implemented, assuming the last component is a model that implements it (some clustering models). Similarly, calling transform on a supervised pipeline calls transform on the supervised component.

Optional key-word arguments

Warning

Set cache=false to guarantee data anonymization.

To build more complicated non-branching pipelines, refer to the MLJ manual sections on composing models.

source diff --git a/dev/list_of_supported_models/index.html b/dev/list_of_supported_models/index.html index 8a7ea1650..6c31733c3 100644 --- a/dev/list_of_supported_models/index.html +++ b/dev/list_of_supported_models/index.html @@ -1,2 +1,2 @@ -List of Supported Models · MLJ

List of Supported Models

For a list of models organized around function ("classification", "regression", etc.), see the Model Browser.

MLJ provides access to a wide variety of machine learning models. We are always looking for help adding new models or testing existing ones. Currently available models are listed below; for the most up-to-date list, run using MLJ; models().

Indications of "maturity" in the table below are approximate, surjective, and possibly out-of-date. A decision to use or not use a model in a critical application should be based on a user's independent assessment.

  • experimental: indicates the package is fairly new and/or is under active development; you can help by testing these packages and making them more robust,
  • low: indicate a package that has reached a roughly stable form in terms of interface and which is unlikely to contain serious bugs. It may be missing some functionality found in similar packages. It has not benefited from a high level of use
  • medium: indicates the package is fairly mature but may benefit from optimizations and/or extra features; you can help by suggesting either,
  • high: indicates the package is very mature and functionalities are expected to have been fairly optimiser and tested.
PackageInterface PkgModelsMaturityNote
BetaML.jl-DecisionTreeClassifier, RandomForestClassifier, NeuralNetworkClassifier, PerceptronClassifier, KernelPerceptronClassifier, PegasosClassifier, DecisionTreeRegressor, RandomForestRegressor, NeuralNetworkRegressor, MultitargetNeuralNetworkRegressor, GaussianMixtureRegressor, MultitargetGaussianMixtureRegressor, KMeansClusterer, KMedoidsClusterer, GaussianMixtureClusterer, SimpleImputer, GaussianMixtureImputer, RandomForestImputer, GeneralImputer, AutoEncodermedium
CatBoost.jl-CatBoostRegressor, CatBoostClassifierhigh
Clustering.jlMLJClusteringInterface.jlKMeans, KMedoids, DBSCAN, HierarchicalClusteringhigh²
DecisionTree.jlMLJDecisionTreeInterface.jlDecisionTreeClassifier, DecisionTreeRegressor, AdaBoostStumpClassifier, RandomForestClassifier, RandomForestRegressorhigh
EvoTrees.jl-EvoTreeRegressor, EvoTreeClassifier, EvoTreeCount, EvoTreeGaussian, EvoTreeMLEmediumtree-based gradient boosting models
EvoLinear.jl-EvoLinearRegressormediumlinear boosting models
GLM.jlMLJGLMInterface.jlLinearRegressor, LinearBinaryClassifier, LinearCountRegressormedium²
Imbalance.jl-RandomOversampler, RandomWalkOversampler, ROSE, SMOTE, BorderlineSMOTE1, SMOTEN, SMOTENC, RandomUndersampler, ClusterUndersampler, ENNUndersampler, TomekUndersampler,low
LIBSVM.jlMLJLIBSVMInterface.jlLinearSVC, SVC, NuSVC, NuSVR, EpsilonSVR, OneClassSVMhighalso via ScikitLearn.jl
LightGBM.jl-LGBMClassifier, LGBMRegressorhigh
FeatureSelector.jl-FeatureSelector, RecursiveFeatureEliminationlow
Flux.jlMLJFlux.jlNeuralNetworkRegressor, NeuralNetworkClassifier, MultitargetNeuralNetworkRegressor, ImageClassifierlow
MLJBalancing.jl-BalancedBaggingClassifierlow
MLJLinearModels.jl-LinearRegressor, RidgeRegressor, LassoRegressor, ElasticNetRegressor, QuantileRegressor, HuberRegressor, RobustRegressor, LADRegressor, LogisticClassifier, MultinomialClassifiermedium
MLJModels.jl (built-in)-ConstantClassifier, ConstantRegressor, ContinuousEncoder, DeterministicConstantClassifier, DeterministicConstantRegressor, FillImputer, InteractionTransformer, OneHotEncoder, Standardizer, UnivariateBoxCoxTransformer, UnivariateDiscretizer, UnivariateFillImputer, UnivariateTimeTypeToContinuous, Standardizer, BinaryThreshholdPredictormedium
MLJText.jl-TfidfTransformer, BM25Transformer, CountTransformerlow
MultivariateStats.jlMLJMultivariateStatsInterface.jlLinearRegressor, MultitargetLinearRegressor, RidgeRegressor, MultitargetRidgeRegressor, PCA, KernelPCA, ICA, LDA, BayesianLDA, SubspaceLDA, BayesianSubspaceLDA, FactorAnalysis, PPCAhigh
NaiveBayes.jlMLJNaiveBayesInterface.jlGaussianNBClassifier, MultinomialNBClassifier, HybridNBClassifierlow
NearestNeighborModels.jl-KNNClassifier, KNNRegressor, MultitargetKNNClassifier, MultitargetKNNRegressorhigh
OneRule.jl-OneRuleClassifierexperimental
OutlierDetectionNeighbors.jl-ABODDetector, COFDetector, DNNDetector, KNNDetector, LOFDetectormedium
OutlierDetectionNetworks.jl-AEDetector, DSADDetector, ESADDetectormedium
OutlierDetectionPython.jl-ABODDetector, CBLOFDetector, CDDetector, COFDetector, COPODDetector, ECODDetector, GMMDetector, HBOSDetector, IForestDetector, INNEDetector, KDEDetector, KNNDetector, LMDDDetector, LOCIDetector, LODADetector, LOFDetector, MCDDetector, OCSVMDetector, PCADetector, RODDetector, SODDetector, SOSDetectorhigh
ParallelKMeans.jl-KMeansexperimental
PartialLeastSquaresRegressor.jl-PLSRegressor, KPLSRegressorexperimental
PartitionedLS.jl-PartLSlow
ScikitLearn.jlMLJScikitLearnInterface.jlARDRegressor, AdaBoostClassifier, AdaBoostRegressor, AffinityPropagation, AgglomerativeClustering, BaggingClassifier, BaggingRegressor, BayesianLDA, BayesianQDA, BayesianRidgeRegressor, BernoulliNBClassifier, Birch, ComplementNBClassifier, DBSCAN, DummyClassifier, DummyRegressor, ElasticNetCVRegressor, ElasticNetRegressor, ExtraTreesClassifier, ExtraTreesRegressor, FeatureAgglomeration, GaussianNBClassifier, GaussianProcessClassifier, GaussianProcessRegressor, GradientBoostingClassifier, GradientBoostingRegressor, HuberRegressor, KMeans, KNeighborsClassifier, KNeighborsRegressor, LarsCVRegressor, LarsRegressor, LassoCVRegressor, LassoLarsCVRegressor, LassoLarsICRegressor, LassoLarsRegressor, LassoRegressor, LinearRegressor, LogisticCVClassifier, LogisticClassifier, MeanShift, MiniBatchKMeans, MultiTaskElasticNetCVRegressor, MultiTaskElasticNetRegressor, MultiTaskLassoCVRegressor, MultiTaskLassoRegressor, MultinomialNBClassifier, OPTICS, OrthogonalMatchingPursuitCVRegressor, OrthogonalMatchingPursuitRegressor, PassiveAggressiveClassifier, PassiveAggressiveRegressor, PerceptronClassifier, ProbabilisticSGDClassifier, RANSACRegressor, RandomForestClassifier, RandomForestRegressor, RidgeCVClassifier, RidgeCVRegressor, RidgeClassifier, RidgeRegressor, SGDClassifier, SGDRegressor, SVMClassifier, SVMLClassifier, SVMLRegressor, SVMNuClassifier, SVMNuRegressor, SVMRegressor, SpectralClustering, TheilSenRegressorhigh²
SIRUS.jl-StableForestClassifier, StableForestRegressor, StableRulesClassifier, StableRulesRegressorlow
SymbolicRegression.jl-MultitargetSRRegressor, SRRegressorexperimental
TSVD.jlMLJTSVDInterface.jlTSVDTransformerhigh
XGBoost.jlMLJXGBoostInterface.jlXGBoostRegressor, XGBoostClassifier, XGBoostCounthigh

Notes

¹Models not in the MLJ registry are not included in integration tests. Consult package documentation to see how to load them. There may be issues loading these models simultaneously with other registered models.

²Some models are missing and assistance is welcome to complete the interface. Post a message on the Julia #mlj Slack channel if you would like to help, thanks!

+List of Supported Models · MLJ

List of Supported Models

For a list of models organized around function ("classification", "regression", etc.), see the Model Browser.

MLJ provides access to a wide variety of machine learning models. We are always looking for help adding new models or testing existing ones. Currently available models are listed below; for the most up-to-date list, run using MLJ; models().

Indications of "maturity" in the table below are approximate, surjective, and possibly out-of-date. A decision to use or not use a model in a critical application should be based on a user's independent assessment.

  • experimental: indicates the package is fairly new and/or is under active development; you can help by testing these packages and making them more robust,
  • low: indicate a package that has reached a roughly stable form in terms of interface and which is unlikely to contain serious bugs. It may be missing some functionality found in similar packages. It has not benefited from a high level of use
  • medium: indicates the package is fairly mature but may benefit from optimizations and/or extra features; you can help by suggesting either,
  • high: indicates the package is very mature and functionalities are expected to have been fairly optimiser and tested.
PackageInterface PkgModelsMaturityNote
BetaML.jl-DecisionTreeClassifier, RandomForestClassifier, NeuralNetworkClassifier, PerceptronClassifier, KernelPerceptronClassifier, PegasosClassifier, DecisionTreeRegressor, RandomForestRegressor, NeuralNetworkRegressor, MultitargetNeuralNetworkRegressor, GaussianMixtureRegressor, MultitargetGaussianMixtureRegressor, KMeansClusterer, KMedoidsClusterer, GaussianMixtureClusterer, SimpleImputer, GaussianMixtureImputer, RandomForestImputer, GeneralImputer, AutoEncodermedium
CatBoost.jl-CatBoostRegressor, CatBoostClassifierhigh
Clustering.jlMLJClusteringInterface.jlKMeans, KMedoids, DBSCAN, HierarchicalClusteringhigh²
DecisionTree.jlMLJDecisionTreeInterface.jlDecisionTreeClassifier, DecisionTreeRegressor, AdaBoostStumpClassifier, RandomForestClassifier, RandomForestRegressorhigh
EvoTrees.jl-EvoTreeRegressor, EvoTreeClassifier, EvoTreeCount, EvoTreeGaussian, EvoTreeMLEmediumtree-based gradient boosting models
EvoLinear.jl-EvoLinearRegressormediumlinear boosting models
GLM.jlMLJGLMInterface.jlLinearRegressor, LinearBinaryClassifier, LinearCountRegressormedium²
Imbalance.jl-RandomOversampler, RandomWalkOversampler, ROSE, SMOTE, BorderlineSMOTE1, SMOTEN, SMOTENC, RandomUndersampler, ClusterUndersampler, ENNUndersampler, TomekUndersampler,low
LIBSVM.jlMLJLIBSVMInterface.jlLinearSVC, SVC, NuSVC, NuSVR, EpsilonSVR, OneClassSVMhighalso via ScikitLearn.jl
LightGBM.jl-LGBMClassifier, LGBMRegressorhigh
FeatureSelector.jl-FeatureSelector, RecursiveFeatureEliminationlow
Flux.jlMLJFlux.jlNeuralNetworkRegressor, NeuralNetworkClassifier, MultitargetNeuralNetworkRegressor, ImageClassifierlow
MLJBalancing.jl-BalancedBaggingClassifierlow
MLJLinearModels.jl-LinearRegressor, RidgeRegressor, LassoRegressor, ElasticNetRegressor, QuantileRegressor, HuberRegressor, RobustRegressor, LADRegressor, LogisticClassifier, MultinomialClassifiermedium
MLJModels.jl (built-in)-ConstantClassifier, ConstantRegressor, ContinuousEncoder, DeterministicConstantClassifier, DeterministicConstantRegressor, FillImputer, InteractionTransformer, OneHotEncoder, Standardizer, UnivariateBoxCoxTransformer, UnivariateDiscretizer, UnivariateFillImputer, UnivariateTimeTypeToContinuous, Standardizer, BinaryThreshholdPredictormedium
MLJText.jl-TfidfTransformer, BM25Transformer, CountTransformerlow
MultivariateStats.jlMLJMultivariateStatsInterface.jlLinearRegressor, MultitargetLinearRegressor, RidgeRegressor, MultitargetRidgeRegressor, PCA, KernelPCA, ICA, LDA, BayesianLDA, SubspaceLDA, BayesianSubspaceLDA, FactorAnalysis, PPCAhigh
NaiveBayes.jlMLJNaiveBayesInterface.jlGaussianNBClassifier, MultinomialNBClassifier, HybridNBClassifierlow
NearestNeighborModels.jl-KNNClassifier, KNNRegressor, MultitargetKNNClassifier, MultitargetKNNRegressorhigh
OneRule.jl-OneRuleClassifierexperimental
OutlierDetectionNeighbors.jl-ABODDetector, COFDetector, DNNDetector, KNNDetector, LOFDetectormedium
OutlierDetectionNetworks.jl-AEDetector, DSADDetector, ESADDetectormedium
OutlierDetectionPython.jl-ABODDetector, CBLOFDetector, CDDetector, COFDetector, COPODDetector, ECODDetector, GMMDetector, HBOSDetector, IForestDetector, INNEDetector, KDEDetector, KNNDetector, LMDDDetector, LOCIDetector, LODADetector, LOFDetector, MCDDetector, OCSVMDetector, PCADetector, RODDetector, SODDetector, SOSDetectorhigh
ParallelKMeans.jl-KMeansexperimental
PartialLeastSquaresRegressor.jl-PLSRegressor, KPLSRegressorexperimental
PartitionedLS.jl-PartLSlow
ScikitLearn.jlMLJScikitLearnInterface.jlARDRegressor, AdaBoostClassifier, AdaBoostRegressor, AffinityPropagation, AgglomerativeClustering, BaggingClassifier, BaggingRegressor, BayesianLDA, BayesianQDA, BayesianRidgeRegressor, BernoulliNBClassifier, Birch, ComplementNBClassifier, DBSCAN, DummyClassifier, DummyRegressor, ElasticNetCVRegressor, ElasticNetRegressor, ExtraTreesClassifier, ExtraTreesRegressor, FeatureAgglomeration, GaussianNBClassifier, GaussianProcessClassifier, GaussianProcessRegressor, GradientBoostingClassifier, GradientBoostingRegressor, HuberRegressor, KMeans, KNeighborsClassifier, KNeighborsRegressor, LarsCVRegressor, LarsRegressor, LassoCVRegressor, LassoLarsCVRegressor, LassoLarsICRegressor, LassoLarsRegressor, LassoRegressor, LinearRegressor, LogisticCVClassifier, LogisticClassifier, MeanShift, MiniBatchKMeans, MultiTaskElasticNetCVRegressor, MultiTaskElasticNetRegressor, MultiTaskLassoCVRegressor, MultiTaskLassoRegressor, MultinomialNBClassifier, OPTICS, OrthogonalMatchingPursuitCVRegressor, OrthogonalMatchingPursuitRegressor, PassiveAggressiveClassifier, PassiveAggressiveRegressor, PerceptronClassifier, ProbabilisticSGDClassifier, RANSACRegressor, RandomForestClassifier, RandomForestRegressor, RidgeCVClassifier, RidgeCVRegressor, RidgeClassifier, RidgeRegressor, SGDClassifier, SGDRegressor, SVMClassifier, SVMLClassifier, SVMLRegressor, SVMNuClassifier, SVMNuRegressor, SVMRegressor, SpectralClustering, TheilSenRegressorhigh²
SIRUS.jl-StableForestClassifier, StableForestRegressor, StableRulesClassifier, StableRulesRegressorlow
SymbolicRegression.jl-MultitargetSRRegressor, SRRegressorexperimental
TSVD.jlMLJTSVDInterface.jlTSVDTransformerhigh
XGBoost.jlMLJXGBoostInterface.jlXGBoostRegressor, XGBoostClassifier, XGBoostCounthigh

Notes

¹Models not in the MLJ registry are not included in integration tests. Consult package documentation to see how to load them. There may be issues loading these models simultaneously with other registered models.

²Some models are missing and assistance is welcome to complete the interface. Post a message on the Julia #mlj Slack channel if you would like to help, thanks!

diff --git a/dev/loading_model_code/index.html b/dev/loading_model_code/index.html index cb945c4e7..005f1330b 100644 --- a/dev/loading_model_code/index.html +++ b/dev/loading_model_code/index.html @@ -1,5 +1,5 @@ -Loading Model Code · MLJ

Loading Model Code

Once the name of a model, and the package providing that model, have been identified (see Model Search) one can either import the model type interactively with @iload, as shown under Installation, or use @load as shown below. The @load macro works from within a module, a package or a function, provided the relevant package providing the MLJ interface has been added to your package environment. It will attempt to load the model type into the global namespace of the module in which @load is invoked (Main if invoked at the REPL).

In general, the code providing core functionality for the model (living in a package you should consult for documentation) may be different from the package providing the MLJ interface. Since the core package is a dependency of the interface package, only the interface package needs to be added to your environment.

For instance, suppose you have activated a Julia package environment my_env that you wish to use for your MLJ project; for example, you have run:

using Pkg
+Loading Model Code · MLJ

Loading Model Code

Once the name of a model, and the package providing that model, have been identified (see Model Search) one can either import the model type interactively with @iload, as shown under Installation, or use @load as shown below. The @load macro works from within a module, a package or a function, provided the relevant package providing the MLJ interface has been added to your package environment. It will attempt to load the model type into the global namespace of the module in which @load is invoked (Main if invoked at the REPL).

In general, the code providing core functionality for the model (living in a package you should consult for documentation) may be different from the package providing the MLJ interface. Since the core package is a dependency of the interface package, only the interface package needs to be added to your environment.

For instance, suppose you have activated a Julia package environment my_env that you wish to use for your MLJ project; for example, you have run:

using Pkg
 Pkg.activate("my_env", shared=true)

Furthermore, suppose you want to use DecisionTreeClassifier, provided by the DecisionTree.jl package. Then, to determine which package provides the MLJ interface you call load_path:

julia> load_path("DecisionTreeClassifier", pkg="DecisionTree")
 "MLJDecisionTreeInterface.DecisionTreeClassifier"

In this case, we see that the package required is MLJDecisionTreeInterface.jl. If this package is not in my_env (do Pkg.status() to check) you add it by running

julia> Pkg.add("MLJDecisionTreeInterface")

So long as my_env is the active environment, this action need never be repeated (unless you run Pkg.rm("MLJDecisionTreeInterface")). You are now ready to instantiate a decision tree classifier:

julia> Tree = @load DecisionTree pkg=DecisionTree
 julia> tree = Tree()

which is equivalent to

julia> import MLJDecisionTreeInterface.DecisionTreeClassifier
@@ -9,4 +9,4 @@
 tree2 = Tree(min_samples_split=6)
 
 SVM = @load SVC pkg=LIBSVM
-svm = SVM()

See also @iload

source
MLJModels.@iloadMacro
@iload ModelName

Interactive alternative to @load. Provides user with an optioin to install (add) the required interface package to the current environment, and to choose the relevant model-providing package in ambiguous cases. See @load

source
+svm = SVM()

See also @iload

source
MLJModels.@iloadMacro
@iload ModelName

Interactive alternative to @load. Provides user with an optioin to install (add) the required interface package to the current environment, and to choose the relevant model-providing package in ambiguous cases. See @load

source
diff --git a/dev/logging_workflows/index.html b/dev/logging_workflows/index.html index 9fe7c2d58..07b1653aa 100644 --- a/dev/logging_workflows/index.html +++ b/dev/logging_workflows/index.html @@ -1,2 +1,2 @@ -Logging Workflows using MLflow · MLJ

Logging Workflows

MLflow integration

MLflow is a popular, language-agnostic, tool for externally logging the outcomes of machine learning experiments, including those carried out using MLJ.

MLJ logging examples are given in the MLJFlow.jl documentation. MLJ includes and re-exports all the methods of MLJFlow.jl, so there is no need to import MLJFlow.jl if using MLJ.

Warning

MLJFlow.jl is a new package still under active development and should be regarded as experimental. At this time, breaking changes to MLJFlow.jl will not necessarily trigger new breaking releases of MLJ.jl.

+Logging Workflows using MLflow · MLJ

Logging Workflows

MLflow integration

MLflow is a popular, language-agnostic, tool for externally logging the outcomes of machine learning experiments, including those carried out using MLJ.

MLJ logging examples are given in the MLJFlow.jl documentation. MLJ includes and re-exports all the methods of MLJFlow.jl, so there is no need to import MLJFlow.jl if using MLJ.

Warning

MLJFlow.jl is a new package still under active development and should be regarded as experimental. At this time, breaking changes to MLJFlow.jl will not necessarily trigger new breaking releases of MLJ.jl.

diff --git a/dev/machines/index.html b/dev/machines/index.html index 3aa837e8f..d2969607b 100644 --- a/dev/machines/index.html +++ b/dev/machines/index.html @@ -1,13 +1,13 @@ -Machines · MLJ

Machines

Recall from Getting Started that a machine binds a model (i.e., a choice of algorithm + hyperparameters) to data (see more at Constructing machines below). A machine is also the object storing learned parameters. Under the hood, calling fit! on a machine calls either MLJBase.fit or MLJBase.update, depending on the machine's internal state (as recorded in private fields old_model and old_rows). These lower-level fit and update methods, which are not ordinarily called directly by the user, dispatch on the model and a view of the data defined by the optional rows keyword argument of fit! (all rows by default).

Warm restarts

If a model update method has been implemented for the model, calls to fit! will avoid redundant calculations for certain kinds of model mutations. The main use-case is increasing an iteration parameter, such as the number of epochs in a neural network. To test if SomeIterativeModel supports this feature, check iteration_parameter(SomeIterativeModel) is different from nothing.

tree = (@load DecisionTreeClassifier pkg=DecisionTree verbosity=0)()
+Machines · MLJ

Machines

Recall from Getting Started that a machine binds a model (i.e., a choice of algorithm + hyperparameters) to data (see more at Constructing machines below). A machine is also the object storing learned parameters. Under the hood, calling fit! on a machine calls either MLJBase.fit or MLJBase.update, depending on the machine's internal state (as recorded in private fields old_model and old_rows). These lower-level fit and update methods, which are not ordinarily called directly by the user, dispatch on the model and a view of the data defined by the optional rows keyword argument of fit! (all rows by default).

Warm restarts

If a model update method has been implemented for the model, calls to fit! will avoid redundant calculations for certain kinds of model mutations. The main use-case is increasing an iteration parameter, such as the number of epochs in a neural network. To test if SomeIterativeModel supports this feature, check iteration_parameter(SomeIterativeModel) is different from nothing.

tree = (@load DecisionTreeClassifier pkg=DecisionTree verbosity=0)()
 forest = EnsembleModel(model=tree, n=10);
 X, y = @load_iris;
 mach = machine(forest, X, y)
 fit!(mach, verbosity=2);
trained Machine; caches model-specific representations of data
   model: ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …)
   args: 
-    1:	Source @750 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @992 ⏎ AbstractVector{Multiclass{3}}
+    1:	Source @300 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @190 ⏎ AbstractVector{Multiclass{3}}
 

Generally, changing a hyperparameter triggers retraining on calls to subsequent fit!:

julia> forest.bagging_fraction = 0.5;
julia> fit!(mach, verbosity=2);[ Info: Updating machine(ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …), …). [ Info: Truncating existing ensemble.

However, for this iterative model, increasing the iteration parameter only adds models to the existing ensemble:

julia> forest.n = 15;
julia> fit!(mach, verbosity=2);[ Info: Updating machine(ProbabilisticEnsembleModel(model = DecisionTreeClassifier(max_depth = -1, …), …), …). [ Info: Building on existing ensemble of length 10 @@ -18,7 +18,7 @@ fit!(mach)
trained Machine; caches model-specific representations of data
   model: PCA(maxoutdim = 0, …)
   args: 
-    1:	Source @666 ⏎ Table{AbstractVector{Continuous}}
+    1:	Source @353 ⏎ Table{AbstractVector{Continuous}}
 
julia> fitted_params(mach)(projection = [-0.36158967738145 0.6565398832858296 0.5809972798276162; 0.08226888989221415 0.7297123713264985 -0.5964180879380994; -0.8565721052905275 -0.175767403428653 -0.07252407548695988; -0.3588439262482158 -0.07470647013503479 -0.5490609107266099],)
julia> report(mach)(indim = 4, outdim = 3, tprincipalvar = 4.545608248041779, @@ -113,4 +113,4 @@ verbosity=1, force=false, composite=nothing, -)

Without mutating any other machine on which it may depend, perform one of the following actions to the machine mach, using the data and model bound to it, and restricting the data to rows if specified:

  • Ab initio training. Ignoring any previous learned parameters and cache, compute and store new learned parameters. Increment mach.state.

  • Training update. Making use of previous learned parameters and/or cache, replace or mutate existing learned parameters. The effect is the same (or nearly the same) as in ab initio training, but may be faster or use less memory, assuming the model supports an update option (implements MLJBase.update). Increment mach.state.

  • No-operation. Leave existing learned parameters untouched. Do not increment mach.state.

If the model, model, bound to mach is a symbol, then instead perform the action using the true model given by getproperty(composite, model). See also machine.

Training action logic

For the action to be a no-operation, either mach.frozen == true or or none of the following apply:

  • (i) mach has never been trained (mach.state == 0).

  • (ii) force == true.

  • (iii) The state of some other machine on which mach depends has changed since the last time mach was trained (ie, the last time mach.state was last incremented).

  • (iv) The specified rows have changed since the last retraining and mach.model does not have Static type.

  • (v) mach.model is a model and different from the last model used for training, but has the same type.

  • (vi) mach.model is a model but has a type different from the last model used for training.

  • (vii) mach.model is a symbol and (composite, mach.model) is different from the last model used for training, but has the same type.

  • (viii) mach.model is a symbol and (composite, mach.model) has a different type from the last model used for training.

In any of the cases (i) - (iv), (vi), or (viii), mach is trained ab initio. If (v) or (vii) is true, then a training update is applied.

To freeze or unfreeze mach, use freeze!(mach) or thaw!(mach).

Implementation details

The data to which a machine is bound is stored in mach.args. Each element of args is either a Node object, or, in the case that concrete data was bound to the machine, it is concrete data wrapped in a Source node. In all cases, to obtain concrete data for actual training, each argument N is called, as in N() or N(rows=rows), and either MLJBase.fit (ab initio training) or MLJBase.update (training update) is dispatched on mach.model and this data. See the "Adding models for general use" section of the MLJ documentation for more on these lower-level training methods.

source
+)

Without mutating any other machine on which it may depend, perform one of the following actions to the machine mach, using the data and model bound to it, and restricting the data to rows if specified:

If the model, model, bound to mach is a symbol, then instead perform the action using the true model given by getproperty(composite, model). See also machine.

Training action logic

For the action to be a no-operation, either mach.frozen == true or or none of the following apply:

In any of the cases (i) - (iv), (vi), or (viii), mach is trained ab initio. If (v) or (vii) is true, then a training update is applied.

To freeze or unfreeze mach, use freeze!(mach) or thaw!(mach).

Implementation details

The data to which a machine is bound is stored in mach.args. Each element of args is either a Node object, or, in the case that concrete data was bound to the machine, it is concrete data wrapped in a Source node. In all cases, to obtain concrete data for actual training, each argument N is called, as in N() or N(rows=rows), and either MLJBase.fit (ab initio training) or MLJBase.update (training update) is dispatched on mach.model and this data. See the "Adding models for general use" section of the MLJ documentation for more on these lower-level training methods.

source diff --git a/dev/mlj_cheatsheet/index.html b/dev/mlj_cheatsheet/index.html index bb00dea4e..07b3c581f 100644 --- a/dev/mlj_cheatsheet/index.html +++ b/dev/mlj_cheatsheet/index.html @@ -1,5 +1,5 @@ -MLJ Cheatsheet · MLJ

MLJ Cheatsheet

Starting an interactive MLJ session

julia> using MLJ
julia> MLJ_VERSION # version of MLJ for this cheatsheetv"0.20.6"

Model search and code loading

info("PCA") retrieves registry metadata for the model called "PCA"

info("RidgeRegressor", pkg="MultivariateStats") retrieves metadata for "RidgeRegresssor", which is provided by multiple packages

doc("DecisionTreeClassifier", pkg="DecisionTree") retrieves the model document string for the classifier, without loading model code

models() lists metadata of every registered model.

models("Tree") lists models with "Tree" in the model or package name.

models(x -> x.is_supervised && x.is_pure_julia) lists all supervised models written in pure julia.

models(matching(X)) lists all unsupervised models compatible with input X.

models(matching(X, y)) lists all supervised models compatible with input/target X/y.

With additional conditions:

models() do model
+MLJ Cheatsheet · MLJ

MLJ Cheatsheet

Starting an interactive MLJ session

julia> using MLJ
julia> MLJ_VERSION # version of MLJ for this cheatsheetv"0.20.6"

Model search and code loading

info("PCA") retrieves registry metadata for the model called "PCA"

info("RidgeRegressor", pkg="MultivariateStats") retrieves metadata for "RidgeRegresssor", which is provided by multiple packages

doc("DecisionTreeClassifier", pkg="DecisionTree") retrieves the model document string for the classifier, without loading model code

models() lists metadata of every registered model.

models("Tree") lists models with "Tree" in the model or package name.

models(x -> x.is_supervised && x.is_pure_julia) lists all supervised models written in pure julia.

models(matching(X)) lists all unsupervised models compatible with input X.

models(matching(X, y)) lists all supervised models compatible with input/target X/y.

With additional conditions:

models() do model
     matching(model, X, y) &&
     model.prediction_type == :probabilistic &&
     model.is_pure_julia
@@ -12,4 +12,4 @@
               !=(:Time);
               rng=123)

Here, y is assigned the :Exit column, and X is assigned the rest, except :Time.

Splitting row indices into train/validation/test, with seeded shuffling:

train, valid, test = partition(eachindex(y), 0.7, 0.2, rng=1234) # for 70:20:10 ratio

For a stratified split:

train, test = partition(eachindex(y), 0.8, stratify=y)

Split a table or matrix X, instead of indices:

Xtrain, Xvalid, Xtest = partition(X, 0.5, 0.3, rng=123)

Simultaneous splitting (needs multi=true):

(Xtrain, Xtest), (ytrain, ytest) = partition((X, y), 0.8, rng=123, multi=true)

Getting data from OpenML:

table = OpenML.load(91)

Creating synthetic classification data:

X, y = make_blobs(100, 2)

(also: make_moons, make_circles, make_regression)

Creating synthetic regression data:

X, y = make_regression(100, 2)

Machine construction

Supervised case:

model = KNNRegressor(K=1)
 mach = machine(model, X, y)

Unsupervised case:

model = OneHotEncoder()
-mach = machine(model, X)

Fitting

The fit! function can be used to fit a machine (defaults shown):

fit!(mach, rows=1:100, verbosity=1, force=false)

Prediction

  • Supervised case: predict(mach, Xnew) or predict(mach, rows=1:100)

    For probabilistic models: predict_mode, predict_mean and predict_median.

  • Unsupervised case: W = transform(mach, Xnew) or inverse_transform(mach, W), etc.

Inspecting objects

info(ConstantRegressor()), info("PCA"), info("RidgeRegressor", pkg="MultivariateStats") gets all properties (aka traits) of registered models

schema(X) get column names, types and scitypes, and nrows, of a table X

scitype(X) gets the scientific type of X

fitted_params(mach) gets learned parameters of the fitted machine

report(mach) gets other training results (e.g. feature rankings)

Saving and retrieving machines using Julia serializer

MLJ.save("my_machine.jls", mach) to save machine mach (without data)

predict_only_mach = machine("my_machine.jls") to deserialize.

Performance estimation

evaluate(model, X, y, resampling=CV(), measure=rms)
evaluate!(mach, resampling=Holdout(), measure=[rms, mav])
evaluate!(mach, resampling=[(fold1, fold2), (fold2, fold1)], measure=rms)

Resampling strategies (resampling=...)

Holdout(fraction_train=0.7, rng=1234) for simple holdout

CV(nfolds=6, rng=1234) for cross-validation

StratifiedCV(nfolds=6, rng=1234) for stratified cross-validation

TimeSeriesSV(nfolds=4) for time-series cross-validation

InSample(): test set = train set

or a list of pairs of row indices:

[(train1, eval1), (train2, eval2), ... (traink, evalk)]

Tuning model wrapper

tuned_model = TunedModel(model; tuning=RandomSearch(), resampling=Holdout(), measure=…, range=…)

Ranges for tuning (range=...)

If r = range(KNNRegressor(), :K, lower=1, upper = 20, scale=:log)

then Grid() search uses iterator(r, 6) == [1, 2, 3, 6, 11, 20].

lower=-Inf and upper=Inf are allowed.

Non-numeric ranges: r = range(model, :parameter, values=…)

Instead of model, declare type: r = range(Char, :c; values=['a', 'b'])

Nested ranges: Use dot syntax, as in r = range(EnsembleModel(atom=tree), :(atom.max_depth), ...)

Specify multiple ranges, as in range=[r1, r2, r3]. For more range options do ?Grid or ?RandomSearch

Tuning strategies

RandomSearch(rng=1234) for basic random search

Grid(resolution=10) or Grid(goal=50) for basic grid search

Also available: LatinHyperCube, Explicit (built-in), MLJTreeParzenTuning, ParticleSwarm, AdaptiveParticleSwarm (3rd-party packages)

Learning curves

For generating a plot of performance against parameter specified by range:

curve = learning_curve(mach, resolution=30, resampling=Holdout(), measure=…, range=…, n=1)
curve = learning_curve(model, X, y, resolution=30, resampling=Holdout(), measure=…, range=…, n=1)

If using Plots.jl:

plot(curve.parameter_values, curve.measurements, xlab=curve.parameter_name, xscale=curve.parameter_scale)

Controlling iterative models

Requires: using MLJIteration

iterated_model = IteratedModel(model=…, resampling=Holdout(), measure=…, controls=…, retrain=false)

Controls

Increment training: Step(n=1)

Stopping: TimeLimit(t=0.5) (in hours), NumberLimit(n=100), NumberSinceBest(n=6), NotANumber(), Threshold(value=0.0), GL(alpha=2.0), PQ(alpha=0.75, k=5), Patience(n=5)

Logging: Info(f=identity), Warn(f=""), Error(predicate, f="")

Callbacks: Callback(f=mach->nothing), WithNumberDo(f=n->@info(n)), WithIterationsDo(f=i->@info("num iterations: $i")), WithLossDo(f=x->@info("loss: $x")), WithTrainingLossesDo(f=v->@info(v))

Snapshots: Save(filename="machine.jlso")

Wraps: MLJIteration.skip(control, predicate=1), IterationControl.with_state_do(control)

Performance measures (metrics)

Do measures() to get full list.

Do measures("log") to list measures with "log" in doc-string.

Transformers

Built-ins include: Standardizer, OneHotEncoder, UnivariateBoxCoxTransformer, FeatureSelector, FillImputer, UnivariateDiscretizer, ContinuousEncoder, UnivariateTimeTypeToContinuous

Externals include: PCA (in MultivariateStats), KMeans, KMedoids (in Clustering).

models(m -> !m.is_supervised) to get full list

Ensemble model wrapper

EnsembleModel(model; weights=Float64[], bagging_fraction=0.8, rng=GLOBAL_RNG, n=100, parallel=true, out_of_bag_measure=[])

Target transformation wrapper

TransformedTargetModel(model; target=Standardizer())

Pipelines

pipe = (X -> coerce(X, :height=>Continuous)) |> OneHotEncoder |> KNNRegressor(K=3)
  • Unsupervised:

    pipe = Standardizer |> OneHotEncoder

  • Concatenation:

    pipe1 |> pipe2 or model |> pipe or pipe |> model, etc.

Advanced model composition techniques

See the Composing Models section of the MLJ manual.

+mach = machine(model, X)

Fitting

The fit! function can be used to fit a machine (defaults shown):

fit!(mach, rows=1:100, verbosity=1, force=false)

Prediction

  • Supervised case: predict(mach, Xnew) or predict(mach, rows=1:100)

    For probabilistic models: predict_mode, predict_mean and predict_median.

  • Unsupervised case: W = transform(mach, Xnew) or inverse_transform(mach, W), etc.

Inspecting objects

info(ConstantRegressor()), info("PCA"), info("RidgeRegressor", pkg="MultivariateStats") gets all properties (aka traits) of registered models

schema(X) get column names, types and scitypes, and nrows, of a table X

scitype(X) gets the scientific type of X

fitted_params(mach) gets learned parameters of the fitted machine

report(mach) gets other training results (e.g. feature rankings)

Saving and retrieving machines using Julia serializer

MLJ.save("my_machine.jls", mach) to save machine mach (without data)

predict_only_mach = machine("my_machine.jls") to deserialize.

Performance estimation

evaluate(model, X, y, resampling=CV(), measure=rms)
evaluate!(mach, resampling=Holdout(), measure=[rms, mav])
evaluate!(mach, resampling=[(fold1, fold2), (fold2, fold1)], measure=rms)

Resampling strategies (resampling=...)

Holdout(fraction_train=0.7, rng=1234) for simple holdout

CV(nfolds=6, rng=1234) for cross-validation

StratifiedCV(nfolds=6, rng=1234) for stratified cross-validation

TimeSeriesSV(nfolds=4) for time-series cross-validation

InSample(): test set = train set

or a list of pairs of row indices:

[(train1, eval1), (train2, eval2), ... (traink, evalk)]

Tuning model wrapper

tuned_model = TunedModel(model; tuning=RandomSearch(), resampling=Holdout(), measure=…, range=…)

Ranges for tuning (range=...)

If r = range(KNNRegressor(), :K, lower=1, upper = 20, scale=:log)

then Grid() search uses iterator(r, 6) == [1, 2, 3, 6, 11, 20].

lower=-Inf and upper=Inf are allowed.

Non-numeric ranges: r = range(model, :parameter, values=…)

Instead of model, declare type: r = range(Char, :c; values=['a', 'b'])

Nested ranges: Use dot syntax, as in r = range(EnsembleModel(atom=tree), :(atom.max_depth), ...)

Specify multiple ranges, as in range=[r1, r2, r3]. For more range options do ?Grid or ?RandomSearch

Tuning strategies

RandomSearch(rng=1234) for basic random search

Grid(resolution=10) or Grid(goal=50) for basic grid search

Also available: LatinHyperCube, Explicit (built-in), MLJTreeParzenTuning, ParticleSwarm, AdaptiveParticleSwarm (3rd-party packages)

Learning curves

For generating a plot of performance against parameter specified by range:

curve = learning_curve(mach, resolution=30, resampling=Holdout(), measure=…, range=…, n=1)
curve = learning_curve(model, X, y, resolution=30, resampling=Holdout(), measure=…, range=…, n=1)

If using Plots.jl:

plot(curve.parameter_values, curve.measurements, xlab=curve.parameter_name, xscale=curve.parameter_scale)

Controlling iterative models

Requires: using MLJIteration

iterated_model = IteratedModel(model=…, resampling=Holdout(), measure=…, controls=…, retrain=false)

Controls

Increment training: Step(n=1)

Stopping: TimeLimit(t=0.5) (in hours), NumberLimit(n=100), NumberSinceBest(n=6), NotANumber(), Threshold(value=0.0), GL(alpha=2.0), PQ(alpha=0.75, k=5), Patience(n=5)

Logging: Info(f=identity), Warn(f=""), Error(predicate, f="")

Callbacks: Callback(f=mach->nothing), WithNumberDo(f=n->@info(n)), WithIterationsDo(f=i->@info("num iterations: $i")), WithLossDo(f=x->@info("loss: $x")), WithTrainingLossesDo(f=v->@info(v))

Snapshots: Save(filename="machine.jlso")

Wraps: MLJIteration.skip(control, predicate=1), IterationControl.with_state_do(control)

Performance measures (metrics)

Do measures() to get full list.

Do measures("log") to list measures with "log" in doc-string.

Transformers

Built-ins include: Standardizer, OneHotEncoder, UnivariateBoxCoxTransformer, FeatureSelector, FillImputer, UnivariateDiscretizer, ContinuousEncoder, UnivariateTimeTypeToContinuous

Externals include: PCA (in MultivariateStats), KMeans, KMedoids (in Clustering).

models(m -> !m.is_supervised) to get full list

Ensemble model wrapper

EnsembleModel(model; weights=Float64[], bagging_fraction=0.8, rng=GLOBAL_RNG, n=100, parallel=true, out_of_bag_measure=[])

Target transformation wrapper

TransformedTargetModel(model; target=Standardizer())

Pipelines

pipe = (X -> coerce(X, :height=>Continuous)) |> OneHotEncoder |> KNNRegressor(K=3)
  • Unsupervised:

    pipe = Standardizer |> OneHotEncoder

  • Concatenation:

    pipe1 |> pipe2 or model |> pipe or pipe |> model, etc.

Advanced model composition techniques

See the Composing Models section of the MLJ manual.

diff --git a/dev/model_browser/index.html b/dev/model_browser/index.html index ee63bd48a..e9d5c9708 100644 --- a/dev/model_browser/index.html +++ b/dev/model_browser/index.html @@ -1,2 +1,2 @@ -Model Browser · MLJ

Model Browser

Models may appear under multiple categories.

Below an encoder is any transformer that does not fall under another category, such as "Missing Value Imputation" or "Dimension Reduction".

Categories

Regression | Classification | Outlier Detection | Iterative Models | Ensemble Models | Dimension Reduction | Clustering | Bayesian Models | Class Imbalance | Encoders | Meta Algorithms | Neural networks | Static Models | Missing Value Imputation | Distribution Fitter | Feature Engineering | Text Analysis | Image Processing

Regression

Classification

Outlier Detection

Iterative Models

Ensemble Models

Dimension Reduction

Clustering

Bayesian Models

Class Imbalance

Encoders

Meta Algorithms

Neural networks

Static Models

Missing Value Imputation

Distribution Fitter

Feature Engineering

Text Analysis

Image Processing

+Model Browser · MLJ

Model Browser

Models may appear under multiple categories.

Below an encoder is any transformer that does not fall under another category, such as "Missing Value Imputation" or "Dimension Reduction".

Categories

Regression | Classification | Outlier Detection | Iterative Models | Ensemble Models | Dimension Reduction | Clustering | Bayesian Models | Class Imbalance | Encoders | Meta Algorithms | Neural networks | Static Models | Missing Value Imputation | Distribution Fitter | Feature Engineering | Text Analysis | Image Processing

Regression

Classification

Outlier Detection

Iterative Models

Ensemble Models

Dimension Reduction

Clustering

Bayesian Models

Class Imbalance

Encoders

Meta Algorithms

Neural networks

Static Models

Missing Value Imputation

Distribution Fitter

Feature Engineering

Text Analysis

Image Processing

diff --git a/dev/model_search/index.html b/dev/model_search/index.html index 9dce0d378..b0ee99adf 100644 --- a/dev/model_search/index.html +++ b/dev/model_search/index.html @@ -1,5 +1,5 @@ -Model Search · MLJ

Model Search

MLJ has a model registry, allowing the user to search models and their properties, without loading all the packages containing model code. In turn, this allows one to efficiently find all models solving a given machine learning task. The task itself is specified with the help of the matching method, and the search executed with the models methods, as detailed below.

For commonly encountered problems with model search, see also Preparing Data.

A table of all models is also given at List of Supported Models.

Model metadata

Terminology. In this section the word "model" refers to a metadata entry in the model registry, as opposed to an actual model struct that such an entry represents. One can obtain such an entry with the info command:

julia> info("PCA")(name = "PCA",
+Model Search · MLJ

Model Search

MLJ has a model registry, allowing the user to search models and their properties, without loading all the packages containing model code. In turn, this allows one to efficiently find all models solving a given machine learning task. The task itself is specified with the help of the matching method, and the search executed with the models methods, as detailed below.

For commonly encountered problems with model search, see also Preparing Data.

A table of all models is also given at List of Supported Models.

Model metadata

Terminology. In this section the word "model" refers to a metadata entry in the model registry, as opposed to an actual model struct that such an entry represents. One can obtain such an entry with the info command:

julia> info("PCA")(name = "PCA",
  package_name = "MultivariateStats",
  is_supervised = false,
  abstract_type = Unsupervised,
@@ -128,4 +128,4 @@
 localmodels(filters...; modl=Main, wrappers=false)
 localmodels(needle::Union{AbstractString,Regex}; modl=Main, wrappers=false)

List all models currently available to the user from the module modl without importing a package, and which additional pass through the specified filters. Here a filter is a Bool-valued function on models.

Use load_path to get the path to some model returned, as in these examples:

ms = localmodels()
 model = ms[1]
-load_path(model)

See also models, load_path.

source
+load_path(model)

See also models, load_path.

source diff --git a/dev/model_stacking/index.html b/dev/model_stacking/index.html index 1d49f0ad2..80f27f1ad 100644 --- a/dev/model_stacking/index.html +++ b/dev/model_stacking/index.html @@ -1,5 +1,5 @@ -Model Stacking · MLJ

Model Stacking

In a model stack, as introduced by Wolpert (1992), an adjudicating model learns the best way to combine the predictions of multiple base models. In MLJ, such models are constructed using the Stack constructor. To learn more about stacking and to see how to construct a stack "by hand" using Learning Networks, see this Data Science in Julia tutorial

MLJBase.StackType
Stack(; metalearner=nothing, name1=model1, name2=model2, ..., keyword_options...)

Implements the two-layer generalized stack algorithm introduced by Wolpert (1992) and generalized by Van der Laan et al (2007). Returns an instance of type ProbabilisticStack or DeterministicStack, depending on the prediction type of metalearner.

When training a machine bound to such an instance:

  • The data is split into training/validation sets according to the specified resampling strategy.

  • Each base model model1, model2, ... is trained on each training subset and outputs predictions on the corresponding validation sets. The multi-fold predictions are spliced together into a so-called out-of-sample prediction for each model.

  • The adjudicating model, metalearner, is subsequently trained on the out-of-sample predictions to learn the best combination of base model predictions.

  • Each base model is retrained on all supplied data for purposes of passing on new production data onto the adjudicator for making new predictions

Arguments

  • metalearner::Supervised: The model that will optimize the desired criterion based on its internals. For instance, a LinearRegression model will optimize the squared error.

  • resampling: The resampling strategy used to prepare out-of-sample predictions of the base learners.

  • measures: A measure or iterable over measures, to perform an internal evaluation of the learners in the Stack while training. This is not for the evaluation of the Stack itself.

  • cache: Whether machines created in the learning network will cache data or not.

  • acceleration: A supported AbstractResource to define the training parallelization mode of the stack.

  • name1=model1, name2=model2, ...: the Supervised model instances to be used as base learners. The provided names become properties of the instance created to allow hyper-parameter access

Example

The following code defines a DeterministicStack instance for learning a Continuous target, and demonstrates that:

  • Base models can be Probabilistic models even if the stack itself is Deterministic (predict_mean is applied in such cases).

  • As an alternative to hyperparameter optimization, one can stack multiple copies of given model, mutating the hyper-parameter used in each copy.

using MLJ
+Model Stacking · MLJ

Model Stacking

In a model stack, as introduced by Wolpert (1992), an adjudicating model learns the best way to combine the predictions of multiple base models. In MLJ, such models are constructed using the Stack constructor. To learn more about stacking and to see how to construct a stack "by hand" using Learning Networks, see this Data Science in Julia tutorial

MLJBase.StackType
Stack(; metalearner=nothing, name1=model1, name2=model2, ..., keyword_options...)

Implements the two-layer generalized stack algorithm introduced by Wolpert (1992) and generalized by Van der Laan et al (2007). Returns an instance of type ProbabilisticStack or DeterministicStack, depending on the prediction type of metalearner.

When training a machine bound to such an instance:

  • The data is split into training/validation sets according to the specified resampling strategy.

  • Each base model model1, model2, ... is trained on each training subset and outputs predictions on the corresponding validation sets. The multi-fold predictions are spliced together into a so-called out-of-sample prediction for each model.

  • The adjudicating model, metalearner, is subsequently trained on the out-of-sample predictions to learn the best combination of base model predictions.

  • Each base model is retrained on all supplied data for purposes of passing on new production data onto the adjudicator for making new predictions

Arguments

  • metalearner::Supervised: The model that will optimize the desired criterion based on its internals. For instance, a LinearRegression model will optimize the squared error.

  • resampling: The resampling strategy used to prepare out-of-sample predictions of the base learners.

  • measures: A measure or iterable over measures, to perform an internal evaluation of the learners in the Stack while training. This is not for the evaluation of the Stack itself.

  • cache: Whether machines created in the learning network will cache data or not.

  • acceleration: A supported AbstractResource to define the training parallelization mode of the stack.

  • name1=model1, name2=model2, ...: the Supervised model instances to be used as base learners. The provided names become properties of the instance created to allow hyper-parameter access

Example

The following code defines a DeterministicStack instance for learning a Continuous target, and demonstrates that:

  • Base models can be Probabilistic models even if the stack itself is Deterministic (predict_mean is applied in such cases).

  • As an alternative to hyperparameter optimization, one can stack multiple copies of given model, mutating the hyper-parameter used in each copy.

using MLJ
 
 DecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree
 EvoTreeRegressor = @load EvoTreeRegressor
@@ -21,4 +21,4 @@
 
 mach = machine(stack, X, y)
 evaluate!(mach; resampling=Holdout(), measure=rmse)
-

The internal evaluation report can be accessed like this and provides a PerformanceEvaluation object for each model:

report(mach).cv_report
source
+

The internal evaluation report can be accessed like this and provides a PerformanceEvaluation object for each model:

report(mach).cv_report
source
diff --git a/dev/models/ABODDetector_OutlierDetectionNeighbors/index.html b/dev/models/ABODDetector_OutlierDetectionNeighbors/index.html index 8717cfe7d..351f3c186 100644 --- a/dev/models/ABODDetector_OutlierDetectionNeighbors/index.html +++ b/dev/models/ABODDetector_OutlierDetectionNeighbors/index.html @@ -1,5 +1,5 @@ -ABODDetector · MLJ

ABODDetector

ABODDetector(k = 5,
+ABODDetector · MLJ

ABODDetector

ABODDetector(k = 5,
              metric = Euclidean(),
              algorithm = :kdtree,
              static = :auto,
@@ -10,4 +10,4 @@
 detector = ABODDetector()
 X = rand(10, 100)
 model, result = fit(detector, X; verbosity=0)
-test_scores = transform(detector, model, X)

References

[1] Kriegel, Hans-Peter; S hubert, Matthias; Zimek, Arthur (2008): Angle-based outlier detection in high-dimensional data.

[2] Li, Xiaojie; Lv, Jian Cheng; Cheng, Dongdong (2015): Angle-Based Outlier Detection Algorithm with More Stable Relationships.

+test_scores = transform(detector, model, X)

References

[1] Kriegel, Hans-Peter; S hubert, Matthias; Zimek, Arthur (2008): Angle-based outlier detection in high-dimensional data.

[2] Li, Xiaojie; Lv, Jian Cheng; Cheng, Dongdong (2015): Angle-Based Outlier Detection Algorithm with More Stable Relationships.

diff --git a/dev/models/ABODDetector_OutlierDetectionPython/index.html b/dev/models/ABODDetector_OutlierDetectionPython/index.html index 8bbf696c9..2135ce70d 100644 --- a/dev/models/ABODDetector_OutlierDetectionPython/index.html +++ b/dev/models/ABODDetector_OutlierDetectionPython/index.html @@ -1,3 +1,3 @@ -ABODDetector · MLJ
+ABODDetector · MLJ
diff --git a/dev/models/ARDRegressor_MLJScikitLearnInterface/index.html b/dev/models/ARDRegressor_MLJScikitLearnInterface/index.html index 5885d4cdc..4c57dc0e6 100644 --- a/dev/models/ARDRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/ARDRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -ARDRegressor · MLJ

ARDRegressor

ARDRegressor

A model type for constructing a Bayesian ARD regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ARDRegressor = @load ARDRegressor pkg=MLJScikitLearnInterface

Do model = ARDRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ARDRegressor(max_iter=...).

Hyper-parameters

  • max_iter = 300
  • tol = 0.001
  • alpha_1 = 1.0e-6
  • alpha_2 = 1.0e-6
  • lambda_1 = 1.0e-6
  • lambda_2 = 1.0e-6
  • compute_score = false
  • threshold_lambda = 10000.0
  • fit_intercept = true
  • copy_X = true
  • verbose = false
+ARDRegressor · MLJ

ARDRegressor

ARDRegressor

A model type for constructing a Bayesian ARD regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ARDRegressor = @load ARDRegressor pkg=MLJScikitLearnInterface

Do model = ARDRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ARDRegressor(max_iter=...).

Hyper-parameters

  • max_iter = 300
  • tol = 0.001
  • alpha_1 = 1.0e-6
  • alpha_2 = 1.0e-6
  • lambda_1 = 1.0e-6
  • lambda_2 = 1.0e-6
  • compute_score = false
  • threshold_lambda = 10000.0
  • fit_intercept = true
  • copy_X = true
  • verbose = false
diff --git a/dev/models/AdaBoostClassifier_MLJScikitLearnInterface/index.html b/dev/models/AdaBoostClassifier_MLJScikitLearnInterface/index.html index eefb0d5a6..adf138fc7 100644 --- a/dev/models/AdaBoostClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/AdaBoostClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -AdaBoostClassifier · MLJ

AdaBoostClassifier

AdaBoostClassifier

A model type for constructing a ada boost classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AdaBoostClassifier = @load AdaBoostClassifier pkg=MLJScikitLearnInterface

Do model = AdaBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostClassifier(estimator=...).

An AdaBoost classifier is a meta-estimator that begins by fitting a classifier on the original dataset and then fits additional copies of the classifier on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases.

This class implements the algorithm known as AdaBoost-SAMME.

+AdaBoostClassifier · MLJ

AdaBoostClassifier

AdaBoostClassifier

A model type for constructing a ada boost classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AdaBoostClassifier = @load AdaBoostClassifier pkg=MLJScikitLearnInterface

Do model = AdaBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostClassifier(estimator=...).

An AdaBoost classifier is a meta-estimator that begins by fitting a classifier on the original dataset and then fits additional copies of the classifier on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases.

This class implements the algorithm known as AdaBoost-SAMME.

diff --git a/dev/models/AdaBoostRegressor_MLJScikitLearnInterface/index.html b/dev/models/AdaBoostRegressor_MLJScikitLearnInterface/index.html index 684228c88..5e46fc4ad 100644 --- a/dev/models/AdaBoostRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/AdaBoostRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -AdaBoostRegressor · MLJ

AdaBoostRegressor

AdaBoostRegressor

A model type for constructing a AdaBoost ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AdaBoostRegressor = @load AdaBoostRegressor pkg=MLJScikitLearnInterface

Do model = AdaBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostRegressor(estimator=...).

An AdaBoost regressor is a meta-estimator that begins by fitting a regressor on the original dataset and then fits additional copies of the regressor on the same dataset but where the weights of instances are adjusted according to the error of the current prediction. As such, subsequent regressors focus more on difficult cases.

This class implements the algorithm known as AdaBoost.R2.

+AdaBoostRegressor · MLJ

AdaBoostRegressor

AdaBoostRegressor

A model type for constructing a AdaBoost ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AdaBoostRegressor = @load AdaBoostRegressor pkg=MLJScikitLearnInterface

Do model = AdaBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostRegressor(estimator=...).

An AdaBoost regressor is a meta-estimator that begins by fitting a regressor on the original dataset and then fits additional copies of the regressor on the same dataset but where the weights of instances are adjusted according to the error of the current prediction. As such, subsequent regressors focus more on difficult cases.

This class implements the algorithm known as AdaBoost.R2.

diff --git a/dev/models/AdaBoostStumpClassifier_DecisionTree/index.html b/dev/models/AdaBoostStumpClassifier_DecisionTree/index.html index 01879b128..d28706db3 100644 --- a/dev/models/AdaBoostStumpClassifier_DecisionTree/index.html +++ b/dev/models/AdaBoostStumpClassifier_DecisionTree/index.html @@ -1,5 +1,5 @@ -AdaBoostStumpClassifier · MLJ

AdaBoostStumpClassifier

AdaBoostStumpClassifier

A model type for constructing a Ada-boosted stump classifier, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AdaBoostStumpClassifier = @load AdaBoostStumpClassifier pkg=DecisionTree

Do model = AdaBoostStumpClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostStumpClassifier(n_iter=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • n_iter=10: number of iterations of AdaBoost
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.
  • predict_mode(mach, Xnew): instead return the mode of each prediction above.

Fitted Parameters

The fields of fitted_params(mach) are:

  • stumps: the Ensemble object returned by the core DecisionTree.jl algorithm.
  • coefficients: the stump coefficients (one per stump)

Report

The fields of report(mach) are:

  • features: the names of the features encountered in training

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
+AdaBoostStumpClassifier · MLJ

AdaBoostStumpClassifier

AdaBoostStumpClassifier

A model type for constructing a Ada-boosted stump classifier, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AdaBoostStumpClassifier = @load AdaBoostStumpClassifier pkg=DecisionTree

Do model = AdaBoostStumpClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AdaBoostStumpClassifier(n_iter=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • n_iter=10: number of iterations of AdaBoost
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.
  • predict_mode(mach, Xnew): instead return the mode of each prediction above.

Fitted Parameters

The fields of fitted_params(mach) are:

  • stumps: the Ensemble object returned by the core DecisionTree.jl algorithm.
  • coefficients: the stump coefficients (one per stump)

Report

The fields of report(mach) are:

  • features: the names of the features encountered in training

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
 Booster = @load AdaBoostStumpClassifier pkg=DecisionTree
 booster = Booster(n_iter=15)
 
@@ -16,4 +16,4 @@
 
 fitted_params(mach).stumps ## raw `Ensemble` object from DecisionTree.jl
 fitted_params(mach).coefs  ## coefficient associated with each stump
-feature_importances(mach)

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.AdaBoostStumpClassifier.

+feature_importances(mach)

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.AdaBoostStumpClassifier.

diff --git a/dev/models/AffinityPropagation_MLJScikitLearnInterface/index.html b/dev/models/AffinityPropagation_MLJScikitLearnInterface/index.html index 315bc1fc6..08813dd09 100644 --- a/dev/models/AffinityPropagation_MLJScikitLearnInterface/index.html +++ b/dev/models/AffinityPropagation_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -AffinityPropagation · MLJ

AffinityPropagation

AffinityPropagation

A model type for constructing a Affinity Propagation Clustering of data, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AffinityPropagation = @load AffinityPropagation pkg=MLJScikitLearnInterface

Do model = AffinityPropagation() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AffinityPropagation(damping=...).

Hyper-parameters

  • damping = 0.5
  • max_iter = 200
  • convergence_iter = 15
  • copy = true
  • preference = nothing
  • affinity = euclidean
  • verbose = false
+AffinityPropagation · MLJ

AffinityPropagation

AffinityPropagation

A model type for constructing a Affinity Propagation Clustering of data, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AffinityPropagation = @load AffinityPropagation pkg=MLJScikitLearnInterface

Do model = AffinityPropagation() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AffinityPropagation(damping=...).

Hyper-parameters

  • damping = 0.5
  • max_iter = 200
  • convergence_iter = 15
  • copy = true
  • preference = nothing
  • affinity = euclidean
  • verbose = false
diff --git a/dev/models/AgglomerativeClustering_MLJScikitLearnInterface/index.html b/dev/models/AgglomerativeClustering_MLJScikitLearnInterface/index.html index 367adb579..f5d075503 100644 --- a/dev/models/AgglomerativeClustering_MLJScikitLearnInterface/index.html +++ b/dev/models/AgglomerativeClustering_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -AgglomerativeClustering · MLJ

AgglomerativeClustering

AgglomerativeClustering

A model type for constructing a agglomerative clustering, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AgglomerativeClustering = @load AgglomerativeClustering pkg=MLJScikitLearnInterface

Do model = AgglomerativeClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AgglomerativeClustering(n_clusters=...).

Recursively merges the pair of clusters that minimally increases a given linkage distance. Note: there is no predict or transform. Instead, inspect the fitted_params.

+AgglomerativeClustering · MLJ

AgglomerativeClustering

AgglomerativeClustering

A model type for constructing a agglomerative clustering, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

AgglomerativeClustering = @load AgglomerativeClustering pkg=MLJScikitLearnInterface

Do model = AgglomerativeClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in AgglomerativeClustering(n_clusters=...).

Recursively merges the pair of clusters that minimally increases a given linkage distance. Note: there is no predict or transform. Instead, inspect the fitted_params.

diff --git a/dev/models/AutoEncoder_BetaML/index.html b/dev/models/AutoEncoder_BetaML/index.html index 05f1c24eb..485bfbb5a 100644 --- a/dev/models/AutoEncoder_BetaML/index.html +++ b/dev/models/AutoEncoder_BetaML/index.html @@ -1,5 +1,5 @@ -AutoEncoder · MLJ

AutoEncoder

mutable struct AutoEncoder <: MLJModelInterface.Unsupervised

A ready-to use AutoEncoder, from the Beta Machine Learning Toolkit (BetaML) for ecoding and decoding of data using neural networks

Parameters:

  • encoded_size: The number of neurons (i.e. dimensions) of the encoded data. If the value is a float it is consiered a percentual (to be rounded) of the dimensionality of the data [def: 0.33]

  • layers_size: Inner layer dimension (i.e. number of neurons). If the value is a float it is considered a percentual (to be rounded) of the dimensionality of the data [def: nothing that applies a specific heuristic]. Consider that the underlying neural network is trying to predict multiple values at the same times. Normally this requires many more neurons than a scalar prediction. If e_layers or d_layers are specified, this parameter is ignored for the respective part.

  • e_layers: The layers (vector of AbstractLayers) responsable of the encoding of the data [def: nothing, i.e. two dense layers with the inner one of layers_size]. See subtypes(BetaML.AbstractLayer) for supported layers

  • d_layers: The layers (vector of AbstractLayers) responsable of the decoding of the data [def: nothing, i.e. two dense layers with the inner one of layers_size]. See subtypes(BetaML.AbstractLayer) for supported layers

  • loss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as (n x d) matrices.

    Warning

    If you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.

  • dloss: Derivative of the loss function [def: BetaML.dsquared_cost if loss==squared_cost, nothing otherwise, i.e. use the derivative of the squared cost or autodiff]

  • epochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]

  • batch_size: Size of each individual batch [def: 8]

  • opt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()] See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers

  • shuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]

  • tunemethod: The method - and its parameters - to employ for hyperparameters autotuning. See SuccessiveHalvingSearch for the default method. To implement automatic hyperparameter tuning during the (first) fit! call simply set autotune=true and eventually change the default tunemethod options (including the parameter ranges, the resources to employ and the loss function to adopt).

  • descr: An optional title and/or description for this model

  • rng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • use transform to obtain the encoded data, and inverse_trasnform to decode to the original data

Example:

julia> using MLJ
+AutoEncoder · MLJ

AutoEncoder

mutable struct AutoEncoder <: MLJModelInterface.Unsupervised

A ready-to use AutoEncoder, from the Beta Machine Learning Toolkit (BetaML) for ecoding and decoding of data using neural networks

Parameters:

  • encoded_size: The number of neurons (i.e. dimensions) of the encoded data. If the value is a float it is consiered a percentual (to be rounded) of the dimensionality of the data [def: 0.33]

  • layers_size: Inner layer dimension (i.e. number of neurons). If the value is a float it is considered a percentual (to be rounded) of the dimensionality of the data [def: nothing that applies a specific heuristic]. Consider that the underlying neural network is trying to predict multiple values at the same times. Normally this requires many more neurons than a scalar prediction. If e_layers or d_layers are specified, this parameter is ignored for the respective part.

  • e_layers: The layers (vector of AbstractLayers) responsable of the encoding of the data [def: nothing, i.e. two dense layers with the inner one of layers_size]. See subtypes(BetaML.AbstractLayer) for supported layers

  • d_layers: The layers (vector of AbstractLayers) responsable of the decoding of the data [def: nothing, i.e. two dense layers with the inner one of layers_size]. See subtypes(BetaML.AbstractLayer) for supported layers

  • loss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as (n x d) matrices.

    Warning

    If you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.

  • dloss: Derivative of the loss function [def: BetaML.dsquared_cost if loss==squared_cost, nothing otherwise, i.e. use the derivative of the squared cost or autodiff]

  • epochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]

  • batch_size: Size of each individual batch [def: 8]

  • opt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()] See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers

  • shuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]

  • tunemethod: The method - and its parameters - to employ for hyperparameters autotuning. See SuccessiveHalvingSearch for the default method. To implement automatic hyperparameter tuning during the (first) fit! call simply set autotune=true and eventually change the default tunemethod options (including the parameter ranges, the resources to employ and the loss function to adopt).

  • descr: An optional title and/or description for this model

  • rng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • use transform to obtain the encoded data, and inverse_trasnform to decode to the original data

Example:

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -59,4 +59,4 @@
 julia> BetaML.relative_mean_error(MLJ.matrix(X),X_recovered)
 0.03387721261716176
 
-
+
diff --git a/dev/models/BM25Transformer_MLJText/index.html b/dev/models/BM25Transformer_MLJText/index.html index a3c40615d..d7075d030 100644 --- a/dev/models/BM25Transformer_MLJText/index.html +++ b/dev/models/BM25Transformer_MLJText/index.html @@ -1,5 +1,5 @@ -BM25Transformer · MLJ

BM25Transformer

BM25Transformer

A model type for constructing a b m25 transformer, based on MLJText.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BM25Transformer = @load BM25Transformer pkg=MLJText

Do model = BM25Transformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BM25Transformer(max_doc_freq=...).

The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of Okapi BM25 document-word statistics. The BM25 scoring function uses both term frequency (TF) and inverse document frequency (IDF, defined below), as in TfidfTransformer, but additionally adjusts for the probability that a user will consider a search result relevant based, on the terms in the search query and those in each document.

In textbooks and implementations there is variation in the definition of IDF. Here two IDF definitions are available. The default, smoothed option provides the IDF for a term t as log((1 + n)/(1 + df(t))) + 1, where n is the total number of documents and df(t) the number of documents in which t appears. Setting smooth_df = false provides an IDF of log(n/df(t)) + 1.

References:

  • http://ethen8181.github.io/machine-learning/search/bm25_intro.html
  • https://en.wikipedia.org/wiki/Okapi_BM25
  • https://nlp.stanford.edu/IR-book/html/htmledition/okapi-bm25-a-non-binary-model-1.html

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:

    • A vector of abstract strings (tokens), e.g., ["I", "like", "Sam", ".", "Sam", "is", "nice", "."] (scitype AbstractVector{Textual})
    • A dictionary of counts, indexed on abstract strings, e.g., Dict("I"=>1, "Sam"=>2, "Sam is"=>1) (scitype Multiset{Textual}})
    • A dictionary of counts, indexed on plain ngrams, e.g., Dict(("I",)=>1, ("Sam",)=>2, ("I", "Sam")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.
  • min_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.
  • κ=2: The term frequency saturation characteristic. Higher values represent slower saturation. What we mean by saturation is the degree to which a term occurring extra times adds to the overall score.
  • β=0.075: Amplifies the particular document length compared to the average length. The bigger β is, the more document length is amplified in terms of the overall score. The default value is 0.75, and the bounds are restricted between 0 and 1.
  • smooth_idf=true: Control which definition of IDF to use (see above).

Operations

  • transform(mach, Xnew): Based on the vocabulary, IDF, and mean word counts learned in training, return the matrix of BM25 scores for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.

Fitted parameters

The fields of fitted_params(mach) are:

  • vocab: A vector containing the string used in the transformer's vocabulary.
  • idf_vector: The transformer's calculated IDF vector.
  • mean_words_in_docs: The mean number of words in each document.

Examples

BM25Transformer accepts a variety of inputs. The example below transforms tokenized documents:

using MLJ
+BM25Transformer · MLJ

BM25Transformer

BM25Transformer

A model type for constructing a b m25 transformer, based on MLJText.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BM25Transformer = @load BM25Transformer pkg=MLJText

Do model = BM25Transformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BM25Transformer(max_doc_freq=...).

The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of Okapi BM25 document-word statistics. The BM25 scoring function uses both term frequency (TF) and inverse document frequency (IDF, defined below), as in TfidfTransformer, but additionally adjusts for the probability that a user will consider a search result relevant based, on the terms in the search query and those in each document.

In textbooks and implementations there is variation in the definition of IDF. Here two IDF definitions are available. The default, smoothed option provides the IDF for a term t as log((1 + n)/(1 + df(t))) + 1, where n is the total number of documents and df(t) the number of documents in which t appears. Setting smooth_df = false provides an IDF of log(n/df(t)) + 1.

References:

  • http://ethen8181.github.io/machine-learning/search/bm25_intro.html
  • https://en.wikipedia.org/wiki/Okapi_BM25
  • https://nlp.stanford.edu/IR-book/html/htmledition/okapi-bm25-a-non-binary-model-1.html

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:

    • A vector of abstract strings (tokens), e.g., ["I", "like", "Sam", ".", "Sam", "is", "nice", "."] (scitype AbstractVector{Textual})
    • A dictionary of counts, indexed on abstract strings, e.g., Dict("I"=>1, "Sam"=>2, "Sam is"=>1) (scitype Multiset{Textual}})
    • A dictionary of counts, indexed on plain ngrams, e.g., Dict(("I",)=>1, ("Sam",)=>2, ("I", "Sam")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.
  • min_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.
  • κ=2: The term frequency saturation characteristic. Higher values represent slower saturation. What we mean by saturation is the degree to which a term occurring extra times adds to the overall score.
  • β=0.075: Amplifies the particular document length compared to the average length. The bigger β is, the more document length is amplified in terms of the overall score. The default value is 0.75, and the bounds are restricted between 0 and 1.
  • smooth_idf=true: Control which definition of IDF to use (see above).

Operations

  • transform(mach, Xnew): Based on the vocabulary, IDF, and mean word counts learned in training, return the matrix of BM25 scores for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.

Fitted parameters

The fields of fitted_params(mach) are:

  • vocab: A vector containing the string used in the transformer's vocabulary.
  • idf_vector: The transformer's calculated IDF vector.
  • mean_words_in_docs: The mean number of words in each document.

Examples

BM25Transformer accepts a variety of inputs. The example below transforms tokenized documents:

using MLJ
 import TextAnalysis
 
 BM25Transformer = @load BM25Transformer pkg=MLJText
@@ -43,4 +43,4 @@
 MLJ.fit!(mach)
 fitted_params(mach)
 
-tfidf_mat = transform(mach, ngram_docs)

See also TfidfTransformer, CountTransformer

+tfidf_mat = transform(mach, ngram_docs)

See also TfidfTransformer, CountTransformer

diff --git a/dev/models/BaggingClassifier_MLJScikitLearnInterface/index.html b/dev/models/BaggingClassifier_MLJScikitLearnInterface/index.html index e9ee6db73..b072ab9de 100644 --- a/dev/models/BaggingClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/BaggingClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -BaggingClassifier · MLJ

BaggingClassifier

BaggingClassifier

A model type for constructing a bagging ensemble classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BaggingClassifier = @load BaggingClassifier pkg=MLJScikitLearnInterface

Do model = BaggingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BaggingClassifier(estimator=...).

A Bagging classifier is an ensemble meta-estimator that fits base classifiers each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction. Such a meta-estimator can typically be used as a way to reduce the variance of a black-box estimator (e.g., a decision tree), by introducing randomization into its construction procedure and then making an ensemble out of it.

+BaggingClassifier · MLJ

BaggingClassifier

BaggingClassifier

A model type for constructing a bagging ensemble classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BaggingClassifier = @load BaggingClassifier pkg=MLJScikitLearnInterface

Do model = BaggingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BaggingClassifier(estimator=...).

A Bagging classifier is an ensemble meta-estimator that fits base classifiers each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction. Such a meta-estimator can typically be used as a way to reduce the variance of a black-box estimator (e.g., a decision tree), by introducing randomization into its construction procedure and then making an ensemble out of it.

diff --git a/dev/models/BaggingRegressor_MLJScikitLearnInterface/index.html b/dev/models/BaggingRegressor_MLJScikitLearnInterface/index.html index 5135980c4..bd76d0280 100644 --- a/dev/models/BaggingRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/BaggingRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -BaggingRegressor · MLJ

BaggingRegressor

BaggingRegressor

A model type for constructing a bagging ensemble regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BaggingRegressor = @load BaggingRegressor pkg=MLJScikitLearnInterface

Do model = BaggingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BaggingRegressor(estimator=...).

A Bagging regressor is an ensemble meta-estimator that fits base regressors each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction. Such a meta-estimator can typically be used as a way to reduce the variance of a black-box estimator (e.g., a decision tree), by introducing randomization into its construction procedure and then making an ensemble out of it.

+BaggingRegressor · MLJ

BaggingRegressor

BaggingRegressor

A model type for constructing a bagging ensemble regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BaggingRegressor = @load BaggingRegressor pkg=MLJScikitLearnInterface

Do model = BaggingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BaggingRegressor(estimator=...).

A Bagging regressor is an ensemble meta-estimator that fits base regressors each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction. Such a meta-estimator can typically be used as a way to reduce the variance of a black-box estimator (e.g., a decision tree), by introducing randomization into its construction procedure and then making an ensemble out of it.

diff --git a/dev/models/BalancedBaggingClassifier_MLJBalancing/index.html b/dev/models/BalancedBaggingClassifier_MLJBalancing/index.html index abe05faa8..89527d18e 100644 --- a/dev/models/BalancedBaggingClassifier_MLJBalancing/index.html +++ b/dev/models/BalancedBaggingClassifier_MLJBalancing/index.html @@ -1,5 +1,5 @@ -BalancedBaggingClassifier · MLJ

BalancedBaggingClassifier

BalancedBaggingClassifier

A model type for constructing a balanced bagging classifier, based on MLJBalancing.jl.

From MLJ, the type can be imported using

BalancedBaggingClassifier = @load BalancedBaggingClassifier pkg=MLJBalancing

Construct an instance with default hyper-parameters using the syntax bagging_model = BalancedBaggingClassifier(model=...)

Given a probablistic classifier.BalancedBaggingClassifier performs bagging by undersampling only majority data in each bag so that its includes as much samples as in the minority data. This is proposed with an Adaboost classifier where the output scores are averaged in the paper Xu-Ying Liu, Jianxin Wu, & Zhi-Hua Zhou. (2009). Exploratory Undersampling for Class-Imbalance Learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39 (2), 539–5501

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: input features of a form supported by the model being wrapped (typically a table, e.g., DataFrame, with Continuous columns will be supported, as a minimum)
  • y: the binary target, which can be any AbstractVector where length(unique(y)) == 2

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • model::Probabilistic: The classifier to use to train on each bag.
  • T::Integer=0: The number of bags to be used in the ensemble. If not given, will be set as the ratio between the frequency of the majority and minority classes. Can be later found in report(mach).
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if Julia VERSION>=1.7. Otherwise, uses MersenneTwister`.

Operations

  • predict(mach, Xnew): return predictions of the target given

features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.

  • predict_mode(mach, Xnew): return the mode of each prediction above

Example

using MLJ
+BalancedBaggingClassifier · MLJ

BalancedBaggingClassifier

BalancedBaggingClassifier

A model type for constructing a balanced bagging classifier, based on MLJBalancing.jl.

From MLJ, the type can be imported using

BalancedBaggingClassifier = @load BalancedBaggingClassifier pkg=MLJBalancing

Construct an instance with default hyper-parameters using the syntax bagging_model = BalancedBaggingClassifier(model=...)

Given a probablistic classifier.BalancedBaggingClassifier performs bagging by undersampling only majority data in each bag so that its includes as much samples as in the minority data. This is proposed with an Adaboost classifier where the output scores are averaged in the paper Xu-Ying Liu, Jianxin Wu, & Zhi-Hua Zhou. (2009). Exploratory Undersampling for Class-Imbalance Learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39 (2), 539–5501

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: input features of a form supported by the model being wrapped (typically a table, e.g., DataFrame, with Continuous columns will be supported, as a minimum)
  • y: the binary target, which can be any AbstractVector where length(unique(y)) == 2

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • model::Probabilistic: The classifier to use to train on each bag.
  • T::Integer=0: The number of bags to be used in the ensemble. If not given, will be set as the ratio between the frequency of the majority and minority classes. Can be later found in report(mach).
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if Julia VERSION>=1.7. Otherwise, uses MersenneTwister`.

Operations

  • predict(mach, Xnew): return predictions of the target given

features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.

  • predict_mode(mach, Xnew): return the mode of each prediction above

Example

using MLJ
 using Imbalance
 
 ## Load base classifier and BalancedBaggingClassifier
@@ -24,4 +24,4 @@
 ## Predict using the trained model
 
 yhat = predict(mach, X)     ## probabilistic predictions
-predict_mode(mach, X)       ## point predictions
+predict_mode(mach, X) ## point predictions
diff --git a/dev/models/BalancedModel_MLJBalancing/index.html b/dev/models/BalancedModel_MLJBalancing/index.html index 955418b02..24bbb60e4 100644 --- a/dev/models/BalancedModel_MLJBalancing/index.html +++ b/dev/models/BalancedModel_MLJBalancing/index.html @@ -1,5 +1,5 @@ -BalancedModel · MLJ

BalancedModel

BalancedModel(; model=nothing, balancer1=balancer_model1, balancer2=balancer_model2, ...)
+BalancedModel · MLJ

BalancedModel

BalancedModel(; model=nothing, balancer1=balancer_model1, balancer2=balancer_model2, ...)
 BalancedModel(model;  balancer1=balancer_model1, balancer2=balancer_model2, ...)

Given a classification model, and one or more balancer models that all implement the MLJModelInterface, BalancedModel allows constructing a sequential pipeline that wraps an arbitrary number of balancing models and a classifier together in a sequential pipeline.

Operation

  • During training, data is first passed to balancer1 and the result is passed to balancer2 and so on, the result from the final balancer is then passed to the classifier for training.
  • During prediction, the balancers have no effect.

Arguments

  • model::Supervised: A classification model that implements the MLJModelInterface.
  • balancer1::Static=...: The first balancer model to pass the data to. This keyword argument can have any name.
  • balancer2::Static=...: The second balancer model to pass the data to. This keyword argument can have any name.
  • and so on for an arbitrary number of balancers.

Returns

  • An instance of type ProbabilisticBalancedModel or DeterministicBalancedModel, depending on the prediction type of model.

Example

using MLJ
 using Imbalance
 
@@ -20,4 +20,4 @@
 
 ## now this behaves as a unified model that can be trained, validated, fine-tuned, etc.
 mach = machine(balanced_model, X, y)
-fit!(mach)
+fit!(mach)
diff --git a/dev/models/BayesianLDA_MLJScikitLearnInterface/index.html b/dev/models/BayesianLDA_MLJScikitLearnInterface/index.html index 6f75c63a8..7d2c34c4b 100644 --- a/dev/models/BayesianLDA_MLJScikitLearnInterface/index.html +++ b/dev/models/BayesianLDA_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -BayesianLDA · MLJ

BayesianLDA

BayesianLDA

A model type for constructing a Bayesian linear discriminant analysis, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianLDA = @load BayesianLDA pkg=MLJScikitLearnInterface

Do model = BayesianLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianLDA(solver=...).

Hyper-parameters

  • solver = svd
  • shrinkage = nothing
  • priors = nothing
  • n_components = nothing
  • store_covariance = false
  • tol = 0.0001
  • covariance_estimator = nothing
+BayesianLDA · MLJ

BayesianLDA

BayesianLDA

A model type for constructing a Bayesian linear discriminant analysis, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianLDA = @load BayesianLDA pkg=MLJScikitLearnInterface

Do model = BayesianLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianLDA(solver=...).

Hyper-parameters

  • solver = svd
  • shrinkage = nothing
  • priors = nothing
  • n_components = nothing
  • store_covariance = false
  • tol = 0.0001
  • covariance_estimator = nothing
diff --git a/dev/models/BayesianLDA_MultivariateStats/index.html b/dev/models/BayesianLDA_MultivariateStats/index.html index 694c3a6a8..7a632be1a 100644 --- a/dev/models/BayesianLDA_MultivariateStats/index.html +++ b/dev/models/BayesianLDA_MultivariateStats/index.html @@ -1,5 +1,5 @@ -BayesianLDA · MLJ

BayesianLDA

BayesianLDA

A model type for constructing a Bayesian LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianLDA = @load BayesianLDA pkg=MultivariateStats

Do model = BayesianLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianLDA(method=...).

The Bayesian multiclass LDA algorithm learns a projection matrix as described in ordinary LDA. Predicted class posterior probability distributions are derived by applying Bayes' rule with a multivariate Gaussian class-conditional distribution. A prior class distribution can be specified by the user or inferred from training data class frequency.

See also the package documentation. For more information about the algorithm, see Li, Zhu and Ogihara (2006): Using Discriminant Analysis for Multi-class Classification: An Experimental Investigation.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • method::Symbol=:gevd: choice of solver, one of :gevd or :whiten methods.
  • cov_w::StatsBase.SimpleCovariance(): An estimator for the within-class covariance (used in computing the within-class scatter matrix, Sw). Any robust estimator from CovarianceEstimation.jl can be used.
  • cov_b::StatsBase.SimpleCovariance(): The same as cov_w but for the between-class covariance (used in computing the between-class scatter matrix, Sb).
  • outdim::Int=0: The output dimension, i.e., dimension of the transformed space, automatically set to min(indim, nclasses-1) if equal to 0.
  • regcoef::Float64=1e-6: The regularization coefficient. A positive value regcoef*eigmax(Sw) where Sw is the within-class scatter matrix, is added to the diagonal of Sw to improve numerical stability. This can be useful if using the standard covariance estimator.
  • priors::Union{Nothing, UnivariateFinite{<:Any, <:Any, <:Any, <:Real}, Dict{<:Any, <:Real}} = nothing: For use in prediction with Bayes rule. If priors = nothing then priors are estimated from the class proportions in the training data. Otherwise it requires a Dict or UnivariateFinite object specifying the classes with non-zero probabilities in the training target.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • classes: The classes seen during model fitting.
  • projection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).
  • priors: The class priors for classification. As inferred from training target y, if not user-specified. A UnivariateFinite object with levels consistent with levels(y).

Report

The fields of report(mach) are:

  • indim: The dimension of the input space i.e the number of training features.
  • outdim: The dimension of the transformed space the model is projected to.
  • mean: The mean of the untransformed training data. A vector of length indim.
  • nclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).
  • class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).
  • class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)
  • Sb: The between class scatter matrix.
  • Sw: The within class scatter matrix.

Examples

using MLJ
+BayesianLDA · MLJ

BayesianLDA

BayesianLDA

A model type for constructing a Bayesian LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianLDA = @load BayesianLDA pkg=MultivariateStats

Do model = BayesianLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianLDA(method=...).

The Bayesian multiclass LDA algorithm learns a projection matrix as described in ordinary LDA. Predicted class posterior probability distributions are derived by applying Bayes' rule with a multivariate Gaussian class-conditional distribution. A prior class distribution can be specified by the user or inferred from training data class frequency.

See also the package documentation. For more information about the algorithm, see Li, Zhu and Ogihara (2006): Using Discriminant Analysis for Multi-class Classification: An Experimental Investigation.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • method::Symbol=:gevd: choice of solver, one of :gevd or :whiten methods.
  • cov_w::StatsBase.SimpleCovariance(): An estimator for the within-class covariance (used in computing the within-class scatter matrix, Sw). Any robust estimator from CovarianceEstimation.jl can be used.
  • cov_b::StatsBase.SimpleCovariance(): The same as cov_w but for the between-class covariance (used in computing the between-class scatter matrix, Sb).
  • outdim::Int=0: The output dimension, i.e., dimension of the transformed space, automatically set to min(indim, nclasses-1) if equal to 0.
  • regcoef::Float64=1e-6: The regularization coefficient. A positive value regcoef*eigmax(Sw) where Sw is the within-class scatter matrix, is added to the diagonal of Sw to improve numerical stability. This can be useful if using the standard covariance estimator.
  • priors::Union{Nothing, UnivariateFinite{<:Any, <:Any, <:Any, <:Real}, Dict{<:Any, <:Real}} = nothing: For use in prediction with Bayes rule. If priors = nothing then priors are estimated from the class proportions in the training data. Otherwise it requires a Dict or UnivariateFinite object specifying the classes with non-zero probabilities in the training target.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • classes: The classes seen during model fitting.
  • projection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).
  • priors: The class priors for classification. As inferred from training target y, if not user-specified. A UnivariateFinite object with levels consistent with levels(y).

Report

The fields of report(mach) are:

  • indim: The dimension of the input space i.e the number of training features.
  • outdim: The dimension of the transformed space the model is projected to.
  • mean: The mean of the untransformed training data. A vector of length indim.
  • nclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).
  • class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).
  • class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)
  • Sb: The between class scatter matrix.
  • Sw: The within class scatter matrix.

Examples

using MLJ
 
 BayesianLDA = @load BayesianLDA pkg=MultivariateStats
 
@@ -10,4 +10,4 @@
 
 Xproj = transform(mach, X)
 y_hat = predict(mach, X)
-labels = predict_mode(mach, X)

See also LDA, SubspaceLDA, BayesianSubspaceLDA

+labels = predict_mode(mach, X)

See also LDA, SubspaceLDA, BayesianSubspaceLDA

diff --git a/dev/models/BayesianQDA_MLJScikitLearnInterface/index.html b/dev/models/BayesianQDA_MLJScikitLearnInterface/index.html index c9640c9df..743cf9f4b 100644 --- a/dev/models/BayesianQDA_MLJScikitLearnInterface/index.html +++ b/dev/models/BayesianQDA_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -BayesianQDA · MLJ

BayesianQDA

BayesianQDA

A model type for constructing a Bayesian quadratic discriminant analysis, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianQDA = @load BayesianQDA pkg=MLJScikitLearnInterface

Do model = BayesianQDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianQDA(priors=...).

Hyper-parameters

  • priors = nothing
  • reg_param = 0.0
  • store_covariance = false
  • tol = 0.0001
+BayesianQDA · MLJ

BayesianQDA

BayesianQDA

A model type for constructing a Bayesian quadratic discriminant analysis, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianQDA = @load BayesianQDA pkg=MLJScikitLearnInterface

Do model = BayesianQDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianQDA(priors=...).

Hyper-parameters

  • priors = nothing
  • reg_param = 0.0
  • store_covariance = false
  • tol = 0.0001
diff --git a/dev/models/BayesianRidgeRegressor_MLJScikitLearnInterface/index.html b/dev/models/BayesianRidgeRegressor_MLJScikitLearnInterface/index.html index 9c93d27f1..3e8a8e7b6 100644 --- a/dev/models/BayesianRidgeRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/BayesianRidgeRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -BayesianRidgeRegressor · MLJ

BayesianRidgeRegressor

BayesianRidgeRegressor

A model type for constructing a Bayesian ridge regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianRidgeRegressor = @load BayesianRidgeRegressor pkg=MLJScikitLearnInterface

Do model = BayesianRidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianRidgeRegressor(max_iter=...).

Hyper-parameters

  • max_iter = 300
  • tol = 0.001
  • alpha_1 = 1.0e-6
  • alpha_2 = 1.0e-6
  • lambda_1 = 1.0e-6
  • lambda_2 = 1.0e-6
  • compute_score = false
  • fit_intercept = true
  • copy_X = true
  • verbose = false
+BayesianRidgeRegressor · MLJ

BayesianRidgeRegressor

BayesianRidgeRegressor

A model type for constructing a Bayesian ridge regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianRidgeRegressor = @load BayesianRidgeRegressor pkg=MLJScikitLearnInterface

Do model = BayesianRidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianRidgeRegressor(max_iter=...).

Hyper-parameters

  • max_iter = 300
  • tol = 0.001
  • alpha_1 = 1.0e-6
  • alpha_2 = 1.0e-6
  • lambda_1 = 1.0e-6
  • lambda_2 = 1.0e-6
  • compute_score = false
  • fit_intercept = true
  • copy_X = true
  • verbose = false
diff --git a/dev/models/BayesianSubspaceLDA_MultivariateStats/index.html b/dev/models/BayesianSubspaceLDA_MultivariateStats/index.html index 9e8c57508..e5efcedc2 100644 --- a/dev/models/BayesianSubspaceLDA_MultivariateStats/index.html +++ b/dev/models/BayesianSubspaceLDA_MultivariateStats/index.html @@ -1,5 +1,5 @@ -BayesianSubspaceLDA · MLJ

BayesianSubspaceLDA

BayesianSubspaceLDA

A model type for constructing a Bayesian subspace LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianSubspaceLDA = @load BayesianSubspaceLDA pkg=MultivariateStats

Do model = BayesianSubspaceLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianSubspaceLDA(normalize=...).

The Bayesian multiclass subspace linear discriminant analysis algorithm learns a projection matrix as described in SubspaceLDA. The posterior class probability distribution is derived as in BayesianLDA.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • normalize=true: Option to normalize the between class variance for the number of observations in each class, one of true or false.

outdim: the ouput dimension, automatically set to min(indim, nclasses-1) if equal to 0. If a non-zero outdim is passed, then the actual output dimension used is min(rank, outdim) where rank is the rank of the within-class covariance matrix.

  • priors::Union{Nothing, UnivariateFinite{<:Any, <:Any, <:Any, <:Real}, Dict{<:Any, <:Real}} = nothing: For use in prediction with Bayes rule. If priors = nothing then priors are estimated from the class proportions in the training data. Otherwise it requires a Dict or UnivariateFinite object specifying the classes with non-zero probabilities in the training target.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • classes: The classes seen during model fitting.
  • projection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).
  • priors: The class priors for classification. As inferred from training target y, if not user-specified. A UnivariateFinite object with levels consistent with levels(y).

Report

The fields of report(mach) are:

  • indim: The dimension of the input space i.e the number of training features.
  • outdim: The dimension of the transformed space the model is projected to.
  • mean: The overall mean of the training data.
  • nclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).

class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).

  • class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)
  • explained_variance_ratio: The ratio of explained variance to total variance. Each dimension corresponds to an eigenvalue.

Examples

using MLJ
+BayesianSubspaceLDA · MLJ

BayesianSubspaceLDA

BayesianSubspaceLDA

A model type for constructing a Bayesian subspace LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BayesianSubspaceLDA = @load BayesianSubspaceLDA pkg=MultivariateStats

Do model = BayesianSubspaceLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BayesianSubspaceLDA(normalize=...).

The Bayesian multiclass subspace linear discriminant analysis algorithm learns a projection matrix as described in SubspaceLDA. The posterior class probability distribution is derived as in BayesianLDA.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • normalize=true: Option to normalize the between class variance for the number of observations in each class, one of true or false.

outdim: the ouput dimension, automatically set to min(indim, nclasses-1) if equal to 0. If a non-zero outdim is passed, then the actual output dimension used is min(rank, outdim) where rank is the rank of the within-class covariance matrix.

  • priors::Union{Nothing, UnivariateFinite{<:Any, <:Any, <:Any, <:Real}, Dict{<:Any, <:Real}} = nothing: For use in prediction with Bayes rule. If priors = nothing then priors are estimated from the class proportions in the training data. Otherwise it requires a Dict or UnivariateFinite object specifying the classes with non-zero probabilities in the training target.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • classes: The classes seen during model fitting.
  • projection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).
  • priors: The class priors for classification. As inferred from training target y, if not user-specified. A UnivariateFinite object with levels consistent with levels(y).

Report

The fields of report(mach) are:

  • indim: The dimension of the input space i.e the number of training features.
  • outdim: The dimension of the transformed space the model is projected to.
  • mean: The overall mean of the training data.
  • nclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).

class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).

  • class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)
  • explained_variance_ratio: The ratio of explained variance to total variance. Each dimension corresponds to an eigenvalue.

Examples

using MLJ
 
 BayesianSubspaceLDA = @load BayesianSubspaceLDA pkg=MultivariateStats
 
@@ -10,4 +10,4 @@
 
 Xproj = transform(mach, X)
 y_hat = predict(mach, X)
-labels = predict_mode(mach, X)

See also LDA, BayesianLDA, SubspaceLDA

+labels = predict_mode(mach, X)

See also LDA, BayesianLDA, SubspaceLDA

diff --git a/dev/models/BernoulliNBClassifier_MLJScikitLearnInterface/index.html b/dev/models/BernoulliNBClassifier_MLJScikitLearnInterface/index.html index 27a8e3b1e..309dcafe1 100644 --- a/dev/models/BernoulliNBClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/BernoulliNBClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -BernoulliNBClassifier · MLJ

BernoulliNBClassifier

BernoulliNBClassifier

A model type for constructing a Bernoulli naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BernoulliNBClassifier = @load BernoulliNBClassifier pkg=MLJScikitLearnInterface

Do model = BernoulliNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BernoulliNBClassifier(alpha=...).

Binomial naive bayes classifier. It is suitable for classification with binary features; features will be binarized based on the binarize keyword (unless it's nothing in which case the features are assumed to be binary).

+BernoulliNBClassifier · MLJ

BernoulliNBClassifier

BernoulliNBClassifier

A model type for constructing a Bernoulli naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BernoulliNBClassifier = @load BernoulliNBClassifier pkg=MLJScikitLearnInterface

Do model = BernoulliNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BernoulliNBClassifier(alpha=...).

Binomial naive bayes classifier. It is suitable for classification with binary features; features will be binarized based on the binarize keyword (unless it's nothing in which case the features are assumed to be binary).

diff --git a/dev/models/BinaryThresholdPredictor_MLJModels/index.html b/dev/models/BinaryThresholdPredictor_MLJModels/index.html index 757c5c2fc..31ca91351 100644 --- a/dev/models/BinaryThresholdPredictor_MLJModels/index.html +++ b/dev/models/BinaryThresholdPredictor_MLJModels/index.html @@ -1,5 +1,5 @@ -BinaryThresholdPredictor · MLJ

BinaryThresholdPredictor

BinaryThresholdPredictor(model; threshold=0.5)

Wrap the Probabilistic model, model, assumed to support binary classification, as a Deterministic model, by applying the specified threshold to the positive class probability. In addition to conventional supervised classifiers, it can also be applied to outlier detection models that predict normalized scores - in the form of appropriate UnivariateFinite distributions - that is, models that subtype AbstractProbabilisticUnsupervisedDetector or AbstractProbabilisticSupervisedDetector.

By convention the positive class is the second class returned by levels(y), where y is the target.

If threshold=0.5 then calling predict on the wrapped model is equivalent to calling predict_mode on the atomic model.

Example

Below is an application to the well-known Pima Indian diabetes dataset, including optimization of the threshold parameter, with a high balanced accuracy the objective. The target class distribution is 500 positives to 268 negatives.

Loading the data:

using MLJ, Random
+BinaryThresholdPredictor · MLJ

BinaryThresholdPredictor

BinaryThresholdPredictor(model; threshold=0.5)

Wrap the Probabilistic model, model, assumed to support binary classification, as a Deterministic model, by applying the specified threshold to the positive class probability. In addition to conventional supervised classifiers, it can also be applied to outlier detection models that predict normalized scores - in the form of appropriate UnivariateFinite distributions - that is, models that subtype AbstractProbabilisticUnsupervisedDetector or AbstractProbabilisticSupervisedDetector.

By convention the positive class is the second class returned by levels(y), where y is the target.

If threshold=0.5 then calling predict on the wrapped model is equivalent to calling predict_mode on the atomic model.

Example

Below is an application to the well-known Pima Indian diabetes dataset, including optimization of the threshold parameter, with a high balanced accuracy the objective. The target class distribution is 500 positives to 268 negatives.

Loading the data:

using MLJ, Random
 rng = Xoshiro(123)
 
 diabetes = OpenML.load(43582)
@@ -23,4 +23,4 @@
 optimized_point_predictor = report(mach2).best_model
 optimized_point_predictor.threshold ## 0.260
 predict(mach2, X)[1:3] ## [1, 1, 0]

Estimating the performance of the auto-thresholding model (nested resampling here):

e = evaluate!(mach2, resampling=CV(nfolds=6), measure=[balanced, accuracy])
-e.measurement[1] ## 0.477 ± 0.110
+e.measurement[1] ## 0.477 ± 0.110
diff --git a/dev/models/Birch_MLJScikitLearnInterface/index.html b/dev/models/Birch_MLJScikitLearnInterface/index.html index 703236b90..7f1e030d4 100644 --- a/dev/models/Birch_MLJScikitLearnInterface/index.html +++ b/dev/models/Birch_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -Birch · MLJ

Birch

Birch

A model type for constructing a birch, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

Birch = @load Birch pkg=MLJScikitLearnInterface

Do model = Birch() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Birch(threshold=...).

Memory-efficient, online-learning algorithm provided as an alternative to MiniBatchKMeans. Note: noisy samples are given the label -1.

+Birch · MLJ

Birch

Birch

A model type for constructing a birch, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

Birch = @load Birch pkg=MLJScikitLearnInterface

Do model = Birch() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Birch(threshold=...).

Memory-efficient, online-learning algorithm provided as an alternative to MiniBatchKMeans. Note: noisy samples are given the label -1.

diff --git a/dev/models/BisectingKMeans_MLJScikitLearnInterface/index.html b/dev/models/BisectingKMeans_MLJScikitLearnInterface/index.html index c6dcda0fc..b88587c52 100644 --- a/dev/models/BisectingKMeans_MLJScikitLearnInterface/index.html +++ b/dev/models/BisectingKMeans_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -BisectingKMeans · MLJ

BisectingKMeans

BisectingKMeans

A model type for constructing a bisecting k means, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BisectingKMeans = @load BisectingKMeans pkg=MLJScikitLearnInterface

Do model = BisectingKMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BisectingKMeans(n_clusters=...).

Bisecting K-Means clustering.

+BisectingKMeans · MLJ

BisectingKMeans

BisectingKMeans

A model type for constructing a bisecting k means, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BisectingKMeans = @load BisectingKMeans pkg=MLJScikitLearnInterface

Do model = BisectingKMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BisectingKMeans(n_clusters=...).

Bisecting K-Means clustering.

diff --git a/dev/models/BorderlineSMOTE1_Imbalance/index.html b/dev/models/BorderlineSMOTE1_Imbalance/index.html index 7579c6bdd..429d86c9f 100644 --- a/dev/models/BorderlineSMOTE1_Imbalance/index.html +++ b/dev/models/BorderlineSMOTE1_Imbalance/index.html @@ -1,5 +1,5 @@ -BorderlineSMOTE1 · MLJ

BorderlineSMOTE1

Initiate a BorderlineSMOTE1 model with the given hyper-parameters.

BorderlineSMOTE1

A model type for constructing a borderline smot e1, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BorderlineSMOTE1 = @load BorderlineSMOTE1 pkg=Imbalance

Do model = BorderlineSMOTE1() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BorderlineSMOTE1(m=...).

BorderlineSMOTE1 implements the BorderlineSMOTE1 algorithm to correct for class imbalance as in Han, H., Wang, W.-Y., & Mao, B.-H. (2005). Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning. In D.S. Huang, X.-P. Zhang, & G.-B. Huang (Eds.), Advances in Intelligent Computing (pp. 878-887). Springer.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = BorderlineSMOTE1()

Hyperparameters

  • m::Integer=5: The number of neighbors to consider while checking the BorderlineSMOTE1 condition. Should be within the range 0 < m < N where N is the number of observations in the data. It will be automatically set to N-1 if N ≤ m.

  • k::Integer=5: Number of nearest neighbors to consider in the SMOTE part of the algorithm. Should be within the range 0 < k < n where n is the number of observations in the smallest class. It will be automatically set to l-1 for any class with l points where l ≤ k.

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

  • verbosity::Integer=1: Whenever higher than 0 info regarding the points that will participate in oversampling is logged.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using BorderlineSMOTE1, returning both the new and original observations

Example

using MLJ
+BorderlineSMOTE1 · MLJ

BorderlineSMOTE1

Initiate a BorderlineSMOTE1 model with the given hyper-parameters.

BorderlineSMOTE1

A model type for constructing a borderline smot e1, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

BorderlineSMOTE1 = @load BorderlineSMOTE1 pkg=Imbalance

Do model = BorderlineSMOTE1() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in BorderlineSMOTE1(m=...).

BorderlineSMOTE1 implements the BorderlineSMOTE1 algorithm to correct for class imbalance as in Han, H., Wang, W.-Y., & Mao, B.-H. (2005). Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning. In D.S. Huang, X.-P. Zhang, & G.-B. Huang (Eds.), Advances in Intelligent Computing (pp. 878-887). Springer.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = BorderlineSMOTE1()

Hyperparameters

  • m::Integer=5: The number of neighbors to consider while checking the BorderlineSMOTE1 condition. Should be within the range 0 < m < N where N is the number of observations in the data. It will be automatically set to N-1 if N ≤ m.

  • k::Integer=5: Number of nearest neighbors to consider in the SMOTE part of the algorithm. Should be within the range 0 < k < n where n is the number of observations in the smallest class. It will be automatically set to l-1 for any class with l points where l ≤ k.

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

  • verbosity::Integer=1: Whenever higher than 0 info regarding the points that will participate in oversampling is logged.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using BorderlineSMOTE1, returning both the new and original observations

Example

using MLJ
 import Imbalance
 
 ## set probability of each class
@@ -28,4 +28,4 @@
 julia> Imbalance.checkbalance(yover)
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 392 (80.0%) 
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 441 (90.0%) 
-0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 490 (100.0%) 
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 490 (100.0%)
diff --git a/dev/models/CBLOFDetector_OutlierDetectionPython/index.html b/dev/models/CBLOFDetector_OutlierDetectionPython/index.html index 48525f422..3478a82df 100644 --- a/dev/models/CBLOFDetector_OutlierDetectionPython/index.html +++ b/dev/models/CBLOFDetector_OutlierDetectionPython/index.html @@ -1,7 +1,7 @@ -CBLOFDetector · MLJ

CBLOFDetector

CBLOFDetector(n_clusters = 8,
+CBLOFDetector · MLJ
+                 n_jobs = 1)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.cblof

diff --git a/dev/models/CDDetector_OutlierDetectionPython/index.html b/dev/models/CDDetector_OutlierDetectionPython/index.html index 7251495cb..7ced74fca 100644 --- a/dev/models/CDDetector_OutlierDetectionPython/index.html +++ b/dev/models/CDDetector_OutlierDetectionPython/index.html @@ -1,3 +1,3 @@ -CDDetector · MLJ
+CDDetector · MLJ
diff --git a/dev/models/COFDetector_OutlierDetectionNeighbors/index.html b/dev/models/COFDetector_OutlierDetectionNeighbors/index.html index 26bbbb037..82b37c734 100644 --- a/dev/models/COFDetector_OutlierDetectionNeighbors/index.html +++ b/dev/models/COFDetector_OutlierDetectionNeighbors/index.html @@ -1,5 +1,5 @@ -COFDetector · MLJ

COFDetector

COFDetector(k = 5,
+COFDetector · MLJ

COFDetector

COFDetector(k = 5,
             metric = Euclidean(),
             algorithm = :kdtree,
             leafsize = 10,
@@ -8,4 +8,4 @@
 detector = COFDetector()
 X = rand(10, 100)
 model, result = fit(detector, X; verbosity=0)
-test_scores = transform(detector, model, X)

References

[1] Tang, Jian; Chen, Zhixiang; Fu, Ada Wai-Chee; Cheung, David Wai-Lok (2002): Enhancing Effectiveness of Outlier Detections for Low Density Patterns.

+test_scores = transform(detector, model, X)

References

[1] Tang, Jian; Chen, Zhixiang; Fu, Ada Wai-Chee; Cheung, David Wai-Lok (2002): Enhancing Effectiveness of Outlier Detections for Low Density Patterns.

diff --git a/dev/models/COFDetector_OutlierDetectionPython/index.html b/dev/models/COFDetector_OutlierDetectionPython/index.html index b64880066..53d59aeea 100644 --- a/dev/models/COFDetector_OutlierDetectionPython/index.html +++ b/dev/models/COFDetector_OutlierDetectionPython/index.html @@ -1,3 +1,3 @@ -COFDetector · MLJ
+COFDetector · MLJ
diff --git a/dev/models/COPODDetector_OutlierDetectionPython/index.html b/dev/models/COPODDetector_OutlierDetectionPython/index.html index 309193805..56c97fe4f 100644 --- a/dev/models/COPODDetector_OutlierDetectionPython/index.html +++ b/dev/models/COPODDetector_OutlierDetectionPython/index.html @@ -1,2 +1,2 @@ -COPODDetector · MLJ
+COPODDetector · MLJ
diff --git a/dev/models/CatBoostClassifier_CatBoost/index.html b/dev/models/CatBoostClassifier_CatBoost/index.html index fc5254229..c26d290d8 100644 --- a/dev/models/CatBoostClassifier_CatBoost/index.html +++ b/dev/models/CatBoostClassifier_CatBoost/index.html @@ -1,5 +1,5 @@ -CatBoostClassifier · MLJ

CatBoostClassifier

CatBoostClassifier

A model type for constructing a CatBoost classifier, based on CatBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

CatBoostClassifier = @load CatBoostClassifier pkg=CatBoost

Do model = CatBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CatBoostClassifier(iterations=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, Finite, Textual; check column scitypes with schema(X). Textual columns will be passed to catboost as text_features, Multiclass columns will be passed to catboost as cat_features, and OrderedFactor columns will be converted to integers.
  • y: the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

More details on the catboost hyperparameters, here are the Python docs: https://catboost.ai/en/docs/concepts/python-reference_catboostclassifier#parameters

Operations

  • predict(mach, Xnew): probabilistic predictions of the target given new features Xnew having the same scitype as X above.
  • predict_mode(mach, Xnew): returns the mode of each of the prediction above.

Accessor functions

  • feature_importances(mach): return vector of feature importances, in the form of feature::Symbol => importance::Real pairs

Fitted parameters

The fields of fitted_params(mach) are:

  • model: The Python CatBoostClassifier model

Report

The fields of report(mach) are:

  • feature_importances: Vector{Pair{Symbol, Float64}} of feature importances

Examples

using CatBoost.MLJCatBoostInterface
+CatBoostClassifier · MLJ

CatBoostClassifier

CatBoostClassifier

A model type for constructing a CatBoost classifier, based on CatBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

CatBoostClassifier = @load CatBoostClassifier pkg=CatBoost

Do model = CatBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CatBoostClassifier(iterations=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, Finite, Textual; check column scitypes with schema(X). Textual columns will be passed to catboost as text_features, Multiclass columns will be passed to catboost as cat_features, and OrderedFactor columns will be converted to integers.
  • y: the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

More details on the catboost hyperparameters, here are the Python docs: https://catboost.ai/en/docs/concepts/python-reference_catboostclassifier#parameters

Operations

  • predict(mach, Xnew): probabilistic predictions of the target given new features Xnew having the same scitype as X above.
  • predict_mode(mach, Xnew): returns the mode of each of the prediction above.

Accessor functions

  • feature_importances(mach): return vector of feature importances, in the form of feature::Symbol => importance::Real pairs

Fitted parameters

The fields of fitted_params(mach) are:

  • model: The Python CatBoostClassifier model

Report

The fields of report(mach) are:

  • feature_importances: Vector{Pair{Symbol, Float64}} of feature importances

Examples

using CatBoost.MLJCatBoostInterface
 using MLJ
 
 X = (
@@ -13,4 +13,4 @@
 mach = machine(model, X, y)
 fit!(mach)
 probs = predict(mach, X)
-preds = predict_mode(mach, X)

See also catboost and the unwrapped model type CatBoost.CatBoostClassifier.

+preds = predict_mode(mach, X)

See also catboost and the unwrapped model type CatBoost.CatBoostClassifier.

diff --git a/dev/models/CatBoostRegressor_CatBoost/index.html b/dev/models/CatBoostRegressor_CatBoost/index.html index c53a4a28f..414a079c1 100644 --- a/dev/models/CatBoostRegressor_CatBoost/index.html +++ b/dev/models/CatBoostRegressor_CatBoost/index.html @@ -1,5 +1,5 @@ -CatBoostRegressor · MLJ

CatBoostRegressor

CatBoostRegressor

A model type for constructing a CatBoost regressor, based on CatBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

CatBoostRegressor = @load CatBoostRegressor pkg=CatBoost

Do model = CatBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CatBoostRegressor(iterations=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, Finite, Textual; check column scitypes with schema(X). Textual columns will be passed to catboost as text_features, Multiclass columns will be passed to catboost as cat_features, and OrderedFactor columns will be converted to integers.
  • y: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

More details on the catboost hyperparameters, here are the Python docs: https://catboost.ai/en/docs/concepts/python-reference_catboostclassifier#parameters

Operations

  • predict(mach, Xnew): probabilistic predictions of the target given new features Xnew having the same scitype as X above.

Accessor functions

  • feature_importances(mach): return vector of feature importances, in the form of feature::Symbol => importance::Real pairs

Fitted parameters

The fields of fitted_params(mach) are:

  • model: The Python CatBoostRegressor model

Report

The fields of report(mach) are:

  • feature_importances: Vector{Pair{Symbol, Float64}} of feature importances

Examples

using CatBoost.MLJCatBoostInterface
+CatBoostRegressor · MLJ

CatBoostRegressor

CatBoostRegressor

A model type for constructing a CatBoost regressor, based on CatBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

CatBoostRegressor = @load CatBoostRegressor pkg=CatBoost

Do model = CatBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CatBoostRegressor(iterations=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, Finite, Textual; check column scitypes with schema(X). Textual columns will be passed to catboost as text_features, Multiclass columns will be passed to catboost as cat_features, and OrderedFactor columns will be converted to integers.
  • y: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

More details on the catboost hyperparameters, here are the Python docs: https://catboost.ai/en/docs/concepts/python-reference_catboostclassifier#parameters

Operations

  • predict(mach, Xnew): probabilistic predictions of the target given new features Xnew having the same scitype as X above.

Accessor functions

  • feature_importances(mach): return vector of feature importances, in the form of feature::Symbol => importance::Real pairs

Fitted parameters

The fields of fitted_params(mach) are:

  • model: The Python CatBoostRegressor model

Report

The fields of report(mach) are:

  • feature_importances: Vector{Pair{Symbol, Float64}} of feature importances

Examples

using CatBoost.MLJCatBoostInterface
 using MLJ
 
 X = (
@@ -12,4 +12,4 @@
 model = CatBoostRegressor(iterations=5)
 mach = machine(model, X, y)
 fit!(mach)
-preds = predict(mach, X)

See also catboost and the unwrapped model type CatBoost.CatBoostRegressor.

+preds = predict(mach, X)

See also catboost and the unwrapped model type CatBoost.CatBoostRegressor.

diff --git a/dev/models/ClusterUndersampler_Imbalance/index.html b/dev/models/ClusterUndersampler_Imbalance/index.html index 8c4a74df8..a44d3f7de 100644 --- a/dev/models/ClusterUndersampler_Imbalance/index.html +++ b/dev/models/ClusterUndersampler_Imbalance/index.html @@ -1,5 +1,5 @@ -ClusterUndersampler · MLJ

ClusterUndersampler

Initiate a cluster undersampling model with the given hyper-parameters.

ClusterUndersampler

A model type for constructing a cluster undersampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ClusterUndersampler = @load ClusterUndersampler pkg=Imbalance

Do model = ClusterUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ClusterUndersampler(mode=...).

ClusterUndersampler implements clustering undersampling as presented in Wei-Chao, L., Chih-Fong, T., Ya-Han, H., & Jing-Shang, J. (2017). Clustering-based undersampling in class-imbalanced data. Information Sciences, 409–410, 17–26. with K-means as the clustering algorithm.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed with model = ClusterUndersampler().

Hyperparameters

  • mode::AbstractString="nearest: If "center" then the undersampled data will consist of the centriods of
each cluster found; if `"nearest"` then it will consist of the nearest neighbor of each centroid.
  • ratios=1.0: A parameter that controls the amount of undersampling to be done for each class

    • Can be a float and in this case each class will be undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • maxiter::Integer=100: Maximum number of iterations to run K-means

  • rng::Integer=42: Random number generator seed. Must be an integer.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively
  • y_under: An abstract vector of labels corresponding to X_under

Operations

  • transform(mach, X, y): resample the data X and y using ClusterUndersampler, returning the undersampled versions

Example

using MLJ
+ClusterUndersampler · MLJ

ClusterUndersampler

Initiate a cluster undersampling model with the given hyper-parameters.

ClusterUndersampler

A model type for constructing a cluster undersampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ClusterUndersampler = @load ClusterUndersampler pkg=Imbalance

Do model = ClusterUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ClusterUndersampler(mode=...).

ClusterUndersampler implements clustering undersampling as presented in Wei-Chao, L., Chih-Fong, T., Ya-Han, H., & Jing-Shang, J. (2017). Clustering-based undersampling in class-imbalanced data. Information Sciences, 409–410, 17–26. with K-means as the clustering algorithm.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed with model = ClusterUndersampler().

Hyperparameters

  • mode::AbstractString="nearest: If "center" then the undersampled data will consist of the centriods of
each cluster found; if `"nearest"` then it will consist of the nearest neighbor of each centroid.
  • ratios=1.0: A parameter that controls the amount of undersampling to be done for each class

    • Can be a float and in this case each class will be undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • maxiter::Integer=100: Maximum number of iterations to run K-means

  • rng::Integer=42: Random number generator seed. Must be an integer.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively
  • y_under: An abstract vector of labels corresponding to X_under

Operations

  • transform(mach, X, y): resample the data X and y using ClusterUndersampler, returning the undersampled versions

Example

using MLJ
 import Imbalance
 
 ## set probability of each class
@@ -29,4 +29,4 @@
 julia> Imbalance.checkbalance(y_under; ref="minority")
 0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) 
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) 
-1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%)
+1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%)
diff --git a/dev/models/ComplementNBClassifier_MLJScikitLearnInterface/index.html b/dev/models/ComplementNBClassifier_MLJScikitLearnInterface/index.html index 8133be99a..6a688001c 100644 --- a/dev/models/ComplementNBClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/ComplementNBClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -ComplementNBClassifier · MLJ

ComplementNBClassifier

ComplementNBClassifier

A model type for constructing a Complement naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ComplementNBClassifier = @load ComplementNBClassifier pkg=MLJScikitLearnInterface

Do model = ComplementNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ComplementNBClassifier(alpha=...).

Similar to MultinomialNBClassifier but with more robust assumptions. Suited for imbalanced datasets.

+ComplementNBClassifier · MLJ

ComplementNBClassifier

ComplementNBClassifier

A model type for constructing a Complement naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ComplementNBClassifier = @load ComplementNBClassifier pkg=MLJScikitLearnInterface

Do model = ComplementNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ComplementNBClassifier(alpha=...).

Similar to MultinomialNBClassifier but with more robust assumptions. Suited for imbalanced datasets.

diff --git a/dev/models/ConstantClassifier_MLJModels/index.html b/dev/models/ConstantClassifier_MLJModels/index.html index e473746ca..0c70fb30b 100644 --- a/dev/models/ConstantClassifier_MLJModels/index.html +++ b/dev/models/ConstantClassifier_MLJModels/index.html @@ -1,5 +1,5 @@ -ConstantClassifier · MLJ

ConstantClassifier

ConstantClassifier

This "dummy" probabilistic predictor always returns the same distribution, irrespective of the provided input pattern. The distribution d returned is the UnivariateFinite distribution based on frequency of classes observed in the training target data. So, pdf(d, level) is the number of times the training target takes on the value level. Use predict_mode instead of predict to obtain the training target mode instead. For more on the UnivariateFinite type, see the CategoricalDistributions.jl package.

Almost any reasonable model is expected to outperform ConstantClassifier, which is used almost exclusively for testing and establishing performance baselines.

In MLJ (or MLJModels) do model = ConstantClassifier() to construct an instance.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame)
  • y is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

None.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew (which for this model are ignored). Predictions are probabilistic.
  • predict_mode(mach, Xnew): Return the mode of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • target_distribution: The distribution fit to the supplied target data.

Examples

using MLJ
+ConstantClassifier · MLJ

ConstantClassifier

ConstantClassifier

This "dummy" probabilistic predictor always returns the same distribution, irrespective of the provided input pattern. The distribution d returned is the UnivariateFinite distribution based on frequency of classes observed in the training target data. So, pdf(d, level) is the number of times the training target takes on the value level. Use predict_mode instead of predict to obtain the training target mode instead. For more on the UnivariateFinite type, see the CategoricalDistributions.jl package.

Almost any reasonable model is expected to outperform ConstantClassifier, which is used almost exclusively for testing and establishing performance baselines.

In MLJ (or MLJModels) do model = ConstantClassifier() to construct an instance.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame)
  • y is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

None.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew (which for this model are ignored). Predictions are probabilistic.
  • predict_mode(mach, Xnew): Return the mode of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • target_distribution: The distribution fit to the supplied target data.

Examples

using MLJ
 
 clf = ConstantClassifier()
 
@@ -26,4 +26,4 @@
 pdf(yhat, L)
 
 ## point predictions:
-predict_mode(mach, Xnew)

See also ConstantRegressor

+predict_mode(mach, Xnew)

See also ConstantRegressor

diff --git a/dev/models/ConstantRegressor_MLJModels/index.html b/dev/models/ConstantRegressor_MLJModels/index.html index 3e55a162b..bf9579337 100644 --- a/dev/models/ConstantRegressor_MLJModels/index.html +++ b/dev/models/ConstantRegressor_MLJModels/index.html @@ -1,5 +1,5 @@ -ConstantRegressor · MLJ

ConstantRegressor

ConstantRegressor

This "dummy" probabilistic predictor always returns the same distribution, irrespective of the provided input pattern. The distribution returned is the one of the type specified that best fits the training target data. Use predict_mean or predict_median to predict the mean or median values instead. If not specified, a normal distribution is fit.

Almost any reasonable model is expected to outperform ConstantRegressor which is used almost exclusively for testing and establishing performance baselines.

In MLJ (or MLJModels) do model = ConstantRegressor() or model = ConstantRegressor(distribution=...) to construct a model instance.

Training data

In MLJ (or MLJBase) bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with schema(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • distribution_type=Distributions.Normal: The distribution to be fit to the target data. Must be a subtype of Distributions.ContinuousUnivariateDistribution.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew (which for this model are ignored). Predictions are probabilistic.
  • predict_mean(mach, Xnew): Return instead the means of the probabilistic predictions returned above.
  • predict_median(mach, Xnew): Return instead the medians of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • target_distribution: The distribution fit to the supplied target data.

Examples

using MLJ
+ConstantRegressor · MLJ

ConstantRegressor

ConstantRegressor

This "dummy" probabilistic predictor always returns the same distribution, irrespective of the provided input pattern. The distribution returned is the one of the type specified that best fits the training target data. Use predict_mean or predict_median to predict the mean or median values instead. If not specified, a normal distribution is fit.

Almost any reasonable model is expected to outperform ConstantRegressor which is used almost exclusively for testing and establishing performance baselines.

In MLJ (or MLJModels) do model = ConstantRegressor() or model = ConstantRegressor(distribution=...) to construct a model instance.

Training data

In MLJ (or MLJBase) bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with schema(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • distribution_type=Distributions.Normal: The distribution to be fit to the target data. Must be a subtype of Distributions.ContinuousUnivariateDistribution.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew (which for this model are ignored). Predictions are probabilistic.
  • predict_mean(mach, Xnew): Return instead the means of the probabilistic predictions returned above.
  • predict_median(mach, Xnew): Return instead the medians of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • target_distribution: The distribution fit to the supplied target data.

Examples

using MLJ
 
 X, y = make_regression(10, 2) ## synthetic data: a table and vector
 regressor = ConstantRegressor()
@@ -10,4 +10,4 @@
 Xnew, _ = make_regression(3, 2)
 predict(mach, Xnew)
 predict_mean(mach, Xnew)
-

See also ConstantClassifier

+

See also ConstantClassifier

diff --git a/dev/models/ContinuousEncoder_MLJModels/index.html b/dev/models/ContinuousEncoder_MLJModels/index.html index b37936bec..59890f8a6 100644 --- a/dev/models/ContinuousEncoder_MLJModels/index.html +++ b/dev/models/ContinuousEncoder_MLJModels/index.html @@ -1,5 +1,5 @@ -ContinuousEncoder · MLJ

ContinuousEncoder

ContinuousEncoder

A model type for constructing a continuous encoder, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ContinuousEncoder = @load ContinuousEncoder pkg=MLJModels

Do model = ContinuousEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ContinuousEncoder(drop_last=...).

Use this model to arrange all features (columns) of a table to have Continuous element scitype, by applying the following protocol to each feature ftr:

  • If ftr is already Continuous retain it.
  • If ftr is Multiclass, one-hot encode it.
  • If ftr is OrderedFactor, replace it with coerce(ftr, Continuous) (vector of floating point integers), unless ordered_factors=false is specified, in which case one-hot encode it.
  • If ftr is Count, replace it with coerce(ftr, Continuous).
  • If ftr has some other element scitype, or was not observed in fitting the encoder, drop it from the table.

Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.

To selectively one-hot-encode categorical features (without dropping columns) use OneHotEncoder instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • drop_last=true: whether to drop the column corresponding to the final class of one-hot encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but two just features otherwise.
  • one_hot_ordered_factors=false: whether to one-hot any feature with OrderedFactor element scitype, or to instead coerce it directly to a (single) Continuous feature using the order

Fitted parameters

The fields of fitted_params(mach) are:

  • features_to_keep: names of features that will not be dropped from the table
  • one_hot_encoder: the OneHotEncoder model instance for handling the one-hot encoding
  • one_hot_encoder_fitresult: the fitted parameters of the OneHotEncoder model

Report

  • features_to_keep: names of input features that will not be dropped from the table
  • new_features: names of all output features

Example

X = (name=categorical(["Danesh", "Lee", "Mary", "John"]),
+ContinuousEncoder · MLJ

ContinuousEncoder

ContinuousEncoder

A model type for constructing a continuous encoder, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ContinuousEncoder = @load ContinuousEncoder pkg=MLJModels

Do model = ContinuousEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ContinuousEncoder(drop_last=...).

Use this model to arrange all features (columns) of a table to have Continuous element scitype, by applying the following protocol to each feature ftr:

  • If ftr is already Continuous retain it.
  • If ftr is Multiclass, one-hot encode it.
  • If ftr is OrderedFactor, replace it with coerce(ftr, Continuous) (vector of floating point integers), unless ordered_factors=false is specified, in which case one-hot encode it.
  • If ftr is Count, replace it with coerce(ftr, Continuous).
  • If ftr has some other element scitype, or was not observed in fitting the encoder, drop it from the table.

Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.

To selectively one-hot-encode categorical features (without dropping columns) use OneHotEncoder instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • drop_last=true: whether to drop the column corresponding to the final class of one-hot encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but two just features otherwise.
  • one_hot_ordered_factors=false: whether to one-hot any feature with OrderedFactor element scitype, or to instead coerce it directly to a (single) Continuous feature using the order

Fitted parameters

The fields of fitted_params(mach) are:

  • features_to_keep: names of features that will not be dropped from the table
  • one_hot_encoder: the OneHotEncoder model instance for handling the one-hot encoding
  • one_hot_encoder_fitresult: the fitted parameters of the OneHotEncoder model

Report

  • features_to_keep: names of input features that will not be dropped from the table
  • new_features: names of all output features

Example

X = (name=categorical(["Danesh", "Lee", "Mary", "John"]),
      grade=categorical(["A", "B", "A", "C"], ordered=true),
      height=[1.85, 1.67, 1.5, 1.67],
      n_devices=[3, 2, 4, 3],
@@ -35,4 +35,4 @@
 julia> setdiff(schema(X).names, report(mach).features_to_keep) ## dropped features
 1-element Vector{Symbol}:
  :comments
-

See also OneHotEncoder

+

See also OneHotEncoder

diff --git a/dev/models/CountTransformer_MLJText/index.html b/dev/models/CountTransformer_MLJText/index.html index 927fa5ca4..54660de02 100644 --- a/dev/models/CountTransformer_MLJText/index.html +++ b/dev/models/CountTransformer_MLJText/index.html @@ -1,5 +1,5 @@ -CountTransformer · MLJ

CountTransformer

CountTransformer

A model type for constructing a count transformer, based on MLJText.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

CountTransformer = @load CountTransformer pkg=MLJText

Do model = CountTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CountTransformer(max_doc_freq=...).

The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of term counts.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:

    • A vector of abstract strings (tokens), e.g., ["I", "like", "Sam", ".", "Sam", "is", "nice", "."] (scitype AbstractVector{Textual})
    • A dictionary of counts, indexed on abstract strings, e.g., Dict("I"=>1, "Sam"=>2, "Sam is"=>1) (scitype Multiset{Textual}})
    • A dictionary of counts, indexed on plain ngrams, e.g., Dict(("I",)=>1, ("Sam",)=>2, ("I", "Sam")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.
  • min_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.

Operations

  • transform(mach, Xnew): Based on the vocabulary learned in training, return the matrix of counts for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.

Fitted parameters

The fields of fitted_params(mach) are:

  • vocab: A vector containing the string used in the transformer's vocabulary.

Examples

CountTransformer accepts a variety of inputs. The example below transforms tokenized documents:

using MLJ
+CountTransformer · MLJ

CountTransformer

CountTransformer

A model type for constructing a count transformer, based on MLJText.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

CountTransformer = @load CountTransformer pkg=MLJText

Do model = CountTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in CountTransformer(max_doc_freq=...).

The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of term counts.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:

    • A vector of abstract strings (tokens), e.g., ["I", "like", "Sam", ".", "Sam", "is", "nice", "."] (scitype AbstractVector{Textual})
    • A dictionary of counts, indexed on abstract strings, e.g., Dict("I"=>1, "Sam"=>2, "Sam is"=>1) (scitype Multiset{Textual}})
    • A dictionary of counts, indexed on plain ngrams, e.g., Dict(("I",)=>1, ("Sam",)=>2, ("I", "Sam")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.
  • min_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.

Operations

  • transform(mach, Xnew): Based on the vocabulary learned in training, return the matrix of counts for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.

Fitted parameters

The fields of fitted_params(mach) are:

  • vocab: A vector containing the string used in the transformer's vocabulary.

Examples

CountTransformer accepts a variety of inputs. The example below transforms tokenized documents:

using MLJ
 import TextAnalysis
 
 CountTransformer = @load CountTransformer pkg=MLJText
@@ -43,4 +43,4 @@
 MLJ.fit!(mach)
 fitted_params(mach)
 
-tfidf_mat = transform(mach, ngram_docs)

See also TfidfTransformer, BM25Transformer

+tfidf_mat = transform(mach, ngram_docs)

See also TfidfTransformer, BM25Transformer

diff --git a/dev/models/DBSCAN_Clustering/index.html b/dev/models/DBSCAN_Clustering/index.html index 38f62eba6..c3041a015 100644 --- a/dev/models/DBSCAN_Clustering/index.html +++ b/dev/models/DBSCAN_Clustering/index.html @@ -1,5 +1,5 @@ -DBSCAN · MLJ

DBSCAN

DBSCAN

A model type for constructing a DBSCAN clusterer (density-based spatial clustering of applications with noise), based on Clustering.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DBSCAN = @load DBSCAN pkg=Clustering

Do model = DBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DBSCAN(radius=...).

DBSCAN is a clustering algorithm that groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away). More information is available at the Clustering.jl documentation. Use predict to get cluster assignments. Point types - core, boundary or noise - are accessed from the machine report (see below).

This is a static implementation, i.e., it does not generalize to new data instances, and there is no training data. For clusterers that do generalize, see KMeans or KMedoids.

In MLJ or MLJBase, create a machine with

mach = machine(model)

Hyper-parameters

  • radius=1.0: query radius.
  • leafsize=20: number of points binned in each leaf node of the nearest neighbor k-d tree.
  • min_neighbors=1: minimum number of a core point neighbors.
  • min_cluster_size=1: minimum number of points in a valid cluster.

Operations

  • predict(mach, X): return cluster label assignments, as an unordered CategoricalVector. Here X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). Note that points of type noise will always get a label of 0.

Report

After calling predict(mach), the fields of report(mach) are:

  • point_types: A CategoricalVector with the DBSCAN point type classification, one element per row of X. Elements are either 'C' (core), 'B' (boundary), or 'N' (noise).

  • nclusters: The number of clusters (excluding the noise "cluster")

  • cluster_labels: The unique list of cluster labels

  • clusters: A vector of Clustering.DbscanCluster objects from Clustering.jl, which have these fields:

    • size: number of points in a cluster (core + boundary)
    • core_indices: indices of points in the cluster core
    • boundary_indices: indices of points on the cluster boundary

Examples

using MLJ
+DBSCAN · MLJ

DBSCAN

DBSCAN

A model type for constructing a DBSCAN clusterer (density-based spatial clustering of applications with noise), based on Clustering.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DBSCAN = @load DBSCAN pkg=Clustering

Do model = DBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DBSCAN(radius=...).

DBSCAN is a clustering algorithm that groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away). More information is available at the Clustering.jl documentation. Use predict to get cluster assignments. Point types - core, boundary or noise - are accessed from the machine report (see below).

This is a static implementation, i.e., it does not generalize to new data instances, and there is no training data. For clusterers that do generalize, see KMeans or KMedoids.

In MLJ or MLJBase, create a machine with

mach = machine(model)

Hyper-parameters

  • radius=1.0: query radius.
  • leafsize=20: number of points binned in each leaf node of the nearest neighbor k-d tree.
  • min_neighbors=1: minimum number of a core point neighbors.
  • min_cluster_size=1: minimum number of points in a valid cluster.

Operations

  • predict(mach, X): return cluster label assignments, as an unordered CategoricalVector. Here X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). Note that points of type noise will always get a label of 0.

Report

After calling predict(mach), the fields of report(mach) are:

  • point_types: A CategoricalVector with the DBSCAN point type classification, one element per row of X. Elements are either 'C' (core), 'B' (boundary), or 'N' (noise).

  • nclusters: The number of clusters (excluding the noise "cluster")

  • cluster_labels: The unique list of cluster labels

  • clusters: A vector of Clustering.DbscanCluster objects from Clustering.jl, which have these fields:

    • size: number of points in a cluster (core + boundary)
    • core_indices: indices of points in the cluster core
    • boundary_indices: indices of points on the cluster boundary

Examples

using MLJ
 
 X, labels  = make_moons(400, noise=0.09, rng=1) ## synthetic data with 2 clusters; X
 y = map(labels) do label
@@ -32,4 +32,4 @@
    :black
 end
 using Plots
-scatter(points, color=colors)
+scatter(points, color=colors)
diff --git a/dev/models/DBSCAN_MLJScikitLearnInterface/index.html b/dev/models/DBSCAN_MLJScikitLearnInterface/index.html index a67cd071d..4d2b2ba4c 100644 --- a/dev/models/DBSCAN_MLJScikitLearnInterface/index.html +++ b/dev/models/DBSCAN_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -DBSCAN · MLJ

DBSCAN

DBSCAN

A model type for constructing a dbscan, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DBSCAN = @load DBSCAN pkg=MLJScikitLearnInterface

Do model = DBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DBSCAN(eps=...).

Density-Based Spatial Clustering of Applications with Noise. Finds core samples of high density and expands clusters from them. Good for data which contains clusters of similar density.

+DBSCAN · MLJ

DBSCAN

DBSCAN

A model type for constructing a dbscan, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DBSCAN = @load DBSCAN pkg=MLJScikitLearnInterface

Do model = DBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DBSCAN(eps=...).

Density-Based Spatial Clustering of Applications with Noise. Finds core samples of high density and expands clusters from them. Good for data which contains clusters of similar density.

diff --git a/dev/models/DNNDetector_OutlierDetectionNeighbors/index.html b/dev/models/DNNDetector_OutlierDetectionNeighbors/index.html index e2b87fc0a..41eee507a 100644 --- a/dev/models/DNNDetector_OutlierDetectionNeighbors/index.html +++ b/dev/models/DNNDetector_OutlierDetectionNeighbors/index.html @@ -1,5 +1,5 @@ -DNNDetector · MLJ

DNNDetector

DNNDetector(d = 0,
+DNNDetector · MLJ

DNNDetector

DNNDetector(d = 0,
             metric = Euclidean(),
             algorithm = :kdtree,
             leafsize = 10,
@@ -8,4 +8,4 @@
 detector = DNNDetector()
 X = rand(10, 100)
 model, result = fit(detector, X; verbosity=0)
-test_scores = transform(detector, model, X)

References

[1] Knorr, Edwin M.; Ng, Raymond T. (1998): Algorithms for Mining Distance-Based Outliers in Large Datasets.

+test_scores = transform(detector, model, X)

References

[1] Knorr, Edwin M.; Ng, Raymond T. (1998): Algorithms for Mining Distance-Based Outliers in Large Datasets.

diff --git a/dev/models/DecisionTreeClassifier_BetaML/index.html b/dev/models/DecisionTreeClassifier_BetaML/index.html index ab1bf1393..ffa9089d4 100644 --- a/dev/models/DecisionTreeClassifier_BetaML/index.html +++ b/dev/models/DecisionTreeClassifier_BetaML/index.html @@ -1,5 +1,5 @@ -DecisionTreeClassifier · MLJ

DecisionTreeClassifier

mutable struct DecisionTreeClassifier <: MLJModelInterface.Probabilistic

A simple Decision Tree model for classification with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. look at all features]
  • splitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: gini]. Either gini, entropy or a custom function. It can also be an anonymous function.
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+DecisionTreeClassifier · MLJ

DecisionTreeClassifier

mutable struct DecisionTreeClassifier <: MLJModelInterface.Probabilistic

A simple Decision Tree model for classification with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. look at all features]
  • splitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: gini]. Either gini, entropy or a custom function. It can also be an anonymous function.
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -27,4 +27,4 @@
  ⋮
  UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)
  UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)
- UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)
+ UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)
diff --git a/dev/models/DecisionTreeClassifier_DecisionTree/index.html b/dev/models/DecisionTreeClassifier_DecisionTree/index.html index d875bd6a1..5cba398bb 100644 --- a/dev/models/DecisionTreeClassifier_DecisionTree/index.html +++ b/dev/models/DecisionTreeClassifier_DecisionTree/index.html @@ -1,5 +1,5 @@ -DecisionTreeClassifier · MLJ

DecisionTreeClassifier

DecisionTreeClassifier

A model type for constructing a CART decision tree classifier, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree

Do model = DecisionTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DecisionTreeClassifier(max_depth=...).

DecisionTreeClassifier implements the CART algorithm, originally published in Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984): "Classification and regression trees". Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software..

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • max_depth=-1: max depth of the decision tree (-1=any)
  • min_samples_leaf=1: max number of samples each leaf needs to have
  • min_samples_split=2: min number of samples needed for a split
  • min_purity_increase=0: min purity needed for a split
  • n_subfeatures=0: number of features to select at random (0 for all)
  • post_prune=false: set to true for post-fit pruning
  • merge_purity_threshold=1.0: (post-pruning) merge leaves having combined purity >= merge_purity_threshold
  • display_depth=5: max depth to show when displaying the tree
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.
  • predict_mode(mach, Xnew): instead return the mode of each prediction above.

Fitted parameters

The fields of fitted_params(mach) are:

  • raw_tree: the raw Node, Leaf or Root object returned by the core DecisionTree.jl algorithm
  • tree: a visualizable, wrapped version of raw_tree implementing the AbstractTrees.jl interface; see "Examples" below
  • encoding: dictionary of target classes keyed on integers used internally by DecisionTree.jl
  • features: the names of the features encountered in training, in an order consistent with the output of print_tree (see below)

Report

The fields of report(mach) are:

  • classes_seen: list of target classes actually observed in training
  • print_tree: alternative method to print the fitted tree, with single argument the tree depth; interpretation requires internal integer-class encoding (see "Fitted parameters" above).
  • features: the names of the features encountered in training, in an order consistent with the output of print_tree (see below)

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
+DecisionTreeClassifier · MLJ

DecisionTreeClassifier

DecisionTreeClassifier

A model type for constructing a CART decision tree classifier, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree

Do model = DecisionTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DecisionTreeClassifier(max_depth=...).

DecisionTreeClassifier implements the CART algorithm, originally published in Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984): "Classification and regression trees". Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software..

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • max_depth=-1: max depth of the decision tree (-1=any)
  • min_samples_leaf=1: max number of samples each leaf needs to have
  • min_samples_split=2: min number of samples needed for a split
  • min_purity_increase=0: min purity needed for a split
  • n_subfeatures=0: number of features to select at random (0 for all)
  • post_prune=false: set to true for post-fit pruning
  • merge_purity_threshold=1.0: (post-pruning) merge leaves having combined purity >= merge_purity_threshold
  • display_depth=5: max depth to show when displaying the tree
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.
  • predict_mode(mach, Xnew): instead return the mode of each prediction above.

Fitted parameters

The fields of fitted_params(mach) are:

  • raw_tree: the raw Node, Leaf or Root object returned by the core DecisionTree.jl algorithm
  • tree: a visualizable, wrapped version of raw_tree implementing the AbstractTrees.jl interface; see "Examples" below
  • encoding: dictionary of target classes keyed on integers used internally by DecisionTree.jl
  • features: the names of the features encountered in training, in an order consistent with the output of print_tree (see below)

Report

The fields of report(mach) are:

  • classes_seen: list of target classes actually observed in training
  • print_tree: alternative method to print the fitted tree, with single argument the tree depth; interpretation requires internal integer-class encoding (see "Fitted parameters" above).
  • features: the names of the features encountered in training, in an order consistent with the output of print_tree (see below)

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
 DecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree
 model = DecisionTreeClassifier(max_depth=3, min_samples_split=3)
 
@@ -28,4 +28,4 @@
 using Plots, TreeRecipe
 plot(tree) ## for a graphical representation of the tree
 
-feature_importances(mach)

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.DecisionTreeClassifier.

+feature_importances(mach)

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.DecisionTreeClassifier.

diff --git a/dev/models/DecisionTreeRegressor_BetaML/index.html b/dev/models/DecisionTreeRegressor_BetaML/index.html index 3fbd41679..8c8035d5e 100644 --- a/dev/models/DecisionTreeRegressor_BetaML/index.html +++ b/dev/models/DecisionTreeRegressor_BetaML/index.html @@ -1,5 +1,5 @@ -DecisionTreeRegressor · MLJ

DecisionTreeRegressor

mutable struct DecisionTreeRegressor <: MLJModelInterface.Deterministic

A simple Decision Tree model for regression with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. look at all features]
  • splitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: variance]. Either variance or a custom function. It can also be an anonymous function.
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+DecisionTreeRegressor · MLJ

DecisionTreeRegressor

mutable struct DecisionTreeRegressor <: MLJModelInterface.Deterministic

A simple Decision Tree model for regression with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. look at all features]
  • splitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: variance]. Either variance or a custom function. It can also be an anonymous function.
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X, y        = @load_boston;
 
@@ -30,4 +30,4 @@
   ⋮    
  23.9  23.75
  22.0  22.2
- 11.9  13.2
+ 11.9 13.2
diff --git a/dev/models/DecisionTreeRegressor_DecisionTree/index.html b/dev/models/DecisionTreeRegressor_DecisionTree/index.html index 737a12b12..ece550194 100644 --- a/dev/models/DecisionTreeRegressor_DecisionTree/index.html +++ b/dev/models/DecisionTreeRegressor_DecisionTree/index.html @@ -1,5 +1,5 @@ -DecisionTreeRegressor · MLJ

DecisionTreeRegressor

DecisionTreeRegressor

A model type for constructing a CART decision tree regressor, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree

Do model = DecisionTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DecisionTreeRegressor(max_depth=...).

DecisionTreeRegressor implements the CART algorithm, originally published in Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984): "Classification and regression trees". Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software..

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • max_depth=-1: max depth of the decision tree (-1=any)
  • min_samples_leaf=1: max number of samples each leaf needs to have
  • min_samples_split=2: min number of samples needed for a split
  • min_purity_increase=0: min purity needed for a split
  • n_subfeatures=0: number of features to select at random (0 for all)
  • post_prune=false: set to true for post-fit pruning
  • merge_purity_threshold=1.0: (post-pruning) merge leaves having combined purity >= merge_purity_threshold
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: the tree or stump object returned by the core DecisionTree.jl algorithm
  • features: the names of the features encountered in training

Report

The fields of report(mach) are:

  • features: the names of the features encountered in training

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
+DecisionTreeRegressor · MLJ

DecisionTreeRegressor

DecisionTreeRegressor

A model type for constructing a CART decision tree regressor, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree

Do model = DecisionTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DecisionTreeRegressor(max_depth=...).

DecisionTreeRegressor implements the CART algorithm, originally published in Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984): "Classification and regression trees". Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software..

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • max_depth=-1: max depth of the decision tree (-1=any)
  • min_samples_leaf=1: max number of samples each leaf needs to have
  • min_samples_split=2: min number of samples needed for a split
  • min_purity_increase=0: min purity needed for a split
  • n_subfeatures=0: number of features to select at random (0 for all)
  • post_prune=false: set to true for post-fit pruning
  • merge_purity_threshold=1.0: (post-pruning) merge leaves having combined purity >= merge_purity_threshold
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: the tree or stump object returned by the core DecisionTree.jl algorithm
  • features: the names of the features encountered in training

Report

The fields of report(mach) are:

  • features: the names of the features encountered in training

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
 DecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree
 model = DecisionTreeRegressor(max_depth=3, min_samples_split=3)
 
@@ -24,4 +24,4 @@
       ├─ -2.931299926506291 (0/11)
       └─ -4.726518740473489 (0/8)
 
-feature_importances(mach) ## get feature importances

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.DecisionTreeRegressor.

+feature_importances(mach) ## get feature importances

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.DecisionTreeRegressor.

diff --git a/dev/models/DeterministicConstantClassifier_MLJModels/index.html b/dev/models/DeterministicConstantClassifier_MLJModels/index.html index 007fbf65c..0f2514366 100644 --- a/dev/models/DeterministicConstantClassifier_MLJModels/index.html +++ b/dev/models/DeterministicConstantClassifier_MLJModels/index.html @@ -1,2 +1,2 @@ -DeterministicConstantClassifier · MLJ

DeterministicConstantClassifier

DeterministicConstantClassifier

A model type for constructing a deterministic constant classifier, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DeterministicConstantClassifier = @load DeterministicConstantClassifier pkg=MLJModels

Do model = DeterministicConstantClassifier() to construct an instance with default hyper-parameters.

+DeterministicConstantClassifier · MLJ

DeterministicConstantClassifier

DeterministicConstantClassifier

A model type for constructing a deterministic constant classifier, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DeterministicConstantClassifier = @load DeterministicConstantClassifier pkg=MLJModels

Do model = DeterministicConstantClassifier() to construct an instance with default hyper-parameters.

diff --git a/dev/models/DeterministicConstantRegressor_MLJModels/index.html b/dev/models/DeterministicConstantRegressor_MLJModels/index.html index 11dd754e3..2c1b1e360 100644 --- a/dev/models/DeterministicConstantRegressor_MLJModels/index.html +++ b/dev/models/DeterministicConstantRegressor_MLJModels/index.html @@ -1,2 +1,2 @@ -DeterministicConstantRegressor · MLJ

DeterministicConstantRegressor

DeterministicConstantRegressor

A model type for constructing a deterministic constant regressor, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DeterministicConstantRegressor = @load DeterministicConstantRegressor pkg=MLJModels

Do model = DeterministicConstantRegressor() to construct an instance with default hyper-parameters.

+DeterministicConstantRegressor · MLJ

DeterministicConstantRegressor

DeterministicConstantRegressor

A model type for constructing a deterministic constant regressor, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DeterministicConstantRegressor = @load DeterministicConstantRegressor pkg=MLJModels

Do model = DeterministicConstantRegressor() to construct an instance with default hyper-parameters.

diff --git a/dev/models/DummyClassifier_MLJScikitLearnInterface/index.html b/dev/models/DummyClassifier_MLJScikitLearnInterface/index.html index 77747c26c..44f4663ff 100644 --- a/dev/models/DummyClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/DummyClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -DummyClassifier · MLJ

DummyClassifier

DummyClassifier

A model type for constructing a dummy classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DummyClassifier = @load DummyClassifier pkg=MLJScikitLearnInterface

Do model = DummyClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DummyClassifier(strategy=...).

DummyClassifier is a classifier that makes predictions using simple rules.

+DummyClassifier · MLJ

DummyClassifier

DummyClassifier

A model type for constructing a dummy classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DummyClassifier = @load DummyClassifier pkg=MLJScikitLearnInterface

Do model = DummyClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DummyClassifier(strategy=...).

DummyClassifier is a classifier that makes predictions using simple rules.

diff --git a/dev/models/DummyRegressor_MLJScikitLearnInterface/index.html b/dev/models/DummyRegressor_MLJScikitLearnInterface/index.html index 285222a57..52651c12e 100644 --- a/dev/models/DummyRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/DummyRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -DummyRegressor · MLJ

DummyRegressor

DummyRegressor

A model type for constructing a dummy regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DummyRegressor = @load DummyRegressor pkg=MLJScikitLearnInterface

Do model = DummyRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DummyRegressor(strategy=...).

DummyRegressor is a regressor that makes predictions using simple rules.

+DummyRegressor · MLJ

DummyRegressor

DummyRegressor

A model type for constructing a dummy regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

DummyRegressor = @load DummyRegressor pkg=MLJScikitLearnInterface

Do model = DummyRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in DummyRegressor(strategy=...).

DummyRegressor is a regressor that makes predictions using simple rules.

diff --git a/dev/models/ECODDetector_OutlierDetectionPython/index.html b/dev/models/ECODDetector_OutlierDetectionPython/index.html index 6901d5717..bc4155b67 100644 --- a/dev/models/ECODDetector_OutlierDetectionPython/index.html +++ b/dev/models/ECODDetector_OutlierDetectionPython/index.html @@ -1,2 +1,2 @@ -ECODDetector · MLJ
+ECODDetector · MLJ
diff --git a/dev/models/ENNUndersampler_Imbalance/index.html b/dev/models/ENNUndersampler_Imbalance/index.html index 475f08050..a61871a7d 100644 --- a/dev/models/ENNUndersampler_Imbalance/index.html +++ b/dev/models/ENNUndersampler_Imbalance/index.html @@ -1,5 +1,5 @@ -ENNUndersampler · MLJ

ENNUndersampler

Initiate a ENN undersampling model with the given hyper-parameters.

ENNUndersampler

A model type for constructing a enn undersampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ENNUndersampler = @load ENNUndersampler pkg=Imbalance

Do model = ENNUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ENNUndersampler(k=...).

ENNUndersampler undersamples a dataset by removing ("cleaning") points that violate a certain condition such as having a different class compared to the majority of the neighbors as proposed in Dennis L Wilson. Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man, and Cybernetics, pages 408–421, 1972.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = ENNUndersampler()

Hyperparameters

  • k::Integer=5: Number of nearest neighbors to consider in the algorithm. Should be within the range 0 < k < n where n is the number of observations in the smallest class.
  • keep_condition::AbstractString="mode": The condition that leads to cleaning a point upon violation. Takes one of "exists", "mode", "only mode" and "all"
- `"exists"`: the point has at least one neighbor from the same class
+ENNUndersampler · MLJ

ENNUndersampler

Initiate a ENN undersampling model with the given hyper-parameters.

ENNUndersampler

A model type for constructing a enn undersampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ENNUndersampler = @load ENNUndersampler pkg=Imbalance

Do model = ENNUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ENNUndersampler(k=...).

ENNUndersampler undersamples a dataset by removing ("cleaning") points that violate a certain condition such as having a different class compared to the majority of the neighbors as proposed in Dennis L Wilson. Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man, and Cybernetics, pages 408–421, 1972.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = ENNUndersampler()

Hyperparameters

  • k::Integer=5: Number of nearest neighbors to consider in the algorithm. Should be within the range 0 < k < n where n is the number of observations in the smallest class.
  • keep_condition::AbstractString="mode": The condition that leads to cleaning a point upon violation. Takes one of "exists", "mode", "only mode" and "all"
- `"exists"`: the point has at least one neighbor from the same class
 - `"mode"`: the class of the point is one of the most frequent classes of the neighbors (there may be many)
 - `"only mode"`: the class of the point is the single most frequent class of the neighbors
 - `"all"`: the class of the point is the same as all the neighbors
  • min_ratios=1.0: A parameter that controls the maximum amount of undersampling to be done for each class. If this algorithm cleans the data to an extent that this is violated, some of the cleaned points will be revived randomly so that it is satisfied.

    • Can be a float and in this case each class will be at most undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class
    • Can be a dictionary mapping each class label to the float minimum ratio for that class
  • force_min_ratios=false: If true, and this algorithm cleans the data such that the ratios for each class exceed those specified in min_ratios then further undersampling will be perform so that the final ratios are equal to min_ratios.

  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

  • try_preserve_type::Bool=true: When true, the function will try to not change the type of the input table (e.g., DataFrame). However, for some tables, this may not succeed, and in this case, the table returned will be a column table (named-tuple of vectors). This parameter is ignored if the input is a matrix.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively
  • y_under: An abstract vector of labels corresponding to X_under

Operations

  • transform(mach, X, y): resample the data X and y using ENNUndersampler, returning the undersampled versions

Example

using MLJ
@@ -28,4 +28,4 @@
 julia> Imbalance.checkbalance(y_under; ref="minority")
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 10 (100.0%) 
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 10 (100.0%) 
-0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 24 (240.0%) 
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 24 (240.0%)
diff --git a/dev/models/ElasticNetCVRegressor_MLJScikitLearnInterface/index.html b/dev/models/ElasticNetCVRegressor_MLJScikitLearnInterface/index.html index f1658d72d..f5c9bfc91 100644 --- a/dev/models/ElasticNetCVRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/ElasticNetCVRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -ElasticNetCVRegressor · MLJ

ElasticNetCVRegressor

ElasticNetCVRegressor

A model type for constructing a elastic net regression with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ElasticNetCVRegressor = @load ElasticNetCVRegressor pkg=MLJScikitLearnInterface

Do model = ElasticNetCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ElasticNetCVRegressor(l1_ratio=...).

Hyper-parameters

  • l1_ratio = 0.5
  • eps = 0.001
  • n_alphas = 100
  • alphas = nothing
  • fit_intercept = true
  • precompute = auto
  • max_iter = 1000
  • tol = 0.0001
  • cv = 5
  • copy_X = true
  • verbose = 0
  • n_jobs = nothing
  • positive = false
  • random_state = nothing
  • selection = cyclic
+ElasticNetCVRegressor · MLJ

ElasticNetCVRegressor

ElasticNetCVRegressor

A model type for constructing a elastic net regression with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ElasticNetCVRegressor = @load ElasticNetCVRegressor pkg=MLJScikitLearnInterface

Do model = ElasticNetCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ElasticNetCVRegressor(l1_ratio=...).

Hyper-parameters

  • l1_ratio = 0.5
  • eps = 0.001
  • n_alphas = 100
  • alphas = nothing
  • fit_intercept = true
  • precompute = auto
  • max_iter = 1000
  • tol = 0.0001
  • cv = 5
  • copy_X = true
  • verbose = 0
  • n_jobs = nothing
  • positive = false
  • random_state = nothing
  • selection = cyclic
diff --git a/dev/models/ElasticNetRegressor_MLJLinearModels/index.html b/dev/models/ElasticNetRegressor_MLJLinearModels/index.html index 26f5d3660..cde70a59c 100644 --- a/dev/models/ElasticNetRegressor_MLJLinearModels/index.html +++ b/dev/models/ElasticNetRegressor_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -ElasticNetRegressor · MLJ

ElasticNetRegressor

ElasticNetRegressor

A model type for constructing a elastic net regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ElasticNetRegressor = @load ElasticNetRegressor pkg=MLJLinearModels

Do model = ElasticNetRegressor() to construct an instance with default hyper-parameters.

Elastic net is a linear model with objective function

$

|Xθ - y|₂²/2 + n⋅λ|θ|₂²/2 + n⋅γ|θ|₁ $

where $n$ is the number of observations.

If scale_penalty_with_samples = false the objective function is instead

$

|Xθ - y|₂²/2 + λ|θ|₂²/2 + γ|θ|₁ $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the L2 regularization. Default: 1.0

  • gamma::Real: strength of the L1 regularization. Default: 0.0

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.ProxGrad.

    If solver=nothing (default) then ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...). Default: nothing

Example

using MLJ
+ElasticNetRegressor · MLJ

ElasticNetRegressor

ElasticNetRegressor

A model type for constructing a elastic net regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ElasticNetRegressor = @load ElasticNetRegressor pkg=MLJLinearModels

Do model = ElasticNetRegressor() to construct an instance with default hyper-parameters.

Elastic net is a linear model with objective function

$

|Xθ - y|₂²/2 + n⋅λ|θ|₂²/2 + n⋅γ|θ|₁ $

where $n$ is the number of observations.

If scale_penalty_with_samples = false the objective function is instead

$

|Xθ - y|₂²/2 + λ|θ|₂²/2 + γ|θ|₁ $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the L2 regularization. Default: 1.0

  • gamma::Real: strength of the L1 regularization. Default: 0.0

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.ProxGrad.

    If solver=nothing (default) then ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...). Default: nothing

Example

using MLJ
 X, y = make_regression()
 mach = fit!(machine(ElasticNetRegressor(), X, y))
 predict(mach, X)
-fitted_params(mach)

See also LassoRegressor.

+fitted_params(mach)

See also LassoRegressor.

diff --git a/dev/models/ElasticNetRegressor_MLJScikitLearnInterface/index.html b/dev/models/ElasticNetRegressor_MLJScikitLearnInterface/index.html index 96574f424..5df360160 100644 --- a/dev/models/ElasticNetRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/ElasticNetRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -ElasticNetRegressor · MLJ

ElasticNetRegressor

ElasticNetRegressor

A model type for constructing a elastic net regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ElasticNetRegressor = @load ElasticNetRegressor pkg=MLJScikitLearnInterface

Do model = ElasticNetRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ElasticNetRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • l1_ratio = 0.5
  • fit_intercept = true
  • precompute = false
  • max_iter = 1000
  • copy_X = true
  • tol = 0.0001
  • warm_start = false
  • positive = false
  • random_state = nothing
  • selection = cyclic
+ElasticNetRegressor · MLJ

ElasticNetRegressor

ElasticNetRegressor

A model type for constructing a elastic net regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ElasticNetRegressor = @load ElasticNetRegressor pkg=MLJScikitLearnInterface

Do model = ElasticNetRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ElasticNetRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • l1_ratio = 0.5
  • fit_intercept = true
  • precompute = false
  • max_iter = 1000
  • copy_X = true
  • tol = 0.0001
  • warm_start = false
  • positive = false
  • random_state = nothing
  • selection = cyclic
diff --git a/dev/models/EnsembleModel_MLJEnsembles/index.html b/dev/models/EnsembleModel_MLJEnsembles/index.html index e4deea15f..345a59bed 100644 --- a/dev/models/EnsembleModel_MLJEnsembles/index.html +++ b/dev/models/EnsembleModel_MLJEnsembles/index.html @@ -1,8 +1,8 @@ -EnsembleModel · MLJ

EnsembleModel

EnsembleModel(model,
+EnsembleModel · MLJ

EnsembleModel

EnsembleModel(model,
               atomic_weights=Float64[],
               bagging_fraction=0.8,
               n=100,
               rng=GLOBAL_RNG,
               acceleration=CPU1(),
-              out_of_bag_measure=[])

Create a model for training an ensemble of n clones of model, with optional bagging. Ensembling is useful if fit!(machine(atom, data...)) does not create identical models on repeated calls (ie, is a stochastic model, such as a decision tree with randomized node selection criteria), or if bagging_fraction is set to a value less than 1.0, or both.

Here the atomic model must support targets with scitype AbstractVector{<:Finite} (single-target classifiers) or AbstractVector{<:Continuous} (single-target regressors).

If rng is an integer, then MersenneTwister(rng) is the random number generator used for bagging. Otherwise some AbstractRNG object is expected.

The atomic predictions are optionally weighted according to the vector atomic_weights (to allow for external optimization) except in the case that model is a Deterministic classifier, in which case atomic_weights are ignored.

The ensemble model is Deterministic or Probabilistic, according to the corresponding supertype of atom. In the case of deterministic classifiers (target_scitype(atom) <: Abstract{<:Finite}), the predictions are majority votes, and for regressors (target_scitype(atom)<: AbstractVector{<:Continuous}) they are ordinary averages. Probabilistic predictions are obtained by averaging the atomic probability distribution/mass functions; in particular, for regressors, the ensemble prediction on each input pattern has the type MixtureModel{VF,VS,D} from the Distributions.jl package, where D is the type of predicted distribution for atom.

Specify acceleration=CPUProcesses() for distributed computing, or CPUThreads() for multithreading.

If a single measure or non-empty vector of measures is specified by out_of_bag_measure, then out-of-bag estimates of performance are written to the training report (call report on the trained machine wrapping the ensemble model).

Important: If per-observation or class weights w (not to be confused with atomic weights) are specified when constructing a machine for the ensemble model, as in mach = machine(ensemble_model, X, y, w), then w is used by any measures specified in out_of_bag_measure that support them.

+ out_of_bag_measure=[])

Create a model for training an ensemble of n clones of model, with optional bagging. Ensembling is useful if fit!(machine(atom, data...)) does not create identical models on repeated calls (ie, is a stochastic model, such as a decision tree with randomized node selection criteria), or if bagging_fraction is set to a value less than 1.0, or both.

Here the atomic model must support targets with scitype AbstractVector{<:Finite} (single-target classifiers) or AbstractVector{<:Continuous} (single-target regressors).

If rng is an integer, then MersenneTwister(rng) is the random number generator used for bagging. Otherwise some AbstractRNG object is expected.

The atomic predictions are optionally weighted according to the vector atomic_weights (to allow for external optimization) except in the case that model is a Deterministic classifier, in which case atomic_weights are ignored.

The ensemble model is Deterministic or Probabilistic, according to the corresponding supertype of atom. In the case of deterministic classifiers (target_scitype(atom) <: Abstract{<:Finite}), the predictions are majority votes, and for regressors (target_scitype(atom)<: AbstractVector{<:Continuous}) they are ordinary averages. Probabilistic predictions are obtained by averaging the atomic probability distribution/mass functions; in particular, for regressors, the ensemble prediction on each input pattern has the type MixtureModel{VF,VS,D} from the Distributions.jl package, where D is the type of predicted distribution for atom.

Specify acceleration=CPUProcesses() for distributed computing, or CPUThreads() for multithreading.

If a single measure or non-empty vector of measures is specified by out_of_bag_measure, then out-of-bag estimates of performance are written to the training report (call report on the trained machine wrapping the ensemble model).

Important: If per-observation or class weights w (not to be confused with atomic weights) are specified when constructing a machine for the ensemble model, as in mach = machine(ensemble_model, X, y, w), then w is used by any measures specified in out_of_bag_measure that support them.

diff --git a/dev/models/EpsilonSVR_LIBSVM/index.html b/dev/models/EpsilonSVR_LIBSVM/index.html index 8bd4cc4a9..ec71722be 100644 --- a/dev/models/EpsilonSVR_LIBSVM/index.html +++ b/dev/models/EpsilonSVR_LIBSVM/index.html @@ -1,5 +1,5 @@ -EpsilonSVR · MLJ

EpsilonSVR

EpsilonSVR

A model type for constructing a ϵ-support vector regressor, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

EpsilonSVR = @load EpsilonSVR pkg=LIBSVM

Do model = EpsilonSVR() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EpsilonSVR(kernel=...).

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

This model is an adaptation of the classifier SVC to regression, but has an additional parameter epsilon (denoted $ϵ$ in the cited reference).

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • cost=1.0 (range (0, Inf)): the parameter denoted $C$ in the cited reference; for greater regularization, decrease cost

  • epsilon=0.1 (range (0, Inf)): the parameter denoted $ϵ$ in the cited reference; epsilon is the thickness of the penalty-free neighborhood of the graph of the prediction function ("slab" or "tube"). Specifically, a data point (x, y) incurs no training loss unless it is outside this neighborhood; the further away it is from the this neighborhood, the greater the loss penalty.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
+EpsilonSVR · MLJ

EpsilonSVR

EpsilonSVR

A model type for constructing a ϵ-support vector regressor, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

EpsilonSVR = @load EpsilonSVR pkg=LIBSVM

Do model = EpsilonSVR() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EpsilonSVR(kernel=...).

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

This model is an adaptation of the classifier SVC to regression, but has an additional parameter epsilon (denoted $ϵ$ in the cited reference).

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • cost=1.0 (range (0, Inf)): the parameter denoted $C$ in the cited reference; for greater regularization, decrease cost

  • epsilon=0.1 (range (0, Inf)): the parameter denoted $ϵ$ in the cited reference; epsilon is the thickness of the penalty-free neighborhood of the graph of the prediction function ("slab" or "tube"). Specifically, a data point (x, y) incurs no training loss unless it is outside this neighborhood; the further away it is from the this neighborhood, the greater the loss penalty.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
 import LIBSVM
 
 EpsilonSVR = @load EpsilonSVR pkg=LIBSVM            ## model type
@@ -22,4 +22,4 @@
 3-element Vector{Float64}:
   1.1121225361666656
   0.04667702229741916
- -0.6958148424680672

See also NuSVR, LIVSVM.jl and the original C implementation documentation.

+ -0.6958148424680672

See also NuSVR, LIVSVM.jl and the original C implementation documentation.

diff --git a/dev/models/EvoLinearRegressor_EvoLinear/index.html b/dev/models/EvoLinearRegressor_EvoLinear/index.html index 34962a6f5..a3ff7798e 100644 --- a/dev/models/EvoLinearRegressor_EvoLinear/index.html +++ b/dev/models/EvoLinearRegressor_EvoLinear/index.html @@ -1,3 +1,3 @@ -EvoLinearRegressor · MLJ

EvoLinearRegressor

EvoLinearRegressor(; kwargs...)

A model type for constructing a EvoLinearRegressor, based on EvoLinear.jl, and implementing both an internal API and the MLJ model interface.

Keyword arguments

  • loss=:mse: loss function to be minimised. Can be one of:

    • :mse
    • :logistic
    • :poisson
    • :gamma
    • :tweedie
  • nrounds=10: maximum number of training rounds.

  • eta=1: Learning rate. Typically in the range [1e-2, 1].

  • L1=0: Regularization penalty applied by shrinking to 0 weight update if update is < L1. No penalty if update > L1. Results in sparse feature selection. Typically in the [0, 1] range on normalized features.

  • L2=0: Regularization penalty applied to the squared of the weight update value. Restricts large parameter values. Typically in the [0, 1] range on normalized features.

  • rng=123: random seed. Not used at the moment.

  • updater=:all: training method. Only :all is supported at the moment. Gradients for each feature are computed simultaneously, then bias is updated based on all features update.

  • device=:cpu: Only :cpu is supported at the moment.

Internal API

Do config = EvoLinearRegressor() to construct an hyper-parameter struct with default hyper-parameters. Provide keyword arguments as listed above to override defaults, for example:

EvoLinearRegressor(loss=:logistic, L1=1e-3, L2=1e-2, nrounds=100)

Training model

A model is built using fit:

config = EvoLinearRegressor()
-m = fit(config; x, y, w)

Inference

Fitted results is an EvoLinearModel which acts as a prediction function when passed a features matrix as argument.

preds = m(x)

MLJ Interface

From MLJ, the type can be imported using:

EvoLinearRegressor = @load EvoLinearRegressor pkg=EvoLinear

Do model = EvoLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoLinearRegressor(loss=...).

Training model

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where:

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given

features Xnew having the same scitype as X above. Predictions are deterministic.

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: the EvoLinearModel object returned by EvoLnear.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :coef: Vector of coefficients (βs) associated to each of the features.
  • :bias: Value of the bias.
  • :names: Names of each of the features.
+EvoLinearRegressor · MLJ

EvoLinearRegressor

EvoLinearRegressor(; kwargs...)

A model type for constructing a EvoLinearRegressor, based on EvoLinear.jl, and implementing both an internal API and the MLJ model interface.

Keyword arguments

  • loss=:mse: loss function to be minimised. Can be one of:

    • :mse
    • :logistic
    • :poisson
    • :gamma
    • :tweedie
  • nrounds=10: maximum number of training rounds.

  • eta=1: Learning rate. Typically in the range [1e-2, 1].

  • L1=0: Regularization penalty applied by shrinking to 0 weight update if update is < L1. No penalty if update > L1. Results in sparse feature selection. Typically in the [0, 1] range on normalized features.

  • L2=0: Regularization penalty applied to the squared of the weight update value. Restricts large parameter values. Typically in the [0, 1] range on normalized features.

  • rng=123: random seed. Not used at the moment.

  • updater=:all: training method. Only :all is supported at the moment. Gradients for each feature are computed simultaneously, then bias is updated based on all features update.

  • device=:cpu: Only :cpu is supported at the moment.

Internal API

Do config = EvoLinearRegressor() to construct an hyper-parameter struct with default hyper-parameters. Provide keyword arguments as listed above to override defaults, for example:

EvoLinearRegressor(loss=:logistic, L1=1e-3, L2=1e-2, nrounds=100)

Training model

A model is built using fit:

config = EvoLinearRegressor()
+m = fit(config; x, y, w)

Inference

Fitted results is an EvoLinearModel which acts as a prediction function when passed a features matrix as argument.

preds = m(x)

MLJ Interface

From MLJ, the type can be imported using:

EvoLinearRegressor = @load EvoLinearRegressor pkg=EvoLinear

Do model = EvoLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoLinearRegressor(loss=...).

Training model

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where:

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given

features Xnew having the same scitype as X above. Predictions are deterministic.

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: the EvoLinearModel object returned by EvoLnear.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :coef: Vector of coefficients (βs) associated to each of the features.
  • :bias: Value of the bias.
  • :names: Names of each of the features.
diff --git a/dev/models/EvoSplineRegressor_EvoLinear/index.html b/dev/models/EvoSplineRegressor_EvoLinear/index.html index a664d4618..8d623d266 100644 --- a/dev/models/EvoSplineRegressor_EvoLinear/index.html +++ b/dev/models/EvoSplineRegressor_EvoLinear/index.html @@ -1,3 +1,3 @@ -EvoSplineRegressor · MLJ

EvoSplineRegressor

EvoSplineRegressor(; kwargs...)

A model type for constructing a EvoSplineRegressor, based on EvoLinear.jl, and implementing both an internal API and the MLJ model interface.

Keyword arguments

  • loss=:mse: loss function to be minimised. Can be one of:

    • :mse
    • :logistic
    • :poisson
    • :gamma
    • :tweedie
  • nrounds=10: maximum number of training rounds.

  • eta=1: Learning rate. Typically in the range [1e-2, 1].

  • L1=0: Regularization penalty applied by shrinking to 0 weight update if update is < L1. No penalty if update > L1. Results in sparse feature selection. Typically in the [0, 1] range on normalized features.

  • L2=0: Regularization penalty applied to the squared of the weight update value. Restricts large parameter values. Typically in the [0, 1] range on normalized features.

  • rng=123: random seed. Not used at the moment.

  • updater=:all: training method. Only :all is supported at the moment. Gradients for each feature are computed simultaneously, then bias is updated based on all features update.

  • device=:cpu: Only :cpu is supported at the moment.

Internal API

Do config = EvoSplineRegressor() to construct an hyper-parameter struct with default hyper-parameters. Provide keyword arguments as listed above to override defaults, for example:

EvoSplineRegressor(loss=:logistic, L1=1e-3, L2=1e-2, nrounds=100)

Training model

A model is built using fit:

config = EvoSplineRegressor()
-m = fit(config; x, y, w)

Inference

Fitted results is an EvoLinearModel which acts as a prediction function when passed a features matrix as argument.

preds = m(x)

MLJ Interface

From MLJ, the type can be imported using:

EvoSplineRegressor = @load EvoSplineRegressor pkg=EvoLinear

Do model = EvoLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoSplineRegressor(loss=...).

Training model

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where:

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given

features Xnew having the same scitype as X above. Predictions are deterministic.

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: the SplineModel object returned by EvoSplineRegressor fitting algorithm.

Report

The fields of report(mach) are:

  • :coef: Vector of coefficients (βs) associated to each of the features.
  • :bias: Value of the bias.
  • :names: Names of each of the features.
+EvoSplineRegressor · MLJ

EvoSplineRegressor

EvoSplineRegressor(; kwargs...)

A model type for constructing a EvoSplineRegressor, based on EvoLinear.jl, and implementing both an internal API and the MLJ model interface.

Keyword arguments

  • loss=:mse: loss function to be minimised. Can be one of:

    • :mse
    • :logistic
    • :poisson
    • :gamma
    • :tweedie
  • nrounds=10: maximum number of training rounds.

  • eta=1: Learning rate. Typically in the range [1e-2, 1].

  • L1=0: Regularization penalty applied by shrinking to 0 weight update if update is < L1. No penalty if update > L1. Results in sparse feature selection. Typically in the [0, 1] range on normalized features.

  • L2=0: Regularization penalty applied to the squared of the weight update value. Restricts large parameter values. Typically in the [0, 1] range on normalized features.

  • rng=123: random seed. Not used at the moment.

  • updater=:all: training method. Only :all is supported at the moment. Gradients for each feature are computed simultaneously, then bias is updated based on all features update.

  • device=:cpu: Only :cpu is supported at the moment.

Internal API

Do config = EvoSplineRegressor() to construct an hyper-parameter struct with default hyper-parameters. Provide keyword arguments as listed above to override defaults, for example:

EvoSplineRegressor(loss=:logistic, L1=1e-3, L2=1e-2, nrounds=100)

Training model

A model is built using fit:

config = EvoSplineRegressor()
+m = fit(config; x, y, w)

Inference

Fitted results is an EvoLinearModel which acts as a prediction function when passed a features matrix as argument.

preds = m(x)

MLJ Interface

From MLJ, the type can be imported using:

EvoSplineRegressor = @load EvoSplineRegressor pkg=EvoLinear

Do model = EvoLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoSplineRegressor(loss=...).

Training model

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where:

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given

features Xnew having the same scitype as X above. Predictions are deterministic.

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: the SplineModel object returned by EvoSplineRegressor fitting algorithm.

Report

The fields of report(mach) are:

  • :coef: Vector of coefficients (βs) associated to each of the features.
  • :bias: Value of the bias.
  • :names: Names of each of the features.
diff --git a/dev/models/EvoTreeClassifier_EvoTrees/index.html b/dev/models/EvoTreeClassifier_EvoTrees/index.html index 46906fe6c..d6203b5fa 100644 --- a/dev/models/EvoTreeClassifier_EvoTrees/index.html +++ b/dev/models/EvoTreeClassifier_EvoTrees/index.html @@ -1,5 +1,5 @@ -EvoTreeClassifier · MLJ

EvoTreeClassifier

EvoTreeClassifier(;kwargs...)

A model type for constructing a EvoTreeClassifier, based on EvoTrees.jl, and implementing both an internal API and the MLJ model interface. EvoTreeClassifier is used to perform multi-class classification, using cross-entropy loss.

Hyper-parameters

  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.

  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain improvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeClassifier(max_depth=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Matrix of size [nobs, K] where K is the number of classes:

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ

From MLJ, the type can be imported using:

EvoTreeClassifier = @load EvoTreeClassifier pkg=EvoTrees

Do model = EvoTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeClassifier(loss=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Multiclas or <:OrderedFactor; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic.
  • predict_mode(mach, Xnew): returns the mode of each of the prediction above.

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
+EvoTreeClassifier · MLJ

EvoTreeClassifier

EvoTreeClassifier(;kwargs...)

A model type for constructing a EvoTreeClassifier, based on EvoTrees.jl, and implementing both an internal API and the MLJ model interface. EvoTreeClassifier is used to perform multi-class classification, using cross-entropy loss.

Hyper-parameters

  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.

  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain improvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeClassifier(max_depth=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Matrix of size [nobs, K] where K is the number of classes:

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ

From MLJ, the type can be imported using:

EvoTreeClassifier = @load EvoTreeClassifier pkg=EvoTrees

Do model = EvoTreeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeClassifier(loss=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Multiclas or <:OrderedFactor; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic.
  • predict_mode(mach, Xnew): returns the mode of each of the prediction above.

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
 using EvoTrees
 config = EvoTreeClassifier(max_depth=5, nbins=32, nrounds=100)
 nobs, nfeats = 1_000, 5
@@ -12,4 +12,4 @@
 X, y = @load_iris
 mach = machine(model, X, y) |> fit!
 preds = predict(mach, X)
-preds = predict_mode(mach, X)

See also EvoTrees.jl.

+preds = predict_mode(mach, X)

See also EvoTrees.jl.

diff --git a/dev/models/EvoTreeCount_EvoTrees/index.html b/dev/models/EvoTreeCount_EvoTrees/index.html index 6c1d3125a..80610c556 100644 --- a/dev/models/EvoTreeCount_EvoTrees/index.html +++ b/dev/models/EvoTreeCount_EvoTrees/index.html @@ -1,5 +1,5 @@ -EvoTreeCount · MLJ

EvoTreeCount

EvoTreeCount(;kwargs...)

A model type for constructing a EvoTreeCount, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeCount is used to perform Poisson probabilistic regression on count target.

Hyper-parameters

  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.

  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model.

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • monotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing).

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeCount(max_depth=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Vector of length nobs:

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ

From MLJ, the type can be imported using:

EvoTreeCount = @load EvoTreeCount pkg=EvoTrees

Do model = EvoTreeCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeCount(loss=...).

Training data

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Count; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): returns a vector of Poisson distributions given features Xnew having the same scitype as X above. Predictions are probabilistic.

Specific metrics can also be predicted using:

  • predict_mean(mach, Xnew)
  • predict_mode(mach, Xnew)
  • predict_median(mach, Xnew)

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
+EvoTreeCount · MLJ

EvoTreeCount

EvoTreeCount(;kwargs...)

A model type for constructing a EvoTreeCount, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeCount is used to perform Poisson probabilistic regression on count target.

Hyper-parameters

  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.

  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model.

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • monotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing).

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeCount(max_depth=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Vector of length nobs:

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ

From MLJ, the type can be imported using:

EvoTreeCount = @load EvoTreeCount pkg=EvoTrees

Do model = EvoTreeCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeCount(loss=...).

Training data

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Count; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): returns a vector of Poisson distributions given features Xnew having the same scitype as X above. Predictions are probabilistic.

Specific metrics can also be predicted using:

  • predict_mean(mach, Xnew)
  • predict_mode(mach, Xnew)
  • predict_median(mach, Xnew)

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
 using EvoTrees
 config = EvoTreeCount(max_depth=5, nbins=32, nrounds=100)
 nobs, nfeats = 1_000, 5
@@ -15,4 +15,4 @@
 preds = predict_mean(mach, X)
 preds = predict_mode(mach, X)
 preds = predict_median(mach, X)
-

See also EvoTrees.jl.

+

See also EvoTrees.jl.

diff --git a/dev/models/EvoTreeGaussian_EvoTrees/index.html b/dev/models/EvoTreeGaussian_EvoTrees/index.html index 53e4dede6..7bfc281ee 100644 --- a/dev/models/EvoTreeGaussian_EvoTrees/index.html +++ b/dev/models/EvoTreeGaussian_EvoTrees/index.html @@ -1,5 +1,5 @@ -EvoTreeGaussian · MLJ

EvoTreeGaussian

EvoTreeGaussian(;kwargs...)

A model type for constructing a EvoTreeGaussian, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeGaussian is used to perform Gaussian probabilistic regression, fitting μ and σ parameters to maximize likelihood.

Hyper-parameters

  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.

  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=8.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • monotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). !Experimental feature: note that for Gaussian regression, constraints may not be enforce systematically.

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeGaussian() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeGaussian(max_depth=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Matrix of size [nobs, 2] where the second dimensions refer to μ and σ respectively:

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ

From MLJ, the type can be imported using:

EvoTreeGaussian = @load EvoTreeGaussian pkg=EvoTrees

Do model = EvoTreeGaussian() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeGaussian(loss=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): returns a vector of Gaussian distributions given features Xnew having the same scitype as X above.

Predictions are probabilistic.

Specific metrics can also be predicted using:

  • predict_mean(mach, Xnew)
  • predict_mode(mach, Xnew)
  • predict_median(mach, Xnew)

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
+EvoTreeGaussian · MLJ

EvoTreeGaussian

EvoTreeGaussian(;kwargs...)

A model type for constructing a EvoTreeGaussian, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeGaussian is used to perform Gaussian probabilistic regression, fitting μ and σ parameters to maximize likelihood.

Hyper-parameters

  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.

  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=8.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • monotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). !Experimental feature: note that for Gaussian regression, constraints may not be enforce systematically.

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeGaussian() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeGaussian(max_depth=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Matrix of size [nobs, 2] where the second dimensions refer to μ and σ respectively:

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ

From MLJ, the type can be imported using:

EvoTreeGaussian = @load EvoTreeGaussian pkg=EvoTrees

Do model = EvoTreeGaussian() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeGaussian(loss=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): returns a vector of Gaussian distributions given features Xnew having the same scitype as X above.

Predictions are probabilistic.

Specific metrics can also be predicted using:

  • predict_mean(mach, Xnew)
  • predict_mode(mach, Xnew)
  • predict_median(mach, Xnew)

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
 using EvoTrees
 params = EvoTreeGaussian(max_depth=5, nbins=32, nrounds=100)
 nobs, nfeats = 1_000, 5
@@ -14,4 +14,4 @@
 preds = predict(mach, X)
 preds = predict_mean(mach, X)
 preds = predict_mode(mach, X)
-preds = predict_median(mach, X)
+preds = predict_median(mach, X)
diff --git a/dev/models/EvoTreeMLE_EvoTrees/index.html b/dev/models/EvoTreeMLE_EvoTrees/index.html index 728d087d6..a4e608e40 100644 --- a/dev/models/EvoTreeMLE_EvoTrees/index.html +++ b/dev/models/EvoTreeMLE_EvoTrees/index.html @@ -1,5 +1,5 @@ -EvoTreeMLE · MLJ

EvoTreeMLE

EvoTreeMLE(;kwargs...)

A model type for constructing a EvoTreeMLE, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeMLE performs maximum likelihood estimation. Assumed distribution is specified through loss kwargs. Both Gaussian and Logistic distributions are supported.

Hyper-parameters

loss=:gaussian: Loss to be be minimized during training. One of:

  • :gaussian / :gaussian_mle
  • :logistic / :logistic_mle
  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.
  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0.

A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=8.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • monotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). !Experimental feature: note that for MLE regression, constraints may not be enforced systematically.

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeMLE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeMLE(max_depth=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Matrix of size [nobs, nparams] where the second dimensions refer to μ & σ for Normal/Gaussian and μ & s for Logistic.

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ

From MLJ, the type can be imported using:

EvoTreeMLE = @load EvoTreeMLE pkg=EvoTrees

Do model = EvoTreeMLE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeMLE(loss=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): returns a vector of Gaussian or Logistic distributions (according to provided loss) given features Xnew having the same scitype as X above.

Predictions are probabilistic.

Specific metrics can also be predicted using:

  • predict_mean(mach, Xnew)
  • predict_mode(mach, Xnew)
  • predict_median(mach, Xnew)

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
+EvoTreeMLE · MLJ

EvoTreeMLE

EvoTreeMLE(;kwargs...)

A model type for constructing a EvoTreeMLE, based on EvoTrees.jl, and implementing both an internal API the MLJ model interface. EvoTreeMLE performs maximum likelihood estimation. Assumed distribution is specified through loss kwargs. Both Gaussian and Logistic distributions are supported.

Hyper-parameters

loss=:gaussian: Loss to be be minimized during training. One of:

  • :gaussian / :gaussian_mle
  • :logistic / :logistic_mle
  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.
  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0.

A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=8.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • monotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). !Experimental feature: note that for MLE regression, constraints may not be enforced systematically.

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeMLE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeMLE(max_depth=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Matrix of size [nobs, nparams] where the second dimensions refer to μ & σ for Normal/Gaussian and μ & s for Logistic.

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ

From MLJ, the type can be imported using:

EvoTreeMLE = @load EvoTreeMLE pkg=EvoTrees

Do model = EvoTreeMLE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeMLE(loss=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): returns a vector of Gaussian or Logistic distributions (according to provided loss) given features Xnew having the same scitype as X above.

Predictions are probabilistic.

Specific metrics can also be predicted using:

  • predict_mean(mach, Xnew)
  • predict_mode(mach, Xnew)
  • predict_median(mach, Xnew)

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
 using EvoTrees
 config = EvoTreeMLE(max_depth=5, nbins=32, nrounds=100)
 nobs, nfeats = 1_000, 5
@@ -14,4 +14,4 @@
 preds = predict(mach, X)
 preds = predict_mean(mach, X)
 preds = predict_mode(mach, X)
-preds = predict_median(mach, X)
+preds = predict_median(mach, X)
diff --git a/dev/models/EvoTreeRegressor_EvoTrees/index.html b/dev/models/EvoTreeRegressor_EvoTrees/index.html index 0b6a3e0ac..50cf5b3b2 100644 --- a/dev/models/EvoTreeRegressor_EvoTrees/index.html +++ b/dev/models/EvoTreeRegressor_EvoTrees/index.html @@ -1,5 +1,5 @@ -EvoTreeRegressor · MLJ

EvoTreeRegressor

EvoTreeRegressor(;kwargs...)

A model type for constructing a EvoTreeRegressor, based on EvoTrees.jl, and implementing both an internal API and the MLJ model interface.

Hyper-parameters

  • loss=:mse: Loss to be be minimized during training. One of:

    • :mse
    • :logloss
    • :gamma
    • :tweedie
    • :quantile
    • :l1
  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.

  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain improvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.

  • alpha::T=0.5: Loss specific parameter in the [0, 1] range: - :quantile: target quantile for the regression. - :l1: weighting parameters to positive vs negative residuals. - Positive residual weights = alpha - Negative residual weights = (1 - alpha)

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • monotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). Only :linear, :logistic, :gamma and tweedie losses are supported at the moment.

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeRegressor(loss=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Vector of length nobs:

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ Interface

From MLJ, the type can be imported using:

EvoTreeRegressor = @load EvoTreeRegressor pkg=EvoTrees

Do model = EvoTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeRegressor(loss=...).

Training model

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are deterministic.

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
+EvoTreeRegressor · MLJ

EvoTreeRegressor

EvoTreeRegressor(;kwargs...)

A model type for constructing a EvoTreeRegressor, based on EvoTrees.jl, and implementing both an internal API and the MLJ model interface.

Hyper-parameters

  • loss=:mse: Loss to be be minimized during training. One of:

    • :mse
    • :logloss
    • :gamma
    • :tweedie
    • :quantile
    • :l1
  • nrounds=100: Number of rounds. It corresponds to the number of trees that will be sequentially stacked. Must be >= 1.

  • eta=0.1: Learning rate. Each tree raw predictions are scaled by eta prior to be added to the stack of predictions. Must be > 0. A lower eta results in slower learning, requiring a higher nrounds but typically improves model performance.

  • L2::T=0.0: L2 regularization factor on aggregate gain. Must be >= 0. Higher L2 can result in a more robust model.

  • lambda::T=0.0: L2 regularization factor on individual gain. Must be >= 0. Higher lambda can result in a more robust model.

  • gamma::T=0.0: Minimum gain improvement needed to perform a node split. Higher gamma can result in a more robust model. Must be >= 0.

  • alpha::T=0.5: Loss specific parameter in the [0, 1] range: - :quantile: target quantile for the regression. - :l1: weighting parameters to positive vs negative residuals. - Positive residual weights = alpha - Negative residual weights = (1 - alpha)

  • max_depth=6: Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf. A complete tree of depth N contains 2^(N - 1) terminal leaves and 2^(N - 1) - 1 split nodes. Compute cost is proportional to 2^max_depth. Typical optimal values are in the 3 to 9 range.

  • min_weight=1.0: Minimum weight needed in a node to perform a split. Matches the number of observations by default or the sum of weights as provided by the weights vector. Must be > 0.

  • rowsample=1.0: Proportion of rows that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • colsample=1.0: Proportion of columns / features that are sampled at each iteration to build the tree. Should be in ]0, 1].

  • nbins=64: Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins. Should be between 2 and 255.

  • monotone_constraints=Dict{Int, Int}(): Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). Only :linear, :logistic, :gamma and tweedie losses are supported at the moment.

  • tree_type="binary" Tree structure to be used. One of:

    • binary: Each node of a tree is grown independently. Tree are built depthwise until max depth is reach or if min weight or gain (see gamma) stops further node splits.
    • oblivious: A common splitting condition is imposed to all nodes of a given depth.
  • rng=123: Either an integer used as a seed to the random number generator or an actual random number generator (::Random.AbstractRNG).

Internal API

Do config = EvoTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeRegressor(loss=...).

Training model

A model is built using fit_evotree:

model = fit_evotree(config; x_train, y_train, kwargs...)

Inference

Predictions are obtained using predict which returns a Vector of length nobs:

EvoTrees.predict(model, X)

Alternatively, models act as a functor, returning predictions when called as a function with features as argument:

model(X)

MLJ Interface

From MLJ, the type can be imported using:

EvoTreeRegressor = @load EvoTreeRegressor pkg=EvoTrees

Do model = EvoTreeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in EvoTreeRegressor(loss=...).

Training model

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X, y) where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are deterministic.

Fitted parameters

The fields of fitted_params(mach) are:

  • :fitresult: The GBTree object returned by EvoTrees.jl fitting algorithm.

Report

The fields of report(mach) are:

  • :features: The names of the features encountered in training.

Examples

## Internal API
 using EvoTrees
 config = EvoTreeRegressor(max_depth=5, nbins=32, nrounds=100)
 nobs, nfeats = 1_000, 5
@@ -11,4 +11,4 @@
 model = EvoTreeRegressor(max_depth=5, nbins=32, nrounds=100)
 X, y = @load_boston
 mach = machine(model, X, y) |> fit!
-preds = predict(mach, X)
+preds = predict(mach, X)
diff --git a/dev/models/ExtraTreesClassifier_MLJScikitLearnInterface/index.html b/dev/models/ExtraTreesClassifier_MLJScikitLearnInterface/index.html index 73d414b73..f8246e190 100644 --- a/dev/models/ExtraTreesClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/ExtraTreesClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -ExtraTreesClassifier · MLJ

ExtraTreesClassifier

ExtraTreesClassifier

A model type for constructing a extra trees classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ExtraTreesClassifier = @load ExtraTreesClassifier pkg=MLJScikitLearnInterface

Do model = ExtraTreesClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ExtraTreesClassifier(n_estimators=...).

Extra trees classifier, fits a number of randomized decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

+ExtraTreesClassifier · MLJ

ExtraTreesClassifier

ExtraTreesClassifier

A model type for constructing a extra trees classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ExtraTreesClassifier = @load ExtraTreesClassifier pkg=MLJScikitLearnInterface

Do model = ExtraTreesClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ExtraTreesClassifier(n_estimators=...).

Extra trees classifier, fits a number of randomized decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

diff --git a/dev/models/ExtraTreesRegressor_MLJScikitLearnInterface/index.html b/dev/models/ExtraTreesRegressor_MLJScikitLearnInterface/index.html index 5a882ae2a..42d147fb8 100644 --- a/dev/models/ExtraTreesRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/ExtraTreesRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -ExtraTreesRegressor · MLJ

ExtraTreesRegressor

ExtraTreesRegressor

A model type for constructing a extra trees regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ExtraTreesRegressor = @load ExtraTreesRegressor pkg=MLJScikitLearnInterface

Do model = ExtraTreesRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ExtraTreesRegressor(n_estimators=...).

Extra trees regressor, fits a number of randomized decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

+ExtraTreesRegressor · MLJ

ExtraTreesRegressor

ExtraTreesRegressor

A model type for constructing a extra trees regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ExtraTreesRegressor = @load ExtraTreesRegressor pkg=MLJScikitLearnInterface

Do model = ExtraTreesRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ExtraTreesRegressor(n_estimators=...).

Extra trees regressor, fits a number of randomized decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

diff --git a/dev/models/FactorAnalysis_MultivariateStats/index.html b/dev/models/FactorAnalysis_MultivariateStats/index.html index 496add57b..471c2c17b 100644 --- a/dev/models/FactorAnalysis_MultivariateStats/index.html +++ b/dev/models/FactorAnalysis_MultivariateStats/index.html @@ -1,5 +1,5 @@ -FactorAnalysis · MLJ

FactorAnalysis

FactorAnalysis

A model type for constructing a factor analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FactorAnalysis = @load FactorAnalysis pkg=MultivariateStats

Do model = FactorAnalysis() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FactorAnalysis(method=...).

Factor analysis is a linear-Gaussian latent variable model that is closely related to probabilistic PCA. In contrast to the probabilistic PCA model, the covariance of conditional distribution of the observed variable given the latent variable is diagonal rather than isotropic.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • method::Symbol=:cm: Method to use to solve the problem, one of :ml, :em, :bayes.
  • maxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.
  • maxiter::Int=1000: Maximum number of iterations.
  • tol::Real=1e-6: Convergence tolerance.
  • eta::Real=tol: Variance lower bound.
  • mean::Union{Nothing, Real, Vector{Float64}}=nothing: If nothing, centering will be computed and applied; if set to 0 no centering is applied (data is assumed pre-centered); if a vector, the centering is done with that vector.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • inverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively. Each column of the projection matrix corresponds to a factor.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim: Dimension of transformed data (number of factors).
  • variance: The variance of the factors.
  • covariance_matrix: The estimated covariance matrix.
  • mean: The mean of the untransformed training data, of length indim.
  • loadings: The factor loadings. A matrix of size (indim, outdim) where indim and outdim are as defined above.

Examples

using MLJ
+FactorAnalysis · MLJ

FactorAnalysis

FactorAnalysis

A model type for constructing a factor analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FactorAnalysis = @load FactorAnalysis pkg=MultivariateStats

Do model = FactorAnalysis() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FactorAnalysis(method=...).

Factor analysis is a linear-Gaussian latent variable model that is closely related to probabilistic PCA. In contrast to the probabilistic PCA model, the covariance of conditional distribution of the observed variable given the latent variable is diagonal rather than isotropic.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • method::Symbol=:cm: Method to use to solve the problem, one of :ml, :em, :bayes.
  • maxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.
  • maxiter::Int=1000: Maximum number of iterations.
  • tol::Real=1e-6: Convergence tolerance.
  • eta::Real=tol: Variance lower bound.
  • mean::Union{Nothing, Real, Vector{Float64}}=nothing: If nothing, centering will be computed and applied; if set to 0 no centering is applied (data is assumed pre-centered); if a vector, the centering is done with that vector.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • inverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively. Each column of the projection matrix corresponds to a factor.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim: Dimension of transformed data (number of factors).
  • variance: The variance of the factors.
  • covariance_matrix: The estimated covariance matrix.
  • mean: The mean of the untransformed training data, of length indim.
  • loadings: The factor loadings. A matrix of size (indim, outdim) where indim and outdim are as defined above.

Examples

using MLJ
 
 FactorAnalysis = @load FactorAnalysis pkg=MultivariateStats
 
@@ -8,4 +8,4 @@
 model = FactorAnalysis(maxoutdim=2)
 mach = machine(model, X) |> fit!
 
-Xproj = transform(mach, X)

See also KernelPCA, ICA, PPCA, PCA

+Xproj = transform(mach, X)

See also KernelPCA, ICA, PPCA, PCA

diff --git a/dev/models/FeatureAgglomeration_MLJScikitLearnInterface/index.html b/dev/models/FeatureAgglomeration_MLJScikitLearnInterface/index.html index b69eca028..03e5dddc4 100644 --- a/dev/models/FeatureAgglomeration_MLJScikitLearnInterface/index.html +++ b/dev/models/FeatureAgglomeration_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -FeatureAgglomeration · MLJ

FeatureAgglomeration

FeatureAgglomeration

A model type for constructing a feature agglomeration, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FeatureAgglomeration = @load FeatureAgglomeration pkg=MLJScikitLearnInterface

Do model = FeatureAgglomeration() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureAgglomeration(n_clusters=...).

Similar to AgglomerativeClustering, but recursively merges features instead of samples."

+FeatureAgglomeration · MLJ

FeatureAgglomeration

FeatureAgglomeration

A model type for constructing a feature agglomeration, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FeatureAgglomeration = @load FeatureAgglomeration pkg=MLJScikitLearnInterface

Do model = FeatureAgglomeration() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureAgglomeration(n_clusters=...).

Similar to AgglomerativeClustering, but recursively merges features instead of samples."

diff --git a/dev/models/FeatureSelector_FeatureSelection/index.html b/dev/models/FeatureSelector_FeatureSelection/index.html index 4dd225235..f764449ed 100644 --- a/dev/models/FeatureSelector_FeatureSelection/index.html +++ b/dev/models/FeatureSelector_FeatureSelection/index.html @@ -1,5 +1,5 @@ -FeatureSelector · MLJ

FeatureSelector

FeatureSelector

A model type for constructing a feature selector, based on FeatureSelection.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FeatureSelector = @load FeatureSelector pkg=FeatureSelection

Do model = FeatureSelector() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureSelector(features=...).

Use this model to select features (columns) of a table, usually as part of a model Pipeline.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any table of input features, where "table" is in the sense of Tables.jl

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated:

    • [] (empty, the default): filter out all features (columns) which were not encountered in training
    • non-empty vector of feature names (symbols): keep only the specified features (ignore=false) or keep only unspecified features (ignore=true)
    • function or other callable: keep a feature if the callable returns true on its name. For example, specifying FeatureSelector(features = name -> name in [:x1, :x3], ignore = true) has the same effect as FeatureSelector(features = [:x1, :x3], ignore = true), namely to select all features, with the exception of :x1 and :x3.
  • ignore: whether to ignore or keep specified features, as explained above

Operations

  • transform(mach, Xnew): select features from the table Xnew as specified by the model, taking features seen during training into account, if relevant

Fitted parameters

The fields of fitted_params(mach) are:

  • features_to_keep: the features that will be selected

Example

using MLJ
+FeatureSelector · MLJ

FeatureSelector

FeatureSelector

A model type for constructing a feature selector, based on FeatureSelection.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FeatureSelector = @load FeatureSelector pkg=FeatureSelection

Do model = FeatureSelector() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureSelector(features=...).

Use this model to select features (columns) of a table, usually as part of a model Pipeline.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any table of input features, where "table" is in the sense of Tables.jl

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated:

    • [] (empty, the default): filter out all features (columns) which were not encountered in training
    • non-empty vector of feature names (symbols): keep only the specified features (ignore=false) or keep only unspecified features (ignore=true)
    • function or other callable: keep a feature if the callable returns true on its name. For example, specifying FeatureSelector(features = name -> name in [:x1, :x3], ignore = true) has the same effect as FeatureSelector(features = [:x1, :x3], ignore = true), namely to select all features, with the exception of :x1 and :x3.
  • ignore: whether to ignore or keep specified features, as explained above

Operations

  • transform(mach, Xnew): select features from the table Xnew as specified by the model, taking features seen during training into account, if relevant

Fitted parameters

The fields of fitted_params(mach) are:

  • features_to_keep: the features that will be selected

Example

using MLJ
 
 X = (ordinal1 = [1, 2, 3],
      ordinal2 = coerce(["x", "y", "x"], OrderedFactor),
@@ -14,4 +14,4 @@
  ordinal2 = CategoricalValue{Symbol,UInt32}["x", "y", "x"],
  ordinal4 = [-20.0, -30.0, -40.0],
  nominal = CategoricalValue{String,UInt32}["Your father", "he", "is"],)
-
+
diff --git a/dev/models/FillImputer_MLJModels/index.html b/dev/models/FillImputer_MLJModels/index.html index dc306f6b0..9dc56dafe 100644 --- a/dev/models/FillImputer_MLJModels/index.html +++ b/dev/models/FillImputer_MLJModels/index.html @@ -1,5 +1,5 @@ -FillImputer · MLJ

FillImputer

FillImputer

A model type for constructing a fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FillImputer = @load FillImputer pkg=MLJModels

Do model = FillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FillImputer(features=...).

Use this model to impute missing values in tabular data. A fixed "filler" value is learned from the training data, one for each column of the table.

For imputing missing values in a vector, use UnivariateFillImputer instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have element scitypes Union{Missing, T}, where T is a subtype of Continuous, Multiclass, OrderedFactor or Count. Check scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: a vector of names of features (symbols) for which imputation is to be attempted; default is empty, which is interpreted as "impute all".
  • continuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values
  • count_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values
  • finite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values

Operations

  • transform(mach, Xnew): return Xnew with missing values imputed with the fill values learned when fitting mach

Fitted parameters

The fields of fitted_params(mach) are:

  • features_seen_in_fit: the names of features (columns) encountered during training
  • univariate_transformer: the univariate model applied to determine the fillers (it's fields contain the functions defining the filler computations)
  • filler_given_feature: dictionary of filler values, keyed on feature (column) names

Examples

using MLJ
+FillImputer · MLJ

FillImputer

FillImputer

A model type for constructing a fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FillImputer = @load FillImputer pkg=MLJModels

Do model = FillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FillImputer(features=...).

Use this model to impute missing values in tabular data. A fixed "filler" value is learned from the training data, one for each column of the table.

For imputing missing values in a vector, use UnivariateFillImputer instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have element scitypes Union{Missing, T}, where T is a subtype of Continuous, Multiclass, OrderedFactor or Count. Check scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: a vector of names of features (symbols) for which imputation is to be attempted; default is empty, which is interpreted as "impute all".
  • continuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values
  • count_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values
  • finite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values

Operations

  • transform(mach, Xnew): return Xnew with missing values imputed with the fill values learned when fitting mach

Fitted parameters

The fields of fitted_params(mach) are:

  • features_seen_in_fit: the names of features (columns) encountered during training
  • univariate_transformer: the univariate model applied to determine the fillers (it's fields contain the functions defining the filler computations)
  • filler_given_feature: dictionary of filler values, keyed on feature (column) names

Examples

using MLJ
 imputer = FillImputer()
 
 X = (a = [1.0, 2.0, missing, 3.0, missing],
@@ -31,4 +31,4 @@
 julia> transform(mach, X)
 (a = [1.0, 2.0, 2.0, 3.0, 2.0],
  b = CategoricalValue{String, UInt32}["y", "n", "y", "y", "y"],
- c = [1, 1, 2, 2, 3],)

See also UnivariateFillImputer.

+ c = [1, 1, 2, 2, 3],)

See also UnivariateFillImputer.

diff --git a/dev/models/GMMDetector_OutlierDetectionPython/index.html b/dev/models/GMMDetector_OutlierDetectionPython/index.html index e72bf8181..9177dc606 100644 --- a/dev/models/GMMDetector_OutlierDetectionPython/index.html +++ b/dev/models/GMMDetector_OutlierDetectionPython/index.html @@ -1,5 +1,5 @@ -GMMDetector · MLJ

GMMDetector

GMMDetector(n_components=1,
+GMMDetector · MLJ
+               warm_start=False)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.gmm

diff --git a/dev/models/GaussianMixtureClusterer_BetaML/index.html b/dev/models/GaussianMixtureClusterer_BetaML/index.html index 700daab8e..b9521af96 100644 --- a/dev/models/GaussianMixtureClusterer_BetaML/index.html +++ b/dev/models/GaussianMixtureClusterer_BetaML/index.html @@ -1,5 +1,5 @@ -GaussianMixtureClusterer · MLJ

GaussianMixtureClusterer

mutable struct GaussianMixtureClusterer <: MLJModelInterface.Unsupervised

A Expectation-Maximisation clustering algorithm with customisable mixtures, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::AbstractVector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the ?GMM module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if the initialisation_strategy parameter is set to "gived". This parameter can also be given symply in term of a type. In this case it is automatically extended to a vector of n_classes mixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def: [DiagonalGaussian() for i in 1:n_classes]]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • maximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Example:


+GaussianMixtureClusterer · MLJ

GaussianMixtureClusterer

mutable struct GaussianMixtureClusterer <: MLJModelInterface.Unsupervised

A Expectation-Maximisation clustering algorithm with customisable mixtures, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::AbstractVector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the ?GMM module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if the initialisation_strategy parameter is set to "gived". This parameter can also be given symply in term of a type. In this case it is automatically extended to a vector of n_classes mixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def: [DiagonalGaussian() for i in 1:n_classes]]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • maximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Example:


 julia> using MLJ
 
 julia> X, y        = @load_iris;
@@ -34,4 +34,4 @@
  ⋮
  UnivariateFinite{Multiclass{3}}(1=>5.39e-25, 2=>0.0167, 3=>0.983)
  UnivariateFinite{Multiclass{3}}(1=>7.5e-29, 2=>0.000106, 3=>1.0)
- UnivariateFinite{Multiclass{3}}(1=>1.6e-20, 2=>0.594, 3=>0.406)
+ UnivariateFinite{Multiclass{3}}(1=>1.6e-20, 2=>0.594, 3=>0.406)
diff --git a/dev/models/GaussianMixtureImputer_BetaML/index.html b/dev/models/GaussianMixtureImputer_BetaML/index.html index 398f39060..565758fac 100644 --- a/dev/models/GaussianMixtureImputer_BetaML/index.html +++ b/dev/models/GaussianMixtureImputer_BetaML/index.html @@ -1,5 +1,5 @@ -GaussianMixtureImputer · MLJ

GaussianMixtureImputer

mutable struct GaussianMixtureImputer <: MLJModelInterface.Unsupervised

Impute missing values using a probabilistic approach (Gaussian Mixture Models) fitted using the Expectation-Maximisation algorithm, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module in BetaML). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to "gived" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported and that currently implemented mixtures areSphericalGaussian,DiagonalGaussianandFullGaussian. [def:DiagonalGaussian`]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance.

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example :

julia> using MLJ
+GaussianMixtureImputer · MLJ

GaussianMixtureImputer

mutable struct GaussianMixtureImputer <: MLJModelInterface.Unsupervised

Impute missing values using a probabilistic approach (Gaussian Mixture Models) fitted using the Expectation-Maximisation algorithm, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module in BetaML). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to "gived" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported and that currently implemented mixtures areSphericalGaussian,DiagonalGaussianandFullGaussian. [def:DiagonalGaussian`]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance.

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example :

julia> using MLJ
 
 julia> X = [1 10.5;1.5 missing; 1.8 8; 1.7 15; 3.2 40; missing missing; 3.3 38; missing -2.3; 5.2 -2.4] |> table ;
 
@@ -33,4 +33,4 @@
  2.51842  15.1747
  3.3      38.0
  2.47412  -2.3
- 5.2      -2.4
+ 5.2 -2.4
diff --git a/dev/models/GaussianMixtureRegressor_BetaML/index.html b/dev/models/GaussianMixtureRegressor_BetaML/index.html index bc0c0f4fd..f608de146 100644 --- a/dev/models/GaussianMixtureRegressor_BetaML/index.html +++ b/dev/models/GaussianMixtureRegressor_BetaML/index.html @@ -1,5 +1,5 @@ -GaussianMixtureRegressor · MLJ

GaussianMixtureRegressor

mutable struct GaussianMixtureRegressor <: MLJModelInterface.Deterministic

A non-linear regressor derived from fitting the data on a probabilistic model (Gaussian Mixture Model). Relatively fast but generally not very precise, except for data with a structure matching the chosen underlying mixture.

This is the single-target version of the model. If you want to predict several labels (y) at once, use the MLJ model MultitargetGaussianMixtureRegressor.

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to "gived" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:[DiagonalGaussian() for i in 1:n_classes]`]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • maximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+GaussianMixtureRegressor · MLJ

GaussianMixtureRegressor

mutable struct GaussianMixtureRegressor <: MLJModelInterface.Deterministic

A non-linear regressor derived from fitting the data on a probabilistic model (Gaussian Mixture Model). Relatively fast but generally not very precise, except for data with a structure matching the chosen underlying mixture.

This is the single-target version of the model. If you want to predict several labels (y) at once, use the MLJ model MultitargetGaussianMixtureRegressor.

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to "gived" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:[DiagonalGaussian() for i in 1:n_classes]`]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • maximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X, y      = @load_boston;
 
@@ -30,4 +30,4 @@
  24.70344283512716
   ⋮
  17.172486989759676
- 17.172486989759644
+ 17.172486989759644
diff --git a/dev/models/GaussianNBClassifier_MLJScikitLearnInterface/index.html b/dev/models/GaussianNBClassifier_MLJScikitLearnInterface/index.html index 245276544..03d27ce7f 100644 --- a/dev/models/GaussianNBClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/GaussianNBClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -GaussianNBClassifier · MLJ

GaussianNBClassifier

GaussianNBClassifier

A model type for constructing a Gaussian naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GaussianNBClassifier = @load GaussianNBClassifier pkg=MLJScikitLearnInterface

Do model = GaussianNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianNBClassifier(priors=...).

Hyper-parameters

  • priors = nothing
  • var_smoothing = 1.0e-9
+GaussianNBClassifier · MLJ

GaussianNBClassifier

GaussianNBClassifier

A model type for constructing a Gaussian naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GaussianNBClassifier = @load GaussianNBClassifier pkg=MLJScikitLearnInterface

Do model = GaussianNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianNBClassifier(priors=...).

Hyper-parameters

  • priors = nothing
  • var_smoothing = 1.0e-9
diff --git a/dev/models/GaussianNBClassifier_NaiveBayes/index.html b/dev/models/GaussianNBClassifier_NaiveBayes/index.html index ddb403547..41150841e 100644 --- a/dev/models/GaussianNBClassifier_NaiveBayes/index.html +++ b/dev/models/GaussianNBClassifier_NaiveBayes/index.html @@ -1,5 +1,5 @@ -GaussianNBClassifier · MLJ

GaussianNBClassifier

GaussianNBClassifier

A model type for constructing a Gaussian naive Bayes classifier, based on NaiveBayes.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GaussianNBClassifier = @load GaussianNBClassifier pkg=NaiveBayes

Do model = GaussianNBClassifier() to construct an instance with default hyper-parameters.

Given each class taken on by the target variable y, it is supposed that the conditional probability distribution for the input variables X is a multivariate Gaussian. The mean and covariance of these Gaussian distributions are estimated using maximum likelihood, and a probability distribution for y given X is deduced by applying Bayes' rule. The required marginal for y is estimated using class frequency in the training data.

Important. The name "naive Bayes classifier" is perhaps misleading. Since we are learning the full multivariate Gaussian distributions for X given y, we are not applying the usual naive Bayes independence condition, which would amount to forcing the covariance matrix to be diagonal.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic.
  • predict_mode(mach, Xnew): Return the mode of above predictions.

Fitted parameters

The fields of fitted_params(mach) are:

  • c_counts: A dictionary containing the observed count of each input class.

  • c_stats: A dictionary containing observed statistics on each input class. Each class is represented by a DataStats object, with the following fields:

    • n_vars: The number of variables used to describe the class's behavior.
    • n_obs: The number of times the class is observed.
    • obs_axis: The axis along which the observations were computed.
  • gaussians: A per class dictionary of Gaussians, each representing the distribution of the class. Represented with type Distributions.MvNormal from the Distributions.jl package.

  • n_obs: The total number of observations in the training data.

Examples

using MLJ
+GaussianNBClassifier · MLJ

GaussianNBClassifier

GaussianNBClassifier

A model type for constructing a Gaussian naive Bayes classifier, based on NaiveBayes.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GaussianNBClassifier = @load GaussianNBClassifier pkg=NaiveBayes

Do model = GaussianNBClassifier() to construct an instance with default hyper-parameters.

Given each class taken on by the target variable y, it is supposed that the conditional probability distribution for the input variables X is a multivariate Gaussian. The mean and covariance of these Gaussian distributions are estimated using maximum likelihood, and a probability distribution for y given X is deduced by applying Bayes' rule. The required marginal for y is estimated using class frequency in the training data.

Important. The name "naive Bayes classifier" is perhaps misleading. Since we are learning the full multivariate Gaussian distributions for X given y, we are not applying the usual naive Bayes independence condition, which would amount to forcing the covariance matrix to be diagonal.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y)

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic.
  • predict_mode(mach, Xnew): Return the mode of above predictions.

Fitted parameters

The fields of fitted_params(mach) are:

  • c_counts: A dictionary containing the observed count of each input class.

  • c_stats: A dictionary containing observed statistics on each input class. Each class is represented by a DataStats object, with the following fields:

    • n_vars: The number of variables used to describe the class's behavior.
    • n_obs: The number of times the class is observed.
    • obs_axis: The axis along which the observations were computed.
  • gaussians: A per class dictionary of Gaussians, each representing the distribution of the class. Represented with type Distributions.MvNormal from the Distributions.jl package.

  • n_obs: The total number of observations in the training data.

Examples

using MLJ
 GaussianNB = @load GaussianNBClassifier pkg=NaiveBayes
 
 X, y = @load_iris
@@ -10,4 +10,4 @@
 
 preds = predict(mach, X) ## probabilistic predictions
 preds[1]
-predict_mode(mach, X) ## point predictions

See also MultinomialNBClassifier

+predict_mode(mach, X) ## point predictions

See also MultinomialNBClassifier

diff --git a/dev/models/GaussianProcessClassifier_MLJScikitLearnInterface/index.html b/dev/models/GaussianProcessClassifier_MLJScikitLearnInterface/index.html index f11771932..082dc1d65 100644 --- a/dev/models/GaussianProcessClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/GaussianProcessClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -GaussianProcessClassifier · MLJ

GaussianProcessClassifier

GaussianProcessClassifier

A model type for constructing a Gaussian process classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GaussianProcessClassifier = @load GaussianProcessClassifier pkg=MLJScikitLearnInterface

Do model = GaussianProcessClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianProcessClassifier(kernel=...).

Hyper-parameters

  • kernel = nothing
  • optimizer = fmin_l_bfgs_b
  • n_restarts_optimizer = 0
  • copy_X_train = true
  • random_state = nothing
  • max_iter_predict = 100
  • warm_start = false
  • multi_class = one_vs_rest
+GaussianProcessClassifier · MLJ

GaussianProcessClassifier

GaussianProcessClassifier

A model type for constructing a Gaussian process classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GaussianProcessClassifier = @load GaussianProcessClassifier pkg=MLJScikitLearnInterface

Do model = GaussianProcessClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianProcessClassifier(kernel=...).

Hyper-parameters

  • kernel = nothing
  • optimizer = fmin_l_bfgs_b
  • n_restarts_optimizer = 0
  • copy_X_train = true
  • random_state = nothing
  • max_iter_predict = 100
  • warm_start = false
  • multi_class = one_vs_rest
diff --git a/dev/models/GaussianProcessRegressor_MLJScikitLearnInterface/index.html b/dev/models/GaussianProcessRegressor_MLJScikitLearnInterface/index.html index 6ee869238..9148d84da 100644 --- a/dev/models/GaussianProcessRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/GaussianProcessRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -GaussianProcessRegressor · MLJ

GaussianProcessRegressor

GaussianProcessRegressor

A model type for constructing a Gaussian process regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GaussianProcessRegressor = @load GaussianProcessRegressor pkg=MLJScikitLearnInterface

Do model = GaussianProcessRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianProcessRegressor(kernel=...).

Hyper-parameters

  • kernel = nothing
  • alpha = 1.0e-10
  • optimizer = fmin_l_bfgs_b
  • n_restarts_optimizer = 0
  • normalize_y = false
  • copy_X_train = true
  • random_state = nothing
+GaussianProcessRegressor · MLJ

GaussianProcessRegressor

GaussianProcessRegressor

A model type for constructing a Gaussian process regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GaussianProcessRegressor = @load GaussianProcessRegressor pkg=MLJScikitLearnInterface

Do model = GaussianProcessRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GaussianProcessRegressor(kernel=...).

Hyper-parameters

  • kernel = nothing
  • alpha = 1.0e-10
  • optimizer = fmin_l_bfgs_b
  • n_restarts_optimizer = 0
  • normalize_y = false
  • copy_X_train = true
  • random_state = nothing
diff --git a/dev/models/GeneralImputer_BetaML/index.html b/dev/models/GeneralImputer_BetaML/index.html index d1d4c9ae9..3809847a8 100644 --- a/dev/models/GeneralImputer_BetaML/index.html +++ b/dev/models/GeneralImputer_BetaML/index.html @@ -1,5 +1,5 @@ -GeneralImputer · MLJ

GeneralImputer

mutable struct GeneralImputer <: MLJModelInterface.Unsupervised

Impute missing values using arbitrary learning models, from the Beta Machine Learning Toolkit (BetaML).

Impute missing values using a vector (one per column) of arbitrary learning models (classifiers/regressors, not necessarily from BetaML) that implement the interface m = Model([options]), train!(m,X,Y) and predict(m,X).

Hyperparameters:

  • cols_to_impute::Union{String, Vector{Int64}}: Columns in the matrix for which to create an imputation model, i.e. to impute. It can be a vector of columns IDs (positions), or the keywords "auto" (default) or "all". With "auto" the model automatically detects the columns with missing data and impute only them. You may manually specify the columns or use "all" if you want to create a imputation model for that columns during training even if all training data are non-missing to apply then the training model to further data with possibly missing values.
  • estimator::Any: An entimator model (regressor or classifier), with eventually its options (hyper-parameters), to be used to impute the various columns of the matrix. It can also be a cols_to_impute-length vector of different estimators to consider a different estimator for each column (dimension) to impute, for example when some columns are categorical (and will hence require a classifier) and some others are numerical (hence requiring a regressor). [default: nothing, i.e. use BetaML random forests, handling classification and regression jobs automatically].
  • missing_supported::Union{Bool, Vector{Bool}}: Wheter the estimator(s) used to predict the missing data support itself missing data in the training features (X). If not, when the model for a certain dimension is fitted, dimensions with missing data in the same rows of those where imputation is needed are dropped and then only non-missing rows in the other remaining dimensions are considered. It can be a vector of boolean values to specify this property for each individual estimator or a single booleann value to apply to all the estimators [default: false]
  • fit_function::Union{Function, Vector{Function}}: The function used by the estimator(s) to fit the model. It should take as fist argument the model itself, as second argument a matrix representing the features, and as third argument a vector representing the labels. This parameter is mandatory for non-BetaML estimators and can be a single value or a vector (one per estimator) in case of different estimator packages used. [default: BetaML.fit!]
  • predict_function::Union{Function, Vector{Function}}: The function used by the estimator(s) to predict the labels. It should take as fist argument the model itself and as second argument a matrix representing the features. This parameter is mandatory for non-BetaML estimators and can be a single value or a vector (one per estimator) in case of different estimator packages used. [default: BetaML.predict]
  • recursive_passages::Int64: Define the number of times to go trough the various columns to impute their data. Useful when there are data to impute on multiple columns. The order of the first passage is given by the decreasing number of missing values per column, the other passages are random [default: 1].
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]. Note that this influence only the specific GeneralImputer code, the individual estimators may have their own rng (or similar) parameter.

Examples :

  • Using BetaML models:
julia> using MLJ;
+GeneralImputer · MLJ

GeneralImputer

mutable struct GeneralImputer <: MLJModelInterface.Unsupervised

Impute missing values using arbitrary learning models, from the Beta Machine Learning Toolkit (BetaML).

Impute missing values using a vector (one per column) of arbitrary learning models (classifiers/regressors, not necessarily from BetaML) that implement the interface m = Model([options]), train!(m,X,Y) and predict(m,X).

Hyperparameters:

  • cols_to_impute::Union{String, Vector{Int64}}: Columns in the matrix for which to create an imputation model, i.e. to impute. It can be a vector of columns IDs (positions), or the keywords "auto" (default) or "all". With "auto" the model automatically detects the columns with missing data and impute only them. You may manually specify the columns or use "all" if you want to create a imputation model for that columns during training even if all training data are non-missing to apply then the training model to further data with possibly missing values.
  • estimator::Any: An entimator model (regressor or classifier), with eventually its options (hyper-parameters), to be used to impute the various columns of the matrix. It can also be a cols_to_impute-length vector of different estimators to consider a different estimator for each column (dimension) to impute, for example when some columns are categorical (and will hence require a classifier) and some others are numerical (hence requiring a regressor). [default: nothing, i.e. use BetaML random forests, handling classification and regression jobs automatically].
  • missing_supported::Union{Bool, Vector{Bool}}: Wheter the estimator(s) used to predict the missing data support itself missing data in the training features (X). If not, when the model for a certain dimension is fitted, dimensions with missing data in the same rows of those where imputation is needed are dropped and then only non-missing rows in the other remaining dimensions are considered. It can be a vector of boolean values to specify this property for each individual estimator or a single booleann value to apply to all the estimators [default: false]
  • fit_function::Union{Function, Vector{Function}}: The function used by the estimator(s) to fit the model. It should take as fist argument the model itself, as second argument a matrix representing the features, and as third argument a vector representing the labels. This parameter is mandatory for non-BetaML estimators and can be a single value or a vector (one per estimator) in case of different estimator packages used. [default: BetaML.fit!]
  • predict_function::Union{Function, Vector{Function}}: The function used by the estimator(s) to predict the labels. It should take as fist argument the model itself and as second argument a matrix representing the features. This parameter is mandatory for non-BetaML estimators and can be a single value or a vector (one per estimator) in case of different estimator packages used. [default: BetaML.predict]
  • recursive_passages::Int64: Define the number of times to go trough the various columns to impute their data. Useful when there are data to impute on multiple columns. The order of the first passage is given by the decreasing number of missing values per column, the other passages are random [default: 1].
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]. Note that this influence only the specific GeneralImputer code, the individual estimators may have their own rng (or similar) parameter.

Examples :

  • Using BetaML models:
julia> using MLJ;
 julia> import BetaML ## The library from which to get the individual estimators to be used for each column imputation
 julia> X = ["a"         8.2;
             "a"     missing;
@@ -57,4 +57,4 @@
  "b"  20
  "c"  -1.8
  "c"  -2.3
- "c"  -2.4
+ "c" -2.4
diff --git a/dev/models/GradientBoostingClassifier_MLJScikitLearnInterface/index.html b/dev/models/GradientBoostingClassifier_MLJScikitLearnInterface/index.html index 54203d13d..8feaa70e7 100644 --- a/dev/models/GradientBoostingClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/GradientBoostingClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -GradientBoostingClassifier · MLJ

GradientBoostingClassifier

GradientBoostingClassifier

A model type for constructing a gradient boosting classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GradientBoostingClassifier = @load GradientBoostingClassifier pkg=MLJScikitLearnInterface

Do model = GradientBoostingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GradientBoostingClassifier(loss=...).

This algorithm builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage n_classes_ regression trees are fit on the negative gradient of the loss function, e.g. binary or multiclass log loss. Binary classification is a special case where only a single regression tree is induced.

HistGradientBoostingClassifier is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

+GradientBoostingClassifier · MLJ

GradientBoostingClassifier

GradientBoostingClassifier

A model type for constructing a gradient boosting classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GradientBoostingClassifier = @load GradientBoostingClassifier pkg=MLJScikitLearnInterface

Do model = GradientBoostingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GradientBoostingClassifier(loss=...).

This algorithm builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage n_classes_ regression trees are fit on the negative gradient of the loss function, e.g. binary or multiclass log loss. Binary classification is a special case where only a single regression tree is induced.

HistGradientBoostingClassifier is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

diff --git a/dev/models/GradientBoostingRegressor_MLJScikitLearnInterface/index.html b/dev/models/GradientBoostingRegressor_MLJScikitLearnInterface/index.html index aa12148a5..7a2390d77 100644 --- a/dev/models/GradientBoostingRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/GradientBoostingRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -GradientBoostingRegressor · MLJ

GradientBoostingRegressor

GradientBoostingRegressor

A model type for constructing a gradient boosting ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GradientBoostingRegressor = @load GradientBoostingRegressor pkg=MLJScikitLearnInterface

Do model = GradientBoostingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GradientBoostingRegressor(loss=...).

This estimator builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage a regression tree is fit on the negative gradient of the given loss function.

HistGradientBoostingRegressor is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

+GradientBoostingRegressor · MLJ

GradientBoostingRegressor

GradientBoostingRegressor

A model type for constructing a gradient boosting ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

GradientBoostingRegressor = @load GradientBoostingRegressor pkg=MLJScikitLearnInterface

Do model = GradientBoostingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in GradientBoostingRegressor(loss=...).

This estimator builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage a regression tree is fit on the negative gradient of the given loss function.

HistGradientBoostingRegressor is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

diff --git a/dev/models/HBOSDetector_OutlierDetectionPython/index.html b/dev/models/HBOSDetector_OutlierDetectionPython/index.html index 13c0d32a5..a154d3d62 100644 --- a/dev/models/HBOSDetector_OutlierDetectionPython/index.html +++ b/dev/models/HBOSDetector_OutlierDetectionPython/index.html @@ -1,4 +1,4 @@ -HBOSDetector · MLJ

HBOSDetector

HBOSDetector(n_bins = 10,
+HBOSDetector · MLJ
+                tol = 0.5)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.hbos

diff --git a/dev/models/HDBSCAN_MLJScikitLearnInterface/index.html b/dev/models/HDBSCAN_MLJScikitLearnInterface/index.html index 9619d70bc..31adcf570 100644 --- a/dev/models/HDBSCAN_MLJScikitLearnInterface/index.html +++ b/dev/models/HDBSCAN_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -HDBSCAN · MLJ

HDBSCAN

HDBSCAN

A model type for constructing a hdbscan, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HDBSCAN = @load HDBSCAN pkg=MLJScikitLearnInterface

Do model = HDBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HDBSCAN(min_cluster_size=...).

Hierarchical Density-Based Spatial Clustering of Applications with Noise. Performs DBSCAN over varying epsilon values and integrates the result to find a clustering that gives the best stability over epsilon. This allows HDBSCAN to find clusters of varying densities (unlike DBSCAN), and be more robust to parameter selection.

+HDBSCAN · MLJ

HDBSCAN

HDBSCAN

A model type for constructing a hdbscan, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HDBSCAN = @load HDBSCAN pkg=MLJScikitLearnInterface

Do model = HDBSCAN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HDBSCAN(min_cluster_size=...).

Hierarchical Density-Based Spatial Clustering of Applications with Noise. Performs DBSCAN over varying epsilon values and integrates the result to find a clustering that gives the best stability over epsilon. This allows HDBSCAN to find clusters of varying densities (unlike DBSCAN), and be more robust to parameter selection.

diff --git a/dev/models/HierarchicalClustering_Clustering/index.html b/dev/models/HierarchicalClustering_Clustering/index.html index 97943c69d..f0d3f4506 100644 --- a/dev/models/HierarchicalClustering_Clustering/index.html +++ b/dev/models/HierarchicalClustering_Clustering/index.html @@ -1,5 +1,5 @@ -HierarchicalClustering · MLJ

HierarchicalClustering

HierarchicalClustering

A model type for constructing a hierarchical clusterer, based on Clustering.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HierarchicalClustering = @load HierarchicalClustering pkg=Clustering

Do model = HierarchicalClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HierarchicalClustering(linkage=...).

Hierarchical Clustering is a clustering algorithm that organizes the data in a dendrogram based on distances between groups of points and computes cluster assignments by cutting the dendrogram at a given height. More information is available at the Clustering.jl documentation. Use predict to get cluster assignments. The dendrogram and the dendrogram cutter are accessed from the machine report (see below).

This is a static implementation, i.e., it does not generalize to new data instances, and there is no training data. For clusterers that do generalize, see KMeans or KMedoids.

In MLJ or MLJBase, create a machine with

mach = machine(model)

Hyper-parameters

  • linkage = :single: linkage method (:single, :average, :complete, :ward, :ward_presquared)
  • metric = SqEuclidean: metric (see Distances.jl for available metrics)
  • branchorder = :r: branchorder (:r, :barjoseph, :optimal)
  • h = nothing: height at which the dendrogram is cut
  • k = 3: number of clusters.

If both k and h are specified, it is guaranteed that the number of clusters is not less than k and their height is not above h.

Operations

  • predict(mach, X): return cluster label assignments, as an unordered CategoricalVector. Here X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Report

After calling predict(mach), the fields of report(mach) are:

  • dendrogram: the dendrogram that was computed when calling predict.
  • cutter: a dendrogram cutter that can be called with a height h or a number of clusters k, to obtain a new assignment of the data points to clusters (see example below).

Examples

using MLJ
+HierarchicalClustering · MLJ

HierarchicalClustering

HierarchicalClustering

A model type for constructing a hierarchical clusterer, based on Clustering.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HierarchicalClustering = @load HierarchicalClustering pkg=Clustering

Do model = HierarchicalClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HierarchicalClustering(linkage=...).

Hierarchical Clustering is a clustering algorithm that organizes the data in a dendrogram based on distances between groups of points and computes cluster assignments by cutting the dendrogram at a given height. More information is available at the Clustering.jl documentation. Use predict to get cluster assignments. The dendrogram and the dendrogram cutter are accessed from the machine report (see below).

This is a static implementation, i.e., it does not generalize to new data instances, and there is no training data. For clusterers that do generalize, see KMeans or KMedoids.

In MLJ or MLJBase, create a machine with

mach = machine(model)

Hyper-parameters

  • linkage = :single: linkage method (:single, :average, :complete, :ward, :ward_presquared)
  • metric = SqEuclidean: metric (see Distances.jl for available metrics)
  • branchorder = :r: branchorder (:r, :barjoseph, :optimal)
  • h = nothing: height at which the dendrogram is cut
  • k = 3: number of clusters.

If both k and h are specified, it is guaranteed that the number of clusters is not less than k and their height is not above h.

Operations

  • predict(mach, X): return cluster label assignments, as an unordered CategoricalVector. Here X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Report

After calling predict(mach), the fields of report(mach) are:

  • dendrogram: the dendrogram that was computed when calling predict.
  • cutter: a dendrogram cutter that can be called with a height h or a number of clusters k, to obtain a new assignment of the data points to clusters (see example below).

Examples

using MLJ
 
 X, labels  = make_moons(400, noise=0.09, rng=1) ## synthetic data with 2 clusters; X
 
@@ -15,4 +15,4 @@
 plot(report(mach).dendrogram)
 
 ## make new predictions by cutting the dendrogram at another height
-report(mach).cutter(h = 2.5)
+report(mach).cutter(h = 2.5)
diff --git a/dev/models/HistGradientBoostingClassifier_MLJScikitLearnInterface/index.html b/dev/models/HistGradientBoostingClassifier_MLJScikitLearnInterface/index.html index 1220c180f..0e784fbb9 100644 --- a/dev/models/HistGradientBoostingClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/HistGradientBoostingClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -HistGradientBoostingClassifier · MLJ

HistGradientBoostingClassifier

HistGradientBoostingClassifier

A model type for constructing a hist gradient boosting classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HistGradientBoostingClassifier = @load HistGradientBoostingClassifier pkg=MLJScikitLearnInterface

Do model = HistGradientBoostingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HistGradientBoostingClassifier(loss=...).

This algorithm builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage n_classes_ regression trees are fit on the negative gradient of the loss function, e.g. binary or multiclass log loss. Binary classification is a special case where only a single regression tree is induced.

HistGradientBoostingClassifier is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

+HistGradientBoostingClassifier · MLJ

HistGradientBoostingClassifier

HistGradientBoostingClassifier

A model type for constructing a hist gradient boosting classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HistGradientBoostingClassifier = @load HistGradientBoostingClassifier pkg=MLJScikitLearnInterface

Do model = HistGradientBoostingClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HistGradientBoostingClassifier(loss=...).

This algorithm builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage n_classes_ regression trees are fit on the negative gradient of the loss function, e.g. binary or multiclass log loss. Binary classification is a special case where only a single regression tree is induced.

HistGradientBoostingClassifier is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

diff --git a/dev/models/HistGradientBoostingRegressor_MLJScikitLearnInterface/index.html b/dev/models/HistGradientBoostingRegressor_MLJScikitLearnInterface/index.html index db47b50ba..8473df396 100644 --- a/dev/models/HistGradientBoostingRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/HistGradientBoostingRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -HistGradientBoostingRegressor · MLJ

HistGradientBoostingRegressor

HistGradientBoostingRegressor

A model type for constructing a gradient boosting ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HistGradientBoostingRegressor = @load HistGradientBoostingRegressor pkg=MLJScikitLearnInterface

Do model = HistGradientBoostingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HistGradientBoostingRegressor(loss=...).

This estimator builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage a regression tree is fit on the negative gradient of the given loss function.

HistGradientBoostingRegressor is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

+HistGradientBoostingRegressor · MLJ

HistGradientBoostingRegressor

HistGradientBoostingRegressor

A model type for constructing a gradient boosting ensemble regression, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HistGradientBoostingRegressor = @load HistGradientBoostingRegressor pkg=MLJScikitLearnInterface

Do model = HistGradientBoostingRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HistGradientBoostingRegressor(loss=...).

This estimator builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage a regression tree is fit on the negative gradient of the given loss function.

HistGradientBoostingRegressor is a much faster variant of this algorithm for intermediate datasets (n_samples >= 10_000).

diff --git a/dev/models/HuberRegressor_MLJLinearModels/index.html b/dev/models/HuberRegressor_MLJLinearModels/index.html index 41ca98631..d371cad67 100644 --- a/dev/models/HuberRegressor_MLJLinearModels/index.html +++ b/dev/models/HuberRegressor_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -HuberRegressor · MLJ

HuberRegressor

HuberRegressor

A model type for constructing a huber regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HuberRegressor = @load HuberRegressor pkg=MLJLinearModels

Do model = HuberRegressor() to construct an instance with default hyper-parameters.

This model coincides with RobustRegressor, with the exception that the robust loss, rho, is fixed to HuberRho(delta), where delta is a new hyperparameter.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • delta::Real: parameterizes the HuberRho function (radius of the ball within which the loss is a quadratic loss) Default: 0.5

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, Newton, NewtonCG, if penalty = :l2, and ProxGrad otherwise.

    If solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
+HuberRegressor · MLJ

HuberRegressor

HuberRegressor

A model type for constructing a huber regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HuberRegressor = @load HuberRegressor pkg=MLJLinearModels

Do model = HuberRegressor() to construct an instance with default hyper-parameters.

This model coincides with RobustRegressor, with the exception that the robust loss, rho, is fixed to HuberRho(delta), where delta is a new hyperparameter.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • delta::Real: parameterizes the HuberRho function (radius of the ball within which the loss is a quadratic loss) Default: 0.5

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, Newton, NewtonCG, if penalty = :l2, and ProxGrad otherwise.

    If solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
 X, y = make_regression()
 mach = fit!(machine(HuberRegressor(), X, y))
 predict(mach, X)
-fitted_params(mach)

See also RobustRegressor, QuantileRegressor.

+fitted_params(mach)

See also RobustRegressor, QuantileRegressor.

diff --git a/dev/models/HuberRegressor_MLJScikitLearnInterface/index.html b/dev/models/HuberRegressor_MLJScikitLearnInterface/index.html index f945674e5..fa1fa33b9 100644 --- a/dev/models/HuberRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/HuberRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -HuberRegressor · MLJ

HuberRegressor

HuberRegressor

A model type for constructing a Huber regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HuberRegressor = @load HuberRegressor pkg=MLJScikitLearnInterface

Do model = HuberRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HuberRegressor(epsilon=...).

Hyper-parameters

  • epsilon = 1.35
  • max_iter = 100
  • alpha = 0.0001
  • warm_start = false
  • fit_intercept = true
  • tol = 1.0e-5
+HuberRegressor · MLJ

HuberRegressor

HuberRegressor

A model type for constructing a Huber regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

HuberRegressor = @load HuberRegressor pkg=MLJScikitLearnInterface

Do model = HuberRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in HuberRegressor(epsilon=...).

Hyper-parameters

  • epsilon = 1.35
  • max_iter = 100
  • alpha = 0.0001
  • warm_start = false
  • fit_intercept = true
  • tol = 1.0e-5
diff --git a/dev/models/ICA_MultivariateStats/index.html b/dev/models/ICA_MultivariateStats/index.html index 9d36674d8..5353a5da4 100644 --- a/dev/models/ICA_MultivariateStats/index.html +++ b/dev/models/ICA_MultivariateStats/index.html @@ -1,5 +1,5 @@ -ICA · MLJ

ICA

ICA

A model type for constructing a independent component analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ICA = @load ICA pkg=MultivariateStats

Do model = ICA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ICA(outdim=...).

Independent component analysis is a computational technique for separating a multivariate signal into additive subcomponents, with the assumption that the subcomponents are non-Gaussian and independent from each other.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • outdim::Int=0: The number of independent components to recover, set automatically if 0.
  • alg::Symbol=:fastica: The algorithm to use (only :fastica is supported at the moment).
  • fun::Symbol=:tanh: The approximate neg-entropy function, one of :tanh, :gaus.
  • do_whiten::Bool=true: Whether or not to perform pre-whitening.
  • maxiter::Int=100: The maximum number of iterations.
  • tol::Real=1e-6: The convergence tolerance for change in the unmixing matrix W.
  • mean::Union{Nothing, Real, Vector{Float64}}=nothing: mean to use, if nothing (default) centering is computed and applied, if zero, no centering; otherwise a vector of means can be passed.
  • winit::Union{Nothing,Matrix{<:Real}}=nothing: Initial guess for the unmixing matrix W: either an empty matrix (for random initialization of W), a matrix of size m × k (if do_whiten is true), or a matrix of size m × k. Here m is the number of components (columns) of the input.

Operations

  • transform(mach, Xnew): Return the component-separated version of input Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: The estimated component matrix.
  • mean: The estimated mean vector.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim: Dimension of transformed data.
  • mean: The mean of the untransformed training data, of length indim.

Examples

using MLJ
+ICA · MLJ

ICA

ICA

A model type for constructing a independent component analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ICA = @load ICA pkg=MultivariateStats

Do model = ICA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ICA(outdim=...).

Independent component analysis is a computational technique for separating a multivariate signal into additive subcomponents, with the assumption that the subcomponents are non-Gaussian and independent from each other.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • outdim::Int=0: The number of independent components to recover, set automatically if 0.
  • alg::Symbol=:fastica: The algorithm to use (only :fastica is supported at the moment).
  • fun::Symbol=:tanh: The approximate neg-entropy function, one of :tanh, :gaus.
  • do_whiten::Bool=true: Whether or not to perform pre-whitening.
  • maxiter::Int=100: The maximum number of iterations.
  • tol::Real=1e-6: The convergence tolerance for change in the unmixing matrix W.
  • mean::Union{Nothing, Real, Vector{Float64}}=nothing: mean to use, if nothing (default) centering is computed and applied, if zero, no centering; otherwise a vector of means can be passed.
  • winit::Union{Nothing,Matrix{<:Real}}=nothing: Initial guess for the unmixing matrix W: either an empty matrix (for random initialization of W), a matrix of size m × k (if do_whiten is true), or a matrix of size m × k. Here m is the number of components (columns) of the input.

Operations

  • transform(mach, Xnew): Return the component-separated version of input Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: The estimated component matrix.
  • mean: The estimated mean vector.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim: Dimension of transformed data.
  • mean: The mean of the untransformed training data, of length indim.

Examples

using MLJ
 
 ICA = @load ICA pkg=MultivariateStats
 
@@ -28,4 +28,4 @@
 plot(X_unmixed.x1)
 plot(X_unmixed.x2)
 plot(X_unmixed.x3)
-

See also PCA, KernelPCA, FactorAnalysis, PPCA

+

See also PCA, KernelPCA, FactorAnalysis, PPCA

diff --git a/dev/models/IForestDetector_OutlierDetectionPython/index.html b/dev/models/IForestDetector_OutlierDetectionPython/index.html index 23624855f..148d89a3e 100644 --- a/dev/models/IForestDetector_OutlierDetectionPython/index.html +++ b/dev/models/IForestDetector_OutlierDetectionPython/index.html @@ -1,8 +1,8 @@ -IForestDetector · MLJ

IForestDetector

IForestDetector(n_estimators = 100,
+IForestDetector · MLJ
+                   n_jobs = 1)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.iforest

diff --git a/dev/models/INNEDetector_OutlierDetectionPython/index.html b/dev/models/INNEDetector_OutlierDetectionPython/index.html index b2ff2dc70..2d10cfc19 100644 --- a/dev/models/INNEDetector_OutlierDetectionPython/index.html +++ b/dev/models/INNEDetector_OutlierDetectionPython/index.html @@ -1,4 +1,4 @@ -INNEDetector · MLJ

INNEDetector

INNEDetector(n_estimators=200,
+INNEDetector · MLJ
+                random_state=None)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.inne

diff --git a/dev/models/ImageClassifier_MLJFlux/index.html b/dev/models/ImageClassifier_MLJFlux/index.html index a7736c581..bf763823d 100644 --- a/dev/models/ImageClassifier_MLJFlux/index.html +++ b/dev/models/ImageClassifier_MLJFlux/index.html @@ -1,5 +1,5 @@ -ImageClassifier · MLJ

ImageClassifier

ImageClassifier

A model type for constructing a image classifier, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ImageClassifier = @load ImageClassifier pkg=MLJFlux

Do model = ImageClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ImageClassifier(builder=...).

ImageClassifier classifies images using a neural network adapted to the type of images provided (color or gray scale). Predictions are probabilistic. Users provide a recipe for constructing the network, based on properties of the image encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any AbstractVector of images with ColorImage or GrayImage scitype; check the scitype with scitype(X) and refer to ScientificTypes.jl documentation on coercing typical image formats into an appropriate type.
  • y is the target, which can be any AbstractVector whose element scitype is Multiclass; check the scitype with scitype(y).

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • builder: An MLJFlux builder that constructs the neural network. The fallback builds a depth-16 VGG architecture adapted to the image size and number of target classes, with no batch normalization; see the Metalhead.jl documentation for details. See the example below for a user-specified builder. A convenience macro @builder is also available. See also finaliser below.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.crossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:

    • Flux.crossentropy: Standard multiclass classification loss, also known as the log loss.
    • Flux.logitcrossentopy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with softmax and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default softmax finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).
    • Flux.tversky_loss: Used with imbalanced data to give more weight to false negatives.
    • Flux.focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.

    Currently MLJ measures are not supported values of loss.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and

    1. Increassing batch size may accelerate training if acceleration=CUDALibs() and a

    GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

  • finaliser=Flux.softmax: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.softmax.

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we use MLJFlux and a custom builder to classify the MNIST image dataset.

using MLJ
+ImageClassifier · MLJ

ImageClassifier

ImageClassifier

A model type for constructing a image classifier, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ImageClassifier = @load ImageClassifier pkg=MLJFlux

Do model = ImageClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ImageClassifier(builder=...).

ImageClassifier classifies images using a neural network adapted to the type of images provided (color or gray scale). Predictions are probabilistic. Users provide a recipe for constructing the network, based on properties of the image encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any AbstractVector of images with ColorImage or GrayImage scitype; check the scitype with scitype(X) and refer to ScientificTypes.jl documentation on coercing typical image formats into an appropriate type.
  • y is the target, which can be any AbstractVector whose element scitype is Multiclass; check the scitype with scitype(y).

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • builder: An MLJFlux builder that constructs the neural network. The fallback builds a depth-16 VGG architecture adapted to the image size and number of target classes, with no batch normalization; see the Metalhead.jl documentation for details. See the example below for a user-specified builder. A convenience macro @builder is also available. See also finaliser below.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.crossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:

    • Flux.crossentropy: Standard multiclass classification loss, also known as the log loss.
    • Flux.logitcrossentopy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with softmax and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default softmax finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).
    • Flux.tversky_loss: Used with imbalanced data to give more weight to false negatives.
    • Flux.focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.

    Currently MLJ measures are not supported values of loss.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and

    1. Increassing batch size may accelerate training if acceleration=CUDALibs() and a

    GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

  • finaliser=Flux.softmax: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.softmax.

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we use MLJFlux and a custom builder to classify the MNIST image dataset.

using MLJ
 using Flux
 import MLJFlux
 import MLJIteration ## for `skip` control

First we want to download the MNIST dataset, and unpack into images and labels:

import MLDatasets: MNIST
@@ -45,4 +45,4 @@
           resampling=Holdout(fraction_train=0.5),
           measure=cross_entropy,
           rows=1:1000,
-          verbosity=0)

See also NeuralNetworkClassifier.

+ verbosity=0)

See also NeuralNetworkClassifier.

diff --git a/dev/models/InteractionTransformer_MLJModels/index.html b/dev/models/InteractionTransformer_MLJModels/index.html index 37cc8393e..d10f931fa 100644 --- a/dev/models/InteractionTransformer_MLJModels/index.html +++ b/dev/models/InteractionTransformer_MLJModels/index.html @@ -1,5 +1,5 @@ -InteractionTransformer · MLJ

InteractionTransformer

InteractionTransformer

A model type for constructing a interaction transformer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

InteractionTransformer = @load InteractionTransformer pkg=MLJModels

Do model = InteractionTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in InteractionTransformer(order=...).

Generates all polynomial interaction terms up to the given order for the subset of chosen columns. Any column that contains elements with scitype <:Infinite is a valid basis to generate interactions. If features is not specified, all such columns with scitype <:Infinite in the table are used as a basis.

In MLJ or MLJBase, you can transform features X with the single call

transform(machine(model), X)

See also the example below.

Hyper-parameters

  • order: Maximum order of interactions to be generated.
  • features: Restricts interations generation to those columns

Operations

  • transform(machine(model), X): Generates polynomial interaction terms out of table X using the hyper-parameters specified in model.

Example

using MLJ
+InteractionTransformer · MLJ

InteractionTransformer

InteractionTransformer

A model type for constructing a interaction transformer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

InteractionTransformer = @load InteractionTransformer pkg=MLJModels

Do model = InteractionTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in InteractionTransformer(order=...).

Generates all polynomial interaction terms up to the given order for the subset of chosen columns. Any column that contains elements with scitype <:Infinite is a valid basis to generate interactions. If features is not specified, all such columns with scitype <:Infinite in the table are used as a basis.

In MLJ or MLJBase, you can transform features X with the single call

transform(machine(model), X)

See also the example below.

Hyper-parameters

  • order: Maximum order of interactions to be generated.
  • features: Restricts interations generation to those columns

Operations

  • transform(machine(model), X): Generates polynomial interaction terms out of table X using the hyper-parameters specified in model.

Example

using MLJ
 
 X = (
     A = [1, 2, 3],
@@ -29,4 +29,4 @@
  C = [7, 8, 9],
  D = ["x₁", "x₂", "x₃"],
  A_B = [4, 10, 18],)
-
+
diff --git a/dev/models/IteratedModel_MLJIteration/index.html b/dev/models/IteratedModel_MLJIteration/index.html index 7b3b0f7f6..73ed4f0c5 100644 --- a/dev/models/IteratedModel_MLJIteration/index.html +++ b/dev/models/IteratedModel_MLJIteration/index.html @@ -1,5 +1,5 @@ -IteratedModel · MLJ

IteratedModel

IteratedModel(model;
+IteratedModel · MLJ

IteratedModel

IteratedModel(model;
     controls=MLJIteration.DEFAULT_CONTROLS,
     resampling=Holdout(),
     measure=nothing,
@@ -12,4 +12,4 @@
 mach = machine(iterated_model, X, y)
 fit!(mach) ## train for 100 iterations
 iterated_model.controls = [Step(1), NumberLimit(50)],
-fit!(mach) ## train for an *extra* 50 iterations

More generally, if iterated_model is mutated and fit!(mach) is called again, then a warm restart is attempted if the only parameters to change are model or controls or both.

Specifically, train_mach.model is mutated to match the current value of iterated_model.model and the iteration parameter of the latter is updated to the last value used in the preceding fit!(mach) call. Then repeated application of the (updated) controls begin anew.

+fit!(mach) ## train for an *extra* 50 iterations

More generally, if iterated_model is mutated and fit!(mach) is called again, then a warm restart is attempted if the only parameters to change are model or controls or both.

Specifically, train_mach.model is mutated to match the current value of iterated_model.model and the iteration parameter of the latter is updated to the last value used in the preceding fit!(mach) call. Then repeated application of the (updated) controls begin anew.

diff --git a/dev/models/KDEDetector_OutlierDetectionPython/index.html b/dev/models/KDEDetector_OutlierDetectionPython/index.html index b5c4bccd2..af0854657 100644 --- a/dev/models/KDEDetector_OutlierDetectionPython/index.html +++ b/dev/models/KDEDetector_OutlierDetectionPython/index.html @@ -1,6 +1,6 @@ -KDEDetector · MLJ

KDEDetector

KDEDetector(bandwidth=1.0,
+KDEDetector · MLJ
+               metric_params=None)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.kde

diff --git a/dev/models/KMeansClusterer_BetaML/index.html b/dev/models/KMeansClusterer_BetaML/index.html index 54fa8bc9a..7204f63ed 100644 --- a/dev/models/KMeansClusterer_BetaML/index.html +++ b/dev/models/KMeansClusterer_BetaML/index.html @@ -1,5 +1,5 @@ -KMeansClusterer · MLJ

KMeansClusterer

mutable struct KMeansClusterer <: MLJModelInterface.Unsupervised

The classical KMeansClusterer clustering algorithm, from the Beta Machine Learning Toolkit (BetaML).

Parameters:

  • n_classes::Int64: Number of classes to discriminate the data [def: 3]

  • dist::Function: Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (l1_distance, l2_distance, l2squared_distance), cosine_distance), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics. Attention that, contrary to KMedoidsClusterer, the KMeansClusterer algorithm is not guaranteed to converge with other distances than the Euclidean one.

  • initialisation_strategy::String: The computation method of the vector of the initial representatives. One of the following:

    • "random": randomly in the X space
    • "grid": using a grid approach
    • "shuffle": selecting randomly within the available points [default]
    • "given": using a provided set of initial representatives provided in the initial_representatives parameter
  • initial_representatives::Union{Nothing, Matrix{Float64}}: Provided (K x D) matrix of initial representatives (useful only with initialisation_strategy="given") [default: nothing]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • online fitting (re-fitting with new data) is supported

Example:

julia> using MLJ
+KMeansClusterer · MLJ

KMeansClusterer

mutable struct KMeansClusterer <: MLJModelInterface.Unsupervised

The classical KMeansClusterer clustering algorithm, from the Beta Machine Learning Toolkit (BetaML).

Parameters:

  • n_classes::Int64: Number of classes to discriminate the data [def: 3]

  • dist::Function: Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (l1_distance, l2_distance, l2squared_distance), cosine_distance), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics. Attention that, contrary to KMedoidsClusterer, the KMeansClusterer algorithm is not guaranteed to converge with other distances than the Euclidean one.

  • initialisation_strategy::String: The computation method of the vector of the initial representatives. One of the following:

    • "random": randomly in the X space
    • "grid": using a grid approach
    • "shuffle": selecting randomly within the available points [default]
    • "given": using a provided set of initial representatives provided in the initial_representatives parameter
  • initial_representatives::Union{Nothing, Matrix{Float64}}: Provided (K x D) matrix of initial representatives (useful only with initialisation_strategy="given") [default: nothing]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • online fitting (re-fitting with new data) is supported

Example:

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -29,4 +29,4 @@
  ⋮            
  "virginica"  3
  "virginica"  3
- "virginica"  1
+ "virginica" 1
diff --git a/dev/models/KMeans_Clustering/index.html b/dev/models/KMeans_Clustering/index.html index fedd099c8..8c79ee959 100644 --- a/dev/models/KMeans_Clustering/index.html +++ b/dev/models/KMeans_Clustering/index.html @@ -1,5 +1,5 @@ -KMeans · MLJ

KMeans

KMeans

A model type for constructing a K-means clusterer, based on Clustering.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KMeans = @load KMeans pkg=Clustering

Do model = KMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMeans(k=...).

K-means is a classical method for clustering or vector quantization. It produces a fixed number of clusters, each associated with a center (also known as a prototype), and each data point is assigned to a cluster with the nearest center.

From a mathematical standpoint, K-means is a coordinate descent algorithm that solves the following optimization problem:

:$

\text{minimize} \ \sum{i=1}^n \| \mathbf{x}i - \boldsymbol{\mu}{zi} \|^2 \ \text{w.r.t.} \ (\boldsymbol{\mu}, z) :$

Here, $\boldsymbol{\mu}_k$ is the center of the $k$-th cluster, and $z_i$ is an index of the cluster for $i$-th point $\mathbf{x}_i$.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • k=3: The number of centroids to use in clustering.

  • metric::SemiMetric=Distances.SqEuclidean: The metric used to calculate the clustering. Must have type PreMetric from Distances.jl.

  • init = :kmpp: One of the following options to indicate how cluster seeds should be initialized:

    • :kmpp: KMeans++
    • :kmenc: K-medoids initialization based on centrality
    • :rand: random
    • an instance of Clustering.SeedingAlgorithm from Clustering.jl
    • an integer vector of length k that provides the indices of points to use as initial cluster centers.

    See documentation of Clustering.jl.

Operations

  • predict(mach, Xnew): return cluster label assignments, given new features Xnew having the same Scitype as X above.
  • transform(mach, Xnew): instead return the mean pairwise distances from new samples to the cluster centers.

Fitted parameters

The fields of fitted_params(mach) are:

  • centers: The coordinates of the cluster centers.

Report

The fields of report(mach) are:

  • assignments: The cluster assignments of each point in the training data.
  • cluster_labels: The labels assigned to each cluster.

Examples

using MLJ
+KMeans · MLJ

KMeans

KMeans

A model type for constructing a K-means clusterer, based on Clustering.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KMeans = @load KMeans pkg=Clustering

Do model = KMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMeans(k=...).

K-means is a classical method for clustering or vector quantization. It produces a fixed number of clusters, each associated with a center (also known as a prototype), and each data point is assigned to a cluster with the nearest center.

From a mathematical standpoint, K-means is a coordinate descent algorithm that solves the following optimization problem:

:$

\text{minimize} \ \sum{i=1}^n \| \mathbf{x}i - \boldsymbol{\mu}{zi} \|^2 \ \text{w.r.t.} \ (\boldsymbol{\mu}, z) :$

Here, $\boldsymbol{\mu}_k$ is the center of the $k$-th cluster, and $z_i$ is an index of the cluster for $i$-th point $\mathbf{x}_i$.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • k=3: The number of centroids to use in clustering.

  • metric::SemiMetric=Distances.SqEuclidean: The metric used to calculate the clustering. Must have type PreMetric from Distances.jl.

  • init = :kmpp: One of the following options to indicate how cluster seeds should be initialized:

    • :kmpp: KMeans++
    • :kmenc: K-medoids initialization based on centrality
    • :rand: random
    • an instance of Clustering.SeedingAlgorithm from Clustering.jl
    • an integer vector of length k that provides the indices of points to use as initial cluster centers.

    See documentation of Clustering.jl.

Operations

  • predict(mach, Xnew): return cluster label assignments, given new features Xnew having the same Scitype as X above.
  • transform(mach, Xnew): instead return the mean pairwise distances from new samples to the cluster centers.

Fitted parameters

The fields of fitted_params(mach) are:

  • centers: The coordinates of the cluster centers.

Report

The fields of report(mach) are:

  • assignments: The cluster assignments of each point in the training data.
  • cluster_labels: The labels assigned to each cluster.

Examples

using MLJ
 KMeans = @load KMeans pkg=Clustering
 
 table = load_iris()
@@ -17,4 +17,4 @@
 
 @assert center_dists[1][1] == 0.0
 @assert center_dists[2][2] == 0.0
-@assert center_dists[3][3] == 0.0

See also KMedoids

+@assert center_dists[3][3] == 0.0

See also KMedoids

diff --git a/dev/models/KMeans_MLJScikitLearnInterface/index.html b/dev/models/KMeans_MLJScikitLearnInterface/index.html index 58b37f355..dd7bde00c 100644 --- a/dev/models/KMeans_MLJScikitLearnInterface/index.html +++ b/dev/models/KMeans_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -KMeans · MLJ

KMeans

KMeans

A model type for constructing a k means, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KMeans = @load KMeans pkg=MLJScikitLearnInterface

Do model = KMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMeans(n_clusters=...).

K-Means algorithm: find K centroids corresponding to K clusters in the data.

+KMeans · MLJ

KMeans

KMeans

A model type for constructing a k means, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KMeans = @load KMeans pkg=MLJScikitLearnInterface

Do model = KMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMeans(n_clusters=...).

K-Means algorithm: find K centroids corresponding to K clusters in the data.

diff --git a/dev/models/KMeans_ParallelKMeans/index.html b/dev/models/KMeans_ParallelKMeans/index.html index 38ca0741a..b113e6d42 100644 --- a/dev/models/KMeans_ParallelKMeans/index.html +++ b/dev/models/KMeans_ParallelKMeans/index.html @@ -1,2 +1,2 @@ -KMeans · MLJ

KMeans

Parallel & lightning fast implementation of all available variants of the KMeans clustering algorithm in native Julia. Compatible with Julia 1.3+

+KMeans · MLJ

KMeans

Parallel & lightning fast implementation of all available variants of the KMeans clustering algorithm in native Julia. Compatible with Julia 1.3+

diff --git a/dev/models/KMedoidsClusterer_BetaML/index.html b/dev/models/KMedoidsClusterer_BetaML/index.html index 7c832a6eb..c8e1cee84 100644 --- a/dev/models/KMedoidsClusterer_BetaML/index.html +++ b/dev/models/KMedoidsClusterer_BetaML/index.html @@ -1,5 +1,5 @@ -KMedoidsClusterer · MLJ

KMedoidsClusterer

mutable struct KMedoidsClusterer <: MLJModelInterface.Unsupervised

Parameters:

  • n_classes::Int64: Number of classes to discriminate the data [def: 3]

  • dist::Function: Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (l1_distance, l2_distance, l2squared_distance), cosine_distance), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics.

  • initialisation_strategy::String: The computation method of the vector of the initial representatives. One of the following:

    • "random": randomly in the X space
    • "grid": using a grid approach
    • "shuffle": selecting randomly within the available points [default]
    • "given": using a provided set of initial representatives provided in the initial_representatives parameter
  • initial_representatives::Union{Nothing, Matrix{Float64}}: Provided (K x D) matrix of initial representatives (useful only with initialisation_strategy="given") [default: nothing]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

The K-medoids clustering algorithm with customisable distance function, from the Beta Machine Learning Toolkit (BetaML).

Similar to K-Means, but the "representatives" (the cetroids) are guaranteed to be one of the training points. The algorithm work with any arbitrary distance measure.

Notes:

  • data must be numerical
  • online fitting (re-fitting with new data) is supported

Example:

julia> using MLJ
+KMedoidsClusterer · MLJ

KMedoidsClusterer

mutable struct KMedoidsClusterer <: MLJModelInterface.Unsupervised

Parameters:

  • n_classes::Int64: Number of classes to discriminate the data [def: 3]

  • dist::Function: Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (l1_distance, l2_distance, l2squared_distance), cosine_distance), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics.

  • initialisation_strategy::String: The computation method of the vector of the initial representatives. One of the following:

    • "random": randomly in the X space
    • "grid": using a grid approach
    • "shuffle": selecting randomly within the available points [default]
    • "given": using a provided set of initial representatives provided in the initial_representatives parameter
  • initial_representatives::Union{Nothing, Matrix{Float64}}: Provided (K x D) matrix of initial representatives (useful only with initialisation_strategy="given") [default: nothing]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

The K-medoids clustering algorithm with customisable distance function, from the Beta Machine Learning Toolkit (BetaML).

Similar to K-Means, but the "representatives" (the cetroids) are guaranteed to be one of the training points. The algorithm work with any arbitrary distance measure.

Notes:

  • data must be numerical
  • online fitting (re-fitting with new data) is supported

Example:

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -29,4 +29,4 @@
  ⋮            
  "virginica"  1
  "virginica"  1
- "virginica"  2
+ "virginica" 2
diff --git a/dev/models/KMedoids_Clustering/index.html b/dev/models/KMedoids_Clustering/index.html index 69fe853a6..962a14e54 100644 --- a/dev/models/KMedoids_Clustering/index.html +++ b/dev/models/KMedoids_Clustering/index.html @@ -1,5 +1,5 @@ -KMedoids · MLJ

KMedoids

KMedoids

A model type for constructing a K-medoids clusterer, based on Clustering.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KMedoids = @load KMedoids pkg=Clustering

Do model = KMedoids() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMedoids(k=...).

K-medoids is a clustering algorithm that works by finding $k$ data points (called medoids) such that the total distance between each data point and the closest medoid is minimal.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • k=3: The number of centroids to use in clustering.

  • metric::SemiMetric=Distances.SqEuclidean: The metric used to calculate the clustering. Must have type PreMetric from Distances.jl.

  • init (defaults to :kmpp): how medoids should be initialized, could be one of the following:

    • :kmpp: KMeans++
    • :kmenc: K-medoids initialization based on centrality
    • :rand: random
    • an instance of Clustering.SeedingAlgorithm from Clustering.jl
    • an integer vector of length k that provides the indices of points to use as initial medoids.

    See documentation of Clustering.jl.

Operations

  • predict(mach, Xnew): return cluster label assignments, given new features Xnew having the same Scitype as X above.
  • transform(mach, Xnew): instead return the mean pairwise distances from new samples to the cluster centers.

Fitted parameters

The fields of fitted_params(mach) are:

  • medoids: The coordinates of the cluster medoids.

Report

The fields of report(mach) are:

  • assignments: The cluster assignments of each point in the training data.
  • cluster_labels: The labels assigned to each cluster.

Examples

using MLJ
+KMedoids · MLJ

KMedoids

KMedoids

A model type for constructing a K-medoids clusterer, based on Clustering.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KMedoids = @load KMedoids pkg=Clustering

Do model = KMedoids() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KMedoids(k=...).

K-medoids is a clustering algorithm that works by finding $k$ data points (called medoids) such that the total distance between each data point and the closest medoid is minimal.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • k=3: The number of centroids to use in clustering.

  • metric::SemiMetric=Distances.SqEuclidean: The metric used to calculate the clustering. Must have type PreMetric from Distances.jl.

  • init (defaults to :kmpp): how medoids should be initialized, could be one of the following:

    • :kmpp: KMeans++
    • :kmenc: K-medoids initialization based on centrality
    • :rand: random
    • an instance of Clustering.SeedingAlgorithm from Clustering.jl
    • an integer vector of length k that provides the indices of points to use as initial medoids.

    See documentation of Clustering.jl.

Operations

  • predict(mach, Xnew): return cluster label assignments, given new features Xnew having the same Scitype as X above.
  • transform(mach, Xnew): instead return the mean pairwise distances from new samples to the cluster centers.

Fitted parameters

The fields of fitted_params(mach) are:

  • medoids: The coordinates of the cluster medoids.

Report

The fields of report(mach) are:

  • assignments: The cluster assignments of each point in the training data.
  • cluster_labels: The labels assigned to each cluster.

Examples

using MLJ
 KMedoids = @load KMedoids pkg=Clustering
 
 table = load_iris()
@@ -17,4 +17,4 @@
 
 @assert center_dists[1][1] == 0.0
 @assert center_dists[2][2] == 0.0
-@assert center_dists[3][3] == 0.0

See also KMeans

+@assert center_dists[3][3] == 0.0

See also KMeans

diff --git a/dev/models/KNNClassifier_NearestNeighborModels/index.html b/dev/models/KNNClassifier_NearestNeighborModels/index.html index 4e26cb9ac..8d46d6364 100644 --- a/dev/models/KNNClassifier_NearestNeighborModels/index.html +++ b/dev/models/KNNClassifier_NearestNeighborModels/index.html @@ -1,5 +1,5 @@ -KNNClassifier · MLJ

KNNClassifier

KNNClassifier

A model type for constructing a K-nearest neighbor classifier, based on NearestNeighborModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KNNClassifier = @load KNNClassifier pkg=NearestNeighborModels

Do model = KNNClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNNClassifier(K=...).

KNNClassifier implements K-Nearest Neighbors classifier which is non-parametric algorithm that predicts a discrete class distribution associated with a new point by taking a vote over the classes of the k-nearest points. Each neighbor vote is assigned a weight based on proximity of the neighbor point to the test point according to a specified distance metric.

For more information about the weighting kernels, see the paper by Geler et.al Comparison of different weighting schemes for the kNN classifier on time-series data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is <:Finite (<:Multiclass or <:OrderedFactor will do); check the scitype with scitype(y)
  • w is the observation weights which can either be nothing (default) or an AbstractVector whose element scitype is Count or Continuous. This is different from weights kernel which is a model hyperparameter, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • K::Int=5 : number of neighbors
  • algorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)
  • metric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.
  • leafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.
  • reorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.
  • weights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.

Examples

using MLJ
+KNNClassifier · MLJ

KNNClassifier

KNNClassifier

A model type for constructing a K-nearest neighbor classifier, based on NearestNeighborModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KNNClassifier = @load KNNClassifier pkg=NearestNeighborModels

Do model = KNNClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNNClassifier(K=...).

KNNClassifier implements K-Nearest Neighbors classifier which is non-parametric algorithm that predicts a discrete class distribution associated with a new point by taking a vote over the classes of the k-nearest points. Each neighbor vote is assigned a weight based on proximity of the neighbor point to the test point according to a specified distance metric.

For more information about the weighting kernels, see the paper by Geler et.al Comparison of different weighting schemes for the kNN classifier on time-series data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is <:Finite (<:Multiclass or <:OrderedFactor will do); check the scitype with scitype(y)
  • w is the observation weights which can either be nothing (default) or an AbstractVector whose element scitype is Count or Continuous. This is different from weights kernel which is a model hyperparameter, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • K::Int=5 : number of neighbors
  • algorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)
  • metric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.
  • leafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.
  • reorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.
  • weights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.

Examples

using MLJ
 KNNClassifier = @load KNNClassifier pkg=NearestNeighborModels
 X, y = @load_crabs; ## a table and a vector from the crabs dataset
 ## view possible kernels
@@ -9,4 +9,4 @@
 mach = machine(model, X, y) |> fit! ## wrap model and required data in an MLJ machine and fit
 y_hat = predict(mach, X)
 labels = predict_mode(mach, X)
-

See also MultitargetKNNClassifier

+

See also MultitargetKNNClassifier

diff --git a/dev/models/KNNDetector_OutlierDetectionNeighbors/index.html b/dev/models/KNNDetector_OutlierDetectionNeighbors/index.html index 3c1a43955..336d010d3 100644 --- a/dev/models/KNNDetector_OutlierDetectionNeighbors/index.html +++ b/dev/models/KNNDetector_OutlierDetectionNeighbors/index.html @@ -1,5 +1,5 @@ -KNNDetector · MLJ

KNNDetector

KNNDetector(k=5,
+KNNDetector · MLJ

KNNDetector

KNNDetector(k=5,
             metric=Euclidean,
             algorithm=:kdtree,
             leafsize=10,
@@ -8,4 +8,4 @@
 detector = KNNDetector()
 X = rand(10, 100)
 model, result = fit(detector, X; verbosity=0)
-test_scores = transform(detector, model, X)

References

[1] Ramaswamy, Sridhar; Rastogi, Rajeev; Shim, Kyuseok (2000): Efficient Algorithms for Mining Outliers from Large Data Sets.

[2] Angiulli, Fabrizio; Pizzuti, Clara (2002): Fast Outlier Detection in High Dimensional Spaces.

+test_scores = transform(detector, model, X)

References

[1] Ramaswamy, Sridhar; Rastogi, Rajeev; Shim, Kyuseok (2000): Efficient Algorithms for Mining Outliers from Large Data Sets.

[2] Angiulli, Fabrizio; Pizzuti, Clara (2002): Fast Outlier Detection in High Dimensional Spaces.

diff --git a/dev/models/KNNDetector_OutlierDetectionPython/index.html b/dev/models/KNNDetector_OutlierDetectionPython/index.html index f13c99cdd..3026714b9 100644 --- a/dev/models/KNNDetector_OutlierDetectionPython/index.html +++ b/dev/models/KNNDetector_OutlierDetectionPython/index.html @@ -1,5 +1,5 @@ -KNNDetector · MLJ

KNNDetector

KNNDetector(n_neighbors = 5,
+KNNDetector · MLJ
+               n_jobs = 1)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.knn

diff --git a/dev/models/KNNRegressor_NearestNeighborModels/index.html b/dev/models/KNNRegressor_NearestNeighborModels/index.html index f8f6fa731..4ebccf0ea 100644 --- a/dev/models/KNNRegressor_NearestNeighborModels/index.html +++ b/dev/models/KNNRegressor_NearestNeighborModels/index.html @@ -1,5 +1,5 @@ -KNNRegressor · MLJ

KNNRegressor

KNNRegressor

A model type for constructing a K-nearest neighbor regressor, based on NearestNeighborModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KNNRegressor = @load KNNRegressor pkg=NearestNeighborModels

Do model = KNNRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNNRegressor(K=...).

KNNRegressor implements K-Nearest Neighbors regressor which is non-parametric algorithm that predicts the response associated with a new point by taking an weighted average of the response of the K-nearest points.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any table of responses whose element scitype is Continuous; check the scitype with scitype(y).
  • w is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • K::Int=5 : number of neighbors
  • algorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)
  • metric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.
  • leafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.
  • reorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.
  • weights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.

Examples

using MLJ
+KNNRegressor · MLJ

KNNRegressor

KNNRegressor

A model type for constructing a K-nearest neighbor regressor, based on NearestNeighborModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KNNRegressor = @load KNNRegressor pkg=NearestNeighborModels

Do model = KNNRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNNRegressor(K=...).

KNNRegressor implements K-Nearest Neighbors regressor which is non-parametric algorithm that predicts the response associated with a new point by taking an weighted average of the response of the K-nearest points.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any table of responses whose element scitype is Continuous; check the scitype with scitype(y).
  • w is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • K::Int=5 : number of neighbors
  • algorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)
  • metric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.
  • leafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.
  • reorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.
  • weights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.

Examples

using MLJ
 KNNRegressor = @load KNNRegressor pkg=NearestNeighborModels
 X, y = @load_boston; ## loads the crabs dataset from MLJBase
 ## view possible kernels
@@ -7,4 +7,4 @@
 model = KNNRegressor(weights = NearestNeighborModels.Inverse()) #KNNRegressor instantiation
 mach = machine(model, X, y) |> fit! ## wrap model and required data in an MLJ machine and fit
 y_hat = predict(mach, X)
-

See also MultitargetKNNRegressor

+

See also MultitargetKNNRegressor

diff --git a/dev/models/KNeighborsClassifier_MLJScikitLearnInterface/index.html b/dev/models/KNeighborsClassifier_MLJScikitLearnInterface/index.html index bc22f5b3b..edbe57747 100644 --- a/dev/models/KNeighborsClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/KNeighborsClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -KNeighborsClassifier · MLJ

KNeighborsClassifier

KNeighborsClassifier

A model type for constructing a K-nearest neighbors classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KNeighborsClassifier = @load KNeighborsClassifier pkg=MLJScikitLearnInterface

Do model = KNeighborsClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNeighborsClassifier(n_neighbors=...).

Hyper-parameters

  • n_neighbors = 5
  • weights = uniform
  • algorithm = auto
  • leaf_size = 30
  • p = 2
  • metric = minkowski
  • metric_params = nothing
  • n_jobs = nothing
+KNeighborsClassifier · MLJ

KNeighborsClassifier

KNeighborsClassifier

A model type for constructing a K-nearest neighbors classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KNeighborsClassifier = @load KNeighborsClassifier pkg=MLJScikitLearnInterface

Do model = KNeighborsClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNeighborsClassifier(n_neighbors=...).

Hyper-parameters

  • n_neighbors = 5
  • weights = uniform
  • algorithm = auto
  • leaf_size = 30
  • p = 2
  • metric = minkowski
  • metric_params = nothing
  • n_jobs = nothing
diff --git a/dev/models/KNeighborsRegressor_MLJScikitLearnInterface/index.html b/dev/models/KNeighborsRegressor_MLJScikitLearnInterface/index.html index 5961feb05..d74eed409 100644 --- a/dev/models/KNeighborsRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/KNeighborsRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -KNeighborsRegressor · MLJ

KNeighborsRegressor

KNeighborsRegressor

A model type for constructing a K-nearest neighbors regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KNeighborsRegressor = @load KNeighborsRegressor pkg=MLJScikitLearnInterface

Do model = KNeighborsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNeighborsRegressor(n_neighbors=...).

Hyper-parameters

  • n_neighbors = 5
  • weights = uniform
  • algorithm = auto
  • leaf_size = 30
  • p = 2
  • metric = minkowski
  • metric_params = nothing
  • n_jobs = nothing
+KNeighborsRegressor · MLJ

KNeighborsRegressor

KNeighborsRegressor

A model type for constructing a K-nearest neighbors regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KNeighborsRegressor = @load KNeighborsRegressor pkg=MLJScikitLearnInterface

Do model = KNeighborsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KNeighborsRegressor(n_neighbors=...).

Hyper-parameters

  • n_neighbors = 5
  • weights = uniform
  • algorithm = auto
  • leaf_size = 30
  • p = 2
  • metric = minkowski
  • metric_params = nothing
  • n_jobs = nothing
diff --git a/dev/models/KPLSRegressor_PartialLeastSquaresRegressor/index.html b/dev/models/KPLSRegressor_PartialLeastSquaresRegressor/index.html index f52705962..eb4c84bdb 100644 --- a/dev/models/KPLSRegressor_PartialLeastSquaresRegressor/index.html +++ b/dev/models/KPLSRegressor_PartialLeastSquaresRegressor/index.html @@ -1,2 +1,2 @@ -KPLSRegressor · MLJ

KPLSRegressor

A Kernel Partial Least Squares Regressor. A Kernel PLS2 NIPALS algorithms. Can be used mainly for regression.

+KPLSRegressor · MLJ

KPLSRegressor

A Kernel Partial Least Squares Regressor. A Kernel PLS2 NIPALS algorithms. Can be used mainly for regression.

diff --git a/dev/models/KernelPCA_MultivariateStats/index.html b/dev/models/KernelPCA_MultivariateStats/index.html index 0774dc1b0..74dd838b0 100644 --- a/dev/models/KernelPCA_MultivariateStats/index.html +++ b/dev/models/KernelPCA_MultivariateStats/index.html @@ -1,5 +1,5 @@ -KernelPCA · MLJ

KernelPCA

KernelPCA

A model type for constructing a kernel prinicipal component analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KernelPCA = @load KernelPCA pkg=MultivariateStats

Do model = KernelPCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KernelPCA(maxoutdim=...).

In kernel PCA the linear operations of ordinary principal component analysis are performed in a reproducing Hilbert space.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • maxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.
  • kernel::Function=(x,y)->x'y: The kernel function, takes in 2 vector arguments x and y, returns a scalar value. Defaults to the dot product of x and y.
  • solver::Symbol=:eig: solver to use for the eigenvalues, one of :eig(default, uses LinearAlgebra.eigen), :eigs(uses Arpack.eigs).
  • inverse::Bool=true: perform calculations needed for inverse transform
  • beta::Real=1.0: strength of the ridge regression that learns the inverse transform when inverse is true.
  • tol::Real=0.0: Convergence tolerance for eigenvalue solver.
  • maxiter::Int=300: maximum number of iterations for eigenvalue solver.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • inverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim: Dimension of transformed data.
  • principalvars: The variance of the principal components.

Examples

using MLJ
+KernelPCA · MLJ

KernelPCA

KernelPCA

A model type for constructing a kernel prinicipal component analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

KernelPCA = @load KernelPCA pkg=MultivariateStats

Do model = KernelPCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in KernelPCA(maxoutdim=...).

In kernel PCA the linear operations of ordinary principal component analysis are performed in a reproducing Hilbert space.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • maxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.
  • kernel::Function=(x,y)->x'y: The kernel function, takes in 2 vector arguments x and y, returns a scalar value. Defaults to the dot product of x and y.
  • solver::Symbol=:eig: solver to use for the eigenvalues, one of :eig(default, uses LinearAlgebra.eigen), :eigs(uses Arpack.eigs).
  • inverse::Bool=true: perform calculations needed for inverse transform
  • beta::Real=1.0: strength of the ridge regression that learns the inverse transform when inverse is true.
  • tol::Real=0.0: Convergence tolerance for eigenvalue solver.
  • maxiter::Int=300: maximum number of iterations for eigenvalue solver.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • inverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim: Dimension of transformed data.
  • principalvars: The variance of the principal components.

Examples

using MLJ
 using LinearAlgebra
 
 KernelPCA = @load KernelPCA pkg=MultivariateStats
@@ -13,4 +13,4 @@
 model = KernelPCA(maxoutdim=2, kernel=rbf_kernel(1))
 mach = machine(model, X) |> fit!
 
-Xproj = transform(mach, X)

See also PCA, ICA, FactorAnalysis, PPCA

+Xproj = transform(mach, X)

See also PCA, ICA, FactorAnalysis, PPCA

diff --git a/dev/models/KernelPerceptronClassifier_BetaML/index.html b/dev/models/KernelPerceptronClassifier_BetaML/index.html index 14874bc39..77a6e22c0 100644 --- a/dev/models/KernelPerceptronClassifier_BetaML/index.html +++ b/dev/models/KernelPerceptronClassifier_BetaML/index.html @@ -1,5 +1,5 @@ -KernelPerceptronClassifier · MLJ

KernelPerceptronClassifier

mutable struct KernelPerceptronClassifier <: MLJModelInterface.Probabilistic

The kernel perceptron algorithm using one-vs-one for multiclass, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • kernel::Function: Kernel function to employ. See ?radial_kernel or ?polynomial_kernel (once loaded the BetaML package) for details or check ?BetaML.Utils to verify if other kernels are defined (you can alsways define your own kernel) [def: radial_kernel]
  • epochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 100]
  • initial_errors::Union{Nothing, Vector{Vector{Int64}}}: Initial distribution of the number of errors errors [def: nothing, i.e. zeros]. If provided, this should be a nModels-lenght vector of nRecords integer values vectors , where nModels is computed as (n_classes * (n_classes - 1)) / 2
  • shuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+KernelPerceptronClassifier · MLJ

KernelPerceptronClassifier

mutable struct KernelPerceptronClassifier <: MLJModelInterface.Probabilistic

The kernel perceptron algorithm using one-vs-one for multiclass, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • kernel::Function: Kernel function to employ. See ?radial_kernel or ?polynomial_kernel (once loaded the BetaML package) for details or check ?BetaML.Utils to verify if other kernels are defined (you can alsways define your own kernel) [def: radial_kernel]
  • epochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 100]
  • initial_errors::Union{Nothing, Vector{Vector{Int64}}}: Initial distribution of the number of errors errors [def: nothing, i.e. zeros]. If provided, this should be a nModels-lenght vector of nRecords integer values vectors , where nModels is computed as (n_classes * (n_classes - 1)) / 2
  • shuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -26,4 +26,4 @@
  UnivariateFinite{Multiclass{3}}(setosa=>0.665, versicolor=>0.245, virginica=>0.09)
  ⋮
  UnivariateFinite{Multiclass{3}}(setosa=>0.09, versicolor=>0.245, virginica=>0.665)
- UnivariateFinite{Multiclass{3}}(setosa=>0.09, versicolor=>0.665, virginica=>0.245)
+ UnivariateFinite{Multiclass{3}}(setosa=>0.09, versicolor=>0.665, virginica=>0.245)
diff --git a/dev/models/LADRegressor_MLJLinearModels/index.html b/dev/models/LADRegressor_MLJLinearModels/index.html index 76903426f..9fe2cc487 100644 --- a/dev/models/LADRegressor_MLJLinearModels/index.html +++ b/dev/models/LADRegressor_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -LADRegressor · MLJ

LADRegressor

LADRegressor

A model type for constructing a lad regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LADRegressor = @load LADRegressor pkg=MLJLinearModels

Do model = LADRegressor() to construct an instance with default hyper-parameters.

Least absolute deviation regression is a linear model with objective function

$

∑ρ(Xθ - y) + n⋅λ|θ|₂² + n⋅γ|θ|₁ $

where $ρ$ is the absolute loss and $n$ is the number of observations.

If scale_penalty_with_samples = false the objective function is instead

$

∑ρ(Xθ - y) + λ|θ|₂² + γ|θ|₁ $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

See also RobustRegressor.

Parameters

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, if penalty = :l2, and ProxGrad otherwise.

    If solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
+LADRegressor · MLJ

LADRegressor

LADRegressor

A model type for constructing a lad regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LADRegressor = @load LADRegressor pkg=MLJLinearModels

Do model = LADRegressor() to construct an instance with default hyper-parameters.

Least absolute deviation regression is a linear model with objective function

$

∑ρ(Xθ - y) + n⋅λ|θ|₂² + n⋅γ|θ|₁ $

where $ρ$ is the absolute loss and $n$ is the number of observations.

If scale_penalty_with_samples = false the objective function is instead

$

∑ρ(Xθ - y) + λ|θ|₂² + γ|θ|₁ $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

See also RobustRegressor.

Parameters

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, if penalty = :l2, and ProxGrad otherwise.

    If solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
 X, y = make_regression()
 mach = fit!(machine(LADRegressor(), X, y))
 predict(mach, X)
-fitted_params(mach)
+fitted_params(mach)
diff --git a/dev/models/LDA_MultivariateStats/index.html b/dev/models/LDA_MultivariateStats/index.html index be5eaaa58..19409c4c6 100644 --- a/dev/models/LDA_MultivariateStats/index.html +++ b/dev/models/LDA_MultivariateStats/index.html @@ -1,5 +1,5 @@ -LDA · MLJ

LDA

LDA

A model type for constructing a linear discriminant analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LDA = @load LDA pkg=MultivariateStats

Do model = LDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LDA(method=...).

Multiclass linear discriminant analysis learns a projection in a space of features to a lower dimensional space, in a way that attempts to preserve as much as possible the degree to which the classes of a discrete target variable can be discriminated. This can be used either for dimension reduction of the features (see transform below) or for probabilistic classification of the target (see predict below).

In the case of prediction, the class probability for a new observation reflects the proximity of that observation to training observations associated with that class, and how far away the observation is from observations associated with other classes. Specifically, the distances, in the transformed (projected) space, of a new observation, from the centroid of each target class, is computed; the resulting vector of distances, multiplied by minus one, is passed to a softmax function to obtain a class probability prediction. Here "distance" is computed using a user-specified distance function.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • method::Symbol=:gevd: The solver, one of :gevd or :whiten methods.
  • cov_w::StatsBase.SimpleCovariance(): An estimator for the within-class covariance (used in computing the within-class scatter matrix, Sw). Any robust estimator from CovarianceEstimation.jl can be used.
  • cov_b::StatsBase.SimpleCovariance(): The same as cov_w but for the between-class covariance (used in computing the between-class scatter matrix, Sb).
  • outdim::Int=0: The output dimension, i.e dimension of the transformed space, automatically set to min(indim, nclasses-1) if equal to 0.
  • regcoef::Float64=1e-6: The regularization coefficient. A positive value regcoef*eigmax(Sw) where Sw is the within-class scatter matrix, is added to the diagonal of Sw to improve numerical stability. This can be useful if using the standard covariance estimator.
  • dist=Distances.SqEuclidean(): The distance metric to use when performing classification (to compare the distance between a new point and centroids in the transformed space); must be a subtype of Distances.SemiMetric from Distances.jl, e.g., Distances.CosineDist.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • predict(mach, Xnew): Return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • classes: The classes seen during model fitting.
  • projection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).

Report

The fields of report(mach) are:

  • indim: The dimension of the input space i.e the number of training features.
  • outdim: The dimension of the transformed space the model is projected to.
  • mean: The mean of the untransformed training data. A vector of length indim.
  • nclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).
  • class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).
  • class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)
  • Sb: The between class scatter matrix.
  • Sw: The within class scatter matrix.

Examples

using MLJ
+LDA · MLJ

LDA

LDA

A model type for constructing a linear discriminant analysis model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LDA = @load LDA pkg=MultivariateStats

Do model = LDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LDA(method=...).

Multiclass linear discriminant analysis learns a projection in a space of features to a lower dimensional space, in a way that attempts to preserve as much as possible the degree to which the classes of a discrete target variable can be discriminated. This can be used either for dimension reduction of the features (see transform below) or for probabilistic classification of the target (see predict below).

In the case of prediction, the class probability for a new observation reflects the proximity of that observation to training observations associated with that class, and how far away the observation is from observations associated with other classes. Specifically, the distances, in the transformed (projected) space, of a new observation, from the centroid of each target class, is computed; the resulting vector of distances, multiplied by minus one, is passed to a softmax function to obtain a class probability prediction. Here "distance" is computed using a user-specified distance function.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • method::Symbol=:gevd: The solver, one of :gevd or :whiten methods.
  • cov_w::StatsBase.SimpleCovariance(): An estimator for the within-class covariance (used in computing the within-class scatter matrix, Sw). Any robust estimator from CovarianceEstimation.jl can be used.
  • cov_b::StatsBase.SimpleCovariance(): The same as cov_w but for the between-class covariance (used in computing the between-class scatter matrix, Sb).
  • outdim::Int=0: The output dimension, i.e dimension of the transformed space, automatically set to min(indim, nclasses-1) if equal to 0.
  • regcoef::Float64=1e-6: The regularization coefficient. A positive value regcoef*eigmax(Sw) where Sw is the within-class scatter matrix, is added to the diagonal of Sw to improve numerical stability. This can be useful if using the standard covariance estimator.
  • dist=Distances.SqEuclidean(): The distance metric to use when performing classification (to compare the distance between a new point and centroids in the transformed space); must be a subtype of Distances.SemiMetric from Distances.jl, e.g., Distances.CosineDist.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • predict(mach, Xnew): Return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • classes: The classes seen during model fitting.
  • projection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).

Report

The fields of report(mach) are:

  • indim: The dimension of the input space i.e the number of training features.
  • outdim: The dimension of the transformed space the model is projected to.
  • mean: The mean of the untransformed training data. A vector of length indim.
  • nclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool).
  • class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).
  • class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)
  • Sb: The between class scatter matrix.
  • Sw: The within class scatter matrix.

Examples

using MLJ
 
 LDA = @load LDA pkg=MultivariateStats
 
@@ -11,4 +11,4 @@
 Xproj = transform(mach, X)
 y_hat = predict(mach, X)
 labels = predict_mode(mach, X)
-

See also BayesianLDA, SubspaceLDA, BayesianSubspaceLDA

+

See also BayesianLDA, SubspaceLDA, BayesianSubspaceLDA

diff --git a/dev/models/LGBMClassifier_LightGBM/index.html b/dev/models/LGBMClassifier_LightGBM/index.html index 5aac49fe2..2d79ffd82 100644 --- a/dev/models/LGBMClassifier_LightGBM/index.html +++ b/dev/models/LGBMClassifier_LightGBM/index.html @@ -1,2 +1,2 @@ -LGBMClassifier · MLJ
+LGBMClassifier · MLJ
diff --git a/dev/models/LGBMRegressor_LightGBM/index.html b/dev/models/LGBMRegressor_LightGBM/index.html index 95f37ff69..274419fad 100644 --- a/dev/models/LGBMRegressor_LightGBM/index.html +++ b/dev/models/LGBMRegressor_LightGBM/index.html @@ -1,2 +1,2 @@ -LGBMRegressor · MLJ
+LGBMRegressor · MLJ
diff --git a/dev/models/LMDDDetector_OutlierDetectionPython/index.html b/dev/models/LMDDDetector_OutlierDetectionPython/index.html index 2b89ba5da..286d97b24 100644 --- a/dev/models/LMDDDetector_OutlierDetectionPython/index.html +++ b/dev/models/LMDDDetector_OutlierDetectionPython/index.html @@ -1,4 +1,4 @@ -LMDDDetector · MLJ

LMDDDetector

LMDDDetector(n_iter = 50,
+LMDDDetector · MLJ
+                random_state = nothing)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.lmdd

diff --git a/dev/models/LOCIDetector_OutlierDetectionPython/index.html b/dev/models/LOCIDetector_OutlierDetectionPython/index.html index eb42d2b30..3810ec4ab 100644 --- a/dev/models/LOCIDetector_OutlierDetectionPython/index.html +++ b/dev/models/LOCIDetector_OutlierDetectionPython/index.html @@ -1,3 +1,3 @@ -LOCIDetector · MLJ
+LOCIDetector · MLJ
diff --git a/dev/models/LODADetector_OutlierDetectionPython/index.html b/dev/models/LODADetector_OutlierDetectionPython/index.html index f061dc7f4..21be7f926 100644 --- a/dev/models/LODADetector_OutlierDetectionPython/index.html +++ b/dev/models/LODADetector_OutlierDetectionPython/index.html @@ -1,3 +1,3 @@ -LODADetector · MLJ
+LODADetector · MLJ
diff --git a/dev/models/LOFDetector_OutlierDetectionNeighbors/index.html b/dev/models/LOFDetector_OutlierDetectionNeighbors/index.html index 793098696..ddb6c7077 100644 --- a/dev/models/LOFDetector_OutlierDetectionNeighbors/index.html +++ b/dev/models/LOFDetector_OutlierDetectionNeighbors/index.html @@ -1,5 +1,5 @@ -LOFDetector · MLJ

LOFDetector

LOFDetector(k = 5,
+LOFDetector · MLJ

LOFDetector

LOFDetector(k = 5,
             metric = Euclidean(),
             algorithm = :kdtree,
             leafsize = 10,
@@ -8,4 +8,4 @@
 detector = LOFDetector()
 X = rand(10, 100)
 model, result = fit(detector, X; verbosity=0)
-test_scores = transform(detector, model, X)

References

[1] Breunig, Markus M.; Kriegel, Hans-Peter; Ng, Raymond T.; Sander, Jörg (2000): LOF: Identifying Density-Based Local Outliers.

+test_scores = transform(detector, model, X)

References

[1] Breunig, Markus M.; Kriegel, Hans-Peter; Ng, Raymond T.; Sander, Jörg (2000): LOF: Identifying Density-Based Local Outliers.

diff --git a/dev/models/LOFDetector_OutlierDetectionPython/index.html b/dev/models/LOFDetector_OutlierDetectionPython/index.html index 1a858b2e0..c0acdc688 100644 --- a/dev/models/LOFDetector_OutlierDetectionPython/index.html +++ b/dev/models/LOFDetector_OutlierDetectionPython/index.html @@ -1,9 +1,9 @@ -LOFDetector · MLJ

LOFDetector

LOFDetector(n_neighbors = 5,
+LOFDetector · MLJ
+               novelty = true)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.lof

diff --git a/dev/models/LarsCVRegressor_MLJScikitLearnInterface/index.html b/dev/models/LarsCVRegressor_MLJScikitLearnInterface/index.html index 5fc3fe41c..128fdb45a 100644 --- a/dev/models/LarsCVRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/LarsCVRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LarsCVRegressor · MLJ

LarsCVRegressor

LarsCVRegressor

A model type for constructing a least angle regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LarsCVRegressor = @load LarsCVRegressor pkg=MLJScikitLearnInterface

Do model = LarsCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LarsCVRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • verbose = false
  • max_iter = 500
  • precompute = auto
  • cv = 5
  • max_n_alphas = 1000
  • n_jobs = nothing
  • eps = 2.220446049250313e-16
  • copy_X = true
+LarsCVRegressor · MLJ

LarsCVRegressor

LarsCVRegressor

A model type for constructing a least angle regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LarsCVRegressor = @load LarsCVRegressor pkg=MLJScikitLearnInterface

Do model = LarsCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LarsCVRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • verbose = false
  • max_iter = 500
  • precompute = auto
  • cv = 5
  • max_n_alphas = 1000
  • n_jobs = nothing
  • eps = 2.220446049250313e-16
  • copy_X = true
diff --git a/dev/models/LarsRegressor_MLJScikitLearnInterface/index.html b/dev/models/LarsRegressor_MLJScikitLearnInterface/index.html index 32f3b35f0..365ec31b2 100644 --- a/dev/models/LarsRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/LarsRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LarsRegressor · MLJ

LarsRegressor

LarsRegressor

A model type for constructing a least angle regressor (LARS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LarsRegressor = @load LarsRegressor pkg=MLJScikitLearnInterface

Do model = LarsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LarsRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • verbose = false
  • precompute = auto
  • n_nonzero_coefs = 500
  • eps = 2.220446049250313e-16
  • copy_X = true
  • fit_path = true
+LarsRegressor · MLJ

LarsRegressor

LarsRegressor

A model type for constructing a least angle regressor (LARS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LarsRegressor = @load LarsRegressor pkg=MLJScikitLearnInterface

Do model = LarsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LarsRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • verbose = false
  • precompute = auto
  • n_nonzero_coefs = 500
  • eps = 2.220446049250313e-16
  • copy_X = true
  • fit_path = true
diff --git a/dev/models/LassoCVRegressor_MLJScikitLearnInterface/index.html b/dev/models/LassoCVRegressor_MLJScikitLearnInterface/index.html index 393aec573..670e795da 100644 --- a/dev/models/LassoCVRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/LassoCVRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LassoCVRegressor · MLJ

LassoCVRegressor

LassoCVRegressor

A model type for constructing a lasso regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoCVRegressor = @load LassoCVRegressor pkg=MLJScikitLearnInterface

Do model = LassoCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoCVRegressor(eps=...).

Hyper-parameters

  • eps = 0.001
  • n_alphas = 100
  • alphas = nothing
  • fit_intercept = true
  • precompute = auto
  • max_iter = 1000
  • tol = 0.0001
  • copy_X = true
  • cv = 5
  • verbose = false
  • n_jobs = nothing
  • positive = false
  • random_state = nothing
  • selection = cyclic
+LassoCVRegressor · MLJ

LassoCVRegressor

LassoCVRegressor

A model type for constructing a lasso regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoCVRegressor = @load LassoCVRegressor pkg=MLJScikitLearnInterface

Do model = LassoCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoCVRegressor(eps=...).

Hyper-parameters

  • eps = 0.001
  • n_alphas = 100
  • alphas = nothing
  • fit_intercept = true
  • precompute = auto
  • max_iter = 1000
  • tol = 0.0001
  • copy_X = true
  • cv = 5
  • verbose = false
  • n_jobs = nothing
  • positive = false
  • random_state = nothing
  • selection = cyclic
diff --git a/dev/models/LassoLarsCVRegressor_MLJScikitLearnInterface/index.html b/dev/models/LassoLarsCVRegressor_MLJScikitLearnInterface/index.html index 347e4dfd7..92b69d38a 100644 --- a/dev/models/LassoLarsCVRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/LassoLarsCVRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LassoLarsCVRegressor · MLJ

LassoLarsCVRegressor

LassoLarsCVRegressor

A model type for constructing a Lasso model fit with least angle regression (LARS) with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoLarsCVRegressor = @load LassoLarsCVRegressor pkg=MLJScikitLearnInterface

Do model = LassoLarsCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsCVRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • verbose = false
  • max_iter = 500
  • precompute = auto
  • cv = 5
  • max_n_alphas = 1000
  • n_jobs = nothing
  • eps = 2.220446049250313e-16
  • copy_X = true
  • positive = false
+LassoLarsCVRegressor · MLJ

LassoLarsCVRegressor

LassoLarsCVRegressor

A model type for constructing a Lasso model fit with least angle regression (LARS) with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoLarsCVRegressor = @load LassoLarsCVRegressor pkg=MLJScikitLearnInterface

Do model = LassoLarsCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsCVRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • verbose = false
  • max_iter = 500
  • precompute = auto
  • cv = 5
  • max_n_alphas = 1000
  • n_jobs = nothing
  • eps = 2.220446049250313e-16
  • copy_X = true
  • positive = false
diff --git a/dev/models/LassoLarsICRegressor_MLJScikitLearnInterface/index.html b/dev/models/LassoLarsICRegressor_MLJScikitLearnInterface/index.html index 854279565..230109fcc 100644 --- a/dev/models/LassoLarsICRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/LassoLarsICRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LassoLarsICRegressor · MLJ

LassoLarsICRegressor

LassoLarsICRegressor

A model type for constructing a Lasso model with LARS using BIC or AIC for model selection, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoLarsICRegressor = @load LassoLarsICRegressor pkg=MLJScikitLearnInterface

Do model = LassoLarsICRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsICRegressor(criterion=...).

Hyper-parameters

  • criterion = aic
  • fit_intercept = true
  • verbose = false
  • precompute = auto
  • max_iter = 500
  • eps = 2.220446049250313e-16
  • copy_X = true
  • positive = false
+LassoLarsICRegressor · MLJ

LassoLarsICRegressor

LassoLarsICRegressor

A model type for constructing a Lasso model with LARS using BIC or AIC for model selection, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoLarsICRegressor = @load LassoLarsICRegressor pkg=MLJScikitLearnInterface

Do model = LassoLarsICRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsICRegressor(criterion=...).

Hyper-parameters

  • criterion = aic
  • fit_intercept = true
  • verbose = false
  • precompute = auto
  • max_iter = 500
  • eps = 2.220446049250313e-16
  • copy_X = true
  • positive = false
diff --git a/dev/models/LassoLarsRegressor_MLJScikitLearnInterface/index.html b/dev/models/LassoLarsRegressor_MLJScikitLearnInterface/index.html index 84ab29902..659e07f0a 100644 --- a/dev/models/LassoLarsRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/LassoLarsRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LassoLarsRegressor · MLJ

LassoLarsRegressor

LassoLarsRegressor

A model type for constructing a Lasso model fit with least angle regression (LARS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoLarsRegressor = @load LassoLarsRegressor pkg=MLJScikitLearnInterface

Do model = LassoLarsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • verbose = false
  • precompute = auto
  • max_iter = 500
  • eps = 2.220446049250313e-16
  • copy_X = true
  • fit_path = true
  • positive = false
+LassoLarsRegressor · MLJ

LassoLarsRegressor

LassoLarsRegressor

A model type for constructing a Lasso model fit with least angle regression (LARS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoLarsRegressor = @load LassoLarsRegressor pkg=MLJScikitLearnInterface

Do model = LassoLarsRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoLarsRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • verbose = false
  • precompute = auto
  • max_iter = 500
  • eps = 2.220446049250313e-16
  • copy_X = true
  • fit_path = true
  • positive = false
diff --git a/dev/models/LassoRegressor_MLJLinearModels/index.html b/dev/models/LassoRegressor_MLJLinearModels/index.html index 1fa04006e..1339a3b4a 100644 --- a/dev/models/LassoRegressor_MLJLinearModels/index.html +++ b/dev/models/LassoRegressor_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -LassoRegressor · MLJ

LassoRegressor

LassoRegressor

A model type for constructing a lasso regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoRegressor = @load LassoRegressor pkg=MLJLinearModels

Do model = LassoRegressor() to construct an instance with default hyper-parameters.

Lasso regression is a linear model with objective function

$

|Xθ - y|₂²/2 + n⋅λ|θ|₁ $

where $n$ is the number of observations.

If scale_penalty_with_samples = false the objective function is

$

|Xθ - y|₂²/2 + λ|θ|₁ $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the L1 regularization. Default: 1.0
  • fit_intercept::Bool: whether to fit the intercept or not. Default: true
  • penalize_intercept::Bool: whether to penalize the intercept. Default: false
  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true
  • solver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.ProxGrad. If solver=nothing (default) then ProxGrad(accel=true) (FISTA) is used. Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...). Default: nothing

Example

using MLJ
+LassoRegressor · MLJ

LassoRegressor

LassoRegressor

A model type for constructing a lasso regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoRegressor = @load LassoRegressor pkg=MLJLinearModels

Do model = LassoRegressor() to construct an instance with default hyper-parameters.

Lasso regression is a linear model with objective function

$

|Xθ - y|₂²/2 + n⋅λ|θ|₁ $

where $n$ is the number of observations.

If scale_penalty_with_samples = false the objective function is

$

|Xθ - y|₂²/2 + λ|θ|₁ $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the L1 regularization. Default: 1.0
  • fit_intercept::Bool: whether to fit the intercept or not. Default: true
  • penalize_intercept::Bool: whether to penalize the intercept. Default: false
  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true
  • solver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.ProxGrad. If solver=nothing (default) then ProxGrad(accel=true) (FISTA) is used. Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...). Default: nothing

Example

using MLJ
 X, y = make_regression()
 mach = fit!(machine(LassoRegressor(), X, y))
 predict(mach, X)
-fitted_params(mach)

See also ElasticNetRegressor.

+fitted_params(mach)

See also ElasticNetRegressor.

diff --git a/dev/models/LassoRegressor_MLJScikitLearnInterface/index.html b/dev/models/LassoRegressor_MLJScikitLearnInterface/index.html index 3f8a17b38..1de6724b2 100644 --- a/dev/models/LassoRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/LassoRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LassoRegressor · MLJ

LassoRegressor

LassoRegressor

A model type for constructing a lasso regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoRegressor = @load LassoRegressor pkg=MLJScikitLearnInterface

Do model = LassoRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • precompute = false
  • copy_X = true
  • max_iter = 1000
  • tol = 0.0001
  • warm_start = false
  • positive = false
  • random_state = nothing
  • selection = cyclic
+LassoRegressor · MLJ

LassoRegressor

LassoRegressor

A model type for constructing a lasso regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LassoRegressor = @load LassoRegressor pkg=MLJScikitLearnInterface

Do model = LassoRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LassoRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • precompute = false
  • copy_X = true
  • max_iter = 1000
  • tol = 0.0001
  • warm_start = false
  • positive = false
  • random_state = nothing
  • selection = cyclic
diff --git a/dev/models/LinearBinaryClassifier_GLM/index.html b/dev/models/LinearBinaryClassifier_GLM/index.html index 6a802f64e..873049db1 100644 --- a/dev/models/LinearBinaryClassifier_GLM/index.html +++ b/dev/models/LinearBinaryClassifier_GLM/index.html @@ -1,5 +1,5 @@ -LinearBinaryClassifier · MLJ

LinearBinaryClassifier

LinearBinaryClassifier

A model type for constructing a linear binary classifier, based on GLM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearBinaryClassifier = @load LinearBinaryClassifier pkg=GLM

Do model = LinearBinaryClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearBinaryClassifier(fit_intercept=...).

LinearBinaryClassifier is a generalized linear model, specialised to the case of a binary target variable, with a user-specified link function. Options exist to specify an intercept or offset feature.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
+LinearBinaryClassifier · MLJ

LinearBinaryClassifier

LinearBinaryClassifier

A model type for constructing a linear binary classifier, based on GLM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearBinaryClassifier = @load LinearBinaryClassifier pkg=GLM

Do model = LinearBinaryClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearBinaryClassifier(fit_intercept=...).

LinearBinaryClassifier is a generalized linear model, specialised to the case of a binary target variable, with a user-specified link function. Options exist to specify an intercept or offset feature.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
 mach = machine(model, X, y, w)

Here

  • X: is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the scitype with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor(2) or <:Multiclass(2); check the scitype with schema(y)
  • w: is a vector of Real per-observation weights

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • fit_intercept=true: Whether to calculate the intercept for this model. If set to false, no intercept will be calculated (e.g. the data is expected to be centered)
  • link=GLM.LogitLink: The function which links the linear prediction function to the probability of a particular outcome or class. This must have type GLM.Link01. Options include GLM.LogitLink(), GLM.ProbitLink(), CloglogLink(),CauchitLink()`.
  • offsetcol=nothing: Name of the column to be used as an offset, if any. An offset is a variable which is known to have a coefficient of 1.
  • maxiter::Integer=30: The maximum number of iterations allowed to achieve convergence.
  • atol::Real=1e-6: Absolute threshold for convergence. Convergence is achieved when the relative change in deviance is less than `max(rtol*dev, atol). This term exists to avoid failure when deviance is unchanged except for rounding errors.
  • rtol::Real=1e-6: Relative threshold for convergence. Convergence is achieved when the relative change in deviance is less than `max(rtol*dev, atol). This term exists to avoid failure when deviance is unchanged except for rounding errors.
  • minstepfac::Real=0.001: Minimum step fraction. Must be between 0 and 1. Lower bound for the factor used to update the linear fit.
  • report_keys: Vector of keys for the report. Possible keys are: :deviance, :dof_residual, :stderror, :vcov, :coef_table and :glm_model. By default only :glm_model is excluded.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • features: The names of the features used during model fitting.
  • coef: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Report

The fields of report(mach) are:

  • deviance: Measure of deviance of fitted model with respect to a perfectly fitted model. For a linear model, this is the weighted residual sum of squares
  • dof_residual: The degrees of freedom for residuals, when meaningful.
  • stderror: The standard errors of the coefficients.
  • vcov: The estimated variance-covariance matrix of the coefficient estimates.
  • coef_table: Table which displays coefficients and summarizes their significance and confidence intervals.
  • glm_model: The raw fitted model returned by GLM.lm. Note this points to training data. Refer to the GLM.jl documentation for usage.

Examples

using MLJ
 import GLM ## namespace must be available
 
@@ -25,4 +25,4 @@
 fitted_params(mach).coef
 fitted_params(mach).intercept
 
-report(mach)

See also LinearRegressor, LinearCountRegressor

+report(mach)

See also LinearRegressor, LinearCountRegressor

diff --git a/dev/models/LinearCountRegressor_GLM/index.html b/dev/models/LinearCountRegressor_GLM/index.html index 43d8dc1d9..2fc3b4366 100644 --- a/dev/models/LinearCountRegressor_GLM/index.html +++ b/dev/models/LinearCountRegressor_GLM/index.html @@ -1,5 +1,5 @@ -LinearCountRegressor · MLJ

LinearCountRegressor

LinearCountRegressor

A model type for constructing a linear count regressor, based on GLM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearCountRegressor = @load LinearCountRegressor pkg=GLM

Do model = LinearCountRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearCountRegressor(fit_intercept=...).

LinearCountRegressor is a generalized linear model, specialised to the case of a Count target variable (non-negative, unbounded integer) with user-specified link function. Options exist to specify an intercept or offset feature.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
+LinearCountRegressor · MLJ

LinearCountRegressor

LinearCountRegressor

A model type for constructing a linear count regressor, based on GLM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearCountRegressor = @load LinearCountRegressor pkg=GLM

Do model = LinearCountRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearCountRegressor(fit_intercept=...).

LinearCountRegressor is a generalized linear model, specialised to the case of a Count target variable (non-negative, unbounded integer) with user-specified link function. Options exist to specify an intercept or offset feature.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
 mach = machine(model, X, y, w)

Here

  • X: is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the scitype with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is Count; check the scitype with schema(y)
  • w: is a vector of Real per-observation weights

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • fit_intercept=true: Whether to calculate the intercept for this model. If set to false, no intercept will be calculated (e.g. the data is expected to be centered)
  • distribution=Distributions.Poisson(): The distribution which the residuals/errors of the model should fit.
  • link=GLM.LogLink(): The function which links the linear prediction function to the probability of a particular outcome or class. This should be one of the following: GLM.IdentityLink(), GLM.InverseLink(), GLM.InverseSquareLink(), GLM.LogLink(), GLM.SqrtLink().
  • offsetcol=nothing: Name of the column to be used as an offset, if any. An offset is a variable which is known to have a coefficient of 1.
  • maxiter::Integer=30: The maximum number of iterations allowed to achieve convergence.
  • atol::Real=1e-6: Absolute threshold for convergence. Convergence is achieved when the relative change in deviance is less than `max(rtol*dev, atol). This term exists to avoid failure when deviance is unchanged except for rounding errors.
  • rtol::Real=1e-6: Relative threshold for convergence. Convergence is achieved when the relative change in deviance is less than `max(rtol*dev, atol). This term exists to avoid failure when deviance is unchanged except for rounding errors.
  • minstepfac::Real=0.001: Minimum step fraction. Must be between 0 and 1. Lower bound for the factor used to update the linear fit.
  • report_keys: Vector of keys for the report. Possible keys are: :deviance, :dof_residual, :stderror, :vcov, :coef_table and :glm_model. By default only :glm_model is excluded.

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew having the same Scitype as X above. Predictions are probabilistic.
  • predict_mean(mach, Xnew): instead return the mean of each prediction above
  • predict_median(mach, Xnew): instead return the median of each prediction above.

Fitted parameters

The fields of fitted_params(mach) are:

  • features: The names of the features encountered during model fitting.
  • coef: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Report

The fields of report(mach) are:

  • deviance: Measure of deviance of fitted model with respect to a perfectly fitted model. For a linear model, this is the weighted residual sum of squares
  • dof_residual: The degrees of freedom for residuals, when meaningful.
  • stderror: The standard errors of the coefficients.
  • vcov: The estimated variance-covariance matrix of the coefficient estimates.
  • coef_table: Table which displays coefficients and summarizes their significance and confidence intervals.
  • glm_model: The raw fitted model returned by GLM.lm. Note this points to training data. Refer to the GLM.jl documentation for usage.

Examples

using MLJ
 import MLJ.Distributions.Poisson
 
@@ -31,4 +31,4 @@
  -2.0255901752504775
   3.014407534033522
 
-report(mach)

See also LinearRegressor, LinearBinaryClassifier

+report(mach)

See also LinearRegressor, LinearBinaryClassifier

diff --git a/dev/models/LinearRegressor_GLM/index.html b/dev/models/LinearRegressor_GLM/index.html index 9547f5d7b..b60cfcf48 100644 --- a/dev/models/LinearRegressor_GLM/index.html +++ b/dev/models/LinearRegressor_GLM/index.html @@ -1,5 +1,5 @@ -LinearRegressor · MLJ

LinearRegressor

LinearRegressor

A model type for constructing a linear regressor, based on GLM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearRegressor = @load LinearRegressor pkg=GLM

Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(fit_intercept=...).

LinearRegressor assumes the target is a continuous variable whose conditional distribution is normal with constant variance, and whose expected value is a linear combination of the features (identity link function). Options exist to specify an intercept or offset feature.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
+LinearRegressor · MLJ

LinearRegressor

LinearRegressor

A model type for constructing a linear regressor, based on GLM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearRegressor = @load LinearRegressor pkg=GLM

Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(fit_intercept=...).

LinearRegressor assumes the target is a continuous variable whose conditional distribution is normal with constant variance, and whose expected value is a linear combination of the features (identity link function). Options exist to specify an intercept or offset feature.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
 mach = machine(model, X, y, w)

Here

  • X: is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the scitype with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)
  • w: is a vector of Real per-observation weights

Hyper-parameters

  • fit_intercept=true: Whether to calculate the intercept for this model. If set to false, no intercept will be calculated (e.g. the data is expected to be centered)
  • dropcollinear=false: Whether to drop features in the training data to ensure linear independence. If true , only the first of each set of linearly-dependent features is used. The coefficient for redundant linearly dependent features is 0.0 and all associated statistics are set to NaN.
  • offsetcol=nothing: Name of the column to be used as an offset, if any. An offset is a variable which is known to have a coefficient of 1.
  • report_keys: Vector of keys for the report. Possible keys are: :deviance, :dof_residual, :stderror, :vcov, :coef_table and :glm_model. By default only :glm_model is excluded.

Train the machine using fit!(mach, rows=...).

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew having the same Scitype as X above. Predictions are probabilistic.
  • predict_mean(mach, Xnew): instead return the mean of each prediction above
  • predict_median(mach, Xnew): instead return the median of each prediction above.

Fitted parameters

The fields of fitted_params(mach) are:

  • features: The names of the features encountered during model fitting.
  • coef: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Report

When all keys are enabled in report_keys, the following fields are available in report(mach):

  • deviance: Measure of deviance of fitted model with respect to a perfectly fitted model. For a linear model, this is the weighted residual sum of squares
  • dof_residual: The degrees of freedom for residuals, when meaningful.
  • stderror: The standard errors of the coefficients.
  • vcov: The estimated variance-covariance matrix of the coefficient estimates.
  • coef_table: Table which displays coefficients and summarizes their significance and confidence intervals.
  • glm_model: The raw fitted model returned by GLM.lm. Note this points to training data. Refer to the GLM.jl documentation for usage.

Examples

using MLJ
 LinearRegressor = @load LinearRegressor pkg=GLM
 glm = LinearRegressor()
@@ -15,4 +15,4 @@
 fitted_params(mach).coef ## x1, x2, intercept
 fitted_params(mach).intercept
 
-report(mach)

See also LinearCountRegressor, LinearBinaryClassifier

+report(mach)

See also LinearCountRegressor, LinearBinaryClassifier

diff --git a/dev/models/LinearRegressor_MLJLinearModels/index.html b/dev/models/LinearRegressor_MLJLinearModels/index.html index d1ac50834..1a56fdeb2 100644 --- a/dev/models/LinearRegressor_MLJLinearModels/index.html +++ b/dev/models/LinearRegressor_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -LinearRegressor · MLJ

LinearRegressor

LinearRegressor

A model type for constructing a linear regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearRegressor = @load LinearRegressor pkg=MLJLinearModels

Do model = LinearRegressor() to construct an instance with default hyper-parameters.

This model provides standard linear regression with objective function

$

|Xθ - y|₂²/2 $

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: "any instance of MLJLinearModels.Analytical. Use Analytical() for Cholesky and CG()=Analytical(iterative=true) for conjugate-gradient.

    If solver = nothing (default) then Analytical() is used. Default: nothing

Example

using MLJ
+LinearRegressor · MLJ

LinearRegressor

LinearRegressor

A model type for constructing a linear regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearRegressor = @load LinearRegressor pkg=MLJLinearModels

Do model = LinearRegressor() to construct an instance with default hyper-parameters.

This model provides standard linear regression with objective function

$

|Xθ - y|₂²/2 $

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: "any instance of MLJLinearModels.Analytical. Use Analytical() for Cholesky and CG()=Analytical(iterative=true) for conjugate-gradient.

    If solver = nothing (default) then Analytical() is used. Default: nothing

Example

using MLJ
 X, y = make_regression()
 mach = fit!(machine(LinearRegressor(), X, y))
 predict(mach, X)
-fitted_params(mach)
+fitted_params(mach)
diff --git a/dev/models/LinearRegressor_MLJScikitLearnInterface/index.html b/dev/models/LinearRegressor_MLJScikitLearnInterface/index.html index 049064a3c..2fe610fb5 100644 --- a/dev/models/LinearRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/LinearRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LinearRegressor · MLJ

LinearRegressor

LinearRegressor

A model type for constructing a ordinary least-squares regressor (OLS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearRegressor = @load LinearRegressor pkg=MLJScikitLearnInterface

Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • copy_X = true
  • n_jobs = nothing
+LinearRegressor · MLJ

LinearRegressor

LinearRegressor

A model type for constructing a ordinary least-squares regressor (OLS), based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearRegressor = @load LinearRegressor pkg=MLJScikitLearnInterface

Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • copy_X = true
  • n_jobs = nothing
diff --git a/dev/models/LinearRegressor_MultivariateStats/index.html b/dev/models/LinearRegressor_MultivariateStats/index.html index 930053b10..8b5372207 100644 --- a/dev/models/LinearRegressor_MultivariateStats/index.html +++ b/dev/models/LinearRegressor_MultivariateStats/index.html @@ -1,5 +1,5 @@ -LinearRegressor · MLJ

LinearRegressor

LinearRegressor

A model type for constructing a linear regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearRegressor = @load LinearRegressor pkg=MultivariateStats

Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(bias=...).

LinearRegressor assumes the target is a Continuous variable and trains a linear prediction function using the least squares algorithm. Options exist to specify a bias term.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • bias=true: Include the bias term if true, otherwise fit without bias term.

Operations

  • predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • coefficients: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Examples

using MLJ
+LinearRegressor · MLJ

LinearRegressor

LinearRegressor

A model type for constructing a linear regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearRegressor = @load LinearRegressor pkg=MultivariateStats

Do model = LinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearRegressor(bias=...).

LinearRegressor assumes the target is a Continuous variable and trains a linear prediction function using the least squares algorithm. Options exist to specify a bias term.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check the column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • bias=true: Include the bias term if true, otherwise fit without bias term.

Operations

  • predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • coefficients: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Examples

using MLJ
 
 LinearRegressor = @load LinearRegressor pkg=MultivariateStats
 linear_regressor = LinearRegressor()
@@ -8,4 +8,4 @@
 mach = machine(linear_regressor, X, y) |> fit!
 
 Xnew, _ = make_regression(3, 2)
-yhat = predict(mach, Xnew) ## new predictions

See also MultitargetLinearRegressor, RidgeRegressor, MultitargetRidgeRegressor

+yhat = predict(mach, Xnew) ## new predictions

See also MultitargetLinearRegressor, RidgeRegressor, MultitargetRidgeRegressor

diff --git a/dev/models/LinearSVC_LIBSVM/index.html b/dev/models/LinearSVC_LIBSVM/index.html index 9487a5a1c..b40cbadc9 100644 --- a/dev/models/LinearSVC_LIBSVM/index.html +++ b/dev/models/LinearSVC_LIBSVM/index.html @@ -1,5 +1,5 @@ -LinearSVC · MLJ

LinearSVC

LinearSVC

A model type for constructing a linear support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearSVC = @load LinearSVC pkg=LIBSVM

Do model = LinearSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearSVC(solver=...).

Reference for algorithm and core C-library: Rong-En Fan et al (2008): "LIBLINEAR: A Library for Large Linear Classification." Journal of Machine Learning Research 9 1871-1874. Available at https://www.csie.ntu.edu.tw/~cjlin/papers/liblinear.pdf.

This model type is similar to SVC from the same package with the setting kernel=LIBSVM.Kernel.KERNEL.Linear, but is optimized for the linear case.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
+LinearSVC · MLJ

LinearSVC

LinearSVC

A model type for constructing a linear support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LinearSVC = @load LinearSVC pkg=LIBSVM

Do model = LinearSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LinearSVC(solver=...).

Reference for algorithm and core C-library: Rong-En Fan et al (2008): "LIBLINEAR: A Library for Large Linear Classification." Journal of Machine Learning Research 9 1871-1874. Available at https://www.csie.ntu.edu.tw/~cjlin/papers/liblinear.pdf.

This model type is similar to SVC from the same package with the setting kernel=LIBSVM.Kernel.KERNEL.Linear, but is optimized for the linear case.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
 mach = machine(model, X, y, w)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)
  • w: a dictionary of class weights, keyed on levels(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • solver=LIBSVM.Linearsolver.L2R_L2LOSS_SVC_DUAL: linear solver, which must be one of the following from the LIBSVM.jl package:

    • LIBSVM.Linearsolver.L2R_LR: L2-regularized logistic regression (primal))
    • LIBSVM.Linearsolver.L2R_L2LOSS_SVC_DUAL: L2-regularized L2-loss support vector classification (dual)
    • LIBSVM.Linearsolver.L2R_L2LOSS_SVC: L2-regularized L2-loss support vector classification (primal)
    • LIBSVM.Linearsolver.L2R_L1LOSS_SVC_DUAL: L2-regularized L1-loss support vector classification (dual)
    • LIBSVM.Linearsolver.MCSVM_CS: support vector classification by Crammer and Singer) LIBSVM.Linearsolver.L1R_L2LOSS_SVC: L1-regularized L2-loss support vector classification)
    • LIBSVM.Linearsolver.L1R_LR: L1-regularized logistic regression
    • LIBSVM.Linearsolver.L2R_LR_DUAL: L2-regularized logistic regression (dual)
  • tolerance::Float64=Inf: tolerance for the stopping criterion;

  • cost=1.0 (range (0, Inf)): the parameter denoted $C$ in the cited reference; for greater regularization, decrease cost

  • bias= -1.0: if bias >= 0, instance x becomes [x; bias]; if bias < 0, no bias term added (default -1)

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • encoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation

Examples

using MLJ
 import LIBSVM
 
@@ -25,4 +25,4 @@
 3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
  "versicolor"
  "versicolor"
- "versicolor"

See also the SVC and NuSVC classifiers, and LIVSVM.jl and the original C implementation documentation.

+ "versicolor"

See also the SVC and NuSVC classifiers, and LIVSVM.jl and the original C implementation documentation.

diff --git a/dev/models/LogisticCVClassifier_MLJScikitLearnInterface/index.html b/dev/models/LogisticCVClassifier_MLJScikitLearnInterface/index.html index 7a15fbb7b..1cd391ad0 100644 --- a/dev/models/LogisticCVClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/LogisticCVClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LogisticCVClassifier · MLJ

LogisticCVClassifier

LogisticCVClassifier

A model type for constructing a logistic regression classifier with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LogisticCVClassifier = @load LogisticCVClassifier pkg=MLJScikitLearnInterface

Do model = LogisticCVClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LogisticCVClassifier(Cs=...).

Hyper-parameters

  • Cs = 10
  • fit_intercept = true
  • cv = 5
  • dual = false
  • penalty = l2
  • scoring = nothing
  • solver = lbfgs
  • tol = 0.0001
  • max_iter = 100
  • class_weight = nothing
  • n_jobs = nothing
  • verbose = 0
  • refit = true
  • intercept_scaling = 1.0
  • multi_class = auto
  • random_state = nothing
  • l1_ratios = nothing
+LogisticCVClassifier · MLJ

LogisticCVClassifier

LogisticCVClassifier

A model type for constructing a logistic regression classifier with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LogisticCVClassifier = @load LogisticCVClassifier pkg=MLJScikitLearnInterface

Do model = LogisticCVClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LogisticCVClassifier(Cs=...).

Hyper-parameters

  • Cs = 10
  • fit_intercept = true
  • cv = 5
  • dual = false
  • penalty = l2
  • scoring = nothing
  • solver = lbfgs
  • tol = 0.0001
  • max_iter = 100
  • class_weight = nothing
  • n_jobs = nothing
  • verbose = 0
  • refit = true
  • intercept_scaling = 1.0
  • multi_class = auto
  • random_state = nothing
  • l1_ratios = nothing
diff --git a/dev/models/LogisticClassifier_MLJLinearModels/index.html b/dev/models/LogisticClassifier_MLJLinearModels/index.html index b93a55327..8db53ad52 100644 --- a/dev/models/LogisticClassifier_MLJLinearModels/index.html +++ b/dev/models/LogisticClassifier_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -LogisticClassifier · MLJ

LogisticClassifier

LogisticClassifier

A model type for constructing a logistic classifier, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LogisticClassifier = @load LogisticClassifier pkg=MLJLinearModels

Do model = LogisticClassifier() to construct an instance with default hyper-parameters.

This model is more commonly known as "logistic regression". It is a standard classifier for both binary and multiclass classification. The objective function applies either a logistic loss (binary target) or multinomial (softmax) loss, and has a mixed L1/L2 penalty:

$

L(y, Xθ) + n⋅λ|θ|₂²/2 + n⋅γ|θ|₁ $

.

Here $L$ is either MLJLinearModels.LogisticLoss or MLJLinearModels.MultiClassLoss, $λ$ and $γ$ indicate the strength of the L2 (resp. L1) regularization components and $n$ is the number of training observations.

With scale_penalty_with_samples = false the objective function is instead

$

L(y, Xθ) + λ|θ|₂²/2 + γ|θ|₁ $

.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1 and strength of the L2 regularizer if penalty is :en. Default: eps()

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of samples. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, Newton, NewtonCG, ProxGrad; but subject to the following restrictions:

    • If penalty = :l2, ProxGrad is disallowed. Otherwise, ProxGrad is the only option.
    • Unless scitype(y) <: Finite{2} (binary target) Newton is disallowed.

    If solver = nothing (default) then ProxGrad(accel=true) (FISTA) is used, unless gamma = 0, in which case LBFGS() is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
+LogisticClassifier · MLJ

LogisticClassifier

LogisticClassifier

A model type for constructing a logistic classifier, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LogisticClassifier = @load LogisticClassifier pkg=MLJLinearModels

Do model = LogisticClassifier() to construct an instance with default hyper-parameters.

This model is more commonly known as "logistic regression". It is a standard classifier for both binary and multiclass classification. The objective function applies either a logistic loss (binary target) or multinomial (softmax) loss, and has a mixed L1/L2 penalty:

$

L(y, Xθ) + n⋅λ|θ|₂²/2 + n⋅γ|θ|₁ $

.

Here $L$ is either MLJLinearModels.LogisticLoss or MLJLinearModels.MultiClassLoss, $λ$ and $γ$ indicate the strength of the L2 (resp. L1) regularization components and $n$ is the number of training observations.

With scale_penalty_with_samples = false the objective function is instead

$

L(y, Xθ) + λ|θ|₂²/2 + γ|θ|₁ $

.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1 and strength of the L2 regularizer if penalty is :en. Default: eps()

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of samples. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, Newton, NewtonCG, ProxGrad; but subject to the following restrictions:

    • If penalty = :l2, ProxGrad is disallowed. Otherwise, ProxGrad is the only option.
    • Unless scitype(y) <: Finite{2} (binary target) Newton is disallowed.

    If solver = nothing (default) then ProxGrad(accel=true) (FISTA) is used, unless gamma = 0, in which case LBFGS() is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
 X, y = make_blobs(centers = 2)
 mach = fit!(machine(LogisticClassifier(), X, y))
 predict(mach, X)
-fitted_params(mach)

See also MultinomialClassifier.

+fitted_params(mach)

See also MultinomialClassifier.

diff --git a/dev/models/LogisticClassifier_MLJScikitLearnInterface/index.html b/dev/models/LogisticClassifier_MLJScikitLearnInterface/index.html index 646264ccc..842e0d7c4 100644 --- a/dev/models/LogisticClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/LogisticClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -LogisticClassifier · MLJ

LogisticClassifier

LogisticClassifier

A model type for constructing a logistic regression classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LogisticClassifier = @load LogisticClassifier pkg=MLJScikitLearnInterface

Do model = LogisticClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LogisticClassifier(penalty=...).

Hyper-parameters

  • penalty = l2
  • dual = false
  • tol = 0.0001
  • C = 1.0
  • fit_intercept = true
  • intercept_scaling = 1.0
  • class_weight = nothing
  • random_state = nothing
  • solver = lbfgs
  • max_iter = 100
  • multi_class = auto
  • verbose = 0
  • warm_start = false
  • n_jobs = nothing
  • l1_ratio = nothing
+LogisticClassifier · MLJ

LogisticClassifier

LogisticClassifier

A model type for constructing a logistic regression classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

LogisticClassifier = @load LogisticClassifier pkg=MLJScikitLearnInterface

Do model = LogisticClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in LogisticClassifier(penalty=...).

Hyper-parameters

  • penalty = l2
  • dual = false
  • tol = 0.0001
  • C = 1.0
  • fit_intercept = true
  • intercept_scaling = 1.0
  • class_weight = nothing
  • random_state = nothing
  • solver = lbfgs
  • max_iter = 100
  • multi_class = auto
  • verbose = 0
  • warm_start = false
  • n_jobs = nothing
  • l1_ratio = nothing
diff --git a/dev/models/MCDDetector_OutlierDetectionPython/index.html b/dev/models/MCDDetector_OutlierDetectionPython/index.html index 265ca7dd0..8fd0c5431 100644 --- a/dev/models/MCDDetector_OutlierDetectionPython/index.html +++ b/dev/models/MCDDetector_OutlierDetectionPython/index.html @@ -1,5 +1,5 @@ -MCDDetector · MLJ

MCDDetector

MCDDetector(store_precision = true,
+MCDDetector · MLJ
+               random_state = nothing)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.mcd

diff --git a/dev/models/MeanShift_MLJScikitLearnInterface/index.html b/dev/models/MeanShift_MLJScikitLearnInterface/index.html index 6ddfc1370..1a197683e 100644 --- a/dev/models/MeanShift_MLJScikitLearnInterface/index.html +++ b/dev/models/MeanShift_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -MeanShift · MLJ

MeanShift

MeanShift

A model type for constructing a mean shift, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MeanShift = @load MeanShift pkg=MLJScikitLearnInterface

Do model = MeanShift() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MeanShift(bandwidth=...).

Mean shift clustering using a flat kernel. Mean shift clustering aims to discover "blobs" in a smooth density of samples. It is a centroid-based algorithm, which works by updating candidates for centroids to be the mean of the points within a given region. These candidates are then filtered in a post-processing stage to eliminate near-duplicates to form the final set of centroids."

+MeanShift · MLJ

MeanShift

MeanShift

A model type for constructing a mean shift, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MeanShift = @load MeanShift pkg=MLJScikitLearnInterface

Do model = MeanShift() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MeanShift(bandwidth=...).

Mean shift clustering using a flat kernel. Mean shift clustering aims to discover "blobs" in a smooth density of samples. It is a centroid-based algorithm, which works by updating candidates for centroids to be the mean of the points within a given region. These candidates are then filtered in a post-processing stage to eliminate near-duplicates to form the final set of centroids."

diff --git a/dev/models/MiniBatchKMeans_MLJScikitLearnInterface/index.html b/dev/models/MiniBatchKMeans_MLJScikitLearnInterface/index.html index 49f8c7f27..3331f10d8 100644 --- a/dev/models/MiniBatchKMeans_MLJScikitLearnInterface/index.html +++ b/dev/models/MiniBatchKMeans_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -MiniBatchKMeans · MLJ

MiniBatchKMeans

MiniBatchKMeans

A model type for constructing a Mini-Batch K-Means clustering., based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MiniBatchKMeans = @load MiniBatchKMeans pkg=MLJScikitLearnInterface

Do model = MiniBatchKMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MiniBatchKMeans(n_clusters=...).

Hyper-parameters

  • n_clusters = 8
  • max_iter = 100
  • batch_size = 100
  • verbose = 0
  • compute_labels = true
  • random_state = nothing
  • tol = 0.0
  • max_no_improvement = 10
  • init_size = nothing
  • n_init = 3
  • init = k-means++
  • reassignment_ratio = 0.01
+MiniBatchKMeans · MLJ

MiniBatchKMeans

MiniBatchKMeans

A model type for constructing a Mini-Batch K-Means clustering., based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MiniBatchKMeans = @load MiniBatchKMeans pkg=MLJScikitLearnInterface

Do model = MiniBatchKMeans() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MiniBatchKMeans(n_clusters=...).

Hyper-parameters

  • n_clusters = 8
  • max_iter = 100
  • batch_size = 100
  • verbose = 0
  • compute_labels = true
  • random_state = nothing
  • tol = 0.0
  • max_no_improvement = 10
  • init_size = nothing
  • n_init = 3
  • init = k-means++
  • reassignment_ratio = 0.01
diff --git a/dev/models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/index.html b/dev/models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/index.html index 274f2aac5..4cdb7b11c 100644 --- a/dev/models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/MultiTaskElasticNetCVRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -MultiTaskElasticNetCVRegressor · MLJ

MultiTaskElasticNetCVRegressor

MultiTaskElasticNetCVRegressor

A model type for constructing a multi-target elastic net regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultiTaskElasticNetCVRegressor = @load MultiTaskElasticNetCVRegressor pkg=MLJScikitLearnInterface

Do model = MultiTaskElasticNetCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskElasticNetCVRegressor(l1_ratio=...).

Hyper-parameters

  • l1_ratio = 0.5
  • eps = 0.001
  • n_alphas = 100
  • alphas = nothing
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.0001
  • cv = 5
  • copy_X = true
  • verbose = 0
  • n_jobs = nothing
  • random_state = nothing
  • selection = cyclic
+MultiTaskElasticNetCVRegressor · MLJ

MultiTaskElasticNetCVRegressor

MultiTaskElasticNetCVRegressor

A model type for constructing a multi-target elastic net regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultiTaskElasticNetCVRegressor = @load MultiTaskElasticNetCVRegressor pkg=MLJScikitLearnInterface

Do model = MultiTaskElasticNetCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskElasticNetCVRegressor(l1_ratio=...).

Hyper-parameters

  • l1_ratio = 0.5
  • eps = 0.001
  • n_alphas = 100
  • alphas = nothing
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.0001
  • cv = 5
  • copy_X = true
  • verbose = 0
  • n_jobs = nothing
  • random_state = nothing
  • selection = cyclic
diff --git a/dev/models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/index.html b/dev/models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/index.html index b749fb85c..354fbd4da 100644 --- a/dev/models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/MultiTaskElasticNetRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -MultiTaskElasticNetRegressor · MLJ

MultiTaskElasticNetRegressor

MultiTaskElasticNetRegressor

A model type for constructing a multi-target elastic net regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultiTaskElasticNetRegressor = @load MultiTaskElasticNetRegressor pkg=MLJScikitLearnInterface

Do model = MultiTaskElasticNetRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskElasticNetRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • l1_ratio = 0.5
  • fit_intercept = true
  • copy_X = true
  • max_iter = 1000
  • tol = 0.0001
  • warm_start = false
  • random_state = nothing
  • selection = cyclic
+MultiTaskElasticNetRegressor · MLJ

MultiTaskElasticNetRegressor

MultiTaskElasticNetRegressor

A model type for constructing a multi-target elastic net regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultiTaskElasticNetRegressor = @load MultiTaskElasticNetRegressor pkg=MLJScikitLearnInterface

Do model = MultiTaskElasticNetRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskElasticNetRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • l1_ratio = 0.5
  • fit_intercept = true
  • copy_X = true
  • max_iter = 1000
  • tol = 0.0001
  • warm_start = false
  • random_state = nothing
  • selection = cyclic
diff --git a/dev/models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/index.html b/dev/models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/index.html index fd166ab9f..ab20b116c 100644 --- a/dev/models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/MultiTaskLassoCVRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -MultiTaskLassoCVRegressor · MLJ

MultiTaskLassoCVRegressor

MultiTaskLassoCVRegressor

A model type for constructing a multi-target lasso regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultiTaskLassoCVRegressor = @load MultiTaskLassoCVRegressor pkg=MLJScikitLearnInterface

Do model = MultiTaskLassoCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskLassoCVRegressor(eps=...).

Hyper-parameters

  • eps = 0.001
  • n_alphas = 100
  • alphas = nothing
  • fit_intercept = true
  • max_iter = 300
  • tol = 0.0001
  • copy_X = true
  • cv = 5
  • verbose = false
  • n_jobs = 1
  • random_state = nothing
  • selection = cyclic
+MultiTaskLassoCVRegressor · MLJ

MultiTaskLassoCVRegressor

MultiTaskLassoCVRegressor

A model type for constructing a multi-target lasso regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultiTaskLassoCVRegressor = @load MultiTaskLassoCVRegressor pkg=MLJScikitLearnInterface

Do model = MultiTaskLassoCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskLassoCVRegressor(eps=...).

Hyper-parameters

  • eps = 0.001
  • n_alphas = 100
  • alphas = nothing
  • fit_intercept = true
  • max_iter = 300
  • tol = 0.0001
  • copy_X = true
  • cv = 5
  • verbose = false
  • n_jobs = 1
  • random_state = nothing
  • selection = cyclic
diff --git a/dev/models/MultiTaskLassoRegressor_MLJScikitLearnInterface/index.html b/dev/models/MultiTaskLassoRegressor_MLJScikitLearnInterface/index.html index 341edacbd..762b615d0 100644 --- a/dev/models/MultiTaskLassoRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/MultiTaskLassoRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -MultiTaskLassoRegressor · MLJ

MultiTaskLassoRegressor

MultiTaskLassoRegressor

A model type for constructing a multi-target lasso regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultiTaskLassoRegressor = @load MultiTaskLassoRegressor pkg=MLJScikitLearnInterface

Do model = MultiTaskLassoRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskLassoRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.0001
  • copy_X = true
  • random_state = nothing
  • selection = cyclic
+MultiTaskLassoRegressor · MLJ

MultiTaskLassoRegressor

MultiTaskLassoRegressor

A model type for constructing a multi-target lasso regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultiTaskLassoRegressor = @load MultiTaskLassoRegressor pkg=MLJScikitLearnInterface

Do model = MultiTaskLassoRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultiTaskLassoRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.0001
  • copy_X = true
  • random_state = nothing
  • selection = cyclic
diff --git a/dev/models/MultinomialClassifier_MLJLinearModels/index.html b/dev/models/MultinomialClassifier_MLJLinearModels/index.html index d7283cee5..1b97d20ee 100644 --- a/dev/models/MultinomialClassifier_MLJLinearModels/index.html +++ b/dev/models/MultinomialClassifier_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -MultinomialClassifier · MLJ

MultinomialClassifier

MultinomialClassifier

A model type for constructing a multinomial classifier, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultinomialClassifier = @load MultinomialClassifier pkg=MLJLinearModels

Do model = MultinomialClassifier() to construct an instance with default hyper-parameters.

This model coincides with LogisticClassifier, except certain optimizations possible in the special binary case will not be applied. Its hyperparameters are identical.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: eps()

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of samples. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, NewtonCG, ProxGrad; but subject to the following restrictions:

    • If penalty = :l2, ProxGrad is disallowed. Otherwise, ProxGrad is the only option.
    • Unless scitype(y) <: Finite{2} (binary target) Newton is disallowed.

    If solver = nothing (default) then ProxGrad(accel=true) (FISTA) is used, unless gamma = 0, in which case LBFGS() is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
+MultinomialClassifier · MLJ

MultinomialClassifier

MultinomialClassifier

A model type for constructing a multinomial classifier, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultinomialClassifier = @load MultinomialClassifier pkg=MLJLinearModels

Do model = MultinomialClassifier() to construct an instance with default hyper-parameters.

This model coincides with LogisticClassifier, except certain optimizations possible in the special binary case will not be applied. Its hyperparameters are identical.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: eps()

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of samples. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, NewtonCG, ProxGrad; but subject to the following restrictions:

    • If penalty = :l2, ProxGrad is disallowed. Otherwise, ProxGrad is the only option.
    • Unless scitype(y) <: Finite{2} (binary target) Newton is disallowed.

    If solver = nothing (default) then ProxGrad(accel=true) (FISTA) is used, unless gamma = 0, in which case LBFGS() is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
 X, y = make_blobs(centers = 3)
 mach = fit!(machine(MultinomialClassifier(), X, y))
 predict(mach, X)
-fitted_params(mach)

See also LogisticClassifier.

+fitted_params(mach)

See also LogisticClassifier.

diff --git a/dev/models/MultinomialNBClassifier_MLJScikitLearnInterface/index.html b/dev/models/MultinomialNBClassifier_MLJScikitLearnInterface/index.html index 6d718ad0a..b2afc6b3e 100644 --- a/dev/models/MultinomialNBClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/MultinomialNBClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -MultinomialNBClassifier · MLJ

MultinomialNBClassifier

MultinomialNBClassifier

A model type for constructing a multinomial naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultinomialNBClassifier = @load MultinomialNBClassifier pkg=MLJScikitLearnInterface

Do model = MultinomialNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultinomialNBClassifier(alpha=...).

Multinomial naive bayes classifier. It is suitable for classification with discrete features (e.g. word counts for text classification).

+MultinomialNBClassifier · MLJ

MultinomialNBClassifier

MultinomialNBClassifier

A model type for constructing a multinomial naive Bayes classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultinomialNBClassifier = @load MultinomialNBClassifier pkg=MLJScikitLearnInterface

Do model = MultinomialNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultinomialNBClassifier(alpha=...).

Multinomial naive bayes classifier. It is suitable for classification with discrete features (e.g. word counts for text classification).

diff --git a/dev/models/MultinomialNBClassifier_NaiveBayes/index.html b/dev/models/MultinomialNBClassifier_NaiveBayes/index.html index 9338e84a5..0c6915873 100644 --- a/dev/models/MultinomialNBClassifier_NaiveBayes/index.html +++ b/dev/models/MultinomialNBClassifier_NaiveBayes/index.html @@ -1,5 +1,5 @@ -MultinomialNBClassifier · MLJ

MultinomialNBClassifier

MultinomialNBClassifier

A model type for constructing a multinomial naive Bayes classifier, based on NaiveBayes.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultinomialNBClassifier = @load MultinomialNBClassifier pkg=NaiveBayes

Do model = MultinomialNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultinomialNBClassifier(alpha=...).

The multinomial naive Bayes classifier is often applied when input features consist of a counts (scitype Count) and when observations for a fixed target class are generated from a multinomial distribution with fixed probability vector, but whose sample length varies from observation to observation. For example, features might represent word counts in text documents being classified by sentiment.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Count; check the column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • alpha=1: Lindstone smoothing in estimation of multinomial probability vectors from training histograms (default corresponds to Laplacian smoothing).

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above.
  • predict_mode(mach, Xnew): Return the mode of above predictions.

Fitted parameters

The fields of fitted_params(mach) are:

  • c_counts: A dictionary containing the observed count of each input class.
  • x_counts: A dictionary containing the categorical counts of each input class.
  • x_totals: The sum of each count (input feature), ungrouped.
  • n_obs: The total number of observations in the training data.

Examples

using MLJ
+MultinomialNBClassifier · MLJ

MultinomialNBClassifier

MultinomialNBClassifier

A model type for constructing a multinomial naive Bayes classifier, based on NaiveBayes.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultinomialNBClassifier = @load MultinomialNBClassifier pkg=NaiveBayes

Do model = MultinomialNBClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultinomialNBClassifier(alpha=...).

The multinomial naive Bayes classifier is often applied when input features consist of a counts (scitype Count) and when observations for a fixed target class are generated from a multinomial distribution with fixed probability vector, but whose sample length varies from observation to observation. For example, features might represent word counts in text documents being classified by sentiment.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Count; check the column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is Finite; check the scitype with schema(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • alpha=1: Lindstone smoothing in estimation of multinomial probability vectors from training histograms (default corresponds to Laplacian smoothing).

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above.
  • predict_mode(mach, Xnew): Return the mode of above predictions.

Fitted parameters

The fields of fitted_params(mach) are:

  • c_counts: A dictionary containing the observed count of each input class.
  • x_counts: A dictionary containing the categorical counts of each input class.
  • x_totals: The sum of each count (input feature), ungrouped.
  • n_obs: The total number of observations in the training data.

Examples

using MLJ
 import TextAnalysis
 
 CountTransformer = @load CountTransformer pkg=MLJText
@@ -41,4 +41,4 @@
 log_loss(y_prob, y[5:6])
 
 ## point predictions:
-yhat = mode.(y_prob) ## or `predict_mode(mach2, rows=5:6)`

See also GaussianNBClassifier

+yhat = mode.(y_prob) ## or `predict_mode(mach2, rows=5:6)`

See also GaussianNBClassifier

diff --git a/dev/models/MultitargetGaussianMixtureRegressor_BetaML/index.html b/dev/models/MultitargetGaussianMixtureRegressor_BetaML/index.html index b38cf1bfc..a72e0cecf 100644 --- a/dev/models/MultitargetGaussianMixtureRegressor_BetaML/index.html +++ b/dev/models/MultitargetGaussianMixtureRegressor_BetaML/index.html @@ -1,5 +1,5 @@ -MultitargetGaussianMixtureRegressor · MLJ

MultitargetGaussianMixtureRegressor

mutable struct MultitargetGaussianMixtureRegressor <: MLJModelInterface.Deterministic

A non-linear regressor derived from fitting the data on a probabilistic model (Gaussian Mixture Model). Relatively fast but generally not very precise, except for data with a structure matching the chosen underlying mixture.

This is the multi-target version of the model. If you want to predict a single label (y), use the MLJ model GaussianMixtureRegressor.

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to "gived" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:[DiagonalGaussian() for i in 1:n_classes]`]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • maximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+MultitargetGaussianMixtureRegressor · MLJ

MultitargetGaussianMixtureRegressor

mutable struct MultitargetGaussianMixtureRegressor <: MLJModelInterface.Deterministic

A non-linear regressor derived from fitting the data on a probabilistic model (Gaussian Mixture Model). Relatively fast but generally not very precise, except for data with a structure matching the chosen underlying mixture.

This is the multi-target version of the model. If you want to predict a single label (y), use the MLJ model GaussianMixtureRegressor.

Hyperparameters:

  • n_classes::Int64: Number of mixtures (latent classes) to consider [def: 3]

  • initial_probmixtures::Vector{Float64}: Initial probabilities of the categorical distribution (n_classes x 1) [default: []]

  • mixtures::Union{Type, Vector{<:BetaML.GMM.AbstractMixture}}: An array (of length n_classes) of the mixtures to employ (see the [?GMM](@ref GMM) module). Each mixture object can be provided with or without its parameters (e.g. mean and variance for the gaussian ones). Fully qualified mixtures are useful only if theinitialisationstrategyparameter is set to "gived" This parameter can also be given symply in term of a _type. In this case it is automatically extended to a vector of n_classesmixtures of the specified type. Note that mixing of different mixture types is not currently supported. [def:[DiagonalGaussian() for i in 1:n_classes]`]

  • tol::Float64: Tolerance to stop the algorithm [default: 10^(-6)]

  • minimum_variance::Float64: Minimum variance for the mixtures [default: 0.05]

  • minimum_covariance::Float64: Minimum covariance for the mixtures with full covariance matrix [default: 0]. This should be set different than minimum_variance (see notes).

  • initialisation_strategy::String: The computation method of the vector of the initial mixtures. One of the following:

    • "grid": using a grid approach
    • "given": using the mixture provided in the fully qualified mixtures parameter
    • "kmeans": use first kmeans (itself initialised with a "grid" strategy) to set the initial mixture centers [default]

    Note that currently "random" and "shuffle" initialisations are not supported in gmm-based algorithms.

  • maximum_iterations::Int64: Maximum number of iterations [def: typemax(Int64), i.e. ∞]

  • rng::Random.AbstractRNG: Random Number Generator [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X, y        = @load_boston;
 
@@ -32,4 +32,4 @@
  23.3358  51.6717
   ⋮       
  16.6843  38.3686
- 16.6843  38.3686
+ 16.6843 38.3686
diff --git a/dev/models/MultitargetKNNClassifier_NearestNeighborModels/index.html b/dev/models/MultitargetKNNClassifier_NearestNeighborModels/index.html index 9afad905d..6e6d5414c 100644 --- a/dev/models/MultitargetKNNClassifier_NearestNeighborModels/index.html +++ b/dev/models/MultitargetKNNClassifier_NearestNeighborModels/index.html @@ -1,5 +1,5 @@ -MultitargetKNNClassifier · MLJ

MultitargetKNNClassifier

MultitargetKNNClassifier

A model type for constructing a multitarget K-nearest neighbor classifier, based on NearestNeighborModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetKNNClassifier = @load MultitargetKNNClassifier pkg=NearestNeighborModels

Do model = MultitargetKNNClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetKNNClassifier(K=...).

Multi-target K-Nearest Neighbors Classifier (MultitargetKNNClassifier) is a variation of KNNClassifier that assumes the target variable is vector-valued with Multiclass or OrderedFactor components. (Target data must be presented as a table, however.)

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • yis the target, which can be any table of responses whose element scitype is either<:Finite(<:Multiclassor<:OrderedFactorwill do); check the columns scitypes withschema(y). Each column ofy` is assumed to belong to a common categorical pool.
  • w is the observation weights which can either be nothing(default) or an AbstractVector whose element scitype is Count or Continuous. This is different from weights kernel which is a model hyperparameter, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • K::Int=5 : number of neighbors
  • algorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)
  • metric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.
  • leafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.
  • reorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.
  • weights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.
  • output_type::Type{<:MultiUnivariateFinite}=DictTable : One of (ColumnTable, DictTable). The type of table type to use for predictions. Setting to ColumnTable might improve performance for narrow tables while setting to DictTable improves performance for wide tables.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are either a ColumnTable or DictTable of UnivariateFiniteVector columns depending on the value set for the output_type parameter discussed above. The probabilistic predictions are uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of each column of the table of probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.

Examples

using MLJ, StableRNGs
+MultitargetKNNClassifier · MLJ

MultitargetKNNClassifier

MultitargetKNNClassifier

A model type for constructing a multitarget K-nearest neighbor classifier, based on NearestNeighborModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetKNNClassifier = @load MultitargetKNNClassifier pkg=NearestNeighborModels

Do model = MultitargetKNNClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetKNNClassifier(K=...).

Multi-target K-Nearest Neighbors Classifier (MultitargetKNNClassifier) is a variation of KNNClassifier that assumes the target variable is vector-valued with Multiclass or OrderedFactor components. (Target data must be presented as a table, however.)

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • yis the target, which can be any table of responses whose element scitype is either<:Finite(<:Multiclassor<:OrderedFactorwill do); check the columns scitypes withschema(y). Each column ofy` is assumed to belong to a common categorical pool.
  • w is the observation weights which can either be nothing(default) or an AbstractVector whose element scitype is Count or Continuous. This is different from weights kernel which is a model hyperparameter, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • K::Int=5 : number of neighbors
  • algorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)
  • metric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.
  • leafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.
  • reorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.
  • weights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.
  • output_type::Type{<:MultiUnivariateFinite}=DictTable : One of (ColumnTable, DictTable). The type of table type to use for predictions. Setting to ColumnTable might improve performance for narrow tables while setting to DictTable improves performance for wide tables.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are either a ColumnTable or DictTable of UnivariateFiniteVector columns depending on the value set for the output_type parameter discussed above. The probabilistic predictions are uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of each column of the table of probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.

Examples

using MLJ, StableRNGs
 
 ## set rng for reproducibility
 rng = StableRNG(10)
@@ -28,4 +28,4 @@
 ## predict
 y_hat = predict(mach, X)
 labels = predict_mode(mach, X)
-

See also KNNClassifier

+

See also KNNClassifier

diff --git a/dev/models/MultitargetKNNRegressor_NearestNeighborModels/index.html b/dev/models/MultitargetKNNRegressor_NearestNeighborModels/index.html index b7615e131..317a415cb 100644 --- a/dev/models/MultitargetKNNRegressor_NearestNeighborModels/index.html +++ b/dev/models/MultitargetKNNRegressor_NearestNeighborModels/index.html @@ -1,5 +1,5 @@ -MultitargetKNNRegressor · MLJ

MultitargetKNNRegressor

MultitargetKNNRegressor

A model type for constructing a multitarget K-nearest neighbor regressor, based on NearestNeighborModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetKNNRegressor = @load MultitargetKNNRegressor pkg=NearestNeighborModels

Do model = MultitargetKNNRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetKNNRegressor(K=...).

Multi-target K-Nearest Neighbors regressor (MultitargetKNNRegressor) is a variation of KNNRegressor that assumes the target variable is vector-valued with Continuous components. (Target data must be presented as a table, however.)

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any table of responses whose element scitype is Continuous; check column scitypes with schema(y).
  • w is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • K::Int=5 : number of neighbors
  • algorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)
  • metric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.
  • leafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.
  • reorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.
  • weights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.

Examples

using MLJ
+MultitargetKNNRegressor · MLJ

MultitargetKNNRegressor

MultitargetKNNRegressor

A model type for constructing a multitarget K-nearest neighbor regressor, based on NearestNeighborModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetKNNRegressor = @load MultitargetKNNRegressor pkg=NearestNeighborModels

Do model = MultitargetKNNRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetKNNRegressor(K=...).

Multi-target K-Nearest Neighbors regressor (MultitargetKNNRegressor) is a variation of KNNRegressor that assumes the target variable is vector-valued with Continuous components. (Target data must be presented as a table, however.)

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any table of responses whose element scitype is Continuous; check column scitypes with schema(y).
  • w is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • K::Int=5 : number of neighbors
  • algorithm::Symbol = :kdtree : one of (:kdtree, :brutetree, :balltree)
  • metric::Metric = Euclidean() : any Metric from Distances.jl for the distance between points. For algorithm = :kdtree only metrics which are instances of Union{Distances.Chebyshev, Distances.Cityblock, Distances.Euclidean, Distances.Minkowski, Distances.WeightedCityblock, Distances.WeightedEuclidean, Distances.WeightedMinkowski} are supported.
  • leafsize::Int = algorithm == 10 : determines the number of points at which to stop splitting the tree. This option is ignored and always taken as 0 for algorithm = :brutetree, since brutetree isn't actually a tree.
  • reorder::Bool = true : if true then points which are close in distance are placed close in memory. In this case, a copy of the original data will be made so that the original data is left unmodified. Setting this to true can significantly improve performance of the specified algorithm (except :brutetree). This option is ignored and always taken as false for algorithm = :brutetree.
  • weights::KNNKernel=Uniform() : kernel used in assigning weights to the k-nearest neighbors for each observation. An instance of one of the types in list_kernels(). User-defined weighting functions can be passed by wrapping the function in a UserDefinedKernel kernel (do ?NearestNeighborModels.UserDefinedKernel for more info). If observation weights w are passed during machine construction then the weight assigned to each neighbor vote is the product of the kernel generated weight for that neighbor and the corresponding observation weight.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • tree: An instance of either KDTree, BruteTree or BallTree depending on the value of the algorithm hyperparameter (See hyper-parameters section above). These are data structures that stores the training data with the view of making quicker nearest neighbor searches on test data points.

Examples

using MLJ
 
 ## Create Data
 X, y = make_regression(10, 5, n_targets=2)
@@ -18,4 +18,4 @@
 
 ## Predict
 y_hat = predict(mach, X)
-

See also KNNRegressor

+

See also KNNRegressor

diff --git a/dev/models/MultitargetLinearRegressor_MultivariateStats/index.html b/dev/models/MultitargetLinearRegressor_MultivariateStats/index.html index 5154f93b8..efe2cfea5 100644 --- a/dev/models/MultitargetLinearRegressor_MultivariateStats/index.html +++ b/dev/models/MultitargetLinearRegressor_MultivariateStats/index.html @@ -1,5 +1,5 @@ -MultitargetLinearRegressor · MLJ

MultitargetLinearRegressor

MultitargetLinearRegressor

A model type for constructing a multitarget linear regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetLinearRegressor = @load MultitargetLinearRegressor pkg=MultivariateStats

Do model = MultitargetLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetLinearRegressor(bias=...).

MultitargetLinearRegressor assumes the target variable is vector-valued with continuous components. It trains a linear prediction function using the least squares algorithm. Options exist to specify a bias term.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any table of responses whose element scitype is Continuous; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • bias=true: Include the bias term if true, otherwise fit without bias term.

Operations

  • predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • coefficients: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Examples

using MLJ
+MultitargetLinearRegressor · MLJ

MultitargetLinearRegressor

MultitargetLinearRegressor

A model type for constructing a multitarget linear regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetLinearRegressor = @load MultitargetLinearRegressor pkg=MultivariateStats

Do model = MultitargetLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetLinearRegressor(bias=...).

MultitargetLinearRegressor assumes the target variable is vector-valued with continuous components. It trains a linear prediction function using the least squares algorithm. Options exist to specify a bias term.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any table of responses whose element scitype is Continuous; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • bias=true: Include the bias term if true, otherwise fit without bias term.

Operations

  • predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • coefficients: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Examples

using MLJ
 using DataFrames
 
 LinearRegressor = @load MultitargetLinearRegressor pkg=MultivariateStats
@@ -10,4 +10,4 @@
 mach = machine(linear_regressor, X, y) |> fit!
 
 Xnew, _ = make_regression(3, 9)
-yhat = predict(mach, Xnew) ## new predictions

See also LinearRegressor, RidgeRegressor, MultitargetRidgeRegressor

+yhat = predict(mach, Xnew) ## new predictions

See also LinearRegressor, RidgeRegressor, MultitargetRidgeRegressor

diff --git a/dev/models/MultitargetNeuralNetworkRegressor_BetaML/index.html b/dev/models/MultitargetNeuralNetworkRegressor_BetaML/index.html index d75614bc8..b79b06709 100644 --- a/dev/models/MultitargetNeuralNetworkRegressor_BetaML/index.html +++ b/dev/models/MultitargetNeuralNetworkRegressor_BetaML/index.html @@ -1,5 +1,5 @@ -MultitargetNeuralNetworkRegressor · MLJ

MultitargetNeuralNetworkRegressor

mutable struct MultitargetNeuralNetworkRegressor <: MLJModelInterface.Deterministic

A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for regression of multiple dimensional targets.

Parameters:

  • layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers

  • loss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as matrices.

    Warning

    If you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.

  • dloss: Derivative of the loss function [def: BetaML.dsquared_cost, i.e. use the derivative of the squared cost]. Use nothing for autodiff.

  • epochs: Number of epochs, i.e. passages trough the whole training sample [def: 300]

  • batch_size: Size of each individual batch [def: 16]

  • opt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers

  • shuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]

  • descr: An optional title and/or description for this model

  • cb: A call back function to provide information during training [def: BetaML.fitting_info]

  • rng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • the label should be a n-records by n-dimensions matrix

Example:

julia> using MLJ
+MultitargetNeuralNetworkRegressor · MLJ

MultitargetNeuralNetworkRegressor

mutable struct MultitargetNeuralNetworkRegressor <: MLJModelInterface.Deterministic

A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for regression of multiple dimensional targets.

Parameters:

  • layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers

  • loss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as matrices.

    Warning

    If you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.

  • dloss: Derivative of the loss function [def: BetaML.dsquared_cost, i.e. use the derivative of the squared cost]. Use nothing for autodiff.

  • epochs: Number of epochs, i.e. passages trough the whole training sample [def: 300]

  • batch_size: Size of each individual batch [def: 16]

  • opt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers

  • shuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]

  • descr: An optional title and/or description for this model

  • cb: A call back function to provide information during training [def: BetaML.fitting_info]

  • rng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • the label should be a n-records by n-dimensions matrix

Example:

julia> using MLJ
 
 julia> X, y        = @load_boston;
 
@@ -38,4 +38,4 @@
   ⋮                   
  23.9  52.8  23.3573  50.654
  22.0  49.0  22.1141  48.5926
- 11.9  28.8  19.9639  45.5823
+ 11.9 28.8 19.9639 45.5823
diff --git a/dev/models/MultitargetNeuralNetworkRegressor_MLJFlux/index.html b/dev/models/MultitargetNeuralNetworkRegressor_MLJFlux/index.html index 9c31ac6d9..801fb5e93 100644 --- a/dev/models/MultitargetNeuralNetworkRegressor_MLJFlux/index.html +++ b/dev/models/MultitargetNeuralNetworkRegressor_MLJFlux/index.html @@ -1,5 +1,5 @@ -MultitargetNeuralNetworkRegressor · MLJ

MultitargetNeuralNetworkRegressor

MultitargetNeuralNetworkRegressor

A model type for constructing a multitarget neural network regressor, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetNeuralNetworkRegressor = @load MultitargetNeuralNetworkRegressor pkg=MLJFlux

Do model = MultitargetNeuralNetworkRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetNeuralNetworkRegressor(builder=...).

MultitargetNeuralNetworkRegressor is for training a data-dependent Flux.jl neural network to predict a multi-valued Continuous target, represented as a table, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.
  • y is the target, which can be any table or matrix of output targets whose element scitype is Continuous; check column scitypes with schema(y). If y is a Matrix, it is assumed to have columns corresponding to variables and rows corresponding to observations.

Hyper-parameters

  • builder=MLJFlux.Linear(σ=Flux.relu): An MLJFlux builder that constructs a neural network. Possible builders include: Linear, Short, and MLP. See MLJFlux documentation for more on builders, and the example below for using the @builder convenience macro.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.mse: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a regression task, natural loss functions are:

    • Flux.mse
    • Flux.mae
    • Flux.msle
    • Flux.huber_loss

    Currently MLJ measures are not supported as loss functions here.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and

    1. Increassing batch size may accelerate training if acceleration=CUDALibs() and a

    GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above. Predictions are deterministic.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network.

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we apply a multi-target regression model to synthetic data:

using MLJ
+MultitargetNeuralNetworkRegressor · MLJ

MultitargetNeuralNetworkRegressor

MultitargetNeuralNetworkRegressor

A model type for constructing a multitarget neural network regressor, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetNeuralNetworkRegressor = @load MultitargetNeuralNetworkRegressor pkg=MLJFlux

Do model = MultitargetNeuralNetworkRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetNeuralNetworkRegressor(builder=...).

MultitargetNeuralNetworkRegressor is for training a data-dependent Flux.jl neural network to predict a multi-valued Continuous target, represented as a table, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.
  • y is the target, which can be any table or matrix of output targets whose element scitype is Continuous; check column scitypes with schema(y). If y is a Matrix, it is assumed to have columns corresponding to variables and rows corresponding to observations.

Hyper-parameters

  • builder=MLJFlux.Linear(σ=Flux.relu): An MLJFlux builder that constructs a neural network. Possible builders include: Linear, Short, and MLP. See MLJFlux documentation for more on builders, and the example below for using the @builder convenience macro.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.mse: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a regression task, natural loss functions are:

    • Flux.mse
    • Flux.mae
    • Flux.msle
    • Flux.huber_loss

    Currently MLJ measures are not supported as loss functions here.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and

    1. Increassing batch size may accelerate training if acceleration=CUDALibs() and a

    GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above. Predictions are deterministic.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network.

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we apply a multi-target regression model to synthetic data:

using MLJ
 import MLJFlux
 using Flux

First, we generate some synthetic data (needs MLJBase 0.20.16 or higher):

X, y = make_regression(100, 9; n_targets = 2) ## both tables
 schema(y)
@@ -24,4 +24,4 @@
 ## loss for `(Xtest, test)`:
 fit!(mach) ## trains on all data `(X, y)`
 yhat = predict(mach, Xtest)
-multi_loss(yhat, ytest)

See also NeuralNetworkRegressor

+multi_loss(yhat, ytest)

See also NeuralNetworkRegressor

diff --git a/dev/models/MultitargetRidgeRegressor_MultivariateStats/index.html b/dev/models/MultitargetRidgeRegressor_MultivariateStats/index.html index 0a22c60f1..2da4080a5 100644 --- a/dev/models/MultitargetRidgeRegressor_MultivariateStats/index.html +++ b/dev/models/MultitargetRidgeRegressor_MultivariateStats/index.html @@ -1,5 +1,5 @@ -MultitargetRidgeRegressor · MLJ

MultitargetRidgeRegressor

MultitargetRidgeRegressor

A model type for constructing a multitarget ridge regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetRidgeRegressor = @load MultitargetRidgeRegressor pkg=MultivariateStats

Do model = MultitargetRidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetRidgeRegressor(lambda=...).

Multi-target ridge regression adds a quadratic penalty term to multi-target least squares regression, for regularization. Ridge regression is particularly useful in the case of multicollinearity. In this case, the output represents a response vector. Options exist to specify a bias term, and to adjust the strength of the penalty term.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any table of responses whose element scitype is Continuous; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • lambda=1.0: Is the non-negative parameter for the regularization strength. If lambda is 0, ridge regression is equivalent to linear least squares regression, and as lambda approaches infinity, all the linear coefficients approach 0.
  • bias=true: Include the bias term if true, otherwise fit without bias term.

Operations

  • predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • coefficients: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Examples

using MLJ
+MultitargetRidgeRegressor · MLJ

MultitargetRidgeRegressor

MultitargetRidgeRegressor

A model type for constructing a multitarget ridge regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetRidgeRegressor = @load MultitargetRidgeRegressor pkg=MultivariateStats

Do model = MultitargetRidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetRidgeRegressor(lambda=...).

Multi-target ridge regression adds a quadratic penalty term to multi-target least squares regression, for regularization. Ridge regression is particularly useful in the case of multicollinearity. In this case, the output represents a response vector. Options exist to specify a bias term, and to adjust the strength of the penalty term.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any table of responses whose element scitype is Continuous; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • lambda=1.0: Is the non-negative parameter for the regularization strength. If lambda is 0, ridge regression is equivalent to linear least squares regression, and as lambda approaches infinity, all the linear coefficients approach 0.
  • bias=true: Include the bias term if true, otherwise fit without bias term.

Operations

  • predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • coefficients: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Examples

using MLJ
 using DataFrames
 
 RidgeRegressor = @load MultitargetRidgeRegressor pkg=MultivariateStats
@@ -10,4 +10,4 @@
 mach = machine(ridge_regressor, X, y) |> fit!
 
 Xnew, _ = make_regression(3, 6)
-yhat = predict(mach, Xnew) ## new predictions

See also LinearRegressor, MultitargetLinearRegressor, RidgeRegressor

+yhat = predict(mach, Xnew) ## new predictions

See also LinearRegressor, MultitargetLinearRegressor, RidgeRegressor

diff --git a/dev/models/MultitargetSRRegressor_SymbolicRegression/index.html b/dev/models/MultitargetSRRegressor_SymbolicRegression/index.html index 0825af3f1..0ec79e371 100644 --- a/dev/models/MultitargetSRRegressor_SymbolicRegression/index.html +++ b/dev/models/MultitargetSRRegressor_SymbolicRegression/index.html @@ -1,5 +1,5 @@ -MultitargetSRRegressor · MLJ

MultitargetSRRegressor

MultitargetSRRegressor

A model type for constructing a Multi-Target Symbolic Regression via Evolutionary Search, based on SymbolicRegression.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetSRRegressor = @load MultitargetSRRegressor pkg=SymbolicRegression

Do model = MultitargetSRRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetSRRegressor(binary_operators=...).

Multi-target Symbolic Regression regressor (MultitargetSRRegressor) conducts several searches for expressions that predict each target variable from a set of input variables. All data is assumed to be Continuous. The search is performed using an evolutionary algorithm. This algorithm is described in the paper https://arxiv.org/abs/2305.01582.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype

Continuous; check column scitypes with schema(X). Variable names in discovered expressions will be taken from the column names of X, if available. Units in columns of X (use DynamicQuantities for units) will trigger dimensional analysis to be used.

  • y is the target, which can be any table of target variables whose element scitype is Continuous; check the scitype with schema(y). Units in columns of y (use DynamicQuantities for units) will trigger dimensional analysis to be used.
  • w is the observation weights which can either be nothing (default) or an AbstractVector whoose element scitype is Count or Continuous. The same weights are used for all targets.

Train the machine using fit!(mach), inspect the discovered expressions with report(mach), and predict on new data with predict(mach, Xnew). Note that unlike other regressors, symbolic regression stores a list of lists of trained models. The models chosen from each of these lists is defined by the function selection_method keyword argument, which by default balances accuracy and complexity. You can override this at prediction time by passing a named tuple with keys data and idx.

Hyper-parameters

  • binary_operators: Vector of binary operators (functions) to use. Each operator should be defined for two input scalars, and one output scalar. All operators need to be defined over the entire real line (excluding infinity - these are stopped before they are input), or return NaN where not defined. For speed, define it so it takes two reals of the same type as input, and outputs the same type. For the SymbolicUtils simplification backend, you will need to define a generic method of the operator so it takes arbitrary types.

  • unary_operators: Same, but for unary operators (one input scalar, gives an output scalar).

  • constraints: Array of pairs specifying size constraints for each operator. The constraints for a binary operator should be a 2-tuple (e.g., (-1, -1)) and the constraints for a unary operator should be an Int. A size constraint is a limit to the size of the subtree in each argument of an operator. e.g., [(^)=>(-1, 3)] means that the ^ operator can have arbitrary size (-1) in its left argument, but a maximum size of 3 in its right argument. Default is no constraints.

  • batching: Whether to evolve based on small mini-batches of data, rather than the entire dataset.

  • batch_size: What batch size to use if using batching.

  • elementwise_loss: What elementwise loss function to use. Can be one of the following losses, or any other loss of type SupervisedLoss. You can also pass a function that takes a scalar target (left argument), and scalar predicted (right argument), and returns a scalar. This will be averaged over the predicted data. If weights are supplied, your function should take a third argument for the weight scalar. Included losses: Regression: - LPDistLoss{P}(), - L1DistLoss(), - L2DistLoss() (mean square), - LogitDistLoss(), - HuberLoss(d), - L1EpsilonInsLoss(ϵ), - L2EpsilonInsLoss(ϵ), - PeriodicLoss(c), - QuantileLoss(τ), Classification: - ZeroOneLoss(), - PerceptronLoss(), - L1HingeLoss(), - SmoothedL1HingeLoss(γ), - ModifiedHuberLoss(), - L2MarginLoss(), - ExpLoss(), - SigmoidLoss(), - DWDMarginLoss(q).

  • loss_function: Alternatively, you may redefine the loss used as any function of tree::Node{T}, dataset::Dataset{T}, and options::Options, so long as you output a non-negative scalar of type T. This is useful if you want to use a loss that takes into account derivatives, or correlations across the dataset. This also means you could use a custom evaluation for a particular expression. If you are using batching=true, then your function should accept a fourth argument idx, which is either nothing (indicating that the full dataset should be used), or a vector of indices to use for the batch. For example,

      function my_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}
    +MultitargetSRRegressor · MLJ

    MultitargetSRRegressor

    MultitargetSRRegressor

    A model type for constructing a Multi-Target Symbolic Regression via Evolutionary Search, based on SymbolicRegression.jl, and implementing the MLJ model interface.

    From MLJ, the type can be imported using

    MultitargetSRRegressor = @load MultitargetSRRegressor pkg=SymbolicRegression

    Do model = MultitargetSRRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetSRRegressor(binary_operators=...).

    Multi-target Symbolic Regression regressor (MultitargetSRRegressor) conducts several searches for expressions that predict each target variable from a set of input variables. All data is assumed to be Continuous. The search is performed using an evolutionary algorithm. This algorithm is described in the paper https://arxiv.org/abs/2305.01582.

    Training data

    In MLJ or MLJBase, bind an instance model to data with

    mach = machine(model, X, y)

    OR

    mach = machine(model, X, y, w)

    Here:

    • X is any table of input features (eg, a DataFrame) whose columns are of scitype

    Continuous; check column scitypes with schema(X). Variable names in discovered expressions will be taken from the column names of X, if available. Units in columns of X (use DynamicQuantities for units) will trigger dimensional analysis to be used.

    • y is the target, which can be any table of target variables whose element scitype is Continuous; check the scitype with schema(y). Units in columns of y (use DynamicQuantities for units) will trigger dimensional analysis to be used.
    • w is the observation weights which can either be nothing (default) or an AbstractVector whoose element scitype is Count or Continuous. The same weights are used for all targets.

    Train the machine using fit!(mach), inspect the discovered expressions with report(mach), and predict on new data with predict(mach, Xnew). Note that unlike other regressors, symbolic regression stores a list of lists of trained models. The models chosen from each of these lists is defined by the function selection_method keyword argument, which by default balances accuracy and complexity. You can override this at prediction time by passing a named tuple with keys data and idx.

    Hyper-parameters

    • binary_operators: Vector of binary operators (functions) to use. Each operator should be defined for two input scalars, and one output scalar. All operators need to be defined over the entire real line (excluding infinity - these are stopped before they are input), or return NaN where not defined. For speed, define it so it takes two reals of the same type as input, and outputs the same type. For the SymbolicUtils simplification backend, you will need to define a generic method of the operator so it takes arbitrary types.

    • unary_operators: Same, but for unary operators (one input scalar, gives an output scalar).

    • constraints: Array of pairs specifying size constraints for each operator. The constraints for a binary operator should be a 2-tuple (e.g., (-1, -1)) and the constraints for a unary operator should be an Int. A size constraint is a limit to the size of the subtree in each argument of an operator. e.g., [(^)=>(-1, 3)] means that the ^ operator can have arbitrary size (-1) in its left argument, but a maximum size of 3 in its right argument. Default is no constraints.

    • batching: Whether to evolve based on small mini-batches of data, rather than the entire dataset.

    • batch_size: What batch size to use if using batching.

    • elementwise_loss: What elementwise loss function to use. Can be one of the following losses, or any other loss of type SupervisedLoss. You can also pass a function that takes a scalar target (left argument), and scalar predicted (right argument), and returns a scalar. This will be averaged over the predicted data. If weights are supplied, your function should take a third argument for the weight scalar. Included losses: Regression: - LPDistLoss{P}(), - L1DistLoss(), - L2DistLoss() (mean square), - LogitDistLoss(), - HuberLoss(d), - L1EpsilonInsLoss(ϵ), - L2EpsilonInsLoss(ϵ), - PeriodicLoss(c), - QuantileLoss(τ), Classification: - ZeroOneLoss(), - PerceptronLoss(), - L1HingeLoss(), - SmoothedL1HingeLoss(γ), - ModifiedHuberLoss(), - L2MarginLoss(), - ExpLoss(), - SigmoidLoss(), - DWDMarginLoss(q).

    • loss_function: Alternatively, you may redefine the loss used as any function of tree::Node{T}, dataset::Dataset{T}, and options::Options, so long as you output a non-negative scalar of type T. This is useful if you want to use a loss that takes into account derivatives, or correlations across the dataset. This also means you could use a custom evaluation for a particular expression. If you are using batching=true, then your function should accept a fourth argument idx, which is either nothing (indicating that the full dataset should be used), or a vector of indices to use for the batch. For example,

        function my_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}
             prediction, flag = eval_tree_array(tree, dataset.X, options)
             if !flag
                 return L(Inf)
      @@ -17,4 +17,4 @@
       r = report(mach)
       for (output_index, (eq, i)) in enumerate(zip(r.equation_strings, r.best_idx))
           println("Equation used for ", output_index, ": ", eq[i])
      -end

      See also SRRegressor.

    +end

    See also SRRegressor.

diff --git a/dev/models/NeuralNetworkClassifier_BetaML/index.html b/dev/models/NeuralNetworkClassifier_BetaML/index.html index 2bdbf1d2e..1835e9cd1 100644 --- a/dev/models/NeuralNetworkClassifier_BetaML/index.html +++ b/dev/models/NeuralNetworkClassifier_BetaML/index.html @@ -1,5 +1,5 @@ -NeuralNetworkClassifier · MLJ

NeuralNetworkClassifier

mutable struct NeuralNetworkClassifier <: MLJModelInterface.Probabilistic

A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for classification problems.

Parameters:

  • layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers. The last "softmax" layer is automatically added.

  • loss: Loss (cost) function [def: BetaML.crossentropy]. Should always assume y and ŷ as matrices.

    Warning

    If you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.

  • dloss: Derivative of the loss function [def: BetaML.dcrossentropy, i.e. the derivative of the cross-entropy]. Use nothing for autodiff.

  • epochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]

  • batch_size: Size of each individual batch [def: 16]

  • opt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers

  • shuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]

  • descr: An optional title and/or description for this model

  • cb: A call back function to provide information during training [def: BetaML.fitting_info]

  • categories: The categories to represent as columns. [def: nothing, i.e. unique training values].

  • handle_unknown: How to handle categories not seens in training or not present in the provided categories array? "error" (default) rises an error, "infrequent" adds a specific column for these categories.

  • other_categories_name: Which value during prediction to assign to this "other" category (i.e. categories not seen on training or not present in the provided categories array? [def: nothing, i.e. typemax(Int64) for integer vectors and "other" for other types]. This setting is active only if handle_unknown="infrequent" and in that case it MUST be specified if Y is neither integer or strings

  • rng: Random Number Generator [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • the label should be a n-records by n-dimensions matrix (e.g. a one-hot-encoded data for classification), where the output columns should be interpreted as the probabilities for each categories.

Example:

julia> using MLJ
+NeuralNetworkClassifier · MLJ

NeuralNetworkClassifier

mutable struct NeuralNetworkClassifier <: MLJModelInterface.Probabilistic

A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for classification problems.

Parameters:

  • layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers. The last "softmax" layer is automatically added.

  • loss: Loss (cost) function [def: BetaML.crossentropy]. Should always assume y and ŷ as matrices.

    Warning

    If you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.

  • dloss: Derivative of the loss function [def: BetaML.dcrossentropy, i.e. the derivative of the cross-entropy]. Use nothing for autodiff.

  • epochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]

  • batch_size: Size of each individual batch [def: 16]

  • opt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers

  • shuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]

  • descr: An optional title and/or description for this model

  • cb: A call back function to provide information during training [def: BetaML.fitting_info]

  • categories: The categories to represent as columns. [def: nothing, i.e. unique training values].

  • handle_unknown: How to handle categories not seens in training or not present in the provided categories array? "error" (default) rises an error, "infrequent" adds a specific column for these categories.

  • other_categories_name: Which value during prediction to assign to this "other" category (i.e. categories not seen on training or not present in the provided categories array? [def: nothing, i.e. typemax(Int64) for integer vectors and "other" for other types]. This setting is active only if handle_unknown="infrequent" and in that case it MUST be specified if Y is neither integer or strings

  • rng: Random Number Generator [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • the label should be a n-records by n-dimensions matrix (e.g. a one-hot-encoded data for classification), where the output columns should be interpreted as the probabilities for each categories.

Example:

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -34,4 +34,4 @@
  UnivariateFinite{Multiclass{3}}(setosa=>0.573, versicolor=>0.213, virginica=>0.213)
  ⋮
  UnivariateFinite{Multiclass{3}}(setosa=>0.236, versicolor=>0.236, virginica=>0.529)
- UnivariateFinite{Multiclass{3}}(setosa=>0.254, versicolor=>0.254, virginica=>0.492)
+ UnivariateFinite{Multiclass{3}}(setosa=>0.254, versicolor=>0.254, virginica=>0.492)
diff --git a/dev/models/NeuralNetworkClassifier_MLJFlux/index.html b/dev/models/NeuralNetworkClassifier_MLJFlux/index.html index 8a52ab876..e8aee1f04 100644 --- a/dev/models/NeuralNetworkClassifier_MLJFlux/index.html +++ b/dev/models/NeuralNetworkClassifier_MLJFlux/index.html @@ -1,5 +1,5 @@ -NeuralNetworkClassifier · MLJ

NeuralNetworkClassifier

NeuralNetworkClassifier

A model type for constructing a neural network classifier, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux

Do model = NeuralNetworkClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkClassifier(builder=...).

NeuralNetworkClassifier is for training a data-dependent Flux.jl neural network for making probabilistic predictions of a Multiclass or OrderedFactor target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.
  • y is the target, which can be any AbstractVector whose element scitype is Multiclass or OrderedFactor; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • builder=MLJFlux.Short(): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux.jl documentation for examples of user-defined builders. See also finaliser below.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.crossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:

    • Flux.crossentropy: Standard multiclass classification loss, also known as the log loss.
    • Flux.logitcrossentopy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with softmax and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default softmax finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).
    • Flux.tversky_loss: Used with imbalanced data to give more weight to false negatives.
    • Flux.focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.

    Currently MLJ measures are not supported values of loss.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and

    1. Increassing batch size may accelerate training if acceleration=CUDALibs() and a

    GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

  • finaliser=Flux.softmax: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.softmax.

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we build a classification model using the Iris dataset. This is a very basic example, using a default builder and no standardization. For a more advanced illustration, see NeuralNetworkRegressor or ImageClassifier, and examples in the MLJFlux.jl documentation.

using MLJ
+NeuralNetworkClassifier · MLJ

NeuralNetworkClassifier

NeuralNetworkClassifier

A model type for constructing a neural network classifier, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux

Do model = NeuralNetworkClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkClassifier(builder=...).

NeuralNetworkClassifier is for training a data-dependent Flux.jl neural network for making probabilistic predictions of a Multiclass or OrderedFactor target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.
  • y is the target, which can be any AbstractVector whose element scitype is Multiclass or OrderedFactor; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • builder=MLJFlux.Short(): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux.jl documentation for examples of user-defined builders. See also finaliser below.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.crossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:

    • Flux.crossentropy: Standard multiclass classification loss, also known as the log loss.
    • Flux.logitcrossentopy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with softmax and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default softmax finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).
    • Flux.tversky_loss: Used with imbalanced data to give more weight to false negatives.
    • Flux.focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.

    Currently MLJ measures are not supported values of loss.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and

    1. Increassing batch size may accelerate training if acceleration=CUDALibs() and a

    GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

  • finaliser=Flux.softmax: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.softmax.

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we build a classification model using the Iris dataset. This is a very basic example, using a default builder and no standardization. For a more advanced illustration, see NeuralNetworkRegressor or ImageClassifier, and examples in the MLJFlux.jl documentation.

using MLJ
 using Flux
 import RDatasets

First, we can load the data:

iris = RDatasets.dataset("datasets", "iris");
 y, X = unpack(iris, ==(:Species), rng=123); ## a vector and a table
@@ -19,4 +19,4 @@
      xlab=curve.parameter_name,
      xscale=curve.parameter_scale,
      ylab = "Cross Entropy")
-

See also ImageClassifier.

+

See also ImageClassifier.

diff --git a/dev/models/NeuralNetworkRegressor_BetaML/index.html b/dev/models/NeuralNetworkRegressor_BetaML/index.html index 91e8e395e..434e2defb 100644 --- a/dev/models/NeuralNetworkRegressor_BetaML/index.html +++ b/dev/models/NeuralNetworkRegressor_BetaML/index.html @@ -1,5 +1,5 @@ -NeuralNetworkRegressor · MLJ

NeuralNetworkRegressor

mutable struct NeuralNetworkRegressor <: MLJModelInterface.Deterministic

A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for regression of a single dimensional target.

Parameters:

  • layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers

  • loss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as matrices, even if the regression task is 1-D

    Warning

    If you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.

  • dloss: Derivative of the loss function [def: BetaML.dsquared_cost, i.e. use the derivative of the squared cost]. Use nothing for autodiff.

  • epochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]

  • batch_size: Size of each individual batch [def: 16]

  • opt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers

  • shuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]

  • descr: An optional title and/or description for this model

  • cb: A call back function to provide information during training [def: fitting_info]

  • rng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • the label should be be a n-records vector.

Example:

julia> using MLJ
+NeuralNetworkRegressor · MLJ

NeuralNetworkRegressor

mutable struct NeuralNetworkRegressor <: MLJModelInterface.Deterministic

A simple but flexible Feedforward Neural Network, from the Beta Machine Learning Toolkit (BetaML) for regression of a single dimensional target.

Parameters:

  • layers: Array of layer objects [def: nothing, i.e. basic network]. See subtypes(BetaML.AbstractLayer) for supported layers

  • loss: Loss (cost) function [def: BetaML.squared_cost]. Should always assume y and ŷ as matrices, even if the regression task is 1-D

    Warning

    If you change the parameter loss, you need to either provide its derivative on the parameter dloss or use autodiff with dloss=nothing.

  • dloss: Derivative of the loss function [def: BetaML.dsquared_cost, i.e. use the derivative of the squared cost]. Use nothing for autodiff.

  • epochs: Number of epochs, i.e. passages trough the whole training sample [def: 200]

  • batch_size: Size of each individual batch [def: 16]

  • opt_alg: The optimisation algorithm to update the gradient at each batch [def: BetaML.ADAM()]. See subtypes(BetaML.OptimisationAlgorithm) for supported optimizers

  • shuffle: Whether to randomly shuffle the data at each iteration (epoch) [def: true]

  • descr: An optional title and/or description for this model

  • cb: A call back function to provide information during training [def: fitting_info]

  • rng: Random Number Generator (see FIXEDSEED) [deafult: Random.GLOBAL_RNG]

Notes:

  • data must be numerical
  • the label should be be a n-records vector.

Example:

julia> using MLJ
 
 julia> X, y        = @load_boston;
 
@@ -35,4 +35,4 @@
   ⋮    
  23.9  30.9032
  22.0  29.49
- 11.9  27.2438
+ 11.9 27.2438
diff --git a/dev/models/NeuralNetworkRegressor_MLJFlux/index.html b/dev/models/NeuralNetworkRegressor_MLJFlux/index.html index 6e6be21a5..496d621e6 100644 --- a/dev/models/NeuralNetworkRegressor_MLJFlux/index.html +++ b/dev/models/NeuralNetworkRegressor_MLJFlux/index.html @@ -1,5 +1,5 @@ -NeuralNetworkRegressor · MLJ

NeuralNetworkRegressor

NeuralNetworkRegressor

A model type for constructing a neural network regressor, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NeuralNetworkRegressor = @load NeuralNetworkRegressor pkg=MLJFlux

Do model = NeuralNetworkRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkRegressor(builder=...).

NeuralNetworkRegressor is for training a data-dependent Flux.jl neural network to predict a Continuous target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • builder=MLJFlux.Linear(σ=Flux.relu): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux documentation for more on builders, and the example below for using the @builder convenience macro.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.mse: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a regression task, natural loss functions are:

    • Flux.mse
    • Flux.mae
    • Flux.msle
    • Flux.huber_loss

    Currently MLJ measures are not supported as loss functions here.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and

    1. Increasing batch size may accelerate training if acceleration=CUDALibs() and a

    GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network.

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalized if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we build a regression model for the Boston house price dataset.

using MLJ
+NeuralNetworkRegressor · MLJ

NeuralNetworkRegressor

NeuralNetworkRegressor

A model type for constructing a neural network regressor, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NeuralNetworkRegressor = @load NeuralNetworkRegressor pkg=MLJFlux

Do model = NeuralNetworkRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkRegressor(builder=...).

NeuralNetworkRegressor is for training a data-dependent Flux.jl neural network to predict a Continuous target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • builder=MLJFlux.Linear(σ=Flux.relu): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux documentation for more on builders, and the example below for using the @builder convenience macro.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.mse: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a regression task, natural loss functions are:

    • Flux.mse
    • Flux.mae
    • Flux.msle
    • Flux.huber_loss

    Currently MLJ measures are not supported as loss functions here.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and

    1. Increasing batch size may accelerate training if acceleration=CUDALibs() and a

    GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network.

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalized if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we build a regression model for the Boston house price dataset.

using MLJ
 import MLJFlux
 using Flux

First, we load in the data: The :MEDV column becomes the target vector y, and all remaining columns go into a table X, with the exception of :CHAS:

data = OpenML.load(531); ## Loads from https://www.openml.org/d/531
 y, X = unpack(data, ==(:MEDV), !=(:CHAS); rng=123);
@@ -42,4 +42,4 @@
 ## loss for `(Xtest, test)`:
 fit!(mach) ## train on `(X, y)`
 yhat = predict(mach, Xtest)
-l2(yhat, ytest)  |> mean

These losses, for the pipeline model, refer to the target on the original, unstandardized, scale.

For implementing stopping criterion and other iteration controls, refer to examples linked from the MLJFlux documentation.

See also MultitargetNeuralNetworkRegressor

+l2(yhat, ytest) |> mean

These losses, for the pipeline model, refer to the target on the original, unstandardized, scale.

For implementing stopping criterion and other iteration controls, refer to examples linked from the MLJFlux documentation.

See also MultitargetNeuralNetworkRegressor

diff --git a/dev/models/NuSVC_LIBSVM/index.html b/dev/models/NuSVC_LIBSVM/index.html index dc188db60..5e301b30d 100644 --- a/dev/models/NuSVC_LIBSVM/index.html +++ b/dev/models/NuSVC_LIBSVM/index.html @@ -1,5 +1,5 @@ -NuSVC · MLJ

NuSVC

NuSVC

A model type for constructing a ν-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NuSVC = @load NuSVC pkg=LIBSVM

Do model = NuSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NuSVC(kernel=...).

This model is a re-parameterization of the SVC classifier, where nu replaces cost, and is mathematically equivalent to it. The parameter nu allows more direct control over the number of support vectors (see under "Hyper-parameters").

This model always predicts actual class labels. For probabilistic predictions, use instead ProbabilisticNuSVC.

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • nu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • encoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
+NuSVC · MLJ

NuSVC

NuSVC

A model type for constructing a ν-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NuSVC = @load NuSVC pkg=LIBSVM

Do model = NuSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NuSVC(kernel=...).

This model is a re-parameterization of the SVC classifier, where nu replaces cost, and is mathematically equivalent to it. The parameter nu allows more direct control over the number of support vectors (see under "Hyper-parameters").

This model always predicts actual class labels. For probabilistic predictions, use instead ProbabilisticNuSVC.

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • nu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • encoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
 import LIBSVM
 
 NuSVC = @load NuSVC pkg=LIBSVM                 ## model type
@@ -25,4 +25,4 @@
 3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
  "virginica"
  "virginica"
- "virginica"

See also the classifiers SVC and LinearSVC, LIVSVM.jl and the original C implementation. documentation.

+ "virginica"

See also the classifiers SVC and LinearSVC, LIVSVM.jl and the original C implementation. documentation.

diff --git a/dev/models/NuSVR_LIBSVM/index.html b/dev/models/NuSVR_LIBSVM/index.html index 2fad5a6f1..e2a0a653a 100644 --- a/dev/models/NuSVR_LIBSVM/index.html +++ b/dev/models/NuSVR_LIBSVM/index.html @@ -1,5 +1,5 @@ -NuSVR · MLJ

NuSVR

NuSVR

A model type for constructing a ν-support vector regressor, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NuSVR = @load NuSVR pkg=LIBSVM

Do model = NuSVR() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NuSVR(kernel=...).

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

This model is a re-parameterization of EpsilonSVR in which the epsilon hyper-parameter is replaced with a new parameter nu (denoted $ν$ in the cited reference) which attempts to control the number of support vectors directly.

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

    • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be

    called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • cost=1.0 (range (0, Inf)): the parameter denoted $C$ in the cited reference; for greater regularization, decrease cost

  • nu=0.5 (range (0, 1]): An upper bound on the fraction of training errors and a lower bound of the fraction of support vectors. Denoted $ν$ in the cited paper. Changing nu changes the thickness of some neighborhood of the graph of the prediction function ("tube" or "slab") and a training error is said to occur when a data point (x, y) lies outside of that neighborhood.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
+NuSVR · MLJ

NuSVR

NuSVR

A model type for constructing a ν-support vector regressor, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NuSVR = @load NuSVR pkg=LIBSVM

Do model = NuSVR() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NuSVR(kernel=...).

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

This model is a re-parameterization of EpsilonSVR in which the epsilon hyper-parameter is replaced with a new parameter nu (denoted $ν$ in the cited reference) which attempts to control the number of support vectors directly.

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

    • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be

    called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • cost=1.0 (range (0, Inf)): the parameter denoted $C$ in the cited reference; for greater regularization, decrease cost

  • nu=0.5 (range (0, 1]): An upper bound on the fraction of training errors and a lower bound of the fraction of support vectors. Denoted $ν$ in the cited paper. Changing nu changes the thickness of some neighborhood of the graph of the prediction function ("tube" or "slab") and a training error is said to occur when a data point (x, y) lies outside of that neighborhood.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
 import LIBSVM
 
 NuSVR = @load NuSVR pkg=LIBSVM                 ## model type
@@ -22,4 +22,4 @@
 3-element Vector{Float64}:
   1.1211558175964662
   0.06677125944808422
- -0.6817578942749346

See also EpsilonSVR, LIVSVM.jl and the original C implementation documentation.

+ -0.6817578942749346

See also EpsilonSVR, LIVSVM.jl and the original C implementation documentation.

diff --git a/dev/models/OCSVMDetector_OutlierDetectionPython/index.html b/dev/models/OCSVMDetector_OutlierDetectionPython/index.html index 8c39a9822..54ef972a3 100644 --- a/dev/models/OCSVMDetector_OutlierDetectionPython/index.html +++ b/dev/models/OCSVMDetector_OutlierDetectionPython/index.html @@ -1,5 +1,5 @@ -OCSVMDetector · MLJ

OCSVMDetector

OCSVMDetector(kernel = "rbf",
+OCSVMDetector · MLJ
+                 max_iter = -1)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.ocsvm

diff --git a/dev/models/OPTICS_MLJScikitLearnInterface/index.html b/dev/models/OPTICS_MLJScikitLearnInterface/index.html index ae9ec3002..49517d83a 100644 --- a/dev/models/OPTICS_MLJScikitLearnInterface/index.html +++ b/dev/models/OPTICS_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -OPTICS · MLJ

OPTICS

OPTICS

A model type for constructing a optics, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OPTICS = @load OPTICS pkg=MLJScikitLearnInterface

Do model = OPTICS() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OPTICS(min_samples=...).

OPTICS (Ordering Points To Identify the Clustering Structure), closely related to `DBSCAN', finds core sample of high density and expands clusters from them. Unlike DBSCAN, keeps cluster hierarchy for a variable neighborhood radius. Better suited for usage on large datasets than the current sklearn implementation of DBSCAN.

+OPTICS · MLJ

OPTICS

OPTICS

A model type for constructing a optics, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OPTICS = @load OPTICS pkg=MLJScikitLearnInterface

Do model = OPTICS() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OPTICS(min_samples=...).

OPTICS (Ordering Points To Identify the Clustering Structure), closely related to `DBSCAN', finds core sample of high density and expands clusters from them. Unlike DBSCAN, keeps cluster hierarchy for a variable neighborhood radius. Better suited for usage on large datasets than the current sklearn implementation of DBSCAN.

diff --git a/dev/models/OneClassSVM_LIBSVM/index.html b/dev/models/OneClassSVM_LIBSVM/index.html index 7c294218d..a8e200cd4 100644 --- a/dev/models/OneClassSVM_LIBSVM/index.html +++ b/dev/models/OneClassSVM_LIBSVM/index.html @@ -1,5 +1,5 @@ -OneClassSVM · MLJ

OneClassSVM

OneClassSVM

A model type for constructing a one-class support vector machine, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OneClassSVM = @load OneClassSVM pkg=LIBSVM

Do model = OneClassSVM() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneClassSVM(kernel=...).

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

This model is an outlier detection model delivering raw scores based on the decision function of a support vector machine. Like the NuSVC classifier, it uses the nu re-parameterization of the cost parameter appearing in standard support vector classification SVC.

To extract normalized scores ("probabilities") wrap the model using ProbabilisticDetector from OutlierDetection.jl. For threshold-based classification, wrap the probabilistic model using MLJ's BinaryThresholdPredictor. Examples of wrapping appear below.

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • nu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • transform(mach, Xnew): return scores for outlierness, given features Xnew having the same scitype as X above. The greater the score, the more likely it is an outlier. This score is based on the SVM decision function. For normalized scores, wrap model using ProbabilisticDetector from OutlierDetection.jl and call predict instead, and for threshold-based classification, wrap again using BinaryThresholdPredictor. See the examples below.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • orientation: this equals 1 if the decision function for libsvm_model is increasing with increasing outlierness, and -1 if it is decreasing instead. Correspondingly, the libsvm_model attaches true to outliers in the first case, and false in the second. (The scores given in the MLJ report and generated by MLJ.transform already correct for this ambiguity, which is therefore only an issue for users directly accessing libsvm_model.)

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Generating raw scores for outlierness

using MLJ
+OneClassSVM · MLJ

OneClassSVM

OneClassSVM

A model type for constructing a one-class support vector machine, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OneClassSVM = @load OneClassSVM pkg=LIBSVM

Do model = OneClassSVM() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneClassSVM(kernel=...).

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

This model is an outlier detection model delivering raw scores based on the decision function of a support vector machine. Like the NuSVC classifier, it uses the nu re-parameterization of the cost parameter appearing in standard support vector classification SVC.

To extract normalized scores ("probabilities") wrap the model using ProbabilisticDetector from OutlierDetection.jl. For threshold-based classification, wrap the probabilistic model using MLJ's BinaryThresholdPredictor. Examples of wrapping appear below.

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • nu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • transform(mach, Xnew): return scores for outlierness, given features Xnew having the same scitype as X above. The greater the score, the more likely it is an outlier. This score is based on the SVM decision function. For normalized scores, wrap model using ProbabilisticDetector from OutlierDetection.jl and call predict instead, and for threshold-based classification, wrap again using BinaryThresholdPredictor. See the examples below.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • orientation: this equals 1 if the decision function for libsvm_model is increasing with increasing outlierness, and -1 if it is decreasing instead. Correspondingly, the libsvm_model attaches true to outliers in the first case, and false in the second. (The scores given in the MLJ report and generated by MLJ.transform already correct for this ambiguity, which is therefore only an issue for users directly accessing libsvm_model.)

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Generating raw scores for outlierness

using MLJ
 import LIBSVM
 import StableRNGs.StableRNG
 
@@ -64,4 +64,4 @@
 julia> yhat = transform(mach, Xnew)
 2-element Vector{Float64}:
  -0.4825363352732942
- -0.4848772169720227

See also LIVSVM.jl and the original C implementation documentation. For an alternative source of outlier detection models with an MLJ interface, see OutlierDetection.jl.

+ -0.4848772169720227

See also LIVSVM.jl and the original C implementation documentation. For an alternative source of outlier detection models with an MLJ interface, see OutlierDetection.jl.

diff --git a/dev/models/OneHotEncoder_MLJModels/index.html b/dev/models/OneHotEncoder_MLJModels/index.html index c89ba0737..df05bc96d 100644 --- a/dev/models/OneHotEncoder_MLJModels/index.html +++ b/dev/models/OneHotEncoder_MLJModels/index.html @@ -1,5 +1,5 @@ -OneHotEncoder · MLJ

OneHotEncoder

OneHotEncoder

A model type for constructing a one-hot encoder, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OneHotEncoder = @load OneHotEncoder pkg=MLJModels

Do model = OneHotEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneHotEncoder(features=...).

Use this model to one-hot encode the Multiclass and OrderedFactor features (columns) of some table, leaving other columns unchanged.

New data to be transformed may lack features present in the fit data, but no new features can be present.

Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.

To ensure all features are transformed into Continuous features, or dropped, use ContinuousEncoder instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: a vector of symbols (column names). If empty (default) then all Multiclass and OrderedFactor features are encoded. Otherwise, encoding is further restricted to the specified features (ignore=false) or the unspecified features (ignore=true). This default behavior can be modified by the ordered_factor flag.
  • ordered_factor=false: when true, OrderedFactor features are universally excluded
  • drop_last=true: whether to drop the column corresponding to the final class of encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but just two features otherwise.

Fitted parameters

The fields of fitted_params(mach) are:

  • all_features: names of all features encountered in training
  • fitted_levels_given_feature: dictionary of the levels associated with each feature encoded, keyed on the feature name
  • ref_name_pairs_given_feature: dictionary of pairs r => ftr (such as 0x00000001 => :grad__A) where r is a CategoricalArrays.jl reference integer representing a level, and ftr the corresponding new feature name; the dictionary is keyed on the names of features that are encoded

Report

The fields of report(mach) are:

  • features_to_be_encoded: names of input features to be encoded
  • new_features: names of all output features

Example

using MLJ
+OneHotEncoder · MLJ

OneHotEncoder

OneHotEncoder

A model type for constructing a one-hot encoder, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OneHotEncoder = @load OneHotEncoder pkg=MLJModels

Do model = OneHotEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneHotEncoder(features=...).

Use this model to one-hot encode the Multiclass and OrderedFactor features (columns) of some table, leaving other columns unchanged.

New data to be transformed may lack features present in the fit data, but no new features can be present.

Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.

To ensure all features are transformed into Continuous features, or dropped, use ContinuousEncoder instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: a vector of symbols (column names). If empty (default) then all Multiclass and OrderedFactor features are encoded. Otherwise, encoding is further restricted to the specified features (ignore=false) or the unspecified features (ignore=true). This default behavior can be modified by the ordered_factor flag.
  • ordered_factor=false: when true, OrderedFactor features are universally excluded
  • drop_last=true: whether to drop the column corresponding to the final class of encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but just two features otherwise.

Fitted parameters

The fields of fitted_params(mach) are:

  • all_features: names of all features encountered in training
  • fitted_levels_given_feature: dictionary of the levels associated with each feature encoded, keyed on the feature name
  • ref_name_pairs_given_feature: dictionary of pairs r => ftr (such as 0x00000001 => :grad__A) where r is a CategoricalArrays.jl reference integer representing a level, and ftr the corresponding new feature name; the dictionary is keyed on the names of features that are encoded

Report

The fields of report(mach) are:

  • features_to_be_encoded: names of input features to be encoded
  • new_features: names of all output features

Example

using MLJ
 
 X = (name=categorical(["Danesh", "Lee", "Mary", "John"]),
      grade=categorical(["A", "B", "A", "C"], ordered=true),
@@ -31,4 +31,4 @@
 │ grade__B     │ Continuous │
 │ height       │ Continuous │
 │ n_devices    │ Count      │
-└──────────────┴────────────┘

See also ContinuousEncoder.

+└──────────────┴────────────┘

See also ContinuousEncoder.

diff --git a/dev/models/OneRuleClassifier_OneRule/index.html b/dev/models/OneRuleClassifier_OneRule/index.html index 172f1be20..c3b0483f7 100644 --- a/dev/models/OneRuleClassifier_OneRule/index.html +++ b/dev/models/OneRuleClassifier_OneRule/index.html @@ -1,5 +1,5 @@ -OneRuleClassifier · MLJ

OneRuleClassifier

OneRuleClassifier

A model type for constructing a one rule classifier, based on OneRule.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OneRuleClassifier = @load OneRuleClassifier pkg=OneRule

Do model = OneRuleClassifier() to construct an instance with default hyper-parameters.

OneRuleClassifier implements the OneRule method for classification by Robert Holte ("Very simple classification rules perform well on most commonly used datasets" in: Machine Learning 11.1 (1993), pp. 63-90).

For more information see:
+OneRuleClassifier · MLJ

OneRuleClassifier

OneRuleClassifier

A model type for constructing a one rule classifier, based on OneRule.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OneRuleClassifier = @load OneRuleClassifier pkg=OneRule

Do model = OneRuleClassifier() to construct an instance with default hyper-parameters.

OneRuleClassifier implements the OneRule method for classification by Robert Holte ("Very simple classification rules perform well on most commonly used datasets" in: Machine Learning 11.1 (1993), pp. 63-90).

For more information see:
 
 - Witten, Ian H., Eibe Frank, and Mark A. Hall. 
   Data Mining Practical Machine Learning Tools and Techniques Third Edition. 
@@ -27,4 +27,4 @@
 
 yhat = MLJ.predict(mach, weather)       ## in a real context 'new' `weather` data would be used
 one_tree = fitted_params(mach).tree
-report(mach).error_rate

See also OneRule.jl.

+report(mach).error_rate

See also OneRule.jl.

diff --git a/dev/models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/index.html b/dev/models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/index.html index 096405ae8..967a37428 100644 --- a/dev/models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/OrthogonalMatchingPursuitCVRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -OrthogonalMatchingPursuitCVRegressor · MLJ

OrthogonalMatchingPursuitCVRegressor

OrthogonalMatchingPursuitCVRegressor

A model type for constructing a orthogonal ,atching pursuit (OMP) model with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OrthogonalMatchingPursuitCVRegressor = @load OrthogonalMatchingPursuitCVRegressor pkg=MLJScikitLearnInterface

Do model = OrthogonalMatchingPursuitCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OrthogonalMatchingPursuitCVRegressor(copy=...).

Hyper-parameters

  • copy = true
  • fit_intercept = true
  • max_iter = nothing
  • cv = 5
  • n_jobs = 1
  • verbose = false
+OrthogonalMatchingPursuitCVRegressor · MLJ

OrthogonalMatchingPursuitCVRegressor

OrthogonalMatchingPursuitCVRegressor

A model type for constructing a orthogonal ,atching pursuit (OMP) model with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OrthogonalMatchingPursuitCVRegressor = @load OrthogonalMatchingPursuitCVRegressor pkg=MLJScikitLearnInterface

Do model = OrthogonalMatchingPursuitCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OrthogonalMatchingPursuitCVRegressor(copy=...).

Hyper-parameters

  • copy = true
  • fit_intercept = true
  • max_iter = nothing
  • cv = 5
  • n_jobs = 1
  • verbose = false
diff --git a/dev/models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/index.html b/dev/models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/index.html index e848bc881..bba2a2c29 100644 --- a/dev/models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/OrthogonalMatchingPursuitRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -OrthogonalMatchingPursuitRegressor · MLJ

OrthogonalMatchingPursuitRegressor

OrthogonalMatchingPursuitRegressor

A model type for constructing a orthogonal matching pursuit regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OrthogonalMatchingPursuitRegressor = @load OrthogonalMatchingPursuitRegressor pkg=MLJScikitLearnInterface

Do model = OrthogonalMatchingPursuitRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OrthogonalMatchingPursuitRegressor(n_nonzero_coefs=...).

Hyper-parameters

  • n_nonzero_coefs = nothing
  • tol = nothing
  • fit_intercept = true
  • precompute = auto
+OrthogonalMatchingPursuitRegressor · MLJ

OrthogonalMatchingPursuitRegressor

OrthogonalMatchingPursuitRegressor

A model type for constructing a orthogonal matching pursuit regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OrthogonalMatchingPursuitRegressor = @load OrthogonalMatchingPursuitRegressor pkg=MLJScikitLearnInterface

Do model = OrthogonalMatchingPursuitRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OrthogonalMatchingPursuitRegressor(n_nonzero_coefs=...).

Hyper-parameters

  • n_nonzero_coefs = nothing
  • tol = nothing
  • fit_intercept = true
  • precompute = auto
diff --git a/dev/models/PCADetector_OutlierDetectionPython/index.html b/dev/models/PCADetector_OutlierDetectionPython/index.html index 3cdba43e5..917d8cc0f 100644 --- a/dev/models/PCADetector_OutlierDetectionPython/index.html +++ b/dev/models/PCADetector_OutlierDetectionPython/index.html @@ -1,5 +1,5 @@ -PCADetector · MLJ

PCADetector

PCADetector(n_components = nothing,
+PCADetector · MLJ
+               random_state = nothing)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.pca

diff --git a/dev/models/PCA_MultivariateStats/index.html b/dev/models/PCA_MultivariateStats/index.html index 66b43e47c..b98bf3c36 100644 --- a/dev/models/PCA_MultivariateStats/index.html +++ b/dev/models/PCA_MultivariateStats/index.html @@ -1,5 +1,5 @@ -PCA · MLJ

PCA

PCA

A model type for constructing a pca, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PCA = @load PCA pkg=MultivariateStats

Do model = PCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PCA(maxoutdim=...).

Principal component analysis learns a linear projection onto a lower dimensional space while preserving most of the initial variance seen in the training data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • maxoutdim=0: Together with variance_ratio, controls the output dimension outdim chosen by the model. Specifically, suppose that k is the smallest integer such that retaining the k most significant principal components accounts for variance_ratio of the total variance in the training data. Then outdim = min(outdim, maxoutdim). If maxoutdim=0 (default) then the effective maxoutdim is min(n, indim - 1) where n is the number of observations and indim the number of features in the training data.

  • variance_ratio::Float64=0.99: The ratio of variance preserved after the transformation

  • method=:auto: The method to use to solve the problem. Choices are

    • :svd: Support Vector Decomposition of the matrix.
    • :cov: Covariance matrix decomposition.
    • :auto: Use :cov if the matrices first dimension is smaller than its second dimension and otherwise use :svd
  • mean=nothing: if nothing, centering will be computed and applied, if set to 0 no centering (data is assumed pre-centered); if a vector is passed, the centering is done with that vector.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • inverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and output respectively.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim = min(n, indim, maxoutdim) is the output dimension; here n is the number of observations.
  • tprincipalvar: Total variance of the principal components.
  • tresidualvar: Total residual variance.
  • tvar: Total observation variance (principal + residual variance).
  • mean: The mean of the untransformed training data, of length indim.
  • principalvars: The variance of the principal components. An AbstractVector of length outdim
  • loadings: The models loadings, weights for each variable used when calculating principal components. A matrix of size (indim, outdim) where indim and outdim are as defined above.

Examples

using MLJ
+PCA · MLJ

PCA

PCA

A model type for constructing a pca, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PCA = @load PCA pkg=MultivariateStats

Do model = PCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PCA(maxoutdim=...).

Principal component analysis learns a linear projection onto a lower dimensional space while preserving most of the initial variance seen in the training data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • maxoutdim=0: Together with variance_ratio, controls the output dimension outdim chosen by the model. Specifically, suppose that k is the smallest integer such that retaining the k most significant principal components accounts for variance_ratio of the total variance in the training data. Then outdim = min(outdim, maxoutdim). If maxoutdim=0 (default) then the effective maxoutdim is min(n, indim - 1) where n is the number of observations and indim the number of features in the training data.

  • variance_ratio::Float64=0.99: The ratio of variance preserved after the transformation

  • method=:auto: The method to use to solve the problem. Choices are

    • :svd: Support Vector Decomposition of the matrix.
    • :cov: Covariance matrix decomposition.
    • :auto: Use :cov if the matrices first dimension is smaller than its second dimension and otherwise use :svd
  • mean=nothing: if nothing, centering will be computed and applied, if set to 0 no centering (data is assumed pre-centered); if a vector is passed, the centering is done with that vector.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • inverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and output respectively.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim = min(n, indim, maxoutdim) is the output dimension; here n is the number of observations.
  • tprincipalvar: Total variance of the principal components.
  • tresidualvar: Total residual variance.
  • tvar: Total observation variance (principal + residual variance).
  • mean: The mean of the untransformed training data, of length indim.
  • principalvars: The variance of the principal components. An AbstractVector of length outdim
  • loadings: The models loadings, weights for each variable used when calculating principal components. A matrix of size (indim, outdim) where indim and outdim are as defined above.

Examples

using MLJ
 
 PCA = @load PCA pkg=MultivariateStats
 
@@ -8,4 +8,4 @@
 model = PCA(maxoutdim=2)
 mach = machine(model, X) |> fit!
 
-Xproj = transform(mach, X)

See also KernelPCA, ICA, FactorAnalysis, PPCA

+Xproj = transform(mach, X)

See also KernelPCA, ICA, FactorAnalysis, PPCA

diff --git a/dev/models/PLSRegressor_PartialLeastSquaresRegressor/index.html b/dev/models/PLSRegressor_PartialLeastSquaresRegressor/index.html index 4edce036a..8e63742d0 100644 --- a/dev/models/PLSRegressor_PartialLeastSquaresRegressor/index.html +++ b/dev/models/PLSRegressor_PartialLeastSquaresRegressor/index.html @@ -1,2 +1,2 @@ -PLSRegressor · MLJ

PLSRegressor

A Partial Least Squares Regressor. Contains PLS1, PLS2 (multi target) algorithms. Can be used mainly for regression.

+PLSRegressor · MLJ

PLSRegressor

A Partial Least Squares Regressor. Contains PLS1, PLS2 (multi target) algorithms. Can be used mainly for regression.

diff --git a/dev/models/PPCA_MultivariateStats/index.html b/dev/models/PPCA_MultivariateStats/index.html index 926b28ae6..0bc8df632 100644 --- a/dev/models/PPCA_MultivariateStats/index.html +++ b/dev/models/PPCA_MultivariateStats/index.html @@ -1,5 +1,5 @@ -PPCA · MLJ

PPCA

PPCA

A model type for constructing a probabilistic PCA model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PPCA = @load PPCA pkg=MultivariateStats

Do model = PPCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PPCA(maxoutdim=...).

Probabilistic principal component analysis is a dimension-reduction algorithm which represents a constrained form of the Gaussian distribution in which the number of free parameters can be restricted while still allowing the model to capture the dominant correlations in a data set. It is expressed as the maximum likelihood solution of a probabilistic latent variable model. For details, see Bishop (2006): C. M. Pattern Recognition and Machine Learning.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • maxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.
  • method::Symbol=:ml: The method to use to solve the problem, one of :ml, :em, :bayes.
  • maxiter::Int=1000: The maximum number of iterations.
  • tol::Real=1e-6: The convergence tolerance.
  • mean::Union{Nothing, Real, Vector{Float64}}=nothing: If nothing, centering will be computed and applied; if set to 0 no centering is applied (data is assumed pre-centered); if a vector, the centering is done with that vector.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • inverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively. Each column of the projection matrix corresponds to a principal component.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim: Dimension of transformed data.
  • tvat: The variance of the components.
  • loadings: The model's loadings matrix. A matrix of size (indim, outdim) where indim and outdim as as defined above.

Examples

using MLJ
+PPCA · MLJ

PPCA

PPCA

A model type for constructing a probabilistic PCA model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PPCA = @load PPCA pkg=MultivariateStats

Do model = PPCA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PPCA(maxoutdim=...).

Probabilistic principal component analysis is a dimension-reduction algorithm which represents a constrained form of the Gaussian distribution in which the number of free parameters can be restricted while still allowing the model to capture the dominant correlations in a data set. It is expressed as the maximum likelihood solution of a probabilistic latent variable model. For details, see Bishop (2006): C. M. Pattern Recognition and Machine Learning.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • maxoutdim=0: Controls the the dimension (number of columns) of the output, outdim. Specifically, outdim = min(n, indim, maxoutdim), where n is the number of observations and indim the input dimension.
  • method::Symbol=:ml: The method to use to solve the problem, one of :ml, :em, :bayes.
  • maxiter::Int=1000: The maximum number of iterations.
  • tol::Real=1e-6: The convergence tolerance.
  • mean::Union{Nothing, Real, Vector{Float64}}=nothing: If nothing, centering will be computed and applied; if set to 0 no centering is applied (data is assumed pre-centered); if a vector, the centering is done with that vector.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • inverse_transform(mach, Xsmall): For a dimension-reduced table Xsmall, such as returned by transform, reconstruct a table, having same the number of columns as the original training data X, that transforms to Xsmall. Mathematically, inverse_transform is a right-inverse for the PCA projection map, whose image is orthogonal to the kernel of that map. In particular, if Xsmall = transform(mach, Xnew), then inverse_transform(Xsmall) is only an approximation to Xnew.

Fitted parameters

The fields of fitted_params(mach) are:

  • projection: Returns the projection matrix, which has size (indim, outdim), where indim and outdim are the number of features of the input and ouput respectively. Each column of the projection matrix corresponds to a principal component.

Report

The fields of report(mach) are:

  • indim: Dimension (number of columns) of the training data and new data to be transformed.
  • outdim: Dimension of transformed data.
  • tvat: The variance of the components.
  • loadings: The model's loadings matrix. A matrix of size (indim, outdim) where indim and outdim as as defined above.

Examples

using MLJ
 
 PPCA = @load PPCA pkg=MultivariateStats
 
@@ -8,4 +8,4 @@
 model = PPCA(maxoutdim=2)
 mach = machine(model, X) |> fit!
 
-Xproj = transform(mach, X)

See also KernelPCA, ICA, FactorAnalysis, PCA

+Xproj = transform(mach, X)

See also KernelPCA, ICA, FactorAnalysis, PCA

diff --git a/dev/models/PartLS_PartitionedLS/index.html b/dev/models/PartLS_PartitionedLS/index.html index 5bcc6fb6a..5019ed7d6 100644 --- a/dev/models/PartLS_PartitionedLS/index.html +++ b/dev/models/PartLS_PartitionedLS/index.html @@ -1,5 +1,5 @@ -PartLS · MLJ

PartLS

PartLS

A model type for fitting a partitioned least squares model to data. Both an MLJ and native interface are provided.

MLJ Interface

From MLJ, the type can be imported using

PartLS = @load PartLS pkg=PartitionedLS

Construct an instance with default hyper-parameters using the syntax model = PartLS(). Provide keyword arguments to override hyper-parameter defaults, as in model = PartLS(P=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any matrix or table with Continuous element scitype. Check column scitypes of a table X with schema(X).
  • y: any vector with Continuous element scitype. Check scitype with scitype(y).

Train the machine using fit!(mach).

Hyper-parameters

  • Optimizer: the optimization algorithm to use. It can be Opt, Alt or BnB (names exported by PartitionedLS.jl).

  • P: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element P_{k, i} = 1 if feature i belongs to partition k.

  • η: the regularization parameter. It controls the strength of the regularization.

  • ϵ: the tolerance parameter. It is used to determine when the Alt optimization algorithm has converged. Only used by the Alt algorithm.

  • T: the maximum number of iterations. It is used to determine when to stop the Alt optimization algorithm has converged. Only used by the Alt algorithm.

  • rng: the random number generator to use.

    • If nothing, the global random number generator rand is used.
    • If an integer, the global number generator rand is used after seeding it with the given integer.
    • If an object of type AbstractRNG, the given random number generator is used.

Operations

  • predict(mach, Xnew): return the predictions of the model on new data Xnew

Fitted parameters

The fields of fitted_params(mach) are:

  • α: the values of the α variables. For each partition k, it holds the values of the α variables are such that $\sum_{i \in P_k} \alpha_{k} = 1$.
  • β: the values of the β variables. For each partition k, β_k is the coefficient that multiplies the features in the k-th partition.
  • t: the intercept term of the model.
  • P: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element P_{k, i} = 1 if feature i belongs to partition k.

Examples

PartLS = @load PartLS pkg=PartitionedLS
+PartLS · MLJ

PartLS

PartLS

A model type for fitting a partitioned least squares model to data. Both an MLJ and native interface are provided.

MLJ Interface

From MLJ, the type can be imported using

PartLS = @load PartLS pkg=PartitionedLS

Construct an instance with default hyper-parameters using the syntax model = PartLS(). Provide keyword arguments to override hyper-parameter defaults, as in model = PartLS(P=...).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any matrix or table with Continuous element scitype. Check column scitypes of a table X with schema(X).
  • y: any vector with Continuous element scitype. Check scitype with scitype(y).

Train the machine using fit!(mach).

Hyper-parameters

  • Optimizer: the optimization algorithm to use. It can be Opt, Alt or BnB (names exported by PartitionedLS.jl).

  • P: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element P_{k, i} = 1 if feature i belongs to partition k.

  • η: the regularization parameter. It controls the strength of the regularization.

  • ϵ: the tolerance parameter. It is used to determine when the Alt optimization algorithm has converged. Only used by the Alt algorithm.

  • T: the maximum number of iterations. It is used to determine when to stop the Alt optimization algorithm has converged. Only used by the Alt algorithm.

  • rng: the random number generator to use.

    • If nothing, the global random number generator rand is used.
    • If an integer, the global number generator rand is used after seeding it with the given integer.
    • If an object of type AbstractRNG, the given random number generator is used.

Operations

  • predict(mach, Xnew): return the predictions of the model on new data Xnew

Fitted parameters

The fields of fitted_params(mach) are:

  • α: the values of the α variables. For each partition k, it holds the values of the α variables are such that $\sum_{i \in P_k} \alpha_{k} = 1$.
  • β: the values of the β variables. For each partition k, β_k is the coefficient that multiplies the features in the k-th partition.
  • t: the intercept term of the model.
  • P: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element P_{k, i} = 1 if feature i belongs to partition k.

Examples

PartLS = @load PartLS pkg=PartitionedLS
 
 X = [[1. 2. 3.];
      [3. 3. 4.];
@@ -40,4 +40,4 @@
 
 ## fit using the optimal algorithm
 result = fit(Opt, X, y, P, η = 0.0)
-y_hat = predict(result.model, X)

For other fit keyword options, refer to the "Hyper-parameters" section for the MLJ interface.

+y_hat = predict(result.model, X)

For other fit keyword options, refer to the "Hyper-parameters" section for the MLJ interface.

diff --git a/dev/models/PassiveAggressiveClassifier_MLJScikitLearnInterface/index.html b/dev/models/PassiveAggressiveClassifier_MLJScikitLearnInterface/index.html index 55ddd0b5b..167b0ea8b 100644 --- a/dev/models/PassiveAggressiveClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/PassiveAggressiveClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -PassiveAggressiveClassifier · MLJ

PassiveAggressiveClassifier

PassiveAggressiveClassifier

A model type for constructing a passive aggressive classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PassiveAggressiveClassifier = @load PassiveAggressiveClassifier pkg=MLJScikitLearnInterface

Do model = PassiveAggressiveClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PassiveAggressiveClassifier(C=...).

Hyper-parameters

  • C = 1.0
  • fit_intercept = true
  • max_iter = 100
  • tol = 0.001
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • shuffle = true
  • verbose = 0
  • loss = hinge
  • n_jobs = nothing
  • random_state = 0
  • warm_start = false
  • class_weight = nothing
  • average = false
+PassiveAggressiveClassifier · MLJ

PassiveAggressiveClassifier

PassiveAggressiveClassifier

A model type for constructing a passive aggressive classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PassiveAggressiveClassifier = @load PassiveAggressiveClassifier pkg=MLJScikitLearnInterface

Do model = PassiveAggressiveClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PassiveAggressiveClassifier(C=...).

Hyper-parameters

  • C = 1.0
  • fit_intercept = true
  • max_iter = 100
  • tol = 0.001
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • shuffle = true
  • verbose = 0
  • loss = hinge
  • n_jobs = nothing
  • random_state = 0
  • warm_start = false
  • class_weight = nothing
  • average = false
diff --git a/dev/models/PassiveAggressiveRegressor_MLJScikitLearnInterface/index.html b/dev/models/PassiveAggressiveRegressor_MLJScikitLearnInterface/index.html index 02c87cc58..60142a461 100644 --- a/dev/models/PassiveAggressiveRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/PassiveAggressiveRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -PassiveAggressiveRegressor · MLJ

PassiveAggressiveRegressor

PassiveAggressiveRegressor

A model type for constructing a passive aggressive regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PassiveAggressiveRegressor = @load PassiveAggressiveRegressor pkg=MLJScikitLearnInterface

Do model = PassiveAggressiveRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PassiveAggressiveRegressor(C=...).

Hyper-parameters

  • C = 1.0
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.0001
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • shuffle = true
  • verbose = 0
  • loss = epsilon_insensitive
  • epsilon = 0.1
  • random_state = nothing
  • warm_start = false
  • average = false
+PassiveAggressiveRegressor · MLJ

PassiveAggressiveRegressor

PassiveAggressiveRegressor

A model type for constructing a passive aggressive regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PassiveAggressiveRegressor = @load PassiveAggressiveRegressor pkg=MLJScikitLearnInterface

Do model = PassiveAggressiveRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PassiveAggressiveRegressor(C=...).

Hyper-parameters

  • C = 1.0
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.0001
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • shuffle = true
  • verbose = 0
  • loss = epsilon_insensitive
  • epsilon = 0.1
  • random_state = nothing
  • warm_start = false
  • average = false
diff --git a/dev/models/PegasosClassifier_BetaML/index.html b/dev/models/PegasosClassifier_BetaML/index.html index 322cbcf17..a80eccf48 100644 --- a/dev/models/PegasosClassifier_BetaML/index.html +++ b/dev/models/PegasosClassifier_BetaML/index.html @@ -1,5 +1,5 @@ -PegasosClassifier · MLJ

PegasosClassifier

mutable struct PegasosClassifier <: MLJModelInterface.Probabilistic

The gradient-based linear "pegasos" classifier using one-vs-all for multiclass, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • initial_coefficients::Union{Nothing, Matrix{Float64}}: N-classes by D-dimensions matrix of initial linear coefficients [def: nothing, i.e. zeros]
  • initial_constant::Union{Nothing, Vector{Float64}}: N-classes vector of initial contant terms [def: nothing, i.e. zeros]
  • learning_rate::Function: Learning rate [def: (epoch -> 1/sqrt(epoch))]
  • learning_rate_multiplicative::Float64: Multiplicative term of the learning rate [def: 0.5]
  • epochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 1000]
  • shuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]
  • force_origin::Bool: Whether to force the parameter associated with the constant term to remain zero [def: false]
  • return_mean_hyperplane::Bool: Whether to return the average hyperplane coefficients instead of the final ones [def: false]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+PegasosClassifier · MLJ

PegasosClassifier

mutable struct PegasosClassifier <: MLJModelInterface.Probabilistic

The gradient-based linear "pegasos" classifier using one-vs-all for multiclass, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • initial_coefficients::Union{Nothing, Matrix{Float64}}: N-classes by D-dimensions matrix of initial linear coefficients [def: nothing, i.e. zeros]
  • initial_constant::Union{Nothing, Vector{Float64}}: N-classes vector of initial contant terms [def: nothing, i.e. zeros]
  • learning_rate::Function: Learning rate [def: (epoch -> 1/sqrt(epoch))]
  • learning_rate_multiplicative::Float64: Multiplicative term of the learning rate [def: 0.5]
  • epochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 1000]
  • shuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]
  • force_origin::Bool: Whether to force the parameter associated with the constant term to remain zero [def: false]
  • return_mean_hyperplane::Bool: Whether to return the average hyperplane coefficients instead of the final ones [def: false]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -28,4 +28,4 @@
  UnivariateFinite{Multiclass{3}}(setosa=>0.791, versicolor=>0.177, virginica=>0.0318)
  ⋮
  UnivariateFinite{Multiclass{3}}(setosa=>0.254, versicolor=>0.5, virginica=>0.246)
- UnivariateFinite{Multiclass{3}}(setosa=>0.283, versicolor=>0.51, virginica=>0.207)
+ UnivariateFinite{Multiclass{3}}(setosa=>0.283, versicolor=>0.51, virginica=>0.207)
diff --git a/dev/models/PerceptronClassifier_BetaML/index.html b/dev/models/PerceptronClassifier_BetaML/index.html index 645b51bb4..eaa852b87 100644 --- a/dev/models/PerceptronClassifier_BetaML/index.html +++ b/dev/models/PerceptronClassifier_BetaML/index.html @@ -1,5 +1,5 @@ -PerceptronClassifier · MLJ

PerceptronClassifier

mutable struct PerceptronClassifier <: MLJModelInterface.Probabilistic

The classical perceptron algorithm using one-vs-all for multiclass, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • initial_coefficients::Union{Nothing, Matrix{Float64}}: N-classes by D-dimensions matrix of initial linear coefficients [def: nothing, i.e. zeros]
  • initial_constant::Union{Nothing, Vector{Float64}}: N-classes vector of initial contant terms [def: nothing, i.e. zeros]
  • epochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 1000]
  • shuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]
  • force_origin::Bool: Whether to force the parameter associated with the constant term to remain zero [def: false]
  • return_mean_hyperplane::Bool: Whether to return the average hyperplane coefficients instead of the final ones [def: false]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+PerceptronClassifier · MLJ

PerceptronClassifier

mutable struct PerceptronClassifier <: MLJModelInterface.Probabilistic

The classical perceptron algorithm using one-vs-all for multiclass, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • initial_coefficients::Union{Nothing, Matrix{Float64}}: N-classes by D-dimensions matrix of initial linear coefficients [def: nothing, i.e. zeros]
  • initial_constant::Union{Nothing, Vector{Float64}}: N-classes vector of initial contant terms [def: nothing, i.e. zeros]
  • epochs::Int64: Maximum number of epochs, i.e. passages trough the whole training sample [def: 1000]
  • shuffle::Bool: Whether to randomly shuffle the data at each iteration (epoch) [def: true]
  • force_origin::Bool: Whether to force the parameter associated with the constant term to remain zero [def: false]
  • return_mean_hyperplane::Bool: Whether to return the average hyperplane coefficients instead of the final ones [def: false]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -29,4 +29,4 @@
  UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>1.27e-18, virginica=>1.86e-310)
  ⋮
  UnivariateFinite{Multiclass{3}}(setosa=>2.77e-57, versicolor=>1.1099999999999999e-82, virginica=>1.0)
- UnivariateFinite{Multiclass{3}}(setosa=>3.09e-22, versicolor=>4.03e-25, virginica=>1.0)
+ UnivariateFinite{Multiclass{3}}(setosa=>3.09e-22, versicolor=>4.03e-25, virginica=>1.0)
diff --git a/dev/models/PerceptronClassifier_MLJScikitLearnInterface/index.html b/dev/models/PerceptronClassifier_MLJScikitLearnInterface/index.html index 39b75e648..4a4fda7d9 100644 --- a/dev/models/PerceptronClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/PerceptronClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -PerceptronClassifier · MLJ

PerceptronClassifier

PerceptronClassifier

A model type for constructing a perceptron classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PerceptronClassifier = @load PerceptronClassifier pkg=MLJScikitLearnInterface

Do model = PerceptronClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PerceptronClassifier(penalty=...).

Hyper-parameters

  • penalty = nothing
  • alpha = 0.0001
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.001
  • shuffle = true
  • verbose = 0
  • eta0 = 1.0
  • n_jobs = nothing
  • random_state = 0
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • class_weight = nothing
  • warm_start = false
+PerceptronClassifier · MLJ

PerceptronClassifier

PerceptronClassifier

A model type for constructing a perceptron classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

PerceptronClassifier = @load PerceptronClassifier pkg=MLJScikitLearnInterface

Do model = PerceptronClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in PerceptronClassifier(penalty=...).

Hyper-parameters

  • penalty = nothing
  • alpha = 0.0001
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.001
  • shuffle = true
  • verbose = 0
  • eta0 = 1.0
  • n_jobs = nothing
  • random_state = 0
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • class_weight = nothing
  • warm_start = false
diff --git a/dev/models/Pipeline_MLJBase/index.html b/dev/models/Pipeline_MLJBase/index.html index 231a9e401..fd30ad023 100644 --- a/dev/models/Pipeline_MLJBase/index.html +++ b/dev/models/Pipeline_MLJBase/index.html @@ -1,5 +1,5 @@ -Pipeline · MLJ

Pipeline

Pipeline(component1, component2, ... , componentk; options...)
+Pipeline · MLJ

Pipeline

Pipeline(component1, component2, ... , componentk; options...)
 Pipeline(name1=component1, name2=component2, ..., namek=componentk; options...)
 component1 |> component2 |> ... |> componentk

Create an instance of a composite model type which sequentially composes the specified components in order. This means component1 receives inputs, whose output is passed to component2, and so forth. A "component" is either a Model instance, a model type (converted immediately to its default instance) or any callable object. Here the "output" of a model is what predict returns if it is Supervised, or what transform returns if it is Unsupervised.

Names for the component fields are automatically generated unless explicitly specified, as in

Pipeline(encoder=ContinuousEncoder(drop_last=false),
          stand=Standardizer())

The Pipeline constructor accepts keyword options discussed further below.

Ordinary functions (and other callables) may be inserted in the pipeline as shown in the following example:

Pipeline(X->coerce(X, :age=>Continuous), OneHotEncoder, ConstantClassifier)

Syntactic sugar

The |> operator is overloaded to construct pipelines out of models, callables, and existing pipelines:

LinearRegressor = @load LinearRegressor pkg=MLJLinearModels add=true
@@ -7,4 +7,4 @@
 
 pipe1 = MLJBase.table |> ContinuousEncoder |> Standardizer
 pipe2 = PCA |> LinearRegressor
-pipe1 |> pipe2

At most one of the components may be a supervised model, but this model can appear in any position. A pipeline with a Supervised component is itself Supervised and implements the predict operation. It is otherwise Unsupervised (possibly Static) and implements transform.

Special operations

If all the components are invertible unsupervised models (ie, implement inverse_transform) then inverse_transform is implemented for the pipeline. If there are no supervised models, then predict is nevertheless implemented, assuming the last component is a model that implements it (some clustering models). Similarly, calling transform on a supervised pipeline calls transform on the supervised component.

Optional key-word arguments

  • prediction_type - prediction type of the pipeline; possible values: :deterministic, :probabilistic, :interval (default=:deterministic if not inferable)
  • operation - operation applied to the supervised component model, when present; possible values: predict, predict_mean, predict_median, predict_mode (default=predict)
  • cache - whether the internal machines created for component models should cache model-specific representations of data (see machine) (default=true)
Warning

Set cache=false to guarantee data anonymization.

To build more complicated non-branching pipelines, refer to the MLJ manual sections on composing models.

+pipe1 |> pipe2

At most one of the components may be a supervised model, but this model can appear in any position. A pipeline with a Supervised component is itself Supervised and implements the predict operation. It is otherwise Unsupervised (possibly Static) and implements transform.

Special operations

If all the components are invertible unsupervised models (ie, implement inverse_transform) then inverse_transform is implemented for the pipeline. If there are no supervised models, then predict is nevertheless implemented, assuming the last component is a model that implements it (some clustering models). Similarly, calling transform on a supervised pipeline calls transform on the supervised component.

Optional key-word arguments

  • prediction_type - prediction type of the pipeline; possible values: :deterministic, :probabilistic, :interval (default=:deterministic if not inferable)
  • operation - operation applied to the supervised component model, when present; possible values: predict, predict_mean, predict_median, predict_mode (default=predict)
  • cache - whether the internal machines created for component models should cache model-specific representations of data (see machine) (default=true)
Warning

Set cache=false to guarantee data anonymization.

To build more complicated non-branching pipelines, refer to the MLJ manual sections on composing models.

diff --git a/dev/models/ProbabilisticNuSVC_LIBSVM/index.html b/dev/models/ProbabilisticNuSVC_LIBSVM/index.html index 50fc3d7ac..572b49556 100644 --- a/dev/models/ProbabilisticNuSVC_LIBSVM/index.html +++ b/dev/models/ProbabilisticNuSVC_LIBSVM/index.html @@ -1,5 +1,5 @@ -ProbabilisticNuSVC · MLJ

ProbabilisticNuSVC

ProbabilisticNuSVC

A model type for constructing a probabilistic ν-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ProbabilisticNuSVC = @load ProbabilisticNuSVC pkg=LIBSVM

Do model = ProbabilisticNuSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticNuSVC(kernel=...).

This model is identical to NuSVC with the exception that it predicts probabilities, instead of actual class labels. Probabilities are computed using Platt scaling, which will add to total computation time.

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

Platt, John (1999): "Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods."

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • nu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • encoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
+ProbabilisticNuSVC · MLJ

ProbabilisticNuSVC

ProbabilisticNuSVC

A model type for constructing a probabilistic ν-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ProbabilisticNuSVC = @load ProbabilisticNuSVC pkg=LIBSVM

Do model = ProbabilisticNuSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticNuSVC(kernel=...).

This model is identical to NuSVC with the exception that it predicts probabilities, instead of actual class labels. Probabilities are computed using Platt scaling, which will add to total computation time.

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

Platt, John (1999): "Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods."

Training data

In MLJ or MLJBase, bind an instance model to data with:

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • nu=0.5 (range (0, 1]): An upper bound on the fraction of margin errors and a lower bound of the fraction of support vectors. Denoted ν in the cited paper. Changing nu changes the thickness of the margin (a neighborhood of the decision surface) and a margin error is said to have occurred if a training observation lies on the wrong side of the surface or within the margin.

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • encoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
 import LIBSVM
 
 ProbabilisticNuSVC = @load ProbabilisticNuSVC pkg=LIBSVM    ## model type
@@ -27,4 +27,4 @@
 model = ProbabilisticNuSVC(kernel=k)
 mach = machine(model, X, y) |> fit!
 
-probs = predict(mach, Xnew)

See also the classifiers NuSVC, SVC, ProbabilisticSVC and LinearSVC. And see LIVSVM.jl and the original C implementation. documentation.

+probs = predict(mach, Xnew)

See also the classifiers NuSVC, SVC, ProbabilisticSVC and LinearSVC. And see LIVSVM.jl and the original C implementation. documentation.

diff --git a/dev/models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/index.html b/dev/models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/index.html index fcecffa3c..4c2d3477b 100644 --- a/dev/models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/ProbabilisticSGDClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -ProbabilisticSGDClassifier · MLJ

ProbabilisticSGDClassifier

ProbabilisticSGDClassifier

A model type for constructing a probabilistic sgd classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ProbabilisticSGDClassifier = @load ProbabilisticSGDClassifier pkg=MLJScikitLearnInterface

Do model = ProbabilisticSGDClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticSGDClassifier(loss=...).

Hyper-parameters

  • loss = log_loss
  • penalty = l2
  • alpha = 0.0001
  • l1_ratio = 0.15
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.001
  • shuffle = true
  • verbose = 0
  • epsilon = 0.1
  • n_jobs = nothing
  • random_state = nothing
  • learning_rate = optimal
  • eta0 = 0.0
  • power_t = 0.5
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • class_weight = nothing
  • warm_start = false
  • average = false
+ProbabilisticSGDClassifier · MLJ

ProbabilisticSGDClassifier

ProbabilisticSGDClassifier

A model type for constructing a probabilistic sgd classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ProbabilisticSGDClassifier = @load ProbabilisticSGDClassifier pkg=MLJScikitLearnInterface

Do model = ProbabilisticSGDClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticSGDClassifier(loss=...).

Hyper-parameters

  • loss = log_loss
  • penalty = l2
  • alpha = 0.0001
  • l1_ratio = 0.15
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.001
  • shuffle = true
  • verbose = 0
  • epsilon = 0.1
  • n_jobs = nothing
  • random_state = nothing
  • learning_rate = optimal
  • eta0 = 0.0
  • power_t = 0.5
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • class_weight = nothing
  • warm_start = false
  • average = false
diff --git a/dev/models/ProbabilisticSVC_LIBSVM/index.html b/dev/models/ProbabilisticSVC_LIBSVM/index.html index 084988f2f..02e07778e 100644 --- a/dev/models/ProbabilisticSVC_LIBSVM/index.html +++ b/dev/models/ProbabilisticSVC_LIBSVM/index.html @@ -1,5 +1,5 @@ -ProbabilisticSVC · MLJ

ProbabilisticSVC

ProbabilisticSVC

A model type for constructing a probabilistic C-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ProbabilisticSVC = @load ProbabilisticSVC pkg=LIBSVM

Do model = ProbabilisticSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticSVC(kernel=...).

This model is identical to SVC with the exception that it predicts probabilities, instead of actual class labels. Probabilities are computed using Platt scaling, which will add to the total computation time.

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

Platt, John (1999): "Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods."

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
+ProbabilisticSVC · MLJ

ProbabilisticSVC

ProbabilisticSVC

A model type for constructing a probabilistic C-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ProbabilisticSVC = @load ProbabilisticSVC pkg=LIBSVM

Do model = ProbabilisticSVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ProbabilisticSVC(kernel=...).

This model is identical to SVC with the exception that it predicts probabilities, instead of actual class labels. Probabilities are computed using Platt scaling, which will add to the total computation time.

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

Platt, John (1999): "Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods."

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
 mach = machine(model, X, y, w)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)
  • w: a dictionary of class weights, keyed on levels(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • cost=1.0 (range (0, Inf)): the parameter denoted $C$ in the cited reference; for greater regularization, decrease cost

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return probabilistic predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • encoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
 import LIBSVM
 
@@ -32,4 +32,4 @@
 probs = predict(mach, Xnew)

Incorporating class weights

In either scenario above, we can do:

weights = Dict("virginica" => 1, "versicolor" => 20, "setosa" => 1)
 mach = machine(model, X, y, weights) |> fit!
 
-probs = predict(mach, Xnew)

See also the classifiers SVC, NuSVC and LinearSVC, and LIVSVM.jl and the original C implementation documentation.

+probs = predict(mach, Xnew)

See also the classifiers SVC, NuSVC and LinearSVC, and LIVSVM.jl and the original C implementation documentation.

diff --git a/dev/models/QuantileRegressor_MLJLinearModels/index.html b/dev/models/QuantileRegressor_MLJLinearModels/index.html index 6af43fb7e..c74f14497 100644 --- a/dev/models/QuantileRegressor_MLJLinearModels/index.html +++ b/dev/models/QuantileRegressor_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -QuantileRegressor · MLJ

QuantileRegressor

QuantileRegressor

A model type for constructing a quantile regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

QuantileRegressor = @load QuantileRegressor pkg=MLJLinearModels

Do model = QuantileRegressor() to construct an instance with default hyper-parameters.

This model coincides with RobustRegressor, with the exception that the robust loss, rho, is fixed to QuantileRho(delta), where delta is a new hyperparameter.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • delta::Real: parameterizes the QuantileRho function (indicating the quantile to use with default 0.5 for the median regression) Default: 0.5

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, if penalty = :l2, and ProxGrad otherwise.

    If solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
+QuantileRegressor · MLJ

QuantileRegressor

QuantileRegressor

A model type for constructing a quantile regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

QuantileRegressor = @load QuantileRegressor pkg=MLJLinearModels

Do model = QuantileRegressor() to construct an instance with default hyper-parameters.

This model coincides with RobustRegressor, with the exception that the robust loss, rho, is fixed to QuantileRho(delta), where delta is a new hyperparameter.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • delta::Real: parameterizes the QuantileRho function (indicating the quantile to use with default 0.5 for the median regression) Default: 0.5

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, if penalty = :l2, and ProxGrad otherwise.

    If solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
 X, y = make_regression()
 mach = fit!(machine(QuantileRegressor(), X, y))
 predict(mach, X)
-fitted_params(mach)

See also RobustRegressor, HuberRegressor.

+fitted_params(mach)

See also RobustRegressor, HuberRegressor.

diff --git a/dev/models/RANSACRegressor_MLJScikitLearnInterface/index.html b/dev/models/RANSACRegressor_MLJScikitLearnInterface/index.html index 817ae1fba..543cb26eb 100644 --- a/dev/models/RANSACRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/RANSACRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -RANSACRegressor · MLJ

RANSACRegressor

RANSACRegressor

A model type for constructing a ransac regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RANSACRegressor = @load RANSACRegressor pkg=MLJScikitLearnInterface

Do model = RANSACRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RANSACRegressor(estimator=...).

Hyper-parameters

  • estimator = nothing
  • min_samples = 5
  • residual_threshold = nothing
  • is_data_valid = nothing
  • is_model_valid = nothing
  • max_trials = 100
  • max_skips = 9223372036854775807
  • stop_n_inliers = 9223372036854775807
  • stop_score = Inf
  • stop_probability = 0.99
  • loss = absolute_error
  • random_state = nothing
+RANSACRegressor · MLJ

RANSACRegressor

RANSACRegressor

A model type for constructing a ransac regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RANSACRegressor = @load RANSACRegressor pkg=MLJScikitLearnInterface

Do model = RANSACRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RANSACRegressor(estimator=...).

Hyper-parameters

  • estimator = nothing
  • min_samples = 5
  • residual_threshold = nothing
  • is_data_valid = nothing
  • is_model_valid = nothing
  • max_trials = 100
  • max_skips = 9223372036854775807
  • stop_n_inliers = 9223372036854775807
  • stop_score = Inf
  • stop_probability = 0.99
  • loss = absolute_error
  • random_state = nothing
diff --git a/dev/models/RODDetector_OutlierDetectionPython/index.html b/dev/models/RODDetector_OutlierDetectionPython/index.html index 46cf1feae..5bb20d9db 100644 --- a/dev/models/RODDetector_OutlierDetectionPython/index.html +++ b/dev/models/RODDetector_OutlierDetectionPython/index.html @@ -1,2 +1,2 @@ -RODDetector · MLJ
+RODDetector · MLJ
diff --git a/dev/models/ROSE_Imbalance/index.html b/dev/models/ROSE_Imbalance/index.html index 2a337ad7a..8b8acefdf 100644 --- a/dev/models/ROSE_Imbalance/index.html +++ b/dev/models/ROSE_Imbalance/index.html @@ -1,5 +1,5 @@ -ROSE · MLJ

ROSE

Initiate a ROSE model with the given hyper-parameters.

ROSE

A model type for constructing a rose, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ROSE = @load ROSE pkg=Imbalance

Do model = ROSE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ROSE(s=...).

ROSE implements the ROSE (Random Oversampling Examples) algorithm to correct for class imbalance as in G Menardi, N. Torelli, “Training and assessing classification rules with imbalanced data,” Data Mining and Knowledge Discovery, 28(1), pp.92-122, 2014.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = ROSE()

Hyperparameters

  • s::float: A parameter that proportionally controls the bandwidth of the Gaussian kernel

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using ROSE, returning both the new and original observations

Example

using MLJ
+ROSE · MLJ

ROSE

Initiate a ROSE model with the given hyper-parameters.

ROSE

A model type for constructing a rose, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ROSE = @load ROSE pkg=Imbalance

Do model = ROSE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ROSE(s=...).

ROSE implements the ROSE (Random Oversampling Examples) algorithm to correct for class imbalance as in G Menardi, N. Torelli, “Training and assessing classification rules with imbalanced data,” Data Mining and Knowledge Discovery, 28(1), pp.92-122, 2014.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = ROSE()

Hyperparameters

  • s::float: A parameter that proportionally controls the bandwidth of the Gaussian kernel

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using ROSE, returning both the new and original observations

Example

using MLJ
 import Imbalance
 
 ## set probability of each class
@@ -27,4 +27,4 @@
 julia> Imbalance.checkbalance(yover)
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) 
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) 
-0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) 
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%)
diff --git a/dev/models/RandomForestClassifier_BetaML/index.html b/dev/models/RandomForestClassifier_BetaML/index.html index 8ab63ab3b..c43ba8e3f 100644 --- a/dev/models/RandomForestClassifier_BetaML/index.html +++ b/dev/models/RandomForestClassifier_BetaML/index.html @@ -1,5 +1,5 @@ -RandomForestClassifier · MLJ

RandomForestClassifier

mutable struct RandomForestClassifier <: MLJModelInterface.Probabilistic

A simple Random Forest model for classification with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_trees::Int64
  • max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. square root of the data dimensions]
  • splitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: gini]. Either gini, entropy or a custom function. It can also be an anonymous function.
  • β::Float64: Parameter that regulate the weights of the scoring of each tree, to be (optionally) used in prediction based on the error of the individual trees computed on the records on which trees have not been trained. Higher values favour "better" trees, but too high values will cause overfitting [def: 0, i.e. uniform weigths]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example :

julia> using MLJ
+RandomForestClassifier · MLJ

RandomForestClassifier

mutable struct RandomForestClassifier <: MLJModelInterface.Probabilistic

A simple Random Forest model for classification with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_trees::Int64
  • max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. square root of the data dimensions]
  • splitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: gini]. Either gini, entropy or a custom function. It can also be an anonymous function.
  • β::Float64: Parameter that regulate the weights of the scoring of each tree, to be (optionally) used in prediction based on the error of the individual trees computed on the records on which trees have not been trained. Higher values favour "better" trees, but too high values will cause overfitting [def: 0, i.e. uniform weigths]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example :

julia> using MLJ
 
 julia> X, y        = @load_iris;
 
@@ -28,4 +28,4 @@
  UnivariateFinite{Multiclass{3}}(setosa=>1.0, versicolor=>0.0, virginica=>0.0)
  ⋮
  UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0, virginica=>1.0)
- UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0667, virginica=>0.933)
+ UnivariateFinite{Multiclass{3}}(setosa=>0.0, versicolor=>0.0667, virginica=>0.933)
diff --git a/dev/models/RandomForestClassifier_DecisionTree/index.html b/dev/models/RandomForestClassifier_DecisionTree/index.html index 6706682d3..d6c9ca362 100644 --- a/dev/models/RandomForestClassifier_DecisionTree/index.html +++ b/dev/models/RandomForestClassifier_DecisionTree/index.html @@ -1,5 +1,5 @@ -RandomForestClassifier · MLJ

RandomForestClassifier

RandomForestClassifier

A model type for constructing a CART random forest classifier, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomForestClassifier = @load RandomForestClassifier pkg=DecisionTree

Do model = RandomForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestClassifier(max_depth=...).

RandomForestClassifier implements the standard Random Forest algorithm, originally published in Breiman, L. (2001): "Random Forests.", Machine Learning, vol. 45, pp. 5–32.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • max_depth=-1: max depth of the decision tree (-1=any)
  • min_samples_leaf=1: min number of samples each leaf needs to have
  • min_samples_split=2: min number of samples needed for a split
  • min_purity_increase=0: min purity needed for a split
  • n_subfeatures=-1: number of features to select at random (0 for all, -1 for square root of number of features)
  • n_trees=10: number of trees to train
  • sampling_fraction=0.7 fraction of samples to train each tree on
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.
  • predict_mode(mach, Xnew): instead return the mode of each prediction above.

Fitted parameters

The fields of fitted_params(mach) are:

  • forest: the Ensemble object returned by the core DecisionTree.jl algorithm

Report

The fields of report(mach) are:

  • features: the names of the features encountered in training

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
+RandomForestClassifier · MLJ

RandomForestClassifier

RandomForestClassifier

A model type for constructing a CART random forest classifier, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomForestClassifier = @load RandomForestClassifier pkg=DecisionTree

Do model = RandomForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestClassifier(max_depth=...).

RandomForestClassifier implements the standard Random Forest algorithm, originally published in Breiman, L. (2001): "Random Forests.", Machine Learning, vol. 45, pp. 5–32.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • max_depth=-1: max depth of the decision tree (-1=any)
  • min_samples_leaf=1: min number of samples each leaf needs to have
  • min_samples_split=2: min number of samples needed for a split
  • min_purity_increase=0: min purity needed for a split
  • n_subfeatures=-1: number of features to select at random (0 for all, -1 for square root of number of features)
  • n_trees=10: number of trees to train
  • sampling_fraction=0.7 fraction of samples to train each tree on
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above. Predictions are probabilistic, but uncalibrated.
  • predict_mode(mach, Xnew): instead return the mode of each prediction above.

Fitted parameters

The fields of fitted_params(mach) are:

  • forest: the Ensemble object returned by the core DecisionTree.jl algorithm

Report

The fields of report(mach) are:

  • features: the names of the features encountered in training

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
 Forest = @load RandomForestClassifier pkg=DecisionTree
 forest = Forest(min_samples_split=6, n_subfeatures=3)
 
@@ -19,4 +19,4 @@
 feature_importances(mach)  ## `:impurity` feature importances
 forest.feature_importance = :split
 feature_importance(mach)   ## `:split` feature importances
-

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.RandomForestClassifier.

+

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.RandomForestClassifier.

diff --git a/dev/models/RandomForestClassifier_MLJScikitLearnInterface/index.html b/dev/models/RandomForestClassifier_MLJScikitLearnInterface/index.html index 93aec4c69..01650dc60 100644 --- a/dev/models/RandomForestClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/RandomForestClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -RandomForestClassifier · MLJ

RandomForestClassifier

RandomForestClassifier

A model type for constructing a random forest classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomForestClassifier = @load RandomForestClassifier pkg=MLJScikitLearnInterface

Do model = RandomForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestClassifier(n_estimators=...).

A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is controlled with the max_samples parameter if bootstrap=True (default), otherwise the whole dataset is used to build each tree.

+RandomForestClassifier · MLJ

RandomForestClassifier

RandomForestClassifier

A model type for constructing a random forest classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomForestClassifier = @load RandomForestClassifier pkg=MLJScikitLearnInterface

Do model = RandomForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestClassifier(n_estimators=...).

A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is controlled with the max_samples parameter if bootstrap=True (default), otherwise the whole dataset is used to build each tree.

diff --git a/dev/models/RandomForestImputer_BetaML/index.html b/dev/models/RandomForestImputer_BetaML/index.html index fa589efbc..5acb1d2aa 100644 --- a/dev/models/RandomForestImputer_BetaML/index.html +++ b/dev/models/RandomForestImputer_BetaML/index.html @@ -1,5 +1,5 @@ -RandomForestImputer · MLJ

RandomForestImputer

mutable struct RandomForestImputer <: MLJModelInterface.Unsupervised

Impute missing values using Random Forests, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_trees::Int64: Number of (decision) trees in the forest [def: 30]
  • max_depth::Union{Nothing, Int64}: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: nothing, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Union{Nothing, Int64}: The maximum number of (random) features to consider at each partitioning [def: nothing, i.e. square root of the data dimension]
  • forced_categorical_cols::Vector{Int64}: Specify the positions of the integer columns to treat as categorical instead of cardinal. [Default: empty vector (all numerical cols are treated as cardinal by default and the others as categorical)]
  • splitting_criterion::Union{Nothing, Function}: Either gini, entropy or variance. This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: nothing, i.e. gini for categorical labels (classification task) and variance for numerical labels(regression task)]. It can be an anonymous function.
  • recursive_passages::Int64: Define the times to go trough the various columns to impute their data. Useful when there are data to impute on multiple columns. The order of the first passage is given by the decreasing number of missing values per column, the other passages are random [default: 1].
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+RandomForestImputer · MLJ

RandomForestImputer

mutable struct RandomForestImputer <: MLJModelInterface.Unsupervised

Impute missing values using Random Forests, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_trees::Int64: Number of (decision) trees in the forest [def: 30]
  • max_depth::Union{Nothing, Int64}: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: nothing, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Union{Nothing, Int64}: The maximum number of (random) features to consider at each partitioning [def: nothing, i.e. square root of the data dimension]
  • forced_categorical_cols::Vector{Int64}: Specify the positions of the integer columns to treat as categorical instead of cardinal. [Default: empty vector (all numerical cols are treated as cardinal by default and the others as categorical)]
  • splitting_criterion::Union{Nothing, Function}: Either gini, entropy or variance. This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: nothing, i.e. gini for categorical labels (classification task) and variance for numerical labels(regression task)]. It can be an anonymous function.
  • recursive_passages::Int64: Define the times to go trough the various columns to impute their data. Useful when there are data to impute on multiple columns. The order of the first passage is given by the decreasing number of missing values per column, the other passages are random [default: 1].
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X = [1 10.5;1.5 missing; 1.8 8; 1.7 15; 3.2 40; missing missing; 3.3 38; missing -2.3; 5.2 -2.4] |> table ;
 
@@ -33,4 +33,4 @@
  2.88375   8.66125
  3.3      38.0
  3.98125  -2.3
- 5.2      -2.4
+ 5.2 -2.4
diff --git a/dev/models/RandomForestRegressor_BetaML/index.html b/dev/models/RandomForestRegressor_BetaML/index.html index 0dcc4c0c1..cc13e6148 100644 --- a/dev/models/RandomForestRegressor_BetaML/index.html +++ b/dev/models/RandomForestRegressor_BetaML/index.html @@ -1,5 +1,5 @@ -RandomForestRegressor · MLJ

RandomForestRegressor

mutable struct RandomForestRegressor <: MLJModelInterface.Deterministic

A simple Random Forest model for regression with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_trees::Int64: Number of (decision) trees in the forest [def: 30]
  • max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. square root of the data dimension]
  • splitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: variance]. Either variance or a custom function. It can also be an anonymous function.
  • β::Float64: Parameter that regulate the weights of the scoring of each tree, to be (optionally) used in prediction based on the error of the individual trees computed on the records on which trees have not been trained. Higher values favour "better" trees, but too high values will cause overfitting [def: 0, i.e. uniform weigths]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
+RandomForestRegressor · MLJ

RandomForestRegressor

mutable struct RandomForestRegressor <: MLJModelInterface.Deterministic

A simple Random Forest model for regression with support for Missing data, from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • n_trees::Int64: Number of (decision) trees in the forest [def: 30]
  • max_depth::Int64: The maximum depth the tree is allowed to reach. When this is reached the node is forced to become a leaf [def: 0, i.e. no limits]
  • min_gain::Float64: The minimum information gain to allow for a node's partition [def: 0]
  • min_records::Int64: The minimum number of records a node must holds to consider for a partition of it [def: 2]
  • max_features::Int64: The maximum number of (random) features to consider at each partitioning [def: 0, i.e. square root of the data dimension]
  • splitting_criterion::Function: This is the name of the function to be used to compute the information gain of a specific partition. This is done by measuring the difference betwwen the "impurity" of the labels of the parent node with those of the two child nodes, weighted by the respective number of items. [def: variance]. Either variance or a custom function. It can also be an anonymous function.
  • β::Float64: Parameter that regulate the weights of the scoring of each tree, to be (optionally) used in prediction based on the error of the individual trees computed on the records on which trees have not been trained. Higher values favour "better" trees, but too high values will cause overfitting [def: 0, i.e. uniform weigths]
  • rng::Random.AbstractRNG: A Random Number Generator to be used in stochastic parts of the code [deafult: Random.GLOBAL_RNG]

Example:

julia> using MLJ
 
 julia> X, y        = @load_boston;
 
@@ -33,4 +33,4 @@
   ⋮    
  23.9  24.42
  22.0  22.4433
- 11.9  15.5833
+ 11.9 15.5833
diff --git a/dev/models/RandomForestRegressor_DecisionTree/index.html b/dev/models/RandomForestRegressor_DecisionTree/index.html index 5d7113735..05313af44 100644 --- a/dev/models/RandomForestRegressor_DecisionTree/index.html +++ b/dev/models/RandomForestRegressor_DecisionTree/index.html @@ -1,5 +1,5 @@ -RandomForestRegressor · MLJ

RandomForestRegressor

RandomForestRegressor

A model type for constructing a CART random forest regressor, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomForestRegressor = @load RandomForestRegressor pkg=DecisionTree

Do model = RandomForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestRegressor(max_depth=...).

DecisionTreeRegressor implements the standard Random Forest algorithm, originally published in Breiman, L. (2001): "Random Forests.", Machine Learning, vol. 45, pp. 5–32

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • max_depth=-1: max depth of the decision tree (-1=any)
  • min_samples_leaf=1: min number of samples each leaf needs to have
  • min_samples_split=2: min number of samples needed for a split
  • min_purity_increase=0: min purity needed for a split
  • n_subfeatures=-1: number of features to select at random (0 for all, -1 for square root of number of features)
  • n_trees=10: number of trees to train
  • sampling_fraction=0.7 fraction of samples to train each tree on
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • forest: the Ensemble object returned by the core DecisionTree.jl algorithm

Report

The fields of report(mach) are:

  • features: the names of the features encountered in training

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
+RandomForestRegressor · MLJ

RandomForestRegressor

RandomForestRegressor

A model type for constructing a CART random forest regressor, based on DecisionTree.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomForestRegressor = @load RandomForestRegressor pkg=DecisionTree

Do model = RandomForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestRegressor(max_depth=...).

DecisionTreeRegressor implements the standard Random Forest algorithm, originally published in Breiman, L. (2001): "Random Forests.", Machine Learning, vol. 45, pp. 5–32

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • max_depth=-1: max depth of the decision tree (-1=any)
  • min_samples_leaf=1: min number of samples each leaf needs to have
  • min_samples_split=2: min number of samples needed for a split
  • min_purity_increase=0: min purity needed for a split
  • n_subfeatures=-1: number of features to select at random (0 for all, -1 for square root of number of features)
  • n_trees=10: number of trees to train
  • sampling_fraction=0.7 fraction of samples to train each tree on
  • feature_importance: method to use for computing feature importances. One of (:impurity, :split)
  • rng=Random.GLOBAL_RNG: random number generator or seed

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • forest: the Ensemble object returned by the core DecisionTree.jl algorithm

Report

The fields of report(mach) are:

  • features: the names of the features encountered in training

Accessor functions

  • feature_importances(mach) returns a vector of (feature::Symbol => importance) pairs; the type of importance is determined by the hyperparameter feature_importance (see above)

Examples

using MLJ
 Forest = @load RandomForestRegressor pkg=DecisionTree
 forest = Forest(max_depth=4, min_samples_split=3)
 
@@ -10,4 +10,4 @@
 yhat = predict(mach, Xnew) ## new predictions
 
 fitted_params(mach).forest ## raw `Ensemble` object from DecisionTree.jl
-feature_importances(mach)

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.RandomForestRegressor.

+feature_importances(mach)

See also DecisionTree.jl and the unwrapped model type MLJDecisionTreeInterface.DecisionTree.RandomForestRegressor.

diff --git a/dev/models/RandomForestRegressor_MLJScikitLearnInterface/index.html b/dev/models/RandomForestRegressor_MLJScikitLearnInterface/index.html index 6c2626574..42e0c14d3 100644 --- a/dev/models/RandomForestRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/RandomForestRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -RandomForestRegressor · MLJ

RandomForestRegressor

RandomForestRegressor

A model type for constructing a random forest regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomForestRegressor = @load RandomForestRegressor pkg=MLJScikitLearnInterface

Do model = RandomForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestRegressor(n_estimators=...).

A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is controlled with the max_samples parameter if bootstrap=True (default), otherwise the whole dataset is used to build each tree.

+RandomForestRegressor · MLJ

RandomForestRegressor

RandomForestRegressor

A model type for constructing a random forest regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomForestRegressor = @load RandomForestRegressor pkg=MLJScikitLearnInterface

Do model = RandomForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomForestRegressor(n_estimators=...).

A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is controlled with the max_samples parameter if bootstrap=True (default), otherwise the whole dataset is used to build each tree.

diff --git a/dev/models/RandomOversampler_Imbalance/index.html b/dev/models/RandomOversampler_Imbalance/index.html index a10949a3a..58d8af495 100644 --- a/dev/models/RandomOversampler_Imbalance/index.html +++ b/dev/models/RandomOversampler_Imbalance/index.html @@ -1,5 +1,5 @@ -RandomOversampler · MLJ

RandomOversampler

Initiate a random oversampling model with the given hyper-parameters.

RandomOversampler

A model type for constructing a random oversampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomOversampler = @load RandomOversampler pkg=Imbalance

Do model = RandomOversampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomOversampler(ratios=...).

RandomOversampler implements naive oversampling by repeating existing observations with replacement.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = RandomOverSampler()

Hyperparameters

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix of real numbers or a table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using RandomOversampler, returning both the new and original observations

Example

using MLJ
+RandomOversampler · MLJ

RandomOversampler

Initiate a random oversampling model with the given hyper-parameters.

RandomOversampler

A model type for constructing a random oversampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomOversampler = @load RandomOversampler pkg=Imbalance

Do model = RandomOversampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomOversampler(ratios=...).

RandomOversampler implements naive oversampling by repeating existing observations with replacement.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = RandomOverSampler()

Hyperparameters

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix of real numbers or a table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using RandomOversampler, returning both the new and original observations

Example

using MLJ
 import Imbalance
 
 ## set probability of each class
@@ -27,4 +27,4 @@
 julia> Imbalance.checkbalance(yover)
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) 
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) 
-0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) 
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%)
diff --git a/dev/models/RandomUndersampler_Imbalance/index.html b/dev/models/RandomUndersampler_Imbalance/index.html index 0e0a889d1..babc46d4b 100644 --- a/dev/models/RandomUndersampler_Imbalance/index.html +++ b/dev/models/RandomUndersampler_Imbalance/index.html @@ -1,5 +1,5 @@ -RandomUndersampler · MLJ

RandomUndersampler

Initiate a random undersampling model with the given hyper-parameters.

RandomUndersampler

A model type for constructing a random undersampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomUndersampler = @load RandomUndersampler pkg=Imbalance

Do model = RandomUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomUndersampler(ratios=...).

RandomUndersampler implements naive undersampling by randomly removing existing observations.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = RandomUndersampler()

Hyperparameters

  • ratios=1.0: A parameter that controls the amount of undersampling to be done for each class

    • Can be a float and in this case each class will be undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix of real numbers or a table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively
  • y_under: An abstract vector of labels corresponding to X_under

Operations

  • transform(mach, X, y): resample the data X and y using RandomUndersampler, returning both the new and original observations

Example

using MLJ
+RandomUndersampler · MLJ

RandomUndersampler

Initiate a random undersampling model with the given hyper-parameters.

RandomUndersampler

A model type for constructing a random undersampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomUndersampler = @load RandomUndersampler pkg=Imbalance

Do model = RandomUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomUndersampler(ratios=...).

RandomUndersampler implements naive undersampling by randomly removing existing observations.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = RandomUndersampler()

Hyperparameters

  • ratios=1.0: A parameter that controls the amount of undersampling to be done for each class

    • Can be a float and in this case each class will be undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix of real numbers or a table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively
  • y_under: An abstract vector of labels corresponding to X_under

Operations

  • transform(mach, X, y): resample the data X and y using RandomUndersampler, returning both the new and original observations

Example

using MLJ
 import Imbalance
 
 ## set probability of each class
@@ -28,4 +28,4 @@
 julia> Imbalance.checkbalance(y_under; ref="minority")
 0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) 
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) 
-1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) 
+1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%)
diff --git a/dev/models/RandomWalkOversampler_Imbalance/index.html b/dev/models/RandomWalkOversampler_Imbalance/index.html index 4f7380e93..aca4f2cef 100644 --- a/dev/models/RandomWalkOversampler_Imbalance/index.html +++ b/dev/models/RandomWalkOversampler_Imbalance/index.html @@ -1,5 +1,5 @@ -RandomWalkOversampler · MLJ

RandomWalkOversampler

Initiate a RandomWalkOversampler model with the given hyper-parameters.

RandomWalkOversampler

A model type for constructing a random walk oversampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomWalkOversampler = @load RandomWalkOversampler pkg=Imbalance

Do model = RandomWalkOversampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomWalkOversampler(ratios=...).

RandomWalkOversampler implements the random walk oversampling algorithm to correct for class imbalance as in Zhang, H., & Li, M. (2014). RWO-Sampling: A random walk over-sampling approach to imbalanced data classification. Information Fusion, 25, 4-20.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = RandomWalkOversampler()

Hyperparameters

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and
 elements in continuous columns should subtype `Infinite` (i.e., have 
+RandomWalkOversampler · MLJ

RandomWalkOversampler

Initiate a RandomWalkOversampler model with the given hyper-parameters.

RandomWalkOversampler

A model type for constructing a random walk oversampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RandomWalkOversampler = @load RandomWalkOversampler pkg=Imbalance

Do model = RandomWalkOversampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RandomWalkOversampler(ratios=...).

RandomWalkOversampler implements the random walk oversampling algorithm to correct for class imbalance as in Zhang, H., & Li, M. (2014). RWO-Sampling: A random walk over-sampling approach to imbalanced data classification. Information Fusion, 25, 4-20.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = RandomWalkOversampler()

Hyperparameters

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and
 elements in continuous columns should subtype `Infinite` (i.e., have 
  [scitype](https://juliaai.github.io/ScientificTypes.jl/) `Count` or `Continuous`).
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using RandomWalkOversampler, returning both the new and original observations

Example

using MLJ
 using ScientificTypes
 import Imbalance
@@ -36,4 +36,4 @@
 julia> Imbalance.checkbalance(yover)
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) 
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) 
-0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%)
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%)
diff --git a/dev/models/RecursiveFeatureElimination_FeatureSelection/index.html b/dev/models/RecursiveFeatureElimination_FeatureSelection/index.html index 04733b448..29129e9c6 100644 --- a/dev/models/RecursiveFeatureElimination_FeatureSelection/index.html +++ b/dev/models/RecursiveFeatureElimination_FeatureSelection/index.html @@ -1,5 +1,5 @@ -RecursiveFeatureElimination · MLJ

RecursiveFeatureElimination

RecursiveFeatureElimination(model, n_features, step)

This model implements a recursive feature elimination algorithm for feature selection. It recursively removes features, training a base model on the remaining features and evaluating their importance until the desired number of features is selected.

Construct an instance with default hyper-parameters using the syntax rfe_model = RecursiveFeatureElimination(model=...). Provide keyword arguments to override hyper-parameter defaults.

Training data

In MLJ or MLJBase, bind an instance rfe_model to data with

mach = machine(rfe_model, X, y)

OR, if the base model supports weights, as

mach = machine(rfe_model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of the scitype as that required by the base model; check column scitypes with schema(X) and column scitypes required by base model with input_scitype(basemodel).
  • y is the target, which can be any table of responses whose element scitype is Continuous or Finite depending on the target_scitype required by the base model; check the scitype with scitype(y).
  • w is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • model: A base model with a fit method that provides information on feature feature importance (i.e reports_feature_importances(model) == true)
  • n_features::Real = 0: The number of features to select. If 0, half of the features are selected. If a positive integer, the parameter is the absolute number of features to select. If a real number between 0 and 1, it is the fraction of features to select.
  • step::Real=1: If the value of step is at least 1, it signifies the quantity of features to eliminate in each iteration. Conversely, if step falls strictly within the range of 0.0 to 1.0, it denotes the proportion (rounded down) of features to remove during each iteration.

Operations

  • transform(mach, X): transform the input table X into a new table containing only

columns corresponding to features gotten from the RFE algorithm.

  • predict(mach, X): transform the input table X into a new table same as in
  • transform(mach, X) above and predict using the fitted base model on the transformed table.

Fitted parameters

The fields of fitted_params(mach) are:

  • features_left: names of features remaining after recursive feature elimination.
  • model_fitresult: fitted parameters of the base model.

Report

The fields of report(mach) are:

  • ranking: The feature ranking of each features in the training dataset.
  • model_report: report for the fitted base model.
  • features: names of features seen during the training process.

Examples

using FeatureSelection, MLJ, StableRNGs
+RecursiveFeatureElimination · MLJ

RecursiveFeatureElimination

RecursiveFeatureElimination(model, n_features, step)

This model implements a recursive feature elimination algorithm for feature selection. It recursively removes features, training a base model on the remaining features and evaluating their importance until the desired number of features is selected.

Construct an instance with default hyper-parameters using the syntax rfe_model = RecursiveFeatureElimination(model=...). Provide keyword arguments to override hyper-parameter defaults.

Training data

In MLJ or MLJBase, bind an instance rfe_model to data with

mach = machine(rfe_model, X, y)

OR, if the base model supports weights, as

mach = machine(rfe_model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of the scitype as that required by the base model; check column scitypes with schema(X) and column scitypes required by base model with input_scitype(basemodel).
  • y is the target, which can be any table of responses whose element scitype is Continuous or Finite depending on the target_scitype required by the base model; check the scitype with scitype(y).
  • w is the observation weights which can either be nothing(default) or an AbstractVector whoose element scitype is Count or Continuous. This is different from weights kernel which is an hyperparameter to the model, see below.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • model: A base model with a fit method that provides information on feature feature importance (i.e reports_feature_importances(model) == true)
  • n_features::Real = 0: The number of features to select. If 0, half of the features are selected. If a positive integer, the parameter is the absolute number of features to select. If a real number between 0 and 1, it is the fraction of features to select.
  • step::Real=1: If the value of step is at least 1, it signifies the quantity of features to eliminate in each iteration. Conversely, if step falls strictly within the range of 0.0 to 1.0, it denotes the proportion (rounded down) of features to remove during each iteration.

Operations

  • transform(mach, X): transform the input table X into a new table containing only

columns corresponding to features gotten from the RFE algorithm.

  • predict(mach, X): transform the input table X into a new table same as in
  • transform(mach, X) above and predict using the fitted base model on the transformed table.

Fitted parameters

The fields of fitted_params(mach) are:

  • features_left: names of features remaining after recursive feature elimination.
  • model_fitresult: fitted parameters of the base model.

Report

The fields of report(mach) are:

  • ranking: The feature ranking of each features in the training dataset.
  • model_report: report for the fitted base model.
  • features: names of features seen during the training process.

Examples

using FeatureSelection, MLJ, StableRNGs
 
 RandomForestRegressor = @load RandomForestRegressor pkg=DecisionTree
 
@@ -22,4 +22,4 @@
 ## predict using the base model
 Xnew = MLJ.table(rand(rng, 50, 10));
 predict(mach, Xnew)
-
+
diff --git a/dev/models/Resampler_MLJBase/index.html b/dev/models/Resampler_MLJBase/index.html index 1e3441dc2..ac8bd0a91 100644 --- a/dev/models/Resampler_MLJBase/index.html +++ b/dev/models/Resampler_MLJBase/index.html @@ -1,5 +1,5 @@ -Resampler · MLJ

Resampler

resampler = Resampler(
+Resampler · MLJ

Resampler

resampler = Resampler(
     model=ConstantRegressor(),
     resampling=CV(),
     measure=nothing,
@@ -12,4 +12,4 @@
     per_observation=true,
     logger=nothing,
     compact=false,
-)

Private method. Use at own risk.

Resampling model wrapper, used internally by the fit method of TunedModel instances and IteratedModel instances. See evaluate! for meaning of the options. Not intended for use by general user, who will ordinarily use evaluate! directly.

Given a machine mach = machine(resampler, args...) one obtains a performance evaluation of the specified model, performed according to the prescribed resampling strategy and other parameters, using data args..., by calling fit!(mach) followed by evaluate(mach).

On subsequent calls to fit!(mach) new train/test pairs of row indices are only regenerated if resampling, repeats or cache fields of resampler have changed. The evolution of an RNG field of resampler does not constitute a change (== for MLJType objects is not sensitive to such changes; see is_same_except).

If there is single train/test pair, then warm-restart behavior of the wrapped model resampler.model will extend to warm-restart behaviour of the wrapper resampler, with respect to mutations of the wrapped model.

The sample weights are passed to the specified performance measures that support weights for evaluation. These weights are not to be confused with any weights bound to a Resampler instance in a machine, used for training the wrapped model when supported.

The sample class_weights are passed to the specified performance measures that support per-class weights for evaluation. These weights are not to be confused with any weights bound to a Resampler instance in a machine, used for training the wrapped model when supported.

+)

Private method. Use at own risk.

Resampling model wrapper, used internally by the fit method of TunedModel instances and IteratedModel instances. See evaluate! for meaning of the options. Not intended for use by general user, who will ordinarily use evaluate! directly.

Given a machine mach = machine(resampler, args...) one obtains a performance evaluation of the specified model, performed according to the prescribed resampling strategy and other parameters, using data args..., by calling fit!(mach) followed by evaluate(mach).

On subsequent calls to fit!(mach) new train/test pairs of row indices are only regenerated if resampling, repeats or cache fields of resampler have changed. The evolution of an RNG field of resampler does not constitute a change (== for MLJType objects is not sensitive to such changes; see is_same_except).

If there is single train/test pair, then warm-restart behavior of the wrapped model resampler.model will extend to warm-restart behaviour of the wrapper resampler, with respect to mutations of the wrapped model.

The sample weights are passed to the specified performance measures that support weights for evaluation. These weights are not to be confused with any weights bound to a Resampler instance in a machine, used for training the wrapped model when supported.

The sample class_weights are passed to the specified performance measures that support per-class weights for evaluation. These weights are not to be confused with any weights bound to a Resampler instance in a machine, used for training the wrapped model when supported.

diff --git a/dev/models/RidgeCVClassifier_MLJScikitLearnInterface/index.html b/dev/models/RidgeCVClassifier_MLJScikitLearnInterface/index.html index 047886696..76c0c36cf 100644 --- a/dev/models/RidgeCVClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/RidgeCVClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -RidgeCVClassifier · MLJ

RidgeCVClassifier

RidgeCVClassifier

A model type for constructing a ridge regression classifier with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeCVClassifier = @load RidgeCVClassifier pkg=MLJScikitLearnInterface

Do model = RidgeCVClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeCVClassifier(alphas=...).

Hyper-parameters

  • alphas = [0.1, 1.0, 10.0]
  • fit_intercept = true
  • scoring = nothing
  • cv = 5
  • class_weight = nothing
  • store_cv_values = false
+RidgeCVClassifier · MLJ

RidgeCVClassifier

RidgeCVClassifier

A model type for constructing a ridge regression classifier with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeCVClassifier = @load RidgeCVClassifier pkg=MLJScikitLearnInterface

Do model = RidgeCVClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeCVClassifier(alphas=...).

Hyper-parameters

  • alphas = [0.1, 1.0, 10.0]
  • fit_intercept = true
  • scoring = nothing
  • cv = 5
  • class_weight = nothing
  • store_cv_values = false
diff --git a/dev/models/RidgeCVRegressor_MLJScikitLearnInterface/index.html b/dev/models/RidgeCVRegressor_MLJScikitLearnInterface/index.html index 3bfe43b45..9a60a7d89 100644 --- a/dev/models/RidgeCVRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/RidgeCVRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -RidgeCVRegressor · MLJ

RidgeCVRegressor

RidgeCVRegressor

A model type for constructing a ridge regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeCVRegressor = @load RidgeCVRegressor pkg=MLJScikitLearnInterface

Do model = RidgeCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeCVRegressor(alphas=...).

Hyper-parameters

  • alphas = (0.1, 1.0, 10.0)
  • fit_intercept = true
  • scoring = nothing
  • cv = 5
  • gcv_mode = nothing
  • store_cv_values = false
+RidgeCVRegressor · MLJ

RidgeCVRegressor

RidgeCVRegressor

A model type for constructing a ridge regressor with built-in cross-validation, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeCVRegressor = @load RidgeCVRegressor pkg=MLJScikitLearnInterface

Do model = RidgeCVRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeCVRegressor(alphas=...).

Hyper-parameters

  • alphas = (0.1, 1.0, 10.0)
  • fit_intercept = true
  • scoring = nothing
  • cv = 5
  • gcv_mode = nothing
  • store_cv_values = false
diff --git a/dev/models/RidgeClassifier_MLJScikitLearnInterface/index.html b/dev/models/RidgeClassifier_MLJScikitLearnInterface/index.html index fa734ed04..a8384ae48 100644 --- a/dev/models/RidgeClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/RidgeClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -RidgeClassifier · MLJ

RidgeClassifier

RidgeClassifier

A model type for constructing a ridge regression classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeClassifier = @load RidgeClassifier pkg=MLJScikitLearnInterface

Do model = RidgeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeClassifier(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • copy_X = true
  • max_iter = nothing
  • tol = 0.001
  • class_weight = nothing
  • solver = auto
  • random_state = nothing
+RidgeClassifier · MLJ

RidgeClassifier

RidgeClassifier

A model type for constructing a ridge regression classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeClassifier = @load RidgeClassifier pkg=MLJScikitLearnInterface

Do model = RidgeClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeClassifier(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • copy_X = true
  • max_iter = nothing
  • tol = 0.001
  • class_weight = nothing
  • solver = auto
  • random_state = nothing
diff --git a/dev/models/RidgeRegressor_MLJLinearModels/index.html b/dev/models/RidgeRegressor_MLJLinearModels/index.html index f37f3ec93..884b9d607 100644 --- a/dev/models/RidgeRegressor_MLJLinearModels/index.html +++ b/dev/models/RidgeRegressor_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -RidgeRegressor · MLJ

RidgeRegressor

RidgeRegressor

A model type for constructing a ridge regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeRegressor = @load RidgeRegressor pkg=MLJLinearModels

Do model = RidgeRegressor() to construct an instance with default hyper-parameters.

Ridge regression is a linear model with objective function

$

|Xθ - y|₂²/2 + n⋅λ|θ|₂²/2 $

where $n$ is the number of observations.

If scale_penalty_with_samples = false then the objective function is instead

$

|Xθ - y|₂²/2 + λ|θ|₂²/2 $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the L2 regularization. Default: 1.0
  • fit_intercept::Bool: whether to fit the intercept or not. Default: true
  • penalize_intercept::Bool: whether to penalize the intercept. Default: false
  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true
  • solver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.Analytical. Use Analytical() for Cholesky and CG()=Analytical(iterative=true) for conjugate-gradient. If solver = nothing (default) then Analytical() is used. Default: nothing

Example

using MLJ
+RidgeRegressor · MLJ

RidgeRegressor

RidgeRegressor

A model type for constructing a ridge regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeRegressor = @load RidgeRegressor pkg=MLJLinearModels

Do model = RidgeRegressor() to construct an instance with default hyper-parameters.

Ridge regression is a linear model with objective function

$

|Xθ - y|₂²/2 + n⋅λ|θ|₂²/2 $

where $n$ is the number of observations.

If scale_penalty_with_samples = false then the objective function is instead

$

|Xθ - y|₂²/2 + λ|θ|₂²/2 $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • lambda::Real: strength of the L2 regularization. Default: 1.0
  • fit_intercept::Bool: whether to fit the intercept or not. Default: true
  • penalize_intercept::Bool: whether to penalize the intercept. Default: false
  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true
  • solver::Union{Nothing, MLJLinearModels.Solver}: any instance of MLJLinearModels.Analytical. Use Analytical() for Cholesky and CG()=Analytical(iterative=true) for conjugate-gradient. If solver = nothing (default) then Analytical() is used. Default: nothing

Example

using MLJ
 X, y = make_regression()
 mach = fit!(machine(RidgeRegressor(), X, y))
 predict(mach, X)
-fitted_params(mach)

See also ElasticNetRegressor.

+fitted_params(mach)

See also ElasticNetRegressor.

diff --git a/dev/models/RidgeRegressor_MLJScikitLearnInterface/index.html b/dev/models/RidgeRegressor_MLJScikitLearnInterface/index.html index 80b721820..fc1247cd5 100644 --- a/dev/models/RidgeRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/RidgeRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -RidgeRegressor · MLJ

RidgeRegressor

RidgeRegressor

A model type for constructing a ridge regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeRegressor = @load RidgeRegressor pkg=MLJScikitLearnInterface

Do model = RidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • copy_X = true
  • max_iter = 1000
  • tol = 0.0001
  • solver = auto
  • random_state = nothing
+RidgeRegressor · MLJ

RidgeRegressor

RidgeRegressor

A model type for constructing a ridge regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeRegressor = @load RidgeRegressor pkg=MLJScikitLearnInterface

Do model = RidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeRegressor(alpha=...).

Hyper-parameters

  • alpha = 1.0
  • fit_intercept = true
  • copy_X = true
  • max_iter = 1000
  • tol = 0.0001
  • solver = auto
  • random_state = nothing
diff --git a/dev/models/RidgeRegressor_MultivariateStats/index.html b/dev/models/RidgeRegressor_MultivariateStats/index.html index 29480fdbf..53c1ed182 100644 --- a/dev/models/RidgeRegressor_MultivariateStats/index.html +++ b/dev/models/RidgeRegressor_MultivariateStats/index.html @@ -1,5 +1,5 @@ -RidgeRegressor · MLJ

RidgeRegressor

RidgeRegressor

A model type for constructing a ridge regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeRegressor = @load RidgeRegressor pkg=MultivariateStats

Do model = RidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeRegressor(lambda=...).

RidgeRegressor adds a quadratic penalty term to least squares regression, for regularization. Ridge regression is particularly useful in the case of multicollinearity. Options exist to specify a bias term, and to adjust the strength of the penalty term.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • lambda=1.0: Is the non-negative parameter for the regularization strength. If lambda is 0, ridge regression is equivalent to linear least squares regression, and as lambda approaches infinity, all the linear coefficients approach 0.
  • bias=true: Include the bias term if true, otherwise fit without bias term.

Operations

  • predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • coefficients: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Examples

using MLJ
+RidgeRegressor · MLJ

RidgeRegressor

RidgeRegressor

A model type for constructing a ridge regressor, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RidgeRegressor = @load RidgeRegressor pkg=MultivariateStats

Do model = RidgeRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in RidgeRegressor(lambda=...).

RidgeRegressor adds a quadratic penalty term to least squares regression, for regularization. Ridge regression is particularly useful in the case of multicollinearity. Options exist to specify a bias term, and to adjust the strength of the penalty term.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • lambda=1.0: Is the non-negative parameter for the regularization strength. If lambda is 0, ridge regression is equivalent to linear least squares regression, and as lambda approaches infinity, all the linear coefficients approach 0.
  • bias=true: Include the bias term if true, otherwise fit without bias term.

Operations

  • predict(mach, Xnew): Return predictions of the target given new features Xnew, which should have the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • coefficients: The linear coefficients determined by the model.
  • intercept: The intercept determined by the model.

Examples

using MLJ
 
 RidgeRegressor = @load RidgeRegressor pkg=MultivariateStats
 pipe = Standardizer() |> RidgeRegressor(lambda=10)
@@ -8,4 +8,4 @@
 
 mach = machine(pipe, X, y) |> fit!
 yhat = predict(mach, X)
-training_error = l1(yhat, y) |> mean

See also LinearRegressor, MultitargetLinearRegressor, MultitargetRidgeRegressor

+training_error = l1(yhat, y) |> mean

See also LinearRegressor, MultitargetLinearRegressor, MultitargetRidgeRegressor

diff --git a/dev/models/RobustRegressor_MLJLinearModels/index.html b/dev/models/RobustRegressor_MLJLinearModels/index.html index 34cce57fd..90bd7f013 100644 --- a/dev/models/RobustRegressor_MLJLinearModels/index.html +++ b/dev/models/RobustRegressor_MLJLinearModels/index.html @@ -1,6 +1,6 @@ -RobustRegressor · MLJ

RobustRegressor

RobustRegressor

A model type for constructing a robust regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RobustRegressor = @load RobustRegressor pkg=MLJLinearModels

Do model = RobustRegressor() to construct an instance with default hyper-parameters.

Robust regression is a linear model with objective function

$

∑ρ(Xθ - y) + n⋅λ|θ|₂² + n⋅γ|θ|₁ $

where $ρ$ is a robust loss function (e.g. the Huber function) and $n$ is the number of observations.

If scale_penalty_with_samples = false the objective function is instead

$

∑ρ(Xθ - y) + λ|θ|₂² + γ|θ|₁ $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • rho::MLJLinearModels.RobustRho: the type of robust loss, which can be any instance of MLJLinearModels.L where L is one of: AndrewsRho, BisquareRho, FairRho, HuberRho, LogisticRho, QuantileRho, TalwarRho, HuberRho, TalwarRho. Default: HuberRho(0.1)

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, Newton, NewtonCG, if penalty = :l2, and ProxGrad otherwise.

    If solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
+RobustRegressor · MLJ

RobustRegressor

RobustRegressor

A model type for constructing a robust regressor, based on MLJLinearModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

RobustRegressor = @load RobustRegressor pkg=MLJLinearModels

Do model = RobustRegressor() to construct an instance with default hyper-parameters.

Robust regression is a linear model with objective function

$

∑ρ(Xθ - y) + n⋅λ|θ|₂² + n⋅γ|θ|₁ $

where $ρ$ is a robust loss function (e.g. the Huber function) and $n$ is the number of observations.

If scale_penalty_with_samples = false the objective function is instead

$

∑ρ(Xθ - y) + λ|θ|₂² + γ|θ|₁ $

.

Different solver options exist, as indicated under "Hyperparameters" below.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where:

  • X is any table of input features (eg, a DataFrame) whose columns have Continuous scitype; check column scitypes with schema(X)
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)

Train the machine using fit!(mach, rows=...).

Hyperparameters

  • rho::MLJLinearModels.RobustRho: the type of robust loss, which can be any instance of MLJLinearModels.L where L is one of: AndrewsRho, BisquareRho, FairRho, HuberRho, LogisticRho, QuantileRho, TalwarRho, HuberRho, TalwarRho. Default: HuberRho(0.1)

  • lambda::Real: strength of the regularizer if penalty is :l2 or :l1. Strength of the L2 regularizer if penalty is :en. Default: 1.0

  • gamma::Real: strength of the L1 regularizer if penalty is :en. Default: 0.0

  • penalty::Union{String, Symbol}: the penalty to use, either :l2, :l1, :en (elastic net) or :none. Default: :l2

  • fit_intercept::Bool: whether to fit the intercept or not. Default: true

  • penalize_intercept::Bool: whether to penalize the intercept. Default: false

  • scale_penalty_with_samples::Bool: whether to scale the penalty with the number of observations. Default: true

  • solver::Union{Nothing, MLJLinearModels.Solver}: some instance of MLJLinearModels.S where S is one of: LBFGS, IWLSCG, Newton, NewtonCG, if penalty = :l2, and ProxGrad otherwise.

    If solver = nothing (default) then LBFGS() is used, if penalty = :l2, and otherwise ProxGrad(accel=true) (FISTA) is used.

    Solver aliases: FISTA(; kwargs...) = ProxGrad(accel=true, kwargs...), ISTA(; kwargs...) = ProxGrad(accel=false, kwargs...) Default: nothing

Example

using MLJ
 X, y = make_regression()
 mach = fit!(machine(RobustRegressor(), X, y))
 predict(mach, X)
-fitted_params(mach)

See also HuberRegressor, QuantileRegressor.

+fitted_params(mach)

See also HuberRegressor, QuantileRegressor.

diff --git a/dev/models/SGDClassifier_MLJScikitLearnInterface/index.html b/dev/models/SGDClassifier_MLJScikitLearnInterface/index.html index 97cfbb863..be349b1bb 100644 --- a/dev/models/SGDClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/SGDClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SGDClassifier · MLJ

SGDClassifier

SGDClassifier

A model type for constructing a sgd classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SGDClassifier = @load SGDClassifier pkg=MLJScikitLearnInterface

Do model = SGDClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SGDClassifier(loss=...).

Hyper-parameters

  • loss = hinge
  • penalty = l2
  • alpha = 0.0001
  • l1_ratio = 0.15
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.001
  • shuffle = true
  • verbose = 0
  • epsilon = 0.1
  • n_jobs = nothing
  • random_state = nothing
  • learning_rate = optimal
  • eta0 = 0.0
  • power_t = 0.5
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • class_weight = nothing
  • warm_start = false
  • average = false
+SGDClassifier · MLJ

SGDClassifier

SGDClassifier

A model type for constructing a sgd classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SGDClassifier = @load SGDClassifier pkg=MLJScikitLearnInterface

Do model = SGDClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SGDClassifier(loss=...).

Hyper-parameters

  • loss = hinge
  • penalty = l2
  • alpha = 0.0001
  • l1_ratio = 0.15
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.001
  • shuffle = true
  • verbose = 0
  • epsilon = 0.1
  • n_jobs = nothing
  • random_state = nothing
  • learning_rate = optimal
  • eta0 = 0.0
  • power_t = 0.5
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • class_weight = nothing
  • warm_start = false
  • average = false
diff --git a/dev/models/SGDRegressor_MLJScikitLearnInterface/index.html b/dev/models/SGDRegressor_MLJScikitLearnInterface/index.html index 888fac941..03fff286e 100644 --- a/dev/models/SGDRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/SGDRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SGDRegressor · MLJ

SGDRegressor

SGDRegressor

A model type for constructing a stochastic gradient descent-based regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SGDRegressor = @load SGDRegressor pkg=MLJScikitLearnInterface

Do model = SGDRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SGDRegressor(loss=...).

Hyper-parameters

  • loss = squared_error
  • penalty = l2
  • alpha = 0.0001
  • l1_ratio = 0.15
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.001
  • shuffle = true
  • verbose = 0
  • epsilon = 0.1
  • random_state = nothing
  • learning_rate = invscaling
  • eta0 = 0.01
  • power_t = 0.25
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • warm_start = false
  • average = false
+SGDRegressor · MLJ

SGDRegressor

SGDRegressor

A model type for constructing a stochastic gradient descent-based regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SGDRegressor = @load SGDRegressor pkg=MLJScikitLearnInterface

Do model = SGDRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SGDRegressor(loss=...).

Hyper-parameters

  • loss = squared_error
  • penalty = l2
  • alpha = 0.0001
  • l1_ratio = 0.15
  • fit_intercept = true
  • max_iter = 1000
  • tol = 0.001
  • shuffle = true
  • verbose = 0
  • epsilon = 0.1
  • random_state = nothing
  • learning_rate = invscaling
  • eta0 = 0.01
  • power_t = 0.25
  • early_stopping = false
  • validation_fraction = 0.1
  • n_iter_no_change = 5
  • warm_start = false
  • average = false
diff --git a/dev/models/SMOTENC_Imbalance/index.html b/dev/models/SMOTENC_Imbalance/index.html index 57cdda7b2..efd0dff15 100644 --- a/dev/models/SMOTENC_Imbalance/index.html +++ b/dev/models/SMOTENC_Imbalance/index.html @@ -1,5 +1,5 @@ -SMOTENC · MLJ

SMOTENC

Initiate a SMOTENC model with the given hyper-parameters.

SMOTENC

A model type for constructing a smotenc, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SMOTENC = @load SMOTENC pkg=Imbalance

Do model = SMOTENC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTENC(k=...).

SMOTENC implements the SMOTENC algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = SMOTENC()

Hyperparameters

  • k=5: Number of nearest neighbors to consider in the SMOTENC algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • knn_tree: Decides the tree used in KNN computations. Either "Brute" or "Ball". BallTree can be much faster but may lead to inaccurate results.

  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using SMOTENC, returning both the new and original observations

Example

using MLJ
+SMOTENC · MLJ

SMOTENC

Initiate a SMOTENC model with the given hyper-parameters.

SMOTENC

A model type for constructing a smotenc, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SMOTENC = @load SMOTENC pkg=Imbalance

Do model = SMOTENC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTENC(k=...).

SMOTENC implements the SMOTENC algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = SMOTENC()

Hyperparameters

  • k=5: Number of nearest neighbors to consider in the SMOTENC algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • knn_tree: Decides the tree used in KNN computations. Either "Brute" or "Ball". BallTree can be much faster but may lead to inaccurate results.

  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A table with element scitypes that subtype Union{Finite, Infinite}. Elements in nominal columns should subtype Finite (i.e., have scitype OrderedFactor or Multiclass) and elements in continuous columns should subtype Infinite (i.e., have scitype Count or Continuous).
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using SMOTENC, returning both the new and original observations

Example

using MLJ
 using ScientificTypes
 import Imbalance
 
@@ -36,4 +36,4 @@
 julia> Imbalance.checkbalance(yover)
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) 
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) 
-0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) 
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%)
diff --git a/dev/models/SMOTEN_Imbalance/index.html b/dev/models/SMOTEN_Imbalance/index.html index 8ac5b7ea4..6af67a046 100644 --- a/dev/models/SMOTEN_Imbalance/index.html +++ b/dev/models/SMOTEN_Imbalance/index.html @@ -1,5 +1,5 @@ -SMOTEN · MLJ

SMOTEN

Initiate a SMOTEN model with the given hyper-parameters.

SMOTEN

A model type for constructing a smoten, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SMOTEN = @load SMOTEN pkg=Imbalance

Do model = SMOTEN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTEN(k=...).

SMOTEN implements the SMOTEN algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTEN: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = SMOTEN()

Hyperparameters

  • k=5: Number of nearest neighbors to consider in the SMOTEN algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix of integers or a table with element scitypes that subtype Finite. That is, for table inputs each column should have either OrderedFactor or Multiclass as the element scitype.
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using SMOTEN, returning both the new and original observations

Example

using MLJ
+SMOTEN · MLJ

SMOTEN

Initiate a SMOTEN model with the given hyper-parameters.

SMOTEN

A model type for constructing a smoten, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SMOTEN = @load SMOTEN pkg=Imbalance

Do model = SMOTEN() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTEN(k=...).

SMOTEN implements the SMOTEN algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTEN: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = SMOTEN()

Hyperparameters

  • k=5: Number of nearest neighbors to consider in the SMOTEN algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix of integers or a table with element scitypes that subtype Finite. That is, for table inputs each column should have either OrderedFactor or Multiclass as the element scitype.
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using SMOTEN, returning both the new and original observations

Example

using MLJ
 using ScientificTypes
 import Imbalance
 
@@ -37,4 +37,4 @@
 julia> Imbalance.checkbalance(yover)
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) 
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) 
-0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) 
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%)
diff --git a/dev/models/SMOTE_Imbalance/index.html b/dev/models/SMOTE_Imbalance/index.html index 559c2f9c1..908195154 100644 --- a/dev/models/SMOTE_Imbalance/index.html +++ b/dev/models/SMOTE_Imbalance/index.html @@ -1,5 +1,5 @@ -SMOTE · MLJ

SMOTE

Initiate a SMOTE model with the given hyper-parameters.

SMOTE

A model type for constructing a smote, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SMOTE = @load SMOTE pkg=Imbalance

Do model = SMOTE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTE(k=...).

SMOTE implements the SMOTE algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = SMOTE()

Hyperparameters

  • k=5: Number of nearest neighbors to consider in the SMOTE algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using SMOTE, returning both the new and original observations

Example

using MLJ
+SMOTE · MLJ

SMOTE

Initiate a SMOTE model with the given hyper-parameters.

SMOTE

A model type for constructing a smote, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SMOTE = @load SMOTE pkg=Imbalance

Do model = SMOTE() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SMOTE(k=...).

SMOTE implements the SMOTE algorithm to correct for class imbalance as in N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.

Training data

In MLJ or MLJBase, wrap the model in a machine by

mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by

model = SMOTE()

Hyperparameters

  • k=5: Number of nearest neighbors to consider in the SMOTE algorithm. Should be within the range [1, n - 1], where n is the number of observations; otherwise set to the nearest of these two values.

  • ratios=1.0: A parameter that controls the amount of oversampling to be done for each class

    • Can be a float and in this case each class will be oversampled to the size of the majority class times the float. By default, all classes are oversampled to the size of the majority class
    • Can be a dictionary mapping each class label to the float ratio for that class
  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • Xover: A matrix or table that includes original data and the new observations due to oversampling. depending on whether the input X is a matrix or table respectively
  • yover: An abstract vector of labels corresponding to Xover

Operations

  • transform(mach, X, y): resample the data X and y using SMOTE, returning both the new and original observations

Example

using MLJ
 import Imbalance
 
 ## set probability of each class
@@ -28,4 +28,4 @@
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38 (79.2%) 
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 43 (89.6%) 
 0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 48 (100.0%) 
-
+
diff --git a/dev/models/SODDetector_OutlierDetectionPython/index.html b/dev/models/SODDetector_OutlierDetectionPython/index.html index f00ae6a03..0a598275f 100644 --- a/dev/models/SODDetector_OutlierDetectionPython/index.html +++ b/dev/models/SODDetector_OutlierDetectionPython/index.html @@ -1,4 +1,4 @@ -SODDetector · MLJ

SODDetector

SODDetector(n_neighbors = 5,
+SODDetector · MLJ
+               alpha = 0.8)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.sod

diff --git a/dev/models/SOSDetector_OutlierDetectionPython/index.html b/dev/models/SOSDetector_OutlierDetectionPython/index.html index 938cf893b..5f58d3173 100644 --- a/dev/models/SOSDetector_OutlierDetectionPython/index.html +++ b/dev/models/SOSDetector_OutlierDetectionPython/index.html @@ -1,4 +1,4 @@ -SOSDetector · MLJ

SOSDetector

SOSDetector(perplexity = 4.5,
+SOSDetector · MLJ
+               eps = 1e-5)

https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.sos

diff --git a/dev/models/SRRegressor_SymbolicRegression/index.html b/dev/models/SRRegressor_SymbolicRegression/index.html index 02d3ab807..8fbb1dd0e 100644 --- a/dev/models/SRRegressor_SymbolicRegression/index.html +++ b/dev/models/SRRegressor_SymbolicRegression/index.html @@ -1,5 +1,5 @@ -SRRegressor · MLJ

SRRegressor

SRRegressor

A model type for constructing a Symbolic Regression via Evolutionary Search, based on SymbolicRegression.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SRRegressor = @load SRRegressor pkg=SymbolicRegression

Do model = SRRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SRRegressor(binary_operators=...).

Single-target Symbolic Regression regressor (SRRegressor) searches for symbolic expressions that predict a single target variable from a set of input variables. All data is assumed to be Continuous. The search is performed using an evolutionary algorithm. This algorithm is described in the paper https://arxiv.org/abs/2305.01582.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). Variable names in discovered expressions will be taken from the column names of X, if available. Units in columns of X (use DynamicQuantities for units) will trigger dimensional analysis to be used.
  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y). Units in y (use DynamicQuantities for units) will trigger dimensional analysis to be used.
  • w is the observation weights which can either be nothing (default) or an AbstractVector whoose element scitype is Count or Continuous.

Train the machine using fit!(mach), inspect the discovered expressions with report(mach), and predict on new data with predict(mach, Xnew). Note that unlike other regressors, symbolic regression stores a list of trained models. The model chosen from this list is defined by the function selection_method keyword argument, which by default balances accuracy and complexity. You can override this at prediction time by passing a named tuple with keys data and idx.

Hyper-parameters

  • binary_operators: Vector of binary operators (functions) to use. Each operator should be defined for two input scalars, and one output scalar. All operators need to be defined over the entire real line (excluding infinity - these are stopped before they are input), or return NaN where not defined. For speed, define it so it takes two reals of the same type as input, and outputs the same type. For the SymbolicUtils simplification backend, you will need to define a generic method of the operator so it takes arbitrary types.

  • unary_operators: Same, but for unary operators (one input scalar, gives an output scalar).

  • constraints: Array of pairs specifying size constraints for each operator. The constraints for a binary operator should be a 2-tuple (e.g., (-1, -1)) and the constraints for a unary operator should be an Int. A size constraint is a limit to the size of the subtree in each argument of an operator. e.g., [(^)=>(-1, 3)] means that the ^ operator can have arbitrary size (-1) in its left argument, but a maximum size of 3 in its right argument. Default is no constraints.

  • batching: Whether to evolve based on small mini-batches of data, rather than the entire dataset.

  • batch_size: What batch size to use if using batching.

  • elementwise_loss: What elementwise loss function to use. Can be one of the following losses, or any other loss of type SupervisedLoss. You can also pass a function that takes a scalar target (left argument), and scalar predicted (right argument), and returns a scalar. This will be averaged over the predicted data. If weights are supplied, your function should take a third argument for the weight scalar. Included losses: Regression: - LPDistLoss{P}(), - L1DistLoss(), - L2DistLoss() (mean square), - LogitDistLoss(), - HuberLoss(d), - L1EpsilonInsLoss(ϵ), - L2EpsilonInsLoss(ϵ), - PeriodicLoss(c), - QuantileLoss(τ), Classification: - ZeroOneLoss(), - PerceptronLoss(), - L1HingeLoss(), - SmoothedL1HingeLoss(γ), - ModifiedHuberLoss(), - L2MarginLoss(), - ExpLoss(), - SigmoidLoss(), - DWDMarginLoss(q).

  • loss_function: Alternatively, you may redefine the loss used as any function of tree::Node{T}, dataset::Dataset{T}, and options::Options, so long as you output a non-negative scalar of type T. This is useful if you want to use a loss that takes into account derivatives, or correlations across the dataset. This also means you could use a custom evaluation for a particular expression. If you are using batching=true, then your function should accept a fourth argument idx, which is either nothing (indicating that the full dataset should be used), or a vector of indices to use for the batch. For example,

      function my_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}
    +SRRegressor · MLJ

    SRRegressor

    SRRegressor

    A model type for constructing a Symbolic Regression via Evolutionary Search, based on SymbolicRegression.jl, and implementing the MLJ model interface.

    From MLJ, the type can be imported using

    SRRegressor = @load SRRegressor pkg=SymbolicRegression

    Do model = SRRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SRRegressor(binary_operators=...).

    Single-target Symbolic Regression regressor (SRRegressor) searches for symbolic expressions that predict a single target variable from a set of input variables. All data is assumed to be Continuous. The search is performed using an evolutionary algorithm. This algorithm is described in the paper https://arxiv.org/abs/2305.01582.

    Training data

    In MLJ or MLJBase, bind an instance model to data with

    mach = machine(model, X, y)

    OR

    mach = machine(model, X, y, w)

    Here:

    • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). Variable names in discovered expressions will be taken from the column names of X, if available. Units in columns of X (use DynamicQuantities for units) will trigger dimensional analysis to be used.
    • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y). Units in y (use DynamicQuantities for units) will trigger dimensional analysis to be used.
    • w is the observation weights which can either be nothing (default) or an AbstractVector whoose element scitype is Count or Continuous.

    Train the machine using fit!(mach), inspect the discovered expressions with report(mach), and predict on new data with predict(mach, Xnew). Note that unlike other regressors, symbolic regression stores a list of trained models. The model chosen from this list is defined by the function selection_method keyword argument, which by default balances accuracy and complexity. You can override this at prediction time by passing a named tuple with keys data and idx.

    Hyper-parameters

    • binary_operators: Vector of binary operators (functions) to use. Each operator should be defined for two input scalars, and one output scalar. All operators need to be defined over the entire real line (excluding infinity - these are stopped before they are input), or return NaN where not defined. For speed, define it so it takes two reals of the same type as input, and outputs the same type. For the SymbolicUtils simplification backend, you will need to define a generic method of the operator so it takes arbitrary types.

    • unary_operators: Same, but for unary operators (one input scalar, gives an output scalar).

    • constraints: Array of pairs specifying size constraints for each operator. The constraints for a binary operator should be a 2-tuple (e.g., (-1, -1)) and the constraints for a unary operator should be an Int. A size constraint is a limit to the size of the subtree in each argument of an operator. e.g., [(^)=>(-1, 3)] means that the ^ operator can have arbitrary size (-1) in its left argument, but a maximum size of 3 in its right argument. Default is no constraints.

    • batching: Whether to evolve based on small mini-batches of data, rather than the entire dataset.

    • batch_size: What batch size to use if using batching.

    • elementwise_loss: What elementwise loss function to use. Can be one of the following losses, or any other loss of type SupervisedLoss. You can also pass a function that takes a scalar target (left argument), and scalar predicted (right argument), and returns a scalar. This will be averaged over the predicted data. If weights are supplied, your function should take a third argument for the weight scalar. Included losses: Regression: - LPDistLoss{P}(), - L1DistLoss(), - L2DistLoss() (mean square), - LogitDistLoss(), - HuberLoss(d), - L1EpsilonInsLoss(ϵ), - L2EpsilonInsLoss(ϵ), - PeriodicLoss(c), - QuantileLoss(τ), Classification: - ZeroOneLoss(), - PerceptronLoss(), - L1HingeLoss(), - SmoothedL1HingeLoss(γ), - ModifiedHuberLoss(), - L2MarginLoss(), - ExpLoss(), - SigmoidLoss(), - DWDMarginLoss(q).

    • loss_function: Alternatively, you may redefine the loss used as any function of tree::Node{T}, dataset::Dataset{T}, and options::Options, so long as you output a non-negative scalar of type T. This is useful if you want to use a loss that takes into account derivatives, or correlations across the dataset. This also means you could use a custom evaluation for a particular expression. If you are using batching=true, then your function should accept a fourth argument idx, which is either nothing (indicating that the full dataset should be used), or a vector of indices to use for the batch. For example,

        function my_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}
             prediction, flag = eval_tree_array(tree, dataset.X, options)
             if !flag
                 return L(Inf)
      @@ -26,4 +26,4 @@
       y_hat = predict(mach, X)
       ## View the equation used:
       r = report(mach)
      -println("Equation used:", r.equation_strings[r.best_idx])

      See also MultitargetSRRegressor.

    +println("Equation used:", r.equation_strings[r.best_idx])

    See also MultitargetSRRegressor.

diff --git a/dev/models/SVC_LIBSVM/index.html b/dev/models/SVC_LIBSVM/index.html index 228c7989b..169eb79b5 100644 --- a/dev/models/SVC_LIBSVM/index.html +++ b/dev/models/SVC_LIBSVM/index.html @@ -1,5 +1,5 @@ -SVC · MLJ

SVC

SVC

A model type for constructing a C-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVC = @load SVC pkg=LIBSVM

Do model = SVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVC(kernel=...).

This model predicts actual class labels. To predict probabilities, use instead ProbabilisticSVC.

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
+SVC · MLJ

SVC

SVC

A model type for constructing a C-support vector classifier, based on LIBSVM.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVC = @load SVC pkg=LIBSVM

Do model = SVC() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVC(kernel=...).

This model predicts actual class labels. To predict probabilities, use instead ProbabilisticSVC.

Reference for algorithm and core C-library: C.-C. Chang and C.-J. Lin (2011): "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27. Updated at https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.

Training data

In MLJ or MLJBase, bind an instance model to data with one of:

mach = machine(model, X, y)
 mach = machine(model, X, y, w)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have Continuous element scitype; check column scitypes with schema(X)
  • y: is the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)
  • w: a dictionary of class weights, keyed on levels(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • kernel=LIBSVM.Kernel.RadialBasis: either an object that can be called, as in kernel(x1, x2), or one of the built-in kernels from the LIBSVM.jl package listed below. Here x1 and x2 are vectors whose lengths match the number of columns of the training data X (see "Examples" below).

    • LIBSVM.Kernel.Linear: (x1, x2) -> x1'*x2
    • LIBSVM.Kernel.Polynomial: (x1, x2) -> gamma*x1'*x2 + coef0)^degree
    • LIBSVM.Kernel.RadialBasis: (x1, x2) -> (exp(-gamma*norm(x1 - x2)^2))
    • LIBSVM.Kernel.Sigmoid: (x1, x2) - > tanh(gamma*x1'*x2 + coef0)

    Here gamma, coef0, degree are other hyper-parameters. Serialization of models with user-defined kernels comes with some restrictions. See LIVSVM.jl issue91

  • gamma = 0.0: kernel parameter (see above); if gamma==-1.0 then gamma = 1/nfeatures is used in training, where nfeatures is the number of features (columns of X). If gamma==0.0 then gamma = 1/(var(Tables.matrix(X))*nfeatures) is used. Actual value used appears in the report (see below).

  • coef0 = 0.0: kernel parameter (see above)

  • degree::Int32 = Int32(3): degree in polynomial kernel (see above)

  • cost=1.0 (range (0, Inf)): the parameter denoted $C$ in the cited reference; for greater regularization, decrease cost

  • cachesize=200.0 cache memory size in MB

  • tolerance=0.001: tolerance for the stopping criterion

  • shrinking=true: whether to use shrinking heuristics

Operations

  • predict(mach, Xnew): return predictions of the target given features Xnew having the same scitype as X above.

Fitted parameters

The fields of fitted_params(mach) are:

  • libsvm_model: the trained model object created by the LIBSVM.jl package
  • encoding: class encoding used internally by libsvm_model - a dictionary of class labels keyed on the internal integer representation

Report

The fields of report(mach) are:

  • gamma: actual value of the kernel parameter gamma used in training

Examples

Using a built-in kernel

using MLJ
 import LIBSVM
 
@@ -33,4 +33,4 @@
 3-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
  "versicolor"
  "versicolor"
- "versicolor"

See also the classifiers ProbabilisticSVC, NuSVC and LinearSVC. And see LIVSVM.jl and the original C implementation documentation.

+ "versicolor"

See also the classifiers ProbabilisticSVC, NuSVC and LinearSVC. And see LIVSVM.jl and the original C implementation documentation.

diff --git a/dev/models/SVMClassifier_MLJScikitLearnInterface/index.html b/dev/models/SVMClassifier_MLJScikitLearnInterface/index.html index 34338165c..7ce7b299e 100644 --- a/dev/models/SVMClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/SVMClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SVMClassifier · MLJ

SVMClassifier

SVMClassifier

A model type for constructing a C-support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMClassifier = @load SVMClassifier pkg=MLJScikitLearnInterface

Do model = SVMClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMClassifier(C=...).

Hyper-parameters

  • C = 1.0
  • kernel = rbf
  • degree = 3
  • gamma = scale
  • coef0 = 0.0
  • shrinking = true
  • tol = 0.001
  • cache_size = 200
  • max_iter = -1
  • decision_function_shape = ovr
  • random_state = nothing
+SVMClassifier · MLJ

SVMClassifier

SVMClassifier

A model type for constructing a C-support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMClassifier = @load SVMClassifier pkg=MLJScikitLearnInterface

Do model = SVMClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMClassifier(C=...).

Hyper-parameters

  • C = 1.0
  • kernel = rbf
  • degree = 3
  • gamma = scale
  • coef0 = 0.0
  • shrinking = true
  • tol = 0.001
  • cache_size = 200
  • max_iter = -1
  • decision_function_shape = ovr
  • random_state = nothing
diff --git a/dev/models/SVMLinearClassifier_MLJScikitLearnInterface/index.html b/dev/models/SVMLinearClassifier_MLJScikitLearnInterface/index.html index 9d93c4a48..1234e7863 100644 --- a/dev/models/SVMLinearClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/SVMLinearClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SVMLinearClassifier · MLJ

SVMLinearClassifier

SVMLinearClassifier

A model type for constructing a linear support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMLinearClassifier = @load SVMLinearClassifier pkg=MLJScikitLearnInterface

Do model = SVMLinearClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMLinearClassifier(penalty=...).

Hyper-parameters

  • penalty = l2
  • loss = squared_hinge
  • dual = true
  • tol = 0.0001
  • C = 1.0
  • multi_class = ovr
  • fit_intercept = true
  • intercept_scaling = 1.0
  • random_state = nothing
  • max_iter = 1000
+SVMLinearClassifier · MLJ

SVMLinearClassifier

SVMLinearClassifier

A model type for constructing a linear support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMLinearClassifier = @load SVMLinearClassifier pkg=MLJScikitLearnInterface

Do model = SVMLinearClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMLinearClassifier(penalty=...).

Hyper-parameters

  • penalty = l2
  • loss = squared_hinge
  • dual = true
  • tol = 0.0001
  • C = 1.0
  • multi_class = ovr
  • fit_intercept = true
  • intercept_scaling = 1.0
  • random_state = nothing
  • max_iter = 1000
diff --git a/dev/models/SVMLinearRegressor_MLJScikitLearnInterface/index.html b/dev/models/SVMLinearRegressor_MLJScikitLearnInterface/index.html index 1f7787400..9184d41eb 100644 --- a/dev/models/SVMLinearRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/SVMLinearRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SVMLinearRegressor · MLJ

SVMLinearRegressor

SVMLinearRegressor

A model type for constructing a linear support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMLinearRegressor = @load SVMLinearRegressor pkg=MLJScikitLearnInterface

Do model = SVMLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMLinearRegressor(epsilon=...).

Hyper-parameters

  • epsilon = 0.0
  • tol = 0.0001
  • C = 1.0
  • loss = epsilon_insensitive
  • fit_intercept = true
  • intercept_scaling = 1.0
  • dual = true
  • random_state = nothing
  • max_iter = 1000
+SVMLinearRegressor · MLJ

SVMLinearRegressor

SVMLinearRegressor

A model type for constructing a linear support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMLinearRegressor = @load SVMLinearRegressor pkg=MLJScikitLearnInterface

Do model = SVMLinearRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMLinearRegressor(epsilon=...).

Hyper-parameters

  • epsilon = 0.0
  • tol = 0.0001
  • C = 1.0
  • loss = epsilon_insensitive
  • fit_intercept = true
  • intercept_scaling = 1.0
  • dual = true
  • random_state = nothing
  • max_iter = 1000
diff --git a/dev/models/SVMNuClassifier_MLJScikitLearnInterface/index.html b/dev/models/SVMNuClassifier_MLJScikitLearnInterface/index.html index 178773256..363fa5d19 100644 --- a/dev/models/SVMNuClassifier_MLJScikitLearnInterface/index.html +++ b/dev/models/SVMNuClassifier_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SVMNuClassifier · MLJ

SVMNuClassifier

SVMNuClassifier

A model type for constructing a nu-support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMNuClassifier = @load SVMNuClassifier pkg=MLJScikitLearnInterface

Do model = SVMNuClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMNuClassifier(nu=...).

Hyper-parameters

  • nu = 0.5
  • kernel = rbf
  • degree = 3
  • gamma = scale
  • coef0 = 0.0
  • shrinking = true
  • tol = 0.001
  • cache_size = 200
  • max_iter = -1
  • decision_function_shape = ovr
  • random_state = nothing
+SVMNuClassifier · MLJ

SVMNuClassifier

SVMNuClassifier

A model type for constructing a nu-support vector classifier, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMNuClassifier = @load SVMNuClassifier pkg=MLJScikitLearnInterface

Do model = SVMNuClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMNuClassifier(nu=...).

Hyper-parameters

  • nu = 0.5
  • kernel = rbf
  • degree = 3
  • gamma = scale
  • coef0 = 0.0
  • shrinking = true
  • tol = 0.001
  • cache_size = 200
  • max_iter = -1
  • decision_function_shape = ovr
  • random_state = nothing
diff --git a/dev/models/SVMNuRegressor_MLJScikitLearnInterface/index.html b/dev/models/SVMNuRegressor_MLJScikitLearnInterface/index.html index 5d932197e..65d62de58 100644 --- a/dev/models/SVMNuRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/SVMNuRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SVMNuRegressor · MLJ

SVMNuRegressor

SVMNuRegressor

A model type for constructing a nu-support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMNuRegressor = @load SVMNuRegressor pkg=MLJScikitLearnInterface

Do model = SVMNuRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMNuRegressor(nu=...).

Hyper-parameters

  • nu = 0.5
  • C = 1.0
  • kernel = rbf
  • degree = 3
  • gamma = scale
  • coef0 = 0.0
  • shrinking = true
  • tol = 0.001
  • cache_size = 200
  • max_iter = -1
+SVMNuRegressor · MLJ

SVMNuRegressor

SVMNuRegressor

A model type for constructing a nu-support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMNuRegressor = @load SVMNuRegressor pkg=MLJScikitLearnInterface

Do model = SVMNuRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMNuRegressor(nu=...).

Hyper-parameters

  • nu = 0.5
  • C = 1.0
  • kernel = rbf
  • degree = 3
  • gamma = scale
  • coef0 = 0.0
  • shrinking = true
  • tol = 0.001
  • cache_size = 200
  • max_iter = -1
diff --git a/dev/models/SVMRegressor_MLJScikitLearnInterface/index.html b/dev/models/SVMRegressor_MLJScikitLearnInterface/index.html index b62ee3258..e15d4ae5c 100644 --- a/dev/models/SVMRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/SVMRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SVMRegressor · MLJ

SVMRegressor

SVMRegressor

A model type for constructing a epsilon-support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMRegressor = @load SVMRegressor pkg=MLJScikitLearnInterface

Do model = SVMRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMRegressor(kernel=...).

Hyper-parameters

  • kernel = rbf
  • degree = 3
  • gamma = scale
  • coef0 = 0.0
  • tol = 0.001
  • C = 1.0
  • epsilon = 0.1
  • shrinking = true
  • cache_size = 200
  • max_iter = -1
+SVMRegressor · MLJ

SVMRegressor

SVMRegressor

A model type for constructing a epsilon-support vector regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SVMRegressor = @load SVMRegressor pkg=MLJScikitLearnInterface

Do model = SVMRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SVMRegressor(kernel=...).

Hyper-parameters

  • kernel = rbf
  • degree = 3
  • gamma = scale
  • coef0 = 0.0
  • tol = 0.001
  • C = 1.0
  • epsilon = 0.1
  • shrinking = true
  • cache_size = 200
  • max_iter = -1
diff --git a/dev/models/SelfOrganizingMap_SelfOrganizingMaps/index.html b/dev/models/SelfOrganizingMap_SelfOrganizingMaps/index.html index d101000db..689a21ea1 100644 --- a/dev/models/SelfOrganizingMap_SelfOrganizingMaps/index.html +++ b/dev/models/SelfOrganizingMap_SelfOrganizingMaps/index.html @@ -1,5 +1,5 @@ -SelfOrganizingMap · MLJ

SelfOrganizingMap

SelfOrganizingMap

A model type for constructing a self organizing map, based on SelfOrganizingMaps.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SelfOrganizingMap = @load SelfOrganizingMap pkg=SelfOrganizingMaps

Do model = SelfOrganizingMap() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SelfOrganizingMap(k=...).

SelfOrganizingMaps implements Kohonen's Self Organizing Map, Proceedings of the IEEE; Kohonen, T.; (1990):"The self-organizing map"

Training data

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X) where

  • X: an AbstractMatrix or Table of input features whose columns are of scitype Continuous.

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • k=10: Number of nodes along once side of SOM grid. There are total nodes.
  • η=0.5: Learning rate. Scales adjust made to winning node and its neighbors during each round of training.
  • σ²=0.05: The (squared) neighbor radius. Used to determine scale for neighbor node adjustments.
  • grid_type=:rectangular Node grid geometry. One of (:rectangular, :hexagonal, :spherical).
  • η_decay=:exponential Learning rate schedule function. One of (:exponential, :asymptotic)
  • σ_decay=:exponential Neighbor radius schedule function. One of (:exponential, :asymptotic, :none)
  • neighbor_function=:gaussian Kernel function used to make adjustment to neighbor weights. Scale is set by σ². One of (:gaussian, :mexican_hat).
  • matching_distance=euclidean Distance function from Distances.jl used to determine winning node.
  • Nepochs=1 Number of times to repeat training on the shuffled dataset.

Operations

  • transform(mach, Xnew): returns the coordinates of the winning SOM node for each instance of Xnew. For SOM of gridtype :rectangular and :hexagonal, these are cartesian coordinates. For gridtype :spherical, these are the latitude and longitude in radians.

Fitted parameters

The fields of fitted_params(mach) are:

  • coords: The coordinates of each of the SOM nodes (points in the domain of the map) with shape (k², 2)
  • weights: Array of weight vectors for the SOM nodes (corresponding points in the map's range) of shape (k², input dimension)

Report

The fields of report(mach) are:

  • classes: the index of the winning node for each instance of the training data X interpreted as a class label

Examples

using MLJ
+SelfOrganizingMap · MLJ

SelfOrganizingMap

SelfOrganizingMap

A model type for constructing a self organizing map, based on SelfOrganizingMaps.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SelfOrganizingMap = @load SelfOrganizingMap pkg=SelfOrganizingMaps

Do model = SelfOrganizingMap() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SelfOrganizingMap(k=...).

SelfOrganizingMaps implements Kohonen's Self Organizing Map, Proceedings of the IEEE; Kohonen, T.; (1990):"The self-organizing map"

Training data

In MLJ or MLJBase, bind an instance model to data with mach = machine(model, X) where

  • X: an AbstractMatrix or Table of input features whose columns are of scitype Continuous.

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • k=10: Number of nodes along once side of SOM grid. There are total nodes.
  • η=0.5: Learning rate. Scales adjust made to winning node and its neighbors during each round of training.
  • σ²=0.05: The (squared) neighbor radius. Used to determine scale for neighbor node adjustments.
  • grid_type=:rectangular Node grid geometry. One of (:rectangular, :hexagonal, :spherical).
  • η_decay=:exponential Learning rate schedule function. One of (:exponential, :asymptotic)
  • σ_decay=:exponential Neighbor radius schedule function. One of (:exponential, :asymptotic, :none)
  • neighbor_function=:gaussian Kernel function used to make adjustment to neighbor weights. Scale is set by σ². One of (:gaussian, :mexican_hat).
  • matching_distance=euclidean Distance function from Distances.jl used to determine winning node.
  • Nepochs=1 Number of times to repeat training on the shuffled dataset.

Operations

  • transform(mach, Xnew): returns the coordinates of the winning SOM node for each instance of Xnew. For SOM of gridtype :rectangular and :hexagonal, these are cartesian coordinates. For gridtype :spherical, these are the latitude and longitude in radians.

Fitted parameters

The fields of fitted_params(mach) are:

  • coords: The coordinates of each of the SOM nodes (points in the domain of the map) with shape (k², 2)
  • weights: Array of weight vectors for the SOM nodes (corresponding points in the map's range) of shape (k², input dimension)

Report

The fields of report(mach) are:

  • classes: the index of the winning node for each instance of the training data X interpreted as a class label

Examples

using MLJ
 som = @load SelfOrganizingMap pkg=SelfOrganizingMaps
 model = som()
 X, y = make_regression(50, 3) ## synthetic data
@@ -7,4 +7,4 @@
 X̃ = transform(mach, X)
 
 rpt = report(mach)
-classes = rpt.classes
+classes = rpt.classes
diff --git a/dev/models/SimpleImputer_BetaML/index.html b/dev/models/SimpleImputer_BetaML/index.html index c22328ac0..fa7cbca3a 100644 --- a/dev/models/SimpleImputer_BetaML/index.html +++ b/dev/models/SimpleImputer_BetaML/index.html @@ -1,5 +1,5 @@ -SimpleImputer · MLJ

SimpleImputer

mutable struct SimpleImputer <: MLJModelInterface.Unsupervised

Impute missing values using feature (column) mean, with optional record normalisation (using l-norm norms), from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • statistic::Function: The descriptive statistic of the column (feature) to use as imputed value [def: mean]
  • norm::Union{Nothing, Int64}: Normalise the feature mean by l-norm norm of the records [default: nothing]. Use it (e.g. norm=1 to use the l-1 norm) if the records are highly heterogeneus (e.g. quantity exports of different countries).

Example:

julia> using MLJ
+SimpleImputer · MLJ

SimpleImputer

mutable struct SimpleImputer <: MLJModelInterface.Unsupervised

Impute missing values using feature (column) mean, with optional record normalisation (using l-norm norms), from the Beta Machine Learning Toolkit (BetaML).

Hyperparameters:

  • statistic::Function: The descriptive statistic of the column (feature) to use as imputed value [def: mean]
  • norm::Union{Nothing, Int64}: Normalise the feature mean by l-norm norm of the records [default: nothing]. Use it (e.g. norm=1 to use the l-1 norm) if the records are highly heterogeneus (e.g. quantity exports of different countries).

Example:

julia> using MLJ
 
 julia> X = [1 10.5;1.5 missing; 1.8 8; 1.7 15; 3.2 40; missing missing; 3.3 38; missing -2.3; 5.2 -2.4] |> table ;
 
@@ -26,4 +26,4 @@
  0.280952    1.69524
  3.3        38.0
  0.0750839  -2.3
- 5.2        -2.4
+ 5.2 -2.4
diff --git a/dev/models/SpectralClustering_MLJScikitLearnInterface/index.html b/dev/models/SpectralClustering_MLJScikitLearnInterface/index.html index 5dd586e86..0c6cd30f8 100644 --- a/dev/models/SpectralClustering_MLJScikitLearnInterface/index.html +++ b/dev/models/SpectralClustering_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -SpectralClustering · MLJ

SpectralClustering

SpectralClustering

A model type for constructing a spectral clustering, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SpectralClustering = @load SpectralClustering pkg=MLJScikitLearnInterface

Do model = SpectralClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SpectralClustering(n_clusters=...).

Apply clustering to a projection of the normalized Laplacian. In practice spectral clustering is very useful when the structure of the individual clusters is highly non-convex or more generally when a measure of the center and spread of the cluster is not a suitable description of the complete cluster. For instance when clusters are nested circles on the 2D plane.

+SpectralClustering · MLJ

SpectralClustering

SpectralClustering

A model type for constructing a spectral clustering, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SpectralClustering = @load SpectralClustering pkg=MLJScikitLearnInterface

Do model = SpectralClustering() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SpectralClustering(n_clusters=...).

Apply clustering to a projection of the normalized Laplacian. In practice spectral clustering is very useful when the structure of the individual clusters is highly non-convex or more generally when a measure of the center and spread of the cluster is not a suitable description of the complete cluster. For instance when clusters are nested circles on the 2D plane.

diff --git a/dev/models/StableForestClassifier_SIRUS/index.html b/dev/models/StableForestClassifier_SIRUS/index.html index 282d742a3..18688c1a0 100644 --- a/dev/models/StableForestClassifier_SIRUS/index.html +++ b/dev/models/StableForestClassifier_SIRUS/index.html @@ -1,2 +1,2 @@ -StableForestClassifier · MLJ

StableForestClassifier

StableForestClassifier

A model type for constructing a stable forest classifier, based on SIRUS.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

StableForestClassifier = @load StableForestClassifier pkg=SIRUS

Do model = StableForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableForestClassifier(rng=...).

StableForestClassifier implements the random forest classifier with a stabilized forest structure (Bénard et al., 2021). This stabilization increases stability when extracting rules. The impact on the predictive accuracy compared to standard random forests should be relatively small.

Note

Just like normal random forests, this model is not easily explainable. If you are interested in an explainable model, use the StableRulesClassifier or StableRulesRegressor.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.
  • partial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.
  • n_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.
  • max_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).
  • q::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.
  • min_data_in_leaf::Int=5: Minimum number of data points per leaf.

Fitted parameters

The fields of fitted_params(mach) are:

  • fitresult: A StableForest object.

Operations

  • predict(mach, Xnew): Return a vector of predictions for each row of Xnew.
+StableForestClassifier · MLJ

StableForestClassifier

StableForestClassifier

A model type for constructing a stable forest classifier, based on SIRUS.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

StableForestClassifier = @load StableForestClassifier pkg=SIRUS

Do model = StableForestClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableForestClassifier(rng=...).

StableForestClassifier implements the random forest classifier with a stabilized forest structure (Bénard et al., 2021). This stabilization increases stability when extracting rules. The impact on the predictive accuracy compared to standard random forests should be relatively small.

Note

Just like normal random forests, this model is not easily explainable. If you are interested in an explainable model, use the StableRulesClassifier or StableRulesRegressor.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.
  • partial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.
  • n_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.
  • max_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).
  • q::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.
  • min_data_in_leaf::Int=5: Minimum number of data points per leaf.

Fitted parameters

The fields of fitted_params(mach) are:

  • fitresult: A StableForest object.

Operations

  • predict(mach, Xnew): Return a vector of predictions for each row of Xnew.
diff --git a/dev/models/StableForestRegressor_SIRUS/index.html b/dev/models/StableForestRegressor_SIRUS/index.html index 2fcef423c..57fa73fb9 100644 --- a/dev/models/StableForestRegressor_SIRUS/index.html +++ b/dev/models/StableForestRegressor_SIRUS/index.html @@ -1,2 +1,2 @@ -StableForestRegressor · MLJ

StableForestRegressor

StableForestRegressor

A model type for constructing a stable forest regressor, based on SIRUS.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

StableForestRegressor = @load StableForestRegressor pkg=SIRUS

Do model = StableForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableForestRegressor(rng=...).

StableForestRegressor implements the random forest regressor with a stabilized forest structure (Bénard et al., 2021).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.
  • partial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.
  • n_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.
  • max_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).
  • q::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.
  • min_data_in_leaf::Int=5: Minimum number of data points per leaf.

Fitted parameters

The fields of fitted_params(mach) are:

  • fitresult: A StableForest object.

Operations

  • predict(mach, Xnew): Return a vector of predictions for each row of Xnew.
+StableForestRegressor · MLJ

StableForestRegressor

StableForestRegressor

A model type for constructing a stable forest regressor, based on SIRUS.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

StableForestRegressor = @load StableForestRegressor pkg=SIRUS

Do model = StableForestRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableForestRegressor(rng=...).

StableForestRegressor implements the random forest regressor with a stabilized forest structure (Bénard et al., 2021).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.
  • partial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.
  • n_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.
  • max_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).
  • q::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.
  • min_data_in_leaf::Int=5: Minimum number of data points per leaf.

Fitted parameters

The fields of fitted_params(mach) are:

  • fitresult: A StableForest object.

Operations

  • predict(mach, Xnew): Return a vector of predictions for each row of Xnew.
diff --git a/dev/models/StableRulesClassifier_SIRUS/index.html b/dev/models/StableRulesClassifier_SIRUS/index.html index 43de939c6..fda11ca71 100644 --- a/dev/models/StableRulesClassifier_SIRUS/index.html +++ b/dev/models/StableRulesClassifier_SIRUS/index.html @@ -1,2 +1,2 @@ -StableRulesClassifier · MLJ

StableRulesClassifier

StableRulesClassifier

A model type for constructing a stable rules classifier, based on SIRUS.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

StableRulesClassifier = @load StableRulesClassifier pkg=SIRUS

Do model = StableRulesClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableRulesClassifier(rng=...).

StableRulesClassifier implements the explainable rule-based model based on a random forest.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.
  • partial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.
  • n_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.
  • max_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).
  • q::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.
  • min_data_in_leaf::Int=5: Minimum number of data points per leaf.
  • max_rules::Int=10: This is the most important hyperparameter after lambda. The more rules, the more accurate the model should be. If this is not the case, tune lambda first. However, more rules will also decrease model interpretability. So, it is important to find a good balance here. In most cases, 10 to 40 rules should provide reasonable accuracy while remaining interpretable.
  • lambda::Float64=1.0: The weights of the final rules are determined via a regularized regression over each rule as a binary feature. This hyperparameter specifies the strength of the ridge (L2) regularizer. SIRUS is very sensitive to the choice of this hyperparameter. Ensure that you try the full range from 10^-4 to 10^4 (e.g., 0.001, 0.01, ..., 100). When trying the range, one good check is to verify that an increase in max_rules increases performance. If this is not the case, then try a different value for lambda.

Fitted parameters

The fields of fitted_params(mach) are:

  • fitresult: A StableRules object.

Operations

  • predict(mach, Xnew): Return a vector of predictions for each row of Xnew.
+StableRulesClassifier · MLJ

StableRulesClassifier

StableRulesClassifier

A model type for constructing a stable rules classifier, based on SIRUS.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

StableRulesClassifier = @load StableRulesClassifier pkg=SIRUS

Do model = StableRulesClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableRulesClassifier(rng=...).

StableRulesClassifier implements the explainable rule-based model based on a random forest.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.
  • partial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.
  • n_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.
  • max_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).
  • q::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.
  • min_data_in_leaf::Int=5: Minimum number of data points per leaf.
  • max_rules::Int=10: This is the most important hyperparameter after lambda. The more rules, the more accurate the model should be. If this is not the case, tune lambda first. However, more rules will also decrease model interpretability. So, it is important to find a good balance here. In most cases, 10 to 40 rules should provide reasonable accuracy while remaining interpretable.
  • lambda::Float64=1.0: The weights of the final rules are determined via a regularized regression over each rule as a binary feature. This hyperparameter specifies the strength of the ridge (L2) regularizer. SIRUS is very sensitive to the choice of this hyperparameter. Ensure that you try the full range from 10^-4 to 10^4 (e.g., 0.001, 0.01, ..., 100). When trying the range, one good check is to verify that an increase in max_rules increases performance. If this is not the case, then try a different value for lambda.

Fitted parameters

The fields of fitted_params(mach) are:

  • fitresult: A StableRules object.

Operations

  • predict(mach, Xnew): Return a vector of predictions for each row of Xnew.
diff --git a/dev/models/StableRulesRegressor_SIRUS/index.html b/dev/models/StableRulesRegressor_SIRUS/index.html index 8c869f0a4..e361e2ec3 100644 --- a/dev/models/StableRulesRegressor_SIRUS/index.html +++ b/dev/models/StableRulesRegressor_SIRUS/index.html @@ -1,2 +1,2 @@ -StableRulesRegressor · MLJ

StableRulesRegressor

StableRulesRegressor

A model type for constructing a stable rules regressor, based on SIRUS.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

StableRulesRegressor = @load StableRulesRegressor pkg=SIRUS

Do model = StableRulesRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableRulesRegressor(rng=...).

StableRulesRegressor implements the explainable rule-based regression model based on a random forest.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.
  • partial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.
  • n_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.
  • max_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).
  • q::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.
  • min_data_in_leaf::Int=5: Minimum number of data points per leaf.
  • max_rules::Int=10: This is the most important hyperparameter after lambda. The more rules, the more accurate the model should be. If this is not the case, tune lambda first. However, more rules will also decrease model interpretability. So, it is important to find a good balance here. In most cases, 10 to 40 rules should provide reasonable accuracy while remaining interpretable.
  • lambda::Float64=1.0: The weights of the final rules are determined via a regularized regression over each rule as a binary feature. This hyperparameter specifies the strength of the ridge (L2) regularizer. SIRUS is very sensitive to the choice of this hyperparameter. Ensure that you try the full range from 10^-4 to 10^4 (e.g., 0.001, 0.01, ..., 100). When trying the range, one good check is to verify that an increase in max_rules increases performance. If this is not the case, then try a different value for lambda.

Fitted parameters

The fields of fitted_params(mach) are:

  • fitresult: A StableRules object.

Operations

  • predict(mach, Xnew): Return a vector of predictions for each row of Xnew.
+StableRulesRegressor · MLJ

StableRulesRegressor

StableRulesRegressor

A model type for constructing a stable rules regressor, based on SIRUS.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

StableRulesRegressor = @load StableRulesRegressor pkg=SIRUS

Do model = StableRulesRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in StableRulesRegressor(rng=...).

StableRulesRegressor implements the explainable rule-based regression model based on a random forest.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have one of the following element scitypes: Continuous, Count, or <:OrderedFactor; check column scitypes with schema(X)
  • y: the target, which can be any AbstractVector whose element scitype is <:OrderedFactor or <:Multiclass; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyperparameters

  • rng::AbstractRNG=default_rng(): Random number generator. Using a StableRNG from StableRNGs.jl is advised.
  • partial_sampling::Float64=0.7: Ratio of samples to use in each subset of the data. The default should be fine for most cases.
  • n_trees::Int=1000: The number of trees to use. It is advisable to use at least thousand trees to for a better rule selection, and in turn better predictive performance.
  • max_depth::Int=2: The depth of the tree. A lower depth decreases model complexity and can therefore improve accuracy when the sample size is small (reduce overfitting).
  • q::Int=10: Number of cutpoints to use per feature. The default value should be fine for most situations.
  • min_data_in_leaf::Int=5: Minimum number of data points per leaf.
  • max_rules::Int=10: This is the most important hyperparameter after lambda. The more rules, the more accurate the model should be. If this is not the case, tune lambda first. However, more rules will also decrease model interpretability. So, it is important to find a good balance here. In most cases, 10 to 40 rules should provide reasonable accuracy while remaining interpretable.
  • lambda::Float64=1.0: The weights of the final rules are determined via a regularized regression over each rule as a binary feature. This hyperparameter specifies the strength of the ridge (L2) regularizer. SIRUS is very sensitive to the choice of this hyperparameter. Ensure that you try the full range from 10^-4 to 10^4 (e.g., 0.001, 0.01, ..., 100). When trying the range, one good check is to verify that an increase in max_rules increases performance. If this is not the case, then try a different value for lambda.

Fitted parameters

The fields of fitted_params(mach) are:

  • fitresult: A StableRules object.

Operations

  • predict(mach, Xnew): Return a vector of predictions for each row of Xnew.
diff --git a/dev/models/Stack_MLJBase/index.html b/dev/models/Stack_MLJBase/index.html index 3779b8e9b..9d32cb36e 100644 --- a/dev/models/Stack_MLJBase/index.html +++ b/dev/models/Stack_MLJBase/index.html @@ -1,5 +1,5 @@ -Stack · MLJ

Stack

Union{Types...}

A type union is an abstract type which includes all instances of any of its argument types. The empty union Union{} is the bottom type of Julia.

Examples

julia> IntOrString = Union{Int,AbstractString}
+Stack · MLJ

Stack

Union{Types...}

A type union is an abstract type which includes all instances of any of its argument types. The empty union Union{} is the bottom type of Julia.

Examples

julia> IntOrString = Union{Int,AbstractString}
 Union{Int64, AbstractString}
 
 julia> 1 isa IntOrString
@@ -9,4 +9,4 @@
 true
 
 julia> 1.0 isa IntOrString
-false
+false
diff --git a/dev/models/Standardizer_MLJModels/index.html b/dev/models/Standardizer_MLJModels/index.html index ca13cacde..0277dbf13 100644 --- a/dev/models/Standardizer_MLJModels/index.html +++ b/dev/models/Standardizer_MLJModels/index.html @@ -1,5 +1,5 @@ -Standardizer · MLJ

Standardizer

Standardizer

A model type for constructing a standardizer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

Standardizer = @load Standardizer pkg=MLJModels

Do model = Standardizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Standardizer(features=...).

Use this model to standardize (whiten) a Continuous vector, or relevant columns of a table. The rescalings applied by this transformer to new data are always those learned during the training phase, which are generally different from what would actually standardize the new data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table or any abstract vector with Continuous element scitype (any abstract float vector). Only features in a table with Continuous scitype can be standardized; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated below:

    • [] (empty, the default): standardize all features (columns) having Continuous element scitype
    • non-empty vector of feature names (symbols): standardize only the Continuous features in the vector (if ignore=false) or Continuous features not named in the vector (ignore=true).
    • function or other callable: standardize a feature if the callable returns true on its name. For example, Standardizer(features = name -> name in [:x1, :x3], ignore = true, count=true) has the same effect as Standardizer(features = [:x1, :x3], ignore = true, count=true), namely to standardize all Continuous and Count features, with the exception of :x1 and :x3.

    Note this behavior is further modified if the ordered_factor or count flags are set to true; see below

  • ignore=false: whether to ignore or standardize specified features, as explained above

  • ordered_factor=false: if true, standardize any OrderedFactor feature wherever a Continuous feature would be standardized, as described above

  • count=false: if true, standardize any Count feature wherever a Continuous feature would be standardized, as described above

Operations

  • transform(mach, Xnew): return Xnew with relevant features standardized according to the rescalings learned during fitting of mach.
  • inverse_transform(mach, Z): apply the inverse transformation to Z, so that inverse_transform(mach, transform(mach, Xnew)) is approximately the same as Xnew; unavailable if ordered_factor or count flags were set to true.

Fitted parameters

The fields of fitted_params(mach) are:

  • features_fit - the names of features that will be standardized
  • means - the corresponding untransformed mean values
  • stds - the corresponding untransformed standard deviations

Report

The fields of report(mach) are:

  • features_fit: the names of features that will be standardized

Examples

using MLJ
+Standardizer · MLJ

Standardizer

Standardizer

A model type for constructing a standardizer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

Standardizer = @load Standardizer pkg=MLJModels

Do model = Standardizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Standardizer(features=...).

Use this model to standardize (whiten) a Continuous vector, or relevant columns of a table. The rescalings applied by this transformer to new data are always those learned during the training phase, which are generally different from what would actually standardize the new data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table or any abstract vector with Continuous element scitype (any abstract float vector). Only features in a table with Continuous scitype can be standardized; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated below:

    • [] (empty, the default): standardize all features (columns) having Continuous element scitype
    • non-empty vector of feature names (symbols): standardize only the Continuous features in the vector (if ignore=false) or Continuous features not named in the vector (ignore=true).
    • function or other callable: standardize a feature if the callable returns true on its name. For example, Standardizer(features = name -> name in [:x1, :x3], ignore = true, count=true) has the same effect as Standardizer(features = [:x1, :x3], ignore = true, count=true), namely to standardize all Continuous and Count features, with the exception of :x1 and :x3.

    Note this behavior is further modified if the ordered_factor or count flags are set to true; see below

  • ignore=false: whether to ignore or standardize specified features, as explained above

  • ordered_factor=false: if true, standardize any OrderedFactor feature wherever a Continuous feature would be standardized, as described above

  • count=false: if true, standardize any Count feature wherever a Continuous feature would be standardized, as described above

Operations

  • transform(mach, Xnew): return Xnew with relevant features standardized according to the rescalings learned during fitting of mach.
  • inverse_transform(mach, Z): apply the inverse transformation to Z, so that inverse_transform(mach, transform(mach, Xnew)) is approximately the same as Xnew; unavailable if ordered_factor or count flags were set to true.

Fitted parameters

The fields of fitted_params(mach) are:

  • features_fit - the names of features that will be standardized
  • means - the corresponding untransformed mean values
  • stds - the corresponding untransformed standard deviations

Report

The fields of report(mach) are:

  • features_fit: the names of features that will be standardized

Examples

using MLJ
 
 X = (ordinal1 = [1, 2, 3],
      ordinal2 = coerce([:x, :y, :x], OrderedFactor),
@@ -34,4 +34,4 @@
  ordinal2 = CategoricalValue{Symbol,UInt32}[:x, :y, :x],
  ordinal3 = [10.0, 20.0, 30.0],
  ordinal4 = [1.0, 0.0, -1.0],
- nominal = CategoricalValue{String,UInt32}["Your father", "he", "is"],)

See also OneHotEncoder, ContinuousEncoder.

+ nominal = CategoricalValue{String,UInt32}["Your father", "he", "is"],)

See also OneHotEncoder, ContinuousEncoder.

diff --git a/dev/models/SubspaceLDA_MultivariateStats/index.html b/dev/models/SubspaceLDA_MultivariateStats/index.html index 4f5eb262c..5d32530b0 100644 --- a/dev/models/SubspaceLDA_MultivariateStats/index.html +++ b/dev/models/SubspaceLDA_MultivariateStats/index.html @@ -1,5 +1,5 @@ -SubspaceLDA · MLJ

SubspaceLDA

SubspaceLDA

A model type for constructing a subpace LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SubspaceLDA = @load SubspaceLDA pkg=MultivariateStats

Do model = SubspaceLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SubspaceLDA(normalize=...).

Multiclass subspace linear discriminant analysis (LDA) is a variation on ordinary LDA suitable for high dimensional data, as it avoids storing scatter matrices. For details, refer the MultivariateStats.jl documentation.

In addition to dimension reduction (using transform) probabilistic classification is provided (using predict). In the case of classification, the class probability for a new observation reflects the proximity of that observation to training observations associated with that class, and how far away the observation is from observations associated with other classes. Specifically, the distances, in the transformed (projected) space, of a new observation, from the centroid of each target class, is computed; the resulting vector of distances, multiplied by minus one, is passed to a softmax function to obtain a class probability prediction. Here "distance" is computed using a user-specified distance function.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • normalize=true: Option to normalize the between class variance for the number of observations in each class, one of true or false.
  • outdim: the ouput dimension, automatically set to min(indim, nclasses-1) if equal to 0. If a non-zero outdim is passed, then the actual output dimension used is min(rank, outdim) where rank is the rank of the within-class covariance matrix.
  • dist=Distances.SqEuclidean(): The distance metric to use when performing classification (to compare the distance between a new point and centroids in the transformed space); must be a subtype of Distances.SemiMetric from Distances.jl, e.g., Distances.CosineDist.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • classes: The classes seen during model fitting.
  • projection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).

Report

The fields of report(mach) are:

  • indim: The dimension of the input space i.e the number of training features.
  • outdim: The dimension of the transformed space the model is projected to.
  • mean: The mean of the untransformed training data. A vector of length indim.
  • nclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool)

class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).

  • class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)
  • explained_variance_ratio: The ratio of explained variance to total variance. Each dimension corresponds to an eigenvalue.

Examples

using MLJ
+SubspaceLDA · MLJ

SubspaceLDA

SubspaceLDA

A model type for constructing a subpace LDA model, based on MultivariateStats.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SubspaceLDA = @load SubspaceLDA pkg=MultivariateStats

Do model = SubspaceLDA() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SubspaceLDA(normalize=...).

Multiclass subspace linear discriminant analysis (LDA) is a variation on ordinary LDA suitable for high dimensional data, as it avoids storing scatter matrices. For details, refer the MultivariateStats.jl documentation.

In addition to dimension reduction (using transform) probabilistic classification is provided (using predict). In the case of classification, the class probability for a new observation reflects the proximity of that observation to training observations associated with that class, and how far away the observation is from observations associated with other classes. Specifically, the distances, in the transformed (projected) space, of a new observation, from the centroid of each target class, is computed; the resulting vector of distances, multiplied by minus one, is passed to a softmax function to obtain a class probability prediction. Here "distance" is computed using a user-specified distance function.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X).
  • y is the target, which can be any AbstractVector whose element scitype is OrderedFactor or Multiclass; check the scitype with scitype(y).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • normalize=true: Option to normalize the between class variance for the number of observations in each class, one of true or false.
  • outdim: the ouput dimension, automatically set to min(indim, nclasses-1) if equal to 0. If a non-zero outdim is passed, then the actual output dimension used is min(rank, outdim) where rank is the rank of the within-class covariance matrix.
  • dist=Distances.SqEuclidean(): The distance metric to use when performing classification (to compare the distance between a new point and centroids in the transformed space); must be a subtype of Distances.SemiMetric from Distances.jl, e.g., Distances.CosineDist.

Operations

  • transform(mach, Xnew): Return a lower dimensional projection of the input Xnew, which should have the same scitype as X above.
  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. Predictions are probabilistic but uncalibrated.
  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • classes: The classes seen during model fitting.
  • projection_matrix: The learned projection matrix, of size (indim, outdim), where indim and outdim are the input and output dimensions respectively (See Report section below).

Report

The fields of report(mach) are:

  • indim: The dimension of the input space i.e the number of training features.
  • outdim: The dimension of the transformed space the model is projected to.
  • mean: The mean of the untransformed training data. A vector of length indim.
  • nclasses: The number of classes directly observed in the training data (which can be less than the total number of classes in the class pool)

class_means: The class-specific means of the training data. A matrix of size (indim, nclasses) with the ith column being the class-mean of the ith class in classes (See fitted params section above).

  • class_weights: The weights (class counts) of each class. A vector of length nclasses with the ith element being the class weight of the ith class in classes. (See fitted params section above.)
  • explained_variance_ratio: The ratio of explained variance to total variance. Each dimension corresponds to an eigenvalue.

Examples

using MLJ
 
 SubspaceLDA = @load SubspaceLDA pkg=MultivariateStats
 
@@ -10,4 +10,4 @@
 
 Xproj = transform(mach, X)
 y_hat = predict(mach, X)
-labels = predict_mode(mach, X)

See also LDA, BayesianLDA, BayesianSubspaceLDA

+labels = predict_mode(mach, X)

See also LDA, BayesianLDA, BayesianSubspaceLDA

diff --git a/dev/models/TSVDTransformer_TSVD/index.html b/dev/models/TSVDTransformer_TSVD/index.html index 298e10570..cc58ef487 100644 --- a/dev/models/TSVDTransformer_TSVD/index.html +++ b/dev/models/TSVDTransformer_TSVD/index.html @@ -1,2 +1,2 @@ -TSVDTransformer · MLJ
+TSVDTransformer · MLJ
diff --git a/dev/models/TfidfTransformer_MLJText/index.html b/dev/models/TfidfTransformer_MLJText/index.html index 4a98aae51..db3e2d16e 100644 --- a/dev/models/TfidfTransformer_MLJText/index.html +++ b/dev/models/TfidfTransformer_MLJText/index.html @@ -1,5 +1,5 @@ -TfidfTransformer · MLJ

TfidfTransformer

TfidfTransformer

A model type for constructing a TF-IFD transformer, based on MLJText.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

TfidfTransformer = @load TfidfTransformer pkg=MLJText

Do model = TfidfTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TfidfTransformer(max_doc_freq=...).

The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of TF-IDF scores. Here "TF" means term-frequency while "IDF" means inverse document frequency (defined below). The TF-IDF score is the product of the two. This is a common term weighting scheme in information retrieval, that has also found good use in document classification. The goal of using TF-IDF instead of the raw frequencies of occurrence of a token in a given document is to scale down the impact of tokens that occur very frequently in a given corpus and that are hence empirically less informative than features that occur in a small fraction of the training corpus.

In textbooks and implementations there is variation in the definition of IDF. Here two IDF definitions are available. The default, smoothed option provides the IDF for a term t as log((1 + n)/(1 + df(t))) + 1, where n is the total number of documents and df(t) the number of documents in which t appears. Setting smooth_df = false provides an IDF of log(n/df(t)) + 1.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:

    • A vector of abstract strings (tokens), e.g., ["I", "like", "Sam", ".", "Sam", "is", "nice", "."] (scitype AbstractVector{Textual})
    • A dictionary of counts, indexed on abstract strings, e.g., Dict("I"=>1, "Sam"=>2, "Sam is"=>1) (scitype Multiset{Textual}})
    • A dictionary of counts, indexed on plain ngrams, e.g., Dict(("I",)=>1, ("Sam",)=>2, ("I", "Sam")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.
  • min_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.
  • smooth_idf=true: Control which definition of IDF to use (see above).

Operations

  • transform(mach, Xnew): Based on the vocabulary and IDF learned in training, return the matrix of TF-IDF scores for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.

Fitted parameters

The fields of fitted_params(mach) are:

  • vocab: A vector containing the strings used in the transformer's vocabulary.
  • idf_vector: The transformer's calculated IDF vector.

Examples

TfidfTransformer accepts a variety of inputs. The example below transforms tokenized documents:

using MLJ
+TfidfTransformer · MLJ

TfidfTransformer

TfidfTransformer

A model type for constructing a TF-IFD transformer, based on MLJText.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

TfidfTransformer = @load TfidfTransformer pkg=MLJText

Do model = TfidfTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TfidfTransformer(max_doc_freq=...).

The transformer converts a collection of documents, tokenized or pre-parsed as bags of words/ngrams, to a matrix of TF-IDF scores. Here "TF" means term-frequency while "IDF" means inverse document frequency (defined below). The TF-IDF score is the product of the two. This is a common term weighting scheme in information retrieval, that has also found good use in document classification. The goal of using TF-IDF instead of the raw frequencies of occurrence of a token in a given document is to scale down the impact of tokens that occur very frequently in a given corpus and that are hence empirically less informative than features that occur in a small fraction of the training corpus.

In textbooks and implementations there is variation in the definition of IDF. Here two IDF definitions are available. The default, smoothed option provides the IDF for a term t as log((1 + n)/(1 + df(t))) + 1, where n is the total number of documents and df(t) the number of documents in which t appears. Setting smooth_df = false provides an IDF of log(n/df(t)) + 1.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

Here:

  • X is any vector whose elements are either tokenized documents or bags of words/ngrams. Specifically, each element is one of the following:

    • A vector of abstract strings (tokens), e.g., ["I", "like", "Sam", ".", "Sam", "is", "nice", "."] (scitype AbstractVector{Textual})
    • A dictionary of counts, indexed on abstract strings, e.g., Dict("I"=>1, "Sam"=>2, "Sam is"=>1) (scitype Multiset{Textual}})
    • A dictionary of counts, indexed on plain ngrams, e.g., Dict(("I",)=>1, ("Sam",)=>2, ("I", "Sam")=>1) (scitype Multiset{<:NTuple{N,Textual} where N}); here a plain ngram is a tuple of abstract strings.

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • max_doc_freq=1.0: Restricts the vocabulary that the transformer will consider. Terms that occur in > max_doc_freq documents will not be considered by the transformer. For example, if max_doc_freq is set to 0.9, terms that are in more than 90% of the documents will be removed.
  • min_doc_freq=0.0: Restricts the vocabulary that the transformer will consider. Terms that occur in < max_doc_freq documents will not be considered by the transformer. A value of 0.01 means that only terms that are at least in 1% of the documents will be included.
  • smooth_idf=true: Control which definition of IDF to use (see above).

Operations

  • transform(mach, Xnew): Based on the vocabulary and IDF learned in training, return the matrix of TF-IDF scores for Xnew, a vector of the same form as X above. The matrix has size (n, p), where n = length(Xnew) and p the size of the vocabulary. Tokens/ngrams not appearing in the learned vocabulary are scored zero.

Fitted parameters

The fields of fitted_params(mach) are:

  • vocab: A vector containing the strings used in the transformer's vocabulary.
  • idf_vector: The transformer's calculated IDF vector.

Examples

TfidfTransformer accepts a variety of inputs. The example below transforms tokenized documents:

using MLJ
 import TextAnalysis
 
 TfidfTransformer = @load TfidfTransformer pkg=MLJText
@@ -43,4 +43,4 @@
 MLJ.fit!(mach)
 fitted_params(mach)
 
-tfidf_mat = transform(mach, ngram_docs)

See also CountTransformer, BM25Transformer

+tfidf_mat = transform(mach, ngram_docs)

See also CountTransformer, BM25Transformer

diff --git a/dev/models/TheilSenRegressor_MLJScikitLearnInterface/index.html b/dev/models/TheilSenRegressor_MLJScikitLearnInterface/index.html index b86506917..ce4a2c809 100644 --- a/dev/models/TheilSenRegressor_MLJScikitLearnInterface/index.html +++ b/dev/models/TheilSenRegressor_MLJScikitLearnInterface/index.html @@ -1,2 +1,2 @@ -TheilSenRegressor · MLJ

TheilSenRegressor

TheilSenRegressor

A model type for constructing a Theil-Sen regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

TheilSenRegressor = @load TheilSenRegressor pkg=MLJScikitLearnInterface

Do model = TheilSenRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TheilSenRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • copy_X = true
  • max_subpopulation = 10000
  • n_subsamples = nothing
  • max_iter = 300
  • tol = 0.001
  • random_state = nothing
  • n_jobs = nothing
  • verbose = false
+TheilSenRegressor · MLJ

TheilSenRegressor

TheilSenRegressor

A model type for constructing a Theil-Sen regressor, based on MLJScikitLearnInterface.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

TheilSenRegressor = @load TheilSenRegressor pkg=MLJScikitLearnInterface

Do model = TheilSenRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TheilSenRegressor(fit_intercept=...).

Hyper-parameters

  • fit_intercept = true
  • copy_X = true
  • max_subpopulation = 10000
  • n_subsamples = nothing
  • max_iter = 300
  • tol = 0.001
  • random_state = nothing
  • n_jobs = nothing
  • verbose = false
diff --git a/dev/models/TomekUndersampler_Imbalance/index.html b/dev/models/TomekUndersampler_Imbalance/index.html index e00f8c042..62adcb0ce 100644 --- a/dev/models/TomekUndersampler_Imbalance/index.html +++ b/dev/models/TomekUndersampler_Imbalance/index.html @@ -1,5 +1,5 @@ -TomekUndersampler · MLJ

TomekUndersampler

Initiate a tomek undersampling model with the given hyper-parameters.

TomekUndersampler

A model type for constructing a tomek undersampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

TomekUndersampler = @load TomekUndersampler pkg=Imbalance

Do model = TomekUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TomekUndersampler(min_ratios=...).

TomekUndersampler undersamples by removing any point that is part of a tomek link in the data. As defined in, Ivan Tomek. Two modifications of cnn. IEEE Trans. Systems, Man and Cybernetics, 6:769–772, 1976.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = TomekUndersampler()

Hyperparameters

  • min_ratios=1.0: A parameter that controls the maximum amount of undersampling to be done for each class. If this algorithm cleans the data to an extent that this is violated, some of the cleaned points will be revived randomly so that it is satisfied.

    • Can be a float and in this case each class will be at most undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class
    • Can be a dictionary mapping each class label to the float minimum ratio for that class
  • force_min_ratios=false: If true, and this algorithm cleans the data such that the ratios for each class exceed those specified in min_ratios then further undersampling will be perform so that the final ratios are equal to min_ratios.

  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

  • try_preserve_type::Bool=true: When true, the function will try to not change the type of the input table (e.g., DataFrame). However, for some tables, this may not succeed, and in this case, the table returned will be a column table (named-tuple of vectors). This parameter is ignored if the input is a matrix.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively
  • y_under: An abstract vector of labels corresponding to X_under

Operations

  • transform(mach, X, y): resample the data X and y using TomekUndersampler, returning both the new and original observations

Example

using MLJ
+TomekUndersampler · MLJ

TomekUndersampler

Initiate a tomek undersampling model with the given hyper-parameters.

TomekUndersampler

A model type for constructing a tomek undersampler, based on Imbalance.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

TomekUndersampler = @load TomekUndersampler pkg=Imbalance

Do model = TomekUndersampler() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in TomekUndersampler(min_ratios=...).

TomekUndersampler undersamples by removing any point that is part of a tomek link in the data. As defined in, Ivan Tomek. Two modifications of cnn. IEEE Trans. Systems, Man and Cybernetics, 6:769–772, 1976.

Training data

In MLJ or MLJBase, wrap the model in a machine by mach = machine(model)

There is no need to provide any data here because the model is a static transformer.

Likewise, there is no need to fit!(mach).

For default values of the hyper-parameters, model can be constructed by model = TomekUndersampler()

Hyperparameters

  • min_ratios=1.0: A parameter that controls the maximum amount of undersampling to be done for each class. If this algorithm cleans the data to an extent that this is violated, some of the cleaned points will be revived randomly so that it is satisfied.

    • Can be a float and in this case each class will be at most undersampled to the size of the minority class times the float. By default, all classes are undersampled to the size of the minority class
    • Can be a dictionary mapping each class label to the float minimum ratio for that class
  • force_min_ratios=false: If true, and this algorithm cleans the data such that the ratios for each class exceed those specified in min_ratios then further undersampling will be perform so that the final ratios are equal to min_ratios.

  • rng::Union{AbstractRNG, Integer}=default_rng(): Either an AbstractRNG object or an Integer seed to be used with Xoshiro if the Julia VERSION supports it. Otherwise, uses MersenneTwister`.

  • try_preserve_type::Bool=true: When true, the function will try to not change the type of the input table (e.g., DataFrame). However, for some tables, this may not succeed, and in this case, the table returned will be a column table (named-tuple of vectors). This parameter is ignored if the input is a matrix.

Transform Inputs

  • X: A matrix or table of floats where each row is an observation from the dataset
  • y: An abstract vector of labels (e.g., strings) that correspond to the observations in X

Transform Outputs

  • X_under: A matrix or table that includes the data after undersampling depending on whether the input X is a matrix or table respectively
  • y_under: An abstract vector of labels corresponding to X_under

Operations

  • transform(mach, X, y): resample the data X and y using TomekUndersampler, returning both the new and original observations

Example

using MLJ
 import Imbalance
 
 ## set probability of each class
@@ -25,4 +25,4 @@
 julia> Imbalance.checkbalance(y_under; ref="minority")
 1: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 19 (100.0%) 
 2: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 22 (115.8%) 
-0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 36 (189.5%)
+0: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 36 (189.5%)
diff --git a/dev/models/TransformedTargetModel_MLJBase/index.html b/dev/models/TransformedTargetModel_MLJBase/index.html index 6485b5c49..ea45bf0f6 100644 --- a/dev/models/TransformedTargetModel_MLJBase/index.html +++ b/dev/models/TransformedTargetModel_MLJBase/index.html @@ -1,4 +1,4 @@ -TransformedTargetModel · MLJ

TransformedTargetModel

TransformedTargetModel(model; transformer=nothing, inverse=nothing, cache=true)

Wrap the supervised or semi-supervised model in a transformation of the target variable.

Here transformer one of the following:

  • The Unsupervised model that is to transform the training target. By default (inverse=nothing) the parameters learned by this transformer are also used to inverse-transform the predictions of model, which means transformer must implement the inverse_transform method. If this is not the case, specify inverse=identity to suppress inversion.
  • A callable object for transforming the target, such as y -> log.(y). In this case a callable inverse, such as z -> exp.(z), should be specified.

Specify cache=false to prioritize memory over speed, or to guarantee data anonymity.

Specify inverse=identity if model is a probabilistic predictor, as inverse-transforming sample spaces is not supported. Alternatively, replace model with a deterministic model, such as Pipeline(model, y -> mode.(y)).

Examples

A model that normalizes the target before applying ridge regression, with predictions returned on the original scale:

@load RidgeRegressor pkg=MLJLinearModels
+TransformedTargetModel · MLJ

TransformedTargetModel

TransformedTargetModel(model; transformer=nothing, inverse=nothing, cache=true)

Wrap the supervised or semi-supervised model in a transformation of the target variable.

Here transformer one of the following:

  • The Unsupervised model that is to transform the training target. By default (inverse=nothing) the parameters learned by this transformer are also used to inverse-transform the predictions of model, which means transformer must implement the inverse_transform method. If this is not the case, specify inverse=identity to suppress inversion.
  • A callable object for transforming the target, such as y -> log.(y). In this case a callable inverse, such as z -> exp.(z), should be specified.

Specify cache=false to prioritize memory over speed, or to guarantee data anonymity.

Specify inverse=identity if model is a probabilistic predictor, as inverse-transforming sample spaces is not supported. Alternatively, replace model with a deterministic model, such as Pipeline(model, y -> mode.(y)).

Examples

A model that normalizes the target before applying ridge regression, with predictions returned on the original scale:

@load RidgeRegressor pkg=MLJLinearModels
 model = RidgeRegressor()
-tmodel = TransformedTargetModel(model, transformer=Standardizer())

A model that applies a static log transformation to the data, again returning predictions to the original scale:

tmodel2 = TransformedTargetModel(model, transformer=y->log.(y), inverse=z->exp.(y))
+tmodel = TransformedTargetModel(model, transformer=Standardizer())

A model that applies a static log transformation to the data, again returning predictions to the original scale:

tmodel2 = TransformedTargetModel(model, transformer=y->log.(y), inverse=z->exp.(y))
diff --git a/dev/models/TunedModel_MLJTuning/index.html b/dev/models/TunedModel_MLJTuning/index.html index f1b9c51c7..6a4301f59 100644 --- a/dev/models/TunedModel_MLJTuning/index.html +++ b/dev/models/TunedModel_MLJTuning/index.html @@ -1,5 +1,5 @@ -TunedModel · MLJ

TunedModel

tuned_model = TunedModel(; model=<model to be mutated>,
+TunedModel · MLJ

TunedModel

tuned_model = TunedModel(; model=<model to be mutated>,
                          tuning=RandomSearch(),
                          resampling=Holdout(),
                          range=nothing,
@@ -11,4 +11,4 @@
                          measure=nothing,
                          n=length(models),
                          operation=nothing,
-                         other_options...)

Construct a wrapper for multiple models, for selection of an optimal one (equivalent to specifying tuning=Explicit() and range=models above). Elements of the iterator models need not have a common type, but they must all be Deterministic or all be Probabilistic and this is not checked but inferred from the first element generated.

See below for a complete list of options.

Training

Calling fit!(mach) on a machine mach=machine(tuned_model, X, y) or mach=machine(tuned_model, X, y, w) will:

  • Instigate a search, over clones of model, with the hyperparameter mutations specified by range, for a model optimizing the specified measure, using performance evaluations carried out using the specified tuning strategy and resampling strategy. In the case models is explictly listed, the search is instead over the models generated by the iterator models.
  • Fit an internal machine, based on the optimal model fitted_params(mach).best_model, wrapping the optimal model object in all the provided data X, y(, w). Calling predict(mach, Xnew) then returns predictions on Xnew of this internal machine. The final train can be supressed by setting train_best=false.

Search space

The range objects supported depend on the tuning strategy specified. Query the strategy docstring for details. To optimize over an explicit list v of models of the same type, use strategy=Explicit() and specify model=v[1] and range=v.

The number of models searched is specified by n. If unspecified, then MLJTuning.default_n(tuning, range) is used. When n is increased and fit!(mach) called again, the old search history is re-instated and the search continues where it left off.

Measures (metrics)

If more than one measure is specified, then only the first is optimized (unless strategy is multi-objective) but the performance against every measure specified will be computed and reported in report(mach).best_performance and other relevant attributes of the generated report. Options exist to pass per-observation weights or class weights to measures; see below.

Important. If a custom measure, my_measure is used, and the measure is a score, rather than a loss, be sure to check that MLJ.orientation(my_measure) == :score to ensure maximization of the measure, rather than minimization. Override an incorrect value with MLJ.orientation(::typeof(my_measure)) = :score.

Accessing the fitted parameters and other training (tuning) outcomes

A Plots.jl plot of performance estimates is returned by plot(mach) or heatmap(mach).

Once a tuning machine mach has bee trained as above, then fitted_params(mach) has these keys/values:

keyvalue
best_modeloptimal model instance
best_fitted_paramslearned parameters of the optimal model

The named tuple report(mach) includes these keys/values:

keyvalue
best_modeloptimal model instance
best_history_entrycorresponding entry in the history, including performance estimate
best_reportreport generated by fitting the optimal model to all data
historytuning strategy-specific history of all evaluations

plus other key/value pairs specific to the tuning strategy.

Each element of history is a property-accessible object with these properties:

keyvalue
measurevector of measures (metrics)
measurementvector of measurements, one per measure
per_foldvector of vectors of unaggregated per-fold measurements
evaluationfull PerformanceEvaluation/CompactPerformaceEvaluation object

Complete list of key-word options

  • model: Supervised model prototype that is cloned and mutated to generate models for evaluation
  • models: Alternatively, an iterator of MLJ models to be explicitly evaluated. These may have varying types.
  • tuning=RandomSearch(): tuning strategy to be applied (eg, Grid()). See the Tuning Models section of the MLJ manual for a complete list of options.
  • resampling=Holdout(): resampling strategy (eg, Holdout(), CV()), StratifiedCV()) to be applied in performance evaluations
  • measure: measure or measures to be applied in performance evaluations; only the first used in optimization (unless the strategy is multi-objective) but all reported to the history
  • weights: per-observation weights to be passed the measure(s) in performance evaluations, where supported. Check support with supports_weights(measure).
  • class_weights: class weights to be passed the measure(s) in performance evaluations, where supported. Check support with supports_class_weights(measure).
  • repeats=1: for generating train/test sets multiple times in resampling ("Monte Carlo" resampling); see evaluate! for details
  • operation/operations - One of predict, predict_mean, predict_mode, predict_median, or predict_joint, or a vector of these of the same length as measure/measures. Automatically inferred if left unspecified.
  • range: range object; tuning strategy documentation describes supported types
  • selection_heuristic: the rule determining how the best model is decided. According to the default heuristic, NaiveSelection(), measure (or the first element of measure) is evaluated for each resample and these per-fold measurements are aggregrated. The model with the lowest (resp. highest) aggregate is chosen if the measure is a :loss (resp. a :score).
  • n: number of iterations (ie, models to be evaluated); set by tuning strategy if left unspecified
  • train_best=true: whether to train the optimal model
  • acceleration=default_resource(): mode of parallelization for tuning strategies that support this
  • acceleration_resampling=CPU1(): mode of parallelization for resampling
  • check_measure=true: whether to check measure is compatible with the specified model and operation)
  • cache=true: whether to cache model-specific representations of user-suplied data; set to false to conserve memory. Speed gains likely limited to the case resampling isa Holdout.
  • compact_history=true: whether to write CompactPerformanceEvaluation](@ref) or regular PerformanceEvaluation objects to the history (accessed via the :evaluation key); the compact form excludes some fields to conserve memory.
+ other_options...)

Construct a wrapper for multiple models, for selection of an optimal one (equivalent to specifying tuning=Explicit() and range=models above). Elements of the iterator models need not have a common type, but they must all be Deterministic or all be Probabilistic and this is not checked but inferred from the first element generated.

See below for a complete list of options.

Training

Calling fit!(mach) on a machine mach=machine(tuned_model, X, y) or mach=machine(tuned_model, X, y, w) will:

  • Instigate a search, over clones of model, with the hyperparameter mutations specified by range, for a model optimizing the specified measure, using performance evaluations carried out using the specified tuning strategy and resampling strategy. In the case models is explictly listed, the search is instead over the models generated by the iterator models.
  • Fit an internal machine, based on the optimal model fitted_params(mach).best_model, wrapping the optimal model object in all the provided data X, y(, w). Calling predict(mach, Xnew) then returns predictions on Xnew of this internal machine. The final train can be supressed by setting train_best=false.

Search space

The range objects supported depend on the tuning strategy specified. Query the strategy docstring for details. To optimize over an explicit list v of models of the same type, use strategy=Explicit() and specify model=v[1] and range=v.

The number of models searched is specified by n. If unspecified, then MLJTuning.default_n(tuning, range) is used. When n is increased and fit!(mach) called again, the old search history is re-instated and the search continues where it left off.

Measures (metrics)

If more than one measure is specified, then only the first is optimized (unless strategy is multi-objective) but the performance against every measure specified will be computed and reported in report(mach).best_performance and other relevant attributes of the generated report. Options exist to pass per-observation weights or class weights to measures; see below.

Important. If a custom measure, my_measure is used, and the measure is a score, rather than a loss, be sure to check that MLJ.orientation(my_measure) == :score to ensure maximization of the measure, rather than minimization. Override an incorrect value with MLJ.orientation(::typeof(my_measure)) = :score.

Accessing the fitted parameters and other training (tuning) outcomes

A Plots.jl plot of performance estimates is returned by plot(mach) or heatmap(mach).

Once a tuning machine mach has bee trained as above, then fitted_params(mach) has these keys/values:

keyvalue
best_modeloptimal model instance
best_fitted_paramslearned parameters of the optimal model

The named tuple report(mach) includes these keys/values:

keyvalue
best_modeloptimal model instance
best_history_entrycorresponding entry in the history, including performance estimate
best_reportreport generated by fitting the optimal model to all data
historytuning strategy-specific history of all evaluations

plus other key/value pairs specific to the tuning strategy.

Each element of history is a property-accessible object with these properties:

keyvalue
measurevector of measures (metrics)
measurementvector of measurements, one per measure
per_foldvector of vectors of unaggregated per-fold measurements
evaluationfull PerformanceEvaluation/CompactPerformaceEvaluation object

Complete list of key-word options

  • model: Supervised model prototype that is cloned and mutated to generate models for evaluation
  • models: Alternatively, an iterator of MLJ models to be explicitly evaluated. These may have varying types.
  • tuning=RandomSearch(): tuning strategy to be applied (eg, Grid()). See the Tuning Models section of the MLJ manual for a complete list of options.
  • resampling=Holdout(): resampling strategy (eg, Holdout(), CV()), StratifiedCV()) to be applied in performance evaluations
  • measure: measure or measures to be applied in performance evaluations; only the first used in optimization (unless the strategy is multi-objective) but all reported to the history
  • weights: per-observation weights to be passed the measure(s) in performance evaluations, where supported. Check support with supports_weights(measure).
  • class_weights: class weights to be passed the measure(s) in performance evaluations, where supported. Check support with supports_class_weights(measure).
  • repeats=1: for generating train/test sets multiple times in resampling ("Monte Carlo" resampling); see evaluate! for details
  • operation/operations - One of predict, predict_mean, predict_mode, predict_median, or predict_joint, or a vector of these of the same length as measure/measures. Automatically inferred if left unspecified.
  • range: range object; tuning strategy documentation describes supported types
  • selection_heuristic: the rule determining how the best model is decided. According to the default heuristic, NaiveSelection(), measure (or the first element of measure) is evaluated for each resample and these per-fold measurements are aggregrated. The model with the lowest (resp. highest) aggregate is chosen if the measure is a :loss (resp. a :score).
  • n: number of iterations (ie, models to be evaluated); set by tuning strategy if left unspecified
  • train_best=true: whether to train the optimal model
  • acceleration=default_resource(): mode of parallelization for tuning strategies that support this
  • acceleration_resampling=CPU1(): mode of parallelization for resampling
  • check_measure=true: whether to check measure is compatible with the specified model and operation)
  • cache=true: whether to cache model-specific representations of user-suplied data; set to false to conserve memory. Speed gains likely limited to the case resampling isa Holdout.
  • compact_history=true: whether to write CompactPerformanceEvaluation](@ref) or regular PerformanceEvaluation objects to the history (accessed via the :evaluation key); the compact form excludes some fields to conserve memory.
diff --git a/dev/models/UnivariateBoxCoxTransformer_MLJModels/index.html b/dev/models/UnivariateBoxCoxTransformer_MLJModels/index.html index 2115f8e88..9bccd7ef2 100644 --- a/dev/models/UnivariateBoxCoxTransformer_MLJModels/index.html +++ b/dev/models/UnivariateBoxCoxTransformer_MLJModels/index.html @@ -1,5 +1,5 @@ -UnivariateBoxCoxTransformer · MLJ

UnivariateBoxCoxTransformer

UnivariateBoxCoxTransformer

A model type for constructing a single variable Box-Cox transformer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateBoxCoxTransformer = @load UnivariateBoxCoxTransformer pkg=MLJModels

Do model = UnivariateBoxCoxTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateBoxCoxTransformer(n=...).

Box-Cox transformations attempt to make data look more normally distributed. This can improve performance and assist in the interpretation of models which suppose that data is generated by a normal distribution.

A Box-Cox transformation (with shift) is of the form

x -> ((x + c)^λ - 1)/λ

for some constant c and real λ, unless λ = 0, in which case the above is replaced with

x -> log(x + c)

Given user-specified hyper-parameters n::Integer and shift::Bool, the present implementation learns the parameters c and λ from the training data as follows: If shift=true and zeros are encountered in the data, then c is set to 0.2 times the data mean. If there are no zeros, then no shift is applied. Finally, n different values of λ between -0.4 and 3 are considered, with λ fixed to the value maximizing normality of the transformed data.

Reference: Wikipedia entry for power transform.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with element scitype Continuous; check the scitype with scitype(x)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • n=171: number of values of the exponent λ to try
  • shift=false: whether to include a preliminary constant translation in transformations, in the presence of zeros

Operations

  • transform(mach, xnew): apply the Box-Cox transformation learned when fitting mach
  • inverse_transform(mach, z): reconstruct the vector z whose transformation learned by mach is z

Fitted parameters

The fields of fitted_params(mach) are:

  • λ: the learned Box-Cox exponent
  • c: the learned shift

Examples

using MLJ
+UnivariateBoxCoxTransformer · MLJ

UnivariateBoxCoxTransformer

UnivariateBoxCoxTransformer

A model type for constructing a single variable Box-Cox transformer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateBoxCoxTransformer = @load UnivariateBoxCoxTransformer pkg=MLJModels

Do model = UnivariateBoxCoxTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateBoxCoxTransformer(n=...).

Box-Cox transformations attempt to make data look more normally distributed. This can improve performance and assist in the interpretation of models which suppose that data is generated by a normal distribution.

A Box-Cox transformation (with shift) is of the form

x -> ((x + c)^λ - 1)/λ

for some constant c and real λ, unless λ = 0, in which case the above is replaced with

x -> log(x + c)

Given user-specified hyper-parameters n::Integer and shift::Bool, the present implementation learns the parameters c and λ from the training data as follows: If shift=true and zeros are encountered in the data, then c is set to 0.2 times the data mean. If there are no zeros, then no shift is applied. Finally, n different values of λ between -0.4 and 3 are considered, with λ fixed to the value maximizing normality of the transformed data.

Reference: Wikipedia entry for power transform.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with element scitype Continuous; check the scitype with scitype(x)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • n=171: number of values of the exponent λ to try
  • shift=false: whether to include a preliminary constant translation in transformations, in the presence of zeros

Operations

  • transform(mach, xnew): apply the Box-Cox transformation learned when fitting mach
  • inverse_transform(mach, z): reconstruct the vector z whose transformation learned by mach is z

Fitted parameters

The fields of fitted_params(mach) are:

  • λ: the learned Box-Cox exponent
  • c: the learned shift

Examples

using MLJ
 using UnicodePlots
 using Random
 Random.seed!(123)
@@ -38,4 +38,4 @@
    [ 3.0,  4.0) ┤▎ 1
                 └                                        ┘
                                  Frequency
-
+
diff --git a/dev/models/UnivariateDiscretizer_MLJModels/index.html b/dev/models/UnivariateDiscretizer_MLJModels/index.html index 7f5b542de..09f5b362e 100644 --- a/dev/models/UnivariateDiscretizer_MLJModels/index.html +++ b/dev/models/UnivariateDiscretizer_MLJModels/index.html @@ -1,5 +1,5 @@ -UnivariateDiscretizer · MLJ

UnivariateDiscretizer

UnivariateDiscretizer

A model type for constructing a single variable discretizer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateDiscretizer = @load UnivariateDiscretizer pkg=MLJModels

Do model = UnivariateDiscretizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateDiscretizer(n_classes=...).

Discretization converts a Continuous vector into an OrderedFactor vector. In particular, the output is a CategoricalVector (whose reference type is optimized).

The transformation is chosen so that the vector on which the transformer is fit has, in transformed form, an approximately uniform distribution of values. Specifically, if n_classes is the level of discretization, then 2*n_classes - 1 ordered quantiles are computed, the odd quantiles being used for transforming (discretization) and the even quantiles for inverse transforming.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with Continuous element scitype; check scitype with scitype(x).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • n_classes: number of discrete classes in the output

Operations

  • transform(mach, xnew): discretize xnew according to the discretization learned when fitting mach
  • inverse_transform(mach, z): attempt to reconstruct from z a vector that transforms to give z

Fitted parameters

The fields of fitted_params(mach).fitesult include:

  • odd_quantiles: quantiles used for transforming (length is n_classes - 1)
  • even_quantiles: quantiles used for inverse transforming (length is n_classes)

Example

using MLJ
+UnivariateDiscretizer · MLJ

UnivariateDiscretizer

UnivariateDiscretizer

A model type for constructing a single variable discretizer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateDiscretizer = @load UnivariateDiscretizer pkg=MLJModels

Do model = UnivariateDiscretizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateDiscretizer(n_classes=...).

Discretization converts a Continuous vector into an OrderedFactor vector. In particular, the output is a CategoricalVector (whose reference type is optimized).

The transformation is chosen so that the vector on which the transformer is fit has, in transformed form, an approximately uniform distribution of values. Specifically, if n_classes is the level of discretization, then 2*n_classes - 1 ordered quantiles are computed, the odd quantiles being used for transforming (discretization) and the even quantiles for inverse transforming.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with Continuous element scitype; check scitype with scitype(x).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • n_classes: number of discrete classes in the output

Operations

  • transform(mach, xnew): discretize xnew according to the discretization learned when fitting mach
  • inverse_transform(mach, z): attempt to reconstruct from z a vector that transforms to give z

Fitted parameters

The fields of fitted_params(mach).fitesult include:

  • odd_quantiles: quantiles used for transforming (length is n_classes - 1)
  • even_quantiles: quantiles used for inverse transforming (length is n_classes)

Example

using MLJ
 using Random
 Random.seed!(123)
 
@@ -30,4 +30,4 @@
  0.012731354778359405
  0.0056265330571125816
  0.005738175684445124
- 0.006835652575801987
+ 0.006835652575801987
diff --git a/dev/models/UnivariateFillImputer_MLJModels/index.html b/dev/models/UnivariateFillImputer_MLJModels/index.html index 9e0773fbf..cc8218566 100644 --- a/dev/models/UnivariateFillImputer_MLJModels/index.html +++ b/dev/models/UnivariateFillImputer_MLJModels/index.html @@ -1,5 +1,5 @@ -UnivariateFillImputer · MLJ

UnivariateFillImputer

UnivariateFillImputer

A model type for constructing a single variable fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateFillImputer = @load UnivariateFillImputer pkg=MLJModels

Do model = UnivariateFillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateFillImputer(continuous_fill=...).

Use this model to imputing missing values in a vector with a fixed value learned from the non-missing values of training vector.

For imputing missing values in tabular data, use FillImputer instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with element scitype Union{Missing, T} where T is a subtype of Continuous, Multiclass, OrderedFactor or Count; check scitype using scitype(x)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • continuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values
  • count_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values
  • finite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values

Operations

  • transform(mach, xnew): return xnew with missing values imputed with the fill values learned when fitting mach

Fitted parameters

The fields of fitted_params(mach) are:

  • filler: the fill value to be imputed in all new data

Examples

using MLJ
+UnivariateFillImputer · MLJ

UnivariateFillImputer

UnivariateFillImputer

A model type for constructing a single variable fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateFillImputer = @load UnivariateFillImputer pkg=MLJModels

Do model = UnivariateFillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateFillImputer(continuous_fill=...).

Use this model to imputing missing values in a vector with a fixed value learned from the non-missing values of training vector.

For imputing missing values in tabular data, use FillImputer instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with element scitype Union{Missing, T} where T is a subtype of Continuous, Multiclass, OrderedFactor or Count; check scitype using scitype(x)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • continuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values
  • count_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values
  • finite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values

Operations

  • transform(mach, xnew): return xnew with missing values imputed with the fill values learned when fitting mach

Fitted parameters

The fields of fitted_params(mach) are:

  • filler: the fill value to be imputed in all new data

Examples

using MLJ
 imputer = UnivariateFillImputer()
 
 x_continuous = [1.0, 2.0, missing, 3.0]
@@ -34,4 +34,4 @@
 3-element Vector{Int64}:
  2
  2
- 5

For imputing tabular data, use FillImputer.

+ 5

For imputing tabular data, use FillImputer.

diff --git a/dev/models/UnivariateStandardizer_MLJModels/index.html b/dev/models/UnivariateStandardizer_MLJModels/index.html index 4e80aab7c..0c26b16e6 100644 --- a/dev/models/UnivariateStandardizer_MLJModels/index.html +++ b/dev/models/UnivariateStandardizer_MLJModels/index.html @@ -1,2 +1,2 @@ -UnivariateStandardizer · MLJ

UnivariateStandardizer

UnivariateStandardizer()

Transformer type for standardizing (whitening) single variable data.

This model may be deprecated in the future. Consider using Standardizer, which handles both tabular and univariate data.

+UnivariateStandardizer · MLJ

UnivariateStandardizer

UnivariateStandardizer()

Transformer type for standardizing (whitening) single variable data.

This model may be deprecated in the future. Consider using Standardizer, which handles both tabular and univariate data.

diff --git a/dev/models/UnivariateTimeTypeToContinuous_MLJModels/index.html b/dev/models/UnivariateTimeTypeToContinuous_MLJModels/index.html index 459a6df23..d48ed174a 100644 --- a/dev/models/UnivariateTimeTypeToContinuous_MLJModels/index.html +++ b/dev/models/UnivariateTimeTypeToContinuous_MLJModels/index.html @@ -1,5 +1,5 @@ -UnivariateTimeTypeToContinuous · MLJ

UnivariateTimeTypeToContinuous

UnivariateTimeTypeToContinuous

A model type for constructing a single variable transformer that creates continuous representations of temporally typed data, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateTimeTypeToContinuous = @load UnivariateTimeTypeToContinuous pkg=MLJModels

Do model = UnivariateTimeTypeToContinuous() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateTimeTypeToContinuous(zero_time=...).

Use this model to convert vectors with a TimeType element type to vectors of Float64 type (Continuous element scitype).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector whose element type is a subtype of Dates.TimeType

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • zero_time: the time that is to correspond to 0.0 under transformations, with the type coinciding with the training data element type. If unspecified, the earliest time encountered in training is used.
  • step::Period=Hour(24): time interval to correspond to one unit under transformation

Operations

  • transform(mach, xnew): apply the encoding inferred when mach was fit

Fitted parameters

fitted_params(mach).fitresult is the tuple (zero_time, step) actually used in transformations, which may differ from the user-specified hyper-parameters.

Example

using MLJ
+UnivariateTimeTypeToContinuous · MLJ

UnivariateTimeTypeToContinuous

UnivariateTimeTypeToContinuous

A model type for constructing a single variable transformer that creates continuous representations of temporally typed data, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateTimeTypeToContinuous = @load UnivariateTimeTypeToContinuous pkg=MLJModels

Do model = UnivariateTimeTypeToContinuous() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateTimeTypeToContinuous(zero_time=...).

Use this model to convert vectors with a TimeType element type to vectors of Float64 type (Continuous element scitype).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector whose element type is a subtype of Dates.TimeType

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • zero_time: the time that is to correspond to 0.0 under transformations, with the type coinciding with the training data element type. If unspecified, the earliest time encountered in training is used.
  • step::Period=Hour(24): time interval to correspond to one unit under transformation

Operations

  • transform(mach, xnew): apply the encoding inferred when mach was fit

Fitted parameters

fitted_params(mach).fitresult is the tuple (zero_time, step) actually used in transformations, which may differ from the user-specified hyper-parameters.

Example

using MLJ
 using Dates
 
 x = [Date(2001, 1, 1) + Day(i) for i in 0:4]
@@ -15,4 +15,4 @@
  52.42857142857143
  52.57142857142857
  52.714285714285715
- 52.857142
+ 52.857142
diff --git a/dev/models/XGBoostClassifier_XGBoost/index.html b/dev/models/XGBoostClassifier_XGBoost/index.html index eee371ba3..c804bfba8 100644 --- a/dev/models/XGBoostClassifier_XGBoost/index.html +++ b/dev/models/XGBoostClassifier_XGBoost/index.html @@ -1,2 +1,2 @@ -XGBoostClassifier · MLJ

XGBoostClassifier

XGBoostClassifier

A model type for constructing a eXtreme Gradient Boosting Classifier, based on XGBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

XGBoostClassifier = @load XGBoostClassifier pkg=XGBoost

Do model = XGBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostClassifier(test=...).

Univariate classification using xgboost.

Training data

In MLJ or MLJBase, bind an instance model to data with

m = machine(model, X, y)

where

  • X: any table of input features, either an AbstractMatrix or Tables.jl-compatible table.
  • y: is an AbstractVector Finite target.

Train using fit!(m, rows=...).

Hyper-parameters

See https://xgboost.readthedocs.io/en/stable/parameter.html.

+XGBoostClassifier · MLJ

XGBoostClassifier

XGBoostClassifier

A model type for constructing a eXtreme Gradient Boosting Classifier, based on XGBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

XGBoostClassifier = @load XGBoostClassifier pkg=XGBoost

Do model = XGBoostClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostClassifier(test=...).

Univariate classification using xgboost.

Training data

In MLJ or MLJBase, bind an instance model to data with

m = machine(model, X, y)

where

  • X: any table of input features, either an AbstractMatrix or Tables.jl-compatible table.
  • y: is an AbstractVector Finite target.

Train using fit!(m, rows=...).

Hyper-parameters

See https://xgboost.readthedocs.io/en/stable/parameter.html.

diff --git a/dev/models/XGBoostCount_XGBoost/index.html b/dev/models/XGBoostCount_XGBoost/index.html index 98981c7e5..e7904a735 100644 --- a/dev/models/XGBoostCount_XGBoost/index.html +++ b/dev/models/XGBoostCount_XGBoost/index.html @@ -1,2 +1,2 @@ -XGBoostCount · MLJ

XGBoostCount

XGBoostCount

A model type for constructing a eXtreme Gradient Boosting Count Regressor, based on XGBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

XGBoostCount = @load XGBoostCount pkg=XGBoost

Do model = XGBoostCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostCount(test=...).

Univariate discrete regression using xgboost.

Training data

In MLJ or MLJBase, bind an instance model to data with

m = machine(model, X, y)

where

  • X: any table of input features, either an AbstractMatrix or Tables.jl-compatible table.
  • y: is an AbstractVector continuous target.

Train using fit!(m, rows=...).

Hyper-parameters

See https://xgboost.readthedocs.io/en/stable/parameter.html.

+XGBoostCount · MLJ

XGBoostCount

XGBoostCount

A model type for constructing a eXtreme Gradient Boosting Count Regressor, based on XGBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

XGBoostCount = @load XGBoostCount pkg=XGBoost

Do model = XGBoostCount() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostCount(test=...).

Univariate discrete regression using xgboost.

Training data

In MLJ or MLJBase, bind an instance model to data with

m = machine(model, X, y)

where

  • X: any table of input features, either an AbstractMatrix or Tables.jl-compatible table.
  • y: is an AbstractVector continuous target.

Train using fit!(m, rows=...).

Hyper-parameters

See https://xgboost.readthedocs.io/en/stable/parameter.html.

diff --git a/dev/models/XGBoostRegressor_XGBoost/index.html b/dev/models/XGBoostRegressor_XGBoost/index.html index 626bf178c..26cc03f26 100644 --- a/dev/models/XGBoostRegressor_XGBoost/index.html +++ b/dev/models/XGBoostRegressor_XGBoost/index.html @@ -1,2 +1,2 @@ -XGBoostRegressor · MLJ

XGBoostRegressor

XGBoostRegressor

A model type for constructing a eXtreme Gradient Boosting Regressor, based on XGBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

XGBoostRegressor = @load XGBoostRegressor pkg=XGBoost

Do model = XGBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostRegressor(test=...).

Univariate continuous regression using xgboost.

Training data

In MLJ or MLJBase, bind an instance model to data with

m = machine(model, X, y)

where

  • X: any table of input features whose columns have Continuous element scitype; check column scitypes with schema(X).
  • y: is an AbstractVector target with Continuous elements; check the scitype with scitype(y).

Train using fit!(m, rows=...).

Hyper-parameters

See https://xgboost.readthedocs.io/en/stable/parameter.html.

+XGBoostRegressor · MLJ

XGBoostRegressor

XGBoostRegressor

A model type for constructing a eXtreme Gradient Boosting Regressor, based on XGBoost.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

XGBoostRegressor = @load XGBoostRegressor pkg=XGBoost

Do model = XGBoostRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in XGBoostRegressor(test=...).

Univariate continuous regression using xgboost.

Training data

In MLJ or MLJBase, bind an instance model to data with

m = machine(model, X, y)

where

  • X: any table of input features whose columns have Continuous element scitype; check column scitypes with schema(X).
  • y: is an AbstractVector target with Continuous elements; check the scitype with scitype(y).

Train using fit!(m, rows=...).

Hyper-parameters

See https://xgboost.readthedocs.io/en/stable/parameter.html.

diff --git a/dev/modifying_behavior/index.html b/dev/modifying_behavior/index.html index 22615135c..c02eb22cf 100644 --- a/dev/modifying_behavior/index.html +++ b/dev/modifying_behavior/index.html @@ -1,4 +1,4 @@ -Modifying Behavior · MLJ

Modifying Behavior

To modify behavior of MLJ you will need to clone the relevant component package (e.g., MLJBase.jl) - or a fork thereof - and modify your local julia environment to use your local clone in place of the official release. For example, you might proceed something like this:

using Pkg
+Modifying Behavior · MLJ

Modifying Behavior

To modify behavior of MLJ you will need to clone the relevant component package (e.g., MLJBase.jl) - or a fork thereof - and modify your local julia environment to use your local clone in place of the official release. For example, you might proceed something like this:

using Pkg
 Pkg.activate("my_MLJ_enf", shared=true)
-Pkg.develop("path/to/my/local/MLJBase")

To test your local clone, do

Pkg.test("MLJBase")

For more on package management, see here.

+Pkg.develop("path/to/my/local/MLJBase")

To test your local clone, do

Pkg.test("MLJBase")

For more on package management, see here.

diff --git a/dev/openml_integration/index.html b/dev/openml_integration/index.html index 6e6082e39..8f046c6c9 100644 --- a/dev/openml_integration/index.html +++ b/dev/openml_integration/index.html @@ -1,2 +1,2 @@ -OpenML Integration · MLJ

OpenML Integration

The OpenML platform provides an integration platform for carrying out and comparing machine learning solutions across a broad collection of public datasets and software platforms.

Integration with OpenML API is presently limited to querying and downloading datasets.

Documentation is here.

+OpenML Integration · MLJ

OpenML Integration

The OpenML platform provides an integration platform for carrying out and comparing machine learning solutions across a broad collection of public datasets and software platforms.

Integration with OpenML API is presently limited to querying and downloading datasets.

Documentation is here.

diff --git a/dev/performance_measures/index.html b/dev/performance_measures/index.html index dda99db19..7f59b7e82 100644 --- a/dev/performance_measures/index.html +++ b/dev/performance_measures/index.html @@ -1,8 +1,8 @@ -Performance Measures · MLJ

Performance Measures

Introduction

In MLJ loss functions, scoring rules, confusion matrices, sensitivities, etc, are collectively referred to as measures. These measures are provided by the package StatisticalMeasures.jl but are immediately available to the MLJ user. Here's a simple example of direct application of the log_loss measures to compute a training loss:

using MLJ
+Performance Measures · MLJ

Performance Measures

Introduction

In MLJ loss functions, scoring rules, confusion matrices, sensitivities, etc, are collectively referred to as measures. These measures are provided by the package StatisticalMeasures.jl but are immediately available to the MLJ user. Here's a simple example of direct application of the log_loss measures to compute a training loss:

using MLJ
 X, y = @load_iris
 DecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree
 tree = DecisionTreeClassifier(max_depth=2)
 mach = machine(tree, X, y) |> fit!
 yhat = predict(mach, X)
-log_loss(yhat, y)
0.143176310291424

For more examples of direct measure usage, see the StatisticalMeasures.jl tutorial.

A list of all measures, ready to use after running using MLJ or using StatisticalMeasures, is here. Alternatively, call measures() (experimental) to generate a dictionary keyed on available measure constructors, with measure metadata as values.

Custom measures

Any measure-like object with appropriate calling behavior can be used with MLJ. To quickly build custom measures, we recommend using the package StatisticalMeasuresBase.jl, which provides this tutorial. Note, in particular, that an "atomic" measure can be transformed into a multi-target measure using this package.

Uses of measures

In MLJ, measures are specified:

and elsewhere.

Using LossFunctions.jl

In previous versions of MLJ, measures from LossFunctions.jl were also available. Now measures from that package must be explicitly imported and wrapped, as described here.

Receiver operator characteristics

A related performance evaluation tool provided by StatisticalMeasures.jl, and hence by MLJ, is the roc_curve method:

StatisticalMeasures.roc_curveFunction
roc_curve(ŷ, y) -> false_positive_rates, true_positive_rates, thresholds

Return data for plotting the receiver operator characteristic (ROC curve) for a binary classification problem.

Here is a vector of UnivariateFinite distributions (from CategoricalDistributions.jl) over the two values taken by the ground truth observations y, a CategoricalVector.

If there are k unique probabilities, then there are correspondingly k thresholds and k+1 "bins" over which the false positive and true positive rates are constant.:

  • [0.0 - thresholds[1]]
  • [thresholds[1] - thresholds[2]]
  • ...
  • [thresholds[k] - 1]

Consequently, true_positive_rates and false_positive_rates have length k+1 if thresholds has length k.

To plot the curve using your favorite plotting backend, do something like plot(false_positive_rates, true_positive_rates).

Core algorithm: Functions.roc_curve

See also AreaUnderCurve.

source

Migration guide for changes to measures in MLJBase 1.0

Prior to MLJBase.jl 1.0 (respectivey, MLJ.jl version 0.19.6) measures were defined in MLJBase.jl (a dependency of MLJ.jl) but now they are provided by MLJ.jl dependency StatisticalMeasures. Effects on users are detailed below:

Breaking behavior likely relevant to many users

  • If using MLJBase without MLJ, then, in Julia 1.9 or higher, StatisticalMeasures must be explicitly imported to use measures that were previously part of MLJBase. If using MLJ, then all previous measures are still available, with the exception of those corresponding to LossFunctions.jl (see below).

  • All measures return a single aggregated measurement. In other words, measures previously reporting a measurement per-observation (previously subtyping Unaggregated) no longer do so. To get per-observation measurements, use the new method StatisticalMeasures.measurements(measure, ŷ, y[, weights, class_weights]).

  • The default measure for regression models (used in evaluate/evaluate! when measures is unspecified) is changed from rms to l2=LPLoss(2) (mean sum of squares).

  • MeanAbsoluteError has been removed and instead mae is an alias for LPLoss(p=1).

  • Measures that previously skipped NaN values will now (at least by default) propagate those values. Missing value behavior is unchanged, except some measures that previously did not support missing now do.

  • Aliases for measure types have been removed. For example RMSE (alias for RootMeanSquaredError) is gone. Aliases for instances, such as rms and cross_entropy persist. The exception is precision, for which ppv can be used in its place. (This is to avoid conflict with Base.precision, which was previously pirated.)

  • info(measure) has been decommissioned; query docstrings or access the new measure traits individually instead. These traits are now provided by StatisticalMeasures.jl and not are not exported. For example, to access the orientation of the measure rms, do import StatisticalMeasures as SM; SM.orientation(rms).

  • Behavior of the measures() method, to list all measures and associated traits, has changed. It now returns a dictionary instead of a vector of named tuples; measures(predicate) is decommissioned, but measures(needle) is preserved. (This method, owned by StatisticalMeasures.jl, has some other search options, but is experimental.)

  • Measures that were wraps of losses from LossFunctions.jl are no longer exposed by MLJBase or MLJ. To use such a loss, you must explicitly import LossFunctions and wrap the loss appropriately. See Using losses from LossFunctions.jl for examples.

  • Some user-defined measures working in previous versions of MLJBase.jl may not work without modification, as they must conform to the new StatisticalMeasuresBase.jl API. See this tutorial on how define new measures.

  • Measures with a "feature argument" X, as in some_measure(ŷ, y, X), are no longer supported. See What is a measure? for allowed signatures in measures.

Packages implementing the MLJ model interface

The migration of measures is not expected to require any changes to the source code in packges providing implementations of the MLJ model interface (MLJModelInterface.jl) such as MLJDecisionTreeInterface.jl and MLJFlux.jl, and this is confirmed by extensive integration tests. However, some current tests will fail, if they use MLJBase measures. The following should generally suffice to adapt such tests:

  • Add StatisticalMeasures as test dependency, and add using StatisticalMeasures to your runtests.jl (and/or included submodules).

  • If measures are qualified, as in MLJBase.rms, then the qualification must be removed or changed to StatisticalMeasures.rms, etc.

  • Be aware that the default measure used in methods such as evaluate!, when measure is not specified, is changed from rms to l2 for regression models.

  • Be aware of that all measures now report a measurement for every observation, and never an aggregate. See second point above.

Breaking behavior possibly relevant to some developers

  • The abstract measure types Aggregated, Unaggregated, Measure have been decommissioned. (A measure is now defined purely by its calling behavior.)

  • What were previously exported as measure types are now only constructors.

  • target_scitype(measure) is decommissioned. Related is StatisticalMeasures.observation_scitype(measure) which declares an upper bound on the allowed scitype of a single observation.

  • prediction_type(measure) is decommissioned. Instead use StatisticalMeasures.kind_of_proxy(measure).

  • The trait reports_each_observation is decommissioned. Related is StatisticalMeasures.can_report_unaggregated; if false the new measurements method simply returns n copies of the aggregated measurement, where n is the number of observations provided, instead of individual observation-dependent measurements.

  • aggregation(measure) has been decommissioned. Instead use StatisticalMeasures.external_mode_of_aggregation(measure).

  • instances(measure) has been decommissioned; query docstrings for measure aliases, or follow this example: aliases = measures()[RootMeanSquaredError].aliases.

  • is_feature_dependent(measure) has been decommissioned. Measures consuming feature data are not longer supported; see above.

  • distribution_type(measure) has been decommissioned.

  • docstring(measure) has been decommissioned.

  • Behavior of aggregate has changed.

  • The following traits, previously exported by MLJBase and MLJ, cannot be applied to measures: supports_weights, supports_class_weights, orientation, human_name. Instead use the traits with these names provided by StatisticalMeausures.jl (they will need to be qualified, as in import StatisticalMeasures; StatisticalMeasures.orientation(measure)).

+log_loss(yhat, y)
0.143176310291424

For more examples of direct measure usage, see the StatisticalMeasures.jl tutorial.

A list of all measures, ready to use after running using MLJ or using StatisticalMeasures, is here. Alternatively, call measures() (experimental) to generate a dictionary keyed on available measure constructors, with measure metadata as values.

Custom measures

Any measure-like object with appropriate calling behavior can be used with MLJ. To quickly build custom measures, we recommend using the package StatisticalMeasuresBase.jl, which provides this tutorial. Note, in particular, that an "atomic" measure can be transformed into a multi-target measure using this package.

Uses of measures

In MLJ, measures are specified:

and elsewhere.

Using LossFunctions.jl

In previous versions of MLJ, measures from LossFunctions.jl were also available. Now measures from that package must be explicitly imported and wrapped, as described here.

Receiver operator characteristics

A related performance evaluation tool provided by StatisticalMeasures.jl, and hence by MLJ, is the roc_curve method:

StatisticalMeasures.roc_curveFunction
roc_curve(ŷ, y) -> false_positive_rates, true_positive_rates, thresholds

Return data for plotting the receiver operator characteristic (ROC curve) for a binary classification problem.

Here is a vector of UnivariateFinite distributions (from CategoricalDistributions.jl) over the two values taken by the ground truth observations y, a CategoricalVector.

If there are k unique probabilities, then there are correspondingly k thresholds and k+1 "bins" over which the false positive and true positive rates are constant.:

  • [0.0 - thresholds[1]]
  • [thresholds[1] - thresholds[2]]
  • ...
  • [thresholds[k] - 1]

Consequently, true_positive_rates and false_positive_rates have length k+1 if thresholds has length k.

To plot the curve using your favorite plotting backend, do something like plot(false_positive_rates, true_positive_rates).

Core algorithm: Functions.roc_curve

See also AreaUnderCurve.

source

Migration guide for changes to measures in MLJBase 1.0

Prior to MLJBase.jl 1.0 (respectivey, MLJ.jl version 0.19.6) measures were defined in MLJBase.jl (a dependency of MLJ.jl) but now they are provided by MLJ.jl dependency StatisticalMeasures. Effects on users are detailed below:

Breaking behavior likely relevant to many users

  • If using MLJBase without MLJ, then, in Julia 1.9 or higher, StatisticalMeasures must be explicitly imported to use measures that were previously part of MLJBase. If using MLJ, then all previous measures are still available, with the exception of those corresponding to LossFunctions.jl (see below).

  • All measures return a single aggregated measurement. In other words, measures previously reporting a measurement per-observation (previously subtyping Unaggregated) no longer do so. To get per-observation measurements, use the new method StatisticalMeasures.measurements(measure, ŷ, y[, weights, class_weights]).

  • The default measure for regression models (used in evaluate/evaluate! when measures is unspecified) is changed from rms to l2=LPLoss(2) (mean sum of squares).

  • MeanAbsoluteError has been removed and instead mae is an alias for LPLoss(p=1).

  • Measures that previously skipped NaN values will now (at least by default) propagate those values. Missing value behavior is unchanged, except some measures that previously did not support missing now do.

  • Aliases for measure types have been removed. For example RMSE (alias for RootMeanSquaredError) is gone. Aliases for instances, such as rms and cross_entropy persist. The exception is precision, for which ppv can be used in its place. (This is to avoid conflict with Base.precision, which was previously pirated.)

  • info(measure) has been decommissioned; query docstrings or access the new measure traits individually instead. These traits are now provided by StatisticalMeasures.jl and not are not exported. For example, to access the orientation of the measure rms, do import StatisticalMeasures as SM; SM.orientation(rms).

  • Behavior of the measures() method, to list all measures and associated traits, has changed. It now returns a dictionary instead of a vector of named tuples; measures(predicate) is decommissioned, but measures(needle) is preserved. (This method, owned by StatisticalMeasures.jl, has some other search options, but is experimental.)

  • Measures that were wraps of losses from LossFunctions.jl are no longer exposed by MLJBase or MLJ. To use such a loss, you must explicitly import LossFunctions and wrap the loss appropriately. See Using losses from LossFunctions.jl for examples.

  • Some user-defined measures working in previous versions of MLJBase.jl may not work without modification, as they must conform to the new StatisticalMeasuresBase.jl API. See this tutorial on how define new measures.

  • Measures with a "feature argument" X, as in some_measure(ŷ, y, X), are no longer supported. See What is a measure? for allowed signatures in measures.

Packages implementing the MLJ model interface

The migration of measures is not expected to require any changes to the source code in packges providing implementations of the MLJ model interface (MLJModelInterface.jl) such as MLJDecisionTreeInterface.jl and MLJFlux.jl, and this is confirmed by extensive integration tests. However, some current tests will fail, if they use MLJBase measures. The following should generally suffice to adapt such tests:

  • Add StatisticalMeasures as test dependency, and add using StatisticalMeasures to your runtests.jl (and/or included submodules).

  • If measures are qualified, as in MLJBase.rms, then the qualification must be removed or changed to StatisticalMeasures.rms, etc.

  • Be aware that the default measure used in methods such as evaluate!, when measure is not specified, is changed from rms to l2 for regression models.

  • Be aware of that all measures now report a measurement for every observation, and never an aggregate. See second point above.

Breaking behavior possibly relevant to some developers

  • The abstract measure types Aggregated, Unaggregated, Measure have been decommissioned. (A measure is now defined purely by its calling behavior.)

  • What were previously exported as measure types are now only constructors.

  • target_scitype(measure) is decommissioned. Related is StatisticalMeasures.observation_scitype(measure) which declares an upper bound on the allowed scitype of a single observation.

  • prediction_type(measure) is decommissioned. Instead use StatisticalMeasures.kind_of_proxy(measure).

  • The trait reports_each_observation is decommissioned. Related is StatisticalMeasures.can_report_unaggregated; if false the new measurements method simply returns n copies of the aggregated measurement, where n is the number of observations provided, instead of individual observation-dependent measurements.

  • aggregation(measure) has been decommissioned. Instead use StatisticalMeasures.external_mode_of_aggregation(measure).

  • instances(measure) has been decommissioned; query docstrings for measure aliases, or follow this example: aliases = measures()[RootMeanSquaredError].aliases.

  • is_feature_dependent(measure) has been decommissioned. Measures consuming feature data are not longer supported; see above.

  • distribution_type(measure) has been decommissioned.

  • docstring(measure) has been decommissioned.

  • Behavior of aggregate has changed.

  • The following traits, previously exported by MLJBase and MLJ, cannot be applied to measures: supports_weights, supports_class_weights, orientation, human_name. Instead use the traits with these names provided by StatisticalMeausures.jl (they will need to be qualified, as in import StatisticalMeasures; StatisticalMeasures.orientation(measure)).

diff --git a/dev/preparing_data/index.html b/dev/preparing_data/index.html index 0b101fbf9..4c8e91633 100644 --- a/dev/preparing_data/index.html +++ b/dev/preparing_data/index.html @@ -1,5 +1,5 @@ -Preparing Data · MLJ

Preparing Data

Splitting data

MLJ has two tools for splitting data. To split data vertically (that is, to split by observations) use partition. This is commonly applied to a vector of observation indices, but can also be applied to datasets themselves, provided they are vectors, matrices or tables.

To split tabular data horizontally (i.e., break up a table based on feature names) use unpack.

MLJBase.partitionFunction
partition(X, fractions...;
+Preparing Data · MLJ

Preparing Data

Splitting data

MLJ has two tools for splitting data. To split data vertically (that is, to split by observations) use partition. This is commonly applied to a vector of observation indices, but can also be applied to datasets themselves, provided they are vectors, matrices or tables.

To split tabular data horizontally (i.e., break up a table based on feature names) use unpack.

MLJBase.partitionFunction
partition(X, fractions...;
           shuffle=nothing,
           rng=Random.GLOBAL_RNG,
           stratify=nothing,
@@ -106,4 +106,4 @@
 │ admitted__no  │ Continuous │ Float64 │
 │ admitted__yes │ Continuous │ Float64 │
 └───────────────┴────────────┴─────────┘
-

Such transformations can also be combined in a pipeline; see Linear Pipelines.

Scientific type coercion

Scientific type coercion is documented in detail at ScientificTypesBase.jl. See also the tutorial at the this MLJ Workshop (specifically, here) and this Data Science in Julia tutorial.

Also relevant is the section, Working with Categorical Data.

Data transformation

MLJ's Built-in transformers are documented at Transformers and Other Unsupervised Models. The most relevant in the present context are: ContinuousEncoder, OneHotEncoder, FeatureSelector and FillImputer. A Gaussian mixture models imputer is provided by BetaML, which can be loaded with

MissingImputator = @load MissingImputator pkg=BetaML

This MLJ Workshop, and the "End-to-end examples" in Data Science in Julia tutorials give further illustrations of data preprocessing in MLJ.

+

Such transformations can also be combined in a pipeline; see Linear Pipelines.

Scientific type coercion

Scientific type coercion is documented in detail at ScientificTypesBase.jl. See also the tutorial at the this MLJ Workshop (specifically, here) and this Data Science in Julia tutorial.

Also relevant is the section, Working with Categorical Data.

Data transformation

MLJ's Built-in transformers are documented at Transformers and Other Unsupervised Models. The most relevant in the present context are: ContinuousEncoder, OneHotEncoder, FeatureSelector and FillImputer. A Gaussian mixture models imputer is provided by BetaML, which can be loaded with

MissingImputator = @load MissingImputator pkg=BetaML

This MLJ Workshop, and the "End-to-end examples" in Data Science in Julia tutorials give further illustrations of data preprocessing in MLJ.

diff --git a/dev/quick_start_guide_to_adding_models/index.html b/dev/quick_start_guide_to_adding_models/index.html index 9858cbe46..eba5243cd 100644 --- a/dev/quick_start_guide_to_adding_models/index.html +++ b/dev/quick_start_guide_to_adding_models/index.html @@ -1,2 +1,2 @@ -Quick-Start Guide to Adding Models · MLJ
+Quick-Start Guide to Adding Models · MLJ
diff --git a/dev/simple_user_defined_models/index.html b/dev/simple_user_defined_models/index.html index bca77c7d3..6c52f5a90 100644 --- a/dev/simple_user_defined_models/index.html +++ b/dev/simple_user_defined_models/index.html @@ -1,5 +1,5 @@ -Simple User Defined Models · MLJ

Simple User Defined Models

To quickly implement a new supervised model in MLJ, it suffices to:

  • Define a mutable struct to store hyperparameters. This is either a subtype of Probabilistic or Deterministic, depending on whether probabilistic or ordinary point predictions are intended. This struct is the model.

  • Define a fit method, dispatched on the model, returning learned parameters, also known as the fitresult.

  • Define a predict method, dispatched on the model, and the fitresult, to return predictions on new patterns.

In the examples below, the training input X of fit, and the new input Xnew passed to predict, are tables. Each training target y is an AbstractVector.

The predictions returned by predict have the same form as y for deterministic models, but are Vectors of distributions for probabilistic models.

Advanced model functionality not addressed here includes: (i) optional update method to avoid redundant calculations when calling fit! on machines a second time; (ii) reporting extra training-related statistics; (iii) exposing model-specific functionality; (iv) checking the scientific type of data passed to your model in machine construction; and (iv) checking the validity of hyperparameter values. All this is described in Adding Models for General Use.

For an unsupervised model, implement transform and, optionally, inverse_transform using the same signature at predict below.

A simple deterministic regressor

Here's a quick-and-dirty implementation of a ridge regressor with no intercept:

import MLJBase
+Simple User Defined Models · MLJ

Simple User Defined Models

To quickly implement a new supervised model in MLJ, it suffices to:

  • Define a mutable struct to store hyperparameters. This is either a subtype of Probabilistic or Deterministic, depending on whether probabilistic or ordinary point predictions are intended. This struct is the model.

  • Define a fit method, dispatched on the model, returning learned parameters, also known as the fitresult.

  • Define a predict method, dispatched on the model, and the fitresult, to return predictions on new patterns.

In the examples below, the training input X of fit, and the new input Xnew passed to predict, are tables. Each training target y is an AbstractVector.

The predictions returned by predict have the same form as y for deterministic models, but are Vectors of distributions for probabilistic models.

Advanced model functionality not addressed here includes: (i) optional update method to avoid redundant calculations when calling fit! on machines a second time; (ii) reporting extra training-related statistics; (iii) exposing model-specific functionality; (iv) checking the scientific type of data passed to your model in machine construction; and (iv) checking the validity of hyperparameter values. All this is described in Adding Models for General Use.

For an unsupervised model, implement transform and, optionally, inverse_transform using the same signature at predict below.

A simple deterministic regressor

Here's a quick-and-dirty implementation of a ridge regressor with no intercept:

import MLJBase
 using LinearAlgebra
 
 mutable struct MyRegressor <: MLJBase.Deterministic
@@ -21,8 +21,8 @@
   lambda = 1.0)
julia> regressor = machine(model, X, y)untrained Machine; caches model-specific representations of data model: MyRegressor(lambda = 1.0) args: - 1: Source @422 ⏎ Table{AbstractVector{Continuous}} - 2: Source @260 ⏎ AbstractVector{Continuous}
julia> evaluate!(regressor, resampling=CV(), measure=rms, verbosity=0)PerformanceEvaluation object with these fields: + 1: Source @268 ⏎ Table{AbstractVector{Continuous}} + 2: Source @460 ⏎ AbstractVector{Continuous}
julia> evaluate!(regressor, resampling=CV(), measure=rms, verbosity=0)PerformanceEvaluation object with these fields: model, measure, operation, measurement, per_fold, per_observation, fitted_params_per_fold, report_per_fold, @@ -56,4 +56,4 @@ MLJBase.predict(model::MyClassifier, fitresult, Xnew) = [fitresult for r in 1:nrows(Xnew)]
julia> X, y = @load_iris;
julia> mach = machine(MyClassifier(), X, y) |> fit!;[ Info: Training machine(MyClassifier(), …).
julia> predict(mach, selectrows(X, 1:2))2-element Vector{UnivariateFinite{Multiclass{3}, String, UInt32, Float64}}: UnivariateFinite{Multiclass{3}}(setosa=>0.333, versicolor=>0.333, virginica=>0.333) - UnivariateFinite{Multiclass{3}}(setosa=>0.333, versicolor=>0.333, virginica=>0.333)
+ UnivariateFinite{Multiclass{3}}(setosa=>0.333, versicolor=>0.333, virginica=>0.333)
diff --git a/dev/target_transformations/index.html b/dev/target_transformations/index.html index 309400804..7ef5571e2 100644 --- a/dev/target_transformations/index.html +++ b/dev/target_transformations/index.html @@ -1,5 +1,5 @@ -Target Transformations · MLJ

Target Transformations

Some supervised models work best if the target variable has been standardized, i.e., rescaled to have zero mean and unit variance. Such a target transformation is learned from the values of the training target variable. In particular, one generally learns a different transformation when training on a proper subset of the training data. Good data hygiene prescribes that a new transformation should be computed each time the supervised model is trained on new data - for example in cross-validation.

Additionally, one generally wants to inverse transform the predictions of the supervised model for the final target predictions to be on the original scale.

All these concerns are addressed by wrapping the supervised model using TransformedTargetModel:

Ridge = @load RidgeRegressor pkg=MLJLinearModels verbosity=0
+Target Transformations · MLJ

Target Transformations

Some supervised models work best if the target variable has been standardized, i.e., rescaled to have zero mean and unit variance. Such a target transformation is learned from the values of the training target variable. In particular, one generally learns a different transformation when training on a proper subset of the training data. Good data hygiene prescribes that a new transformation should be computed each time the supervised model is trained on new data - for example in cross-validation.

Additionally, one generally wants to inverse transform the predictions of the supervised model for the final target predictions to be on the original scale.

All these concerns are addressed by wrapping the supervised model using TransformedTargetModel:

Ridge = @load RidgeRegressor pkg=MLJLinearModels verbosity=0
 ridge = Ridge(fit_intercept=false)
 ridge2 = TransformedTargetModel(ridge, transformer=Standardizer())
TransformedTargetModelDeterministic(
   model = RidgeRegressor(
@@ -75,4 +75,4 @@
 └──────────────────────────────────────┴─────────┘
 

Without the log transform (ie, using ridge) we get the poorer mean absolute error, l1, of 3.9.

MLJBase.TransformedTargetModelFunction
TransformedTargetModel(model; transformer=nothing, inverse=nothing, cache=true)

Wrap the supervised or semi-supervised model in a transformation of the target variable.

Here transformer one of the following:

  • The Unsupervised model that is to transform the training target. By default (inverse=nothing) the parameters learned by this transformer are also used to inverse-transform the predictions of model, which means transformer must implement the inverse_transform method. If this is not the case, specify inverse=identity to suppress inversion.

  • A callable object for transforming the target, such as y -> log.(y). In this case a callable inverse, such as z -> exp.(z), should be specified.

Specify cache=false to prioritize memory over speed, or to guarantee data anonymity.

Specify inverse=identity if model is a probabilistic predictor, as inverse-transforming sample spaces is not supported. Alternatively, replace model with a deterministic model, such as Pipeline(model, y -> mode.(y)).

Examples

A model that normalizes the target before applying ridge regression, with predictions returned on the original scale:

@load RidgeRegressor pkg=MLJLinearModels
 model = RidgeRegressor()
-tmodel = TransformedTargetModel(model, transformer=Standardizer())

A model that applies a static log transformation to the data, again returning predictions to the original scale:

tmodel2 = TransformedTargetModel(model, transformer=y->log.(y), inverse=z->exp.(y))
source
+tmodel = TransformedTargetModel(model, transformer=Standardizer())

A model that applies a static log transformation to the data, again returning predictions to the original scale:

tmodel2 = TransformedTargetModel(model, transformer=y->log.(y), inverse=z->exp.(y))
source
diff --git a/dev/third_party_packages/index.html b/dev/third_party_packages/index.html index 14f9237cd..5726ad9a7 100644 --- a/dev/third_party_packages/index.html +++ b/dev/third_party_packages/index.html @@ -1,2 +1,2 @@ -Third Party Packages · MLJ

Third Party Packages

A list of third-party packages with integration with MLJ.

Last updated December 2020.

Pull requests to update this list are very welcome. Otherwise, you may post an issue requesting this here.

Packages providing models in the MLJ model registry

See List of Supported Models

Providing unregistered models:

Packages providing other kinds of functionality:

+Third Party Packages · MLJ

Third Party Packages

A list of third-party packages with integration with MLJ.

Last updated December 2020.

Pull requests to update this list are very welcome. Otherwise, you may post an issue requesting this here.

Packages providing models in the MLJ model registry

See List of Supported Models

Providing unregistered models:

Packages providing other kinds of functionality:

diff --git a/dev/thresholding_probabilistic_predictors/index.html b/dev/thresholding_probabilistic_predictors/index.html index b64ec68bf..f56141baf 100644 --- a/dev/thresholding_probabilistic_predictors/index.html +++ b/dev/thresholding_probabilistic_predictors/index.html @@ -1,5 +1,5 @@ -Thresholding Probabilistic Predictors · MLJ

Thresholding Probabilistic Predictors

Although one can call predict_mode on a probabilistic binary classifier to get deterministic predictions, a more flexible strategy is to wrap the model using BinaryThresholdPredictor, as this allows the user to specify the threshold probability for predicting a positive class. This wrapping converts a probabilistic classifier into a deterministic one.

The positive class is always the second class returned when calling levels on the training target y.

MLJModels.BinaryThresholdPredictorType
BinaryThresholdPredictor(model; threshold=0.5)

Wrap the Probabilistic model, model, assumed to support binary classification, as a Deterministic model, by applying the specified threshold to the positive class probability. In addition to conventional supervised classifiers, it can also be applied to outlier detection models that predict normalized scores - in the form of appropriate UnivariateFinite distributions - that is, models that subtype AbstractProbabilisticUnsupervisedDetector or AbstractProbabilisticSupervisedDetector.

By convention the positive class is the second class returned by levels(y), where y is the target.

If threshold=0.5 then calling predict on the wrapped model is equivalent to calling predict_mode on the atomic model.

Example

Below is an application to the well-known Pima Indian diabetes dataset, including optimization of the threshold parameter, with a high balanced accuracy the objective. The target class distribution is 500 positives to 268 negatives.

Loading the data:

using MLJ, Random
+Thresholding Probabilistic Predictors · MLJ

Thresholding Probabilistic Predictors

Although one can call predict_mode on a probabilistic binary classifier to get deterministic predictions, a more flexible strategy is to wrap the model using BinaryThresholdPredictor, as this allows the user to specify the threshold probability for predicting a positive class. This wrapping converts a probabilistic classifier into a deterministic one.

The positive class is always the second class returned when calling levels on the training target y.

MLJModels.BinaryThresholdPredictorType
BinaryThresholdPredictor(model; threshold=0.5)

Wrap the Probabilistic model, model, assumed to support binary classification, as a Deterministic model, by applying the specified threshold to the positive class probability. In addition to conventional supervised classifiers, it can also be applied to outlier detection models that predict normalized scores - in the form of appropriate UnivariateFinite distributions - that is, models that subtype AbstractProbabilisticUnsupervisedDetector or AbstractProbabilisticSupervisedDetector.

By convention the positive class is the second class returned by levels(y), where y is the target.

If threshold=0.5 then calling predict on the wrapped model is equivalent to calling predict_mode on the atomic model.

Example

Below is an application to the well-known Pima Indian diabetes dataset, including optimization of the threshold parameter, with a high balanced accuracy the objective. The target class distribution is 500 positives to 268 negatives.

Loading the data:

using MLJ, Random
 rng = Xoshiro(123)
 
 diabetes = OpenML.load(43582)
@@ -23,4 +23,4 @@
 optimized_point_predictor = report(mach2).best_model
 optimized_point_predictor.threshold # 0.260
 predict(mach2, X)[1:3] # [1, 1, 0]

Estimating the performance of the auto-thresholding model (nested resampling here):

e = evaluate!(mach2, resampling=CV(nfolds=6), measure=[balanced, accuracy])
-e.measurement[1] # 0.477 ± 0.110
source
+e.measurement[1] # 0.477 ± 0.110
source
diff --git a/dev/transformers/index.html b/dev/transformers/index.html index 7acd0868a..33014f652 100644 --- a/dev/transformers/index.html +++ b/dev/transformers/index.html @@ -1,5 +1,5 @@ -Transformers and Other Unsupervised models · MLJ

Transformers and Other Unsupervised Models

Several unsupervised models used to perform common transformations, such as one-hot encoding, are available in MLJ out-of-the-box. These are detailed in Built-in transformers below.

A transformer is static if it has no learned parameters. While such a transformer is tantamount to an ordinary function, realizing it as an MLJ static transformer (a subtype of Static <: Unsupervised) can be useful, especially if the function depends on parameters the user would like to manipulate (which become hyper-parameters of the model). The necessary syntax for defining your own static transformers is described in Static transformers below.

Some unsupervised models, such as clustering algorithms, have a predict method in addition to a transform method. We give an example of this in Transformers that also predict

Finally, we note that models that fit a distribution, or more generally a sampler object, to some data, which are sometimes viewed as unsupervised, are treated in MLJ as supervised models. See Models that learn a probability distribution for an example.

Built-in transformers

MLJModels.StandardizerType
Standardizer

A model type for constructing a standardizer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

Standardizer = @load Standardizer pkg=MLJModels

Do model = Standardizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Standardizer(features=...).

Use this model to standardize (whiten) a Continuous vector, or relevant columns of a table. The rescalings applied by this transformer to new data are always those learned during the training phase, which are generally different from what would actually standardize the new data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table or any abstract vector with Continuous element scitype (any abstract float vector). Only features in a table with Continuous scitype can be standardized; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated below:

    • [] (empty, the default): standardize all features (columns) having Continuous element scitype

    • non-empty vector of feature names (symbols): standardize only the Continuous features in the vector (if ignore=false) or Continuous features not named in the vector (ignore=true).

    • function or other callable: standardize a feature if the callable returns true on its name. For example, Standardizer(features = name -> name in [:x1, :x3], ignore = true, count=true) has the same effect as Standardizer(features = [:x1, :x3], ignore = true, count=true), namely to standardize all Continuous and Count features, with the exception of :x1 and :x3.

    Note this behavior is further modified if the ordered_factor or count flags are set to true; see below

  • ignore=false: whether to ignore or standardize specified features, as explained above

  • ordered_factor=false: if true, standardize any OrderedFactor feature wherever a Continuous feature would be standardized, as described above

  • count=false: if true, standardize any Count feature wherever a Continuous feature would be standardized, as described above

Operations

  • transform(mach, Xnew): return Xnew with relevant features standardized according to the rescalings learned during fitting of mach.

  • inverse_transform(mach, Z): apply the inverse transformation to Z, so that inverse_transform(mach, transform(mach, Xnew)) is approximately the same as Xnew; unavailable if ordered_factor or count flags were set to true.

Fitted parameters

The fields of fitted_params(mach) are:

  • features_fit - the names of features that will be standardized

  • means - the corresponding untransformed mean values

  • stds - the corresponding untransformed standard deviations

Report

The fields of report(mach) are:

  • features_fit: the names of features that will be standardized

Examples

using MLJ
+Transformers and Other Unsupervised models · MLJ

Transformers and Other Unsupervised Models

Several unsupervised models used to perform common transformations, such as one-hot encoding, are available in MLJ out-of-the-box. These are detailed in Built-in transformers below.

A transformer is static if it has no learned parameters. While such a transformer is tantamount to an ordinary function, realizing it as an MLJ static transformer (a subtype of Static <: Unsupervised) can be useful, especially if the function depends on parameters the user would like to manipulate (which become hyper-parameters of the model). The necessary syntax for defining your own static transformers is described in Static transformers below.

Some unsupervised models, such as clustering algorithms, have a predict method in addition to a transform method. We give an example of this in Transformers that also predict

Finally, we note that models that fit a distribution, or more generally a sampler object, to some data, which are sometimes viewed as unsupervised, are treated in MLJ as supervised models. See Models that learn a probability distribution for an example.

Built-in transformers

MLJModels.StandardizerType
Standardizer

A model type for constructing a standardizer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

Standardizer = @load Standardizer pkg=MLJModels

Do model = Standardizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Standardizer(features=...).

Use this model to standardize (whiten) a Continuous vector, or relevant columns of a table. The rescalings applied by this transformer to new data are always those learned during the training phase, which are generally different from what would actually standardize the new data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table or any abstract vector with Continuous element scitype (any abstract float vector). Only features in a table with Continuous scitype can be standardized; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated below:

    • [] (empty, the default): standardize all features (columns) having Continuous element scitype

    • non-empty vector of feature names (symbols): standardize only the Continuous features in the vector (if ignore=false) or Continuous features not named in the vector (ignore=true).

    • function or other callable: standardize a feature if the callable returns true on its name. For example, Standardizer(features = name -> name in [:x1, :x3], ignore = true, count=true) has the same effect as Standardizer(features = [:x1, :x3], ignore = true, count=true), namely to standardize all Continuous and Count features, with the exception of :x1 and :x3.

    Note this behavior is further modified if the ordered_factor or count flags are set to true; see below

  • ignore=false: whether to ignore or standardize specified features, as explained above

  • ordered_factor=false: if true, standardize any OrderedFactor feature wherever a Continuous feature would be standardized, as described above

  • count=false: if true, standardize any Count feature wherever a Continuous feature would be standardized, as described above

Operations

  • transform(mach, Xnew): return Xnew with relevant features standardized according to the rescalings learned during fitting of mach.

  • inverse_transform(mach, Z): apply the inverse transformation to Z, so that inverse_transform(mach, transform(mach, Xnew)) is approximately the same as Xnew; unavailable if ordered_factor or count flags were set to true.

Fitted parameters

The fields of fitted_params(mach) are:

  • features_fit - the names of features that will be standardized

  • means - the corresponding untransformed mean values

  • stds - the corresponding untransformed standard deviations

Report

The fields of report(mach) are:

  • features_fit: the names of features that will be standardized

Examples

using MLJ
 
 X = (ordinal1 = [1, 2, 3],
      ordinal2 = coerce([:x, :y, :x], OrderedFactor),
@@ -434,4 +434,4 @@
  (3, "virginica")
  (3, "virginica")
  (1, "virginica")
- (3, "virginica")
+ (3, "virginica")
diff --git a/dev/tuning_models/index.html b/dev/tuning_models/index.html index 5a8592899..6b309c525 100644 --- a/dev/tuning_models/index.html +++ b/dev/tuning_models/index.html @@ -1,5 +1,5 @@ -Tuning Models · MLJ

Tuning Models

MLJ provides several built-in and third-party options for optimizing a model's hyper-parameters. The quick-reference table below omits some advanced keyword options.

tuning strategynotespackage to importpackage providing the core algorithm
Grid(goal=nothing, resolution=10)shuffled by default; goal is upper bound for number of grid pointsMLJ.jl or MLJTuning.jlMLJTuning.jl
RandomSearch(rng=GLOBAL_RNG)with customizable priorsMLJ.jl or MLJTuning.jlMLJTuning.jl
LatinHypercube(rng=GLOBAL_RNG)with discrete parameter supportMLJ.jl or MLJTuning.jlLatinHypercubeSampling
MLJTreeParzenTuning()See this example for usageTreeParzen.jlTreeParzen.jl (port to Julia of hyperopt)
ParticleSwarm(n_particles=3, rng=GLOBAL_RNG)Standard Kennedy-Eberhart algorithm, plus discrete parameter supportMLJParticleSwarmOptimization.jlMLJParticleSwarmOptimization.jl
AdaptiveParticleSwarm(n_particles=3, rng=GLOBAL_RNG)Zhan et al. variant with automated swarm coefficient updates, plus discrete parameter supportMLJParticleSwarmOptimization.jlMLJParticleSwarmOptimization.jl
Explicit()For an explicit list of models of varying typeMLJ.jl or MLJTuning.jlMLJTuning.jl

Below we illustrate hyperparameter optimization using the Grid, RandomSearch, LatinHypercube and Explicit tuning strategies.

Overview

In MLJ model tuning is implemented as a model wrapper. After wrapping a model in a tuning strategy and binding the wrapped model to data in a machine called mach, calling fit!(mach) instigates a search for optimal model hyperparameters, within a specified range, and then uses all supplied data to train the best model. To predict using that model, one then calls predict(mach, Xnew). In this way, the wrapped model may be viewed as a "self-tuning" version of the unwrapped model. That is, wrapping the model simply transforms certain hyper-parameters into learned parameters.

A corollary of the tuning-as-wrapper approach is that the evaluation of the performance of a TunedModel instance using evaluate! implies nested resampling. This approach is inspired by MLR. See also below.

In MLJ, tuning is an iterative procedure, with an iteration parameter n, the total number of model instances to be evaluated. Accordingly, tuning can be controlled using MLJ's IteratedModel wrapper. After familiarizing oneself with the TunedModel wrapper described below, see Controlling model tuning for more on this advanced feature.

For a more in-depth overview of tuning in MLJ, or for implementation details, see the MLJTuning documentation. For a complete list of options see the TunedModel doc-string below.

Tuning a single hyperparameter using a grid search (regression example)

using MLJ
+Tuning Models · MLJ

Tuning Models

MLJ provides several built-in and third-party options for optimizing a model's hyper-parameters. The quick-reference table below omits some advanced keyword options.

tuning strategynotespackage to importpackage providing the core algorithm
Grid(goal=nothing, resolution=10)shuffled by default; goal is upper bound for number of grid pointsMLJ.jl or MLJTuning.jlMLJTuning.jl
RandomSearch(rng=GLOBAL_RNG)with customizable priorsMLJ.jl or MLJTuning.jlMLJTuning.jl
LatinHypercube(rng=GLOBAL_RNG)with discrete parameter supportMLJ.jl or MLJTuning.jlLatinHypercubeSampling
MLJTreeParzenTuning()See this example for usageTreeParzen.jlTreeParzen.jl (port to Julia of hyperopt)
ParticleSwarm(n_particles=3, rng=GLOBAL_RNG)Standard Kennedy-Eberhart algorithm, plus discrete parameter supportMLJParticleSwarmOptimization.jlMLJParticleSwarmOptimization.jl
AdaptiveParticleSwarm(n_particles=3, rng=GLOBAL_RNG)Zhan et al. variant with automated swarm coefficient updates, plus discrete parameter supportMLJParticleSwarmOptimization.jlMLJParticleSwarmOptimization.jl
Explicit()For an explicit list of models of varying typeMLJ.jl or MLJTuning.jlMLJTuning.jl

Below we illustrate hyperparameter optimization using the Grid, RandomSearch, LatinHypercube and Explicit tuning strategies.

Overview

In MLJ model tuning is implemented as a model wrapper. After wrapping a model in a tuning strategy and binding the wrapped model to data in a machine called mach, calling fit!(mach) instigates a search for optimal model hyperparameters, within a specified range, and then uses all supplied data to train the best model. To predict using that model, one then calls predict(mach, Xnew). In this way, the wrapped model may be viewed as a "self-tuning" version of the unwrapped model. That is, wrapping the model simply transforms certain hyper-parameters into learned parameters.

A corollary of the tuning-as-wrapper approach is that the evaluation of the performance of a TunedModel instance using evaluate! implies nested resampling. This approach is inspired by MLR. See also below.

In MLJ, tuning is an iterative procedure, with an iteration parameter n, the total number of model instances to be evaluated. Accordingly, tuning can be controlled using MLJ's IteratedModel wrapper. After familiarizing oneself with the TunedModel wrapper described below, see Controlling model tuning for more on this advanced feature.

For a more in-depth overview of tuning in MLJ, or for implementation details, see the MLJTuning documentation. For a complete list of options see the TunedModel doc-string below.

Tuning a single hyperparameter using a grid search (regression example)

using MLJ
 X = MLJ.table(rand(100, 10));
 y = 2X.x1 - X.x2 + 0.05*rand(100);
 Tree = @load DecisionTreeRegressor pkg=DecisionTree verbosity=0;
@@ -66,8 +66,8 @@
 fit!(mach, verbosity=0)
trained Machine; does not cache data
   model: DeterministicTunedModel(model = DecisionTreeRegressor(max_depth = -1, …), …)
   args: 
-    1:	Source @836 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @834 ⏎ AbstractVector{Continuous}
+    1:	Source @944 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @118 ⏎ AbstractVector{Continuous}
 

We can inspect the detailed results of the grid search with report(mach) or just retrieve the optimal model, as here:

fitted_params(mach).best_model
DecisionTreeRegressor(
   max_depth = -1, 
   min_samples_leaf = 5, 
@@ -109,8 +109,8 @@
 fit!(mach, verbosity=0);
trained Machine; does not cache data
   model: ProbabilisticTunedModel(model = KNNClassifier(K = 5, …), …)
   args: 
-    1:	Source @437 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @873 ⏎ AbstractVector{Multiclass{3}}
+    1:	Source @253 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @227 ⏎ AbstractVector{Multiclass{3}}
 

Case (ii) - deterministic measure:

self_tuning_knn = TunedModel(
     model=knn,
     resampling = CV(nfolds=4, rng=1234),
@@ -123,8 +123,8 @@
 fit!(mach, verbosity=0);
trained Machine; does not cache data
   model: ProbabilisticTunedModel(model = KNNClassifier(K = 5, …), …)
   args: 
-    1:	Source @727 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @740 ⏎ AbstractVector{Multiclass{3}}
+    1:	Source @172 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @391 ⏎ AbstractVector{Multiclass{3}}
 

Let's inspect the best model and corresponding evaluation of the metric in case (ii):

entry = report(mach).best_history_entry
(model = KNNClassifier(K = 9, …),
  measure = StatisticalMeasuresBase.RobustMeasure{StatisticalMeasuresBase.FussyMeasure{StatisticalMeasuresBase.RobustMeasure{StatisticalMeasuresBase.Multimeasure{StatisticalMeasuresBase.SupportsMissingsMeasure{StatisticalMeasures.MisclassificationRateOnScalars}, Nothing, StatisticalMeasuresBase.Mean, typeof(identity)}}, Nothing}}[MisclassificationRate()],
  measurement = [0.02666666666666667],
@@ -181,8 +181,8 @@
 fit!(mach, verbosity=0);
trained Machine; does not cache data
   model: DeterministicTunedModel(model = DeterministicEnsembleModel(model = DecisionTreeRegressor(max_depth = -1, …), …), …)
   args: 
-    1:	Source @023 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @976 ⏎ AbstractVector{Continuous}
+    1:	Source @829 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @281 ⏎ AbstractVector{Continuous}
 

We can plot the grid search results:

using Plots
 plot(mach)

Instead of specifying a goal, we can declare a global resolution, which is overridden for a particular parameter by pairing its range with the resolution desired. In the next example, the default resolution=100 is applied to the r2 field, but a resolution of 3 is applied to the r1 field. Additionally, we ask that the grid points be randomly traversed and the total number of evaluations be limited to 25.

tuning = Grid(resolution=100, shuffle=true, rng=1234)
 self_tuning_forest = TunedModel(
@@ -196,8 +196,8 @@
 fit!(machine(self_tuning_forest, X, y), verbosity=0);
trained Machine; does not cache data
   model: DeterministicTunedModel(model = DeterministicEnsembleModel(model = DecisionTreeRegressor(max_depth = -1, …), …), …)
   args: 
-    1:	Source @550 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @017 ⏎ AbstractVector{Continuous}
+    1:	Source @617 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @790 ⏎ AbstractVector{Continuous}
 

For more options for a grid search, see Grid below.

Let's attempt to tune the same hyperparameters using a RandomSearch tuning strategy. By default, bounded numeric ranges like r1 and r2 are sampled uniformly (before rounding, in the case of the integer range r1). Positive unbounded ranges are sampled using a Gamma distribution by default, and all others using a (truncated) normal distribution.

self_tuning_forest = TunedModel(
     model=forest,
     tuning=RandomSearch(),
@@ -212,8 +212,8 @@
 fit!(mach, verbosity=0)
trained Machine; does not cache data
   model: DeterministicTunedModel(model = DeterministicEnsembleModel(model = DecisionTreeRegressor(max_depth = -1, …), …), …)
   args: 
-    1:	Source @784 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @612 ⏎ AbstractVector{Continuous}
+    1:	Source @709 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @334 ⏎ AbstractVector{Continuous}
 
using Plots
 plot(mach)

The prior distributions used for sampling each hyperparameter can be customized, as can the global fallbacks. See the RandomSearch doc-string below for details.

Tuning using Latin hypercube sampling

One can also tune the hyperparameters using the LatinHypercube tuning strategy. This method uses a genetic-based optimization algorithm based on the inverse of the Audze-Eglais function, using the library LatinHypercubeSampling.jl.

We'll work with the data X, y and ranges r1 and r2 defined above and instantiate a Latin hypercube resampling strategy:

latin = LatinHypercube(gens=2, popsize=120)
LatinHypercube(
   gens = 2, 
@@ -236,8 +236,8 @@
 fit!(mach, verbosity=0)
trained Machine; does not cache data
   model: DeterministicTunedModel(model = DeterministicEnsembleModel(model = DecisionTreeRegressor(max_depth = -1, …), …), …)
   args: 
-    1:	Source @005 ⏎ Table{AbstractVector{Continuous}}
-    2:	Source @803 ⏎ AbstractVector{Continuous}
+    1:	Source @248 ⏎ Table{AbstractVector{Continuous}}
+    2:	Source @511 ⏎ AbstractVector{Continuous}
 
using Plots
 plot(mach)

Comparing models of different type and nested cross-validation

Instead of mutating hyperparameters of a fixed model, one can instead optimise over an explicit list of models, whose types are allowed to vary. As with other tuning strategies, evaluating the resulting TunedModel itself implies nested resampling (e.g., nested cross-validation) which we now examine in a bit more detail.

tree = (@load DecisionTreeClassifier pkg=DecisionTree verbosity=0)()
 knn = (@load KNNClassifier pkg=NearestNeighborModels verbosity=0)()
@@ -351,4 +351,4 @@
                          tuning=LatinHypercube(...),
                          range=...,
                          measures=...,
-                         n=...)

(See TunedModel for complete options.)

To use a periodic version of the Audze-Eglais function (to reduce clustering along the boundaries) specify periodic_ae = true.

Supported ranges:

A single one-dimensional range or vector of one-dimensioinal ranges can be specified. Specifically, in LatinHypercubeSampling search, the range field of a TunedModel instance can be:

  • A single one-dimensional range - ie, ParamRange object - r, constructed

using the range method.

  • Any vector of objects of the above form

Both NumericRanges and NominalRanges are supported, and hyper-parameter values are sampled on a scale specified by the range (eg, r.scale = :log).

source
+ n=...)

(See TunedModel for complete options.)

To use a periodic version of the Audze-Eglais function (to reduce clustering along the boundaries) specify periodic_ae = true.

Supported ranges:

A single one-dimensional range or vector of one-dimensioinal ranges can be specified. Specifically, in LatinHypercubeSampling search, the range field of a TunedModel instance can be:

using the range method.

Both NumericRanges and NominalRanges are supported, and hyper-parameter values are sampled on a scale specified by the range (eg, r.scale = :log).

source diff --git a/dev/weights/index.html b/dev/weights/index.html index 09814ad58..b2194406a 100644 --- a/dev/weights/index.html +++ b/dev/weights/index.html @@ -1,5 +1,5 @@ -Weights · MLJ

Weights

In machine learning it is possible to assign each observation an independent significance, or weight, either in training or in performance evaluation, or both.

There are two kinds of weights in use in MLJ:

  • per observation weights (also just called weights) refer to weight vectors of the same length as the number of observations

  • class weights refer to dictionaries keyed on the target classes (levels) for use in classification problems

Specifying weights in training

To specify weights in training you bind the weights to the model along with the data when constructing a machine. For supervised models the weights are specified last:

KNNRegressor = @load KNNRegressor
+Weights · MLJ

Weights

In machine learning it is possible to assign each observation an independent significance, or weight, either in training or in performance evaluation, or both.

There are two kinds of weights in use in MLJ:

  • per observation weights (also just called weights) refer to weight vectors of the same length as the number of observations

  • class weights refer to dictionaries keyed on the target classes (levels) for use in classification problems

Specifying weights in training

To specify weights in training you bind the weights to the model along with the data when constructing a machine. For supervised models the weights are specified last:

KNNRegressor = @load KNNRegressor
 model = KNNRegressor()
 X, y = make_regression(10, 3)
 w = rand(length(y))
@@ -9,4 +9,4 @@
 end

The model model supports class weights if supports_class_weights(model) is true.

Specifying weights in performance evaluation

When calling a measure (metric) that supports weights, provide the weights as the last argument, as in

_, y = @load_iris
 ŷ = shuffle(y)
 w = Dict("versicolor" => 1, "setosa" => 2, "virginica"=> 3)
-macro_f1score(ŷ, y, w)

Some measures also support specification of a class weight dictionary. For details see the StatisticalMeasures.jl tutorial.

To pass weights to all the measures listed in an evaluate!/evaluate call, use the keyword specifiers weights=... or class_weights=.... For details, see Evaluating Model Performance.

+macro_f1score(ŷ, y, w)

Some measures also support specification of a class weight dictionary. For details see the StatisticalMeasures.jl tutorial.

To pass weights to all the measures listed in an evaluate!/evaluate call, use the keyword specifiers weights=... or class_weights=.... For details, see Evaluating Model Performance.

diff --git a/dev/working_with_categorical_data/index.html b/dev/working_with_categorical_data/index.html index 3decb130c..9d6e3a651 100644 --- a/dev/working_with_categorical_data/index.html +++ b/dev/working_with_categorical_data/index.html @@ -1,5 +1,5 @@ -Working with Categorical Data · MLJ

Working with Categorical Data

Scientific types for discrete data

Recall that models articulate their data requirements using scientific types (see Getting Started or the ScientificTypes.jl documentation). There are three scientific types discrete data can have: Count, OrderedFactor and Multiclass.

Count data

In MLJ you cannot use integers to represent (finite) categorical data. Integers are reserved for discrete data you want interpreted as Count <: Infinite:

scitype([1, 4, 5, 6])
AbstractVector{Count} (alias for AbstractArray{Count, 1})

The Count scientific type includes things like the number of phone calls, or city populations, and other "frequency" data of a generally unbounded nature.

That said, you may have data that is theoretically Count, but which you coerce to OrderedFactor to enable the use of more models, trusting to your knowledge of how those models work to inform an appropriate interpretation.

OrderedFactor and Multiclass data

Other integer data, such as the number of an animal's legs, or number of rooms in homes, are, generally, coerced to OrderedFactor <: Finite. The other categorical scientific type is Multiclass <: Finite, which is for unordered categorical data. Coercing data to one of these two forms is discussed under Detecting and coercing improperly represented categorical data below.

Binary data

There is no separate scientific type for binary data. Binary data is either OrderedFactor{2} if ordered, and Multiclass{2} otherwise. Data with type OrderedFactor{2} is considered to have an intrinsic "positive" class, e.g., the outcome of a medical test, and the "pass/fail" outcome of an exam. MLJ measures, such as true_positive assume the second class in the ordering is the "positive" class. Inspecting and changing order are discussed in the next section.

If data has type Bool it is considered Count data (as Bool <: Integer) and, generally, users will want to coerce such data to Multiclass or OrderedFactor.

Detecting and coercing improperly represented categorical data

One inspects the scientific type of data using scitype as shown above. To inspect all column scientific types in a table simultaneously, use schema. (The scitype(X) of a table X contains a condensed form of this information used in type dispatch; see here.)

import DataFrames: DataFrame
+Working with Categorical Data · MLJ

Working with Categorical Data

Scientific types for discrete data

Recall that models articulate their data requirements using scientific types (see Getting Started or the ScientificTypes.jl documentation). There are three scientific types discrete data can have: Count, OrderedFactor and Multiclass.

Count data

In MLJ you cannot use integers to represent (finite) categorical data. Integers are reserved for discrete data you want interpreted as Count <: Infinite:

scitype([1, 4, 5, 6])
AbstractVector{Count} (alias for AbstractArray{Count, 1})

The Count scientific type includes things like the number of phone calls, or city populations, and other "frequency" data of a generally unbounded nature.

That said, you may have data that is theoretically Count, but which you coerce to OrderedFactor to enable the use of more models, trusting to your knowledge of how those models work to inform an appropriate interpretation.

OrderedFactor and Multiclass data

Other integer data, such as the number of an animal's legs, or number of rooms in homes, are, generally, coerced to OrderedFactor <: Finite. The other categorical scientific type is Multiclass <: Finite, which is for unordered categorical data. Coercing data to one of these two forms is discussed under Detecting and coercing improperly represented categorical data below.

Binary data

There is no separate scientific type for binary data. Binary data is either OrderedFactor{2} if ordered, and Multiclass{2} otherwise. Data with type OrderedFactor{2} is considered to have an intrinsic "positive" class, e.g., the outcome of a medical test, and the "pass/fail" outcome of an exam. MLJ measures, such as true_positive assume the second class in the ordering is the "positive" class. Inspecting and changing order are discussed in the next section.

If data has type Bool it is considered Count data (as Bool <: Integer) and, generally, users will want to coerce such data to Multiclass or OrderedFactor.

Detecting and coercing improperly represented categorical data

One inspects the scientific type of data using scitype as shown above. To inspect all column scientific types in a table simultaneously, use schema. (The scitype(X) of a table X contains a condensed form of this information used in type dispatch; see here.)

import DataFrames: DataFrame
 X = DataFrame(
     name = ["Siri", "Robo", "Alexa", "Cortana"],
     gender = ["male", "male", "Female", "female"],
@@ -108,4 +108,4 @@
  UnivariateFinite{Multiclass{3}}(no=>0.245, yes=>0.755)
  UnivariateFinite{Multiclass{3}}(no=>0.447, yes=>0.553)
  UnivariateFinite{Multiclass{3}}(no=>0.509, yes=>0.491)
- UnivariateFinite{Multiclass{3}}(no=>0.218, yes=>0.782)

Or, equivalently:

d_vec = UnivariateFinite(["no", "yes"], yes_probs, augment=true, pool=v)

For more options, see UnivariateFinite.

+ UnivariateFinite{Multiclass{3}}(no=>0.218, yes=>0.782)

Or, equivalently:

d_vec = UnivariateFinite(["no", "yes"], yes_probs, augment=true, pool=v)

For more options, see UnivariateFinite.