Skip to content

Commit

Permalink
Merge pull request #206 from JuliaAI/dev
Browse files Browse the repository at this point in the history
Generate new docs.
  • Loading branch information
ablaom authored Jul 3, 2024
2 parents 8107f2c + 7b7128f commit fae094c
Show file tree
Hide file tree
Showing 4 changed files with 89 additions and 29 deletions.
38 changes: 21 additions & 17 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,26 +10,30 @@ makedocs(;
"Quick-start guide" => "quick_start_guide.md",
"The model type hierarchy" => "the_model_type_hierarchy.md",
"New model type declarations" => "type_declarations.md",
"Supervised models" => "supervised_models.md",
"Summary of methods" => "summary_of_methods.md",
"The form of data for fitting and predicting" => "form_of_data.md",
"The fit method" => "the_fit_method.md",
"The fitted_params method" => "the_fitted_params_method.md",
"The predict method" => "the_predict_method.md",
"The predict_joint method" => "the_predict_joint_method.md",
"Training losses" => "training_losses.md",
"Feature importances" => "feature_importances.md",
"Trait declarations" => "trait_declarations.md",
"Iterative models and the update! method" => "iterative_models.md",
"Implementing a data front end" => "implementing_a_data_front_end.md",
"Supervised models with a transform method" =>
"supervised_models_with_transform.md",
"Models that learn a probability distribution" => "fitting_distributions.md",
"Serialization" => "serialization.md",
"Document strings" => "document_strings.md",
"Supervised models" => [
"Introduction" => "supervised_models.md",
"Summary of methods" => "summary_of_methods.md",
"The form of data for fitting and predicting" => "form_of_data.md",
"The fit method" => "the_fit_method.md",
"The fitted_params method" => "the_fitted_params_method.md",
"The predict method" => "the_predict_method.md",
"The predict_joint method" => "the_predict_joint_method.md",
"Training losses" => "training_losses.md",
"Feature importances" => "feature_importances.md",
"Trait declarations" => "trait_declarations.md",
"Iterative models and the update! method" => "iterative_models.md",
"Implementing a data front end" => "implementing_a_data_front_end.md",
"Supervised models with a transform method" =>
"supervised_models_with_transform.md",
"Models that learn a probability distribution" =>
"fitting_distributions.md",
],
"Unsupervised models" => "unsupervised_models.md",
"Static models" => "static_models.md",
"Outlier detection models" => "outlier_detection_models.md",
"Model wrappers" => "model_wrappers.md",
"Serialization" => "serialization.md",
"Document strings" => "document_strings.md",
"Convenience methods" => "convenience_methods.md",
"Where to place code implementing new models" => "where_to_put_code.md",
"How to add models to the MLJ Model Registry" => "how_to_register.md",
Expand Down
2 changes: 1 addition & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ API defined there, as outlined in this document.

!!! tip

This is a reference document, which has become rather sprawling over the evolution of the MLJ project. We recommend starting with [Quick start guide](@ref), which covers the main points relevant to most new model implementations.
This is a reference document, which has become rather sprawling over the evolution of the MLJ project. We recommend starting with [Quick start guide](@ref), which covers the main points relevant to most new model implementations. Most topics are only detailed for `Supervised` models, so if you are implementing another kind of model, you may still need to refer to the [Supervised models](@ref) section.

Interface code can be hosted by the package providing the core machine learning algorithm,
or by a stand-alone "interface-only" package, using the template
Expand Down
29 changes: 29 additions & 0 deletions docs/src/model_wrappers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Model wrappers

A model that can have one or more other models as hyper-parameters should overload the trait `is_wrapper`, as in this example:

```julia
MLJModelInterface.target_in_fit(::Type{<:MyWrapper}) = true
```

The constructor for such a model does not need provide default values for the model-valued
hyper-parameters. If only a single model is wrapped, then the hyper-parameter should have
the name `:model` and this should be an optional positional argument, as well as a keyword
argument.

For example, `EnsembleModel` is a model wrapper, and we can construct an instance like this:

```julia
using MLJ
atom = ConstantClassfier()
EnsembleModel(tree, n=100)
```

but also like this:

```julia
EnsembleModel(model=tree, n=100)
```

This is the only case in MLJ where positional arguments in a model constructor are
allowed.
49 changes: 38 additions & 11 deletions docs/src/unsupervised_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,10 @@ similar fashion. The main differences are:

- The `fit` method, which still returns `(fitresult, cache, report)` will typically have
only one training argument `X`, as in `MLJModelInterface.fit(model, verbosity, X)`,
although this is not a hard requirement. For example, a feature selection tool (wrapping
some supervised model) might also include a target `y` as input. Furthermore, in the
case of models that subtype `Static <: Unsupervised` (see [Static
models](@ref)) `fit` has no training arguments at all, but does not need to be
implemented as a fallback returns `(nothing, nothing, nothing)`.
although this is not a hard requirement; see [Transformers requiring a target variable
in training](@ref) below. Furthermore, in the case of models that subtype `Static <:
Unsupervised` (see [Static models](@ref)) `fit` has no training arguments at all, but
does not need to be implemented as a fallback returns `(nothing, nothing, nothing)`.

- A `transform` and/or `predict` method is implemented, and has the same signature as
`predict` does in the supervised case, as in `MLJModelInterface.transform(model,
Expand All @@ -27,15 +26,43 @@ similar fashion. The main differences are:
argument, you must overload the trait `fit_data_scitype`, which bounds the allowed
`data` passed to `fit(model, verbosity, data...)` and will always be a `Tuple` type.

- An `inverse_transform` can be optionally implemented. The signature
is the same as `transform`, as in
`MLJModelInterface.inverse_transform(model, fitresult, Xout)`, which:
- An `inverse_transform` can be optionally implemented. The signature is the same as
`transform`, as in `MLJModelInterface.inverse_transform(model::MyUnsupervisedModel,
fitresult, Xout)`, which:
- must make sense for any `Xout` for which `scitype(Xout) <:
output_scitype(SomeSupervisedModel)` (see below); and
output_scitype(MyUnsupervisedModel)`; and
- must return an object `Xin` satisfying `scitype(Xin) <:
input_scitype(SomeSupervisedModel)`.
input_scitype(MyUnsupervisedModel)`.

For sample implementatations, see MLJ's [built-in
For sample implementations, see MLJ's [built-in
transformers](https://github.com/JuliaAI/MLJModels.jl/blob/dev/src/builtins/Transformers.jl)
and the clustering models at
[MLJClusteringInterface.jl](https://github.com/jbrea/MLJClusteringInterface.jl).

## Transformers requiring a target variable in training

An `Unsupervised` model that is not `Static` may include a second argument `y` in it's
`fit` signature, as in `fit(::MyTransformer, verbosity, X, y)`. For example, some feature
selection tools require a target variable `y` in training. (Unlike `Supervised` models, an
`Unsupervised` model is not required to implement `predict`, and in pipelines it is the
output of `transform`, and not `predict`, that is always propagated to the next model.) Such a
model should overload the trait `target_in_fit`, as in this example:

```julia
MLJModelInterface.target_in_fit(::Type{<:MyTransformer}) = true
```

This ensures that such models can appear in pipelines, and that a target provided to the
pipeline model is passed on to the model in training.

If the model implements more than one `fit` signature (e.g., one with a target `y` and one
without) then `fit_data_scitype` must also be overloaded, as in this example:

```julia
MLJModelInterface.fit_data_scitype(::Type{<:MyTransformer}) = Union{
Tuple{Table(Continuous)},
Tuple{Table(Continous), AbstractVector{<:Finite}},
}
```


0 comments on commit fae094c

Please sign in to comment.