Content suggestion for features #14

ablaom · 2024-08-02T20:27:30Z

Since I had some trouble adding markdown content, I'm including this here for now:

Matching models to tasks

A Model Registry stores detailed metadata for over 200 models and documentation can be searched without loading model code.

julia> X, y = @load_iris
julia> models(matching(X, y))
54-element Vector
 (name = AdaBoostClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = AdaBoostStumpClassifier, package_name = DecisionTree, ... )
 (name = BaggingClassifier, package_name = MLJScikitLearnInterface, ... )
 ⋮

julia> models("pca")
 (name = PCA, package_name = MultivariateStats, ... )
 (name = PCADetector, package_name = OutlierDetectionPython, ... )

Tuning is a wrapper

For improved composability, and to mitigate data hygiene issues, an extensive number of
meta-algorithms are implemented as model wrappers:

Tuning
Control of iterative models
Correction for class imbalance
Target transforms / inverse transforms
Thresholding probabilistic predictors
Homogeneous model ensembling
Recursive feature elimination
Wolpert model stacking

In this way, a model wrapped in a tuning strategy, for example, becomes a "self-tuning"
model, with all data resampling (e.g., cross-validation) managed under the hood.

model = XGBoostRegressor()
r1 = range(model, :max_depth, lower=3, upper=10)
r2 = range(model, :gamma, lower=0, upper=10, scale=:log)
tuned_model = TunedModel(model, range=[r1, r2], resampling=CV(), measure=l2)

# optimise and retrain on all data:
mach = machine(tuned_model, data) |> fit!

predict(mach, Xnew)      # prediction using optimized params
report(mach).best_model  # inspect optimisation results

Tunable nested parameters

Creating pipelines, or wrapping models in meta-algorithms, such as iteration control,
creates nested hyper-parameters. Such parameters can be optimized like any other.

julia> pipe = ContinuousEncoder() |> RidgeRegressor()
DeterministicPipeline(
  continuous_encoder = ContinuousEncoder(
		drop_last = false,
		one_hot_ordered_factors = false),
  ridge_regressor = RidgeRegressor(
		lambda = 1.0,
		fit_intercept = true,
		penalize_intercept = false,
		scale_penalty_with_samples = true,
		solver = nothing),
  cache = true)

julia> r = range(pipe, :(ridge_regressor.lambda), lower=0.001, upper=10.0)
julia> tuned_model = TunedModel(pipe, range=r, resampling=CV(), measure=l2)

Smart pipelines

Conventional model pipelines are
available out-of-the box. Hyper-parameters of different model components can be
simultaneously tuned, but only necessary components are retrained in each pipeline
evaluation. Training reports expose reports for individual components, and the same holds
for learned parameters.

pipe = OneHotEncoder() |> PCA(maxout=3) |> DecisionTreeClassifier()
mach = machine(pipe, X, y) |> fit!

# get actual PCA reduction dimension:
report(mach).pca.outdim

# get the tree:
fitted_params(mach).decision_tree_classifier.tree

Iteration control

MLJ provides a rich supply of iterative model "controls", such as early stopping criteria,
snapshots, and callbacks for visualization. Any model with an iteration parameter can be
wrapped in such controls, the iteration parameter becoming an additional learned
parameter.

model = EvoTreeRegressor()
controls = [Step(1), Patience(5), TimeLimit(1/60), InvalidValue()]

iterated_model = IteratedModel(
	model;
	controls,
	measure=l2,
	resampling=Holdout(),
	retrain=true,
)

# train on holdout to find `nrounds` and retrain on all data:
mach = machine(iterated_mode, X, y) |> fit!
predict(mach, Xnew) # predict on new data

Composition beyond pipelines

In principle, any MLJ workflow is readily transformed into a lazily executed learning
network.

For example, in the code block opposite, fit! triggers training of both models
in parallel. Mutate a hyper-parameter of model1, call fit! again, and only model1's
learned parameters are updated.

Learning networks can be exported as new stand-alone model types. MLJ's pipelines and
stacks are actually implemented using learning networks.

X, y = source.(X, y) # wrap data in "source nodes"

# a normal MLJ workflow, with training omitted:
mach1 = machine(model1, X, y)
mach2 = machine(model2, X, y)
y1 = predict(mach1, X) # a callable "node"
y2 = predict(mach2, X)

y = 0.5*(y1 + y2)
fit!(y, acceleration=CPUThreads())

y(Xnew) # blended prediction on new data

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Content suggestion for features #14

Content suggestion for features #14

ablaom commented Aug 2, 2024

Content suggestion for features #14

Content suggestion for features #14

Comments

ablaom commented Aug 2, 2024

Matching models to tasks

Tuning is a wrapper

Tunable nested parameters

Smart pipelines

Iteration control

Composition beyond pipelines