JavaScript is required to unlock solutions.
Please enable JavaScript and reload the page,
or download the source files from GitHub and run the code locally.

Goal

After this exercise, you should be able to understand the importance of nested resampling when tuning ML algorithms to avoid overtuning of HPs and how to apply nested resampling.

Exercises

As a follow-up on the tuning use case, we continue with the German credit task (tsk("german_credit")) and perform nested resampling. The purpose is to get a valid estimation of the generalization error without an optimistic bias introduced by tuning without nested resampling. For this task, we estimate the generalization error of a k-NN model implemented in kknn.

Recap: Nested Resampling

Nested resampling evaluates a learner combined with a tuning strategy to correctly estimate the generalization error. Linking a tuning strategy with a learner uses the training data to find a good hyperparameter configuration (HPC). Using the same data for performance estimation and hence taking the best-estimated generalization error of the best-performing model leads to an over-optimistic estimation. This is because information on the resampling splits may favor specific HPCs (overtuning) by chance.

The AutoTuner

For this exercise, we use the same setup as for the tuning use-case with a 3-fold CV, msr("classif.ce") as performance measure, a random search, and a termination combination of 40 evaluations. Define an AutoTuner with auto_tuner() and train it on the german_credit task:

library(mlr3verse)
task = tsk("german_credit")

Recap: AutoTuner

The AutoTuner class of mlr3tuning combines the learner and the HPO to encapsulate the learner from its HPs. When training an AutoTuner, two steps are executed. (1) Conduct tuning based on the defined tuning strategy and (2) take the best HPC and fit a model with that HPs to the full data set.

Hint 1

The AutoTuner is defined by auto_tuner() by specifying the learner, resampling, measure, terminator, search_space, and the tuner. The AutoTuner then behaves like a normal learner. We can use $train to fit the AutoTuner (tuning + model fit on best HPC) with the AutoTuner.

Hint 2

library(mlr3)
library(mrl3learners)
library(mlr3tuning)

# Parts from the previous exercise:
task = tsk("german_credit")
lrn_knn = lrn("classif.kknn")

search_space = ps(
  k = p_int(1, 100),
  scale = p_lgl())
)

resampling = rsmp("cv", folds = 3L)

terminator = trm("evals", n_evals = 40L)

tuner = tnr("random_search", batch_size = 4L)

# AutoTuner definition:
at = auto_tuner(
  learner = ...,
  resampling = ...,
  measure = ...,
  terminator = ...,
  search_space = ...,
  tuner = ...
)

at$...(...)

Solution

Perform nested resampling

Setting the resampling strategy in the AutoTuner defines how the HPC are internally evaluated and is hence called inner resampling. To get the final estimate of the generalization error, we have to resample the AutoTuner. This resampling is also called outer resampling. Use resample to conduct a 3-fold CV as outer resampling strategy:

Hint 1

As for normal learner, we first have to define the resampling strategy and then call resample with the task, learner, and resampling as arguments.

Hint 2

outer = rsmp(...)
res = ...(task = ..., learner = ..., resampling = outer)

Solution

Benchmark comparison

Conduct a benchmark to compare the previous KNN-AutoTuner that automatically finds the best hyperparameters with an untuned k-NN and two further but untuned learners in with their default hyperparametervalues (e.g., a decision tree and a random forest without tuning them). Think about suitable learners (which you already know) and run a benchmark with benchmark(). What can you observe (especially when looking at the untuned methods vs. the tuned k-NN model)?

Hint 1

A list of all possible learners can be achieved via as.data.table(mlr_learners). Note that the previously defined KNN-AutoTuner behaves like a normal learner that automatically finds the best hyperparameters (internally, it performs an inner resampling to evaluate the hyperparameter configurations of the random search). As we want to get the final estimate of the generalization error, we have to define another so-called outer resampling to compare the different learners within the benchmark() function (you can use e.g. a 4-fold CV as outer resampling strategy). This will perform nested resampling for the KNN-AutoTuner.

Hint 2

Conducting the benchmark requires to pass a benchmark_grid() to the benchmark() function:

l1 = lrn(...)
l2 = lrn(...)
l3 = lrn(...)

bmr = ...(...(
  tasks = ..,
  learners = list(...),
  resamplings = ...))

Solution

Summary

We learned how to encapsulate a learner from its HPs by wrapping it in an AutoTuner.
The AutoTuner does so by applying internal HPO using an inner resampling.
We have to additionally resample the AutoTuner to get valid estimations (outer resampling) and be able to compare it with other learners. The outer resampling that is applied to a learner that already performs resampling intrinsically (the inner resampling) for finding the best HPC is called nested resampling.