library(mlr3verse)
library(mlr3learners)
Intro
Multilevel stacking is an ensemble technique, where predictions of several learners are added as new features to extend the orginal data on different levels. On each level, the extended data is used to train a new level of learners. This can be repeated for several iterations until a final learner is trained. To avoid overfitting, it is advisable to use test set (out-of-bag) predictions in each level.
In this post, a multilevel stacking example will be created using mlr3pipelines and tuned using mlr3tuning . A similar example is available in the mlr3book. However, we additionally explain how to tune the hyperparameters of the whole ensemble and each underlying learner jointly.
In our stacking example, we proceed as follows:
- Level 0: Based on the input data, we train three learners (
rpart
,glmnet
andlda
) on a sparser feature space obtained using different feature filter methods from mlr3filters to obtain slightly decorrelated predictions. The test set predictions of these learners are attached to the original data (used in level 0) and will serve as input for the learners in level 1. - Level 1: We transform this extended data using PCA, on which we then train additional three learners (
rpart
,glmnet
andlda
). The test set predictions of the level 1 learners are attached to input data used in level 1. - Finally, we train a final
ranger
learner to the data extended by level 1. Note that the number of features selected by the feature filter method in level 0 and the number of principal components retained in level 1 will be jointly tuned with some other hyperparameters of the learners in each level.
Prerequisites
We load the mlr3verse package which pulls in the most important packages for this example. The mlr3learners package loads additional learners
.
We initialize the random number generator with a fixed seed for reproducibility, and decrease the verbosity of the logger to keep the output clearly represented.
set.seed(7832)
::get_logger("mlr3")$set_threshold("warn")
lgr::get_logger("bbotk")$set_threshold("warn") lgr
For the stacking example, we use the sonar classification task:
= tsk("sonar")
task_sonar $col_roles$stratum = task_sonar$target_names # stratification task_sonar
Pipeline creation
Level 0
As mentioned, the level 0 learners are rpart
, glmnet
and lda
:
= lrn("classif.rpart", predict_type = "prob")
learner_rpart = lrn("classif.glmnet", predict_type = "prob")
learner_glmnet = lrn("classif.lda", predict_type = "prob") learner_lda
To create the learner out-of-bag predictions, we use PipeOpLearnerCV
:
= po("learner_cv", learner_rpart, id = "rprt_1")
cv1_rpart = po("learner_cv", learner_glmnet, id = "glmnet_1")
cv1_glmnet = po("learner_cv", learner_lda, id = "lda_1") cv1_lda
A sparser representation of the input data in level 0 is obtained using the following filters:
= po("filter", flt("anova"), id = "filt1")
anova = po("filter", flt("mrmr"), id = "filt2")
mrmr = po("filter", flt("find_correlation"), id = "filt3") find_cor
To summarize these steps into level 0, we use the gunion()
function. The out-of-bag predictions of all level 0 learners is attached using PipeOpFeatureUnion
along with the original data passed via PipeOpNOP
:
= gunion(list(
level0 %>>% cv1_rpart,
anova %>>% cv1_glmnet,
mrmr %>>% cv1_lda,
find_cor po("nop", id = "nop1"))) %>>%
po("featureunion", id = "union1")
We can have a look at the graph from level 0:
$plot(html = FALSE) level0
Level 1
Now, we create the level 1 learners:
= po("learner_cv", learner_rpart, id = "rprt_2")
cv2_rpart = po("learner_cv", learner_glmnet, id = "glmnet_2")
cv2_glmnet = po("learner_cv", learner_lda, id = "lda_2") cv2_lda
All level 1 learners will use PipeOpPCA
transformed data as input:
= level0 %>>%
level1 po("copy", 4) %>>%
gunion(list(
po("pca", id = "pca2_1", param_vals = list(scale. = TRUE)) %>>% cv2_rpart,
po("pca", id = "pca2_2", param_vals = list(scale. = TRUE)) %>>% cv2_glmnet,
po("pca", id = "pca2_3", param_vals = list(scale. = TRUE)) %>>% cv2_lda,
po("nop", id = "nop2"))) %>>%
po("featureunion", id = "union2")
We can have a look at the graph from level 1:
$plot(html = FALSE) level1
The out-of-bag predictions of the level 1 learners are attached to the input data from level 1 and a final ranger learner will be trained:
= lrn("classif.ranger", predict_type = "prob")
ranger_lrn
= level1 %>>% ranger_lrn
ensemble $plot(html = FALSE) ensemble
Defining the tuning space
In order to tune the ensemble’s hyperparameter jointly, we define the search space using ParamSet
from the paradox package:
= ps(
search_space_ensemble filt1.filter.nfeat = p_int(5, 50),
filt2.filter.nfeat = p_int(5, 50),
filt3.filter.nfeat = p_int(5, 50),
pca2_1.rank. = p_int(3, 50),
pca2_2.rank. = p_int(3, 50),
pca2_3.rank. = p_int(3, 20),
rprt_1.cp = p_dbl(0.001, 0.1),
rprt_1.minbucket = p_int(1, 10),
glmnet_1.alpha = p_dbl(0, 1),
rprt_2.cp = p_dbl(0.001, 0.1),
rprt_2.minbucket = p_int(1, 10),
glmnet_2.alpha = p_dbl(0, 1),
classif.ranger.mtry = p_int(1, 10),
classif.ranger.sample.fraction = p_dbl(0.5, 1),
classif.ranger.num.trees = p_int(50, 200))
Performance comparison
Even with a simple ensemble, there is quite a few things to setup. We compare the performance of the ensemble with a simple tuned ranger learner
.
To proceed, we convert the ensemble
pipeline as a GraphLearner
:
= as_learner(ensemble)
learner_ensemble $id = "ensemble"
learner_ensemble$predict_type = "prob" learner_ensemble
We define the search space for the simple ranger learner:
= ps(
search_space_ranger mtry = p_int(1, 10),
sample.fraction = p_dbl(0.5, 1),
num.trees = p_int(50, 200))
For performance comparison, we use the benchmark()
function that requires a design incorporating a list of learners and a list of tasks. Here, we have two learners (the simple ranger learner and the ensemble) and one task. Since we want to tune the simple ranger learner as well as the whole ensemble learner, we need to create an AutoTuner
for each learner to be compared. To do so, we need to define a resampling strategy for the tuning in the inner loop (we use 3-fold cross-validation) and for the final evaluation (outer loop) use use holdout validation:
= rsmp("cv", folds = 3)
inner_resampling
# AutoTuner for the ensemble learner
= auto_tuner(
at_1 tuner = tnr("random_search"),
learner = learner_ensemble,
resampling = inner_resampling,
measure = msr("classif.auc"),
search_space = search_space_ensemble,
term_evals = 3) # to limit running time
# AutoTuner for the simple ranger learner
= auto_tuner(
at_2 tuner = tnr("random_search"),
learner = ranger_lrn,
resampling = inner_resampling,
measure = msr("classif.auc"),
search_space = search_space_ranger,
term_evals = 3) # to limit running time
# Define the list of learners
= list(at_1, at_2)
learners
# For benchmarking, we use a simple holdout
= rsmp("holdout")
outer_resampling $instantiate(task_sonar)
outer_resampling
= benchmark_grid(
design tasks = task_sonar,
learners = learners,
resamplings = outer_resampling
)
= benchmark(design, store_models = TRUE) bmr
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $train()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_1's $predict()
Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
This happened PipeOp glmnet_2's $predict()
$aggregate(msr("classif.auc"))[, .(nr, task_id, learner_id, resampling_id, iters, classif.auc)] bmr
For a more reliable comparison, the number of evaluation of the random search should be increased.
Conclusion
This example shows the versatility of mlr3pipelines. By using more learners, varied representations of the data set as well as more levels, a powerful yet compute hungry pipeline can be created. It is important to note that care should be taken to avoid name clashes of pipeline objects.