Integer Hyperparameters in Tuners for Real-valued Search Spaces

Optimize integer hyperparameters with tuners that can only propose real numbers.

Author

Marc Becker

Published

January 19, 2021

Introduction

Tuner for real-valued search spaces are not able to tune on integer hyperparameters. However, it is possible to round the real values proposed by a Tuner to integers before passing them to the learner in the evaluation. We show how to apply a parameter transformation to a ParamSet and use this set in the tuning process.

We load the mlr3verse package which pulls in the most important packages for this example.

library(mlr3verse)
Loading required package: mlr3

We initialize the random number generator with a fixed seed for reproducibility, and decrease the verbosity of the logger to keep the output clearly represented.

set.seed(7832)
lgr::get_logger("mlr3")$set_threshold("warn")
lgr::get_logger("bbotk")$set_threshold("warn")

Task and Learner

In this example, we use the k-Nearest-Neighbor classification learner. We want to tune the integer-valued hyperparameter k which defines the numbers of neighbors.

learner = lrn("classif.kknn")
print(learner$param_set$params$k)
   id    class lower upper levels default
1:  k ParamInt     1   Inf              7

Tuning

We choose generalized simulated annealing as tuning strategy. The param_classes field of TunerGenSA states that the tuner only supports real-valued (ParamDbl) hyperparameter tuning.

print(tnr("gensa"))
<TunerGenSA>: Generalized Simulated Annealing
* Parameters: trace.mat=FALSE, smooth=FALSE
* Parameter classes: ParamDbl
* Properties: single-crit
* Packages: mlr3tuning, bbotk, GenSA

To get integer-valued hyperparameter values for k, we construct a search space with a transformation function. The as.integer() function converts any real valued number to an integer by removing the decimal places.

search_space = ps(
  k = p_dbl(lower = 3, upper = 7.99, trafo = as.integer)
)

We start the tuning and compare the results of the search space to the results in the space of the learners hyperparameter set.

instance = tune(
  tuner = tnr("gensa"),
  task = tsk("iris"),
  learner = learner,
  resampling = rsmp("holdout"),
  measure = msr("classif.ce"),
  term_evals = 20,
  search_space = search_space)
Warning in optim(theta.old, fun, gradient, control = control, method = method, : one-dimensional optimization by Nelder-Mead is unreliable:
use "Brent" or optimize() directly

The optimal k is still a real number in the search space.

instance$result_x_search_space
         k
1: 3.82686

However, in the learners hyperparameters space, k is an integer value.

instance$result_x_domain
$k
[1] 3

The archive shows us that for all real-valued k proposed by GenSA, an integer-valued k in the learner hyperparameter space (x_domain_k) was created.

as.data.table(instance$archive)[, .(k, classif.ce, x_domain_k)]
           k classif.ce x_domain_k
 1: 3.826860       0.06          3
 2: 5.996323       0.06          5
 3: 5.941332       0.06          5
 4: 3.826860       0.06          3
 5: 3.826860       0.06          3
 6: 3.826860       0.06          3
 7: 4.209546       0.06          4
 8: 3.444174       0.06          3
 9: 4.018203       0.06          4
10: 3.635517       0.06          3
11: 3.922532       0.06          3
12: 3.731189       0.06          3
13: 3.874696       0.06          3
14: 3.779024       0.06          3
15: 3.850778       0.06          3
16: 3.802942       0.06          3
17: 3.838819       0.06          3
18: 3.814901       0.06          3
19: 3.832840       0.06          3
20: 3.820881       0.06          3

Internally, TunerGenSA was given the parameter types of the search space and therefore suggested real numbers for k. Before the performance of the different k values was evaluated, the transformation function of the search_space parameter set was called and k was transformed to an integer value.

Note that the tuner is not aware of the transformation. This has two problematic consequences: First, the tuner might propose different real valued configurations that after rounding end up to be already evaluated configurations and we end up with re-evaluating the same hyperparameter configuration. This is only problematic, if we only optimze integer parameters. Second, the rounding introduces discontinuities which can be problematic for some tuners.

We successfully tuned a integer-valued hyperparameter with TunerGenSA which is only suitable for an real-valued search space. This technique is not limited to tuning problems. Optimizer in bbotk can be also used in the same way to produce points with integer parameters.