mlr3fselect - Runtime and Memory Benchmarks

Scope

This report analyzes runtime and memory usage of mlr3fselect across recent versions. It evaluates fselect() and fselect_nested() in sequential and parallel execution. Given the size of the mlr3 ecosystem, performance bottlenecks can arise at multiple stages. This report enables users to assess whether observed runtimes and memory footprints are within expected ranges. Substantial anomalies should be reported via a GitHub issue. Benchmarks are executed on a high‑performance cluster optimized for multi‑core throughput rather than single‑core speed. Runtimes on modern local machines may therefore differ.

Summary of Latest mlr3fselect Version

The benchmarks are comprehensive, so we summarize results for the latest mlr3fselect version. We measure runtime and memory for random search with 1,000 resampling iterations on the spam dataset with 1,000 and 10,000 instances. Nested resampling uses 10 outer iterations and the same random search in the inner loop with a holdout resampling. Overhead introduced by fselect() and fselect_nested() must be interpreted relative to the model training time. For 1 s training time, overhead is minimal. For 100 ms training time, overhead is approximately 20%. For 10 ms training time, overhead approximately doubles to triples total runtime. For 1 ms training time, total runtime is about 16 to 20 times the bare model training time. Memory usage for fselect() and fselect_nested() ranges between 450 MB and 550 MB. An empty R session consumes 131 MB. mlr3fselect parallelizes over resampling iterations using the future package. Parallel execution adds overhead from worker initialization, so we compare parallel and sequential runtimes. Parallel fselect() reduces total runtime for all training times. Memory increases with core count because each worker is a separate R session. Using 10 cores requires around 1.8 GB. fselect_nested() parallelizes over the outer resampling loop. Across all training times, the parallel version is faster than the sequential version. Total memory usage is approximately 3.3 GB.

Feature Selection

We measure runtime and memory usage of fselect() across mlr3fselect versions. Random search is used with batch_size = 1000. Models are trained on the spam dataset with 1,000 and 10,000 instances.

task = tsk("spam")

learner = lrn("classif.rpart")

fselect(
  fselector = fs("random_search", batch_size = 1000),
  task = task,
  learner = learner,
  resampling = rsmp("holdout"),
  measure = msr("classif.ce"),
  terminator = trm("evals", n_evals = 1000),
  store_benchmark_result = FALSE,
  store_models = FALSE
)
Runtime and memory usage of fselect() by mlr3fselect version and task size. The k factors indicate how many times longer total runtime is than the model training time. The subscripts denote reference training times in milliseconds; for example, k100 corresponds to 100 ms. A green background marks cases where total runtime is less than three times the model training time. The pk factors report the speedup of parallel relative to sequential execution. The pk factor is omitted when parallel execution is slower than sequential execution.
mlr3fselect Task Size Overhead, s k1000 k100 k10 k1 Memory, mb pk1 pk10 pk100 pk1000
1000 Observations
1.4.0.9000 1000 16 1.0 1.2 2.6 17 402 1.8 2.5 5.9 9.3
1.4.0 1000 24 1.0 1.2 3.4 25 462 1.7 2.2 5.1 8.9
1.3.0 1000 23 1.0 1.2 3.3 24 467 1.6 2.1 4.9 8.9
1.2.1 1000 23 1.0 1.2 3.3 24 457 1.6 2.0 4.9 8.9
10000 Observations
1.4.0.9000 10000 18 1.0 1.2 2.8 19 372 1.2 1.7 4.6 8.8
1.4.0 10000 26 1.0 1.3 3.6 27 520 1.6 2.0 4.7 8.8
1.3.0 10000 24 1.0 1.2 3.4 25 532 1.5 2.0 4.7 8.8
1.2.1 10000 24 1.0 1.2 3.4 25 520 1.9 2.4 5.4 9.1

Nested Feature Selection

We measure runtime and memory usage of fselect_nested() across mlr3fselect versions. The outer resampling performs 10 iterations, and the inner random search evaluates 1,000 feature subsets. Models are trained on the spam dataset with 1,000 and 10,000 instances.

task = tsk("spam")

learner = lrn("classif.rpart")

fselect_nested(
  fselector = fs("random_search", batch_size = 1000),
  task = task,
  learner = learner,
  inner_resampling = rsmp("holdout"),
  outer_resampling = rsmp("subsampling", repeats = 10),
  measure = msr("classif.ce"),
  terminator = trm("evals", n_evals = 1000),
  store_fselect_instance = FALSE,
  store_benchmark_result = FALSE,
  store_models = FALSE
)
Runtime and memory usage of fselect_nested() by mlr3fselect version and task size. The k factors indicate how many times longer total runtime is than the model training time. The subscripts denote reference training times in milliseconds; for example, k100 corresponds to 100 ms. A green background marks cases where total runtime is less than three times the model training time. The pk factors report the speedup of parallel relative to sequential execution. The pk factor is omitted when parallel execution is slower than sequential execution.
mlr3fselect Task Size Overhead, s k1000 k100 k10 k1 Memory, mb pk1 pk10 pk100 pk1000
1000 Observations
1.4.0.9000 1000 19 1.0 1.2 2.9 20 285 6.1 6.9 9.0 9.9
1.4.0 1000 26 1.0 1.3 3.6 27 341 8.8 9.1 9.7 10
1.3.0 1000 24 1.0 1.2 3.4 25 335 9.3 9.5 9.9 10
1.2.1 1000 24 1.0 1.2 3.4 25 332 1.5 1.9 4.6 8.8
10000 Observations
1.4.0.9000 10000 20 1.0 1.2 3.0 21 304 9.4 9.5 9.9 10
1.4.0 10000 27 1.0 1.3 3.7 28 351 10 10 10 10
1.3.0 10000 26 1.0 1.3 3.6 27 349 9.8 9.9 10 10
1.2.1 10000 26 1.0 1.3 3.6 27 352 9.9 9.9 10 10