= tsk("spam")
task = lrn("classif.featureless")
learner
$train(task) learner
Scope
This report analyzes the runtime and memory usage of mlr3
across the four most recent package versions. It focuses on the learner methods $train()
and $predict()
and on the evaluation functions resample()
and benchmark()
. The benchmarks quantify the runtime overhead introduced by mlr3
and the memory usage. Overhead is reported relative to the training time of the underlying models. The study varies dataset size and the number of resampling iterations. All experiments also assess the effect of parallelization on runtime and memory. The impact of encapsulation is examined by comparing alternative encapsulation methods.
Given the size of the mlr3
ecosystem, performance bottlenecks can arise at multiple stages. This report helps users assess whether observed runtimes fall within expected ranges. Substantial anomalies in runtime or memory should be reported by opening a GitHub issue. Benchmarks are executed on a high‑performance cluster optimized for multi‑core throughput rather than single‑core speed. Consequently, single‑core runtimes may be faster on a modern local machine.
Summary of Latest mlr3 Version
The benchmarks are comprehensive, so we summarize the results for the latest mlr3
version. The runtime overhead of mlr3
must be interpreted relative to model training and prediction times. For instance, if ranger::ranger()
takes 100 ms to train and lrn("classif.ranger")$train()
takes 110 milliseconds, the overhead is 10%. If the same model requires 1 second to train, the overhead is 1%. The overhead is shown relative to the training time of the models with the factors k_1
, k_10
, k_100
, and k_1000
. The subscript denotes the model’s training time in milliseconds. The factors pk_1
, pk_10
, pk_100
, and pk_1000
report the speedup of parallel over sequential execution.
We first consider $train()
. For models with training times of 1000 ms and 100 ms, the overhead is minimal. When training takes 10 ms, runtime approximately doubles. For 1 ms models, overhead is roughly ten times the bare model training time.
The overhead of $predict()
is comparable to $train()
, and dataset size has only a minor effect. $predict_newdata()
converts newdata
to a task and then predicts, which roughly doubles the overhead relative to $predict()
. The recently introduced $predict_newdata_fast()
is substantially faster than $predict_newdata()
. For models with 10 ms prediction time, the overhead is about 10%. For models with 1 ms prediction time, the overhead is about 50%.
The overhead of resample()
and benchmark()
is small for 1000 ms and 100 ms models. For 10 ms models, the total runtime is approximately twice the bare training time. For 1 ms models, the total runtime is approximately ten times the bare training time. An empty R session consumes 131 MB of memory. Resampling with 10 iterations uses approximately 164 MB, increasing to about 225 MB for 1000 iterations. Memory usage for benchmark()
is comparable to resample()
.
mlr3
parallelizes over resampling iterations via the future
package. Parallel execution adds overhead due to worker initialization. We therefore compare parallel and sequential runtimes. For 1 s models, parallel resample()
and benchmark()
reduce total runtime. For 100 ms models, parallelization is advantageous primarily for 100 or 1000 iterations. For 10 ms and 1 ms models, parallel execution overtakes sequential execution mainly at 1000 iterations. Memory grows with the number of cores because each core launches a separate R session. Using 10 cores results in a total memory footprint of approximately 1.2 GB.
Encapsulation captures and logs conditions such as messages, warnings, and errors without interrupting control flow. Encapsulation via callr
introduces approximately 1 s of additional runtime per model training. Encapsulation via evaluate
adds negligible runtime overhead.
Train
The runtime and memory usage of $train()
are measured for different mlr3 versions.
mlr3 | Task Size | Overhead, ms | k1000 | k100 | k10 | k1 | Memory, mb |
---|---|---|---|---|---|---|---|
10 Observations | |||||||
1.1.0.9000 | 10 | 5 | 1.0 | 1.0 | 1.5 | 5.8 | 146 |
1.1.0 | 10 | 5 | 1.0 | 1.0 | 1.5 | 5.9 | 147 |
1.0.1 | 10 | 5 | 1.0 | 1.0 | 1.5 | 5.8 | 145 |
1.0.0 | 10 | 5 | 1.0 | 1.0 | 1.5 | 5.9 | 146 |
100 Observations | |||||||
1.1.0.9000 | 100 | 5 | 1.0 | 1.0 | 1.5 | 5.8 | 146 |
1.1.0 | 100 | 5 | 1.0 | 1.0 | 1.5 | 5.8 | 143 |
1.0.1 | 100 | 5 | 1.0 | 1.0 | 1.5 | 5.9 | 148 |
1.0.0 | 100 | 5 | 1.0 | 1.0 | 1.5 | 5.9 | 147 |
1000 Observations | |||||||
1.1.0.9000 | 1000 | 5 | 1.0 | 1.0 | 1.5 | 5.8 | 147 |
1.1.0 | 1000 | 5 | 1.0 | 1.0 | 1.5 | 5.9 | 148 |
1.0.1 | 1000 | 5 | 1.0 | 1.1 | 1.5 | 6.0 | 148 |
1.0.0 | 1000 | 5 | 1.0 | 1.1 | 1.5 | 6.0 | 146 |
10000 Observations | |||||||
1.1.0.9000 | 10000 | 6 | 1.0 | 1.1 | 1.6 | 6.8 | 178 |
1.1.0 | 10000 | 6 | 1.0 | 1.1 | 1.6 | 6.8 | 177 |
1.0.1 | 10000 | 6 | 1.0 | 1.1 | 1.6 | 6.9 | 169 |
1.0.0 | 10000 | 6 | 1.0 | 1.1 | 1.6 | 7.0 | 167 |
Predict
The runtime of $predict()
is measured across mlr3
versions.
= tsk("spam")
task = lrn("classif.featureless")
learner
$train(task)
learner
$predict(task) learner
mlr3 | Task Size | Overhead, ms | k1000 | k100 | k10 | k1 | Memory, mb |
---|---|---|---|---|---|---|---|
10 Observations | |||||||
1.1.0.9000 | 10 | 5 | 1.0 | 1.0 | 1.5 | 5.8 | 149 |
1.1.0 | 10 | 5 | 1.0 | 1.0 | 1.5 | 5.9 | 147 |
1.0.1 | 10 | 5 | 1.0 | 1.1 | 1.5 | 6.0 | 147 |
1.0.0 | 10 | 5 | 1.0 | 1.1 | 1.5 | 6.1 | 148 |
100 Observations | |||||||
1.1.0.9000 | 100 | 5 | 1.0 | 1.1 | 1.5 | 6.1 | 147 |
1.1.0 | 100 | 5 | 1.0 | 1.1 | 1.5 | 6.0 | 149 |
1.0.1 | 100 | 5 | 1.0 | 1.1 | 1.5 | 6.1 | 148 |
1.0.0 | 100 | 5 | 1.0 | 1.1 | 1.5 | 6.2 | 151 |
1000 Observations | |||||||
1.1.0.9000 | 1000 | 5 | 1.0 | 1.1 | 1.5 | 6.1 | 152 |
1.1.0 | 1000 | 5 | 1.0 | 1.1 | 1.5 | 6.2 | 152 |
1.0.1 | 1000 | 5 | 1.0 | 1.1 | 1.5 | 6.1 | 150 |
1.0.0 | 1000 | 5 | 1.0 | 1.1 | 1.5 | 6.3 | 150 |
10000 Observations | |||||||
1.1.0.9000 | 10000 | 6 | 1.0 | 1.1 | 1.6 | 7.3 | 173 |
1.1.0 | 10000 | 6 | 1.0 | 1.1 | 1.6 | 7.2 | 178 |
1.0.1 | 10000 | 6 | 1.0 | 1.1 | 1.6 | 7.5 | 176 |
1.0.0 | 10000 | 7 | 1.0 | 1.1 | 1.7 | 7.8 | 177 |
Predict Newdata
The runtime of $predict_newdata()
is measured across mlr3
versions.
= tsk("spam")
task = lrn("classif.featureless")
learner
$train(task)
learner
$predict_newdata(newdata) learner
mlr3 | Task Size | Overhead, ms | k1000 | k100 | k10 | k1 | Memory, mb |
---|---|---|---|---|---|---|---|
10 Observations | |||||||
1.1.0.9000 | 10 | 19 | 1.0 | 1.2 | 2.9 | 20 | 158 |
1.1.0 | 10 | 20 | 1.0 | 1.2 | 3.0 | 21 | 154 |
1.0.1 | 10 | 20 | 1.0 | 1.2 | 3.0 | 21 | 153 |
1.0.0 | 10 | 21 | 1.0 | 1.2 | 3.1 | 22 | 153 |
100 Observations | |||||||
1.1.0.9000 | 100 | 19 | 1.0 | 1.2 | 2.9 | 20 | 157 |
1.1.0 | 100 | 20 | 1.0 | 1.2 | 3.0 | 21 | 156 |
1.0.1 | 100 | 21 | 1.0 | 1.2 | 3.1 | 22 | 155 |
1.0.0 | 100 | 22 | 1.0 | 1.2 | 3.2 | 23 | 153 |
1000 Observations | |||||||
1.1.0.9000 | 1000 | 19 | 1.0 | 1.2 | 2.9 | 20 | 159 |
1.1.0 | 1000 | 21 | 1.0 | 1.2 | 3.1 | 22 | 161 |
1.0.1 | 1000 | 21 | 1.0 | 1.2 | 3.1 | 22 | 158 |
1.0.0 | 1000 | 22 | 1.0 | 1.2 | 3.2 | 23 | 158 |
10000 Observations | |||||||
1.1.0.9000 | 10000 | 27 | 1.0 | 1.3 | 3.7 | 28 | 181 |
1.1.0 | 10000 | 31 | 1.0 | 1.3 | 4.1 | 32 | 183 |
1.0.1 | 10000 | 29 | 1.0 | 1.3 | 3.9 | 30 | 182 |
1.0.0 | 10000 | 35 | 1.0 | 1.3 | 4.5 | 36 | 184 |
Predict Newdata Fast
The runtime of $predict_newdata_fast()
is measured across mlr3
versions.
= tsk("spam")
task = lrn("classif.featureless")
learner
$train(task)
learner
$predict_newdata_fast(task) learner
mlr3 | Task Size | Overhead, ms | k1000 | k100 | k10 | k1 | Memory, mb |
---|---|---|---|---|---|---|---|
10 Observations | |||||||
1.1.0.9000 | 10 | 0 | 1.0 | 1.0 | 1.0 | 1.3 | 150 |
1.1.0 | 10 | 0 | 1.0 | 1.0 | 1.0 | 1.3 | 150 |
100 Observations | |||||||
1.1.0.9000 | 100 | NA | NA | NA | NA | NA | 153 |
1.1.0 | 100 | NA | NA | NA | NA | NA | 150 |
1000 Observations | |||||||
1.1.0.9000 | 1000 | 0 | 1.0 | 1.0 | 1.0 | 1.4 | 157 |
1.1.0 | 1000 | 0 | 1.0 | 1.0 | 1.0 | 1.4 | 152 |
10000 Observations | |||||||
1.1.0.9000 | 10000 | 1 | 1.0 | 1.0 | 1.1 | 2.0 | 162 |
1.1.0 | 10000 | 1 | 1.0 | 1.0 | 1.1 | 2.0 | 159 |
Resampling
The runtime and memory usage of resample()
are measured across mlr3
versions. The number of resampling iterations (evals
) is set to 1000, 100, and 10. We also measure the runtime of resample()
with future::multisession
parallelization on 10 cores.
= tsk("spam")
task = lrn("classif.featureless")
learner
= rsmp("subsampling", repeats = evals)
resampling
resample(task, learner, resampling)
mlr3 | Resampling Iterations | Runtime, s | k1000 | k100 | k10 | k1 | Memory, mb | pk1000 | pk100 | pk10 | pk1 |
---|---|---|---|---|---|---|---|---|---|---|---|
10 Resampling Iterations | |||||||||||
1.1.0.9000 | 10 | 120 | 1.0 | 1.1 | 2.2 | 13 | 149 | 3.3 | — | — | — |
1.1.0 | 10 | 126 | 1.0 | 1.1 | 2.3 | 14 | 150 | 3.3 | — | — | — |
1.0.1 | 10 | 118 | 1.0 | 1.1 | 2.2 | 13 | 149 | 3.3 | — | — | — |
1.0.0 | 10 | 126 | 1.0 | 1.1 | 2.3 | 14 | 151 | 3.3 | — | — | — |
100 Resampling Iterations | |||||||||||
1.1.0.9000 | 100 | 1057 | 1.0 | 1.1 | 2.1 | 12 | 156 | 27 | 2.9 | — | — |
1.1.0 | 100 | 1077 | 1.0 | 1.1 | 2.1 | 12 | 153 | 27 | 3.0 | — | — |
1.0.1 | 100 | 1112 | 1.0 | 1.1 | 2.1 | 12 | 154 | 27 | 2.9 | — | — |
1.0.0 | 100 | 1181 | 1.0 | 1.1 | 2.2 | 13 | 157 | 27 | 3.0 | — | — |
1000 Resampling Iterations | |||||||||||
1.1.0.9000 | 1000 | 9710 | 1.0 | 1.1 | 2.0 | 11 | 257 | 58 | 6.3 | 2.4 | 1.4 |
1.1.0 | 1000 | 10847 | 1.0 | 1.1 | 2.1 | 12 | 264 | 58 | 6.4 | 2.5 | 1.6 |
1.0.1 | 1000 | 10580 | 1.0 | 1.1 | 2.1 | 12 | 276 | 58 | 6.3 | 2.4 | 1.5 |
1.0.0 | 1000 | 10745 | 1.0 | 1.1 | 2.1 | 12 | 254 | 58 | 6.3 | 2.4 | 1.6 |
Benchmark
The runtime and memory usage of benchmark()
are measured across mlr3
versions. The number of resampling iterations (evals
) is set to 1000, 100, and 10. We also measure the runtime of benchmark()
with future::multisession
parallelization on 10 cores.
= tsk("spam")
task = lrn("classif.featureless")
learner = rsmp("subsampling", repeats = evals / 5)
resampling
= benchmark_grid(task, replicate(5, learner), resampling)
design
benchmark(design)
mlr3 | Resampling Iterations | Runtime, s | k1000 | k100 | k10 | k1 | Memory, mb | pk1000 | pk100 | pk10 | pk1 |
---|---|---|---|---|---|---|---|---|---|---|---|
10 Resampling Iterations | |||||||||||
1.1.0.9000 | 10 | 130 | 1.0 | 1.1 | 2.3 | 14 | 150 | — | — | — | — |
1.1.0 | 10 | 138 | 1.0 | 1.1 | 2.4 | 15 | 149 | — | — | — | — |
1.0.1 | 10 | 131 | 1.0 | 1.1 | 2.3 | 14 | 149 | — | — | — | — |
1.0.0 | 10 | 141 | 1.0 | 1.1 | 2.4 | 15 | 150 | — | — | — | — |
100 Resampling Iterations | |||||||||||
1.1.0.9000 | 100 | 1086 | 1.0 | 1.1 | 2.1 | 12 | 155 | 6.9 | — | — | — |
1.1.0 | 100 | 1085 | 1.0 | 1.1 | 2.1 | 12 | 152 | 6.9 | — | — | — |
1.0.1 | 100 | 1112 | 1.0 | 1.1 | 2.1 | 12 | 154 | 6.9 | — | — | — |
1.0.0 | 100 | 1151 | 1.0 | 1.1 | 2.2 | 13 | 156 | 6.8 | — | — | — |
1000 Resampling Iterations | |||||||||||
1.1.0.9000 | 1000 | 9949 | 1.0 | 1.1 | 2.0 | 11 | 259 | 40 | 4.4 | 1.2 | — |
1.1.0 | 1000 | 10668 | 1.0 | 1.1 | 2.1 | 12 | 257 | 41 | 4.5 | 1.3 | — |
1.0.1 | 1000 | 9571 | 1.0 | 1.1 | 2.0 | 11 | 255 | 40 | 4.4 | 1.2 | — |
1.0.0 | 1000 | 10280 | 1.0 | 1.1 | 2.0 | 11 | 254 | 41 | 4.4 | 1.3 | — |
Encapsulation
The runtime and memory usage of $train()
are measured for different encapsulation methods and mlr3
versions.
= tsk("spam")
task = lrn("classif.featureless")
learner $encapsulate(method, fallback = lrn("classif.featureless"))
learner
$train(task) learner
mlr3 | Method | Runtime, s | k1000 | k100 | k10 | k1 | Memory, mb |
---|---|---|---|---|---|---|---|
No Encapsulation | |||||||
1.1.0.9000 | none | 7 | 1.0 | 1.1 | 1.7 | 7.5 | 149 |
1.1.0 | none | 8 | 1.0 | 1.1 | 1.8 | 9.3 | 148 |
1.0.1 | none | 8 | 1.0 | 1.1 | 1.8 | 9.3 | 152 |
1.0.0 | none | 8 | 1.0 | 1.1 | 1.8 | 9.3 | 152 |
Evaluate | |||||||
1.1.0.9000 | evaluate | 20 | 1.0 | 1.2 | 3.0 | 21 | 149 |
1.1.0 | evaluate | 22 | 1.0 | 1.2 | 3.2 | 23 | 150 |
1.0.1 | evaluate | 22 | 1.0 | 1.2 | 3.2 | 23 | 149 |
1.0.0 | evaluate | 25 | 1.0 | 1.2 | 3.5 | 26 | 149 |
Callr | |||||||
1.1.0.9000 | callr | 579 | 1.6 | 6.8 | 59 | 580 | 148 |
1.1.0 | callr | 1311 | 2.3 | 14 | 130 | 1,300 | 151 |
1.0.1 | callr | 668 | 1.7 | 7.7 | 68 | 670 | 151 |
1.0.0 | callr | 1401 | 2.4 | 15 | 140 | 1,400 | 151 |