mlr3: Machine Learning in R

The mlr3 ecosystem is the framework for machine learning in R.

An open-source collection of R packages providing a unified interface for machine learning in the R language. Successor of mlr.

A scientifically designed and easy to learn interface.
More than 100 connected machine learning algorithms.
Light on dependencies.
Convenient parallelization with the future package.
State-of-the-art optimization algorithms.
Dataflow programming with pipelines.

Get Started

There are many packages in the mlr3 ecosystem that you may want to use. You can install the full mlr3 universe at once with:

install.packages("mlr3verse")

You can also use our Docker images.

Resources

Our book “Applied Machine Learning Using mlr3 in R” is the central entry point to mlr3 ecosystem. This essential guide covers key aspects of machine learning, from building and evaluating predictive models to advanced techniques like hyperparameter tuning for peak performance. It delves into constructing comprehensive machine learning pipelines, encompassing data pre-processing, modeling, and prediction aggregation.

The book is primarily aimed at researchers, practitioners, and graduate students who use machine learning or who are interested in using it. It can be used as a textbook for an introductory or advanced machine learning class that uses R, as a reference for people who work with machine learning methods, and in industry for exploratory experiments in machine learning.

In addition to the book, there are many other resources to learn more about mlr3. The gallery contains a collection of case studies that demonstrate the functionality of mlr3. The cheatsheets provide a quick overview of the most important functions. The resources section contains links to talks, courses, and other material.

Examples

Basic Machine Learning

Get to know the basic building blocks of machine learning in mlr3. Train your first learner and estimate its performance with resampling. Compare the performance of learners with benchmarking.

Optimization

Optimize the hyperparameters of a classification tree on the Palmer Penguins data set. Become familiar with search spaces and transformations. Fit a final model with optimized hyperparameters for predicting new data.

Pipelines

Build a preprocessing pipeline for missing data in the German Credit data set. Optimize the parameters of the pipeline and stack multiple learners into an ensemble model. Learn about techniques to tackle challenging data sets.

Feature Selection

Start a feature selection on the Titanic data set. Learn about different optimization algorithms and fit a final model. Estimate the performance of the optimized feature set with nested resampling.