mlr3-0.1.0

Initial release of mlr3
R
r-bloggers
Author

Patrick Schratz

Published

July 31, 2019

mlr3 - Initial release

The mlr-org team is very proud to present the initial release of the mlr3 machine-learning framework for R.

mlr3 comes with a clean object-oriented-design using the R6 class system. With this, it overcomes the limitations of R’s S3 classes. It is a rewrite of the well-known mlr package which provides a convenient way of accessing many algorithms in R through a consistent interface.

While mlr was one big package that included everything starting from preprocessing, tuning, feature-selection to visualization, mlr3 is a package framework consisting of many packages containing specific parts of the functionality required for a complete machine-learning workflow.

Background - why a rewrite?

The addition of many features to mlr has led to a “feature creep” which makes it hard to maintain and extend.

Due to the many tests in mlr, a full CI run of the package on Travis CI takes more than 30 minutes. This does not include the installation of dependencies, running R CMD check or building the pkgdown site (which includes more than 20 vignettes with executable code). The dependency installation alone takes 1 hour(!). This is due to the huge number of packages that mlr imports to be able to use all the algorithms for which it provides a unified interface. On a vanilla CI build without cache there would be 326(!) packages to be installed.

mlr consists of roughly 40k lines of R code.

github.com/AlDanial/cloc v 1.82  T=0.98 s (1335.1 files/s, 226920.4 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
HTML                           346          16592           6276         125687
R                              878           5909          12566          40696
Rmd                             44           2390           5296           2500
Markdown                        11            196              0           1136
XML                              1              0              0            963
YAML                             4             14              5            244
CSS                              2             69             66            203
C                                3             16             33            106
JSON                             5              0              0             98
JavaScript                       2             21              9             80
Bourne Shell                     4              8              6             69
C/C++ Header                     1              7              0             20
SVG                              1              0              1             11
-------------------------------------------------------------------------------
SUM:                          1302          25222          24258         171813
-------------------------------------------------------------------------------

Adding any feature has become a huge pain since it requires passing down objects through many functions. In addition, a change to one part of the project might trigger unwanted side-effects at other places or even break functions. Even though most of the package is covered by tests, this creates a high wall of reluctance to extend the p