mlr3 - Initial release
The mlr-org team is very proud to present the initial release of the mlr3 machine-learning framework for R.
mlr3 comes with a clean object-oriented-design using the R6 class system. With this, it overcomes the limitations of R’s S3 classes. It is a rewrite of the well-known mlr package which provides a convenient way of accessing many algorithms in R through a consistent interface.
While mlr was one big package that included everything starting from preprocessing, tuning, feature-selection to visualization, mlr3 is a package framework consisting of many packages containing specific parts of the functionality required for a complete machine-learning workflow.
Background - why a rewrite?
The addition of many features to mlr has led to a “feature creep” which makes it hard to maintain and extend.
Due to the many tests in mlr, a full CI run of the package on Travis CI takes more than 30 minutes. This does not include the installation of dependencies, running R CMD check or building the pkgdown site (which includes more than 20 vignettes with executable code). The dependency installation alone takes 1 hour(!). This is due to the huge number of packages that mlr imports to be able to use all the algorithms for which it provides a unified interface. On a vanilla CI build without cache there would be 326(!) packages to be installed.
mlr consists of roughly 40k lines of R code.
github.com/AlDanial/cloc v 1.82 T=0.98 s (1335.1 files/s, 226920.4 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
HTML 346 16592 6276 125687
R 878 5909 12566 40696
Rmd 44 2390 5296 2500
Markdown 11 196 0 1136
XML 1 0 0 963
YAML 4 14 5 244
CSS 2 69 66 203
C 3 16 33 106
JSON 5 0 0 98
JavaScript 2 21 9 80
Bourne Shell 4 8 6 69
C/C++ Header 1 7 0 20
SVG 1 0 1 11
-------------------------------------------------------------------------------
SUM: 1302 25222 24258 171813
-------------------------------------------------------------------------------
Adding any feature has become a huge pain since it requires passing down objects through many functions. In addition, a change to one part of the project might trigger unwanted side-effects at other places or even break functions. Even though most of the package is covered by tests, this creates a high wall of reluctance to extend the p