The website features runtime and memory benchmarks of the mlr3tuning package now.

Feature Selection Wrapper

Feature selection wrappers can be found in the mlr3fselect packages. The goal is to find the best subset of features with respect to a performance measure in an iterative fashion.

Key
Label
Packages
Properties
Asynchronous Design Points
  • single-crit
  • multi-crit
  • async
Asynchronous Exhaustive Search
  • single-crit
  • multi-crit
  • async
Asynchronous Random Search
  • single-crit
  • multi-crit
Design Points
  • single-crit
  • multi-crit
Exhaustive Search
  • single-crit
  • multi-crit
Genetic Search
  • single-crit
Random Search
  • single-crit
  • multi-crit
Recursive Feature Elimination
  • single-crit
  • requires_model
Recursive Feature Elimination
  • single-crit
  • requires_model
Sequential Search
  • single-crit
Shadow Variable Search
  • single-crit

Example Usage

Run a sequential feature selection on the Pima Indian Diabetes data set.

library(mlr3verse)

# retrieve task
task = tsk("pima")

# load learner
learner = lrn("classif.rpart")

# feature selection on the pima indians diabetes data set
instance = fselect(
  fselector = fs("sequential"),
  task = task,
  learner = learner,
  resampling = rsmp("holdout"),
  measure = msr("classif.ce")
)

# best performing feature subset
instance$result
      age glucose insulin   mass pedigree pregnant pressure triceps
   <lgcl>  <lgcl>  <lgcl> <lgcl>   <lgcl>   <lgcl>   <lgcl>  <lgcl>
1:   TRUE    TRUE   FALSE   TRUE     TRUE     TRUE    FALSE   FALSE
                             features n_features classif.ce
                               <list>      <int>      <num>
1: age,glucose,mass,pedigree,pregnant          5  0.2148438
# subset the task and fit the final model
task$select(instance$result_feature_set)
learner$train(task)

print(learner)

── <LearnerClassifRpart> (classif.rpart): Classification Tree ──────────────────
• Model: rpart
• Parameters: xval=0
• Packages: mlr3 and rpart
• Predict Types: [response] and prob
• Feature Types: logical, integer, numeric, factor, and ordered
• Encapsulation: none (fallback: -)
• Properties: importance, missings, multiclass, selected_features, twoclass,
and weights
• Other settings: use_weights = 'use'