StratifiedRF: Builds Trees by Sampling Variables in Groups

Random Forest-like tree ensemble that works with groups of predictor variables. When building a tree, a number of variables is taken randomly from each group separately, thus ensuring that it considers variables from each group for the splits. Useful when rows contain information about different things (e.g. user information and product information) and it's not sensible to make a prediction with information from only one group of variables, or when there are far more variables from one group than the other and it's desired to have groups appear evenly on trees. Trees are grown using the C5.0 algorithm rather than the usual CART algorithm. Supports parallelization (multithreaded), missing values in predictors, and categorical variables (without doing One-Hot encoding in the processing). Can also be used to create a regular (non-stratified) Random Forest-like model, but made up of C5.0 trees and with some additional control options. As it's built with C5.0 trees, it works only for classification (not for regression).

Version: 0.2.2
Imports: C50, dplyr, parallel, stats
Published: 2017-06-30
DOI: 10.32614/CRAN.package.StratifiedRF
Author: David Cortes
Maintainer: David Cortes <david.cortes.rivera at>
License: GPL-3
NeedsCompilation: no
In views: MissingData
CRAN checks: StratifiedRF results


Reference manual: StratifiedRF.pdf


Package source: StratifiedRF_0.2.2.tar.gz
Windows binaries: r-devel:, r-release:, r-oldrel:
macOS binaries: r-release (arm64): StratifiedRF_0.2.2.tgz, r-oldrel (arm64): StratifiedRF_0.2.2.tgz, r-release (x86_64): StratifiedRF_0.2.2.tgz, r-oldrel (x86_64): StratifiedRF_0.2.2.tgz
Old sources: StratifiedRF archive


Please use the canonical form to link to this page.