Description

Automatically Builds 12 Classification Models.

Description

Automatically builds 12 classification models from data. The package returns 26 plots, 5 tables and a summary report. The package automatically builds six individual classification models, including error (RMSE) and predictions. That data is used to create an ensemble, which is then modeled using six methods. The process is repeated as many times as the user requests. The mean of the results are presented in a summary table. The package returns the confusion matrices for all 12 models, tables of the correlation of the numeric data, the results of the variance inflation process, the head of the ensemble and the head of the data frame.

README.md

cran.r-project.org

ClassificationEnsembles

The goal of ClassificationEnsembles is to automatically conduct a thorough analysis of data that includes classification data. The user only needs to provide the data and answer a few questions (such as which column to analyze). ClassificationEnsembles fits 12 models (6 individual models and 6 ensembles of models). The package also returns 13 plots, five tables and a summary report sorted by accuracy (highest to lowest)

Installation

You can install the development version of ClassificationEnsembles like so:

devtools::install_github("InfiniteCuriosity/ClassificationEnsembles")

Example

ClassificationEnsembles will model the location of a car seat (Good, Medium or Bad) based on the other features in the Carseats data set

library(ClassificationEnsembles)
Classification(data = ISLR::Carseats,
               colnum = 7,
               numresamples = 25,
               predict_on_new_data = "N",
               set_seed = "N",
               remove_VIF_above = 5.00,
               scale_all_numeric_predictors_in_data = "N",
               how_to_handle_strings = 1,
               save_all_trained_models = "N",
               save_all_plots = "N",
               use_parallel = "Y",
               train_amount = 0.60,
               test_amount = 0.20,
               validation_amount = 0.20)
)

The 12 classification models which are built automatically are:

C50
Ensemble Bagged Cart
Ensemble Bagged Random Forest
Ensemble C50
Ensemble Naive Bayes
Ensemble Support Vector Machines
Ensemble Trees
Linear
Partial Least Squares
Penalized Discrmininant Analysis
RPart
Trees

The 26 plots it returns automatically are:

Holdout accuracy / train accurcay by model, fixed scales
Residuals by model, free scales
Residuals by model, fixed scales
Classification error, free scales
Classification error, fixed scales
Accuracy data, free scales
Accuracy data, fixed scales
Accuracy by model, free scales
Accuracy by model, fixed scales
Histograms of numeric columns
Boxplots of numeric columns
Duration barchart
False negative rate free scales
False negative rate fixed scales
False positive rate, free scales
False positive rate, fixed scales
True negative rate, free scales
True negative rate, fixed scales
True positive rate, free scales
True positive rate, fixed scales
Over or underfitting barchart
Model accuracy barchart
Barchart of each feature vs target by percentage
Barchart of each feature vs target by value
Correlation of numeric data as circles and colors
Correlation of numeric data as numbers and colors

The 5 tables the package returns automatically are:

Head of the ensemble
Head of the data frame
Variance Inflation Factor of the numeric columns
Correlation of the data
Summary report, including accuracy, duration, overfitting, sum of diagonals

The package also returns 25 summary tables (sometimes called confusion matrices), one for each of the models. These can be found in the Console. For example, using the dry_beams_small classification data set:

ensemble_bag_rf_test_pred BARBUNYA BOMBAY CALI DERMASON HOROZ SEKER SIRA BARBUNYA 21 0 0 0 0 0 0 BOMBAY 0 16 0 0 0 0 0 CALI 0 0 35 0 0 0 0 DERMASON 0 0 0 76 0 0 0 HOROZ 0 0 0 0 36 0 0 SEKER 0 0 0 0 0 48 0 SIRA 0 0 0 0 0 0 51

A data summary is also in the Console. Using dry_beans_small as an example: $Data_summary Eccentricity ConvexArea Extent Solidity roundness ShapeFactor4
Min. :0.2190 Min. : 20825 Min. :0.5802 Min. :0.9551 Min. :0.5718 Min. :0.9550
1st Qu.:0.7175 1st Qu.: 37052 1st Qu.:0.7240 1st Qu.:0.9859 1st Qu.:0.8320 1st Qu.:0.9941
Median :0.7642 Median : 45261 Median :0.7606 Median :0.9886 Median :0.8833 Median :0.9966
Mean :0.7517 Mean : 53997 Mean :0.7519 Mean :0.9874 Mean :0.8750 Mean :0.9952
3rd Qu.:0.8117 3rd Qu.: 62159 3rd Qu.:0.7887 3rd Qu.:0.9903 3rd Qu.:0.9191 3rd Qu.:0.9980
Max. :0.9082 Max. :229994 Max. :0.8325 Max. :0.9937 Max. :0.9879 Max. :0.9996

BARBUNYA: 79
BOMBAY : 31
CALI : 97
DERMASON:212
HOROZ :115
SEKER :121
SIRA :158

r-ClassificationEnsembles

ClassificationEnsembles

Installation

Example

Version

License

Status

Source

Homepage

Platforms (76)

ClassificationEnsembles

Installation

Example

Version

License

Status

Source

Homepage

Platforms76 (76)

Platforms (76)