Description

Baseline Models for Classification and Regression.

Description

Providing equivalent functions for the dummy classifier and regressor used in 'Python' 'scikit-learn' library. Our goal is to allow R users to easily identify baseline performance for their classification and regression problems. Our baseline models use no predictors, and are useful in cases of class imbalance, multiclass classification, and when users want to quickly identify how much improvement their statistical and machine learning models are over several baseline models. We use a "better" default (proportional guessing) for the dummy classifier than the 'Python' implementation ("prior", which is the most frequent class in the training set). The functions in the package can be used on their own, or introduce methods named 'dummy_regressor' or 'dummy_classifier' that can be used within the caret package pipeline.

README.md

cran.r-project.org

basemodels: Baseline Models for Classification and Regression

This R package, basemodels, provides equivalent functions for the dummy classifier and regressor used in 'Python' 'scikit-learn' library with some modifications. Our goal is to allow R users to easily identify baseline performance for their classification and regression problems. Our baseline models use no predictors, and are useful in cases of class imbalance, multi-class classification, and when users want to quickly identify how much improvement their statistical and machine learning models are over several baseline models. We use a "better" default (proportional guessing) for the dummy classifier than the Python implementation ("prior", which is the most frequent class in the training set).

Example

# Split the data into training and testing sets
set.seed(2023)
index <- sample(1:nrow(iris), nrow(iris) * 0.8)
train_data <- iris[index,]
test_data <- iris[-index,]
dummy_model <- dummy_classifier(train_data$Species, strategy = "proportional", random_state = 2024)

# Make predictions using the trained dummy classifier
pred_vec <- predict_dummy_classifier(dummy_model, test_data)

# Evaluate the performance of the dummy classifier
conf_matrix <- caret::confusionMatrix(pred_vec, test_data$Species)
print(conf_matrix)

# Make predictions using the trained dummy regressor
reg_model <- dummy_regressor(train_data$Sepal.Length, strategy = "median")
y_hat <- predict_dummy_regressor(reg_model, test_data)
# Find mean squared error
mean((test_data$Sepal.Length-y_hat)^2)

Install

The package can be installed directly from CRAN:

install.packages("basemodels")

or directly from GitHub:

devtools::install_github("Ying-Ju/basemodels")

r-basemodels

basemodels: Baseline Models for Classification and Regression

Example

Install

Version

License

Status

Source

Homepage

Platforms (77)

basemodels: Baseline Models for Classification and Regression

Example

Install

Version

License

Status

Source

Homepage

Platforms77 (77)

Platforms (77)