Description
Distribution-Based Model Selection.
Description
Perform model selection using distribution and probability-based methods, including standardized AIC, BIC, and AICc. These standardized information criteria allow one to perform model selection in a way similar to the prevalent "Rule of 2" method, but formalize the method to rely on probability theory. A novel goodness-of-fit procedure for assessing linear regression models is also available. This test relies on theoretical properties of the estimated error variance for a normal linear regression model, and employs a bootstrap procedure to assess the null hypothesis that the fitted model shows no lack of fit. For more information, see Koeneman and Cavanaugh (2023) <arXiv:2309.10614>. Functionality to perform all subsets linear or generalized linear regression is also available.
README.md
DBModelSelect
The goal of DBModelSelect is to provide a package for using various distribution-based model selection techniques.
Installation
You can install the development version of DBModelSelect from GitHub with:
# install.packages("devtools")
devtools::install_github("shkoeneman/DBModelSelect")
Example
This is a basic example which shows you how to use various features of the package:
library(DBModelSelect)
# generate some data
set.seed(5122023)
data <- data.frame(s = rnorm(200), t = rnorm(200))
data$y <- data$s + rnorm(200)
# perform all subsets regression
model_list <- FitLMSubsets(response = "y", data = data, intercept = TRUE, force_intercept = FALSE)
#determine whether largest candidate model shows lack of fit
BootGOFTestLM(model_list[[length(model_list)]], data = data)
# perform model selection
model_select <- StandICModelSelect(model_list, IC = "AIC")
# print and plot results of model selection
print(model_select)
plot(model_select)