Description

Handling Heteroskedasticity in the Linear Regression Model.

Description

Implements numerous methods for testing for, modelling, and correcting for heteroskedasticity in the classical linear regression model. The most novel contribution of the package is found in the functions that implement the as-yet-unpublished auxiliary linear variance models and auxiliary nonlinear variance models that are designed to estimate error variances in a heteroskedastic linear regression model. These models follow principles of statistical learning described in Hastie (2009) <doi:10.1007/978-0-387-21606-5>. The nonlinear version of the model is estimated using quasi-likelihood methods as described in Seber and Wild (2003, ISBN: 0-471-47135-6). Bootstrap methods for approximate confidence intervals for error variances are implemented as described in Efron and Tibshirani (1993, ISBN: 978-1-4899-4541-9), including also the expansion technique described in Hesterberg (2014) <doi:10.1080/00031305.2015.1089789>. The wild bootstrap employed here follows the description in Davidson and Flachaire (2008) <doi:10.1016/j.jeconom.2008.08.003>. Tuning of hyper-parameters makes use of a golden section search function that is modelled after the MATLAB function of Zarnowiec (2022) <https://www.mathworks.com/matlabcentral/fileexchange/25919-golden-section-method-algorithm>. A methodological description of the algorithm can be found in Fox (2021, ISBN: 978-1-003-00957-3). There are 25 different functions that implement hypothesis tests for heteroskedasticity. These include a test based on Anscombe (1961) <https://projecteuclid.org/euclid.bsmsp/1200512155>, Ramsey's (1969) BAMSET Test <doi:10.1111/j.2517-6161.1969.tb00796.x>, the tests of Bickel (1978) <doi:10.1214/aos/1176344124>, Breusch and Pagan (1979) <doi:10.2307/1911963> with and without the modification proposed by Koenker (1981) <doi:10.1016/0304-4076(81)90062-2>, Carapeto and Holt (2003) <doi:10.1080/0266476022000018475>, Cook and Weisberg (1983) <doi:10.1093/biomet/70.1.1> (including their graphical methods), Diblasi and Bowman (1997) <doi:10.1016/S0167-7152(96)00115-0>, Dufour, Khalaf, Bernard, and Genest (2004) <doi:10.1016/j.jeconom.2003.10.024>, Evans and King (1985) <doi:10.1016/0304-4076(85)90085-5> and Evans and King (1988) <doi:10.1016/0304-4076(88)90006-1>, Glejser (1969) <doi:10.1080/01621459.1969.10500976> as formulated by Mittelhammer, Judge and Miller (2000, ISBN: 0-521-62394-4), Godfrey and Orme (1999) <doi:10.1080/07474939908800438>, Goldfeld and Quandt (1965) <doi:10.1080/01621459.1965.10480811>, Harrison and McCabe (1979) <doi:10.1080/01621459.1979.10482544>, Harvey (1976) <doi:10.2307/1913974>, Honda (1989) <doi:10.1111/j.2517-6161.1989.tb01749.x>, Horn (1981) <doi:10.1080/03610928108828074>, Li and Yao (2019) <doi:10.1016/j.ecosta.2018.01.001> with and without the modification of Bai, Pan, and Yin (2016) <doi:10.1007/s11749-017-0575-x>, Rackauskas and Zuokas (2007) <doi:10.1007/s10986-007-0018-6>, Simonoff and Tsai (1994) <doi:10.2307/2986026> with and without the modification of Ferrari, Cysneiros, and Cribari-Neto (2004) <doi:10.1016/S0378-3758(03)00210-6>, Szroeter (1978) <doi:10.2307/1913831>, Verbyla (1993) <doi:10.1111/j.2517-6161.1993.tb01918.x>, White (1980) <doi:10.2307/1912934>, Wilcox and Keselman (2006) <doi:10.1080/10629360500107923>, Yuce (2008) <https://dergipark.org.tr/en/pub/iuekois/issue/8989/112070>, and Zhou, Song, and Thompson (2015) <doi:10.1002/cjs.11252>. Besides these heteroskedasticity tests, there are supporting functions that compute the BLUS residuals of Theil (1965) <doi:10.1080/01621459.1965.10480851>, the conditional two-sided p-values of Kulinskaya (2008) <doi:10.48550/arXiv.0810.2124>, and probabilities for the nonparametric trend statistic of Lehmann (1975, ISBN: 0-816-24996-1). For handling heteroskedasticity, in addition to the new auxiliary variance model methods, there is a function to implement various existing Heteroskedasticity-Consistent Covariance Matrix Estimators from the literature, such as those of White (1980) <doi:10.2307/1912934>, MacKinnon and White (1985) <doi:10.1016/0304-4076(85)90158-7>, Cribari-Neto (2004) <doi:10.1016/S0167-9473(02)00366-3>, Cribari-Neto et al. (2007) <doi:10.1080/03610920601126589>, Cribari-Neto and da Silva (2011) <doi:10.1007/s10182-010-0141-2>, Aftab and Chang (2016) <doi:10.18187/pjsor.v12i2.983>, and Li et al. (2017) <doi:10.1080/00949655.2016.1198906>.

README.md

cran.r-project.org

skedastic

The purpose of the skedastic package is to make a suite of old and new methods for detecting and correcting for heteroskedasticity in linear regression models accessible to R users.

Installation

# Install from CRAN
install.packages("skedastic", dependencies = c("Depends", "Imports"))

# Or the development version from GitHub:
install.packages("devtools")
devtools::install_github("tjfarrar/skedastic")

Usage

Heteroskedasticity (sometimes spelt 'heteroscedasticity') is a violation of one of the assumptions of the classical linear regression model (the Gauss-Markov Assumptions). This assumption, known as homoskedasticity, holds that the variance of the random error term remains constant across all observations. Under heteroskedasticity, the Ordinary Least Squares estimator is no longer the Best Linear Unbiased Estimator (BLUE) of the parameter vector, while the classical t-tests for testing significance of the parameters are invalid. Thus, heteroskedasticity-robust methods are required.

The most novel functionality of this package is provided by the alvm.fit and anlvm.fit functions, which fit an Auxiliary Linear Variance Model or an Auxiliary Nonlinear Variance Model, respectively. These are new models for estimating error variances in heteroskedastic linear regression models, developed as part of the author's doctoral research.

The hccme function computes heteroskedasticity-consistent covariance matrix estimates for the $\hat{\beta}$ Ordinary Least Squares estimator using ten different methods found in the literature.

25 distinct functions in the package implement hypothesis testing methods for detecting heteroskedasticity that have been previously published in academic literature. Other functions implement graphical methods for detecting heteroskedasticity or perform supporting tasks for the tests such as computing transformations of the Ordinary Least Squares (OLS) residuals that are useful in heteroskedasticity detection, or computing probabilities from the null distribution of a nonparametric test statistic. Certain functions have applications beyond the problem of heteroskedasticity in linear regression. These include pRQF, which computes cumulative probabilities from the distribution of a ratio of quadratic forms in normal random vectors, twosidedpval, which implements three different approaches for calculating two-sided $p$-values from asymmetric null distributions, and dDtrend and pdDtrend, which compute probabilities from Lehmann's nonparametric trend statistic.

Most of the exported functions in the package take a linear model as their primary argument (which can be passed as an lm object). Thus, to use this package a user must first be familiar with how to fit linear regression models using the lm function from package stats.

Here is an example of implementing the Breusch-Pagan Test for heteroskedasticity on a linear regression model fit to the cars dataset, with distance (cars$dist) as the response (dependent) variable and speed (cars$speed) as the explanatory (independent) variable.

library(skedastic)
mylm <- lm(dist ~ speed, data = cars)
breusch_pagan(mylm)

To compute BLUS residuals for the same model:

myblusres <- blus(mylm, omit = "last")
myblusres

To create customised residual plots for the same model:

hetplot(mylm, horzvar = c("explanatory", "log_explanatory"), vertvar = c("res", "res_stud"), vertfun = "2", filetype = NA)

To fit an auxiliary linear variance model to the same linear regression model, assuming that the error variances are a linear function of the speed predictor, and extract the resulting variance estimates:

myalvm <- alvm.fit(mylm, model = "linear")
myalvm$var.est

To fit an auxiliary linear variance model to the same linear regression model, assuming that the error variances are a quadratic function of the speed predictor, and extract the resulting variance estimates:

myanlvm <- anlvm.fit(mylm, g = function(x) x ^ 2)
mynalvm$var.est

Learn More

No vignettes have been created yet for this package. Watch this space.

r-skedastic

skedastic

Installation

Usage

Learn More

Version

License

Status

Source

Homepage

Platforms (80)

skedastic

Installation

Usage

Learn More

Version

License

Status

Source

Homepage

Platforms80 (80)

Platforms (80)