Assessing the Partial Association Between Ordinal Variables.
PAsso: an R Package for Assessing Partial Association between Ordinal Variables
Overview
An implementation of the unified framework for assessing Parrtial Association between ordinal variables after adjusting for a set of covariates (Dungang Liu, Shaobo Li, Yan Yu and Irini Moustaki (2020), accepted by the Journal of the American Statistical Association). This package provides a set of tools to quantify, visualize, and test partial associations between multiple ordinal variables. It can produce a number of $\phi$ measures, partial regression plots, 3-D plots, and $p$-values for testing $H_0: \phi=0$ or $H_0: \phi \leq \delta$
Installation
The PAsso
package is currently available on PAsso CRAN.
Install PAsso
development version from GitHub (recommended)
# Install the development version from GitHub
if (!requireNamespace("devtools")) install.packages("devtools")
devtools::install_github("XiaoruiZhu/PAsso")
Install PAsso
from the CRAN
# Install from CRAN
install.packages("PAsso")
# For macOS, if you have "error: 'math.h' file not found" for installing PAsso v0.1.9,
# the solution could be:
install.packages("https://cran.r-project.org/bin/macosx/el-capitan/contrib/3.6/PAsso_0.1.9.tgz",
repos = NULL, type = "source")
# If error "there is no package called 'gsl'" comes, try:
install.packages("gsl", type = "mac.binary")
Example
The following example shows the R code for evaluating the partial association between a binary variable $\text{PreVote.num}$ and a ordinal variable $\text{PID}$, while adjusting for age, education, and income. Specifically, $\text{PreVote.num}$ is the respondent's voting preference between Donald Trump and Hilary Clinton. And $\text{PID}$ is the respondent’s party identification with 7 ordinal levels from strong democrat (=1) to strong republication (=7). The data set is drawn from the 2016 American National Election Study.
library(PAsso)
PAsso_1 <- PAsso(responses = c("PreVote.num", "PID"),
adjustments = c("income.num", "age", "edu.year"),
data = ANES2016,
uni.model = "probit",
method = c("kendall"))
# Print the partial association matrix only
print(PAsso_1, 5)
# Provide:
# 1. partial association matrix;
# 2. marginal association matrix for comparison purpose;
# 3. summary of models' coefficients for model diagnostics and interpretation
summary(PAsso_1, 4)
# Plot partial association regression plot: residuals
plot(PAsso_1)
# Retrieve residuals that are used as ingredients for partial assocaition analyses
test_resids <- residuals(PAsso_1, draw = 1)
head(test_resids)
# test function: Conduct inference based on object of "PAsso.test" class ----------------------------
library(progress);
system.time(Pcor_SR_test1 <- test(object = PAsso_1, bootstrap_rep = 100, H0 = 0, parallel = F))
print(Pcor_SR_test1, digits=6)
# diagnostic.plot function -----------------------------------------------------
check_qq <- diagnostic.plot(object = PAsso_1, output = "qq")
check_fitted <- diagnostic.plot(object = PAsso_1, output = "fitted")
check_covar <- diagnostic.plot(object = PAsso_1, output = "covariate")
# Or more specific, draw residual-vs-covariate plot for the second model with
# response "PID" and covariate "income.num"
diagnostic.plot(object = PAsso_1, output = "covariate", x_name = "income.num", model_id = 2)
# general association measure and 3-D plot for VOTE and PID ------------------
library("copula"); library("plotly")
# Draw all pairs
testPlots <- plot3D(PAsso_1)
testPlots$`PreVote.num v.s. PID`
# Draw just one pair
testPlots2 <- plot3D(object = PAsso_1, y1 = "PreVote.num", y2 = "PID")
testPlots2
# "PAsso" advanced using of the function: Input a few models directly ------------------------------
fit_vote <- glm(PreVote.num ~ income.num + age + edu.year, data = ANES2016,
family = binomial(link = "probit"))
summary(fit_vote)
library(MASS)
fit_PID <- polr(as.factor(PID) ~ income.num + age + edu.year, data = ANES2016,
method = "probit", Hess = TRUE)
summary(fit_PID)
system.time(PAsso_adv1 <- PAsso(fitted.models=list(fit_vote, fit_PID),
association = c("partial"),
method = c("kendall"),
resids.type = "surrogate")
)
# Partial association coefficients
print(PAsso_adv1, digits = 3)
summary(PAsso_adv1, digits = 3)
References
Liu, D., Li, S., Yu, Y., & Moustaki, I. (2020). Assessing partial association between ordinal variables: quantification, visualization, and hypothesis testing. Journal of the American Statistical Association, 1-14. <10.1080/01621459.2020.1796394>
Liu, D., & Zhang, H. (2018). Residuals and diagnostics for ordinal regression models: A surrogate approach. Journal of the American Statistical Association, 113(522), 845-854. <10.1080/01621459.2017.1292915>
Greenwell, B.M., McCarthy, A.J., Boehmke, B.C. & Liu, D. (2018) Residuals and diagnostics for binary and ordinal regression models: An introduction to the sure package. The R Journal. <10.32614/RJ-2018-004>