Make Quick Descriptive Tables for Categorical Variables.
freqtables
The goal of freqtables
is to quickly make tables of descriptive statistics for categorical variables (i.e., counts, percentages, confidence intervals). This package is designed to work in a tidyverse
pipeline, and consideration has been given to get results from R to Microsoft Word ® with minimal pain.
Installation
You can install the released version of freqtables
from CRAN with:
install.packages("freqtables")
And the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("brad-cannell/freqtables")
Example
Because freqtables
is intended to be used in a dplyr
pipeline, loading dplyr
into your current R session is recommended.
library(dplyr)
library(freqtables)
The examples below will use R’s built-in mtcars
data set.
data("mtcars")
freq_table()
The freq_table()
function produces one-way and two-way frequency tables for categorical variables. In addition to frequencies, the freq_table()
function displays percentages, and the standard errors and confidence intervals of the percentages. For two-way tables only, freq_table()
also displays row (subgroup) percentages, standard errors, and confidence intervals.
For one-way tables, the default 95 percent confidence intervals displayed are logit transformed confidence intervals equivalent to those used by Stata. Additionally, freq_table()
will return Wald (“linear”) confidence intervals if the argument to ci_type = “wald”.
For two-way tables, freq_table()
returns logit transformed confidence intervals equivalent to those used by Stata.
Here is an example of using freq_table()
to create a one-way frequency table with all function arguments left at their default values:
mtcars %>%
freq_table(am)
#> var cat n n_total percent se t_crit lcl ucl
#> 1 am 0 19 32 59.375 8.820997 2.039513 40.94225 75.49765
#> 2 am 1 13 32 40.625 8.820997 2.039513 24.50235 59.05775
Here is an example of using freq_table()
to create a two-way frequency table with all function arguments left at their default values:
mtcars %>%
freq_table(am, cyl)
#> # A tibble: 6 × 17
#> row_var row_cat col_var col_cat n n_row n_total percent_total se_total
#> <chr> <chr> <chr> <chr> <int> <int> <int> <dbl> <dbl>
#> 1 am 0 cyl 4 3 19 32 9.38 5.24
#> 2 am 0 cyl 6 4 19 32 12.5 5.94
#> 3 am 0 cyl 8 12 19 32 37.5 8.70
#> 4 am 1 cyl 4 8 13 32 25 7.78
#> 5 am 1 cyl 6 3 13 32 9.38 5.24
#> 6 am 1 cyl 8 2 13 32 6.25 4.35
#> # … with 8 more variables: t_crit_total <dbl>, lcl_total <dbl>,
#> # ucl_total <dbl>, percent_row <dbl>, se_row <dbl>, t_crit_row <dbl>,
#> # lcl_row <dbl>, ucl_row <dbl>
You can learn more about the freq_table()
function and ways to adjust default behaviors in vignette(“descriptive_analysis”).
freq_test()
The freq_test()
function is an S3 generic. It currently has methods for conducting hypothesis tests on one-way and two-way frequency tables. Further, it is made to work in a dplyr pipeline with the freq_table()
function.
For the freq_table_two_way
class, the methods used are Pearson’s chi-square test of independence Fisher’s exact test. When cell counts are <= 5, Fisher’s Exact Test is considered more reliable.
Here is an example of using freq_test()
to test the equality of proportions on a one-way frequency table with all function arguments left at their default values:
mtcars %>%
freq_table(am) %>%
freq_test() %>%
select(var:percent, p_chi2_pearson)
#> var cat n n_total percent p_chi2_pearson
#> 1 am 0 19 32 59.375 0.2888444
#> 2 am 1 13 32 40.625 0.2888444
Here is an example of using freq_test()
to conduct a chi-square test of independence on a two-way frequency table with all function arguments left at their default values:
mtcars %>%
freq_table(am, vs) %>%
freq_test() %>%
select(row_var:n, percent_row, p_chi2_pearson)
#> # A tibble: 4 × 7
#> row_var row_cat col_var col_cat n percent_row p_chi2_pearson
#> <chr> <chr> <chr> <chr> <int> <dbl> <dbl>
#> 1 am 0 vs 0 12 63.2 0.341
#> 2 am 0 vs 1 7 36.8 0.341
#> 3 am 1 vs 0 6 46.2 0.341
#> 4 am 1 vs 1 7 53.8 0.341
You can learn more about the freq_table()
function and ways to adjust default behaviors in vignette(“using_freq_test”).
freq_format()
The freq_format function is intended to make it quick and easy to format the output of the freq_table function for tables that may be used for publication. For example, a proportion and 95% confidence interval could be formatted as “24.00 (21.00 - 27.00).”
mtcars %>%
freq_table(am) %>%
freq_format(
recipe = "percent (lcl - ucl)",
name = "percent_95",
digits = 2
) %>%
select(var, cat, percent_95)
#> var cat percent_95
#> 1 am 0 59.38 (40.94 - 75.50)
#> 2 am 1 40.62 (24.50 - 59.06)
You can learn more about the freq_format()
function by reading the function documentation.