MyNixOS website logo
Description

Quickly Explore Complex Survey Data.

Visualize and tabulate single-choice, multiple-choice, matrix-style questions from survey data. Includes ability to group cross-tabulations, frequency distributions, and plots by categorical variables and to integrate survey weights. Ideal for quickly uncovering descriptive patterns in survey data.

Surveyexplorer

CRANstatus

Visualize and tabulate single-choice, multiple-choice, matrix-style questions from survey data. Includes ability to group cross-tabulations, frequency distributions, and plots by categorical variables and to integrate survey weights. Ideal for quickly uncovering descriptive patterns in survey data.

Installation

install.packages("surveyexplorer")
# or devtools::install_github("liamhaller/surveyexplorer") for the devlopment version 

Examples

library(surveyexplorer)

The data used in the following examples is from the berlinbears dataset, a fictional survey of bears in Berlin, that is included in the surveyexplorer package.

Single-choice questions

#Basic table
single_table(berlinbears, 
             question = income)
Question: income
nfreq
<1000820.164
1000-2000500.100
2000-30001770.354
3000-40001090.218
5000+570.114
No answer220.044
NA30.006
Column Total500

Use group_by = to partition the question into several groups

single_table(berlinbears,
             question = income,
             group_by = gender)
Question: income
grouped by: gender
femalemaleNARowwise Total
FrequencyCountFrequencyCountFrequencyCountFrequencyCount
<100016.74%3915.73%3921.05%416.40%82
1000-20009.87%239.68%2415.79%310.00%50
2000-300035.62%8335.89%8926.32%535.40%177
3000-400021.89%5122.18%5515.79%321.80%109
5000+11.59%2710.89%2715.79%311.40%57
No answer3.86%94.84%125.26%14.40%22
NA0.43%10.81%20.00%00.60%3
Columnwise Total46.60%23349.60%2483.80%19100.00%500

Ignore unwanted subgroups with subgroups_to_exclude

single_table(berlinbears,
             question = income, 
             group_by = gender, 
             subgroups_to_exclude = NA) 
Question: income
grouped by: gender
femalemaleRowwise Total
FrequencyCountFrequencyCountFrequencyCount
<100016.74%3915.73%3916.22%78
1000-20009.87%239.68%249.77%47
2000-300035.62%8335.89%8935.76%172
3000-400021.89%5122.18%5522.04%106
5000+11.59%2710.89%2711.23%54
No answer3.86%94.84%124.37%21
NA0.43%10.81%20.62%3
Columnwise Total48.44%23351.56%248100.00%481

Remove NAs from the question variable with na.rm

single_table(berlinbears,
             question = income, 
             group_by = gender, 
             subgroups_to_exclude = NA,
             na.rm = TRUE)
Question: income
grouped by: gender
femalemaleRowwise Total
FrequencyCountFrequencyCountFrequencyCount
<100016.81%3915.85%3916.32%78
1000-20009.91%239.76%249.83%47
2000-300035.78%8336.18%8935.98%172
3000-400021.98%5122.36%5522.18%106
5000+11.64%2710.98%2711.30%54
No answer3.88%94.88%124.39%21
Columnwise Total48.54%23251.46%246100.00%478

Finally, you can specify survey weights using the weight option

single_table(berlinbears,
             question = income, 
             group_by = gender, 
             subgroups_to_exclude = NA,
             na.rm = TRUE,
             weights = weights)
Question: income
grouped by: gender
femalemaleRowwise Total
FrequencyCountFrequencyCountFrequencyCount
<100015.96%59.617.21%75.216.63%134.8
1000-200010.46%39.110.19%44.510.31%83.6
2000-300033.79%126.333.88%148.033.84%274.3
3000-400025.08%93.725.34%110.725.22%204.4
5000+9.82%36.78.68%37.99.21%74.6
No answer4.90%18.34.70%20.54.79%38.8
Columnwise Total46.10%373.653.90%436.9100.00%810.5
Frequencies and counts are weighted

The same syntax can be applied to the single_freq function to plot frequencies of the question optionally partitioned by subgroups.

single_freq(berlinbears,
             question = income, 
             group_by = gender, 
             subgroups_to_exclude = NA,
             na.rm = TRUE,
             weights = weights)

Multiple-choice questions

The options and syntax for multiple-choice tables multi_table and graphs multi_graphs are the same. The only difference is the question input also accommodates tidyselect syntax to select several columns for each answer option. For example, the question “will_eat” has five answer options each prefixed by “will_eat”

berlinbears |> 
  dplyr::select(starts_with('will_eat')) |> 
  head()
#>   will_eat.SQ001 will_eat.SQ002 will_eat.SQ003 will_eat.SQ004 will_eat.SQ005
#> 1              0              1              0              1              1
#> 2              0              1              1              1              1
#> 3              1              1              0              1              1
#> 4              0              0              0              1              0
#> 5              0              0              0              1              1
#> 6              0              0              0              1              0

The same syntax can be used to select the question for the multiple choice tables and graphs

multi_table(berlinbears, 
            question = dplyr::starts_with('will_eat'), 
            group_by = genus, 
            subgroups_to_exclude = NA,
            na.rm = TRUE)
Question: dplyr::starts_with("will_eat")
grouped by: genus
AiluropodaUrsusRowwise Total
FrequencyCountFrequencyCountFrequencyCount
will_eat.SQ00497.54%27891.62%17540.12%453
will_eat.SQ00259.30%16966.49%12726.22%296
will_eat.SQ00543.86%12546.60%8918.95%214
will_eat.SQ00124.91%7127.23%5210.89%123
will_eat.SQ0038.42%249.95%193.81%43
Columnwise Total59.08%66740.92%462100.00%1129

For graphing, the multi_freq function creates an UpSet plot to visualize the frequencies of the intersecting sets for each answer combination and also includes the ability to specify weights.

multi_freq(berlinbears, 
            question = dplyr::starts_with('will_eat'), 
            na.rm = TRUE,
            weights = weights)
#> Estimes are only preciese to one significant digit, weights may have been rounded

The graphs can also be grouped

multi_freq(berlinbears, 
            question = dplyr::starts_with('will_eat'), 
            group_by = genus,
            subgroups_to_exclude = NA,
            na.rm = FALSE,
            weights = weights)
#> Estimes are only preciese to one significant digit, weights may have been rounded

Matrix Questions

matrix_table has the same syntax as above and works with array or categorical questions

matrix_table(berlinbears, 
             dplyr::starts_with('c_'),
             group_by = is_parent)
Question: dplyr::starts_with("c_")
grouped by: is_parent
highlowmediumNA
0
c_diet6.02% (20)71.99% (239)16.57% (55)5.42% (18)
c_exercise25% (83)27.71% (92)24.1% (80)23.19% (77)
1
c_diet3.57% (6)75% (126)17.26% (29)4.17% (7)
c_exercise19.05% (32)27.38% (46)23.81% (40)29.76% (50)

matrix_freq visualizes the frequencies of responses

matrix_freq(berlinbears, 
             dplyr::starts_with('p_'), 
             na.rm = TRUE)

For array/matrix style questions that are numeric matrix_mean plots the mean values and confidence intervals

matrix_mean(berlinbears, 
             question = dplyr::starts_with('p_'),
             na.rm = TRUE)
#Can also apply grouping + survey weights
matrix_mean(berlinbears, 
            question = dplyr::starts_with('p_'),
            na.rm = TRUE,
            group_by = species, 
            subgroups_to_exclude = NA)

Finally, for Likert questions (scales of 3,5,7,9…) matrix_likert provides a custom plot

#you can specify custom labels with the `label` argument
matrix_likert(berlinbears,
              question = dplyr::starts_with('p_'),
              labels = c('Strongly disagree', 'Disagree','Neutral','Agree','Strongly agree'))

#can also apply pass custom colors and specify weights weights 
matrix_likert(berlinbears, 
              question = dplyr::starts_with('p_'),
              labels = c('Strongly disagree', 'Disagree','Neutral','Agree','Strongly agree'), 
              colors = c("#E1AA28", "#1E5F46", "#7E8F75", "#EFCD83", "#E17832"),
              weights = weights) 

Overview

Functions

  • Single-choice
    • single_table
    • single_freq
  • Multiple-choice
    • multi_table
    • multi_freq
  • Matrix
    • matrix_table
    • matrix_freq
    • matrix_mean
    • matrix_likert

*_table functions return a gt table of the cross tabulations and frequencies for each question while *_freq returns the same data but as a plot.

For matrix-style questions with numerical input, matrix_mean plots the mean value value and ± two standard deviations. matrix_likert visualizes questions that accept Likert responses (strongly agree-strongly disagree) or questions with 3,5,7,9… categories.

Syntax

Each function contains the following options

  • dataset —The input dataframe (or tibble) of survey questions
  • question — The column(s) that contain the response options for a question, can be selected by using tidyselect semantics or providing a vector of column names or numbers
  • group_by — Optional variable to group the analysis. If provided, the frequencies and counts will be calculated within each subgroup
  • subgroups_to_exclude — Optional vector specifying subgroups to exclude from the analysis
  • weights — Optional variable containing survey weights. If provided, frequencies and counts will be weighted accordingly
  • na.rm — Logical indicating whether to remove NA values from question before analysis.
Metadata

Version

0.2.0

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows