MyNixOS website logo
Description

Calculates Whisker Odds.

Descriptive statistics for large data tend to be low resolution on the tails. Whisker Odds generate a table of descriptive statistics for large data. This is the same as letter-values, but with an alternative naming of depths which allow for depths beyond 26. For a reference to letter-values see 'Heike Hofmann' and 'Hadley Wickham' and 'Karen Kafadar' (2017) <doi:10.1080/10618600.2017.1305277>.

Whisker Odds (wodds)

sensible summary statistics for big data

The goal of wodds is to make the calculations of whisker odds (wodds) easy. Wodds follow the same rules as letter-values, but with a different naming system.

Installation

You can install the development version of wodds from GitHub with:

# install.packages("devtools")
devtools::install_github("alexhallam/wodds")

Example

This is a basic example which shows you how to solve a common problem:

options(digits=1)
library(wodds)
library(knitr)
set.seed(42)
a <- rnorm(n = 1e4, 0, 1)
df_wodds <- wodds::wodds(a)
df_wodds
#> # A tibble: 11 × 3
#>    lower_value wodd_name upper_value
#>          <dbl> <chr>           <dbl>
#>  1    -0.00625 M            -0.00625
#>  2    -0.694   F             0.663  
#>  3    -1.17    E             1.16   
#>  4    -1.57    S             1.52   
#>  5    -1.88    SM            1.87   
#>  6    -2.17    SF            2.15   
#>  7    -2.41    SE            2.41   
#>  8    -2.66    S2            2.64   
#>  9    -2.86    S2M           2.88   
#> 10    -3.01    S2F           3.22   
#> 11    -3.13    S2E           3.34

Outliers beyond the last wodd are marked with O<value> in ascending order. There should rarely be more than 7 outliers when using wodds.

df_wodds_and_outs <- wodds::wodds(a, include_outliers = TRUE)
df_wodds_and_outs
#> # A tibble: 17 × 3
#>    lower_value wodd_name upper_value
#>          <dbl> <chr>           <dbl>
#>  1    -0.00625 M            -0.00625
#>  2    -0.694   F             0.663  
#>  3    -1.17    E             1.16   
#>  4    -1.57    S             1.52   
#>  5    -1.88    SM            1.87   
#>  6    -2.17    SF            2.15   
#>  7    -2.41    SE            2.41   
#>  8    -2.66    S2            2.64   
#>  9    -2.86    S2M           2.88   
#> 10    -3.01    S2F           3.22   
#> 11    -3.13    S2E           3.34   
#> 12    -3.14    O1            3.34   
#> 13    -3.18    O2            3.47   
#> 14    -3.20    O3            3.50   
#> 15    -3.33    O4            3.58   
#> 16    -3.37    O5            4.33   
#> 17    -4.04    O6           NA

Though not necessary it is possible to include tail area if additional communication or teaching is needed. It is assumed that the wodd should be explanatory enough to not need to rely on tail_area.

df_wodds_and_outs <- wodds::wodds(a, include_tail_area  = TRUE)
df_wodds_and_outs
#> # A tibble: 11 × 4
#>    tail_area lower_value wodd_name upper_value
#>        <dbl>       <dbl> <chr>           <dbl>
#>  1         2    -0.00625 M            -0.00625
#>  2         4    -0.694   F             0.663  
#>  3         8    -1.17    E             1.16   
#>  4        16    -1.57    S             1.52   
#>  5        32    -1.88    SM            1.87   
#>  6        64    -2.17    SF            2.15   
#>  7       128    -2.41    SE            2.41   
#>  8       256    -2.66    S2            2.64   
#>  9       512    -2.86    S2M           2.88   
#> 10      1024    -3.01    S2F           3.22   
#> 11      2048    -3.13    S2E           3.34

An example with all options set to TRUE.

df_wodds_and_outs <- wodds::wodds(a, include_depth = TRUE, include_tail_area = TRUE, include_outliers = TRUE)
df_wodds_and_outs
#> # A tibble: 17 × 5
#>    depth tail_area lower_value wodd_name upper_value
#>    <int>     <dbl>       <dbl> <chr>           <dbl>
#>  1     1         2    -0.00625 M            -0.00625
#>  2     2         4    -0.694   F             0.663  
#>  3     3         8    -1.17    E             1.16   
#>  4     4        16    -1.57    S             1.52   
#>  5     5        32    -1.88    SM            1.87   
#>  6     6        64    -2.17    SF            2.15   
#>  7     7       128    -2.41    SE            2.41   
#>  8     8       256    -2.66    S2            2.64   
#>  9     9       512    -2.86    S2M           2.88   
#> 10    10      1024    -3.01    S2F           3.22   
#> 11    11      2048    -3.13    S2E           3.34   
#> 12    NA        NA    -3.14    O1            3.34   
#> 13    NA        NA    -3.18    O2            3.47   
#> 14    NA        NA    -3.20    O3            3.50   
#> 15    NA        NA    -3.33    O4            3.58   
#> 16    NA        NA    -3.37    O5            4.33   
#> 17    NA        NA    -4.04    O6           NA

A knitr::kable example for publication.

knitr::kable(df_wodds_and_outs, align = 'c',digits = 3)
depthtail_arealower_valuewodd_nameupper_value
12-0.006M-0.006
24-0.694F0.663
38-1.169E1.155
416-1.569S1.524
532-1.878SM1.866
664-2.173SF2.150
7128-2.415SE2.409
8256-2.656S22.637
9512-2.857S2M2.883
101024-3.013S2F3.220
112048-3.130S2E3.338
NANA-3.139O13.339
NANA-3.181O23.471
NANA-3.200O33.495
NANA-3.331O43.585
NANA-3.372O54.328
NANA-4.043O6NA

Getting the depth

wodds::get_depth_from_n(n=15734L, alpha = 0.05)
#> [1] 11

Getting the sample size

wodds::get_n_from_depth(d = 11L)
#> [1] 15734

Whisker Odds and Letter-Values

Letter-Values are a fantastic tool! I think the naming could be improved. For this reason I introduce whisker odds (wodds) as an alternative naming system. My hypothesis is that with an alternative naming system the use of these descriptive statistics will be see more use. This is a rebranding of a what I think is a powerful modern statistical tool.

Metadata

Version

0.1.0

License

Unknown

Platforms (75)

    Darwin
    FreeBSD 13
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd13
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd13
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows