Dot Plots Mimicking Violin Plots.
dotsViolin. Dot Plots Mimicking Violin Plots
Modifies dot plots to have different sizes of dots mimicking violin plots and identifies modes or peaks for them (Rosenblatt, 1956; Parzen, 1962).
dotsViolin, an R package (R Core Team, 2023) uses gridExtra (Auguie, 2017), gtools (Bolker et al., 2022), tidyr (Wickham et al., 2023c), stringr (Wickham, 2022), dplyr (Wickham et al., 2023b), ggplot2 (Wickham et al., 2023a), lazyeval (Wickham, 2019), magrittr (Bache and Wickham, 2022), rlang (Henry and Wickham, 2023), scales (Wickham and Seidel, 2022), tidyselect (Henry and Wickham, 2022)
Documentation was written with R-packages roxygen2 (Wickham et al., 2022), knitr (Xie, 2023), Rmarkdown (Allaire et al., 2023).
Academic presentation related (Roa-Ovalle, 2019)
Installation
devtools::install_gitlab(repo = "ferroao/dotsViolin")
Releases
Citation
To cite package ‘dotsViolin’ in publications use:
Roa-Ovalle F, Telles M (2023). dotsViolin: Integrated tables in dot and violin R ggplots. R package version 0.0.1, https://gitlab.com/ferroao/dotsViolin.
To write citation to file:
sink("dotsViolin.bib")
toBibtex(citation("dotsViolin"))
sink()
Authors
Plot window
Define your plotting window size with something like par(pin=c(10,6))
, or with svg()
, png()
, etc.
In VSCode, you could use something like this
{
"r.plot.useHttpgd": false,
"r.plot.devArgs": {
"width": 800,
"height": 600
}
}
Examples
1 Discrete Data:
library(dotsViolin)
fabaceae_mode_counts <- get_modes_counts(fabaceae_clade_n_df, "clade", "parsed_n")
fabaceae_mode_counts
clade | m1 | m2 | m3 | count |
---|---|---|---|---|
Caesalpinieae | 12 | NA | NA | 29 |
Cassieae | 14 | 8 | 12 | 64 |
Cercidoideae | 14 | 7 | NA | 33 |
Detarioideae | 12 | 8,17 | NA | 50 |
Dialioideae | 14 | NA | NA | 6 |
Dimorphandra and rel. | 14 | 13 | NA | 16 |
Mimosoids | 13 | 26 | 14 | 221 |
outgroup | 8 | 12 | 11 | 145 |
Papilionoideae | 8 | 11 | 7 | 1410 |
Umtiza and rel. | 14 | NA | NA | 7 |
library(dotsViolin)
fabaceae_clade_n_df_count <- make_legend_with_stats(fabaceae_mode_counts, "label_count", 1, TRUE)
fabaceae_clade_n_df$label_count <- fabaceae_clade_n_df_count$label_count[match(
fabaceae_clade_n_df$clade,
fabaceae_clade_n_df_count$clade
)]
desiredorder1 <- unique(fabaceae_clade_n_df$clade)
fabaceae_clade_n_df
tip.label clade parsed_n
1 KX374504_Abarema_centiflora Mimosoids 13
2 KX213142_Adenodolichos_bussei Papilionoideae 11
3 KX792912_Almaleea_cambagei Papilionoideae 8
4 KP109982_Amphithalea_cymbifolia Papilionoideae 9
5 KP230727_Argyrolobium_tuberosum Papilionoideae 13
6 GU220019_Ateleia_arsenii Papilionoideae 14
label_count
1 Mimosoids 13 26 14 (221)
2 Papilionoideae 8 11 7 (1410)
3 Papilionoideae 8 11 7 (1410)
4 Papilionoideae 8 11 7 (1410)
5 Papilionoideae 8 11 7 (1410)
6 Papilionoideae 8 11 7 (1410)
par(mar = c(0, 0, 0, 0), omi = rep(0, 4))
dots_and_violin(
fabaceae_clade_n_df, "clade", "label_count", "parsed_n", 2,
30, "Chromosome haploid number", desiredorder1, 1, .85, 4,
"ownwork",
violin = FALSE
)
par(mar = c(0, 0, 0, 0), omi = rep(0, 4))
dots_and_violin(
fabaceae_clade_n_df, "clade", "label_count", "parsed_n", 2,
30, "Chromosome haploid number", desiredorder1, 1, .85, 4,
dots = FALSE
)
par(mar = c(0, 0, 0, 0), omi = rep(0, 4))
dots_and_violin(
fabaceae_clade_n_df, "clade", "label_count", "parsed_n", 2,
30, "Chromosome haploid number", desiredorder1, 1, .85, 4
)
2 Continuous Data:
Define your plotting window size with something like par(pin=c(10,6))
, or with svg()
, png()
, etc.
library(dotsViolin)
fabaceae_Cx_peak_counts_per_clade_df <- get_peaks_counts_continuous(
fabaceae_clade_1Cx_df,
"clade", "Cx", 2, 0.25, 1, 2
)
fabaceae_Cx_peak_counts_per_clade_df
clade | m1 | m2 | counts | |
---|---|---|---|---|
Caesalpinieae | Caesalpinieae | 0.85,1.80 | 2 | |
Cassieae | Cassieae | 0.69 | 0.52,0.56 | 6 |
Cercidoideae | Cercidoideae | 0.60 | 5 | |
COM clade | COM clade | 0.35,0.50,0.83 | 3 | |
Detarioideae | Detarioideae | 2.21 | 0.84,2.01 | 4 |
Dimorphandra & rel. | Dimorphandra & rel. | 0.73,0.79 | 2 | |
Malvids | Malvids | 0.40 | 0.63 | 8 |
Mimosoids | Mimosoids | 0.70 | 0.43 | 42 |
outgroups | outgroups | 0.48 | 1.38,2.76 | 9 |
Papilionoideae | Papilionoideae | 0.59 | 212 | |
Polygala amara | Polygala amara | 0.42 | 1 | |
Umtiza & rel. | Umtiza & rel. | 0.65,1.05 | 2 | |
Vitis vinifera | Vitis vinifera | 0.43 | 1 |
library(dotsViolin)
namecol <- "labelcountcustom"
fabaceae_clade_1Cx_modes_count_df <- make_legend_with_stats(
fabaceae_Cx_peak_counts_per_clade_df,
namecol, 1, TRUE
)
fabaceae_clade_1Cx_df$labelcountcustom <-
fabaceae_clade_1Cx_modes_count_df$labelcountcustom[match(
fabaceae_clade_1Cx_df$clade,
fabaceae_clade_1Cx_modes_count_df$clade
)]
desiredorder <- unique(fabaceae_clade_1Cx_df$clade)
fabaceae_clade_1Cx_df
name clade Cx genus ownwork
6 'Silene_latifolia_JF715055' outgroups 2.7000 Silene no
7 'Fagopyrum_esculentum_NC010776' outgroups 1.4350 Fagopyrum no
11 'Helianthus_annuus_NC007977' outgroups 2.4250 Helianthus no
12 'Daucus_carota_NC008325' outgroups 2.8375 Daucus no
14 'Olea_europaea_NC013707' outgroups 1.9500 Olea no
18 'Coffea_arabica_NC008535' outgroups 0.6000 Coffea no
labelcountcustom
6 outgroups 0.48 1.38,2.76 (9)
7 outgroups 0.48 1.38,2.76 (9)
11 outgroups 0.48 1.38,2.76 (9)
12 outgroups 0.48 1.38,2.76 (9)
14 outgroups 0.48 1.38,2.76 (9)
18 outgroups 0.48 1.38,2.76 (9)
par(mar = c(0, 0, 0, 0), omi = rep(0, 4))
dots_and_violin(
fabaceae_clade_1Cx_df, "clade", "labelcountcustom", "Cx", 3,
3, "Genome Size", desiredorder, 0.03, 0.25, 2,
"ownwork"
)
par(mar = c(0, 0, 0, 0), omi = rep(0, 4))
dots_and_violin(
fabaceae_clade_1Cx_df, "clade", "labelcountcustom", "Cx", 3,
3, "Genome Size", desiredorder, 0.03, 0.25, 2,
dots = FALSE
)
par(mar = c(0, 0, 0, 0), omi = rep(0, 4))
dots_and_violin(
fabaceae_clade_1Cx_df, "clade", "labelcountcustom", "Cx", 3,
3, "Genome Size", desiredorder, 0.03, 0.25, 2,
"ownwork",
violin = FALSE
)
References
R-packages
Allaire J, Xie Y, Dervieux C, McPherson J, Luraschi J, Ushey K, Atkins A, Wickham H, Cheng J, Chang W, Iannone R. 2023. Rmarkdown: Dynamic documents for r. R package version 2.24. https://CRAN.R-project.org/package=rmarkdown
Auguie B. 2017. gridExtra: Miscellaneous functions for "grid" graphics. R package version 2.3. https://CRAN.R-project.org/package=gridExtra
Bache SM, Wickham H. 2022. Magrittr: A forward-pipe operator for r. R package version 2.0.3. https://CRAN.R-project.org/package=magrittr
Bolker B, Warnes GR, Lumley T. 2022. Gtools: Various r programming tools. R package version 3.9.4. https://github.com/r-gregmisc/gtools
Henry L, Wickham H. 2022. Tidyselect: Select from a set of strings. R package version 1.2.0. https://CRAN.R-project.org/package=tidyselect
Henry L, Wickham H. 2023. Rlang: Functions for base types and core r and tidyverse features. R package version 1.1.1. https://CRAN.R-project.org/package=rlang
R Core Team. 2023. R: A language and environment for statistical computing R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
Wickham H. 2019. Lazyeval: Lazy (non-standard) evaluation. R package version 0.2.2. https://CRAN.R-project.org/package=lazyeval
Wickham H. 2022. Stringr: Simple, consistent wrappers for common string operations. R package version 1.5.0. https://CRAN.R-project.org/package=stringr
Wickham H, Chang W, Henry L, Pedersen TL, Takahashi K, Wilke C, Woo K, Yutani H, Dunnington D. 2023a. ggplot2: Create elegant data visualisations using the grammar of graphics. R package version 3.4.4. https://CRAN.R-project.org/package=ggplot2
Wickham H, Danenberg P, Csárdi G, Eugster M. 2022. roxygen2: In-line documentation for r. R package version 7.2.3. https://CRAN.R-project.org/package=roxygen2
Wickham H, François R, Henry L, Müller K, Vaughan D. 2023b. Dplyr: A grammar of data manipulation. R package version 1.1.3. https://CRAN.R-project.org/package=dplyr
Wickham H, Seidel D. 2022. Scales: Scale functions for visualization. R package version 1.2.1. https://CRAN.R-project.org/package=scales
Wickham H, Vaughan D, Girlich M. 2023c. Tidyr: Tidy messy data. R package version 1.3.0. https://CRAN.R-project.org/package=tidyr
Xie Y. 2023. Knitr: A general-purpose package for dynamic report generation in r. R package version 1.43. https://yihui.org/knitr/
Academia
Parzen E. 1962. On estimation of a probability density function and mode The Annals of Mathematical Statistics, 33: 1065–1076. https://doi.org/10.1214/aoms/1177704472
Roa-Ovalle F. 2019. Poliploidia e duplicação genômica nas leguminosas brasileiras In: Rocha LL da (ed) Sociedade Botânica do Brasil. https://70cnbot.botanica.org.br/wp-content/uploads/2019/11/Livro-70%C2%BA-Congresso-Nacional-de-Bot%C3%A2nica..pdf
Rosenblatt M. 1956. Remarks on some nonparametric estimates of a density function The Annals of Mathematical Statistics, 27: 832–837. https://doi.org/10.1214/aoms/1177728190