Description
Assortative Mating Simulation and Multivariate Bernoulli Variates.
Description
Simulation of phenotype / genotype data under assortative mating. Includes functions for generating Bahadur order-2 multivariate Bernoulli variables with general and diagonal-plus-low-rank correlation structures. Further details are provided in: Border and Malik (2022) <doi:10.1101/2022.10.13.512132>.
README.md
Efficient simulation of genotype / phenotype data under assortative mating by generating Bahadur order-2 multivariate Bernoulli distributed random variates.
Features
- Multivariate Bernoulli (MVB) distribution samplers
rb_dplr
: generate Bahadur order-2 MVB variates with diagonal-plus-low-rank (DPLR) correlation structuresrb_unstr
: generate Bahadur order-2 MVB variates with arbitrary correlation structures
- Assortative mating modeling tools
- Compute equilibrium parameters under univariate AM
h2_eq
: compute equilibrium heritabilityrg_eq
: compute equilibrium cross-mate genetic correlationvg_eq
: compute equilibrium genetic variance
- Generate genotype / phenotype data given initial conditions
am_simulate
: complete univariate genotype / phenotype simulationam_covariance_structure
: compute outer-product covariance component for AM-induced DPLR covariance structure
- Compute equilibrium parameters under univariate AM
Installation
rBahadur
is now on CRAN:
install.packages("rBahadur")
Alternatively, you can install directly from github using the install_github
function provided by the remotes
library:
remotes::install_github("rborder/rBahadur")
Usage
Here we demonstrate using rBahadur
to simulate genotype / phenotype at equilibrium under AM: given the following parameters:
h2_0
: panmictic heritabilityr
: cross-mate phenotypic correlationm
: number of diploid, biallelic causal variantsn
: number of individuals to simulatemin_MAF
: minimum minor allele frequency
set.seed(2022)
h2_0 = .5; m = 2000; n = 5000; r =.5; min_MAF=.05
## simulate genotype/phenotype data
sim_dat <- am_simulate(h2_0, r, m, n)
We compare the target and realized allele frequencies:
## plot empirical first moments of genotypes versus expectations
afs_emp <- colMeans(sim_dat$X)/2
plot(sim_dat$AF, afs_emp)
We compare the expected equilibrium heritability to that realized in simulation:
## empirical h2 vs expected equilibrium h2
(emp_h2 <- var(sim_dat$g)/var(sim_dat$y))
h2_eq(r, .5)
Citation
Developed by Richard Border and Osman Malik. For further details, or if you find this software useful, please cite:
- Border, R. and Malik, O.A., 2022.
rBahadur
: efficient simulation of structured high-dimensional genotype data with applications to assortative mating. BMC Bioinformatics. https://doi.org/10.1186/s12859-023-05442-6
Background reading:
- The Multivariate Bernoulli distribution and the Bahadur representation:
- Teugels, J.L., 1990. Some representations of the multivariate Bernoulli and binomial distributions. Journal of Multivariate Analysis, 32(2), pp.256-268. https://doi.org/10.1016/0047-259X(90)90084-U
- Bahadur, R.R., 1959. A representation of the joint distribution of responses to n dichotomous items. School of Aviation Medicine, Randolph AFB, Texas. https://apps.dtic.mil/sti/citations/AD0706093
- Cross-generational dynamics of genetic variants under univariate assortative mating:
- Nagylaki, T., 1982. Assortative mating for a quantitative character. Journal of Mathematical Biology, 16, pp.57–74. https://doi.org/10.1007/BF00275161