MyNixOS website logo
Description

Simulate Complex Traits from Genotypes.

Simulate complex traits given a SNP genotype matrix and model parameters (the desired heritability, number of causal loci, and either the true ancestral allele frequencies used to generate the genotypes or the mean kinship for a real dataset). Emphasis on avoiding common biases due to the use of estimated allele frequencies. The code selects random loci to be causal, constructs coefficients for these loci and random independent non-genetic effects, and can optionally generate random group effects. Traits can follow three models: random coefficients, fixed effect sizes, and infinitesimal (multivariate normal). GWAS method benchmarking functions are also provided. Described in Yao and Ochoa (2022) <doi:10.1101/2022.03.25.485885>.

simtrait simtrait

The simtrait R package enables simulation of complex traits with user-set number of causal loci and the desired heritability of the trait (the proportion of variance due to genetic effects).

The main function requires a simulated genotype matrix, including the true ancestral allele frequencies. These parameters are necessary to correctly specify the desired correlation structure. See the package bnpsd for simulating genotypes for admixed individuals (example below).

Simulating a trait from real genotypes is possible with a good kinship matrix estimate. See the package popkin for accurate kinship estimation.

Installation

You can install the released version of simtrait from CRAN with:

install.packages("simtrait")

Install the latest development version from GitHub:

install.packages("devtools") # if needed
library(devtools)
install_github("OchoaLab/simtrait", build_vignettes = TRUE)

You can see the package vignette, which has more detailed documentation, by typing this into your R session:

vignette('simtrait')

Example

The code below has two parts: (1) simulate genotypes, and (2) simulate the trait.

Simulate an admixed population

The first step is to simulate genotypes from an admixed population, to have an example where there is population structure and known ancestral allele frequencies. We use the external package bnpsd to achieve this.

library(bnpsd) # to simulate an admixed population

# dimensions of data/model
# number of loci
m_loci <- 10000
# number of individuals, smaller than usual for easier visualizations
n_ind <- 30
# number of intermediate subpops
k_subpops <- 3

# define population structure
# FST values for k = 3 subpopulations
inbr_subpops <- 1 : k_subpops
# bias coeff of standard Fst estimator
bias_coeff <- 0.5
# desired final Fst of admixed individuals
Fst <- 0.3
obj <- admix_prop_1d_linear(
    n_ind,
    k_subpops,
    bias_coeff = bias_coeff,
    coanc_subpops = inbr_subpops,
    fst = Fst
)
admix_proportions <- obj$admix_proportions
# rescaled Fst vector for intermediate subpops
inbr_subpops <- obj$coanc_subpops

# get pop structure parameters of the admixed individuals
concestry <- coanc_admix(admix_proportions, inbr_subpops)
kinship <- coanc_to_kinship(concestry)

# draw allele freqs and genotypes
out <- draw_all_admix(admix_proportions, inbr_subpops, m_loci)
# genotypes
X <- out$X
# ancestral allele frequencies
p_anc <- out$p_anc

Simulate a random trait

Here we apply our package to this simulated genotype data.

library(simtrait) # load this package

# parameters of simulation
m_causal <- 100
herit <- 0.8

# create simulated trait and associated data

# version 1: known p_anc (prefered, only applicable to simulated data)
obj <- sim_trait(X = X, m_causal = m_causal, herit = herit, p_anc = p_anc)
# version 2: known kinship (more broadly applicable but fewer guarantees)
obj <- sim_trait(X = X, m_causal = m_causal, herit = herit, kinship = kinship)

# outputs in both versions:
# trait vector
obj$trait
# randomly-picked causal locus index
obj$causal_indexes
# locus effect size vector
obj$causal_coeffs

# theoretical covariance of the simulated traits
V <- cov_trait(kinship = kinship, herit = herit)
Metadata

Version

1.1.3

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows