MyNixOS website logo
Description

Convolute Probabilistic Distributions.

Convolute probabilistic distributions using the random generator function of each distribution. A new random number generator function is created that perform the mathematical operation on the individual random samples from the random generator function of each distribution. See the documentation for examples.

The convdistr package

John J. Aponte

The convdistr package provide tools to define distribution objects and make mathematical operations with them. It keep track of the results as if they where scalar numbers but maintaining the ability to obtain randoms samples of the convoluted distributions.

To install this package from github

devtools::install_github("johnaponte/convdistr", build_manual = T, build_vignettes = T)

Practical example

What would be the resulting distribution of a + b * c if a is a normal distribution with mean 1 and standard deviation 0.5, b is a poisson distribution with lambda 5 and c is a beta distribution with shape parameters 10 and 20?

library(convdistr)
library(ggplot2)

a <- new_NORMAL(1,0.5)
b <- new_POISSON(5)
c <- new_BETA(10,20)
res <- a + b * c

metadata(res) 
#>   distribution     rvar
#> 1  CONVOLUTION 2.666667
summary(res)
distributionvarnameovalnsamplemean_sd_lci_median_uci_
CONVOLUTIONrvar2.67100002.661.020.942.564.95
ggDISTRIBUTION(res) + ggtitle("a + b * c")

The result is a distribution with expected value 2.67. A sample from 10000 drawns of the distribution shows a mean value of 2.66, a median of 2.56 and 95% quantiles of 0.94, 4.95

The following sections describe the DISTRIBUTION object, how to create new DISTRIBUTION objects and how to make operations and mixtures with them.

Please note that when convoluting distributions, this package assumes the distributions are independent between them, i.e. their correlation is 0. If not, you need to implement specific distributions to handle the correlation, like the MULTIVARIATE object.

Description of the DISTRIBUTION object

The DISTRIBUTION is kind of abstract class (or interface) that specific constructors should implement.

It contains 4 fields:

distribution : A character with the name of the distribution implemented

seed : A numerical seed that is use to get a repeatable sample in the summary function

oval : The observed value. It is the value expected. It is used as a number for the mathematical operations of the distributions as if they were a simple scalar

rfunc(n) : A function that generate random numbers from the distribution. Its only parameter n is the number of drawns of the distribution. It returns a matrix with as many rows as n, and as many columns as the dimensions of the distributions

The DISTRIBUTION object can support multidimensional distributions for example a dirichlet distribution. The names of the dimensions should coincides with the names of the oval vector. If it has only one dimension, the default name is rvar.

It is expected that the rfunc could be included in the creation of new distributions by convolution or mixture, so the environment should be carefully controlled to avoid reference leaking that is possible within the R language. For that reason, the rfunc should be created within a restrict_environment function that controls that only the variables that are required within the function are saved in the environment of the function.

Once the new objects are instanced, the fields are immutable and should not be changed.

Factory of DISTRIBUTION objects

The following functions create new objects of class DISTRIBUTION

Distributionfactoryparametersfunction
uniformnew_UNIFORMp_min, p_maxrunif
normalnew_NORMALp_mean, p_sdrnorm
betanew_BETAp_shape1, p_shape2rbeta
betanew_BETA_lcip_mean, p_lci, p_ucirbeta
triangularnew_TRIANGULARp_min, p_max, p_modertriangular
poissonnew_POISSONp_lambdarpoisson
exponentialnew_EXPONENTIALp_raterexp
discretenew_DISCRETEp_supp, p_probsample
dirichletnew_DIRICHLETp_alpha, p_dimnamesrdirichlet
truncatednew_TRUNCATEDp_distribution, p_min, p_max
diracnew_DIRACp_value
NAnew_NAp_dimnames

Methods

The following are methods for all objects of class DISTRIBUTION

  • metadata(x) Print the metadata for the distribution
  • summary(object, n=10000) Produce a summary of the distribution
  • rfunc(x, n) Generate n random drawns of the distribution
  • plot(x, n= 10000) Produce a density plot of the distribution
  • ggDISTRIBUTION(x, n= 10000) produce a density plot of the distribution using ggplot2
myDistr <- new_NORMAL(0,1)
metadata(myDistr)
#>   distribution rvar
#> 1       NORMAL    0
rfunc(myDistr, 10)
#>            rvar
#> 1  -0.202292246
#> 2   2.359176819
#> 3  -0.378977974
#> 4  -1.108465547
#> 5   0.080081266
#> 6  -0.001522165
#> 7   1.140359435
#> 8   0.220586273
#> 9   0.533860090
#> 10  1.450453816
summary(myDistr)
distributionvarnameovalnsamplemean_sd_lci_median_uci_
NORMALrvar0100000.011.01-1.980.021.97
plot(myDistr)

ggDISTRIBUTION(myDistr)

Convolution for Distribution with the same dimensions

Mathematical operations like +, -, *, / between DISTRIBUTION with the same dimensions can be perform with the new_CONVOLUTION(listdistr, op, omit_NA = FALSE) function. The listdistr parameter is a list of DISTRIBUTION objects on which the operation is made. A shorter version exists for each one of the operations as follow

  • new_SUM(listdistr, omit_NA = FALSE)
  • new_SUBTRACTION(listdistr, omit_NA = FALSE)
  • new_MULTIPLICATION(listdistr, omit_NA = FALSE)
  • new_DIVISION(listdistr, omit_NA = FALSE)

but Mathematical operator can also be used.

d1 <- new_NORMAL(1,1)
d2 <- new_UNIFORM(2,8)
d3 <- new_POISSON(5)
dsum <- new_SUM(list(d1,d2,d3))
dsum
#>   distribution rvar
#> 1  CONVOLUTION   11
d1 + d2 + d3
#>   distribution rvar
#> 1  CONVOLUTION   11
summary(dsum)
distributionvarnameovalnsamplemean_sd_lci_median_uci_
CONVOLUTIONrvar1110000113.015.410.8817.2
ggDISTRIBUTION(dsum)

Mixture

A DISTRIBUTION, consisting on the mixture of several distribution can be obtained with the new_MIXTURE(listdistr, mixture) function where listdistr is a list of DISTRIBUTION objects and mixture the vector of probabilities for each distribution. If missing the mixture, the probability will be the same for each distribution.

d1 <- new_NORMAL(1,0.5)
d2 <- new_NORMAL(5,0.5)
d3 <- new_NORMAL(10,0.5)
dmix <- new_MIXTURE(list(d1,d2,d3))
summary(dmix)
distributionvarnameovalnsamplemean_sd_lci_median_uci_
MIXTURErvar5.33100005.323.70.274.9910.71
ggDISTRIBUTION(dmix)

Convolution of distributions with different dimensions

When convoluting distribution with different dimensions, there are two possibilities. The new_CONVOLUTION_assoc family of functions perform the operation only on the common dimensions and left the others dimensions as they are, or the new_CONVOLUTION_comb family of functions which perform the operation in the combination of all dimensions.

d1 <- new_MULTINORMAL(c(0,1), matrix(c(1,0.3,0.3,1), ncol = 2), p_dimnames = c("A","B"))
d2 <- new_MULTINORMAL(c(3,4), matrix(c(1,0.3,0.3,1), ncol = 2), p_dimnames = c("B","C"))
summary(d1)
distributionvarnameovalnsamplemean_sd_lci_median_uci_
MULTINORMALA01000001-1.960.011.95
MULTINORMALB11000011-0.951.002.94
summary(d2)
distributionvarnameovalnsamplemean_sd_lci_median_uci_
MULTINORMALB3100003.011.001.043.034.97
MULTINORMALC4100004.011.012.044.006.00
summary(new_SUM_assoc(d1,d2))
distributionvarnameovalnsamplemean_sd_lci_median_uci_
CONVOLUTIONA010000-0.010.99-1.94-0.011.92
CONVOLUTIONC4100004.001.002.053.995.95
CONVOLUTIONB4100004.011.421.214.016.80
summary(new_SUM_comb(d1,d2))
distributionvarnameovalnsamplemean_sd_lci_median_uci_
CONVOLUTIONA_B3100002.961.420.172.975.76
CONVOLUTIONB_B4100003.981.411.174.006.71
CONVOLUTIONA_C4100003.971.411.283.976.75
CONVOLUTIONB_C5100004.991.402.204.987.67
convdistr website
Metadata

Version

1.6.2

License

Unknown

Platforms (77)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows