Table of Contents

## Sampling from a Mixture of Distributions

It is said that a distribution $f(x)$ is a mixture of *k* components distributions $f_1(x), …, f_k(x)$ if:

$f(x) = \sum_{i=1}^k \pi_i f_i(x)$

where $\pi_i$ are the so called mixing weights, $0 \le \pi_i \le 1$, and $\pi_1 + … + \pi_k = 1$. Here, new data points from distribution will be generated in the standard way: first to pick a distribution, with probabilities given by the mixing weights, and then to generate one observation according to that distribution. More information about mixture distribution can be read in Wikipedia.

# 1. Generating random variables from a mixture of normal distributions

To generate from a mixture distribution the R package *usefr* will be used.

`library(usefr)`

set.seed(123) # set a seed for random generation`# ========= A mixture of three distributions =========`

`phi = c(7/10, 3/10) # Mixture proportions`

`# ---------------------------------------------------------`

`# === Named vector of the corresponding distribution function parameters`

`# must be provided`

`args <- list(norm = c(mean = 1, sd = 1), `

`norm = c(mean = 5, sd = 1))`

# ------------------------------------------------------------`# ===== Sampling from the specified mixture distribution ====`

`x <- rmixtdistr(n = 1e5, pi = pi , arg = args)`

# ------------------------------------------------------------

# === The graphics for the simulated dataset and the corresponding theoretical`# mixture distribution`

`par(bg = "gray98", mar = c(3, 4, 2, 1) )`

`hist(x, 90, freq = FALSE, las = 1, family = "serif", col = rgb(0, 0, 1, 0.2), border = "deepskyblue")`

`x1 <- seq(-4, 10, by = 0.001)`

`lines(x1, dmixtdistr(x1, phi = phi, arg = args), col = "red")`

# 2. Mixture of Weibull and Gamma distributions

Mixture of normal distributions is what most frequently we see online and in paper. Let’s see the mixture of Weibull and Gamma distributions.

`set.seed(123) # set a seed for random generation`

# ==== A mixture of three distributions =====`pi = c(7/10, 3/10) # Mixture proportions # ---------------------------------------------------------`

`# === Named vector of the corresponding distribution function parameters`

`# must be provided`

`args <- list(gamma = c(shape = 20, scale = 1/15),`

`weibull = c(shape = 3, scale = 0.5))`

# ---------------------------------------------------------

# === Sampling from the specified mixture distribution ====

`x <- rmixtdistr(n = 1e5, pi = pi , arg = args)`

# ---------------------------------------------------------

# === The graphics for the simulated dataset and the corresponding theoretical`# mixture distribution`

`par(bg = "gray98", mar = c(3, 4, 2, 1) )`

`hist(x, 90, freq = FALSE, las = 1, family = "serif", col = "cyan1", border = "deepskyblue")`

`x1 <- seq(-4, 10, by = 0.001)`

`lines(x1, dmixtdistr(x1, pi = pi, arg = args), col = "red")`

# 3. Mixture of Gamma, Weibull, and Log-Normal distributions

`set.seed(123) # set a seed for random generation`

`# =============== A mixture of three distributions ========================`

`pi = c(5/10, 3/10, 2/10) # Mixture proportions`

`# --------------------------------------------------------------------------`

`# ==== Named vector of the corresponding distribution function parameters`

`# must be provided`

`args <- list(gamma = c(shape = 20, scale = 1/10),`

`weibull = c(shape = 4, scale = 0.8),`

`lnorm = c(meanlog = 1.2, sdlog = 0.08))`

`# --------------------------------------------------------------------------`

`# ======= Sampling from the specified mixture distribution =======`

`x <- rmixtdistr(n = 1e5, pi = pi , arg = args)`

`# --------------------------------------------------------------------------`

`# The graphics for the simulated dataset and the corresponding theoretical`

`# mixture distribution`

`par(bg = "gray98", mar = c(3, 4, 2, 1) )`

`hist(x, 90, freq = FALSE, las = 1, family = "serif", col = "plum1", border = "violet")`

`x1 <- seq(-4, 10, by = 0.001)`

`lines(x1, dmixtdistr(x1, pi = pi, arg = args), col = "red")`