library(norMmix)
set.seed(2020)

Ultra quick start

This section is for those who want a quick whiff of what the package can do. For the proper start to this vignette, see next section.

A normal mixture model is a multivariate probability distribution constructed of normal distributions and mixture weights. Its (probability) density and (cumulative) distribution functions (aka PDF and CDF), \(f()\) and \(F()\) are \[ f({\bf x}) = \sum_{k=1}^{K} \pi_k \phi({\bf x};\mu_k, \Sigma_k), \text{ and } \\ F({\bf x}) = \sum_{k=1}^{K} \pi_k \Phi({\bf x};\mu_k, \Sigma_k), \]

with \(\pi_k \ge 0\), \(\sum \pi_k = 1\), and \(\phi(\cdot; \mu_k, \Sigma_k)\) and \(\Phi(\cdot; \mu_k, \Sigma_k)\) are the density and distribution function of a multivariate normal (aka Gaussian) distribution with mean \(\mu_k\) and covariance matrix \(\Sigma_k\). For details, see e.g., https://en.wikipedia.org/wiki/Multivariate_normal_distribution .

To begin, pick your favorite dataset and how many components you want to fit. For the most general model, let model="VVV". Use claraInit as the default method.

run norMmixMLE:

faith <- norMmixMLE(faithful, 3, model="VVV", initFUN=claraInit)
#> initial  value 1148.756748 
#> iter  10 value 1130.331858
#> iter  20 value 1120.857337
#> iter  30 value 1120.175360
#> iter  40 value 1119.345117
#> iter  50 value 1119.252780
#> iter  60 value 1119.213982
#> iter  60 value 1119.213972
#> iter  60 value 1119.213972
#> final  value 1119.213972 
#> converged

and inspect the results:

plot(faith)

This is all you need to know for just the bare bones functionality of the package. We can also go about the whole thing the other way around, by starting out by defining a normal mixture from scratch.

Use the norMmix() function; the constructor function for normal mixtures.

w <- c(0.5, 0.3, 0.2)
mu <- matrix(1:6, 2, 3)
sig <- array(c(2,1,1,2,
               3,2,2,3,
               4,3,3,4), c(2,2,3))
nm <- norMmix(mu, Sigma=sig, weight=w)
plot(nm)

higher-dimensional plots are also possible.

plot(MW32)

This model is taken from

Marron, J. S., and Wand, M. P. (1992) “Exact Mean Integrated Squared Error.” doi:10.1214/aos/1176348653.

All the models from this paper are provided in the package as examples.

rnorMmix() generates data (random observations) from such distributions:

x <- rnorMmix(500, nm)
plot(nm, xlim = c(-5,10), ylim = c(-5, 12),
     main = "500 observations from a mixture of 3 bivariate Gaussians")
points(x)