norMmix
library(norMmix)
set.seed(2020)
This section is for those who want a quick whiff of what the package can do. For the proper start to this vignette, see next section.
A normal mixture model is a multivariate probability distribution constructed of normal distributions and mixture weights. Its (probability) density and (cumulative) distribution functions (aka PDF and CDF), \(f()\) and \(F()\) are \[ f({\bf x}) = \sum_{k=1}^{K} \pi_k \phi({\bf x};\mu_k, \Sigma_k), \text{ and } \\ F({\bf x}) = \sum_{k=1}^{K} \pi_k \Phi({\bf x};\mu_k, \Sigma_k), \]
with \(\pi_k \ge 0\), \(\sum \pi_k = 1\), and \(\phi(\cdot; \mu_k, \Sigma_k)\) and \(\Phi(\cdot; \mu_k, \Sigma_k)\) are the density and distribution function of a multivariate normal (aka Gaussian) distribution with mean \(\mu_k\) and covariance matrix \(\Sigma_k\). For details, see e.g., https://en.wikipedia.org/wiki/Multivariate_normal_distribution .
To begin, pick your favorite dataset and how many components you want
to fit. For the most general model, let model="VVV"
. Use
claraInit
as the default method.
run norMmixMLE
:
faith <- norMmixMLE(faithful, 3, model="VVV", initFUN=claraInit)
#> initial value 1148.756748
#> iter 10 value 1130.331858
#> iter 20 value 1120.857337
#> iter 30 value 1120.175360
#> iter 40 value 1119.345117
#> iter 50 value 1119.252780
#> iter 60 value 1119.213982
#> iter 60 value 1119.213972
#> iter 60 value 1119.213972
#> final value 1119.213972
#> converged
and inspect the results:
plot(faith)
This is all you need to know for just the bare bones functionality of the package. We can also go about the whole thing the other way around, by starting out by defining a normal mixture from scratch.
Use the norMmix()
function; the constructor function for
normal mixtures.
w <- c(0.5, 0.3, 0.2)
mu <- matrix(1:6, 2, 3)
sig <- array(c(2,1,1,2,
3,2,2,3,
4,3,3,4), c(2,2,3))
nm <- norMmix(mu, Sigma=sig, weight=w)
plot(nm)
higher-dimensional plots are also possible.
plot(MW32)
This model is taken from
Marron, J. S., and Wand, M. P. (1992) “Exact Mean Integrated Squared Error.” doi:10.1214/aos/1176348653.
All the models from this paper are provided in the package as examples.
rnorMmix()
generates data (random observations) from
such distributions:
x <- rnorMmix(500, nm)
plot(nm, xlim = c(-5,10), ylim = c(-5, 12),
main = "500 observations from a mixture of 3 bivariate Gaussians")
points(x)