What is a Gaussian Mixture Model (GMM) – 3D Point Cloud Classification Primer

Recently we finished a paper about 3d point cloud classification and segmentation using our proposed 3D modified Fisher Vector (3DmFV) representation and convolutional neural networks (CNNs).  It is currently available on ArXiv.  I believe in making research accessible to everyone and I realized that this paper requires some level of familiarity with two main topics:

  1. Gaussian Mixture Models (GMM)
  2. Fisher Vectors

This post is the first in a series of three and aims to be a GMM primer for using 3DmFV for 3D point cloud classification.

So, what is a GMM (sometimes referred to as Mixture of Gaussians)?

The shortest answer is a probability distribution composed of several Gaussians.

I will start with some mathematical definitions but will also give a more intuitive example at the bottom.

The Math

The Gaussian (also known as a multivariate normal distribution) of a random D dimensional point p can be specified using \mu and \Sigma , which are its expected value (mean), and covariance matrix  respectively.  Therefore the likelihood of a single point p with respects to the kth Gaussian is given by

u_k(p) = \frac{1}{(2\pi)^{D/2}|\Sigma_k|^{1/2}}\exp\left\{-\frac{1}{2}(p-\mu_k)'\Sigma_k^{-1}(p-\mu_k)\right\},

and the likelihood of p associated with the GMM

u_\lambda(p) = \sum_{k=1}^{K}w_ku_k(p)

Here,  w_k are the mixture weights ( informally representing the number of points/ the amount of influence of each Gaussian).


Let’s say that you have some D dimensional data points (I will focus on 2D and 3D data because it is easy to visualize but it works for much higher dimensions).  In addition, let us assume that each one of these data points came from one of K different sources. Each source is a Gaussian in a D dimensional space.  The challenge is to find the Gaussians parameters that will maximize their expectation. This is usually done using the Expectation Maximization (EM) algorithm.

Basically what we want to do is find  \lambda=(w_k,\mu_k,\Sigma_k) for the Gaussians that best fits the data.

The Intuition

We can imagine a more intuitive (and adventurous) problem:

Once upon a time, there was a rich man with a lot of gold in his pockets. He also had a hole in one of them and whenever he walked around his garden some coins fell to the ground. What you want to do is draw a treasure map for adventurers and let them know where are they most likely to find gold.

In the image below you can see the coins that fell out as points ( each coin has a different color corresponding to the Gaussian that was used to generate it) and a treasure map overlay that shows four different areas to finding gold (the lines transparency indicates the probability – close to the center of each Gaussian the probability is higher to find gold and as you go farther away it reduces.

2D Gaussian Mixrure Model example

We can do something very similar for 3D points (abandoning the rich man example). In the image below you can see the generated 3D points and a semi-transparent surface representing the Gaussian location and scale ( 3 standard deviations in each axis).

3D GMM example


You can see how we can basically represent all of these 4000 points using only 4 Gaussians.  Each points gets a likelihood value (u_k) for each Gaussian.  In our paper, we show that we can use a GMM with Gaussians on a grid (and not using EM algorithm) to represent 3D point clouds for the classification and part segmentation task. But, more on that in a later post.


The Code

The code used for generating the images above is available on github.

It is composed of three main parts

  • Generating data
  • Fitting the Gaussian Mixture Model
  • Visualization

Generating data

First, we specify the number of points to generate in each Gaussian,  the problems dimensionality (2/3), and the parameters for the Gaussians that we will use to generate the data.

## Generate synthetic data
N,D = 1000, 2 # number of points and dimenstinality

if D == 2:
    #set gaussian ceters and covariances in 2D
    means = np.array([[0.5, 0.0],
                      [0, 0],
                      [-0.5, -0.5],
                      [-0.8, 0.3]])
    covs = np.array([np.diag([0.01, 0.01]),
                     np.diag([0.025, 0.01]),
                     np.diag([0.01, 0.025]),
                     np.diag([0.01, 0.01])])
elif D == 3:
    # set gaussian ceters and covariances in 3D
    means = np.array([[0.5, 0.0, 0.0],
                      [0.0, 0.0, 0.0],
                      [-0.5, -0.5, -0.5],
                      [-0.8, 0.3, 0.4]])
    covs = np.array([np.diag([0.01, 0.01, 0.03]),
                     np.diag([0.08, 0.01, 0.01]),
                     np.diag([0.01, 0.05, 0.01]),
                     np.diag([0.03, 0.07, 0.01])])
n_gaussians = means.shape[0]

Next, we generate points using a multivariate normal distribution for

points = []
for i in range(len(means)):
    x = np.random.multivariate_normal(means[i], covs[i], N )
points = np.concatenate(points)

Fitting the Gaussian Mixture Model

Using the EM algorithm (provided in scikit-learn) we were able to find all of the Gaussians parameters

#fit the gaussian model
gmm = GaussianMixture(n_components=n_gaussians, covariance_type='diag')


Finally, we visualize the data using some custom functions (see the git repository for the functions)

if D == 2:
    visualization.visualize_2D_gmm(points, gmm.weights_, gmm.means_.T, np.sqrt(gmm.covariances_).T)
elif D == 3:
    visualization.visualize_3d_gmm(points, gmm.weights_, gmm.means_.T, np.sqrt(gmm.covariances_).T)

This concludes my very brief primer on GMMs.

Additional resources: