What is a Fisher Vector for 3D point clouds – 3D Point Cloud Classification Primer

Recently we published a paper about 3d point cloud classification (and segmentation) using our proposed 3D modified Fisher Vector (3DmFV) representation and convolutional neural networks (CNNs).  The preprint is available on ArXiv and the final version is available in Robotics and Automation Letters (RA-L) journal.

I believe in making research accessible to everyone and I realized that this paper requires some level of familiarity with two main topics:

  1. Gaussian Mixture Models (GMM). 
  2. Fisher Vectors

This post is the second in a series of three and aims to be a Fisher Vector (FV) primer for using 3DmFV for 3D point cloud classification. The first is about GMMS, if you are unfamiliar with it you should start there.

So, what is a Fisher Vector?

The shortest answer is a feature aggregation method.  For an in-depth answer make sure to read Perronnin’s papers – I personally like this one.  Intuitively, FV describes points by their deviation from a GMM.

I will start with some mathematical definitions but will also give a more intuitive example at the bottom.

The Math

Remember that the likelihood of every point p associated with the GMM is given by

u_\lambda(p) = \sum_{k=1}^{K}w_ku_k(p)

The Fisher Vector may be written as the sum of normalized gradient statistics for each point

FV_\lambda^X = \sum_{t=1}^TL_\lambda\nabla_\lambda\log u_\lambda(p_t)

Here, \lambda is the set of parameters of a K component GMM:  \lambda=\{(w_k,\mu_k,\Sigma_k), k=1,...K\}, where w_k,\mu_k,\Sigma_k are the mixture weight, expected value, and covariance matrix of the k^{th} Gaussian.

We perform a change of variables from \{w_k\} to \{\alpha_k\}, ensures that u_\lambda(x) is a valid distribution and simplifies the gradient calculation:

w_k = \frac{exp(\alpha_k)}{\sum_{j=1}^{K}exp(\alpha_j)}

The soft assignment of a point to a gaussian is given by

\gamma_t(k) = \frac{w_k u_k(p_t)}{\sum_{j=1}^{K} w_j u_j(p_t)}

The normalized gradients are:

FV_{\alpha_k}^X = \frac{1}{\sqrt{w_k}} \sum_{t=1}^T(\gamma_t(k)-w_k) FV_{\mu_k}^X = \frac{1}{\sqrt{w_k}} \sum_{t=1}^T \gamma_t(k) \left( \frac{p_t-\mu_k}{\sigma_k} \right) FV_{\sigma_k}^X = \frac{1}{\sqrt{2w_k}} \sum_{t=1}^T \gamma_t(k) \left[ \frac{(p_t-\mu_k)^2}{\sigma_k^2}-1 \right]

Finally, the Fisher vector is formed by concatenating all of these components and normalizing them by the number of points T.

The Intuition

It is much easier to understand the FV for points with some nice visualizations.

Let’s take a single 2D point in a single Gaussian and show its FV next to it. In the image below, the Gaussian is qualitatively visualized with a dashed circle with a radius of a single standard deviation.

Point fisher vector example

Now let’s see what happens when we move the point (hint FV changes)

Point fisher vector example

Finally, we can see what happens when we move the point all around.

Point fisher vector animation

The Code

A great implementation of fisher vectors is available in this link 

In order to recreate the images above you can use my repository for this FV tutorial on my GitHub.

Make sure to check out the next post in this series about our 3D modified Fisher Vectors (3DmFV) for point cloud classification.