Fitting models with circular data in PyMC

2022-May-04

Data are sometimes on a circular scale, such as the angle of an oriented stimulus, and the analysis of such data often needs to take this circularity into account. Here, we will look at how we can use PyMC to fit a model to circular data.

Data are sometimes on a circular scale (also known as directional), and circular data often need to be analysed differently to non-circular data. In perception research, this often occurs with things like stimulus orientation and camera rotations—where the measurement is in angles. A classic example of the need for specialised analysis methods is where we have a pair of angles, say 300° and 10°, and we would like to calculate the mean. Using the typical calculation for the mean gives 155°, which is not a good measure of the central tendency of this data. Instead, we can use the circular mean (via scipy.stats.circmean, for example), which gives 335°.

Two probability distributions that are suitable for use with circular data are the wrapped normal distribution and the von Mises distribution. The distributions are similar, and we will focus here on the von Mises distribution because it is implemented in PyMC. Here, we will look at how we can fit a von Mises distribution to some circular data in PyMC.

Generating the example data

We generate some example data using the method for generating random values from a specified von Mises distribution in numpy. Here, we will set the mean parameter (μ) to 135° and the standard deviation parameter (σ) to 45° and draw 100 random values.

rand = np.random.default_rng(seed=21056627)

n_obs = 100

mu = np.radians(135)
sigma = np.radians(45)
kappa = 1 / (sigma ** 2)

data = rand.vonmises(mu=mu, kappa=kappa, size=n_obs)

Note that the distribution in numpy is parameterised by κ (kappa), which is the inverse of the variance. It also requires the parameters to be specified in radians and produces values in the range of [−π, +π].

A histogram of these values is shown below. Note how the values ‘wrap around’, and how this can make our distribution appear to be bimodal if we are not aware that it is circular data.

Histogram of the example dataset. — Histogram of 100 draws from a von Mises distribution with a mean of 135° and a standard deviation of 45°.

Should this histogram be plotted on polar rather than Cartesian axes? Polar axes do indeed seem well-suited, given that they are specified by angle and length—which matches the characteristics of this plotted data. However, I find graphs using polar axes to be more difficult to visually understand and so I tend to prefer using Cartesian axes. Sometimes it is useful to add a redundant quarter-cycle or so to the left and right edges to make the circularity more evident and not ‘disadvantage’ the data at the boundaries.

Specifying the model

Our goal now is to fit a model to this observed data and estimate the parameters of the distribution from which it was generated. Within the Bayesian framework, this involves placing priors on the unknown parameters—here, μ and σ. The details of the priors need to be specific to the particular circumstances of a given analysis, and pm.sample_prior_predictive can be very useful in visualising the consequences of the selected prior values to check that they align with the intentions.

Here, we will use a uniform prior for μ, expressing our prior belief that the parameter could be anywhere within the circular range. Importantly, we need to indicate the circularity of this μ parameter to pymc by using the transform argument when specifying the prior distribution. We will use a broad half-normal distribution for the standard deviation.

This model can be implemented in pymc via:

with pm.Model() as model:

    mu = pm.Uniform(
        "mu",
        lower=-np.pi,
        upper=+np.pi,
        transform=pm.transforms.Circular(),
    )

    sigma = pm.HalfNormal("sigma", sigma=np.radians(90))

    kappa = 1 / (sigma ** 2)

    pm.VonMises(
        "obs",
        mu=mu,
        kappa=kappa,
        observed=data,
    )

More complex models are sometimes necessary to incorporate the presence of bimodality within the circular data, such as using a pm.Mixture distribution. This bimodality can sometimes occur at 180° separations, which can provide some additional information to constrain the priors. However, ‘label switching’ (Jasra, Holmes, & Stephens, 2005) can be an issue with circular data in particular (Mulder, Jongsma, & Klugkist, 2020) and can require additional consideration.

Sampling and model evaluation

We can run the MCMC sampling process using pymc via:

with model:
    idata = pm.sample(return_inferencedata=True)

I am using return_inferencedata=True so that the result of the sampling process is an arviz (Kumar, Carroll, Hartikainen, & Martin, 2019) InferenceData object, which is not the default in pymc3 (which I am using here)—but it will become the default in the upcoming release of pymc. I was slow to come around to the benefits of the InferenceData representation, but I have since become a big fan of xarray (Hoyer & Hamman, 2017), upon which InferenceData is based, and now prefer this sort of representation in most of my programming.

It is important to note that the posterior samples for μ are also circular data, so we need to be careful with how they are summarised. Happily, arviz is able to appropriately deal with circular data in many instances. For example, obtaining the mean of the posterior using az.summary can use the circular mean if the argument circ_var_names=["mu"] is provided. Doing so here gives a mean of about 132.75°, which is close to the value of μ in the generating distribution (135°).

One way we can evaluate the descriptive adequacy of the model is to randomly draw samples from the fitted posterior (here, μ and σ) and use those to evaluate the probability density function at a range of angles. As expected, the 50 such samples shown in the below depict distributions that appear to visually approximate the histogram of the observed data.

Histogram of the example dataset with overlaid probability density functions from samples from the fitted posterior distribution. — Evaluation of the probability density function of a von Mises distribution with parameters from each of 50 random draws from the fitted posterior, overlaid on the histogram of the observed data.

Summary

Circular data often need to be analysed using specialised procedures that take the circularity into account. Here, we looked at how we can use PyMC to fit a simple model to some circular data.

Software versions

Here are the version of the key software and packages that were used to generate the material in this post:

Python implementation: CPython
Python version       : 3.9.12
IPython version      : 8.3.0

pymc3: 3.11.5
veusz: 3.4
numpy: 1.22.1
scipy: 1.7.3
arviz: 0.12.0

References

Hoyer, S. & Hamman, J. (2017) xarray: N-D labeled arrays and datasets in Python. Journal of Open Research Software, 5(1), 10–10.
Jasra, A., Holmes, C.C., & Stephens, D.A. (2005) Markov Chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling. Statistical Science, 20(1), 50–67.
Kumar, R., Carroll, C., Hartikainen, A., & Martin, O. (2019) ArviZ: a unified library for exploratory analysis of Bayesian models in Python. Journal of Open Source Software, 4(33), 1143–1143.
Mulder, K., Jongsma, P., & Klugkist, I. (2020) Bayesian inference for mixtures of von Mises distributions using reversible jump MCMC sampler. Journal of Statistical Computation and Simulation, 90(9), 1539–1556.