Fix Python – How to create a density plot in matplotlib?

Question

Asked By – unode

In R I can create the desired output by doing:

data = c(rep(1.5, 7), rep(2.5, 2), rep(3.5, 8),
         rep(4.5, 3), rep(5.5, 1), rep(6.5, 8))
plot(density(data, bw=0.5))

Density plot in R

In python (with matplotlib) the closest I got was with a simple histogram:

import matplotlib.pyplot as plt
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
plt.hist(data, bins=6)
plt.show()

Histogram in matplotlib

I also tried the normed=True parameter but couldn’t get anything other than trying to fit a gaussian to the histogram.

My latest attempts were around scipy.stats and gaussian_kde, following examples on the web, but I’ve been unsuccessful so far.

Now we will see solution for issue: How to create a density plot in matplotlib?


Answer

Sven has shown how to use the class gaussian_kde from Scipy, but you will notice that it doesn’t look quite like what you generated with R. This is because gaussian_kde tries to infer the bandwidth automatically. You can play with the bandwidth in a way by changing the function covariance_factor of the gaussian_kde class. First, here is what you get without changing that function:

alt text

However, if I use the following code:

import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import gaussian_kde
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
density = gaussian_kde(data)
xs = np.linspace(0,8,200)
density.covariance_factor = lambda : .25
density._compute_covariance()
plt.plot(xs,density(xs))
plt.show()

I get

alt text

which is pretty close to what you are getting from R. What have I done? gaussian_kde uses a changable function, covariance_factor to calculate its bandwidth. Before changing the function, the value returned by covariance_factor for this data was about .5. Lowering this lowered the bandwidth. I had to call _compute_covariance after changing that function so that all of the factors would be calculated correctly. It isn’t an exact correspondence with the bw parameter from R, but hopefully it helps you get in the right direction.

This question is answered By – Justin Peel

This answer is collected from stackoverflow and reviewed by FixPython community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0