Kernel density estimation

Kernel density estimation is a method for non-parametric density estimation.

On this page you can read about its unconditional (standard) and conditional form.

Unconditional case

Formula of unconditional kernel density estimation with product kernel:

\[\hat{f}(x) = \sum_{i=1}^m w_{i} \prod_{j=i}^n \frac{1}{h_j} K \left( \frac{x_{j} - x_{i, j}}{h_j} \right) \text{,} \quad x \in \mathbb{R}^n\]

Check available kernels.

Example of constructing kernel density estimation on small dataset (\(m=9\)) with gaussian kernel:

There are four available kernel functions. See formulas and plot below:

Formulas of available kernel functions
Kernel name	Formula
Gaussian	\(\frac{1}{\sqrt{2 \pi}} \exp \left( \frac{x^2}{2} \right)\)
Uniform	\(0.5 \quad \text{if } \|x\| \leq 1 \quad \text{otherwise } 0\)
Epanechnikov	\(\frac{3}{4} (1-x^2) \quad \text{if } \|x\| \leq 1 \quad \text{otherwise } 0\)
Cauchy	\(\frac{2}{\pi (x^2 + 1)^2}\)

Example of constructing kernel density estimation with weighted data points.

Notice that the rightmost data points have more impact on estimated density than others.

There are four available bandwidth selection methods:

Illustration of kernel density estimations with different bandwidth selection methods computed on data drawn from gaussian mixture (blue curve):