Bandwidth selection

kdelearn.bandwidth_selection.normal_reference(x_train: ndarray, weights_train: Optional[ndarray] = None, kernel_name: str = 'gaussian') → ndarray[source]

AMISE-optimal bandwidth for the (assuming) gaussian density.

See paragraph (3.2.1) in [1].

Parameters

x_train (ndarray of shape (m_train, n)) – Data points containing data with float type.
weights_train (ndarray of shape (m_train,), optional) – Weights of data points. If None, all data points are equally weighted.
kernel_name ({'gaussian', 'uniform', 'epanechnikov', 'cauchy'}, default='gaussian') – Name of kernel function.

Returns

bandwidth – Smoothing parameter for scaling the estimator.

Return type

ndarray of shape (n,)

Examples

>>> x_train = np.random.normal(0, 1, size=(100, 1))
>>> bandwidth = normal_reference(x_train, kernel_name="gaussian")

References

[1] Wand, M. P. and Jones, M. C. Kernel Smoothing. Chapman and Hall, 1995.

kdelearn.bandwidth_selection.direct_plugin(x_train: ndarray, weights_train: Optional[ndarray] = None, kernel_name: str = 'gaussian', stage: int = 2)[source]

Direct plug-in method with gaussian kernel used in estimation of integrated squared density derivatives limited to maximum value of stage equal to 3.

See paragraph (3.6.1) in [1].