Bandwidth selection
- kdelearn.bandwidth_selection.normal_reference(x_train: ndarray, weights_train: Optional[ndarray] = None, kernel_name: str = 'gaussian') ndarray[source]
AMISE-optimal bandwidth for the (assuming) gaussian density.
See paragraph (3.2.1) in [1].
- Parameters
x_train (ndarray of shape (m_train, n)) – Data points containing data with float type.
weights_train (ndarray of shape (m_train,), optional) – Weights of data points. If None, all data points are equally weighted.
kernel_name ({'gaussian', 'uniform', 'epanechnikov', 'cauchy'}, default='gaussian') – Name of kernel function.
- Returns
bandwidth – Smoothing parameter for scaling the estimator.
- Return type
ndarray of shape (n,)
Examples
>>> x_train = np.random.normal(0, 1, size=(100, 1)) >>> bandwidth = normal_reference(x_train, kernel_name="gaussian")
References
[1] Wand, M. P. and Jones, M. C. Kernel Smoothing. Chapman and Hall, 1995.
- kdelearn.bandwidth_selection.direct_plugin(x_train: ndarray, weights_train: Optional[ndarray] = None, kernel_name: str = 'gaussian', stage: int = 2)[source]
Direct plug-in method with gaussian kernel used in estimation of integrated squared density derivatives limited to maximum value of stage equal to 3.
See paragraph (3.6.1) in [1].
- Parameters
x_train (ndarray of shape (m_train, n)) – Data points containing data with float type.
weights_train (ndarray of shape (m_train,), optional) – Weights of data points. If None, all data points are equally weighted.
kernel_name ({'gaussian', 'uniform', 'epanechnikov', 'cauchy'}, default='gaussian') – Name of kernel function.
stage (int, default=2) – Depth of plugging-in (max 3).
- Returns
bandwidth – Smoothing parameter for scaling the estimator.
- Return type
ndarray of shape (n,)
Examples
>>> x_train = np.random.normal(0, 1, size=(100, 1)) >>> bandwidth = direct_plugin(x_train, kernel_name="gaussian", stage=2)
References
[1] Wand, M. P. and Jones, M. C. Kernel Smoothing. Chapman and Hall, 1995.