module feature_selection.fires

FIRES Feature Selection Method.

This module contains the Fast, Interpretable and Robust Evaluation and Selection of features (FIRES) with a Probit base model and normally distributed parameters as introduced by: HAUG, Johannes, et al. Leveraging model inherent variable importance for stable online feature selection. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020. S. 1478-1502. URL: https://dl.acm.org/doi/abs/10.1145/3394486.3403200

Copyright (C) 2022 Johannes Haug.


class FIRES

FIRES feature selector.

method FIRES.__init__

__init__(
    n_total_features: int,
    n_selected_features: int,
    classes: list,
    mu_init: Union[int, numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]] = 0,
    sigma_init: Union[int, numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]] = 1,
    penalty_s: float = 0.01,
    penalty_r: float = 0.01,
    epochs: int = 1,
    lr_mu: float = 0.01,
    lr_sigma: float = 0.01,
    scale_weights: bool = True,
    reset_after_drift: bool = False,
    baseline: str = 'constant',
    ref_sample: Union[float, numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]] = 0
)

Inits the feature selector.

Args:

  • n_total_features: The total number of features.
  • n_selected_features: The number of selected features.
  • classes: A list of unique target values (class labels).
  • mu_init: Initial importance, i.e. mean of the parameter distribution. One may either set the initial values separately per feature (by providing a vector), or use the same initial value for all features (by providing a scalar).
  • sigma_init: Initial uncertainty, i.e. standard deviation of the parameter distribution. One may either set the initial values separately per feature (by providing a vector), or use the same initial value for all features (by providing a scalar).
  • penalty_s: Penalty factor in the optimization of weights w.r.t the uncertainty (corresponds to gamma_s in the paper).
  • penalty_r: Penalty factor in the optimization of weights for the regularization (corresponds to gamma_r in the paper).
  • epochs: Number of epochs in each update iteration.
  • lr_mu: Learning rate for the gradient update of the mean.
  • lr_sigma: Learning rate for the gradient update of the standard deviation.
  • scale_weights: If True, scale feature weights into the range [0,1]. If False, do not scale weights.
  • reset_after_drift: A boolean indicating if the change detector will be reset after a drift was detected.
  • baseline: A string identifier of the baseline method. The baseline is the value that we substitute non-selected features with. This is necessary, because most online learning models are not able to handle arbitrary patterns of missing data.
  • ref_sample: A sample used to compute the baseline. If the constant baseline is used, one needs to provide a single float value.

method FIRES.reset

reset()

Resets the feature selector.


method FIRES.weight_features

weight_features(
    X: Union[numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]],
    y: Union[numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]
)

Updates feature weights.

Args:

  • X: Array/matrix of observations.
  • y: Array of corresponding labels.

This file was automatically generated via lazydocs.