module feature_selection.efs

Extremal Feature Selection Method.

This module contains the Extremal Feature Selection model introduced by: CARVALHO, Vitor R.; COHEN, William W. Single-pass online learning: Performance, voting schemes and online feature selection. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. 2006. S. 548-553.

Copyright (C) 2022 Johannes Haug.


class EFS

Extremal feature selector.

This feature selection algorithm uses the weights of a Modified Balanced Winnow classifier.

method EFS.__init__

__init__(
    n_total_features: int,
    n_selected_features: int,
    u: Optional[numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]] = None,
    v: Optional[numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]] = None,
    theta: float = 1,
    M: float = 1,
    alpha: float = 1.5,
    beta: float = 0.5,
    reset_after_drift: bool = False,
    baseline: str = 'constant',
    ref_sample: Union[float, numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]] = 0
)

Inits the feature selector.

Args:

  • n_total_features: The total number of features.
  • n_selected_features: The number of selected features.
  • u: Initial positive model weights of the Winnow algorithm.
  • v: Initial negative model weights of the Winnow algorithm.
  • theta: Threshold parameter.
  • M (float): Margin parameter.
  • alpha (float): Promotion parameter.
  • beta (float): Demotion parameter.
  • reset_after_drift: A boolean indicating if the change detector will be reset after a drift was detected.
  • baseline: A string identifier of the baseline method. The baseline is the value that we substitute non-selected features with. This is necessary, because most online learning models are not able to handle arbitrary patterns of missing data.
  • ref_sample: A sample used to compute the baseline. If the constant baseline is used, one needs to provide a single float value.

method EFS.reset

reset()

Resets the feature selector.


method EFS.weight_features

weight_features(
    X: Union[numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]],
    y: Union[numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]
)

Updates feature weights.

Args:

  • X: Array/matrix of observations.
  • y: Array of corresponding labels.

This file was automatically generated via lazydocs.