module feature_selection.efs
Extremal Feature Selection Method.
This module contains the Extremal Feature Selection model introduced by: CARVALHO, Vitor R.; COHEN, William W. Single-pass online learning: Performance, voting schemes and online feature selection. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. 2006. S. 548-553.
Copyright (C) 2022 Johannes Haug.
class EFS
Extremal feature selector.
This feature selection algorithm uses the weights of a Modified Balanced Winnow classifier.
method EFS.__init__
__init__(
n_total_features: int,
n_selected_features: int,
u: Optional[numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]] = None,
v: Optional[numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]] = None,
theta: float = 1,
M: float = 1,
alpha: float = 1.5,
beta: float = 0.5,
reset_after_drift: bool = False,
baseline: str = 'constant',
ref_sample: Union[float, numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]] = 0
)
Inits the feature selector.
Args:
n_total_features
: The total number of features.n_selected_features
: The number of selected features.u
: Initial positive model weights of the Winnow algorithm.v
: Initial negative model weights of the Winnow algorithm.theta
: Threshold parameter.M
(float): Margin parameter.alpha
(float): Promotion parameter.beta
(float): Demotion parameter.reset_after_drift
: A boolean indicating if the change detector will be reset after a drift was detected.baseline
: A string identifier of the baseline method. The baseline is the value that we substitute non-selected features with. This is necessary, because most online learning models are not able to handle arbitrary patterns of missing data.ref_sample
: A sample used to compute the baseline. If the constant baseline is used, one needs to provide a single float value.
method EFS.reset
reset()
Resets the feature selector.
method EFS.weight_features
weight_features(
X: Union[numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]],
y: Union[numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]
)
Updates feature weights.
Args:
X
: Array/matrix of observations.y
: Array of corresponding labels.
This file was automatically generated via lazydocs.