module pipeline.holdout_pipeline
Periodic Holdout Pipeline.
This module implements a pipeline following the periodic holdout evaluation strategy.
Copyright (C) 2022 Johannes Haug.
class HoldoutPipeline
Pipeline class for periodic holdout evaluation.
Attributes:
test_set
(Tuple[ArrayLike, ArrayLike] | None): A tuple containing the initial test observations and labels used for the holdout evaluation.test_replace_interval
(int | None): This integer specifies in which interval we replace the oldest test observation. For example, if test_replace_interval=10 then we will use every 10th observation to replace the currently oldest test observation. Note that test observations will not be used for training, hence this interval should not be chosen too small. If argument is None, we use the complete batch at testing time in the evluation.
method HoldoutPipeline.__init__
__init__(
data_loader: float.data.data_loader.DataLoader,
predictor: Union[float.prediction.base_predictor.BasePredictor, List[float.prediction.base_predictor.BasePredictor]],
prediction_evaluator: float.prediction.evaluation.prediction_evaluator.PredictionEvaluator,
change_detector: Optional[float.change_detection.base_change_detector.BaseChangeDetector] = None,
change_detection_evaluator: Optional[float.change_detection.evaluation.change_detection_evaluator.ChangeDetectionEvaluator] = None,
feature_selector: Optional[float.feature_selection.base_feature_selector.BaseFeatureSelector] = None,
feature_selection_evaluator: Optional[float.feature_selection.evaluation.feature_selection_evaluator.FeatureSelectionEvaluator] = None,
batch_size: int = 1,
n_pretrain: int = 100,
n_max: int = inf,
label_delay_range: Optional[tuple] = None,
test_set: Optional[Tuple[Union[numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]], Union[numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]]] = None,
test_interval: int = 10,
test_replace_interval: Optional[int] = None,
estimate_memory_alloc: bool = False,
random_state: int = 0
)
Initializes the pipeline.
Args:
data_loader
: Data loader object.predictor
: Predictor object or list of predictor objects.prediction_evaluator
: Evaluator object for the predictive model(s).change_detector
: Concept drift detection model.change_detection_evaluator
: Evaluator for active concept drift detection.feature_selector
: Online feature selection model.feature_selection_evaluator
: Evaluator for the online feature selection.batch_size
: Batch size, i.e. no. of observations drawn from the data loader at one time step.n_pretrain
: Number of observations used for the initial training of the predictive model.n_max
: Maximum number of observations used in the evaluation.label_delay_range
: The min and max delay in the availability of labels in time steps. The delay is sampled uniformly from this range.test_set
: A tuple containing the initial test observations and labels used for the holdout evaluation.test_interval
: The interval/frequency at which the online learning models are evaluated.test_replace_interval
: This integer specifies in which interval we replace the oldest test observation. For example, if test_replace_interval=10 then we will use every 10th observation to replace the currently oldest test observation. Note that test observations will not be used for training, hence this interval should not be chosen too small. If argument is None, we use the complete batch at testing time in the evaluation.estimate_memory_alloc
: Boolean that indicates if the method-wise change in allocated memory (GB) shall be monitored. Note that this delivers only an indication of the approximate memory consumption and can significantly increase the total run time of the pipeline.random_state
: A random integer seed used to specify a random number generator.
method HoldoutPipeline.run
run()
Runs the pipeline.
This file was automatically generated via lazydocs.