module pipeline.base_pipeline

Base Pipeline.

This module contains functionality to construct a pipeline and run experiments in a standardized and modular fashion. This abstract BasePipeline class should be used as a super class for all specific evaluation pipelines.

Copyright (C) 2022 Johannes Haug.


class BasePipeline

Abstract base class for evaluation pipelines.

Attributes:

  • data_loader (DataLoader): Data loader object.
  • predictors (List[BasePredictor]): Predictive model(s).
  • prediction_evaluators (List[PredictionEvaluator]): Evaluator(s) for the predictive model(s).
  • change_detector (ConceptDriftDetector | None): Concept drift detection model.
  • change_detection_evaluator (ChangeDetectionEvaluator | None): Evaluator for active concept drift detection.
  • feature_selector (BaseFeatureSelector | None): Online feature selection model.
  • feature_selection_evaluator (FeatureSelectionEvaluator | None): Evaluator for the online feature selection.
  • batch_size (int | None): Batch size, i.e. no. of observations drawn from the data loader at one time step.
  • n_pretrain (int | None): Number of observations used for the initial training of the predictive model.
  • n_max (int | None): Maximum number of observations used in the evaluation.
  • label_delay_range (tuple | None): The min and max delay in the availability of labels in time steps. The delay is sampled uniformly from this range.
  • estimate_memory_alloc (bool): Boolean that indicates if the method-wise change in allocated memory (GB) shall be monitored. Note that this delivers only an indication of the approximate memory consumption and can significantly increase the total run time of the pipeline.
  • test_interval (int): The interval/frequency at which the online learning models are evaluated. This parameter is always 1 for a prequential or distributed fold evaluation.
  • rng (Generator): A numpy random number generator object.
  • start_time (float): Physical start time.
  • time_step (int): Current logical time step, i.e. iteration.
  • n_total (int): Total number of observations currently observed.

method BasePipeline.__init__

__init__(
    data_loader: float.data.data_loader.DataLoader,
    predictor: Union[float.prediction.base_predictor.BasePredictor, List[float.prediction.base_predictor.BasePredictor]],
    prediction_evaluator: float.prediction.evaluation.prediction_evaluator.PredictionEvaluator,
    change_detector: Optional[float.change_detection.base_change_detector.BaseChangeDetector],
    change_detection_evaluator: Optional[float.change_detection.evaluation.change_detection_evaluator.ChangeDetectionEvaluator],
    feature_selector: Optional[float.feature_selection.base_feature_selector.BaseFeatureSelector],
    feature_selection_evaluator: Optional[float.feature_selection.evaluation.feature_selection_evaluator.FeatureSelectionEvaluator],
    batch_size: int,
    n_pretrain: int,
    n_max: int,
    label_delay_range: Optional[tuple],
    test_interval: int,
    estimate_memory_alloc: bool,
    random_state: int
)

Initializes the pipeline.

Args:

  • data_loader: Data loader object.
  • predictor: Predictor object or list of predictor objects.
  • prediction_evaluator: Evaluator object for the predictive model(s).
  • change_detector: Concept drift detection model.
  • change_detection_evaluator: Evaluator for active concept drift detection.
  • feature_selector: Online feature selection model.
  • feature_selection_evaluator: Evaluator for the online feature selection.
  • batch_size: Batch size, i.e. no. of observations drawn from the data loader at one time step.
  • n_pretrain: Number of observations used for the initial training of the predictive model.
  • n_max: Maximum number of observations used in the evaluation.
  • test_interval: The interval/frequency at which the online learning models are evaluated. This parameter is always 1 for a prequential evaluation.
  • estimate_memory_alloc: Boolean that indicates if the method-wise change in allocated memory (GB) shall be monitored. Note that this delivers only an indication of the approximate memory consumption and can significantly increase the total run time of the pipeline.
  • random_state: A random integer seed used to specify a random number generator.

Raises:

  • AttributeError: If one of the provided objects is not valid.

method BasePipeline.run

run()

Runs the pipeline.

This function is specifically implemented for each evaluation strategy.


This file was automatically generated via lazydocs.