module data.data_loader
Data Loader.
This module encapsulates functionality to load and preprocess input data. The data loader class uses the scikit-multiflow Stream class to simulate streaming data.
Copyright (C) 2022 Johannes Haug.
class DataLoader
Data Loader Class.
The data loader class is responsible to sample and pre-process (i.e. normalize) input data, thereby simulating a data stream. The data loader uses a skmultiflow Stream object to generate or load streaming data.
Attributes:
path
(str | None): The path to a .csv file containing the training data set.stream
(Stream | None): A scikit-multiflow data stream object.target_col
(int): The index of the target column in the training data.scaler
(BaseScaler | None): A scaler object used to normalize/standardize sampled instances.
method DataLoader.__init__
__init__(
path: Optional[str] = None,
stream: Optional[skmultiflow.data.base_stream.Stream] = None,
target_col: int = -1,
scaler: Optional[float.data.preprocessing.base_scaler.BaseScaler] = None
)
Inits the data loader.
The data loader init function must receive either one of the following inputs: 1.) the path to a .csv file (+ a target index), which is then mapped to a skmultiflow FileStream object. 2.) a valid scikit multiflow Stream object.
Args:
path
: The path to a .csv file containing the training data set.stream
: A scikit-multiflow data stream object.target_col
: The index of the target column in the training data.scaler
: A scaler object used to normalize/standardize sampled instances.
method DataLoader.get_data
get_data(
n_batch: int
) → Tuple[Union[numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]], Union[numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]]
Loads a batch from the stream object.
Args:
n_batch
: Number of samples to load from the data stream object.
Returns:
Tuple[ArrayLike, ArrayLike]
: The sampled observations and corresponding targets.
This file was automatically generated via lazydocs.