`module` `prediction.dynamic_model_tree`

Dynamic Model Tree Classifier.

This module contains an implementation of the Dynamic Model Tree classification framework proposed in:

Haug, Johannes; Broelemann, Klaus; Kasneci, Gjergji. Dynamic Model Tree for Interpretable Data Stream Learning. In: 38th IEEE International Conference on Data Engineering, DOI: 10.1109/ICDE53745.2022.00237, 2022.

`class` `DynamicModelTreeClassifier`

Dynamic Model Tree Classifier.

This implementation of the DMT uses linear (logit) simple models and the negative log likelihood loss (as described in the corresponding paper).

Attributes:

classes (List): List of the target classes.
learning_rate (float): Learning rate of the linear models.
penalty_term (float): Regularization term for the linear model (0 = no regularization penalty).
penalty (str): String identifier of the type of regularization used by the linear model. Either 'l1', 'l2', or 'elasticnet' (see documentation of sklearn SGDClassifier).
epsilon (float): Threshold required before attempting to split or prune based on the Akaike Information Criterion. The smaller the epsilon-threshold, the stronger the evidence for splitting/pruning must be. Choose 0 < epsilon <= 1.
n_saved_candidates (int): Max. number of saved split candidates per node.
p_replaceable_candidates (float): Max. percent of saved split candidates that can be replaced by new/better candidates per training iteration.
cat_features (List[int]): List of indices (pos. in the feature vector) corresponding to categorical features.
root (Node): Root node of the Dynamic Model Tree.

`method` `DynamicModelTreeClassifier.init`

__init__(
    classes: List,
    learning_rate: float = 0.05,
    penalty_term: float = 0,
    penalty: str = 'l2',
    epsilon: float = 1e-07,
    n_saved_candidates: int = 100,
    p_replaceable_candidates: float = 0.5,
    cat_features: Optional[List[int]] = None,
    reset_after_drift: Optional[bool] = False
)

Inits the DMT.

Args:

classes: List of the target classes.
learning_rate: Learning rate of the linear models.
penalty_term: Regularization term for the linear model (0 = no regularization penalty).
penalty: String identifier of the type of regularization used by the linear model. Either 'l1', 'l2', or 'elasticnet' (see documentation of sklearn SGDClassifier).
epsilon: Threshold required before attempting to split or prune based on the Akaike Information Criterion. The smaller the epsilon-threshold, the stronger the evidence for splitting/pruning must be. Choose 0 < epsilon <= 1.
n_saved_candidates: Max. number of saved split candidates per node.
p_replaceable_candidates: Max. percent of saved split candidates that can be replaced by new/better candidates per training iteration.
cat_features: List of indices (pos. in the feature vector) corresponding to categorical features.
reset_after_drift: A boolean indicating if the predictor will be reset after a drift was detected. Note that the DMT automatically adjusts to concept drift and thus generally need not be retrained.

`method` `DynamicModelTreeClassifier.n_nodes`

n_nodes() → Tuple[int, int, int]

Returns the number of nodes, leaves and the depth of the DMT.

Returns:

int: Total number of nodes.
int: Total number of leaves.
int: Depth (where a single root node has depth = 1).

`method` `DynamicModelTreeClassifier.partial_fit`

partial_fit(
    X: Union[numpy.__array_like._SupportsArray[numpy.dtype], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]],
    y: Union[numpy.__array_like._SupportsArray[numpy.dtype], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]
)

Updates the predictor.

Args:

X: Array/matrix of observations.
y: Array of corresponding labels.

`method` `DynamicModelTreeClassifier.predict`

predict(
    X: Union[numpy.__array_like._SupportsArray[numpy.dtype], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]
) → Union[numpy.__array_like._SupportsArray[numpy.dtype], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]

Predicts the target values.

Args:

X: Array/matrix of observations.

Returns:

ArrayLike: Predicted labels for all observations.

`method` `DynamicModelTreeClassifier.predict_proba`

predict_proba(
    X: Union[numpy.__array_like._SupportsArray[numpy.dtype], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]
) → Union[numpy.__array_like._SupportsArray[numpy.dtype], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]

Predicts the probability of target values.

Args:

X: Array/matrix of observations.

Returns:

ArrayLike: Predicted probability per class label for all observations.

`method` `DynamicModelTreeClassifier.reset`

reset()

Resets the predictor.

`class` `Node`

Node of the Dynamic Model Tree.

Attributes:

classes (List): List of the target classes.
n_features (int): Number of input features.
learning_rate (float): Learning rate of the linear models.
penalty_term (float): Regularization term for the linear model (0 = no regularization penalty).
penalty (str): String identifier of the type of regularization used by the linear model. Either 'l1', 'l2', or 'elasticnet' (see documentation of sklearn SGDClassifier).
epsilon (float): Threshold required before attempting to split or prune based on the Akaike Information Criterion. The smaller the epsilon-threshold, the stronger the evidence for splitting/pruning must be. Choose 0 < epsilon <= 1.
n_saved_candidates (int): Max. number of saved split candidates per node.
p_replaceable_candidates (float): Max. percent of saved split candidates that can be replaced by new/better candidates per training iteration.
cat_features (List[int]): List of indices (pos. in the feature vector) corresponding to categorical features.
linear_model (Any): Linear (logit) model trained at the node.
log_likelihood (ArrayLike): Log-likelihood given observations that reached the node.
counts_left (dict): Number of observations per split candidate falling to the left child.
log_likelihoods_left (dict): Log-likelihoods of the left child per split candidate.
gradients_left (dict): Gradients of the left child per split candidate.
counts_right (dict): Number of observations per split candidate falling to the right child.
log_likelihoods_right (dict): Log-likelihoods of the right child per split candidate.
gradients_right (dict): Gradients of the right child per split candidate
children (List[Node]): List of child nodes.
split (tuple): Feature/value combination used for splitting.
is_leaf (bool): Indicator of whether the node is a leaf.

`method` `Node.init`

__init__(
    classes: List,
    n_features: int,
    learning_rate: float,
    penalty_term: float,
    penalty: str,
    epsilon: float,
    n_saved_candidates: int,
    p_replaceable_candidates: float,
    cat_features: List[int]
)

Inits Node.

Args:

classes: List of the target classes.
n_features: Number of input features.
learning_rate: Learning rate of the linear models.
penalty_term: Regularization term for the linear model (0 = no regularization penalty).
penalty: String identifier of the type of regularization used by the linear model. Either 'l1', 'l2', or 'elasticnet' (see documentation of sklearn SGDClassifier).
epsilon: Threshold required before attempting to split or prune based on the Akaike Information Criterion. The smaller the epsilon-threshold, the stronger the evidence for splitting/pruning must be. Choose 0 < epsilon <= 1.
n_saved_candidates: Max. number of saved split candidates per node.
p_replaceable_candidates: Max. percent of saved split candidates that can be replaced by new/better candidates per training iteration.
cat_features: List of indices (pos. in the feature vector) corresponding to categorical features.

`method` `Node.predict_observation`

predict_observation(
    x: Union[numpy.__array_like._SupportsArray[numpy.dtype], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]],
    get_prob: bool = False
) → Union[numpy.__array_like._SupportsArray[numpy.dtype], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]

Predicts one observation (recurrent function).

Passes an observation down the tree until a leaf is reached. Makes prediction at leaf.

Args:

x: Observation.
get_prob: Indicator whether to return class probabilities.

Returns:

ArrayLike: Predicted class label/probability of the given observation.

`method` `Node.update`

update(
    X: Union[numpy.__array_like._SupportsArray[numpy.dtype], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]],
    y: Union[numpy.__array_like._SupportsArray[numpy.dtype], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]
)

Updates the node and all descendants.

Update the parameters of the weak model at the given node. If the node is an inner node, we attempt to split on a different feature or replace the inner node by a leaf and thereby prune all previous children/subbranches. If the node is a leaf node, we attempt to split.

Args:

X: Array/matrix of observations.
y: Array of corresponding labels.

This file was automatically generated via lazydocs.

module prediction.dynamic_model_tree

class DynamicModelTreeClassifier

method DynamicModelTreeClassifier.__init__

method DynamicModelTreeClassifier.n_nodes

method DynamicModelTreeClassifier.partial_fit

method DynamicModelTreeClassifier.predict

method DynamicModelTreeClassifier.predict_proba

method DynamicModelTreeClassifier.reset

class Node

method Node.__init__

method Node.predict_observation

method Node.update

`module` `prediction.dynamic_model_tree`

`class` `DynamicModelTreeClassifier`

`method` `DynamicModelTreeClassifier.init`

`method` `DynamicModelTreeClassifier.n_nodes`

`method` `DynamicModelTreeClassifier.partial_fit`

`method` `DynamicModelTreeClassifier.predict`

`method` `DynamicModelTreeClassifier.predict_proba`

`method` `DynamicModelTreeClassifier.reset`

`class` `Node`

`method` `Node.init`

`method` `Node.predict_observation`

`method` `Node.update`