canari.data_visualization#

Visualization for Canari’s results.

This module provides functions to plot:

  • Raw or standardized data

  • Prediction results with uncertainty (mean ± standard deviation)

  • Hidden state estimates

  • Probability of regime changes (anomalies) from Switching Kalman filter (SKF)

canari.data_visualization.plot_data(data_processor: DataProcess, standardization: bool | None = False, plot_train_data: bool | None = True, plot_validation_data: bool | None = True, plot_test_data: bool | None = True, plot_column: list[int] | None = None, sub_plot: Axes | None = None, color: str | None = 'r', linestyle: str | None = '-', train_label: str | None = None, validation_label: str | None = None, test_label: str | None = None, plot_nan: bool | None = True)[source]#

Plot train, validation, and test data with optional standardization and NaN filtering.

Parameters:
  • data_processor (DataProcess) – Data processing object.

  • standardization (bool, optional) – Plot data in standardized or original space

  • plot_train_data (bool, optional) – If True, plot training data.

  • plot_validation_data (bool, optional) – If True, plot validation data.

  • plot_test_data (bool, optional) – If True, plot test data.

  • plot_column (list[int], optional) – List of column indices to plot.

  • sub_plot (plt.Axes, optional) – Matplotlib subplot axis to plot on.

  • color (str, optional) – Line color for plot lines.

  • linestyle (str, optional) – Line style (e.g., ‘-’, ‘–‘).

  • train_label (str, optional) – Legend label for training data.

  • validation_label (str, optional) – Legend label for validation data.

  • test_label (str, optional) – Legend label for test data.

  • plot_nan (bool, optional) – Whether to include NaNs in the plot for plotting missing values.

Examples

>>> from canari import DataProcess, plot_data
>>> # Create data
>>> dt_index = pd.date_range(start="2025-01-01", periods=11, freq="H")
>>> data = pd.DataFrame({'value': np.linspace(0.1, 1.0, 11)},
                index=dt_index)
>>> dp = DataProcess(data,
            train_split=0.7,
            validation_split=0.2,
            test_split=0.1,
            time_covariates = ["hour_of_day"],
            standardization=True)
>>> fig, ax = plt.subplots(figsize=(12, 4))
>>> plot_data(
        data_processor=dp,
        standardization=True,
        plot_validation_data=True,
        plot_test_data=True,
    )
canari.data_visualization.plot_prediction(data_processor: DataProcess, mean_train_pred: ndarray | None = None, std_train_pred: ndarray | None = None, mean_validation_pred: ndarray | None = None, std_validation_pred: ndarray | None = None, mean_test_pred: ndarray | None = None, std_test_pred: ndarray | None = None, num_std: int | None = 1, sub_plot: Axes | None = None, color: str | None = 'blue', linestyle: str | None = '-', train_label: List[str] | None = ['', ''], validation_label: List[str] | None = ['', ''], test_label: List[str] | None = ['', ''])[source]#

Plot predicted mean and uncertainty for each data split. The uncertainty is represented by confidence regions based on a multiple of the associated variable’s standard deviation.

Parameters:
  • data_processor (DataProcess) – Data processing object.

  • mean_train_pred (np.ndarray, optional) – Predicted means for training.

  • std_train_pred (np.ndarray, optional) – Standard deviations for training predictions.

  • mean_validation_pred (np.ndarray, optional) – Predicted means for validation.

  • std_validation_pred (np.ndarray, optional) – Standard deviations for validation predictions.

  • mean_test_pred (np.ndarray, optional) – Predicted means for test predictions.

  • std_test_pred (np.ndarray, optional) – Standard deviations for test predictions.

  • num_std (int, optional) – Number of std deviations for confidence region.

  • sub_plot (plt.Axes, optional) – Matplotlib subplot axis to plot on.

  • color (str, optional) – Line color.

  • linestyle (str, optional) – Line style.

  • train_label (List[str], optional) – [mean_label, std_label] for train.

  • validation_label (List[str], optional) – [mean_label, std_label] for validation.

  • test_label (List[str], optional) – [mean_label, std_label] for test.

Examples

>>> from canari import plot_prediction
>>> mu_preds_val, std_preds_val, states = model.lstm_train(train_data=train_set,validation_data=val_set)
>>> plot_prediction(
        data_processor=dp,
        mean_validation_pred=mu_preds_val,
        std_validation_pred=std_preds_val,
    )
canari.data_visualization.plot_skf_states(data_processor: DataProcess, states: StatesHistory, model_prob: ndarray, states_to_plot: list[str] | None = 'all', states_type: str | None = 'posterior', standardization: bool | None = False, num_std: int | None = 1, plot_observation: bool | None = True, color: str | None = 'b', linestyle: str | None = '-', legend_location: str | None = None, plot_nan: bool | None = True)[source]#

Plot hidden states along with probabilities of regime changes.

Parameters:
  • data_processor (DataProcess) – Data processing object.

  • states (StatesHistory) – Object containing hidden states history over time.

  • model_prob (np.ndarray) – Probabilities of the abnormal regime.

  • states_to_plot (list[str] or "all", optional) – Names of states to plot.

  • states_type (str, optional) – Type of state (‘posterior’ or ‘prior’ or ‘smooth’).

  • standardization (bool, optional) – Whether to plot hidden states in standardized or original space.

  • num_std (int, optional) – Number of standard deviations for confidence regions.

  • plot_observation (bool, optional) – Whether to include observed data into “level” plot.

  • color (str, optional) – Line color for states.

  • linestyle (str, optional) – Line style.

  • legend_location (str, optional) – Location of legend in top subplot.

  • plot_nan (bool, optional) – Whether to include NaNs in the plot for plotting missing values.

Examples

>>> from canari import plot_skf_states
>>> filter_marginal_abnorm_prob, states = skf.filter(data=all_data)
>>> fig, ax = plot_skf_states(
        data_processor=dp,
        states=states,
        states_type="posterior",
        model_prob=filter_marginal_abnorm_prob,
    )
canari.data_visualization.plot_states(data_processor: DataProcess, states: StatesHistory, states_to_plot: list[str] | None = 'all', states_type: str | None = 'posterior', standardization: bool | None = False, num_std: int | None = 1, sub_plot: Axes | None = None, color: str | None = 'b', linestyle: str | None = '-', legend_location: str | None = None)[source]#

Plot hidden states with mean and confidence regions.

Parameters:
  • data_processor (DataProcess) – Data processing object.

  • states (StatesHistory) – Object containing hidden states history over time.

  • states_to_plot (list[str] or "all", optional) – Names of states to plot.

  • states_type (str, optional) – Type of state (‘posterior’ or ‘prior’ or ‘smooth’).

  • standardization (bool, optional) – Whether to plot hidden states in standardized or original space.

  • num_std (int, optional) – Number of standard deviations for confidence region.

  • sub_plot (plt.Axes, optional) – Matplotlib subplot axis to plot on.

  • color (str, optional) – Color for mean line and confidence region fill.

  • linestyle (str, optional) – Line style.

  • legend_location (str, optional) – Legend placement for first state subplot.

Examples

>>> from canari import plot_states
>>> mu_preds_val, std_preds_val, states = model.lstm_train(train_data=train_set,validation_data=val_set)
>>> fig, ax = plot_states(
        data_processor=dp,
        states=states,
        states_type="posterior"
    )
canari.data_visualization.plot_with_uncertainty(time, mu, std, color: str | None = 'k', linestyle: str | None = '-', label: list[str] | None = ['', ''], index: int | None = None, num_std: int | None = 1, ax: Axes | None = None)[source]#

Plot mean and confidence region for a variable.

Parameters:
  • time (np.ndarray) – Time vector.

  • mu (np.ndarray) – Mean values to plot.

  • std (np.ndarray) – Standard deviation values.

  • color (str, optional) – Line/fill color.

  • linestyle (str, optional) – Line style.

  • label (List[str], optional) – Labels for mean and confidence region.

  • index (int, optional) – Index if plotting from a multivariate tensor.

  • num_std (int, optional) – Number of standard deviations for confidence regions.

  • ax (plt.Axes, optional) – Axis to use for plotting.