canari.model#
Hybrid LSTM-SSM model that combines Bayesian Long-short Term Memory (LSTM) Neural Networks and State-Space Models (SSM).
This model supports a flexible architecture where multiple component are assembled to define a structured state-space model.
On time series data, this model can:
Provide forecasts with associated uncertainties.
Decompose orginal time serires data into unobserved hidden states. Provide mean values and associate uncertainties for these hidden states.
Train its Bayesian LSTM network component.
Support forecasting, filtering, and smoothing operations.
Generate synthetic time series data, including synthetic anomaly injection.
References
Vuong, V.D., Nguyen, L.H. and Goulet, J.-A. (2025). Coupling LSTM neural networks and state-space models through analytically tractable inference. International Journal of Forecasting. Volume 41, Issue 1, Pages 128-140.
- class canari.model.Model(*components: BaseComponent)[source]#
Bases:
object
Model class for the Hybrid LSTM/SSM model.
- Parameters:
*components (BaseComponent) – One or more instances of classes derived from
BaseComponent
.
Examples
>>> from canari.component import LocalTrend, Periodic, WhiteNoise >>> from canari import Model >>> # Components >>> local_trend = LocalTrend(mu_states=[1,0.5], var_states=[1,0.5]) >>> periodic = Periodic(mu_states=[1,1],var_states=[2,2],period=52) >>> residual = WhiteNoise(std_error=0.04168) >>> # Define model >>> model = Model(local_trend, periodic, residual)
- components#
Dictionary to save model components’ configurations.
- Type:
Dict[str, BaseComponent]
- num_states#
Number of hidden states.
- Type:
int
- states_names#
Names of hidden states.
- Type:
list[str]
- mu_states#
Mean vector for the hidden states \(X_{t|t}\) at the time step t.
- Type:
np.ndarray
- var_states#
Covariance matrix for the hidden states \(X_{t|t}\) at the time step t.
- Type:
np.ndarray
- mu_states_prior#
Prior mean vector for the hidden states \(X_{t+1|t}\) at the time step t+1.
- Type:
np.ndarray
- var_states_prior#
Prior covariance matrix for the hidden states \(X_{t+1|t}\) at the time step t+1.
- Type:
np.ndarray
- mu_states_posterior#
Posteriror mean vector for the hidden states \(X_{t+1|t+1}\) at the time step t+1. In case of missing data (NaN observation), it will have the same values as
mu_states_prior
.- Type:
np.ndarray
- var_states_posterior#
Posteriror covariance matrix for the hidden states \(X_{t+1|t+1}\) at the time step t+1. In case of missing data (NaN observation), it will have the same values as
var_states_prior
.- Type:
np.ndarray
- states#
Container for storing prior, posterior, and smoothed values of hidden states over time.
- Type:
- mu_obs_predict#
Means for observation predictions at a time step t+1.
- Type:
np.ndarray
- var_obs_predict#
Variances for observation predictions at a time step t+1.
- Type:
np.ndarray
- observation_matrix#
Global observation matrix constructed from all components.
- Type:
np.ndarray
- transition_matrix#
Global transition matrix constructed from all components.
- Type:
np.ndarray
- process_noise_matrix#
Global process noise matrix constructed from all components.
- Type:
np.ndarray
- # LSTM-related attributes
only being used when a
LstmNetwork
component is found.
- lstm_net#
LSTM neural network that is generated from the
LstmNetwork
component, if present. It is apytagi.Sequential
instance.- Type:
pytagi.Sequential
- lstm_output_history#
Container for saving a rolling history of LSTM output over a fixed look-back window.
- Type:
- # Early stopping attributes
only being used when training a
LstmNetwork
component.
- early_stop_metric#
Best value associated with the metric being monitored.
- Type:
float
- early_stop_metric_history#
Logged history of metric values across epochs.
- Type:
List[float]
- early_stop_lstm_param#
LSTM’s weight and bias parameters at the optimal epoch for
pytagi.Sequential
.- Type:
Dict
- early_stop_init_mu_states#
Copy of mu_states at time step t=0 of the optimal epoch .
- Type:
np.ndarray
- early_stop_init_var_states#
Copy of var_states at time step t=0 of the optimal epoch .
- Type:
np.ndarray
- optimal_epoch#
Epoch at which the metric being monitored was best.
- Type:
int
- stop_training#
Flag indicating whether training has been stopped due to early stopping or by reaching maximum number of epoch.
- Type:
bool
- # Optimization attribute
- metric_optim#
metric used for optimization in
model_optimizer
- Type:
float
- auto_initialize_baseline_states(data: ndarray)[source]#
Automatically assign initial means and variances for baseline hidden states (level, trend, and acceleration) from input data using time series decomposition defined in
decompose_data()
.- Parameters:
data (np.ndarray) – Time series data.
Examples
>>> train_set, val_set, test_set, all_data = dp.get_splits() >>> model.auto_initialize_baseline_states(train_data["y"][0 : 52])
- backward(obs: float) Tuple[ndarray, ndarray, ndarray, ndarray] [source]#
Update step in the Kalman filter for one time step.
This function is used at the one-time-step level. Recall
backward()
fromcommon
.- Parameters:
obs (float) – Observation value.
- Returns:
A tuple containing:
- delta_mu (np.ndarray):
The delta for updating
mu_states_prior
.
- delta_var (np.ndarray):
The delta for updating
var_states_prior
.
mu_states_posterior
(np.ndarray):The posterior mean of the hidden states.
var_states_posterior
(np.ndarray):The posterior variance of the hidden states.
- Return type:
Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]
- early_stopping(evaluate_metric: float, current_epoch: int, max_epoch: int, mode: str | None = 'min', patience: int | None = 20, skip_epoch: int | None = 5) Tuple[bool, int, float, list] [source]#
Apply early stopping based on a monitored metric when training a LSTM neural network.
This method records evaluate_metric at each epoch, if there is an improvement on it, update
early_stop_metric
,early_stop_init_mu_states
,early_stop_init_var_states
,early_stop_lstm_param
, andoptimal_epoch
.Sets the stop_training to True if
optimal_epoch
= max_epoch, or (current_epoch -optimal_epoch
)>= patience.When stop_training is True, set
mu_states
=early_stop_init_mu_states
,var_states
=early_stop_init_var_states
, and set LSTM parameters toearly_stop_lstm_param
.- Parameters:
current_epoch (int) – Current epoch
max_epoch (int) – Maximum number of epochs
evaluate_metric (float) – Current metric value for this epoch.
mode (Optional[str]) – Direction for early stopping: ‘min’ (default).
patience (Optional[int]) – Number of epochs without improvement before stopping. Defaults to 20.
skip_epoch (Optional[int]) – Number of initial epochs to ignore when looking for improvements. Defaults to 5.
- Returns:
stop_training: True if training stops.
optimal_epoch: Epoch index of when having best metric.
early_stop_metric: Best evaluate_metric. .
early_stop_metric_history: History of evaluate_metric at all epochs.
- Return type:
Tuple[bool, int, float, List[float]]
Examples
>>> model.early_stopping(evaluate_metric=mse, current_epoch=1, max_epoch=50)
- filter(data: Dict[str, ndarray], train_lstm: bool | None = True) Tuple[ndarray, ndarray, StatesHistory] [source]#
Run the Kalman filter over an entire dataset, i.e., repeatly apply the Kalman prediction and update steps over multiple time steps.
This function is used at the entire-dataset-level. Recall repeatedly the function
forward()
andbackward()
at one-time-step level fromModel
.- Parameters:
data (Dict[str, np.ndarray]) – Includes ‘x’ and ‘y’.
train_lstm (bool) – Whether to update LSTM’s parameter weights and biases. Defaults to True.
- Returns:
A tuple containing:
- mu_obs_preds (np.ndarray):
The means for forecasts.
- std_obs_preds (np.ndarray):
The standard deviations for forecasts.
states
:The history of hidden states over time.
- Return type:
Tuple[np.ndarray, np.ndarray, StatesHistory]
Examples
>>> mu_preds_train, std_preds_train, states = model.filter(train_set)
- forecast(data: Dict[str, ndarray]) Tuple[ndarray, ndarray, StatesHistory] [source]#
Perform multi-step-ahead forecast over an entire dataset by recursively making one-step-ahead predictions, i.e., reapeatly apply the Kalman prediction step over multiple time steps.
This function is used at the entire-dataset-level. Recall repeatedly the function
forward()
at one-time-step level fromModel
.- Parameters:
data (Dict[str, np.ndarray]) – A dictionary containing key ‘x’ as input covariates, if exists ‘y’ (real observations) will not be used.
- Returns:
A tuple containing:
- mu_obs_preds (np.ndarray):
The means for forecasts.
- std_obs_preds (np.ndarray):
The standard deviations for forecasts.
states
:The history of hidden states over time.
- Return type:
Tuple[np.ndarray, np.ndarray, StatesHistory]
Examples
>>> mu_preds_val, std_preds_val, states = model.forecast(val_set)
- forward(input_covariates: ndarray | None = None, mu_lstm_pred: ndarray | None = None, var_lstm_pred: ndarray | None = None) Tuple[ndarray, ndarray, ndarray, ndarray] [source]#
Make a one-step-ahead prediction using the prediction step of the Kalman filter. If no input_covariates for LSTM, use an empty np.ndarray. Recall
forward()
fromcommon
.This function is used at the one-time-step level.
- Parameters:
input_covariates (Optional[np.ndarray]) – Input covariates for LSTM at time t.
mu_lstm_pred (Optional[np.ndarray]) – Predicted mean from LSTM at time t+1, used when we dont want LSTM to make predictions, but use LSTM predictions already have.
var_lstm_pred (Optional[np.ndarray]) – Predicted variance from LSTM at time t+1, used when we dont want LSTM to make predictions, but use LSTM predictions already have.
- Returns:
A tuple containing:
mu_obs_predict
(np.ndarray):The predictive mean of the observation at t+1.
var_obs_predict
(np.ndarray):The predictive variance of the observation at t+1.
mu_states_prior
(np.ndarray):The prior mean of the hidden state at t+1.
var_states_prior
(np.ndarray):The prior variance of the hidden state at t+1.
- Return type:
Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]
- generate_time_series(num_time_series: int, num_time_steps: int, sample_from_lstm_pred=True, time_covariates=None, time_covariate_info=None, add_anomaly=False, anomaly_mag_range=None, anomaly_begin_range=None, anomaly_type='trend') ndarray [source]#
Generate synthetic time series data based on the model components, with optional synthetic anomaly injection.
- Parameters:
num_time_series (int) – Number of independent series to generate.
num_time_steps (int) – Number of timesteps per generated series.
sample_from_lstm_pred (bool, optional) – If False, zeroes out LSTM-derived variance so that the generation ignores the LSTM uncertainty. Defaults to True.
time_covariates (np.ndarray of shape (num_time_steps, cov_dim), optional) – Time-varying covariates to include in generation. If provided, these will be standardized using time_covariate_info and passed through the model each step. Defaults to None.
time_covariate_info (dict, optional) – Required if time_covariates is not None. Must contain: - “initial_time_covariate” (np.ndarray): the starting covariate vector - “mu” (np.ndarray): means for standardization - “std” (np.ndarray): standard deviations for standardization
add_anomaly (bool, optional) – Whether to inject a synthetic anomaly into each series. Defaults to False.
anomaly_mag_range (tuple of float, optional) – (min, max) range for random anomaly magnitudes. Required if add_anomaly=True. Defaults to None.
anomaly_begin_range (tuple of int, optional) – (min, max) range of timestep indices at which anomaly may start. Required if add_anomaly=True. Defaults to None.
anomaly_type (str, optional) – Type of injected anomaly: “trend”: a growing linear drift after anomaly starts, “level”: a constant shift after anomaly starts. Defaults to “trend”.
- Returns:
- generated series (np.ndarray):
Generated series with the shape (num_time_series, num_time_steps).
- input_covariates (np.ndarray):
The input covariates used.
- anomaly magnitudes (List[float]):
Anomaly magnitudes per series.
- anomaly start timesteps (List[float]):
Anomaly start timesteps per series.
- Return type:
Tuple[np.ndarray, np.ndarray, List[float], List[int]]
- get_dict() dict [source]#
Export model attributes into a serializable dictionary.
- Returns:
Serializable model dictionary containing neccessary attributes.
- Return type:
dict
Examples
>>> saved_dict = model.get_dict()
- get_states_index(states_name: str)[source]#
Retrieve index of a state in the state vector.
- Parameters:
states_name (str) – The name of the state.
- Returns:
Index of the state, or None if not found.
- Return type:
int or None
Examples
>>> lstm_index = model.get_states_index("lstm") >>> level_index = model.get_states_index("level")
- initialize_states_history()[source]#
Reinitialize prior, posterior, and smoothed values for hidden states in
states
with empty lists.
- initialize_states_with_smoother_estimates()[source]#
Set hidden states
mu_states
andvar_states
using the smoothed estimates for hidden states at the first time step t=1 stored instates
. This new hidden states act as the inital hidden states at t=0 in the next epoch.
- static load_dict(save_dict: dict)[source]#
Reconstruct a model instance from a saved dictionary.
- Parameters:
save_dict (dict) – Dictionary containing saved model structure and parameters.
- Returns:
An instance of
Model
generated from the input dictionary.- Return type:
Examples
>>> saved_dict = model.get_dict() >>> loaded_model = Model.load_dict(saved_dict)
- lstm_train(train_data: Dict[str, ndarray], validation_data: Dict[str, ndarray], white_noise_decay: bool | None = True, white_noise_max_std: float | None = 5, white_noise_decay_factor: float | None = 0.9) Tuple[ndarray, ndarray, StatesHistory] [source]#
Train the
LstmNetwork
component on the provided training set, then evaluate on the validation set. Optionally apply exponential decay on the white noise standard deviation over epochs.At the end of this function, use
set_memory
to set the memory to t=0.- Parameters:
train_data (Dict[str, np.ndarray]) – Dictionary with keys ‘x’ and ‘y’ for training inputs and targets.
validation_data (Dict[str, np.ndarray]) – Dictionary with keys ‘x’ and ‘y’ for validation inputs and targets.
white_noise_decay (bool, optional) – If True, apply an exponential decay on the white noise standard deviation over epochs, if a white noise component exists. Defaults to True.
white_noise_max_std (float, optional) – Upper bound on the white-noise standard deviation when decaying. Defaults to 5.
white_noise_decay_factor (float, optional) – Multiplicative decay factor applied to the white‐noise standard deviation each epoch. Defaults to 0.9.
- Returns:
A tuple containing:
- mu_obs_preds (np.ndarray):
The means for multi-step-ahead predictions for the validation set.
- std_obs_preds (np.ndarray):
The standard deviations for multi-step-ahead predictions for the validation set.
states
:The history of hidden states over time.
- Return type:
Tuple[np.ndarray, np.ndarray, StatesHistory]
Examples
>>> mu_preds_val, std_preds_val, states = model.lstm_train(train_data=train_set,validation_data=val_set)
- rts_smoother(time_step: int, matrix_inversion_tol: float | None = 1e-12, tol_type: str | None = 'relative')[source]#
Apply RTS smoothing equations for a specity timestep. As a result of this function, the smoothed estimates for hidden states at the specific time step will be updated in
states
.This function is used at the one-time-step level. Recall
rts_smoother()
fromcommon
.- Parameters:
time_step (int) – Target smoothing index.
matrix_inversion_tol (float) – Numerical stability threshold for matrix pseudoinversion (pinv). Defaults to 1E-12.
- set_memory(states: StatesHistory, time_step: int)[source]#
Set
mu_states
,var_states
, andlstm_output_history
with smoothed estimates from a specific time steps stored instates
. This is to prepare for the next analysis by ensuring the continuity of these variables, e.g., if the next analysis starts from time step t, should set the memory to the time step t.If t=0, also set the means and variances for cell and hidden states of
lstm_net
to zeros. If t is not 0, need to set cell and hidden states outside in the code using Model.lstm_net.set_lstm_states(lstm_cell_hidden_states).- Parameters:
states (StatesHistory) – Full history of hidden states over time.
time_step (int) – Index of timestep to restore.
Examples
>>> # If the next analysis starts from the beginning of the time series >>> model.set_memory(states=model.states, time_step=0)) >>> # If the next analysis starts from t = 200 >>> model.set_memory(states=model.states, time_step=200))
- set_states(new_mu_states: ndarray, new_var_states: ndarray)[source]#
Set new values for states, i.e.,
mu_states
andvar_states
- Parameters:
new_mu_states (np.ndarray) – Mean values to be set.
new_var_states (np.ndarray) – Covariance matrix to be set.
- smoother(matrix_inversion_tol: float | None = 1e-12, tol_type: str | None = 'relative') StatesHistory [source]#
Run the Kalman smoother over an entire time series data, i.e., repeatly apply the RTS smoothing equation over multiple time steps.
This function is used at the entire-dataset-level. Recall repeatedly the function
rts_smoother()
at one-time-step level fromModel
.- Parameters:
matrix_inversion_tol (float) – Numerical stability threshold for matrix pseudoinversion (pinv). Defaults to 1E-12.
- Returns:
states
: The history of hidden states over time.- Return type:
Examples
>>> mu_preds_train, std_preds_train, states = model.filter(train_set) >>> states = model.smoother()
- white_noise_decay(epoch: int, white_noise_max_std: float, white_noise_decay_factor: float)[source]#
Apply exponential decay to white noise standard deviation over epochs, and modify the variance for the white noise component in
process_noise_matrix
. This decaying noise structure is intended to improve the training performance of TAGI-LSTM.- Parameters:
epoch (int) – Current training epoch.
white_noise_max_std (float) – Maximum allowed noise std.
white_noise_decay_factor (float) – Factor controlling decay rate.