canari.model#
Hybrid LSTM-SSM model that combines Bayesian Long-short Term Memory (LSTM) Neural Networks and State-Space Models (SSM).
This model supports a flexible architecture where multiple component are assembled to define a structured state-space model.
On time series data, this model can:
Provide forecasts with associated uncertainties.
Decompose orginal time serires data into unobserved hidden states. Provide mean values and associate uncertainties for these hidden states.
Train its Bayesian LSTM network component.
Support forecasting, filtering, and smoothing operations.
Generate synthetic time series data, including synthetic anomaly injection.
References
Vuong, V.D., Nguyen, L.H. and Goulet, J.-A. (2025). Coupling LSTM neural networks and state-space models through analytically tractable inference. International Journal of Forecasting. Volume 41, Issue 1, Pages 128-140.
- class canari.model.Model(*components: BaseComponent)[source]#
Bases:
objectModel class for the Hybrid LSTM/SSM model.
- Parameters:
*components (BaseComponent) – One or more instances of classes derived from
BaseComponent.
Examples
>>> from canari.component import LocalTrend, Periodic, WhiteNoise >>> from canari import Model >>> # Components >>> local_trend = LocalTrend(mu_states=[1,0.5], var_states=[1,0.5]) >>> periodic = Periodic(mu_states=[1,1],var_states=[2,2],period=52) >>> residual = WhiteNoise(std_error=0.04168) >>> # Define model >>> model = Model(local_trend, periodic, residual)
- components#
Dictionary to save model components’ configurations.
- Type:
Dict[str, BaseComponent]
- num_states#
Number of hidden states.
- Type:
int
- states_names#
Names of hidden states.
- Type:
list[str]
- mu_states#
Mean vector for the hidden states \(X_{t|t}\) at the time step t.
- Type:
np.ndarray
- var_states#
Covariance matrix for the hidden states \(X_{t|t}\) at the time step t.
- Type:
np.ndarray
- mu_states_prior#
Prior mean vector for the hidden states \(X_{t+1|t}\) at the time step t+1.
- Type:
np.ndarray
- var_states_prior#
Prior covariance matrix for the hidden states \(X_{t+1|t}\) at the time step t+1.
- Type:
np.ndarray
- mu_states_posterior#
Posteriror mean vector for the hidden states \(X_{t+1|t+1}\) at the time step t+1. In case of missing data (NaN observation), it will have the same values as
mu_states_prior.- Type:
np.ndarray
- var_states_posterior#
Posteriror covariance matrix for the hidden states \(X_{t+1|t+1}\) at the time step t+1. In case of missing data (NaN observation), it will have the same values as
var_states_prior.- Type:
np.ndarray
- states#
Container for storing prior, posterior, and smoothed values of hidden states over time.
- Type:
- mu_obs_predict#
Means for observation predictions at a time step t+1.
- Type:
np.ndarray
- var_obs_predict#
Variances for observation predictions at a time step t+1.
- Type:
np.ndarray
- observation_matrix#
Global observation matrix constructed from all components.
- Type:
np.ndarray
- transition_matrix#
Global transition matrix constructed from all components.
- Type:
np.ndarray
- process_noise_matrix#
Global process noise matrix constructed from all components.
- Type:
np.ndarray
- # LSTM-related attributes
only being used when a
LstmNetworkcomponent is found.
- lstm_net#
LSTM neural network that is generated from the
LstmNetworkcomponent, if present. It is apytagi.Sequentialinstance.- Type:
pytagi.Sequential
- lstm_output_history#
Container for saving a rolling history of LSTM output over a fixed look-back window.
- Type:
- lstm_states_history#
Container for saving the history for LSTM’s hidden and cell states for all time steps.
- Type:
list
- # Early stopping attributes
only being used when training a
LstmNetworkcomponent.
- early_stop_metric#
Best value associated with the metric being monitored.
- Type:
float
- early_stop_metric_history#
Logged history of metric values across epochs.
- Type:
List[float]
- early_stop_lstm_param#
LSTM’s weight and bias parameters at the optimal epoch for
pytagi.Sequential.- Type:
Dict
- early_stop_lstm_states#
lstm_states_historyat the optimal epoch- Type:
np.ndarray
- optimal_epoch#
Epoch at which the metric being monitored was best.
- Type:
int
- stop_training#
Flag indicating whether training has been stopped due to early stopping or by reaching maximum number of epoch.
- Type:
bool
- # Optimization attribute
- metric_optim#
metric used for optimization in
model_optimizer- Type:
float
- auto_initialize_baseline_states(data: ndarray)[source]#
Automatically assign initial means and variances for baseline hidden states (level, trend, and acceleration) from input data using time series decomposition defined in
decompose_data().- Parameters:
data (np.ndarray) – Time series data.
Examples
>>> train_set, val_set, test_set, all_data = dp.get_splits() >>> model.auto_initialize_baseline_states(train_data["y"][0 : 52])
- backward(obs: float) Tuple[ndarray, ndarray, ndarray, ndarray][source]#
Update step in the Kalman filter for one time step.
This function is used at the one-time-step level. Recall
backward()fromcommon.- Parameters:
obs (float) – Observation value.
- Returns:
A tuple containing:
- delta_mu (np.ndarray):
The delta for updating
mu_states_prior.
- delta_var (np.ndarray):
The delta for updating
var_states_prior.
mu_states_posterior(np.ndarray):The posterior mean of the hidden states.
var_states_posterior(np.ndarray):The posterior variance of the hidden states.
- Return type:
Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]
- early_stopping(evaluate_metric: float, current_epoch: int, max_epoch: int, mode: str | None = 'min', patience: int | None = 20, skip_epoch: int | None = 5) Tuple[bool, int, float, list][source]#
Apply early stopping based on the evaluate_metric when training a LSTM neural network.
This method records evaluate_metric at each epoch, if there is an improvement on it, update
early_stop_metric,early_stop_lstm_param,early_stop_states,early_stop_lstm_states, andoptimal_epoch.Sets the stop_training to True if
optimal_epoch= max_epoch, or (current_epoch -optimal_epoch)>= patience.When stop_training is True, set
states=early_stop_states,lstm_states_history=early_stop_lstm_states, set LSTM parameters toearly_stop_lstm_param, and set memory to time_step=0- Parameters:
current_epoch (int) – Current epoch
max_epoch (int) – Maximum number of epochs
evaluate_metric (float) – Current metric value for this epoch.
mode (Optional[str]) – Direction for early stopping: ‘min’ (default).
patience (Optional[int]) – Number of epochs without improvement before stopping. Defaults to 20.
skip_epoch (Optional[int]) – Number of initial epochs to ignore when looking for improvements. Defaults to 5.
- Returns:
stop_training: True if training stops.
optimal_epoch: Epoch index of when having best metric.
early_stop_metric: Best evaluate_metric. .
early_stop_metric_history: History of evaluate_metric at all epochs.
- Return type:
Tuple[bool, int, float, List[float]]
Examples
>>> model.early_stopping(evaluate_metric=mse, current_epoch=1, max_epoch=50)
- filter(data: Dict[str, ndarray], train_lstm: bool | None = True) Tuple[ndarray, ndarray, StatesHistory][source]#
Run the Kalman filter over an entire dataset, i.e., repeatly apply the Kalman prediction and update steps over multiple time steps.
This function is used at the entire-dataset-level. Recall repeatedly the function
forward()andbackward()at one-time-step level fromModel.- Parameters:
data (Dict[str, np.ndarray]) – Includes ‘x’ and ‘y’.
train_lstm (bool) – Whether to update LSTM’s parameter weights and biases. Defaults to True.
- Returns:
A tuple containing:
- mu_obs_preds (np.ndarray):
The means for forecasts.
- std_obs_preds (np.ndarray):
The standard deviations for forecasts.
states:The history of hidden states over time.
- Return type:
Tuple[np.ndarray, np.ndarray, StatesHistory]
Examples
>>> mu_preds_train, std_preds_train, states = model.filter(train_set)
- forecast(data: Dict[str, ndarray]) Tuple[ndarray, ndarray, StatesHistory][source]#
Perform multi-step-ahead forecast over an entire dataset by recursively making one-step-ahead predictions, i.e., reapeatly apply the Kalman prediction step over multiple time steps.
This function is used at the entire-dataset-level. Recall repeatedly the function
forward()at one-time-step level fromModel.- Parameters:
data (Dict[str, np.ndarray]) – A dictionary containing key ‘x’ as input covariates, if exists ‘y’ (real observations) will not be used.
- Returns:
A tuple containing:
- mu_obs_preds (np.ndarray):
The means for forecasts.
- std_obs_preds (np.ndarray):
The standard deviations for forecasts.
states:The history of hidden states over time.
- Return type:
Tuple[np.ndarray, np.ndarray, StatesHistory]
Examples
>>> mu_preds_val, std_preds_val, states = model.forecast(val_set)
- forward(input_covariates: ndarray | None = None, var_input_covariates: ndarray | None = None, mu_lstm_pred: ndarray | None = None, var_lstm_pred: ndarray | None = None) Tuple[ndarray, ndarray, ndarray, ndarray][source]#
Make a one-step-ahead prediction using the prediction step of the Kalman filter. If no input_covariates for LSTM, use an empty np.ndarray. Recall
forward()fromcommon.This function is used at the one-time-step level.
- Parameters:
input_covariates (Optional[np.ndarray]) – Input covariates for LSTM at time t.
mu_lstm_pred (Optional[np.ndarray]) – Predicted mean from LSTM at time t+1, used when we dont want LSTM to make predictions, but use LSTM predictions already have.
var_lstm_pred (Optional[np.ndarray]) – Predicted variance from LSTM at time t+1, used when we dont want LSTM to make predictions, but use LSTM predictions already have.
- Returns:
A tuple containing:
mu_obs_predict(np.ndarray):The predictive mean of the observation at t+1.
var_obs_predict(np.ndarray):The predictive variance of the observation at t+1.
mu_states_prior(np.ndarray):The prior mean of the hidden state at t+1.
var_states_prior(np.ndarray):The prior variance of the hidden state at t+1.
- Return type:
Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]
- generate_time_series(num_time_series: int, num_time_steps: int, sample_from_lstm_pred=True, time_covariates=None, time_covariate_info=None, add_anomaly=False, anomaly_mag_range=None, anomaly_begin_range=None, anomaly_type='trend') ndarray[source]#
Generate synthetic time series data based on the model components, with optional synthetic anomaly injection.
- Parameters:
num_time_series (int) – Number of independent series to generate.
num_time_steps (int) – Number of timesteps per generated series.
sample_from_lstm_pred (bool, optional) – If False, zeroes out LSTM-derived variance so that the generation ignores the LSTM uncertainty. Defaults to True.
time_covariates (np.ndarray of shape (num_time_steps, cov_dim), optional) – Time-varying covariates to include in generation. If provided, these will be standardized using time_covariate_info and passed through the model each step. Defaults to None.
time_covariate_info (dict, optional) – Required if time_covariates is not None. Must contain: - “initial_time_covariate” (np.ndarray): the starting covariate vector - “mu” (np.ndarray): means for standardization - “std” (np.ndarray): standard deviations for standardization
add_anomaly (bool, optional) – Whether to inject a synthetic anomaly into each series. Defaults to False.
anomaly_mag_range (tuple of float, optional) – (min, max) range for random anomaly magnitudes. Required if add_anomaly=True. Defaults to None.
anomaly_begin_range (tuple of int, optional) – (min, max) range of timestep indices at which anomaly may start. Required if add_anomaly=True. Defaults to None.
anomaly_type (str, optional) – Type of injected anomaly: “trend”: a growing linear drift after anomaly starts, “level”: a constant shift after anomaly starts. Defaults to “trend”.
- Returns:
- generated series (np.ndarray):
Generated series with the shape (num_time_series, num_time_steps).
- input_covariates (np.ndarray):
The input covariates used.
- anomaly magnitudes (List[float]):
Anomaly magnitudes per series.
- anomaly start timesteps (List[float]):
Anomaly start timesteps per series.
- Return type:
Tuple[np.ndarray, np.ndarray, List[float], List[int]]
- get_dict(time_step: int | None = None) dict[source]#
Export model attributes into a serializable dictionary.
- Parameters:
time_step (Optional[int]) – the time step to get the model and memory at. If None export the model with the current memory. Defaults to None.
- Returns:
Serializable model dictionary containing neccessary attributes.
- Return type:
dict
Examples
>>> saved_dict = model.get_dict()
- get_memory(time_step: int | None = None) dict[source]#
Get memory which includes
mu_states,var_states,lstm_output_history, and lstm_states oflstm_net. If time_step is provided, obtain the memory at that time step. Otherwise, obtain the memory at the current time step.- Parameters:
time_step (Optional[int]) – time step to obtain the memory
- Returns:
Dict
- get_states_index(states_name: str)[source]#
Retrieve index of a state in the state vector.
- Parameters:
states_name (str) – The name of the state.
- Returns:
Index of the state, or None if not found.
- Return type:
int or None
Examples
>>> lstm_index = model.get_states_index("lstm") >>> level_index = model.get_states_index("level")
- initialize_states_history()[source]#
Reinitialize prior, posterior, and smoothed values for hidden states in
statesas well aslstm_states_historywith empty lists.
- initialize_states_with_smoother_estimates()[source]#
Set hidden states
mu_statesandvar_statesusing the smoothed estimates for hidden states at the first time step t=1 stored instates. This new hidden states act as the inital hidden states at t=0 in the next epoch.
- static load_dict(save_dict: dict)[source]#
Reconstruct a model instance from a saved dictionary.
- Parameters:
save_dict (dict) – Dictionary containing saved model structure and parameters.
- Returns:
An instance of
Modelgenerated from the input dictionary.- Return type:
Examples
>>> saved_dict = model.get_dict() >>> loaded_model = Model.load_dict(saved_dict)
- lstm_train(train_data: Dict[str, ndarray], validation_data: Dict[str, ndarray], white_noise_decay: bool | None = True, white_noise_max_std: float | None = 5, white_noise_decay_factor: float | None = 0.9) Tuple[ndarray, ndarray, StatesHistory][source]#
Train the
LstmNetworkcomponent on the provided training set, then evaluate on the validation set. Optionally apply exponential decay on the white noise standard deviation over epochs.At the end of this function, use
set_memoryto set the memory to t=0.- Parameters:
train_data (Dict[str, np.ndarray]) – Dictionary with keys ‘x’ and ‘y’ for training inputs and targets.
validation_data (Dict[str, np.ndarray]) – Dictionary with keys ‘x’ and ‘y’ for validation inputs and targets.
white_noise_decay (bool, optional) – If True, apply an exponential decay on the white noise standard deviation over epochs, if a white noise component exists. Defaults to True.
white_noise_max_std (float, optional) – Upper bound on the white-noise standard deviation when decaying. Defaults to 5.
white_noise_decay_factor (float, optional) – Multiplicative decay factor applied to the white‐noise standard deviation each epoch. Defaults to 0.9.
- Returns:
A tuple containing:
- mu_validation_preds (np.ndarray):
The means for multi-step-ahead predictions for the validation set.
- std_validation_preds (np.ndarray):
The standard deviations for multi-step-ahead predictions for the validation set.
states:The history of hidden states over time.
- Return type:
Tuple[np.ndarray, np.ndarray, StatesHistory]
Examples
>>> mu_preds_val, std_preds_val, states = model.lstm_train(train_data=train_set,validation_data=val_set)
- pretraining_filter(train_data: dict)[source]#
Run exactly lstm_look_back_len dummy steps through the LSTM so that the lstm_output_history gets filled with lstm_look_back_len predictions. Assumes self.lstm_net and self.lstm_output_history already exist.
- rts_smoother(time_step: int, matrix_inversion_tol: float | None = 1e-12, tol_type: str | None = 'relative')[source]#
Apply RTS smoothing equations for a specity timestep. As a result of this function, the smoothed estimates for hidden states at the specific time step will be updated in
states.This function is used at the one-time-step level. Recall
rts_smoother()fromcommon.- Parameters:
time_step (int) – Target smoothing index.
matrix_inversion_tol (Optional[float]) – Numerical stability threshold for matrix pseudoinversion (pinv). Defaults to 1E-12.
tol_type (Optional[str]) – Tolerance type, “relative” or “absolute”. Defaults to “relative”.
- save_states_history()[source]#
Save current prior, posterior hidden states, and cross-covariaces between hidden states at two consecutive time steps for later use in Kalman’s smoother.
- set_memory(time_step: int | None = None, memory: dict | None = None)[source]#
Set memory which includes
mu_states,var_states,lstm_output_historyand lstm_states inlstm_netwith smoothed estimates to a specific time step. This is to prepare for the next analysis by ensuring the continuity of these variables, e.g., if the next analysis starts from time step t, should set the memory to the time step t-1.- Parameters:
time_step (Optional[int]) – Time step to set the memory.
memory (Optional[dict]) – memory to be set.
Examples
>>> # If the next analysis starts from the beginning of the time series >>> model.set_memory(time_step=0) >>> # If the next analysis starts from t = 200 >>> model.set_memory(time_step=199)
- set_states(new_mu_states: ndarray, new_var_states: ndarray)[source]#
Set new values for states, i.e.,
mu_statesandvar_states- Parameters:
new_mu_states (np.ndarray) – Mean values to be set.
new_var_states (np.ndarray) – Covariance matrix to be set.
- smoother(matrix_inversion_tol: float | None = 1e-12, tol_type: str | None = 'relative') StatesHistory[source]#
Run the Kalman smoother over an entire time series data, i.e., repeatly apply the RTS smoothing equation over multiple time steps.
This function is used at the entire-dataset-level. Recall repeatedly the function
rts_smoother()at one-time-step level fromModel.- Parameters:
matrix_inversion_tol (float) – Numerical stability threshold for matrix pseudoinversion (pinv). Defaults to 1E-12.
tol_type (Optional[str]) – Tolerance type, “relative” or “absolute”. Defaults to “relative”.
- Returns:
states: The history of hidden states over time.- Return type:
Examples
>>> mu_preds_train, std_preds_train, states = model.filter(train_set) >>> states = model.smoother()
- update_lstm_output_history(mu_states: ndarray, var_states: ndarray)[source]#
Update the rolling history of LSTM output means and variances with the mu_states and var_states
- Parameters:
mu_states (np.ndarray) – mean value to be updated to LSTM output history.
var_states (np.ndarray) – variance value to be updated to LSTM output history.
- update_lstm_param(delta_mu_states: ndarray, delta_var_states: ndarray)[source]#
Obtain the posteriors for the LSTM neural network’s parameters in
lstm_netby adding delta to their priors.- Parameters:
delta_mu_states (np.ndarray) – Delta mean for states.
delta_var_states (np.ndarray) – Delta variance states.
- update_lstm_states_history(index: int, last_step: int)[source]#
Store LSTM states at specific time steps. Currently only save states the first and last time steps of the train, validation, and test sets. For other time steps, save None.
- Parameters:
index (int) – time step to save the lstm states at
- white_noise_decay(epoch: int, white_noise_max_std: float, white_noise_decay_factor: float)[source]#
Apply exponential decay to white noise standard deviation over epochs, and modify the variance for the white noise component in
process_noise_matrix. This decaying noise structure is intended to improve the training performance of TAGI-LSTM.- Parameters:
epoch (int) – Current training epoch.
white_noise_max_std (float) – Maximum allowed noise std.
white_noise_decay_factor (float) – Factor controlling decay rate.