From simple baselines to the first autoregressive model

The persistence model

Before building a sophisticated model, it is essential to define a simple baseline. The most natural baseline consists in assuming that tomorrow's consumption will be identical to today's. This model, called the persistence model, requires no learning.

Concretely, if we denote the consumption on day t as y(t), the prediction is simply:

y_pred[t] = y[t-1]

This model may seem naive, but it is actually difficult to beat in some contexts, particularly when series are strongly autocorrelated (for instance, the water level of certain large rivers).

Script implementing the persistence model: scripts/persistence.py

Script output:

=== PERSISTENCE BASELINE ===
Persistence MAE: 56,428.78 MWh
Number of observations: 1,095
Period: 2022-01-01 → 2024-12-31
Persistence MAPE: 4.86%

MAPE (Mean Absolute Percentage Error) measures the average error of a forecasting model by calculating, for each data point, the absolute error divided by the actual value, then expressing the result as a percentage. The average of all these error percentages is then calculated, providing an intuitive and easy-to-interpret metric (e.g., "on average, our forecasts deviate by 4.1% from the actual values").

Once this model is applied to our data, we obtain a Mean Absolute Error (MAE) of around 57,589 MWh for the test period (2022–2024). This figure is high, but it provides an essential reference point. Any future improvement will have to be measured against this baseline.

Persistence therefore gives us a useful benchmark, but it exploits very little of the available information. Let us now see what happens when we allow the model to look further into the past.

Leveraging history with autoregression

The first natural improvement consists in no longer restricting ourselves to the previous day (D-1) but instead exploiting a longer history. The idea is that the consumption of a given day depends not only on the previous day but also on the days before that.

We thus build an autoregressive (AR) model, where the consumption of day t+1 is predicted from the last N days: y(t+1) = f(y(t), y(t-1), . . ., y(t-N+1))

In its simplest form, we choose a linear regression to model this relationship.

In practice, we create a training matrix by sliding a window of size N along the series. Each window becomes an input observation, and the following value becomes the target.

Here is a simplified implementation:

def create_ar_dataset(series, window): 
    X, y = [], [] 
    #from i to i+window-1: the "window" past values 
    for i in range(len(series) - window): 
        X.append(series\[i:i+window\]) # the value just after the window 
        y.append(series\[i+window\]) 
    return np.array(X), np.array(y)

• len(series) - window: number of windows we can extract (we stop before the end to have a target).
• series[i:i+window]: sublist of window values starting at index i.
• series[i+window]: the element just after this window.

Finally, we convert the lists into NumPy arrays suitable for machine learning models.

We then test different window sizes (7, 14, 30, 60, 90 and 120 days).

On the test set, the model predicts one day ahead. Once the actual consumption of that day becomes available, it is incorporated into the input window before forecasting the following day. This rolling procedure avoids using any future information while remaining consistent with real-world forecasting conditions.

Performance improves sharply between 7 and 30 days, then gains become marginal beyond that. A 30-day window appears to be a reasonable initial compromise.

With this approach, the MAE drops below 30,000 MWh, a considerable improvement over the persistence model.

Script implementing the autoregressive model: scripts/AR.py

The improvement is spectacular, but it also reveals a structural limitation: the model learns only from past values, without any knowledge of the nature of the day being forecasted.