Example for monthly data

This is a basic example for monthly data using Silverkite. Note that here we are fitting a few simple models and the goal is not to optimize the results as much as possible.

10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
 import warnings
 from collections import defaultdict

 import plotly
 import pandas as pd

 from greykite.framework.benchmark.data_loader_ts import DataLoaderTS
 from greykite.framework.templates.autogen.forecast_config import EvaluationPeriodParam
 from greykite.framework.templates.autogen.forecast_config import ForecastConfig
 from greykite.framework.templates.autogen.forecast_config import MetadataParam
 from greykite.framework.templates.autogen.forecast_config import ModelComponentsParam
 from greykite.framework.templates.forecaster import Forecaster
 from greykite.framework.utils.result_summary import summarize_grid_search_results
 from greykite.framework.input.univariate_time_series import UnivariateTimeSeries

 warnings.filterwarnings("ignore")

Loads dataset into UnivariateTimeSeries.

29
30
31
32
33
34
35
36
37
38
39
40
 dl = DataLoaderTS()
 agg_func = {"count": "sum"}
 df = dl.load_bikesharing(agg_freq="monthly", agg_func=agg_func)
 # In this monthly data the last month data is incomplete, therefore we drop it
 df.drop(df.tail(1).index,inplace=True)
 df.reset_index(drop=True)
 ts = UnivariateTimeSeries()
 ts.load_data(
     df=df,
     time_col="ts",
     value_col="count",
     freq="MS")

Out:

<greykite.framework.input.univariate_time_series.UnivariateTimeSeries object at 0x1b07d4c50>

Exploratory data analysis (EDA)

After reading in a time series, we could first do some exploratory data analysis. The UnivariateTimeSeries class is used to store a timeseries and perform EDA.

A quick description of the data can be obtained as follows.

51
52
53
 print(ts.describe_time_col())
 print(ts.describe_value_col())
 print(df.head())

Out:

{'data_points': 108, 'mean_increment_secs': 2629143.925233645, 'min_timestamp': Timestamp('2010-09-01 00:00:00'), 'max_timestamp': Timestamp('2019-08-01 00:00:00')}
count       108.000000
mean     231254.101852
std      106017.804606
min        4001.000000
25%      144661.750000
50%      227332.000000
75%      327851.250000
max      404811.000000
Name: y, dtype: float64
          ts  count
0 2010-09-01   4001
1 2010-10-01  35949
2 2010-11-01  47391
3 2010-12-01  28253
4 2011-01-01  37499

Let’s plot the original timeseries. (The interactive plot is generated by plotly: click to zoom!)

58
59
 fig = ts.plot()
 plotly.io.show(fig)

Exploratory plots can be plotted to reveal the time series’s properties. Monthly overlay plot can be used to inspect the annual patterns. This plot overlays various years on top of each other.

65
66
67
68
69
70
71
72
73
74
75
76
 fig = ts.plot_quantiles_and_overlays(
      groupby_time_feature="month",
      show_mean=False,
      show_quantiles=False,
      show_overlays=True,
      overlay_label_time_feature="year",
      overlay_style={"line": {"width": 1}, "opacity": 0.5},
      center_values=False,
      xlabel="month of year",
      ylabel=ts.original_value_col,
      title="yearly seasonality for each year (centered)",)
 plotly.io.show(fig)

Specify common metadata.

80
81
82
83
84
85
86
87
 forecast_horizon = 4
 time_col = "ts"
 value_col = "count"
 meta_data_params = MetadataParam(
     time_col=time_col,
     value_col=value_col,
     freq="MS",
 )

Specify common evaluation parameters. Set minimum input data for training.

 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
 cv_min_train_periods = 24
 # Let CV use most recent splits for cross-validation.
 cv_use_most_recent_splits = True
 # Determine the maximum number of validations.
 cv_max_splits = 5
 evaluation_period_param = EvaluationPeriodParam(
     test_horizon=forecast_horizon,
     cv_horizon=forecast_horizon,
     periods_between_train_test=0,
     cv_min_train_periods=cv_min_train_periods,
     cv_expanding_window=True,
     cv_use_most_recent_splits=cv_use_most_recent_splits,
     cv_periods_between_splits=None,
     cv_periods_between_train_test=0,
     cv_max_splits=cv_max_splits,
 )

Fit a simple model without autoregression. The important modeling parameters for monthly data are as follows. These are plugged into ModelComponentsParam. The extra_pred_cols is used to specify growth and annual seasonality Growth is modelled with both “ct_sqrt”, “ct1” for extra flexibility as we have longterm data and ridge regularization will avoid over-fitting the trend. The annual seasonality is modelled categorically with “C(month)” instead of Fourier series. This is because in monthly data, the number of data points in year is rather small (12) as opposed to daily data where there are many points in the year, which makes categorical representation non-feasible. The categorical representation of monthly also is more explainable/interpretable in the model summary.

122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
 extra_pred_cols = ["ct_sqrt", "ct1", "C(month, levels=list(range(1, 13)))"]
 autoregression = None

 # Specify the model parameters
 model_components = ModelComponentsParam(
     growth=dict(growth_term=None),
     seasonality=dict(
         yearly_seasonality=[False],
         quarterly_seasonality=[False],
         monthly_seasonality=[False],
         weekly_seasonality=[False],
         daily_seasonality=[False]
     ),
     custom=dict(
         fit_algorithm_dict=dict(fit_algorithm="ridge"),
         extra_pred_cols=extra_pred_cols
     ),
     regressors=dict(regressor_cols=None),
     autoregression=autoregression,
     uncertainty=dict(uncertainty_dict=None),
     events=dict(holiday_lookup_countries=None),
 )

 # Run the forecast model
 forecaster = Forecaster()
 result =  forecaster.run_forecast_config(
     df=df,
     config=ForecastConfig(
         model_template="SILVERKITE",
         coverage=0.95,
         forecast_horizon=forecast_horizon,
         metadata_param=meta_data_params,
         evaluation_period_param=evaluation_period_param,
         model_components_param=model_components
     )
 )

 # Get the useful fields from the forecast result
 model = result.model[-1]
 backtest = result.backtest
 forecast = result.forecast
 grid_search = result.grid_search

 # Check model coefficients / variables
 # Get model summary with p-values
 print(model.summary())

 # Get cross-validation results
 cv_results = summarize_grid_search_results(
     grid_search=grid_search,
     decimals=2,
     cv_report_metrics=None,
     column_order=[
         "rank", "mean_test", "split_test", "mean_train", "split_train",
         "mean_fit_time", "mean_score_time", "params"])
 # Transposes to save space in the printed output
 print(cv_results.transpose())

 # Check historical evaluation metrics (on the historical training/test set).
 backtest_eval = defaultdict(list)
 for metric, value in backtest.train_evaluation.items():
     backtest_eval[metric].append(value)
     backtest_eval[metric].append(backtest.test_evaluation[metric])
 metrics = pd.DataFrame(backtest_eval, index=["train", "test"]).T
 print(metrics)

Out:

Fitting 5 folds for each of 1 candidates, totalling 5 fits
================================ Model Summary =================================

Number of observations: 108,   Number of features: 14
Method: Ridge regression
Number of nonzero features: 14
Regularization parameter: 0.1748

Residuals:
         Min           1Q       Median           3Q          Max
  -7.121e+04   -2.265e+04       -322.6    2.499e+04    6.327e+04

            Pred_col    Estimate   Std. Err Pr(>)_boot sig. code                     95%CI
           Intercept  -1.061e+05  2.287e+04     <2e-16       ***  (-1.470e+05, -5.655e+04)
 C(month,... 13)))_2  -1.271e+04  1.856e+04      0.496             (-5.010e+04, 2.011e+04)
 C(month,... 13)))_3   4.549e+04  1.532e+04      0.004        **    (1.089e+04, 7.071e+04)
 C(month,... 13)))_4   1.148e+05  1.545e+04     <2e-16       ***    (8.001e+04, 1.397e+05)
 C(month,... 13)))_5   1.314e+05  1.608e+04     <2e-16       ***    (9.314e+04, 1.549e+05)
 C(month,... 13)))_6   1.448e+05  1.674e+04     <2e-16       ***    (1.060e+05, 1.723e+05)
 C(month,... 13)))_7   1.525e+05  1.722e+04     <2e-16       ***    (1.154e+05, 1.821e+05)
 C(month,... 13)))_8   1.523e+05  1.713e+04     <2e-16       ***    (1.148e+05, 1.823e+05)
 C(month,... 13)))_9   1.318e+05  1.716e+04     <2e-16       ***    (9.244e+04, 1.599e+05)
 C(month,...13)))_10   1.167e+05  1.676e+04     <2e-16       ***    (7.905e+04, 1.440e+05)
 C(month,...13)))_11   4.323e+04  1.538e+04      0.006        **       (8837.0, 6.864e+04)
 C(month,...13)))_12     -3090.0  1.614e+04      0.858             (-4.027e+04, 2.172e+04)
             ct_sqrt   1.748e+05  2.143e+04     <2e-16       ***    (1.268e+05, 2.088e+05)
                 ct1  -2.117e+04     5697.0     <2e-16       ***     (-3.035e+04, -7805.0)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Multiple R-squared: 0.9104,   Adjusted R-squared: 0.8986
F-statistic: 72.821 on 12 and 94 DF,   p-value: 1.110e-16
Model AIC: 2770.5,   model BIC: 2806.7

WARNING: the condition number is large, 4.02e+03. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.

                                                   0
rank_test_MAPE                                     1
mean_test_MAPE                                 19.93
split_test_MAPE    (16.83, 23.32, 4.26, 24.84, 30.4)
mean_train_MAPE                                28.41
split_train_MAPE  (29.02, 30.0, 28.61, 28.08, 26.34)
mean_fit_time                                   3.51
mean_score_time                                 0.27
params                                            []
                                                          train        test
CORR                                                   0.952305     0.99193
R2                                                     0.905847    -9.01706
MSE                                                 1.03449e+09  7.3309e+08
RMSE                                                    32163.5     27075.6
MAE                                                     26827.1     27035.7
MedAE                                                   25350.2     26717.7
MAPE                                                    24.9649     7.69857
MedAPE                                                  9.59447     7.62733
sMAPE                                                   10.7097     3.70635
Q80                                                     13413.5     5407.14
Q95                                                     13413.5     1351.78
Q99                                                     13413.5     270.357
OutsideTolerance1p                                     0.980769           1
OutsideTolerance2p                                     0.865385           1
OutsideTolerance3p                                     0.836538           1
OutsideTolerance4p                                     0.798077           1
OutsideTolerance5p                                     0.778846           1
Outside Tolerance (fraction)                               None        None
R2_null_model_score                                        None        None
Prediction Band Width (%)                               107.402     36.1039
Prediction Band Coverage (fraction)                    0.971154           1
Coverage: Lower Band                                   0.442308           1
Coverage: Upper Band                                   0.528846           0
Coverage Diff: Actual_Coverage - Intended_Coverage    0.0211538        0.05

Fit/backtest plot:

190
191
 fig = backtest.plot()
 plotly.io.show(fig)

Forecast plot:

195
196
 fig = forecast.plot()
 plotly.io.show(fig)

The components plot:

200
201
 fig = forecast.plot_components()
 plotly.io.show(fig)

Fit a simple model with autoregression. This is done by specifying the autoregression parameter in ModelComponentsParam. Note that the auto-regressive structure can be customized further depending on your data.

207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
 extra_pred_cols = ["ct_sqrt", "ct1", "C(month, levels=list(range(1, 13)))"]
 autoregression = {
     "autoreg_dict": {
         "lag_dict": {"orders": [1]},
         "agg_lag_dict": None
     }
 }

 # Specify the model parameters
 model_components = ModelComponentsParam(
     growth=dict(growth_term=None),
     seasonality=dict(
         yearly_seasonality=[False],
         quarterly_seasonality=[False],
         monthly_seasonality=[False],
         weekly_seasonality=[False],
         daily_seasonality=[False]
     ),
     custom=dict(
         fit_algorithm_dict=dict(fit_algorithm="ridge"),
         extra_pred_cols=extra_pred_cols
     ),
     regressors=dict(regressor_cols=None),
     autoregression=autoregression,
     uncertainty=dict(uncertainty_dict=None),
     events=dict(holiday_lookup_countries=None),
 )

 # Run the forecast model
 forecaster = Forecaster()
 result =  forecaster.run_forecast_config(
     df=df,
     config=ForecastConfig(
         model_template="SILVERKITE",
         coverage=0.95,
         forecast_horizon=forecast_horizon,
         metadata_param=meta_data_params,
         evaluation_period_param=evaluation_period_param,
         model_components_param=model_components
     )
 )

 # Get the useful fields from the forecast result
 model = result.model[-1]
 backtest = result.backtest
 forecast = result.forecast
 grid_search = result.grid_search

 # Check model coefficients / variables
 # Get model summary with p-values
 print(model.summary())

 # Get cross-validation results
 cv_results = summarize_grid_search_results(
     grid_search=grid_search,
     decimals=2,
     cv_report_metrics=None,
     column_order=[
         "rank", "mean_test", "split_test", "mean_train", "split_train",
         "mean_fit_time", "mean_score_time", "params"])
 # Transposes to save space in the printed output
 print(cv_results.transpose())

 # Check historical evaluation metrics (on the historical training/test set).
 backtest_eval = defaultdict(list)
 for metric, value in backtest.train_evaluation.items():
     backtest_eval[metric].append(value)
     backtest_eval[metric].append(backtest.test_evaluation[metric])
 metrics = pd.DataFrame(backtest_eval, index=["train", "test"]).T
 print(metrics)

Out:

Fitting 5 folds for each of 1 candidates, totalling 5 fits
================================ Model Summary =================================

Number of observations: 108,   Number of features: 15
Method: Ridge regression
Number of nonzero features: 15
Regularization parameter: 1.789

Residuals:
         Min           1Q       Median           3Q          Max
  -7.679e+04   -1.328e+04        484.7    1.489e+04    8.141e+04

            Pred_col    Estimate   Std. Err Pr(>)_boot sig. code                     95%CI
           Intercept   1.126e+04     8251.0      0.164                (-4075.0, 2.896e+04)
 C(month,... 13)))_2     -3806.0     7278.0      0.566                (-1.786e+04, 9454.0)
 C(month,... 13)))_3   4.103e+04     9708.0     <2e-16       ***    (2.173e+04, 5.846e+04)
 C(month,... 13)))_4   6.078e+04  1.298e+04     <2e-16       ***    (3.265e+04, 8.378e+04)
 C(month,... 13)))_5   2.803e+04     7113.0     <2e-16       ***    (1.404e+04, 4.028e+04)
 C(month,... 13)))_6   2.831e+04     9133.0      0.002        **    (1.050e+04, 4.435e+04)
 C(month,... 13)))_7   2.573e+04     6008.0     <2e-16       ***    (1.418e+04, 3.659e+04)
 C(month,... 13)))_8   2.047e+04     5086.0     <2e-16       ***       (9764.0, 2.985e+04)
 C(month,... 13)))_9      2659.0     6845.0      0.680             (-1.294e+04, 1.457e+04)
 C(month,...13)))_10      6175.0     5267.0      0.224                (-5288.0, 1.533e+04)
 C(month,...13)))_11  -4.795e+04  1.115e+04     <2e-16       ***  (-6.957e+04, -2.654e+04)
 C(month,...13)))_12  -3.795e+04     7610.0     <2e-16       ***  (-5.289e+04, -2.326e+04)
             ct_sqrt   1.162e+04     9479.0      0.222                (-8136.0, 2.886e+04)
                 ct1      1226.0     2609.0      0.604                   (-4019.0, 6408.0)
              y_lag1      0.7948    0.04584     <2e-16       ***          (0.7042, 0.8868)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Multiple R-squared: 0.9297,   Adjusted R-squared: 0.9219
F-statistic: 114.62 on 10 and 96 DF,   p-value: 1.110e-16
Model AIC: 2740.7,   model BIC: 2772.1

WARNING: the condition number is large, 3.26e+12. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.

                                                   0
rank_test_MAPE                                     1
mean_test_MAPE                                 20.74
split_test_MAPE   (24.08, 11.08, 6.87, 40.51, 21.13)
mean_train_MAPE                                17.03
split_train_MAPE  (17.8, 17.99, 17.19, 16.64, 15.55)
mean_fit_time                                   3.23
mean_score_time                                 1.93
params                                            []
                                                          train        test
CORR                                                    0.96174    0.950079
R2                                                      0.92443     -3.5984
MSE                                                 8.30317e+08  3.3653e+08
RMSE                                                    28815.2     18344.7
MAE                                                     21411.4     18101.4
MedAE                                                   15294.8       19253
MAPE                                                     15.177     5.17313
MedAPE                                                   7.1528     5.44632
sMAPE                                                   6.84121     2.51932
Q80                                                     10705.7     3620.29
Q95                                                     10705.7     905.072
Q99                                                     10705.7     181.014
OutsideTolerance1p                                     0.894231           1
OutsideTolerance2p                                     0.826923           1
OutsideTolerance3p                                     0.778846           1
OutsideTolerance4p                                     0.721154        0.75
OutsideTolerance5p                                        0.625        0.75
Outside Tolerance (fraction)                               None        None
R2_null_model_score                                        None        None
Prediction Band Width (%)                               96.2213      35.563
Prediction Band Coverage (fraction)                    0.932692           1
Coverage: Lower Band                                   0.461538           1
Coverage: Upper Band                                   0.471154           0
Coverage Diff: Actual_Coverage - Intended_Coverage   -0.0173077        0.05

Fit/backtest plot:

280
281
 fig = backtest.plot()
 plotly.io.show(fig)

Forecast plot:

285
286
 fig = forecast.plot()
 plotly.io.show(fig)

The components plot:

290
291
 fig = forecast.plot_components()
 plotly.io.show(fig)

Fit a model with time-varying seasonality (month effect). This is achieved by adding "ct1*C(month)" to ModelComponentsParam. Note that this feature may or may not be useful in your use case. We have included this for demonstration purposes only. In this example, while the fit has improved the backtest is inferior to the previous setting.

299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
 extra_pred_cols = ["ct_sqrt", "ct1", "C(month, levels=list(range(1, 13)))",
                    "ct1*C(month, levels=list(range(1, 13)))"]
 autoregression = {
     "autoreg_dict": {
         "lag_dict": {"orders": [1]},
         "agg_lag_dict": None
     }
 }

 # Specify the model parameters
 model_components = ModelComponentsParam(
     growth=dict(growth_term=None),
     seasonality=dict(
         yearly_seasonality=[False],
         quarterly_seasonality=[False],
         monthly_seasonality=[False],
         weekly_seasonality=[False],
         daily_seasonality=[False]
     ),
     custom=dict(
         fit_algorithm_dict=dict(fit_algorithm="ridge"),
         extra_pred_cols=extra_pred_cols
     ),
     regressors=dict(regressor_cols=None),
     autoregression=autoregression,
     uncertainty=dict(uncertainty_dict=None),
     events=dict(holiday_lookup_countries=None),
 )

 # Run the forecast model
 forecaster = Forecaster()
 result =  forecaster.run_forecast_config(
     df=df,
     config=ForecastConfig(
         model_template="SILVERKITE",
         coverage=0.95,
         forecast_horizon=forecast_horizon,
         metadata_param=meta_data_params,
         evaluation_period_param=evaluation_period_param,
         model_components_param=model_components
     )
 )

 # Get the useful fields from the forecast result
 model = result.model[-1]
 backtest = result.backtest
 forecast = result.forecast
 grid_search = result.grid_search

 # Check model coefficients / variables
 # Get model summary with p-values
 print(model.summary())

 # Get cross-validation results
 cv_results = summarize_grid_search_results(
     grid_search=grid_search,
     decimals=2,
     cv_report_metrics=None,
     column_order=[
         "rank", "mean_test", "split_test", "mean_train", "split_train",
         "mean_fit_time", "mean_score_time", "params"])
 # Transposes to save space in the printed output
 print(cv_results.transpose())

 # Check historical evaluation metrics (on the historical training/test set).
 backtest_eval = defaultdict(list)
 for metric, value in backtest.train_evaluation.items():
     backtest_eval[metric].append(value)
     backtest_eval[metric].append(backtest.test_evaluation[metric])
 metrics = pd.DataFrame(backtest_eval, index=["train", "test"]).T
 print(metrics)

Out:

Fitting 5 folds for each of 1 candidates, totalling 5 fits
================================ Model Summary =================================

Number of observations: 108,   Number of features: 26
Method: Ridge regression
Number of nonzero features: 26
Regularization parameter: 29.15

Residuals:
         Min           1Q       Median           3Q          Max
  -6.180e+04   -1.322e+04      -3030.0    1.151e+04    7.759e+04

            Pred_col    Estimate Std. Err Pr(>)_boot sig. code                   95%CI
           Intercept   2.042e+04   5108.0     <2e-16       ***  (1.089e+04, 3.136e+04)
 C(month,... 13)))_2      -812.5    637.9      0.190                  (-2171.0, 263.9)
 C(month,... 13)))_3      3127.0   1000.0      0.006        **        (1228.0, 4998.0)
 C(month,... 13)))_4      3959.0   1258.0      0.002        **        (1143.0, 6190.0)
 C(month,... 13)))_5      2999.0    945.6      0.002        **        (1115.0, 4774.0)
 C(month,... 13)))_6       832.2    593.5      0.172                  (-340.2, 2055.0)
 C(month,... 13)))_7      1253.0    604.7      0.038         *        (-19.94, 2270.0)
 C(month,... 13)))_8       925.9    576.6      0.114                  (-199.6, 2067.0)
 C(month,... 13)))_9       -19.3    779.5      0.982                 (-1641.0, 1420.0)
 C(month,...13)))_10      -782.2    640.9      0.232                  (-1925.0, 638.7)
 C(month,...13)))_11     -2595.0    601.4     <2e-16       ***      (-3768.0, -1386.0)
 C(month,...13)))_12     -3451.0    968.8      0.002        **      (-5306.0, -1478.0)
             ct_sqrt      1832.0    938.3      0.052         .        (-108.7, 3610.0)
                 ct1      -234.7   1302.0      0.834                 (-2448.0, 2551.0)
 ct1:C(mo... 13)))_2      1611.0   1700.0      0.328                 (-1900.0, 4798.0)
 ct1:C(mo... 13)))_3      9752.0   2252.0     <2e-16       ***     (5250.0, 1.445e+04)
 ct1:C(mo... 13)))_4   1.313e+04   2323.0     <2e-16       ***     (8620.0, 1.754e+04)
 ct1:C(mo... 13)))_5      4360.0   1712.0      0.010         *        (1334.0, 7937.0)
 ct1:C(mo... 13)))_6      6085.0   1999.0      0.006        **        (1896.0, 9525.0)
 ct1:C(mo... 13)))_7      4815.0   1196.0      0.002        **        (2685.0, 7548.0)
 ct1:C(mo... 13)))_8      3663.0    921.1     <2e-16       ***        (1661.0, 5497.0)
 ct1:C(mo... 13)))_9      -289.7   1995.0      0.874                 (-4080.0, 3232.0)
 ct1:C(mo...13)))_10      1967.0   1447.0      0.148                 (-1511.0, 3938.0)
 ct1:C(mo...13)))_11  -1.156e+04   1506.0     <2e-16       ***   (-1.397e+04, -7563.0)
 ct1:C(mo...13)))_12     -6430.0   1650.0     <2e-16       ***      (-9555.0, -3487.0)
              y_lag1      0.8629  0.04046     <2e-16       ***        (0.7897, 0.9453)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Multiple R-squared: 0.9441,   Adjusted R-squared: 0.9375
F-statistic: 139.81 on 11 and 95 DF,   p-value: 1.110e-16
Model AIC: 2717.0,   model BIC: 2749.8

WARNING: the condition number is large, 2.34e+11. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.

                                                 0
rank_test_MAPE                                   1
mean_test_MAPE                               11.34
split_test_MAPE    (6.0, 15.75, 6.39, 21.63, 6.94)
mean_train_MAPE                              12.05
split_train_MAPE  (16.46, 15.58, 9.15, 8.5, 10.53)
mean_fit_time                                  3.2
mean_score_time                                1.9
params                                          []
                                                         train         test
CORR                                                  0.982006     0.989264
R2                                                     0.96433     -59.3175
MSE                                                 3.9192e+08  4.41428e+09
RMSE                                                     19797      66440.1
MAE                                                    15499.4      62905.1
MedAE                                                  13399.1        72193
MAPE                                                   9.98505      17.7785
MedAPE                                                  6.6598      20.4163
sMAPE                                                  4.66599      8.09662
Q80                                                    7749.68        12581
Q95                                                    7749.68      3145.26
Q99                                                    7749.68      629.051
OutsideTolerance1p                                    0.923077            1
OutsideTolerance2p                                    0.826923            1
OutsideTolerance3p                                    0.721154            1
OutsideTolerance4p                                    0.634615            1
OutsideTolerance5p                                    0.557692            1
Outside Tolerance (fraction)                              None         None
R2_null_model_score                                       None         None
Prediction Band Width (%)                              66.1071      18.5954
Prediction Band Coverage (fraction)                   0.932692         0.25
Coverage: Lower Band                                  0.471154         0.25
Coverage: Upper Band                                  0.461538            0
Coverage Diff: Actual_Coverage - Intended_Coverage  -0.0173077         -0.7

Fit/backtest plot:

373
374
 fig = backtest.plot()
 plotly.io.show(fig)

Forecast plot:

378
379
 fig = forecast.plot()
 plotly.io.show(fig)

The components plot:

383
384
 fig = forecast.plot_components()
 plotly.io.show(fig)

Total running time of the script: ( 1 minutes 42.667 seconds)

Gallery generated by Sphinx-Gallery