Note

Click here to download the full example code

Example for monthly data¶

This is a basic example for monthly data using Silverkite. Note that here we are fitting a few simple models and the goal is not to optimize the results as much as possible.

 import warnings
 from collections import defaultdict

 import plotly
 import pandas as pd

 from greykite.framework.benchmark.data_loader_ts import DataLoaderTS
 from greykite.framework.templates.autogen.forecast_config import EvaluationPeriodParam
 from greykite.framework.templates.autogen.forecast_config import ForecastConfig
 from greykite.framework.templates.autogen.forecast_config import MetadataParam
 from greykite.framework.templates.autogen.forecast_config import ModelComponentsParam
 from greykite.framework.templates.forecaster import Forecaster
 from greykite.framework.templates.model_templates import ModelTemplateEnum
 from greykite.framework.utils.result_summary import summarize_grid_search_results
 from greykite.framework.input.univariate_time_series import UnivariateTimeSeries

 warnings.filterwarnings("ignore")

Loads dataset into UnivariateTimeSeries.

 dl = DataLoaderTS()
 agg_func = {"count": "sum"}
 df = dl.load_bikesharing(agg_freq="monthly", agg_func=agg_func)
 # In this monthly data the last month data is incomplete, therefore we drop it
 df.drop(df.tail(1).index,inplace=True)
 df.reset_index(drop=True)
 ts = UnivariateTimeSeries()
 ts.load_data(
     df=df,
     time_col="ts",
     value_col="count",
     freq="MS")

Out:

<greykite.framework.input.univariate_time_series.UnivariateTimeSeries object at 0x19ed3f3d0>

Exploratory data analysis (EDA)¶

After reading in a time series, we could first do some exploratory data analysis. The UnivariateTimeSeries class is used to store a timeseries and perform EDA.

A quick description of the data can be obtained as follows.

 print(ts.describe_time_col())
 print(ts.describe_value_col())
 print(df.head())

Out:

{'data_points': 108, 'mean_increment_secs': 2629143.925233645, 'min_timestamp': Timestamp('2010-09-01 00:00:00'), 'max_timestamp': Timestamp('2019-08-01 00:00:00')}
count       108.000000
mean     231254.101852
std      106017.804606
min        4001.000000
25%      144661.750000
50%      227332.000000
75%      327851.250000
max      404811.000000
Name: y, dtype: float64
          ts  count
0 2010-09-01   4001
1 2010-10-01  35949
2 2010-11-01  47391
3 2010-12-01  28253
4 2011-01-01  37499

Let’s plot the original timeseries. (The interactive plot is generated by plotly: click to zoom!)

 fig = ts.plot()
 plotly.io.show(fig)

Exploratory plots can be plotted to reveal the time series’s properties. Monthly overlay plot can be used to inspect the annual patterns. This plot overlays various years on top of each other.

 fig = ts.plot_quantiles_and_overlays(
      groupby_time_feature="month",
      show_mean=False,
      show_quantiles=False,
      show_overlays=True,
      overlay_label_time_feature="year",
      overlay_style={"line": {"width": 1}, "opacity": 0.5},
      center_values=False,
      xlabel="month of year",
      ylabel=ts.original_value_col,
      title="yearly seasonality for each year (centered)",)
 plotly.io.show(fig)

Specify common metadata.

 forecast_horizon = 4
 time_col = "ts"
 value_col = "count"
 meta_data_params = MetadataParam(
     time_col=time_col,
     value_col=value_col,
     freq="MS",
 )

Specify common evaluation parameters. Set minimum input data for training.

 cv_min_train_periods = 24
 # Let CV use most recent splits for cross-validation.
 cv_use_most_recent_splits = True
 # Determine the maximum number of validations.
 cv_max_splits = 5
 evaluation_period_param = EvaluationPeriodParam(
     test_horizon=forecast_horizon,
     cv_horizon=forecast_horizon,
     periods_between_train_test=0,
     cv_min_train_periods=cv_min_train_periods,
     cv_expanding_window=True,
     cv_use_most_recent_splits=cv_use_most_recent_splits,
     cv_periods_between_splits=None,
     cv_periods_between_train_test=0,
     cv_max_splits=cv_max_splits,
 )

Fit a simple model without autoregression. The important modeling parameters for monthly data are as follows. These are plugged into ModelComponentsParam. The extra_pred_cols is used to specify growth and annual seasonality Growth is modelled with both “ct_sqrt”, “ct1” for extra flexibility as we have longterm data and ridge regularization will avoid over-fitting the trend. The annual seasonality is modelled categorically with “C(month)” instead of Fourier series. This is because in monthly data, the number of data points in year is rather small (12) as opposed to daily data where there are many points in the year, which makes categorical representation non-feasible. The categorical representation of monthly also is more explainable/interpretable in the model summary.

 extra_pred_cols = ["ct_sqrt", "ct1", "C(month, levels=list(range(1, 13)))"]
 autoregression = None

 # Specify the model parameters
 model_components = ModelComponentsParam(
     growth=dict(growth_term=None),
     seasonality=dict(
         yearly_seasonality=[False],
         quarterly_seasonality=[False],
         monthly_seasonality=[False],
         weekly_seasonality=[False],
         daily_seasonality=[False]
     ),
     custom=dict(
         fit_algorithm_dict=dict(fit_algorithm="ridge"),
         extra_pred_cols=extra_pred_cols
     ),
     regressors=dict(regressor_cols=None),
     autoregression=autoregression,
     uncertainty=dict(uncertainty_dict=None),
     events=dict(holiday_lookup_countries=None),
 )

 # Run the forecast model
 forecaster = Forecaster()
 result = forecaster.run_forecast_config(
     df=df,
     config=ForecastConfig(
         model_template=ModelTemplateEnum.SILVERKITE.name,
         coverage=0.95,
         forecast_horizon=forecast_horizon,
         metadata_param=meta_data_params,
         evaluation_period_param=evaluation_period_param,
         model_components_param=model_components
     )
 )

 # Get the useful fields from the forecast result
 model = result.model[-1]
 backtest = result.backtest
 forecast = result.forecast
 grid_search = result.grid_search

 # Check model coefficients / variables
 # Get model summary with p-values
 print(model.summary())

 # Get cross-validation results
 cv_results = summarize_grid_search_results(
     grid_search=grid_search,
     decimals=2,
     cv_report_metrics=None,
     column_order=[
         "rank", "mean_test", "split_test", "mean_train", "split_train",
         "mean_fit_time", "mean_score_time", "params"])
 # Transposes to save space in the printed output
 print(cv_results.transpose())

 # Check historical evaluation metrics (on the historical training/test set).
 backtest_eval = defaultdict(list)
 for metric, value in backtest.train_evaluation.items():
     backtest_eval[metric].append(value)
     backtest_eval[metric].append(backtest.test_evaluation[metric])
 metrics = pd.DataFrame(backtest_eval, index=["train", "test"]).T
 print(metrics)

Out:

Fitting 5 folds for each of 1 candidates, totalling 5 fits
================================ Model Summary =================================

Number of observations: 108,   Number of features: 21
Method: Ridge regression
Number of nonzero features: 21
Regularization parameter: 0.01269

Residuals:
         Min           1Q       Median           3Q          Max
  -5.631e+04   -2.219e+04       2946.0    2.172e+04    6.649e+04

            Pred_col    Estimate   Std. Err Pr(>)_boot sig. code                     95%CI
           Intercept  -9.460e+04  3.439e+04      0.010         *  (-1.464e+05, -1.203e+04)
 C(month,... 13)))_2      5660.0  1.875e+04      0.740             (-3.299e+04, 4.029e+04)
 C(month,... 13)))_3   6.530e+04  1.754e+04     <2e-16       ***    (3.438e+04, 1.028e+05)
 C(month,... 13)))_4   1.362e+05  1.590e+04     <2e-16       ***    (1.045e+05, 1.677e+05)
 C(month,... 13)))_5   1.534e+05  1.657e+04     <2e-16       ***    (1.215e+05, 1.872e+05)
 C(month,... 13)))_6   1.675e+05  1.782e+04      0.002        **    (1.370e+05, 2.002e+05)
 C(month,... 13)))_7   1.756e+05  1.671e+04     <2e-16       ***    (1.417e+05, 2.069e+05)
 C(month,... 13)))_8   1.758e+05  1.689e+04     <2e-16       ***    (1.427e+05, 2.092e+05)
 C(month,... 13)))_9   1.477e+05  1.749e+04     <2e-16       ***    (1.112e+05, 1.828e+05)
 C(month,...13)))_10   1.345e+05  1.645e+04     <2e-16       ***    (1.019e+05, 1.675e+05)
 C(month,...13)))_11   6.066e+04  1.500e+04     <2e-16       ***    (3.115e+04, 8.971e+04)
 C(month,...13)))_12   1.422e+04  1.748e+04      0.404             (-1.796e+04, 4.928e+04)
             ct_sqrt   3.313e+05  1.175e+05      0.004        **    (3.869e+04, 4.618e+05)
                 ct1   3.895e+04  1.280e+05      0.782             (-1.761e+05, 2.901e+05)
   cp0_2011_12_31_00   2.954e+04  8.431e+04      0.726             (-1.244e+05, 2.166e+05)
   cp1_2012_01_30_00   1.218e+04  8.109e+04      0.880             (-1.390e+05, 1.885e+05)
   cp2_2012_12_31_00  -7.390e+04  1.024e+05      0.472             (-2.949e+05, 1.051e+05)
   cp3_2014_12_30_00  -1.254e+04  6.107e+04      0.822             (-1.265e+05, 1.166e+05)
   cp4_2015_02_01_00   4.932e+04  4.751e+04      0.310             (-4.634e+04, 1.330e+05)
   cp5_2015_04_29_00  -3.631e+04  9.086e+04      0.708             (-2.161e+05, 1.553e+05)
   cp6_2017_08_31_00  -7.053e+04  2.199e+04      0.002        **  (-1.126e+05, -2.873e+04)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Multiple R-squared: 0.9248,   Adjusted R-squared: 0.9113
F-statistic: 68.337 on 16 and 90 DF,   p-value: 1.110e-16
Model AIC: 2759.1,   model BIC: 2805.3

WARNING: the condition number is large, 2.44e+04. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.

                                                   0
rank_test_MAPE                                     1
mean_test_MAPE                                 17.95
split_test_MAPE   (16.97, 21.68, 5.09, 23.25, 22.77)
mean_train_MAPE                                30.74
split_train_MAPE  (34.41, 28.6, 31.42, 29.18, 30.07)
mean_fit_time                                    1.5
mean_score_time                                 0.24
params                                            []
                                                          train         test
CORR                                                   0.959601     0.959809
R2                                                     0.920783     -2.06113
MSE                                                 8.70384e+08  2.24026e+08
RMSE                                                    29502.3      14967.5
MAE                                                     25057.5      14721.3
MedAE                                                   23885.3      13428.2
MAPE                                                    31.1795      4.18228
MedAPE                                                  9.40904      3.79933
sMAPE                                                   10.5786      2.13708
Q80                                                     12528.8        11777
Q95                                                     12528.8      13985.2
Q99                                                     12528.8      14574.1
OutsideTolerance1p                                     0.980769            1
OutsideTolerance2p                                     0.894231            1
OutsideTolerance3p                                     0.836538            1
OutsideTolerance4p                                     0.826923         0.25
OutsideTolerance5p                                     0.740385         0.25
Outside Tolerance (fraction)                               None         None
R2_null_model_score                                        None         None
Prediction Band Width (%)                               98.5155      33.1166
Prediction Band Coverage (fraction)                    0.980769            1
Coverage: Lower Band                                        0.5            0
Coverage: Upper Band                                   0.480769            1
Coverage Diff: Actual_Coverage - Intended_Coverage    0.0307692         0.05

Fit/backtest plot:

 fig = backtest.plot()
 plotly.io.show(fig)

Forecast plot:

 fig = forecast.plot()
 plotly.io.show(fig)

The components plot:

 fig = forecast.plot_components()
 plotly.io.show(fig)

Fit a simple model with autoregression. This is done by specifying the autoregression parameter in ModelComponentsParam. Note that the auto-regressive structure can be customized further depending on your data.

 extra_pred_cols = ["ct_sqrt", "ct1", "C(month, levels=list(range(1, 13)))"]
 autoregression = {
     "autoreg_dict": {
         "lag_dict": {"orders": [1]},
         "agg_lag_dict": None
     }
 }

 # Specify the model parameters
 model_components = ModelComponentsParam(
     growth=dict(growth_term=None),
     seasonality=dict(
         yearly_seasonality=[False],
         quarterly_seasonality=[False],
         monthly_seasonality=[False],
         weekly_seasonality=[False],
         daily_seasonality=[False]
     ),
     custom=dict(
         fit_algorithm_dict=dict(fit_algorithm="ridge"),
         extra_pred_cols=extra_pred_cols
     ),
     regressors=dict(regressor_cols=None),
     autoregression=autoregression,
     uncertainty=dict(uncertainty_dict=None),
     events=dict(holiday_lookup_countries=None),
 )

 # Run the forecast model
 forecaster = Forecaster()
 result = forecaster.run_forecast_config(
     df=df,
     config=ForecastConfig(
         model_template=ModelTemplateEnum.SILVERKITE.name,
         coverage=0.95,
         forecast_horizon=forecast_horizon,
         metadata_param=meta_data_params,
         evaluation_period_param=evaluation_period_param,
         model_components_param=model_components
     )
 )

 # Get the useful fields from the forecast result
 model = result.model[-1]
 backtest = result.backtest
 forecast = result.forecast
 grid_search = result.grid_search

 # Check model coefficients / variables
 # Get model summary with p-values
 print(model.summary())

 # Get cross-validation results
 cv_results = summarize_grid_search_results(
     grid_search=grid_search,
     decimals=2,
     cv_report_metrics=None,
     column_order=[
         "rank", "mean_test", "split_test", "mean_train", "split_train",
         "mean_fit_time", "mean_score_time", "params"])
 # Transposes to save space in the printed output
 print(cv_results.transpose())

 # Check historical evaluation metrics (on the historical training/test set).
 backtest_eval = defaultdict(list)
 for metric, value in backtest.train_evaluation.items():
     backtest_eval[metric].append(value)
     backtest_eval[metric].append(backtest.test_evaluation[metric])
 metrics = pd.DataFrame(backtest_eval, index=["train", "test"]).T
 print(metrics)

Out:

Fitting 5 folds for each of 1 candidates, totalling 5 fits
================================ Model Summary =================================

Number of observations: 108,   Number of features: 22
Method: Ridge regression
Number of nonzero features: 22
Regularization parameter: 0.0621

Residuals:
         Min           1Q       Median           3Q          Max
  -5.655e+04   -1.618e+04      -1849.0    1.957e+04    6.007e+04

            Pred_col    Estimate   Std. Err Pr(>)_boot sig. code                    95%CI
           Intercept  -2.605e+04  1.765e+04      0.128               (-6.082e+04, 5663.0)
 C(month,... 13)))_2   1.142e+04  1.253e+04      0.336            (-1.417e+04, 3.417e+04)
 C(month,... 13)))_3   6.686e+04  1.407e+04     <2e-16       ***   (4.060e+04, 9.746e+04)
 C(month,... 13)))_4   1.060e+05  1.553e+04     <2e-16       ***   (7.367e+04, 1.327e+05)
 C(month,... 13)))_5   8.563e+04  1.535e+04     <2e-16       ***   (5.626e+04, 1.173e+05)
 C(month,... 13)))_6   9.056e+04  1.626e+04     <2e-16       ***   (5.808e+04, 1.204e+05)
 C(month,... 13)))_7   9.126e+04  1.611e+04     <2e-16       ***   (5.848e+04, 1.234e+05)
 C(month,... 13)))_8   8.720e+04  1.615e+04     <2e-16       ***   (5.631e+04, 1.183e+05)
 C(month,... 13)))_9   6.215e+04  1.638e+04     <2e-16       ***   (3.267e+04, 9.527e+04)
 C(month,...13)))_10   6.108e+04  1.480e+04     <2e-16       ***   (3.271e+04, 9.215e+04)
 C(month,...13)))_11     -6119.0  1.718e+04      0.688            (-4.439e+04, 2.556e+04)
 C(month,...13)))_12  -1.324e+04  1.319e+04      0.286            (-4.037e+04, 1.257e+04)
             ct_sqrt   9.290e+04  3.920e+04      0.012         *      (8616.0, 1.634e+05)
                 ct1   4.863e+04  2.159e+04      0.024         *      (7906.0, 9.485e+04)
   cp0_2011_12_31_00   2.021e+04  2.720e+04      0.430            (-2.641e+04, 7.852e+04)
   cp1_2012_01_30_00   1.920e+04  2.499e+04      0.412            (-2.329e+04, 7.171e+04)
   cp2_2012_12_31_00  -3.002e+04  3.298e+04      0.370            (-9.062e+04, 3.715e+04)
   cp3_2014_12_30_00      -945.8  1.874e+04      0.966            (-3.848e+04, 3.266e+04)
   cp4_2015_02_01_00      1769.0  1.357e+04      0.892            (-2.739e+04, 2.510e+04)
   cp5_2015_04_29_00  -1.569e+04  3.222e+04      0.630            (-7.719e+04, 5.183e+04)
   cp6_2017_08_31_00  -3.195e+04  1.898e+04      0.090         .     (-6.886e+04, 7482.0)
              y_lag1   2.133e+05  2.997e+04     <2e-16       ***   (1.530e+05, 2.681e+05)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Multiple R-squared: 0.9451,   Adjusted R-squared: 0.9355
F-statistic: 97.446 on 15 and 91 DF,   p-value: 1.110e-16
Model AIC: 2724.5,   model BIC: 2769.8

WARNING: the condition number is large, 5.60e+03. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.

                                                    0
rank_test_MAPE                                      1
mean_test_MAPE                                  16.81
split_test_MAPE     (14.18, 11.56, 9.9, 29.35, 19.04)
mean_train_MAPE                                 22.43
split_train_MAPE  (23.53, 22.22, 22.94, 22.04, 21.44)
mean_fit_time                                    1.34
mean_score_time                                  1.84
params                                             []
                                                          train         test
CORR                                                   0.970891     0.810576
R2                                                     0.942621     0.171875
MSE                                                 6.30447e+08  6.06056e+07
RMSE                                                    25108.7      7784.96
MAE                                                       20697      6872.87
MedAE                                                   18654.1      6918.05
MAPE                                                    20.9768      1.95055
MedAPE                                                  8.37506      1.99794
sMAPE                                                   8.81036     0.983418
Q80                                                     10348.5      4526.72
Q95                                                     10348.5      5071.85
Q99                                                     10348.5      5217.22
OutsideTolerance1p                                     0.932692         0.75
OutsideTolerance2p                                        0.875          0.5
OutsideTolerance3p                                     0.788462         0.25
OutsideTolerance4p                                         0.75            0
OutsideTolerance5p                                     0.673077            0
Outside Tolerance (fraction)                               None         None
R2_null_model_score                                        None         None
Prediction Band Width (%)                               83.8443      18.4221
Prediction Band Coverage (fraction)                    0.951923            1
Coverage: Lower Band                                   0.490385         0.25
Coverage: Upper Band                                   0.461538         0.75
Coverage Diff: Actual_Coverage - Intended_Coverage   0.00192308         0.05

Fit/backtest plot:

 fig = backtest.plot()
 plotly.io.show(fig)

Forecast plot:

 fig = forecast.plot()
 plotly.io.show(fig)

The components plot:

 fig = forecast.plot_components()
 plotly.io.show(fig)

Fit a model with time-varying seasonality (month effect). This is achieved by adding "ct1*C(month)" to ModelComponentsParam. Note that this feature may or may not be useful in your use case. We have included this for demonstration purposes only. In this example, while the fit has improved the backtest is inferior to the previous setting.

 extra_pred_cols = ["ct_sqrt", "ct1", "C(month, levels=list(range(1, 13)))",
                    "ct1*C(month, levels=list(range(1, 13)))"]
 autoregression = {
     "autoreg_dict": {
         "lag_dict": {"orders": [1]},
         "agg_lag_dict": None
     }
 }

 # Specify the model parameters
 model_components = ModelComponentsParam(
     growth=dict(growth_term=None),
     seasonality=dict(
         yearly_seasonality=[False],
         quarterly_seasonality=[False],
         monthly_seasonality=[False],
         weekly_seasonality=[False],
         daily_seasonality=[False]
     ),
     custom=dict(
         fit_algorithm_dict=dict(fit_algorithm="ridge"),
         extra_pred_cols=extra_pred_cols
     ),
     regressors=dict(regressor_cols=None),
     autoregression=autoregression,
     uncertainty=dict(uncertainty_dict=None),
     events=dict(holiday_lookup_countries=None),
 )

 # Run the forecast model
 forecaster = Forecaster()
 result = forecaster.run_forecast_config(
     df=df,
     config=ForecastConfig(
         model_template=ModelTemplateEnum.SILVERKITE.name,
         coverage=0.95,
         forecast_horizon=forecast_horizon,
         metadata_param=meta_data_params,
         evaluation_period_param=evaluation_period_param,
         model_components_param=model_components
     )
 )

 # Get the useful fields from the forecast result
 model = result.model[-1]
 backtest = result.backtest
 forecast = result.forecast
 grid_search = result.grid_search

 # Check model coefficients / variables
 # Get model summary with p-values
 print(model.summary())

 # Get cross-validation results
 cv_results = summarize_grid_search_results(
     grid_search=grid_search,
     decimals=2,
     cv_report_metrics=None,
     column_order=[
         "rank", "mean_test", "split_test", "mean_train", "split_train",
         "mean_fit_time", "mean_score_time", "params"])
 # Transposes to save space in the printed output
 print(cv_results.transpose())

 # Check historical evaluation metrics (on the historical training/test set).
 backtest_eval = defaultdict(list)
 for metric, value in backtest.train_evaluation.items():
     backtest_eval[metric].append(value)
     backtest_eval[metric].append(backtest.test_evaluation[metric])
 metrics = pd.DataFrame(backtest_eval, index=["train", "test"]).T
 print(metrics)

Out:

Fitting 5 folds for each of 1 candidates, totalling 5 fits
================================ Model Summary =================================

Number of observations: 108,   Number of features: 33
Method: Ridge regression
Number of nonzero features: 33
Regularization parameter: 0.01269

Residuals:
         Min           1Q       Median           3Q          Max
  -5.127e+04   -1.256e+04        752.4    1.392e+04    5.073e+04

            Pred_col    Estimate   Std. Err Pr(>)_boot sig. code                     95%CI
           Intercept  -2.220e+04  1.954e+04      0.250                (-6.576e+04, 9544.0)
 C(month,... 13)))_2     -1857.0  2.435e+04      0.916             (-5.949e+04, 4.069e+04)
 C(month,... 13)))_3   3.125e+04  2.271e+04      0.134                (-6432.0, 8.940e+04)
 C(month,... 13)))_4   5.244e+04  2.191e+04      0.024         *    (2.614e+04, 1.088e+05)
 C(month,... 13)))_5   7.419e+04  2.033e+04      0.010         *    (4.666e+04, 1.221e+05)
 C(month,... 13)))_6   5.570e+04  2.270e+04      0.034         *    (2.342e+04, 1.120e+05)
 C(month,... 13)))_7   5.992e+04  2.234e+04      0.012         *    (2.880e+04, 1.150e+05)
 C(month,... 13)))_8   5.781e+04  2.159e+04      0.016         *    (2.876e+04, 1.103e+05)
 C(month,... 13)))_9   4.858e+04  2.831e+04      0.076         .    (1.729e+04, 1.255e+05)
 C(month,...13)))_10   3.069e+04  1.817e+04      0.090         .       (1147.0, 7.843e+04)
 C(month,...13)))_11   2.508e+04  1.637e+04      0.110                (-1479.0, 6.628e+04)
 C(month,...13)))_12     -1322.0  1.694e+04      0.942             (-3.032e+04, 4.180e+04)
             ct_sqrt   1.757e+05  6.048e+04     <2e-16       ***    (4.610e+04, 2.765e+05)
                 ct1   3.731e+04  4.906e+04      0.494             (-3.618e+04, 1.449e+05)
 ct1:C(mo... 13)))_2   2.775e+04  3.884e+04      0.352             (-4.481e+04, 1.132e+05)
 ct1:C(mo... 13)))_3   7.465e+04  3.912e+04      0.062         .   (-1.808e+04, 1.414e+05)
 ct1:C(mo... 13)))_4   1.332e+05  3.916e+04      0.008        **    (4.876e+04, 1.982e+05)
 ct1:C(mo... 13)))_5   8.293e+04  3.897e+04      0.038         *      (-1101.0, 1.576e+05)
 ct1:C(mo... 13)))_6   1.336e+05  3.438e+04      0.006        **    (6.264e+04, 1.912e+05)
 ct1:C(mo... 13)))_7   1.330e+05  3.510e+04     <2e-16       ***    (6.143e+04, 2.012e+05)
 ct1:C(mo... 13)))_8   1.329e+05  3.458e+04     <2e-16       ***    (5.925e+04, 1.964e+05)
 ct1:C(mo... 13)))_9   9.543e+04  4.620e+04      0.042         *       (3490.0, 1.782e+05)
 ct1:C(mo...13)))_10   1.198e+05  2.909e+04      0.002        **    (5.748e+04, 1.782e+05)
 ct1:C(mo...13)))_11     -6389.0  3.095e+04      0.844             (-6.537e+04, 5.810e+04)
 ct1:C(mo...13)))_12      1972.0  3.133e+04      0.952             (-5.922e+04, 5.842e+04)
   cp0_2011_12_31_00      6598.0  3.495e+04      0.870             (-6.051e+04, 7.260e+04)
   cp1_2012_01_30_00     -9129.0  3.703e+04      0.834             (-7.853e+04, 6.467e+04)
   cp2_2012_12_31_00  -6.343e+04  5.699e+04      0.286             (-1.746e+05, 3.395e+04)
   cp3_2014_12_30_00     -6911.0  5.326e+04      0.900             (-1.081e+05, 9.562e+04)
   cp4_2015_02_01_00   3.416e+04  3.812e+04      0.366             (-5.045e+04, 1.023e+05)
   cp5_2015_04_29_00  -2.803e+04  8.469e+04      0.698             (-1.832e+05, 1.527e+05)
   cp6_2017_08_31_00  -5.819e+04  2.060e+04      0.012         *  (-9.439e+04, -1.231e+04)
              y_lag1   1.281e+05  4.927e+04      0.012         *    (4.335e+04, 2.308e+05)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Multiple R-squared: 0.9678,   Adjusted R-squared: 0.9566
F-statistic: 85.908 on 27 and 79 DF,   p-value: 1.110e-16
Model AIC: 2690.2,   model BIC: 2767.1

WARNING: the condition number is large, 2.75e+04. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.

                                                  0
rank_test_MAPE                                    1
mean_test_MAPE                                 8.19
split_test_MAPE    (3.45, 11.02, 6.03, 16.43, 4.02)
mean_train_MAPE                               12.42
split_train_MAPE  (15.4, 11.02, 11.4, 11.17, 13.12)
mean_fit_time                                  1.29
mean_score_time                                1.75
params                                           []
                                                          train         test
CORR                                                   0.983665     0.942998
R2                                                     0.967592     -28.7164
MSE                                                 3.56083e+08  2.17477e+09
RMSE                                                    18870.2      46634.4
MAE                                                     15056.5      42831.5
MedAE                                                   13205.8      44422.5
MAPE                                                    13.8329      12.0911
MedAPE                                                  6.51192      12.4955
sMAPE                                                   5.01372      5.64679
Q80                                                     7528.25      8566.29
Q95                                                     7528.25      2141.57
Q99                                                     7528.25      428.315
OutsideTolerance1p                                     0.913462            1
OutsideTolerance2p                                     0.798077            1
OutsideTolerance3p                                     0.759615            1
OutsideTolerance4p                                     0.701923            1
OutsideTolerance5p                                        0.625         0.75
Outside Tolerance (fraction)                               None         None
R2_null_model_score                                        None         None
Prediction Band Width (%)                               63.0122      19.5714
Prediction Band Coverage (fraction)                    0.923077         0.25
Coverage: Lower Band                                   0.442308         0.25
Coverage: Upper Band                                   0.480769            0
Coverage Diff: Actual_Coverage - Intended_Coverage   -0.0269231         -0.7

Fit/backtest plot:

 fig = backtest.plot()
 plotly.io.show(fig)

Forecast plot:

 fig = forecast.plot()
 plotly.io.show(fig)

The components plot:

 fig = forecast.plot_components()
 plotly.io.show(fig)

Total running time of the script: ( 1 minutes 0.759 seconds)

Gallery generated by Sphinx-Gallery