Note
Click here to download the full example code
Example for weekly data
This is a basic example for weekly data using Silverkite. Note that here we are fitting a few simple models and the goal is not to optimize the results as much as possible.
10 import warnings
11 from collections import defaultdict
12
13 import plotly
14 import pandas as pd
15
16 from greykite.common.constants import TIME_COL
17 from greykite.common.constants import VALUE_COL
18 from greykite.framework.benchmark.data_loader_ts import DataLoader
19 from greykite.framework.input.univariate_time_series import UnivariateTimeSeries
20 from greykite.framework.templates.autogen.forecast_config import EvaluationPeriodParam
21 from greykite.framework.templates.autogen.forecast_config import ForecastConfig
22 from greykite.framework.templates.autogen.forecast_config import MetadataParam
23 from greykite.framework.templates.autogen.forecast_config import ModelComponentsParam
24 from greykite.framework.templates.forecaster import Forecaster
25 from greykite.framework.utils.result_summary import summarize_grid_search_results
26
27 warnings.filterwarnings("ignore")
Loads weekly dataset into UnivariateTimeSeries
.
31 dl = DataLoader()
32 agg_func = {"count": "sum"}
33 df = dl.load_bikesharing(agg_freq="weekly", agg_func=agg_func)
34 # In this dataset the first week and last week's data are incomplete, therefore we drop it
35 df.drop(df.head(1).index,inplace=True)
36 df.drop(df.tail(1).index,inplace=True)
37 df.reset_index(drop=True)
38 ts = UnivariateTimeSeries()
39 ts.load_data(
40 df=df,
41 time_col="ts",
42 value_col="count",
43 freq="W-MON")
44 print(ts.df.head())
Out:
ts y
2010-09-27 2010-09-27 2801
2010-10-04 2010-10-04 3238
2010-10-11 2010-10-11 6241
2010-10-18 2010-10-18 7756
2010-10-25 2010-10-25 9556
Exploratory Data Analysis (EDA)
After reading in a time series, we could first do some exploratory data analysis.
The UnivariateTimeSeries
class is
used to store a timeseries and perform EDA.
A quick description of the data can be obtained as follows.
55 print(ts.describe_time_col())
56 print(ts.describe_value_col())
Out:
{'data_points': 466, 'mean_increment_secs': 604800.0, 'min_timestamp': Timestamp('2010-09-27 00:00:00'), 'max_timestamp': Timestamp('2019-08-26 00:00:00')}
count 466.000000
mean 53466.961373
std 24728.824016
min 2801.000000
25% 32819.750000
50% 51921.500000
75% 76160.750000
max 102350.000000
Name: y, dtype: float64
Let’s plot the original timeseries.
(The interactive plot is generated by plotly
: click to zoom!)
61 fig = ts.plot()
62 plotly.io.show(fig)
Exploratory plots can be plotted to reveal the time series’s properties. Monthly overlay plot can be used to inspect the annual patterns. This plot overlays various years on top of each other.
68 fig = ts.plot_quantiles_and_overlays(
69 groupby_time_feature="month",
70 show_mean=True,
71 show_quantiles=False,
72 show_overlays=True,
73 center_values=True,
74 overlay_label_time_feature="year", # splits overlays by year
75 overlay_style={"line": {"width": 1}, "opacity": 0.5},
76 xlabel="Month",
77 ylabel=ts.original_value_col,
78 title="Yearly seasonality by year (centered)",
79 )
80 plotly.io.show(fig)
Weekly overlay plot.
84 fig = ts.plot_quantiles_and_overlays(
85 groupby_time_feature="woy",
86 show_mean=True,
87 show_quantiles=False,
88 show_overlays=True,
89 center_values=True,
90 overlay_label_time_feature="year", # splits overlays by year
91 overlay_style={"line": {"width": 1}, "opacity": 0.5},
92 xlabel="Week of year",
93 ylabel=ts.original_value_col,
94 title="Yearly seasonality by year (centered)",
95 )
96 plotly.io.show(fig)
Fit Greykite Models
After some exploratory data analysis, let’s specify the model parameters and fit a Greykite model.
Specify common metadata.
Specify common evaluation parameters. Set minimum input data for training.
117 cv_min_train_periods = 52 * 2
118 # Let CV use most recent splits for cross-validation.
119 cv_use_most_recent_splits = True
120 # Determine the maximum number of validations.
121 cv_max_splits = 6
122 evaluation_period = EvaluationPeriodParam(
123 test_horizon=forecast_horizon,
124 cv_horizon=forecast_horizon,
125 periods_between_train_test=0,
126 cv_min_train_periods=cv_min_train_periods,
127 cv_expanding_window=True,
128 cv_use_most_recent_splits=cv_use_most_recent_splits,
129 cv_periods_between_splits=None,
130 cv_periods_between_train_test=0,
131 cv_max_splits=cv_max_splits,
132 )
Let’s also define a helper function that generates the model results summary and plots.
136 def get_model_results_summary(result):
137 """Generates model results summary.
138
139 Parameters
140 ----------
141 result : `ForecastResult`
142 See :class:`~greykite.framework.pipeline.pipeline.ForecastResult` for documentation.
143
144 Returns
145 -------
146 Prints out model coefficients, cross-validation results, overall train/test evalautions.
147 """
148 # Get the useful fields from the forecast result
149 model = result.model[-1]
150 backtest = result.backtest
151 grid_search = result.grid_search
152
153 # Check model coefficients / variables
154 # Get model summary with p-values
155 print(model.summary())
156
157 # Get cross-validation results
158 cv_results = summarize_grid_search_results(
159 grid_search=grid_search,
160 decimals=2,
161 cv_report_metrics=None,
162 column_order=[
163 "rank", "mean_test", "split_test", "mean_train", "split_train",
164 "mean_fit_time", "mean_score_time", "params"])
165 # Transposes to save space in the printed output
166 print("================================= CV Results ==================================")
167 print(cv_results.transpose())
168
169 # Check historical evaluation metrics (on the historical training/test set).
170 backtest_eval = defaultdict(list)
171 for metric, value in backtest.train_evaluation.items():
172 backtest_eval[metric].append(value)
173 backtest_eval[metric].append(backtest.test_evaluation[metric])
174 metrics = pd.DataFrame(backtest_eval, index=["train", "test"]).T
175 print("=========================== Train/Test Evaluation =============================")
176 print(metrics)
Fit a simple model without autoregression.
The the most important model parameters are specified through ModelComponentsParam
.
The extra_pred_cols
is used to specify growth and annual seasonality
Growth is modelled with both “ct_sqrt”, “ct1” for extra flexibility as we have
longterm data and ridge regularization will avoid over-fitting the trend.
The yearly seasonality is modelled using Fourier series. In the ModelComponentsParam
,
we can specify the order of that - the higher the order is, the more flexible pattern
the model could capture. Usually one can try integers between 10 and 50.
188 autoregression = None
189 extra_pred_cols = ["ct1", "ct_sqrt", "ct1:C(month, levels=list(range(1, 13)))"]
190
191 # Specify the model parameters
192 model_components = ModelComponentsParam(
193 autoregression=autoregression,
194 seasonality={
195 "yearly_seasonality": 25,
196 "quarterly_seasonality": 0,
197 "monthly_seasonality": 0,
198 "weekly_seasonality": 0,
199 "daily_seasonality": 0
200 },
201 changepoints={
202 'changepoints_dict': {
203 "method": "auto",
204 "resample_freq": "7D",
205 "regularization_strength": 0.5,
206 "potential_changepoint_distance": "14D",
207 "no_changepoint_distance_from_end": "60D",
208 "yearly_seasonality_order": 25,
209 "yearly_seasonality_change_freq": None,
210 },
211 "seasonality_changepoints_dict": None
212 },
213 events={
214 "holiday_lookup_countries": []
215 },
216 growth={
217 "growth_term": None
218 },
219 custom={
220 'feature_sets_enabled': False,
221 'fit_algorithm_dict': dict(fit_algorithm='ridge'),
222 'extra_pred_cols': extra_pred_cols,
223 }
224 )
225
226 forecast_config = ForecastConfig(
227 metadata_param=metadata,
228 forecast_horizon=forecast_horizon,
229 coverage=0.95,
230 evaluation_period_param=evaluation_period,
231 model_components_param=model_components
232 )
233
234 # Run the forecast model
235 forecaster = Forecaster()
236 result = forecaster.run_forecast_config(
237 df=ts.df,
238 config=forecast_config
239 )
Out:
Fitting 6 folds for each of 1 candidates, totalling 6 fits
Let’s check the model results summary and plots.
243 get_model_results_summary(result)
Out:
================================ Model Summary =================================
Number of observations: 466, Number of features: 68
Method: Ridge regression
Number of nonzero features: 68
Regularization parameter: 0.02807
Residuals:
Min 1Q Median 3Q Max
-2.758e+04 -3953.0 169.9 4685.0 2.165e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 1.086e+04 5294.0 0.050 . (1756.0, 2.336e+04)
ct1 1.932e+04 1.164e+04 0.094 . (-3678.0, 4.119e+04)
ct1:C(mo... 13)))_2 917.1 6344.0 0.888 (-1.126e+04, 1.306e+04)
ct1:C(mo... 13)))_3 9628.0 6146.0 0.120 (-2516.0, 2.182e+04)
ct1:C(mo... 13)))_4 3.472e+04 5246.0 <2e-16 *** (2.413e+04, 4.382e+04)
ct1:C(mo... 13)))_5 2.762e+04 5702.0 <2e-16 *** (1.642e+04, 3.809e+04)
ct1:C(mo... 13)))_6 3.755e+04 5463.0 <2e-16 *** (2.713e+04, 4.822e+04)
ct1:C(mo... 13)))_7 4.068e+04 5292.0 <2e-16 *** (3.037e+04, 5.109e+04)
ct1:C(mo... 13)))_8 4.096e+04 5488.0 <2e-16 *** (2.995e+04, 5.137e+04)
ct1:C(mo... 13)))_9 3.261e+04 5927.0 <2e-16 *** (2.136e+04, 4.347e+04)
ct1:C(mo...13)))_10 3.322e+04 5053.0 <2e-16 *** (2.295e+04, 4.309e+04)
ct1:C(mo...13)))_11 1.036e+04 5430.0 0.058 . (-530.8, 2.090e+04)
ct1:C(mo...13)))_12 4878.0 5181.0 0.362 (-5512.0, 1.429e+04)
ct_sqrt 5.656e+04 8836.0 <2e-16 *** (3.758e+04, 7.278e+04)
sin1_ct1_yearly -1.977e+04 1615.0 <2e-16 *** (-2.336e+04, -1.697e+04)
cos1_ct1_yearly 6433.0 1763.0 0.002 ** (3256.0, 9958.0)
sin2_ct1_yearly 5236.0 1645.0 0.002 ** (2144.0, 8625.0)
cos2_ct1_yearly 4841.0 1647.0 0.004 ** (1901.0, 8082.0)
sin3_ct1_yearly 1800.0 1697.0 0.286 (-1498.0, 5189.0)
cos3_ct1_yearly -272.9 1528.0 0.872 (-3353.0, 2526.0)
sin4_ct1_yearly -337.2 1493.0 0.806 (-3413.0, 2542.0)
cos4_ct1_yearly -1376.0 1543.0 0.384 (-4264.0, 1525.0)
sin5_ct1_yearly -2044.0 1388.0 0.138 (-4926.0, 325.2)
cos5_ct1_yearly -786.7 1334.0 0.582 (-3254.0, 1895.0)
sin6_ct1_yearly -126.7 1575.0 0.938 (-3160.0, 2983.0)
cos6_ct1_yearly 1482.0 1161.0 0.170 (-1002.0, 3765.0)
sin7_ct1_yearly -363.0 1311.0 0.754 (-2937.0, 2223.0)
cos7_ct1_yearly -270.2 1185.0 0.824 (-2439.0, 2113.0)
sin8_ct1_yearly -1562.0 1265.0 0.248 (-3896.0, 762.7)
cos8_ct1_yearly 724.6 1097.0 0.504 (-1388.0, 2926.0)
sin9_ct1_yearly -1829.0 1076.0 0.082 . (-3939.0, 127.9)
cos9_ct1_yearly 2049.0 1167.0 0.078 . (-338.1, 4281.0)
sin10_ct1_yearly 630.4 1075.0 0.508 (-1617.0, 2690.0)
cos10_ct1_yearly 1728.0 1020.0 0.088 . (-309.5, 3557.0)
sin11_ct1_yearly 2838.0 1021.0 0.002 ** (801.3, 4863.0)
cos11_ct1_yearly -686.5 1091.0 0.538 (-2805.0, 1401.0)
sin12_ct1_yearly -2057.0 1035.0 0.052 . (-4165.0, -109.7)
cos12_ct1_yearly -2302.0 1061.0 0.028 * (-4344.0, -335.2)
sin13_ct1_yearly -407.8 977.6 0.638 (-2366.0, 1584.0)
cos13_ct1_yearly 316.2 995.1 0.714 (-1530.0, 2362.0)
sin14_ct1_yearly -1372.0 1022.0 0.160 (-3526.0, 539.1)
cos14_ct1_yearly 1162.0 1080.0 0.272 (-1085.0, 3352.0)
sin15_ct1_yearly 112.2 1057.0 0.910 (-1839.0, 2286.0)
cos15_ct1_yearly 1259.0 1025.0 0.222 (-679.0, 3219.0)
sin16_ct1_yearly -2384.0 1080.0 0.024 * (-4457.0, -203.8)
cos16_ct1_yearly -644.5 1041.0 0.542 (-2583.0, 1394.0)
sin17_ct1_yearly 680.7 1032.0 0.472 (-1526.0, 2713.0)
cos17_ct1_yearly 850.7 1054.0 0.430 (-1046.0, 2843.0)
sin18_ct1_yearly -549.1 1042.0 0.632 (-2587.0, 1487.0)
cos18_ct1_yearly 220.5 1124.0 0.844 (-1840.0, 2468.0)
sin19_ct1_yearly -579.2 978.0 0.530 (-2680.0, 1003.0)
cos19_ct1_yearly -773.8 1064.0 0.442 (-2749.0, 1450.0)
sin20_ct1_yearly 1083.0 1055.0 0.296 (-791.7, 3229.0)
cos20_ct1_yearly 311.6 1069.0 0.774 (-1847.0, 2418.0)
sin21_ct1_yearly -1025.0 1093.0 0.358 (-3154.0, 1165.0)
cos21_ct1_yearly -483.3 1057.0 0.650 (-2540.0, 1650.0)
sin22_ct1_yearly 1110.0 974.0 0.250 (-772.2, 3062.0)
cos22_ct1_yearly 340.0 1170.0 0.798 (-2040.0, 2439.0)
sin23_ct1_yearly 506.2 1012.0 0.612 (-1530.0, 2429.0)
cos23_ct1_yearly -3098.0 1063.0 <2e-16 *** (-5230.0, -1109.0)
sin24_ct1_yearly -1075.0 1076.0 0.300 (-3111.0, 852.4)
cos24_ct1_yearly -367.7 992.3 0.718 (-2243.0, 1573.0)
sin25_ct1_yearly -2124.0 1017.0 0.032 * (-4084.0, -242.8)
cos25_ct1_yearly -34.79 1062.0 0.970 (-1968.0, 1997.0)
cp0_2012_01_30_00 2645.0 1.109e+04 0.850 (-1.933e+04, 2.325e+04)
cp1_2013_01_14_00 -2.420e+04 9522.0 0.010 * (-4.130e+04, -5279.0)
cp2_2015_02_23_00 -1574.0 5513.0 0.760 (-1.263e+04, 8744.0)
cp3_2017_10_02_00 -1.979e+04 3053.0 <2e-16 *** (-2.596e+04, -1.483e+04)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9181, Adjusted R-squared: 0.9047
F-statistic: 67.998 on 65 and 399 DF, p-value: 1.110e-16
Model AIC: 11257.0, model BIC: 11533.0
WARNING: the condition number is large, 2.06e+05. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
================================= CV Results ==================================
0
rank_test_MAPE 1
mean_test_MAPE 10.73
split_test_MAPE (12.11, 15.22, 4.69, 10.99, 9.81, 11.59)
mean_train_MAPE 15.96
split_train_MAPE (15.89, 16.1, 16.05, 15.95, 15.91, 15.88)
mean_fit_time 2.02
mean_score_time 0.23
params []
=========================== Train/Test Evaluation =============================
train test
CORR 0.957887 0.886164
R2 0.917539 -2.019265
MSE 50186299.582873 51875105.70039
RMSE 7084.228934 7202.437483
MAE 5394.331719 6425.432865
MedAE 4193.480899 6045.073112
MAPE 15.853829 8.134249
MedAPE 8.593354 7.539209
sMAPE 7.41262 3.864425
Q80 2697.165859 1285.086573
Q95 2697.165859 321.271643
Q99 2697.165859 64.254329
OutsideTolerance1p 0.943723 1.0
OutsideTolerance2p 0.883117 1.0
OutsideTolerance3p 0.80303 1.0
OutsideTolerance4p 0.755411 0.75
OutsideTolerance5p 0.705628 0.5
Outside Tolerance (fraction) None None
R2_null_model_score None None
Prediction Band Width (%) 91.188809 41.58916
Prediction Band Coverage (fraction) 0.965368 1.0
Coverage: Lower Band 0.465368 1.0
Coverage: Upper Band 0.5 0.0
Coverage Diff: Actual_Coverage - Intended_Coverage 0.015368 0.05
MIS 37747.359084 33697.724421
Fit/backtest plot:
247 fig = result.backtest.plot()
248 plotly.io.show(fig)
Forecast plot:
252 fig = result.forecast.plot()
253 plotly.io.show(fig)
The components plot:
257 fig = result.forecast.plot_components()
258 plotly.io.show(fig)
Fit a simple model with autoregression.
This is done by specifying the autoregression
parameter in ModelComponentsParam
.
Note that the auto-regressive structure can be customized further depending on your data.
264 autoregression = {
265 "autoreg_dict": {
266 "lag_dict": {"orders": [1]}, # Only use lag-1
267 "agg_lag_dict": None
268 }
269 }
270 extra_pred_cols = ["ct1", "ct_sqrt", "ct1:C(month, levels=list(range(1, 13)))"]
271
272 # Specify the model parameters
273 model_components = ModelComponentsParam(
274 autoregression=autoregression,
275 seasonality={
276 "yearly_seasonality": 25,
277 "quarterly_seasonality": 0,
278 "monthly_seasonality": 0,
279 "weekly_seasonality": 0,
280 "daily_seasonality": 0
281 },
282 changepoints={
283 'changepoints_dict': {
284 "method": "auto",
285 "resample_freq": "7D",
286 "regularization_strength": 0.5,
287 "potential_changepoint_distance": "14D",
288 "no_changepoint_distance_from_end": "60D",
289 "yearly_seasonality_order": 25,
290 "yearly_seasonality_change_freq": None,
291 },
292 "seasonality_changepoints_dict": None
293 },
294 events={
295 "holiday_lookup_countries": []
296 },
297 growth={
298 "growth_term": None
299 },
300 custom={
301 'feature_sets_enabled': False,
302 'fit_algorithm_dict': dict(fit_algorithm='ridge'),
303 'extra_pred_cols': extra_pred_cols,
304 }
305 )
306
307 forecast_config = ForecastConfig(
308 metadata_param=metadata,
309 forecast_horizon=forecast_horizon,
310 coverage=0.95,
311 evaluation_period_param=evaluation_period,
312 model_components_param=model_components
313 )
314
315 # Run the forecast model
316 forecaster = Forecaster()
317 result = forecaster.run_forecast_config(
318 df=ts.df,
319 config=forecast_config
320 )
Out:
Fitting 6 folds for each of 1 candidates, totalling 6 fits
Let’s check the model results summary and plots.
324 get_model_results_summary(result)
Out:
================================ Model Summary =================================
Number of observations: 466, Number of features: 69
Method: Ridge regression
Number of nonzero features: 69
Regularization parameter: 0.0621
Residuals:
Min 1Q Median 3Q Max
-2.954e+04 -3841.0 468.7 4111.0 2.007e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 1.228e+04 5177.0 0.032 * (4335.0, 2.535e+04)
ct1 1.700e+04 5159.0 <2e-16 *** (7118.0, 2.734e+04)
ct1:C(mo... 13)))_2 -1832.0 5632.0 0.736 (-1.330e+04, 9083.0)
ct1:C(mo... 13)))_3 4877.0 5708.0 0.358 (-6388.0, 1.603e+04)
ct1:C(mo... 13)))_4 2.342e+04 5560.0 <2e-16 *** (1.159e+04, 3.335e+04)
ct1:C(mo... 13)))_5 1.770e+04 5896.0 0.002 ** (4848.0, 2.873e+04)
ct1:C(mo... 13)))_6 2.350e+04 5028.0 <2e-16 *** (1.261e+04, 3.282e+04)
ct1:C(mo... 13)))_7 2.675e+04 5046.0 <2e-16 *** (1.632e+04, 3.543e+04)
ct1:C(mo... 13)))_8 2.596e+04 5051.0 <2e-16 *** (1.576e+04, 3.505e+04)
ct1:C(mo... 13)))_9 1.952e+04 5703.0 <2e-16 *** (8022.0, 3.109e+04)
ct1:C(mo...13)))_10 2.056e+04 5295.0 <2e-16 *** (9193.0, 2.981e+04)
ct1:C(mo...13)))_11 2924.0 4964.0 0.542 (-7041.0, 1.253e+04)
ct1:C(mo...13)))_12 1582.0 5054.0 0.744 (-9825.0, 1.027e+04)
ct_sqrt 3.718e+04 6168.0 <2e-16 *** (2.412e+04, 4.862e+04)
sin1_ct1_yearly -1.452e+04 1884.0 <2e-16 *** (-1.872e+04, -1.138e+04)
cos1_ct1_yearly 4158.0 1757.0 0.024 * (680.2, 7525.0)
sin2_ct1_yearly 3378.0 1532.0 0.040 * (132.1, 6629.0)
cos2_ct1_yearly 4393.0 1571.0 0.006 ** (1288.0, 7432.0)
sin3_ct1_yearly 1934.0 1471.0 0.176 (-1251.0, 4608.0)
cos3_ct1_yearly -201.8 1486.0 0.872 (-3176.0, 2570.0)
sin4_ct1_yearly -914.2 1415.0 0.540 (-3749.0, 1559.0)
cos4_ct1_yearly -796.1 1362.0 0.552 (-3284.0, 1881.0)
sin5_ct1_yearly -1596.0 1359.0 0.258 (-4102.0, 926.4)
cos5_ct1_yearly -988.7 1279.0 0.434 (-3298.0, 1556.0)
sin6_ct1_yearly 113.2 1416.0 0.966 (-2437.0, 2970.0)
cos6_ct1_yearly 1641.0 1102.0 0.116 (-680.0, 3672.0)
sin7_ct1_yearly -485.6 1273.0 0.720 (-3066.0, 1949.0)
cos7_ct1_yearly -215.5 1100.0 0.850 (-2265.0, 1649.0)
sin8_ct1_yearly -1111.0 1151.0 0.312 (-3538.0, 1053.0)
cos8_ct1_yearly 519.5 1081.0 0.670 (-1614.0, 2659.0)
sin9_ct1_yearly -2083.0 1068.0 0.048 * (-4068.0, 4.552)
cos9_ct1_yearly 1135.0 1122.0 0.308 (-1099.0, 3358.0)
sin10_ct1_yearly 178.2 1069.0 0.848 (-2086.0, 2119.0)
cos10_ct1_yearly 1482.0 945.5 0.114 (-300.2, 3292.0)
sin11_ct1_yearly 2166.0 943.2 0.018 * (225.3, 4031.0)
cos11_ct1_yearly -251.9 1063.0 0.826 (-2368.0, 1645.0)
sin12_ct1_yearly -1328.0 963.9 0.154 (-3114.0, 607.4)
cos12_ct1_yearly -2660.0 966.8 0.006 ** (-4659.0, -872.6)
sin13_ct1_yearly -1046.0 976.9 0.284 (-3129.0, 645.8)
cos13_ct1_yearly 463.8 1011.0 0.638 (-1556.0, 2372.0)
sin14_ct1_yearly -1463.0 976.3 0.136 (-3406.0, 412.8)
cos14_ct1_yearly 1141.0 1128.0 0.304 (-1049.0, 3347.0)
sin15_ct1_yearly -189.7 1057.0 0.870 (-2378.0, 1823.0)
cos15_ct1_yearly 1158.0 1007.0 0.260 (-784.7, 3191.0)
sin16_ct1_yearly -2142.0 990.7 0.028 * (-4039.0, -276.4)
cos16_ct1_yearly -1075.0 988.8 0.286 (-2873.0, 918.0)
sin17_ct1_yearly 423.9 1024.0 0.654 (-1570.0, 2324.0)
cos17_ct1_yearly 983.1 1044.0 0.338 (-1141.0, 3111.0)
sin18_ct1_yearly -245.8 999.0 0.806 (-2171.0, 1902.0)
cos18_ct1_yearly -192.3 1079.0 0.868 (-2220.0, 1732.0)
sin19_ct1_yearly -698.6 997.3 0.450 (-2600.0, 1133.0)
cos19_ct1_yearly -929.2 976.0 0.338 (-2820.0, 1172.0)
sin20_ct1_yearly 1321.0 994.2 0.196 (-540.0, 3342.0)
cos20_ct1_yearly 428.0 1020.0 0.668 (-1383.0, 2632.0)
sin21_ct1_yearly -1275.0 973.2 0.184 (-3257.0, 496.6)
cos21_ct1_yearly -740.6 1069.0 0.518 (-2912.0, 1185.0)
sin22_ct1_yearly 1233.0 984.4 0.210 (-778.9, 3253.0)
cos22_ct1_yearly 603.5 1030.0 0.562 (-1423.0, 2671.0)
sin23_ct1_yearly 570.9 985.5 0.538 (-1561.0, 2407.0)
cos23_ct1_yearly -3478.0 1004.0 <2e-16 *** (-5498.0, -1624.0)
sin24_ct1_yearly -1124.0 1019.0 0.276 (-3128.0, 694.5)
cos24_ct1_yearly -484.6 1004.0 0.656 (-2373.0, 1597.0)
sin25_ct1_yearly -2646.0 971.2 0.010 * (-4556.0, -762.4)
cos25_ct1_yearly 307.7 960.2 0.726 (-1343.0, 2309.0)
cp0_2012_01_30_00 2383.0 7732.0 0.764 (-1.238e+04, 1.782e+04)
cp1_2013_01_14_00 -1.616e+04 7495.0 0.026 * (-2.973e+04, -1020.0)
cp2_2015_02_23_00 -2032.0 4820.0 0.650 (-1.222e+04, 6862.0)
cp3_2017_10_02_00 -1.392e+04 2978.0 <2e-16 *** (-1.960e+04, -8143.0)
y_lag1 2.981e+04 5195.0 <2e-16 *** (2.024e+04, 3.995e+04)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9245, Adjusted R-squared: 0.912
F-statistic: 73.697 on 66 and 398 DF, p-value: 1.110e-16
Model AIC: 11220.0, model BIC: 11498.0
WARNING: the condition number is large, 1.04e+05. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
================================= CV Results ==================================
0
rank_test_MAPE 1
mean_test_MAPE 11.13
split_test_MAPE (13.86, 14.93, 6.43, 8.6, 11.99, 10.95)
mean_train_MAPE 14.46
split_train_MAPE (14.61, 14.54, 14.48, 14.43, 14.38, 14.31)
mean_fit_time 2.31
mean_score_time 4.52
params []
=========================== Train/Test Evaluation =============================
train test
CORR 0.961255 -0.401978
R2 0.924008 -2.511734
MSE 46248880.414021 60336402.966613
RMSE 6800.65294 7767.651058
MAE 5131.614192 5780.149922
MedAE 4077.576962 4754.447672
MAPE 14.266985 7.445479
MedAPE 7.631892 5.969659
sMAPE 6.689048 3.485365
Q80 2565.807096 1156.029984
Q95 2565.807096 289.007496
Q99 2565.807096 57.801499
OutsideTolerance1p 0.919913 0.75
OutsideTolerance2p 0.863636 0.75
OutsideTolerance3p 0.798701 0.5
OutsideTolerance4p 0.757576 0.5
OutsideTolerance5p 0.675325 0.5
Outside Tolerance (fraction) None None
R2_null_model_score None None
Prediction Band Width (%) 87.710451 30.41495
Prediction Band Coverage (fraction) 0.965368 0.75
Coverage: Lower Band 0.4329 0.75
Coverage: Upper Band 0.532468 0.0
Coverage Diff: Actual_Coverage - Intended_Coverage 0.015368 -0.2
MIS 36540.712701 82074.787535
Fit/backtest plot:
328 fig = result.backtest.plot()
329 plotly.io.show(fig)
Forecast plot:
333 fig = result.forecast.plot()
334 plotly.io.show(fig)
The components plot:
338 fig = result.forecast.plot_components()
339 plotly.io.show(fig)
Fit a greykite model with autoregression and forecast one-by-one. Forecast one-by-one is only
used when autoregression is set to “auto”, and it can be enable by setting forecast_one_by_one=True
in
Without forecast one-by-one, the lag order in autoregression has to be greater
than the forecast horizon in order to avoid simulation (which leads to less accuracy).
The advantage of turning on forecast_one_by_one is to improve the forecast accuracy by breaking
the forecast horizon to smaller steps, fitting multiple models using immediate lags.
Note that the forecast one-by-one option may slow down the training.
350 autoregression = {
351 "autoreg_dict": "auto"
352 }
353 extra_pred_cols = ["ct1", "ct_sqrt", "ct1:C(month, levels=list(range(1, 13)))"]
354 forecast_one_by_one = True
355
356 # Specify the model parameters
357 model_components = ModelComponentsParam(
358 autoregression=autoregression,
359 seasonality={
360 "yearly_seasonality": 25,
361 "quarterly_seasonality": 0,
362 "monthly_seasonality": 0,
363 "weekly_seasonality": 0,
364 "daily_seasonality": 0
365 },
366 changepoints={
367 'changepoints_dict': {
368 "method": "auto",
369 "resample_freq": "7D",
370 "regularization_strength": 0.5,
371 "potential_changepoint_distance": "14D",
372 "no_changepoint_distance_from_end": "60D",
373 "yearly_seasonality_order": 25,
374 "yearly_seasonality_change_freq": None,
375 },
376 "seasonality_changepoints_dict": None
377 },
378 events={
379 "holiday_lookup_countries": []
380 },
381 growth={
382 "growth_term": None
383 },
384 custom={
385 'feature_sets_enabled': False,
386 'fit_algorithm_dict': dict(fit_algorithm='ridge'),
387 'extra_pred_cols': extra_pred_cols,
388 }
389 )
390
391 forecast_config = ForecastConfig(
392 metadata_param=metadata,
393 forecast_horizon=forecast_horizon,
394 coverage=0.95,
395 evaluation_period_param=evaluation_period,
396 model_components_param=model_components,
397 forecast_one_by_one=forecast_one_by_one
398 )
399
400 # Run the forecast model
401 forecaster = Forecaster()
402 result = forecaster.run_forecast_config(
403 df=ts.df,
404 config=forecast_config
405 )
Out:
Fitting 6 folds for each of 1 candidates, totalling 6 fits
Let’s check the model results summary and plots. Here the forecast_one_by_one option fits 4 models for each step, hence 4 model summaries are printed, and 4 components plots are generated.
410 get_model_results_summary(result)
Out:
[================================ Model Summary =================================
Number of observations: 466, Number of features: 71
Method: Ridge regression
Number of nonzero features: 71
Regularization parameter: 0.0621
Residuals:
Min 1Q Median 3Q Max
-2.890e+04 -3782.0 623.6 3971.0 2.048e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 1.157e+04 4549.0 0.014 * (4240.0, 2.099e+04)
ct1 1.383e+04 4904.0 0.002 ** (4850.0, 2.345e+04)
ct1:C(mo... 13)))_2 -1369.0 5322.0 0.808 (-1.218e+04, 8889.0)
ct1:C(mo... 13)))_3 3498.0 5490.0 0.532 (-7190.0, 1.402e+04)
ct1:C(mo... 13)))_4 2.005e+04 5341.0 <2e-16 *** (9653.0, 3.060e+04)
ct1:C(mo... 13)))_5 1.243e+04 5908.0 0.030 * (1121.0, 2.304e+04)
ct1:C(mo... 13)))_6 1.846e+04 5142.0 <2e-16 *** (8311.0, 2.905e+04)
ct1:C(mo... 13)))_7 2.014e+04 5056.0 <2e-16 *** (9793.0, 2.962e+04)
ct1:C(mo... 13)))_8 1.952e+04 5335.0 <2e-16 *** (8284.0, 3.016e+04)
ct1:C(mo... 13)))_9 1.316e+04 5832.0 0.020 * (1382.0, 2.513e+04)
ct1:C(mo...13)))_10 1.442e+04 5720.0 0.006 ** (3802.0, 2.553e+04)
ct1:C(mo...13)))_11 -2251.0 4988.0 0.670 (-1.243e+04, 7466.0)
ct1:C(mo...13)))_12 88.81 5346.0 0.988 (-1.095e+04, 1.092e+04)
ct_sqrt 2.724e+04 6156.0 <2e-16 *** (1.443e+04, 3.870e+04)
sin1_ct1_yearly -1.184e+04 2064.0 <2e-16 *** (-1.592e+04, -8085.0)
cos1_ct1_yearly 2201.0 1618.0 0.162 (-730.6, 5501.0)
sin2_ct1_yearly 2414.0 1345.0 0.080 . (-195.6, 4978.0)
cos2_ct1_yearly 4682.0 1474.0 <2e-16 *** (1876.0, 7418.0)
sin3_ct1_yearly 2098.0 1444.0 0.144 (-751.9, 4583.0)
cos3_ct1_yearly -160.8 1314.0 0.916 (-3023.0, 2044.0)
sin4_ct1_yearly -903.4 1370.0 0.500 (-3611.0, 1747.0)
cos4_ct1_yearly -403.2 1348.0 0.750 (-3173.0, 2116.0)
sin5_ct1_yearly -1422.0 1353.0 0.284 (-3968.0, 1043.0)
cos5_ct1_yearly -1450.0 1237.0 0.250 (-3799.0, 868.2)
sin6_ct1_yearly 18.43 1308.0 0.988 (-2688.0, 2362.0)
cos6_ct1_yearly 2262.0 1111.0 0.038 * (-74.55, 4261.0)
sin7_ct1_yearly -570.3 1254.0 0.656 (-3136.0, 1877.0)
cos7_ct1_yearly -392.5 1104.0 0.730 (-2452.0, 1776.0)
sin8_ct1_yearly -1046.0 1164.0 0.372 (-3329.0, 1069.0)
cos8_ct1_yearly 314.1 1060.0 0.788 (-1830.0, 2252.0)
sin9_ct1_yearly -2537.0 1042.0 0.010 * (-4420.0, -667.2)
cos9_ct1_yearly 1553.0 1050.0 0.146 (-631.8, 3483.0)
sin10_ct1_yearly 384.5 1065.0 0.694 (-1846.0, 2431.0)
cos10_ct1_yearly 1482.0 1010.0 0.152 (-556.2, 3530.0)
sin11_ct1_yearly 2064.0 984.8 0.038 * (71.7, 3976.0)
cos11_ct1_yearly -377.9 959.2 0.700 (-2120.0, 1395.0)
sin12_ct1_yearly -1793.0 1001.0 0.080 . (-3824.0, 131.9)
cos12_ct1_yearly -2747.0 1010.0 0.004 ** (-4844.0, -940.7)
sin13_ct1_yearly -926.8 937.8 0.326 (-2735.0, 793.2)
cos13_ct1_yearly 1072.0 1030.0 0.280 (-910.4, 3143.0)
sin14_ct1_yearly -1079.0 966.7 0.264 (-2864.0, 801.6)
cos14_ct1_yearly 1292.0 1006.0 0.202 (-778.8, 3093.0)
sin15_ct1_yearly -10.76 1060.0 0.992 (-2143.0, 2058.0)
cos15_ct1_yearly 1135.0 939.2 0.242 (-630.5, 2936.0)
sin16_ct1_yearly -2247.0 974.2 0.022 * (-4145.0, -291.2)
cos16_ct1_yearly -882.8 980.4 0.402 (-2605.0, 1114.0)
sin17_ct1_yearly 484.5 1007.0 0.656 (-1435.0, 2424.0)
cos17_ct1_yearly 891.7 1033.0 0.376 (-1202.0, 2905.0)
sin18_ct1_yearly -413.2 1051.0 0.712 (-2677.0, 1463.0)
cos18_ct1_yearly -31.73 977.3 0.972 (-1894.0, 1918.0)
sin19_ct1_yearly -696.1 1022.0 0.508 (-2669.0, 1292.0)
cos19_ct1_yearly -785.3 1012.0 0.452 (-2883.0, 1058.0)
sin20_ct1_yearly 1262.0 966.0 0.212 (-572.4, 3108.0)
cos20_ct1_yearly 359.5 1014.0 0.730 (-1379.0, 2526.0)
sin21_ct1_yearly -1148.0 969.1 0.250 (-3058.0, 624.8)
cos21_ct1_yearly -670.7 1066.0 0.542 (-2719.0, 1437.0)
sin22_ct1_yearly 1086.0 898.6 0.218 (-940.0, 2588.0)
cos22_ct1_yearly 421.2 1047.0 0.744 (-1528.0, 2358.0)
sin23_ct1_yearly 409.4 924.9 0.664 (-1302.0, 2347.0)
cos23_ct1_yearly -3170.0 1075.0 <2e-16 *** (-5045.0, -1119.0)
sin24_ct1_yearly -1020.0 968.9 0.306 (-2907.0, 791.3)
cos24_ct1_yearly -452.9 1011.0 0.624 (-2356.0, 1673.0)
sin25_ct1_yearly -2412.0 976.0 0.012 * (-4289.0, -534.3)
cos25_ct1_yearly 349.5 998.5 0.716 (-1460.0, 2468.0)
cp0_2012_01_30_00 2834.0 7475.0 0.680 (-1.076e+04, 1.732e+04)
cp1_2013_01_14_00 -1.253e+04 7827.0 0.102 (-2.791e+04, 2154.0)
cp2_2015_02_23_00 -1701.0 4845.0 0.700 (-1.136e+04, 8025.0)
cp3_2017_10_02_00 -1.108e+04 2789.0 0.002 ** (-1.614e+04, -5200.0)
y_lag1 2.380e+04 5587.0 <2e-16 *** (1.294e+04, 3.469e+04)
y_lag2 1.217e+04 5644.0 0.034 * (992.4, 2.396e+04)
y_lag3 1.001e+04 5309.0 0.048 * (-178.5, 1.963e+04)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9265, Adjusted R-squared: 0.9139
F-statistic: 73.444 on 67 and 397 DF, p-value: 1.110e-16
Model AIC: 11212.0, model BIC: 11498.0
WARNING: the condition number is large, 1.08e+05. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
, ================================ Model Summary =================================
Number of observations: 466, Number of features: 71
Method: Ridge regression
Number of nonzero features: 71
Regularization parameter: 0.0621
Residuals:
Min 1Q Median 3Q Max
-2.845e+04 -3677.0 385.2 4356.0 1.959e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 1.081e+04 4680.0 0.022 * (2906.0, 2.078e+04)
ct1 1.641e+04 6144.0 0.004 ** (4315.0, 2.821e+04)
ct1:C(mo... 13)))_2 -370.3 6424.0 0.958 (-1.453e+04, 1.194e+04)
ct1:C(mo... 13)))_3 4962.0 6173.0 0.412 (-6341.0, 1.745e+04)
ct1:C(mo... 13)))_4 2.512e+04 5612.0 <2e-16 *** (1.270e+04, 3.542e+04)
ct1:C(mo... 13)))_5 1.566e+04 6003.0 0.010 * (3079.0, 2.623e+04)
ct1:C(mo... 13)))_6 2.458e+04 5345.0 <2e-16 *** (1.399e+04, 3.563e+04)
ct1:C(mo... 13)))_7 2.551e+04 5407.0 <2e-16 *** (1.516e+04, 3.588e+04)
ct1:C(mo... 13)))_8 2.564e+04 5745.0 <2e-16 *** (1.504e+04, 3.643e+04)
ct1:C(mo... 13)))_9 1.826e+04 6006.0 0.004 ** (5956.0, 2.878e+04)
ct1:C(mo...13)))_10 1.944e+04 5742.0 <2e-16 *** (6433.0, 3.031e+04)
ct1:C(mo...13)))_11 -402.0 5404.0 0.946 (-1.088e+04, 1.075e+04)
ct1:C(mo...13)))_12 649.1 5321.0 0.894 (-1.102e+04, 1.016e+04)
ct_sqrt 3.508e+04 7159.0 <2e-16 *** (2.060e+04, 4.795e+04)
sin1_ct1_yearly -1.474e+04 2119.0 <2e-16 *** (-1.904e+04, -1.062e+04)
cos1_ct1_yearly 3027.0 1778.0 0.076 . (-272.8, 6283.0)
sin2_ct1_yearly 3383.0 1507.0 0.022 * (541.1, 6379.0)
cos2_ct1_yearly 5267.0 1678.0 0.006 ** (2296.0, 8752.0)
sin3_ct1_yearly 2135.0 1491.0 0.162 (-809.4, 4834.0)
cos3_ct1_yearly -350.5 1458.0 0.806 (-3231.0, 2487.0)
sin4_ct1_yearly -665.1 1437.0 0.658 (-3171.0, 2440.0)
cos4_ct1_yearly -669.7 1439.0 0.614 (-3469.0, 2128.0)
sin5_ct1_yearly -1646.0 1423.0 0.244 (-4728.0, 892.0)
cos5_ct1_yearly -1388.0 1261.0 0.258 (-3593.0, 1551.0)
sin6_ct1_yearly -82.52 1480.0 0.958 (-2977.0, 2832.0)
cos6_ct1_yearly 2357.0 1164.0 0.040 * (385.4, 4681.0)
sin7_ct1_yearly -591.4 1312.0 0.666 (-3276.0, 1871.0)
cos7_ct1_yearly -571.0 1063.0 0.566 (-2716.0, 1659.0)
sin8_ct1_yearly -1370.0 1224.0 0.252 (-3918.0, 1046.0)
cos8_ct1_yearly 438.5 1059.0 0.688 (-1599.0, 2571.0)
sin9_ct1_yearly -2511.0 1100.0 0.024 * (-4663.0, -425.2)
cos9_ct1_yearly 2370.0 1145.0 0.040 * (344.0, 4837.0)
sin10_ct1_yearly 767.9 1105.0 0.502 (-1347.0, 2889.0)
cos10_ct1_yearly 1622.0 1054.0 0.126 (-533.2, 3516.0)
sin11_ct1_yearly 2459.0 1011.0 0.012 * (494.4, 4342.0)
cos11_ct1_yearly -712.9 936.8 0.488 (-2323.0, 1077.0)
sin12_ct1_yearly -2496.0 1020.0 0.014 * (-4516.0, -581.1)
cos12_ct1_yearly -2554.0 1044.0 0.018 * (-4623.0, -408.7)
sin13_ct1_yearly -464.0 1029.0 0.636 (-2606.0, 1509.0)
cos13_ct1_yearly 1258.0 1058.0 0.254 (-769.1, 3366.0)
sin14_ct1_yearly -833.4 968.1 0.396 (-2769.0, 881.9)
cos14_ct1_yearly 1381.0 1020.0 0.180 (-544.8, 3522.0)
sin15_ct1_yearly 256.7 1069.0 0.814 (-1900.0, 2145.0)
cos15_ct1_yearly 1171.0 1020.0 0.228 (-902.5, 3145.0)
sin16_ct1_yearly -2512.0 1046.0 0.020 * (-4526.0, -642.8)
cos16_ct1_yearly -483.9 1079.0 0.668 (-2410.0, 1751.0)
sin17_ct1_yearly 730.2 1043.0 0.470 (-1225.0, 2725.0)
cos17_ct1_yearly 778.5 1032.0 0.444 (-1192.0, 2839.0)
sin18_ct1_yearly -699.5 1096.0 0.524 (-2842.0, 1459.0)
cos18_ct1_yearly 272.6 1077.0 0.804 (-1704.0, 2557.0)
sin19_ct1_yearly -664.3 1003.0 0.514 (-2704.0, 1011.0)
cos19_ct1_yearly -601.2 1035.0 0.548 (-2704.0, 1487.0)
sin20_ct1_yearly 1087.0 1062.0 0.328 (-1057.0, 3059.0)
cos20_ct1_yearly 252.7 993.1 0.806 (-1731.0, 2157.0)
sin21_ct1_yearly -908.7 1023.0 0.408 (-2729.0, 1139.0)
cos21_ct1_yearly -453.1 1008.0 0.628 (-2704.0, 1402.0)
sin22_ct1_yearly 922.5 974.1 0.332 (-1026.0, 2765.0)
cos22_ct1_yearly 129.3 998.9 0.870 (-1703.0, 2154.0)
sin23_ct1_yearly 247.3 924.1 0.772 (-1823.0, 1857.0)
cos23_ct1_yearly -2725.0 1058.0 0.014 * (-4657.0, -594.9)
sin24_ct1_yearly -933.8 1039.0 0.376 (-2886.0, 1080.0)
cos24_ct1_yearly -348.9 1029.0 0.744 (-2289.0, 1628.0)
sin25_ct1_yearly -1887.0 1077.0 0.076 . (-3900.0, 387.2)
cos25_ct1_yearly 142.6 1025.0 0.878 (-1796.0, 2026.0)
cp0_2012_01_30_00 2893.0 8145.0 0.708 (-1.267e+04, 1.846e+04)
cp1_2013_01_14_00 -1.565e+04 8435.0 0.060 . (-3.155e+04, 808.2)
cp2_2015_02_23_00 -1969.0 5143.0 0.724 (-1.203e+04, 8177.0)
cp3_2017_10_02_00 -1.377e+04 3015.0 <2e-16 *** (-1.960e+04, -7951.0)
y_lag2 1.841e+04 5419.0 <2e-16 *** (7827.0, 2.887e+04)
y_lag3 1.335e+04 5487.0 0.012 * (1829.0, 2.490e+04)
y_lag4 384.1 5590.0 0.968 (-1.032e+04, 1.106e+04)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9225, Adjusted R-squared: 0.9092
F-statistic: 69.268 on 67 and 397 DF, p-value: 1.110e-16
Model AIC: 11236.0, model BIC: 11522.0
WARNING: the condition number is large, 1.08e+05. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
, ================================ Model Summary =================================
Number of observations: 466, Number of features: 71
Method: Ridge regression
Number of nonzero features: 71
Regularization parameter: 0.0621
Residuals:
Min 1Q Median 3Q Max
-2.803e+04 -3780.0 295.0 4331.0 2.130e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 1.174e+04 5012.0 0.026 * (3186.0, 2.304e+04)
ct1 1.729e+04 5686.0 0.002 ** (7067.0, 2.854e+04)
ct1:C(mo... 13)))_2 1405.0 5878.0 0.802 (-9391.0, 1.233e+04)
ct1:C(mo... 13)))_3 7198.0 5927.0 0.234 (-4840.0, 1.896e+04)
ct1:C(mo... 13)))_4 2.871e+04 5132.0 <2e-16 *** (1.762e+04, 3.748e+04)
ct1:C(mo... 13)))_5 1.782e+04 5800.0 0.002 ** (6229.0, 2.760e+04)
ct1:C(mo... 13)))_6 2.810e+04 4979.0 <2e-16 *** (1.820e+04, 3.796e+04)
ct1:C(mo... 13)))_7 2.890e+04 5206.0 <2e-16 *** (1.797e+04, 3.894e+04)
ct1:C(mo... 13)))_8 2.861e+04 5080.0 <2e-16 *** (1.825e+04, 3.777e+04)
ct1:C(mo... 13)))_9 2.073e+04 5944.0 <2e-16 *** (7752.0, 3.137e+04)
ct1:C(mo...13)))_10 2.231e+04 5351.0 <2e-16 *** (1.075e+04, 3.192e+04)
ct1:C(mo...13)))_11 132.7 5400.0 0.982 (-1.037e+04, 1.042e+04)
ct1:C(mo...13)))_12 1278.0 5236.0 0.838 (-9561.0, 1.046e+04)
ct_sqrt 3.670e+04 7179.0 <2e-16 *** (2.109e+04, 4.969e+04)
sin1_ct1_yearly -1.623e+04 1835.0 <2e-16 *** (-2.043e+04, -1.305e+04)
cos1_ct1_yearly 2976.0 1920.0 0.124 (-397.2, 6991.0)
sin2_ct1_yearly 3903.0 1591.0 0.022 * (938.6, 6940.0)
cos2_ct1_yearly 5873.0 1618.0 0.002 ** (2817.0, 9328.0)
sin3_ct1_yearly 1995.0 1563.0 0.212 (-1328.0, 4870.0)
cos3_ct1_yearly -385.0 1549.0 0.798 (-3270.0, 2540.0)
sin4_ct1_yearly -256.7 1410.0 0.862 (-2995.0, 2592.0)
cos4_ct1_yearly -1006.0 1455.0 0.462 (-3930.0, 1923.0)
sin5_ct1_yearly -2232.0 1406.0 0.102 (-5056.0, 437.2)
cos5_ct1_yearly -1400.0 1429.0 0.326 (-4434.0, 1419.0)
sin6_ct1_yearly -94.22 1488.0 0.954 (-2623.0, 3189.0)
cos6_ct1_yearly 1961.0 1147.0 0.090 . (-196.0, 4200.0)
sin7_ct1_yearly -576.8 1349.0 0.668 (-3266.0, 1948.0)
cos7_ct1_yearly -366.1 1129.0 0.782 (-2539.0, 1947.0)
sin8_ct1_yearly -1574.0 1191.0 0.182 (-3927.0, 709.9)
cos8_ct1_yearly 846.0 1149.0 0.446 (-1438.0, 3068.0)
sin9_ct1_yearly -1734.0 1102.0 0.130 (-3948.0, 452.8)
cos9_ct1_yearly 2571.0 1182.0 0.032 * (137.3, 4867.0)
sin10_ct1_yearly 936.5 1084.0 0.388 (-1156.0, 3000.0)
cos10_ct1_yearly 1333.0 1067.0 0.212 (-888.5, 3160.0)
sin11_ct1_yearly 2276.0 1077.0 0.042 * (-27.64, 4196.0)
cos11_ct1_yearly -887.5 1045.0 0.412 (-2856.0, 1040.0)
sin12_ct1_yearly -2133.0 1034.0 0.038 * (-3941.0, -80.9)
cos12_ct1_yearly -2129.0 1100.0 0.060 . (-4380.0, 76.26)
sin13_ct1_yearly -311.7 1051.0 0.800 (-2353.0, 1680.0)
cos13_ct1_yearly 806.0 1142.0 0.516 (-1217.0, 3044.0)
sin14_ct1_yearly -983.8 1017.0 0.336 (-2945.0, 975.4)
cos14_ct1_yearly 1091.0 1015.0 0.288 (-752.3, 3267.0)
sin15_ct1_yearly 138.3 1054.0 0.878 (-1804.0, 2307.0)
cos15_ct1_yearly 1151.0 1006.0 0.264 (-947.6, 3019.0)
sin16_ct1_yearly -2434.0 1010.0 0.016 * (-4516.0, -536.3)
cos16_ct1_yearly -554.2 1091.0 0.624 (-2561.0, 1654.0)
sin17_ct1_yearly 657.1 1062.0 0.548 (-1332.0, 2567.0)
cos17_ct1_yearly 765.6 1126.0 0.488 (-1579.0, 2890.0)
sin18_ct1_yearly -705.3 1068.0 0.504 (-2858.0, 1275.0)
cos18_ct1_yearly 471.5 1086.0 0.636 (-1578.0, 2623.0)
sin19_ct1_yearly -460.8 970.3 0.652 (-2698.0, 1385.0)
cos19_ct1_yearly -671.8 1056.0 0.524 (-2791.0, 1243.0)
sin20_ct1_yearly 1066.0 954.6 0.264 (-665.5, 2981.0)
cos20_ct1_yearly 441.2 990.3 0.648 (-1285.0, 2452.0)
sin21_ct1_yearly -826.5 980.4 0.420 (-2684.0, 1053.0)
cos21_ct1_yearly -657.0 1031.0 0.540 (-2715.0, 1376.0)
sin22_ct1_yearly 867.5 937.5 0.350 (-922.9, 2686.0)
cos22_ct1_yearly 350.7 1101.0 0.752 (-1898.0, 2614.0)
sin23_ct1_yearly 802.0 1005.0 0.396 (-1331.0, 2647.0)
cos23_ct1_yearly -3043.0 1065.0 0.002 ** (-5273.0, -955.8)
sin24_ct1_yearly -1033.0 1056.0 0.336 (-2995.0, 1105.0)
cos24_ct1_yearly -523.5 1027.0 0.608 (-2607.0, 1435.0)
sin25_ct1_yearly -2544.0 1075.0 0.028 * (-4805.0, -496.3)
cos25_ct1_yearly 49.26 1043.0 0.942 (-1903.0, 2076.0)
cp0_2012_01_30_00 2950.0 8048.0 0.698 (-1.203e+04, 1.962e+04)
cp1_2013_01_14_00 -1.687e+04 7853.0 0.034 * (-3.232e+04, -1054.0)
cp2_2015_02_23_00 -2182.0 4954.0 0.660 (-1.260e+04, 7104.0)
cp3_2017_10_02_00 -1.481e+04 2823.0 <2e-16 *** (-2.024e+04, -9301.0)
y_lag3 1.671e+04 5385.0 <2e-16 *** (6367.0, 2.738e+04)
y_lag4 553.5 5618.0 0.918 (-9751.0, 1.175e+04)
y_lag5 1.004e+04 4856.0 0.032 * (865.7, 1.918e+04)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9208, Adjusted R-squared: 0.9072
F-statistic: 67.654 on 67 and 397 DF, p-value: 1.110e-16
Model AIC: 11246.0, model BIC: 11532.0
WARNING: the condition number is large, 1.08e+05. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
, ================================ Model Summary =================================
Number of observations: 466, Number of features: 71
Method: Ridge regression
Number of nonzero features: 71
Regularization parameter: 0.02807
Residuals:
Min 1Q Median 3Q Max
-2.844e+04 -3736.0 448.2 4182.0 2.170e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 1.167e+04 5390.0 0.038 * (2676.0, 2.389e+04)
ct1 1.785e+04 1.142e+04 0.124 (-3816.0, 4.054e+04)
ct1:C(mo... 13)))_2 2556.0 6449.0 0.676 (-9303.0, 1.564e+04)
ct1:C(mo... 13)))_3 9731.0 6008.0 0.108 (-1794.0, 2.193e+04)
ct1:C(mo... 13)))_4 3.373e+04 5771.0 <2e-16 *** (2.263e+04, 4.428e+04)
ct1:C(mo... 13)))_5 2.337e+04 6529.0 <2e-16 *** (9587.0, 3.531e+04)
ct1:C(mo... 13)))_6 3.428e+04 5592.0 <2e-16 *** (2.298e+04, 4.453e+04)
ct1:C(mo... 13)))_7 3.618e+04 5792.0 <2e-16 *** (2.511e+04, 4.728e+04)
ct1:C(mo... 13)))_8 3.622e+04 6144.0 <2e-16 *** (2.426e+04, 4.741e+04)
ct1:C(mo... 13)))_9 2.771e+04 6639.0 <2e-16 *** (1.526e+04, 3.937e+04)
ct1:C(mo...13)))_10 2.884e+04 6074.0 <2e-16 *** (1.756e+04, 3.976e+04)
ct1:C(mo...13)))_11 5801.0 6162.0 0.344 (-6076.0, 1.769e+04)
ct1:C(mo...13)))_12 3647.0 5310.0 0.500 (-7709.0, 1.347e+04)
ct_sqrt 4.639e+04 9225.0 <2e-16 *** (2.872e+04, 6.283e+04)
sin1_ct1_yearly -1.796e+04 1908.0 <2e-16 *** (-2.184e+04, -1.444e+04)
cos1_ct1_yearly 4496.0 1811.0 0.012 * (1219.0, 7998.0)
sin2_ct1_yearly 4660.0 1587.0 0.002 ** (1648.0, 7849.0)
cos2_ct1_yearly 5335.0 1671.0 <2e-16 *** (2027.0, 8476.0)
sin3_ct1_yearly 1819.0 1478.0 0.236 (-1118.0, 4674.0)
cos3_ct1_yearly -139.4 1632.0 0.926 (-3377.0, 2814.0)
sin4_ct1_yearly -144.7 1371.0 0.918 (-2857.0, 2421.0)
cos4_ct1_yearly -1255.0 1429.0 0.398 (-3824.0, 1366.0)
sin5_ct1_yearly -2204.0 1422.0 0.122 (-5037.0, 522.2)
cos5_ct1_yearly -1147.0 1253.0 0.364 (-3465.0, 1157.0)
sin6_ct1_yearly -262.0 1458.0 0.856 (-3198.0, 2459.0)
cos6_ct1_yearly 1650.0 1165.0 0.162 (-597.4, 3825.0)
sin7_ct1_yearly -376.6 1380.0 0.762 (-2940.0, 2454.0)
cos7_ct1_yearly -280.6 1217.0 0.772 (-2795.0, 2029.0)
sin8_ct1_yearly -1639.0 1149.0 0.158 (-4056.0, 555.1)
cos8_ct1_yearly 933.8 1108.0 0.380 (-1204.0, 3078.0)
sin9_ct1_yearly -1453.0 1089.0 0.188 (-3659.0, 691.5)
cos9_ct1_yearly 2266.0 1122.0 0.052 . (37.51, 4430.0)
sin10_ct1_yearly 826.0 1046.0 0.406 (-1240.0, 2867.0)
cos10_ct1_yearly 1374.0 1100.0 0.208 (-726.8, 3570.0)
sin11_ct1_yearly 2407.0 1049.0 0.022 * (453.3, 4511.0)
cos11_ct1_yearly -786.0 1046.0 0.480 (-2790.0, 1156.0)
sin12_ct1_yearly -1690.0 1035.0 0.102 (-3566.0, 413.5)
cos12_ct1_yearly -2128.0 1052.0 0.042 * (-4082.0, -52.35)
sin13_ct1_yearly -512.2 1044.0 0.614 (-2763.0, 1419.0)
cos13_ct1_yearly 277.1 1110.0 0.796 (-1765.0, 2507.0)
sin14_ct1_yearly -1453.0 1005.0 0.150 (-3493.0, 370.6)
cos14_ct1_yearly 1027.0 1058.0 0.328 (-1029.0, 2992.0)
sin15_ct1_yearly -11.37 1118.0 0.980 (-2316.0, 1975.0)
cos15_ct1_yearly 1352.0 1022.0 0.200 (-652.4, 3396.0)
sin16_ct1_yearly -2544.0 1013.0 0.008 ** (-4468.0, -463.1)
cos16_ct1_yearly -857.2 1043.0 0.428 (-2895.0, 1234.0)
sin17_ct1_yearly 743.2 1047.0 0.494 (-1199.0, 3050.0)
cos17_ct1_yearly 887.1 1044.0 0.404 (-1189.0, 2783.0)
sin18_ct1_yearly -719.1 1081.0 0.504 (-2948.0, 1464.0)
cos18_ct1_yearly 346.5 1063.0 0.726 (-1741.0, 2631.0)
sin19_ct1_yearly -632.6 1072.0 0.578 (-2772.0, 1430.0)
cos19_ct1_yearly -719.2 1007.0 0.482 (-2696.0, 1198.0)
sin20_ct1_yearly 1077.0 1015.0 0.308 (-720.1, 3212.0)
cos20_ct1_yearly 254.7 990.3 0.798 (-1588.0, 2158.0)
sin21_ct1_yearly -907.9 1048.0 0.366 (-2915.0, 1193.0)
cos21_ct1_yearly -420.9 1054.0 0.664 (-2406.0, 1679.0)
sin22_ct1_yearly 968.5 978.8 0.316 (-883.1, 2981.0)
cos22_ct1_yearly 213.5 1080.0 0.850 (-2036.0, 2275.0)
sin23_ct1_yearly 593.9 956.6 0.544 (-1311.0, 2346.0)
cos23_ct1_yearly -2913.0 1150.0 0.010 * (-5055.0, -413.9)
sin24_ct1_yearly -1020.0 1027.0 0.314 (-3067.0, 941.6)
cos24_ct1_yearly -433.4 1029.0 0.654 (-2499.0, 1519.0)
sin25_ct1_yearly -2270.0 1065.0 0.028 * (-4272.0, -139.1)
cos25_ct1_yearly -60.41 1049.0 0.952 (-2075.0, 2097.0)
cp0_2012_01_30_00 2899.0 1.148e+04 0.794 (-1.913e+04, 2.698e+04)
cp1_2013_01_14_00 -2.130e+04 1.033e+04 0.026 * (-4.180e+04, -2529.0)
cp2_2015_02_23_00 -1575.0 5561.0 0.740 (-1.265e+04, 9196.0)
cp3_2017_10_02_00 -1.748e+04 2885.0 <2e-16 *** (-2.260e+04, -1.102e+04)
y_lag4 4134.0 5738.0 0.478 (-7745.0, 1.520e+04)
y_lag5 1.226e+04 5190.0 0.014 * (2156.0, 2.170e+04)
y_lag6 -2743.0 6123.0 0.656 (-1.352e+04, 1.064e+04)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9193, Adjusted R-squared: 0.9054
F-statistic: 65.708 on 68 and 396 DF, p-value: 1.110e-16
Model AIC: 11256.0, model BIC: 11545.0
WARNING: the condition number is large, 2.17e+05. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
]
================================= CV Results ==================================
0
rank_test_MAPE 1
mean_test_MAPE 10.34
split_test_MAPE (10.34, 15.4, 6.67, 11.39, 9.49, 8.78)
mean_train_MAPE 15.57
split_train_MAPE (15.68, 15.65, 15.63, 15.54, 15.5, 15.44)
mean_fit_time 10.02
mean_score_time 0.85
params []
=========================== Train/Test Evaluation =============================
train test
CORR 0.958442 0.927707
R2 0.918606 -1.017877
MSE 49536598.782839 34669900.692205
RMSE 7038.224121 5888.115207
MAE 5322.253903 4775.374732
MedAE 4027.174276 4324.400695
MAPE 15.377647 6.111202
MedAPE 8.106217 5.450027
sMAPE 7.197916 2.917224
Q80 2661.126952 955.074946
Q95 2661.126952 238.768737
Q99 2661.126952 47.753747
OutsideTolerance1p 0.950216 1.0
OutsideTolerance2p 0.880952 0.75
OutsideTolerance3p 0.829004 0.5
OutsideTolerance4p 0.774892 0.5
OutsideTolerance5p 0.679654 0.5
Outside Tolerance (fraction) None None
R2_null_model_score None None
Prediction Band Width (%) 91.11172 40.784758
Prediction Band Coverage (fraction) 0.963203 1.0
Coverage: Lower Band 0.448052 1.0
Coverage: Upper Band 0.515152 0.0
Coverage Diff: Actual_Coverage - Intended_Coverage 0.013203 0.05
MIS 38199.341036 33020.871899
Fit/backtest plot:
414 fig = result.backtest.plot()
415 plotly.io.show(fig)
Forecast plot:
419 fig = result.forecast.plot()
420 plotly.io.show(fig)
The components plot:
424 figs = result.forecast.plot_components()
425 for fig in figs:
426 plotly.io.show(fig)
Total running time of the script: ( 2 minutes 50.387 seconds)