Note
Click here to download the full example code
Example for weekly data¶
This is a basic example for weekly data using Silverkite. Note that here we are fitting a few simple models and the goal is not to optimize the results as much as possible.
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | import warnings
from collections import defaultdict
import plotly
import pandas as pd
from greykite.common.constants import TIME_COL
from greykite.common.constants import VALUE_COL
from greykite.framework.benchmark.data_loader_ts import DataLoader
from greykite.framework.input.univariate_time_series import UnivariateTimeSeries
from greykite.framework.templates.autogen.forecast_config import EvaluationPeriodParam
from greykite.framework.templates.autogen.forecast_config import ForecastConfig
from greykite.framework.templates.autogen.forecast_config import MetadataParam
from greykite.framework.templates.autogen.forecast_config import ModelComponentsParam
from greykite.framework.templates.forecaster import Forecaster
from greykite.framework.utils.result_summary import summarize_grid_search_results
warnings.filterwarnings("ignore")
|
Loads weekly dataset into UnivariateTimeSeries
.
31 32 33 34 35 36 37 38 39 40 41 42 43 44 | dl = DataLoader()
agg_func = {"count": "sum"}
df = dl.load_bikesharing(agg_freq="weekly", agg_func=agg_func)
# In this dataset the first week and last week's data are incomplete, therefore we drop it
df.drop(df.head(1).index,inplace=True)
df.drop(df.tail(1).index,inplace=True)
df.reset_index(drop=True)
ts = UnivariateTimeSeries()
ts.load_data(
df=df,
time_col="ts",
value_col="count",
freq="W-MON")
print(ts.df.head())
|
Out:
ts y
2010-09-27 2010-09-27 2801
2010-10-04 2010-10-04 3238
2010-10-11 2010-10-11 6241
2010-10-18 2010-10-18 7756
2010-10-25 2010-10-25 9556
Exploratory Data Analysis (EDA)¶
After reading in a time series, we could first do some exploratory data analysis.
The UnivariateTimeSeries
class is
used to store a timeseries and perform EDA.
A quick description of the data can be obtained as follows.
55 56 | print(ts.describe_time_col())
print(ts.describe_value_col())
|
Out:
{'data_points': 466, 'mean_increment_secs': 604800.0, 'min_timestamp': Timestamp('2010-09-27 00:00:00'), 'max_timestamp': Timestamp('2019-08-26 00:00:00')}
count 466.000000
mean 53466.961373
std 24728.824016
min 2801.000000
25% 32819.750000
50% 51921.500000
75% 76160.750000
max 102350.000000
Name: y, dtype: float64
Let’s plot the original timeseries.
(The interactive plot is generated by plotly
: click to zoom!)
61 62 | fig = ts.plot()
plotly.io.show(fig)
|
Exploratory plots can be plotted to reveal the time series’s properties. Monthly overlay plot can be used to inspect the annual patterns. This plot overlays various years on top of each other.
68 69 70 71 72 73 74 75 76 77 78 79 80 | fig = ts.plot_quantiles_and_overlays(
groupby_time_feature="month",
show_mean=True,
show_quantiles=False,
show_overlays=True,
center_values=True,
overlay_label_time_feature="year", # splits overlays by year
overlay_style={"line": {"width": 1}, "opacity": 0.5},
xlabel="Month",
ylabel=ts.original_value_col,
title="Yearly seasonality by year (centered)",
)
plotly.io.show(fig)
|
Weekly overlay plot.
84 85 86 87 88 89 90 91 92 93 94 95 96 | fig = ts.plot_quantiles_and_overlays(
groupby_time_feature="woy",
show_mean=True,
show_quantiles=False,
show_overlays=True,
center_values=True,
overlay_label_time_feature="year", # splits overlays by year
overlay_style={"line": {"width": 1}, "opacity": 0.5},
xlabel="Week of year",
ylabel=ts.original_value_col,
title="Yearly seasonality by year (centered)",
)
plotly.io.show(fig)
|
Fit Greykite Models¶
After some exploratory data analysis, let’s specify the model parameters and fit a Greykite model.
Specify common metadata.
105 106 107 108 109 110 111 112 |
Specify common evaluation parameters. Set minimum input data for training.
117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | cv_min_train_periods = 52 * 2
# Let CV use most recent splits for cross-validation.
cv_use_most_recent_splits = True
# Determine the maximum number of validations.
cv_max_splits = 6
evaluation_period = EvaluationPeriodParam(
test_horizon=forecast_horizon,
cv_horizon=forecast_horizon,
periods_between_train_test=0,
cv_min_train_periods=cv_min_train_periods,
cv_expanding_window=True,
cv_use_most_recent_splits=cv_use_most_recent_splits,
cv_periods_between_splits=None,
cv_periods_between_train_test=0,
cv_max_splits=cv_max_splits,
)
|
Let’s also define a helper function that generates the model results summary and plots.
136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 | def get_model_results_summary(result):
"""Generates model results summary.
Parameters
----------
result : `ForecastResult`
See :class:`~greykite.framework.pipeline.pipeline.ForecastResult` for documentation.
Returns
-------
Prints out model coefficients, cross-validation results, overall train/test evalautions.
"""
# Get the useful fields from the forecast result
model = result.model[-1]
backtest = result.backtest
grid_search = result.grid_search
# Check model coefficients / variables
# Get model summary with p-values
print(model.summary())
# Get cross-validation results
cv_results = summarize_grid_search_results(
grid_search=grid_search,
decimals=2,
cv_report_metrics=None,
column_order=[
"rank", "mean_test", "split_test", "mean_train", "split_train",
"mean_fit_time", "mean_score_time", "params"])
# Transposes to save space in the printed output
print("================================= CV Results ==================================")
print(cv_results.transpose())
# Check historical evaluation metrics (on the historical training/test set).
backtest_eval = defaultdict(list)
for metric, value in backtest.train_evaluation.items():
backtest_eval[metric].append(value)
backtest_eval[metric].append(backtest.test_evaluation[metric])
metrics = pd.DataFrame(backtest_eval, index=["train", "test"]).T
print("=========================== Train/Test Evaluation =============================")
print(metrics)
|
Fit a simple model without autoregression.
The the most important model parameters are specified through ModelComponentsParam
.
The extra_pred_cols
is used to specify growth and annual seasonality
Growth is modelled with both “ct_sqrt”, “ct1” for extra flexibility as we have
longterm data and ridge regularization will avoid over-fitting the trend.
The yearly seasonality is modelled using Fourier series. In the ModelComponentsParam
,
we can specify the order of that - the higher the order is, the more flexible pattern
the model could capture. Usually one can try integers between 10 and 50.
188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 | autoregression = None
extra_pred_cols = ["ct1", "ct_sqrt", "ct1:C(month, levels=list(range(1, 13)))"]
# Specify the model parameters
model_components = ModelComponentsParam(
autoregression=autoregression,
seasonality={
"yearly_seasonality": 25,
"quarterly_seasonality": 0,
"monthly_seasonality": 0,
"weekly_seasonality": 0,
"daily_seasonality": 0
},
changepoints={
'changepoints_dict': {
"method": "auto",
"resample_freq": "7D",
"regularization_strength": 0.5,
"potential_changepoint_distance": "14D",
"no_changepoint_distance_from_end": "60D",
"yearly_seasonality_order": 25,
"yearly_seasonality_change_freq": None,
},
"seasonality_changepoints_dict": None
},
events={
"holiday_lookup_countries": []
},
growth={
"growth_term": None
},
custom={
'feature_sets_enabled': False,
'fit_algorithm_dict': dict(fit_algorithm='ridge'),
'extra_pred_cols': extra_pred_cols,
}
)
forecast_config = ForecastConfig(
metadata_param=metadata,
forecast_horizon=forecast_horizon,
coverage=0.95,
evaluation_period_param=evaluation_period,
model_components_param=model_components
)
# Run the forecast model
forecaster = Forecaster()
result = forecaster.run_forecast_config(
df=ts.df,
config=forecast_config
)
|
Out:
Fitting 6 folds for each of 1 candidates, totalling 6 fits
Let’s check the model results summary and plots.
243 | get_model_results_summary(result)
|
Out:
================================ Model Summary =================================
Number of observations: 466, Number of features: 68
Method: Ridge regression
Number of nonzero features: 68
Regularization parameter: 1.123
Residuals:
Min 1Q Median 3Q Max
-2.753e+04 -3838.0 186.0 4594.0 2.175e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 7670.0 1855.0 <2e-16 *** (4548.0, 1.182e+04)
ct1 7608.0 1291.0 <2e-16 *** (5362.0, 1.026e+04)
ct1:C(mo... 13)))_2 200.0 780.9 0.770 (-1371.0, 1723.0)
ct1:C(mo... 13)))_3 1229.0 772.3 0.110 (-285.8, 2588.0)
ct1:C(mo... 13)))_4 4159.0 670.4 <2e-16 *** (2749.0, 5488.0)
ct1:C(mo... 13)))_5 3320.0 717.2 <2e-16 *** (1799.0, 4695.0)
ct1:C(mo... 13)))_6 4426.0 624.6 <2e-16 *** (3056.0, 5623.0)
ct1:C(mo... 13)))_7 4729.0 616.4 <2e-16 *** (3424.0, 5955.0)
ct1:C(mo... 13)))_8 4732.0 625.9 <2e-16 *** (3306.0, 5881.0)
ct1:C(mo... 13)))_9 4289.0 735.6 <2e-16 *** (2940.0, 5775.0)
ct1:C(mo...13)))_10 4342.0 665.3 <2e-16 *** (3021.0, 5600.0)
ct1:C(mo...13)))_11 1440.0 680.0 0.044 * (-14.45, 2796.0)
ct1:C(mo...13)))_12 697.2 649.7 0.270 (-673.8, 1872.0)
ct_sqrt 9738.0 2264.0 <2e-16 *** (4792.0, 1.363e+04)
sin1_ct1_yearly -9796.0 810.8 <2e-16 *** (-1.140e+04, -8451.0)
cos1_ct1_yearly 2936.0 848.2 <2e-16 *** (1350.0, 4677.0)
sin2_ct1_yearly 2458.0 794.6 0.002 ** (997.3, 4103.0)
cos2_ct1_yearly 2233.0 778.5 0.006 ** (805.9, 3716.0)
sin3_ct1_yearly 810.2 757.2 0.262 (-822.4, 2248.0)
cos3_ct1_yearly -158.2 837.2 0.824 (-1948.0, 1386.0)
sin4_ct1_yearly -149.3 763.0 0.836 (-1571.0, 1152.0)
cos4_ct1_yearly -695.1 726.4 0.326 (-2217.0, 822.1)
sin5_ct1_yearly -1038.0 768.6 0.166 (-2607.0, 412.5)
cos5_ct1_yearly -456.8 664.6 0.496 (-1697.0, 989.2)
sin6_ct1_yearly -106.5 750.2 0.874 (-1568.0, 1419.0)
cos6_ct1_yearly 743.4 561.6 0.172 (-307.2, 1856.0)
sin7_ct1_yearly -160.7 651.4 0.786 (-1371.0, 1133.0)
cos7_ct1_yearly -119.4 563.0 0.824 (-1205.0, 975.4)
sin8_ct1_yearly -777.8 597.6 0.186 (-1962.0, 370.2)
cos8_ct1_yearly 331.5 543.2 0.532 (-723.7, 1429.0)
sin9_ct1_yearly -931.8 565.8 0.106 (-1991.0, 188.8)
cos9_ct1_yearly 1028.0 553.6 0.084 . (-41.61, 2234.0)
sin10_ct1_yearly 313.1 521.1 0.538 (-666.6, 1257.0)
cos10_ct1_yearly 885.4 536.1 0.082 . (-152.8, 1819.0)
sin11_ct1_yearly 1426.0 500.1 <2e-16 *** (565.8, 2406.0)
cos11_ct1_yearly -345.3 513.1 0.534 (-1279.0, 654.4)
sin12_ct1_yearly -1031.0 506.5 0.042 * (-1931.0, 11.26)
cos12_ct1_yearly -1156.0 513.3 0.020 * (-2030.0, -112.2)
sin13_ct1_yearly -215.5 514.0 0.674 (-1300.0, 770.7)
cos13_ct1_yearly 131.3 510.6 0.786 (-865.3, 1113.0)
sin14_ct1_yearly -718.6 515.4 0.174 (-1735.0, 236.8)
cos14_ct1_yearly 567.2 519.8 0.270 (-416.0, 1606.0)
sin15_ct1_yearly 42.66 564.1 0.942 (-1088.0, 1027.0)
cos15_ct1_yearly 628.4 498.2 0.206 (-324.3, 1597.0)
sin16_ct1_yearly -1186.0 537.7 0.024 * (-2185.0, -151.8)
cos16_ct1_yearly -335.2 503.3 0.506 (-1262.0, 686.7)
sin17_ct1_yearly 323.1 552.1 0.544 (-691.7, 1419.0)
cos17_ct1_yearly 408.7 517.9 0.468 (-643.9, 1334.0)
sin18_ct1_yearly -276.4 496.9 0.612 (-1286.0, 661.8)
cos18_ct1_yearly 113.0 541.3 0.814 (-1075.0, 1166.0)
sin19_ct1_yearly -275.3 531.5 0.594 (-1369.0, 761.5)
cos19_ct1_yearly -397.8 539.3 0.482 (-1462.0, 649.0)
sin20_ct1_yearly 532.5 512.4 0.294 (-434.9, 1597.0)
cos20_ct1_yearly 138.4 516.0 0.808 (-920.0, 998.8)
sin21_ct1_yearly -508.6 508.8 0.306 (-1524.0, 453.5)
cos21_ct1_yearly -239.0 567.1 0.670 (-1406.0, 805.6)
sin22_ct1_yearly 564.0 476.6 0.246 (-364.8, 1470.0)
cos22_ct1_yearly 167.3 544.5 0.776 (-825.8, 1248.0)
sin23_ct1_yearly 258.2 501.4 0.616 (-773.8, 1204.0)
cos23_ct1_yearly -1555.0 516.8 0.004 ** (-2613.0, -521.9)
sin24_ct1_yearly -536.9 499.6 0.298 (-1478.0, 482.8)
cos24_ct1_yearly -190.4 511.1 0.724 (-1093.0, 784.3)
sin25_ct1_yearly -1067.0 516.7 0.048 * (-2052.0, -14.09)
cos25_ct1_yearly -28.7 488.7 0.952 (-988.1, 970.9)
cp0_2012_01_30_00 -1760.0 1866.0 0.334 (-5147.0, 2091.0)
cp1_2013_01_14_00 -4526.0 1682.0 0.008 ** (-8127.0, -1487.0)
cp2_2015_02_23_00 -1144.0 1240.0 0.358 (-3723.0, 1020.0)
cp3_2017_10_02_00 -1.035e+04 1495.0 <2e-16 *** (-1.305e+04, -7150.0)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.918, Adjusted R-squared: 0.9046
F-statistic: 68.221 on 65 and 399 DF, p-value: 1.110e-16
Model AIC: 11257.0, model BIC: 11532.0
WARNING: the condition number is large, 2.28e+04. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
================================= CV Results ==================================
0
rank_test_MAPE 1
mean_test_MAPE 10.86
split_test_MAPE (11.49, 14.36, 5.11, 11.09, 10.62, 12.52)
mean_train_MAPE 16.07
split_train_MAPE (16.15, 16.19, 16.13, 16.03, 15.98, 15.97)
mean_fit_time 6.17
mean_score_time 0.36
params []
=========================== Train/Test Evaluation =============================
train test
CORR 0.957869 0.885644
R2 0.917504 -2.57631
MSE 5.02071e+07 6.14459e+07
RMSE 7085.7 7838.74
MAE 5381.07 7127.6
MedAE 4246.77 6758.66
MAPE 15.9252 9.00144
MedAPE 8.4034 8.41672
sMAPE 7.28901 4.26262
Q80 2690.54 1425.52
Q95 2690.54 356.38
Q99 2690.54 71.276
OutsideTolerance1p 0.941558 1
OutsideTolerance2p 0.883117 1
OutsideTolerance3p 0.805195 1
OutsideTolerance4p 0.755411 1
OutsideTolerance5p 0.692641 0.75
Outside Tolerance (fraction) None None
R2_null_model_score None None
Prediction Band Width (%) 78.8295 34.3187
Prediction Band Coverage (fraction) 0.935065 1
Coverage: Lower Band 0.445887 1
Coverage: Upper Band 0.489177 0
Coverage Diff: Actual_Coverage - Intended_Coverage -0.0149351 0.05
Fit/backtest plot:
247 248 | fig = result.backtest.plot()
plotly.io.show(fig)
|
Forecast plot:
252 253 | fig = result.forecast.plot()
plotly.io.show(fig)
|
The components plot:
257 258 | fig = result.forecast.plot_components()
plotly.io.show(fig)
|
Fit a simple model with autoregression.
This is done by specifying the autoregression
parameter in ModelComponentsParam
.
Note that the auto-regressive structure can be customized further depending on your data.
264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 | autoregression = {
"autoreg_dict": {
"lag_dict": {"orders": [1]}, # Only use lag-1
"agg_lag_dict": None
}
}
extra_pred_cols = ["ct1", "ct_sqrt", "ct1:C(month, levels=list(range(1, 13)))"]
# Specify the model parameters
model_components = ModelComponentsParam(
autoregression=autoregression,
seasonality={
"yearly_seasonality": 25,
"quarterly_seasonality": 0,
"monthly_seasonality": 0,
"weekly_seasonality": 0,
"daily_seasonality": 0
},
changepoints={
'changepoints_dict': {
"method": "auto",
"resample_freq": "7D",
"regularization_strength": 0.5,
"potential_changepoint_distance": "14D",
"no_changepoint_distance_from_end": "60D",
"yearly_seasonality_order": 25,
"yearly_seasonality_change_freq": None,
},
"seasonality_changepoints_dict": None
},
events={
"holiday_lookup_countries": []
},
growth={
"growth_term": None
},
custom={
'feature_sets_enabled': False,
'fit_algorithm_dict': dict(fit_algorithm='ridge'),
'extra_pred_cols': extra_pred_cols,
}
)
forecast_config = ForecastConfig(
metadata_param=metadata,
forecast_horizon=forecast_horizon,
coverage=0.95,
evaluation_period_param=evaluation_period,
model_components_param=model_components
)
# Run the forecast model
forecaster = Forecaster()
result = forecaster.run_forecast_config(
df=ts.df,
config=forecast_config
)
|
Out:
Fitting 6 folds for each of 1 candidates, totalling 6 fits
Let’s check the model results summary and plots.
324 | get_model_results_summary(result)
|
Out:
================================ Model Summary =================================
Number of observations: 466, Number of features: 69
Method: Ridge regression
Number of nonzero features: 69
Regularization parameter: 756.5
Residuals:
Min 1Q Median 3Q Max
-3.148e+04 -4497.0 -102.1 5158.0 2.958e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 5476.0 739.0 <2e-16 *** (4030.0, 6887.0)
ct1 450.3 69.56 <2e-16 *** (306.4, 586.7)
ct1:C(mo... 13)))_2 -178.2 142.2 0.234 (-427.3, 94.64)
ct1:C(mo... 13)))_3 56.88 179.9 0.768 (-269.1, 422.4)
ct1:C(mo... 13)))_4 489.6 188.6 0.012 * (111.3, 849.3)
ct1:C(mo... 13)))_5 328.1 201.8 0.096 . (-96.1, 736.7)
ct1:C(mo... 13)))_6 321.5 127.9 0.012 * (73.04, 567.6)
ct1:C(mo... 13)))_7 355.6 113.8 0.002 ** (125.8, 561.6)
ct1:C(mo... 13)))_8 300.2 106.3 0.006 ** (66.58, 482.2)
ct1:C(mo... 13)))_9 217.9 138.7 0.106 (-36.1, 506.3)
ct1:C(mo...13)))_10 55.09 184.6 0.764 (-316.1, 400.5)
ct1:C(mo...13)))_11 -527.8 147.5 <2e-16 *** (-789.3, -208.5)
ct1:C(mo...13)))_12 -481.3 179.6 0.004 ** (-809.0, -141.5)
ct_sqrt 212.9 34.27 <2e-16 *** (144.8, 273.7)
sin1_ct1_yearly -616.0 69.28 <2e-16 *** (-728.1, -458.3)
cos1_ct1_yearly -47.88 79.43 0.530 (-206.7, 105.1)
sin2_ct1_yearly 17.12 80.44 0.846 (-139.2, 173.1)
cos2_ct1_yearly 303.6 88.2 <2e-16 *** (110.4, 463.1)
sin3_ct1_yearly 121.0 90.09 0.198 (-47.6, 288.3)
cos3_ct1_yearly -50.69 97.16 0.604 (-235.6, 130.3)
sin4_ct1_yearly -48.36 102.1 0.596 (-248.1, 155.9)
cos4_ct1_yearly 120.2 88.76 0.178 (-59.86, 292.6)
sin5_ct1_yearly -22.47 98.63 0.810 (-198.4, 183.2)
cos5_ct1_yearly -148.3 104.9 0.150 (-328.0, 70.98)
sin6_ct1_yearly 101.0 95.62 0.296 (-90.27, 272.6)
cos6_ct1_yearly 210.2 129.6 0.106 (-49.7, 444.3)
sin7_ct1_yearly -46.32 121.1 0.668 (-270.9, 198.5)
cos7_ct1_yearly -77.62 112.9 0.488 (-282.0, 158.8)
sin8_ct1_yearly -6.747 118.4 0.952 (-233.8, 232.0)
cos8_ct1_yearly -73.72 125.4 0.580 (-315.9, 152.0)
sin9_ct1_yearly -327.5 118.8 0.004 ** (-532.8, -79.69)
cos9_ct1_yearly -14.67 123.9 0.912 (-256.9, 217.2)
sin10_ct1_yearly -46.6 136.1 0.738 (-295.8, 208.1)
cos10_ct1_yearly 136.8 126.0 0.296 (-92.04, 372.3)
sin11_ct1_yearly 127.5 127.1 0.346 (-110.0, 387.2)
cos11_ct1_yearly 64.29 130.5 0.598 (-181.4, 335.5)
sin12_ct1_yearly 12.39 130.0 0.922 (-232.8, 277.8)
cos12_ct1_yearly -379.4 128.6 <2e-16 *** (-615.1, -121.8)
sin13_ct1_yearly -252.7 133.5 0.064 . (-517.1, 1.536)
cos13_ct1_yearly 80.36 122.4 0.494 (-145.5, 311.1)
sin14_ct1_yearly -198.5 117.0 0.092 . (-404.1, 54.21)
cos14_ct1_yearly 120.7 132.5 0.340 (-147.2, 382.3)
sin15_ct1_yearly -100.8 123.7 0.386 (-351.2, 135.2)
cos15_ct1_yearly 127.1 127.4 0.324 (-121.2, 388.6)
sin16_ct1_yearly -144.3 122.4 0.210 (-391.3, 85.06)
cos16_ct1_yearly -227.4 115.2 0.034 * (-435.1, -5.372)
sin17_ct1_yearly 14.31 112.5 0.910 (-195.2, 243.8)
cos17_ct1_yearly 137.4 136.2 0.288 (-132.2, 410.6)
sin18_ct1_yearly 45.3 125.3 0.742 (-191.0, 302.8)
cos18_ct1_yearly -116.2 116.2 0.328 (-349.5, 90.52)
sin19_ct1_yearly -111.2 130.5 0.386 (-374.2, 135.6)
cos19_ct1_yearly -169.5 112.9 0.122 (-397.2, 45.1)
sin20_ct1_yearly 193.9 126.9 0.130 (-59.18, 473.0)
cos20_ct1_yearly 52.4 131.5 0.716 (-206.7, 298.6)
sin21_ct1_yearly -199.0 113.4 0.086 . (-431.0, 19.14)
cos21_ct1_yearly -109.8 130.7 0.362 (-360.0, 162.6)
sin22_ct1_yearly 188.8 116.1 0.108 (-32.08, 410.4)
cos22_ct1_yearly 128.1 141.6 0.348 (-152.4, 389.5)
sin23_ct1_yearly 90.29 116.8 0.456 (-148.9, 299.3)
cos23_ct1_yearly -486.9 134.9 <2e-16 *** (-725.8, -195.2)
sin24_ct1_yearly -122.9 114.8 0.282 (-335.8, 102.7)
cos24_ct1_yearly -73.89 135.4 0.574 (-342.0, 176.9)
sin25_ct1_yearly -424.9 123.2 <2e-16 *** (-652.1, -179.8)
cos25_ct1_yearly 111.8 131.4 0.406 (-142.3, 366.1)
cp0_2012_01_30_00 294.0 55.78 <2e-16 *** (181.1, 396.7)
cp1_2013_01_14_00 116.1 59.85 0.050 . (0.3185, 230.9)
cp2_2015_02_23_00 -158.8 102.8 0.122 (-345.5, 63.05)
cp3_2017_10_02_00 -142.3 66.52 0.032 * (-262.8, -9.171)
y_lag1 0.8278 0.02072 <2e-16 *** (0.7863, 0.8687)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.8819, Adjusted R-squared: 0.8773
F-statistic: 187.63 on 17 and 447 DF, p-value: 1.110e-16
Model AIC: 11331.0, model BIC: 11408.0
WARNING: the condition number is large, 2.13e+09. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
================================= CV Results ==================================
0
rank_test_MAPE 1
mean_test_MAPE 13.05
split_test_MAPE (17.77, 14.85, 20.94, 7.33, 8.88, 8.54)
mean_train_MAPE 16.39
split_train_MAPE (16.4, 16.35, 16.39, 16.36, 16.5, 16.38)
mean_fit_time 6.55
mean_score_time 5.11
params []
=========================== Train/Test Evaluation =============================
train test
CORR 0.938646 -0.419785
R2 0.880955 -3.97518
MSE 7.24512e+07 8.54804e+07
RMSE 8511.83 9245.56
MAE 6369.21 7892.53
MedAE 4744.33 6720.96
MAPE 16.3261 10.0381
MedAPE 10.1886 8.40033
sMAPE 7.36061 4.68871
Q80 3184.61 1578.51
Q95 3184.61 394.626
Q99 3184.61 78.9253
OutsideTolerance1p 0.939394 1
OutsideTolerance2p 0.887446 1
OutsideTolerance3p 0.824675 1
OutsideTolerance4p 0.774892 0.75
OutsideTolerance5p 0.725108 0.5
Outside Tolerance (fraction) None None
R2_null_model_score None None
Prediction Band Width (%) 94.6955 48.7696
Prediction Band Coverage (fraction) 0.937229 1
Coverage: Lower Band 0.474026 1
Coverage: Upper Band 0.463203 0
Coverage Diff: Actual_Coverage - Intended_Coverage -0.0127706 0.05
Fit/backtest plot:
328 329 | fig = result.backtest.plot()
plotly.io.show(fig)
|
Forecast plot:
333 334 | fig = result.forecast.plot()
plotly.io.show(fig)
|
The components plot:
338 339 | fig = result.forecast.plot_components()
plotly.io.show(fig)
|
Fit a greykite model with autoregression and forecast one-by-one. Forecast one-by-one is only
used when autoregression is set to “auto”, and it can be enable by setting forecast_one_by_one=True
in
Without forecast one-by-one, the lag order in autoregression has to be greater
than the forecast horizon in order to avoid simulation (which leads to less accuracy).
The advantage of turning on forecast_one_by_one is to improve the forecast accuracy by breaking
the forecast horizon to smaller steps, fitting multiple models using immediate lags.
Note that the forecast one-by-one option may slow down the training.
350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 | autoregression = {
"autoreg_dict": "auto"
}
extra_pred_cols = ["ct1", "ct_sqrt", "ct1:C(month, levels=list(range(1, 13)))"]
forecast_one_by_one = True
# Specify the model parameters
model_components = ModelComponentsParam(
autoregression=autoregression,
seasonality={
"yearly_seasonality": 25,
"quarterly_seasonality": 0,
"monthly_seasonality": 0,
"weekly_seasonality": 0,
"daily_seasonality": 0
},
changepoints={
'changepoints_dict': {
"method": "auto",
"resample_freq": "7D",
"regularization_strength": 0.5,
"potential_changepoint_distance": "14D",
"no_changepoint_distance_from_end": "60D",
"yearly_seasonality_order": 25,
"yearly_seasonality_change_freq": None,
},
"seasonality_changepoints_dict": None
},
events={
"holiday_lookup_countries": []
},
growth={
"growth_term": None
},
custom={
'feature_sets_enabled': False,
'fit_algorithm_dict': dict(fit_algorithm='ridge'),
'extra_pred_cols': extra_pred_cols,
}
)
forecast_config = ForecastConfig(
metadata_param=metadata,
forecast_horizon=forecast_horizon,
coverage=0.95,
evaluation_period_param=evaluation_period,
model_components_param=model_components,
forecast_one_by_one=forecast_one_by_one
)
# Run the forecast model
forecaster = Forecaster()
result = forecaster.run_forecast_config(
df=ts.df,
config=forecast_config
)
|
Out:
Fitting 6 folds for each of 1 candidates, totalling 6 fits
Let’s check the model results summary and plots. Here the forecast_one_by_one option fits 4 models for each step, hence 4 model summaries are printed, and 4 components plots are generated.
410 | get_model_results_summary(result)
|
Out:
[================================ Model Summary =================================
Number of observations: 466, Number of features: 71
Method: Ridge regression
Number of nonzero features: 71
Regularization parameter: 236.4
Residuals:
Min 1Q Median 3Q Max
-2.965e+04 -3672.0 200.7 4309.0 2.490e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 3955.0 684.2 <2e-16 *** (2630.0, 5232.0)
ct1 497.2 113.6 <2e-16 *** (222.8, 691.4)
ct1:C(mo... 13)))_2 -60.86 203.3 0.788 (-422.5, 384.9)
ct1:C(mo... 13)))_3 318.8 248.7 0.194 (-128.6, 843.4)
ct1:C(mo... 13)))_4 1036.0 220.4 <2e-16 *** (575.7, 1461.0)
ct1:C(mo... 13)))_5 311.0 268.7 0.240 (-261.4, 804.1)
ct1:C(mo... 13)))_6 547.6 179.0 0.002 ** (184.5, 883.3)
ct1:C(mo... 13)))_7 372.0 150.6 0.012 * (76.11, 634.3)
ct1:C(mo... 13)))_8 355.8 128.7 0.002 ** (94.27, 612.9)
ct1:C(mo... 13)))_9 173.5 202.2 0.406 (-218.7, 553.2)
ct1:C(mo...13)))_10 -25.25 222.5 0.896 (-506.2, 420.2)
ct1:C(mo...13)))_11 -1104.0 185.4 <2e-16 *** (-1458.0, -708.4)
ct1:C(mo...13)))_12 -708.2 229.0 <2e-16 *** (-1164.0, -292.6)
ct_sqrt 286.6 69.74 <2e-16 *** (127.6, 411.1)
sin1_ct1_yearly -1045.0 135.1 <2e-16 *** (-1288.0, -747.6)
cos1_ct1_yearly -387.8 132.1 <2e-16 *** (-646.7, -154.0)
sin2_ct1_yearly -40.62 140.2 0.786 (-319.7, 207.4)
cos2_ct1_yearly 753.5 163.7 <2e-16 *** (402.3, 1037.0)
sin3_ct1_yearly 402.0 160.0 0.014 * (36.18, 699.2)
cos3_ct1_yearly -105.0 166.4 0.520 (-415.1, 214.7)
sin4_ct1_yearly -89.45 172.1 0.612 (-424.8, 265.2)
cos4_ct1_yearly 246.7 173.9 0.146 (-88.56, 653.8)
sin5_ct1_yearly -111.6 195.5 0.584 (-495.2, 299.7)
cos5_ct1_yearly -446.5 183.2 0.022 * (-787.5, -98.22)
sin6_ct1_yearly 267.4 162.3 0.100 (-64.97, 570.6)
cos6_ct1_yearly 649.0 237.2 0.006 ** (133.5, 1063.0)
sin7_ct1_yearly -163.2 238.3 0.522 (-650.5, 278.2)
cos7_ct1_yearly -240.6 188.9 0.202 (-623.0, 127.7)
sin8_ct1_yearly -101.7 230.2 0.662 (-531.9, 335.6)
cos8_ct1_yearly -135.8 226.4 0.538 (-522.6, 301.1)
sin9_ct1_yearly -818.9 231.7 0.002 ** (-1246.0, -348.4)
cos9_ct1_yearly 354.2 215.7 0.098 . (-78.44, 757.6)
sin10_ct1_yearly 104.9 226.2 0.620 (-298.2, 549.4)
cos10_ct1_yearly 379.6 220.2 0.086 . (-85.95, 775.6)
sin11_ct1_yearly 374.6 231.8 0.092 . (-84.43, 792.8)
cos11_ct1_yearly -39.44 258.6 0.844 (-574.7, 430.9)
sin12_ct1_yearly -372.7 234.0 0.116 (-813.7, 91.86)
cos12_ct1_yearly -765.1 242.8 0.002 ** (-1205.0, -231.5)
sin13_ct1_yearly -318.4 248.2 0.192 (-787.8, 174.1)
cos13_ct1_yearly 428.6 246.9 0.080 . (-21.9, 938.8)
sin14_ct1_yearly -233.6 225.0 0.296 (-679.0, 235.4)
cos14_ct1_yearly 321.0 248.3 0.186 (-167.7, 792.7)
sin15_ct1_yearly -50.55 224.2 0.820 (-464.1, 402.5)
cos15_ct1_yearly 271.9 229.8 0.232 (-159.1, 695.8)
sin16_ct1_yearly -449.9 246.5 0.058 . (-916.3, 12.71)
cos16_ct1_yearly -277.0 232.5 0.238 (-732.1, 173.4)
sin17_ct1_yearly 117.8 224.6 0.594 (-311.0, 582.4)
cos17_ct1_yearly 226.1 230.9 0.324 (-227.5, 698.7)
sin18_ct1_yearly -30.72 241.5 0.920 (-515.4, 399.0)
cos18_ct1_yearly -124.7 225.2 0.588 (-539.3, 300.4)
sin19_ct1_yearly -220.0 217.6 0.308 (-651.5, 183.9)
cos19_ct1_yearly -228.7 230.5 0.316 (-623.5, 264.0)
sin20_ct1_yearly 340.7 228.7 0.146 (-137.1, 779.8)
cos20_ct1_yearly 64.52 242.5 0.830 (-389.3, 617.6)
sin21_ct1_yearly -297.2 232.1 0.200 (-695.4, 165.3)
cos21_ct1_yearly -160.8 243.9 0.524 (-661.1, 328.5)
sin22_ct1_yearly 308.2 219.3 0.164 (-141.1, 698.2)
cos22_ct1_yearly 132.4 247.3 0.588 (-328.1, 609.7)
sin23_ct1_yearly 87.1 228.6 0.688 (-399.3, 487.9)
cos23_ct1_yearly -794.7 243.2 <2e-16 *** (-1293.0, -303.8)
sin24_ct1_yearly -205.3 228.3 0.358 (-663.0, 253.1)
cos24_ct1_yearly -116.6 253.0 0.632 (-569.5, 406.5)
sin25_ct1_yearly -648.7 238.4 0.002 ** (-1108.0, -188.4)
cos25_ct1_yearly 167.3 246.2 0.456 (-310.6, 624.5)
cp0_2012_01_30_00 254.3 74.47 <2e-16 *** (88.28, 387.0)
cp1_2013_01_14_00 -15.59 93.71 0.866 (-188.0, 179.6)
cp2_2015_02_23_00 -400.2 197.0 0.038 * (-755.7, -1.963)
cp3_2017_10_02_00 -337.0 147.7 0.024 * (-601.6, -12.42)
y_lag1 0.4335 0.05004 <2e-16 *** (0.3435, 0.5318)
y_lag2 0.241 0.0534 <2e-16 *** (0.1389, 0.3467)
y_lag3 0.1888 0.05114 <2e-16 *** (0.08849, 0.2879)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9112, Adjusted R-squared: 0.9043
F-statistic: 128.73 on 33 and 431 DF, p-value: 1.110e-16
Model AIC: 11231.0, model BIC: 11375.0
WARNING: the condition number is large, 2.01e+10. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
, ================================ Model Summary =================================
Number of observations: 466, Number of features: 71
Method: Ridge regression
Number of nonzero features: 71
Regularization parameter: 187.4
Residuals:
Min 1Q Median 3Q Max
-2.939e+04 -4218.0 77.39 4457.0 2.993e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 5444.0 779.9 <2e-16 *** (3961.0, 6888.0)
ct1 745.1 154.9 <2e-16 *** (398.7, 990.1)
ct1:C(mo... 13)))_2 -41.98 275.7 0.854 (-633.2, 498.4)
ct1:C(mo... 13)))_3 502.7 298.1 0.090 . (-74.86, 1043.0)
ct1:C(mo... 13)))_4 1673.0 242.6 <2e-16 *** (1120.0, 2110.0)
ct1:C(mo... 13)))_5 360.9 280.0 0.200 (-191.7, 932.5)
ct1:C(mo... 13)))_6 981.7 201.6 <2e-16 *** (549.3, 1356.0)
ct1:C(mo... 13)))_7 571.7 167.2 0.002 ** (232.1, 874.5)
ct1:C(mo... 13)))_8 607.6 158.3 <2e-16 *** (287.2, 901.3)
ct1:C(mo... 13)))_9 301.8 223.9 0.186 (-121.0, 722.2)
ct1:C(mo...13)))_10 117.7 240.1 0.622 (-358.6, 572.1)
ct1:C(mo...13)))_11 -1478.0 202.4 <2e-16 *** (-1856.0, -1086.0)
ct1:C(mo...13)))_12 -1021.0 255.1 <2e-16 *** (-1547.0, -523.2)
ct_sqrt 448.5 90.46 <2e-16 *** (240.6, 594.6)
sin1_ct1_yearly -1755.0 187.2 <2e-16 *** (-2066.0, -1303.0)
cos1_ct1_yearly -606.2 164.7 0.002 ** (-927.8, -280.1)
sin2_ct1_yearly 35.54 148.4 0.812 (-260.7, 311.0)
cos2_ct1_yearly 1221.0 203.9 <2e-16 *** (793.7, 1598.0)
sin3_ct1_yearly 574.9 185.3 0.004 ** (165.4, 894.0)
cos3_ct1_yearly -169.3 203.7 0.390 (-593.4, 221.0)
sin4_ct1_yearly -37.46 217.2 0.854 (-456.7, 394.7)
cos4_ct1_yearly 245.1 199.5 0.248 (-154.5, 644.4)
sin5_ct1_yearly -282.7 234.3 0.216 (-717.4, 225.1)
cos5_ct1_yearly -542.9 229.6 0.024 * (-970.6, -78.8)
sin6_ct1_yearly 446.5 213.9 0.036 * (60.16, 870.2)
cos6_ct1_yearly 810.5 299.5 0.006 ** (185.4, 1353.0)
sin7_ct1_yearly -291.4 268.1 0.246 (-825.9, 211.7)
cos7_ct1_yearly -390.6 232.2 0.100 (-844.1, 62.99)
sin8_ct1_yearly -308.8 253.3 0.228 (-773.5, 188.2)
cos8_ct1_yearly -64.01 255.9 0.776 (-584.7, 422.1)
sin9_ct1_yearly -878.1 251.6 0.002 ** (-1372.0, -369.7)
cos9_ct1_yearly 894.3 268.9 <2e-16 *** (346.7, 1407.0)
sin10_ct1_yearly 358.8 279.1 0.202 (-187.4, 909.1)
cos10_ct1_yearly 453.1 276.8 0.088 . (-124.6, 908.7)
sin11_ct1_yearly 505.5 260.4 0.054 . (-69.61, 966.4)
cos11_ct1_yearly -267.8 285.7 0.346 (-879.4, 301.5)
sin12_ct1_yearly -765.3 276.5 <2e-16 *** (-1236.0, -213.5)
cos12_ct1_yearly -688.1 288.2 0.018 * (-1206.0, -148.1)
sin13_ct1_yearly -85.15 265.0 0.750 (-630.1, 433.0)
cos13_ct1_yearly 619.9 294.5 0.038 * (21.82, 1211.0)
sin14_ct1_yearly -103.8 278.4 0.746 (-641.9, 440.4)
cos14_ct1_yearly 353.8 301.3 0.254 (-228.9, 938.0)
sin15_ct1_yearly 63.93 277.6 0.814 (-505.7, 556.5)
cos15_ct1_yearly 290.3 277.5 0.290 (-265.1, 828.5)
sin16_ct1_yearly -628.6 286.3 0.022 * (-1157.0, -64.3)
cos16_ct1_yearly -141.1 264.5 0.570 (-658.9, 374.6)
sin17_ct1_yearly 244.8 278.7 0.364 (-313.9, 791.2)
cos17_ct1_yearly 222.3 276.1 0.434 (-337.5, 717.3)
sin18_ct1_yearly -160.9 285.8 0.582 (-698.2, 401.7)
cos18_ct1_yearly -37.86 266.2 0.886 (-537.6, 497.7)
sin19_ct1_yearly -285.2 286.7 0.310 (-823.5, 289.6)
cos19_ct1_yearly -166.2 281.1 0.578 (-644.1, 430.4)
sin20_ct1_yearly 343.6 271.6 0.194 (-217.7, 828.5)
cos20_ct1_yearly 11.24 282.8 0.958 (-531.4, 577.3)
sin21_ct1_yearly -244.6 266.6 0.344 (-736.1, 249.7)
cos21_ct1_yearly -53.1 310.4 0.824 (-700.7, 526.6)
sin22_ct1_yearly 280.2 259.4 0.284 (-256.9, 773.1)
cos22_ct1_yearly -45.65 287.3 0.856 (-592.1, 525.9)
sin23_ct1_yearly -123.9 259.9 0.614 (-588.9, 391.6)
cos23_ct1_yearly -617.7 285.7 0.020 * (-1110.0, -30.61)
sin24_ct1_yearly -166.4 293.9 0.568 (-700.6, 453.2)
cos24_ct1_yearly -53.36 268.1 0.812 (-588.0, 452.8)
sin25_ct1_yearly -335.8 262.9 0.196 (-802.6, 200.4)
cos25_ct1_yearly 118.8 288.4 0.680 (-460.1, 683.0)
cp0_2012_01_30_00 354.5 106.4 <2e-16 *** (126.4, 537.7)
cp1_2013_01_14_00 -69.86 113.0 0.556 (-281.9, 192.5)
cp2_2015_02_23_00 -652.9 254.2 0.004 ** (-1135.0, -121.3)
cp3_2017_10_02_00 -561.0 196.4 0.002 ** (-890.3, -111.1)
y_lag2 0.4025 0.05301 <2e-16 *** (0.3052, 0.5147)
y_lag3 0.2626 0.05417 <2e-16 *** (0.1597, 0.3743)
y_lag4 0.1381 0.05428 0.014 * (0.03179, 0.2368)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9025, Adjusted R-squared: 0.8941
F-statistic: 104.21 on 36 and 428 DF, p-value: 1.110e-16
Model AIC: 11281.0, model BIC: 11437.0
WARNING: the condition number is large, 2.53e+10. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
, ================================ Model Summary =================================
Number of observations: 466, Number of features: 71
Method: Ridge regression
Number of nonzero features: 71
Regularization parameter: 148.5
Residuals:
Min 1Q Median 3Q Max
-2.818e+04 -4351.0 264.3 4838.0 2.436e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 6405.0 791.3 <2e-16 *** (4777.0, 7923.0)
ct1 892.0 166.8 <2e-16 *** (515.5, 1158.0)
ct1:C(mo... 13)))_2 134.2 269.4 0.630 (-430.8, 637.4)
ct1:C(mo... 13)))_3 773.1 321.3 0.018 * (84.01, 1431.0)
ct1:C(mo... 13)))_4 2369.0 248.1 <2e-16 *** (1858.0, 2837.0)
ct1:C(mo... 13)))_5 525.1 292.3 0.060 . (-77.25, 1047.0)
ct1:C(mo... 13)))_6 1411.0 225.1 <2e-16 *** (892.6, 1767.0)
ct1:C(mo... 13)))_7 882.1 196.8 <2e-16 *** (431.3, 1244.0)
ct1:C(mo... 13)))_8 832.7 170.7 <2e-16 *** (473.6, 1132.0)
ct1:C(mo... 13)))_9 398.2 273.1 0.146 (-151.6, 943.4)
ct1:C(mo...13)))_10 307.6 265.3 0.242 (-308.4, 728.7)
ct1:C(mo...13)))_11 -1750.0 277.1 <2e-16 *** (-2233.0, -1139.0)
ct1:C(mo...13)))_12 -1295.0 283.7 <2e-16 *** (-1880.0, -757.4)
ct_sqrt 560.2 106.4 <2e-16 *** (324.9, 733.1)
sin1_ct1_yearly -2577.0 215.6 <2e-16 *** (-2911.0, -2088.0)
cos1_ct1_yearly -829.9 187.8 <2e-16 *** (-1177.0, -466.8)
sin2_ct1_yearly 224.5 187.1 0.236 (-182.4, 541.6)
cos2_ct1_yearly 1818.0 227.9 <2e-16 *** (1280.0, 2188.0)
sin3_ct1_yearly 673.0 239.4 0.006 ** (146.9, 1074.0)
cos3_ct1_yearly -244.2 226.5 0.278 (-675.8, 190.5)
sin4_ct1_yearly 86.37 249.1 0.710 (-456.6, 550.6)
cos4_ct1_yearly 137.5 248.6 0.590 (-350.4, 640.4)
sin5_ct1_yearly -615.9 276.9 0.022 * (-1107.0, -81.8)
cos5_ct1_yearly -516.7 246.1 0.032 * (-957.6, -27.32)
sin6_ct1_yearly 582.5 252.0 0.026 * (115.0, 1110.0)
cos6_ct1_yearly 626.3 317.9 0.044 * (-2.668, 1217.0)
sin7_ct1_yearly -360.3 312.2 0.234 (-975.2, 204.6)
cos7_ct1_yearly -404.7 269.8 0.134 (-899.0, 118.4)
sin8_ct1_yearly -547.7 294.8 0.064 . (-1146.0, -7.613)
cos8_ct1_yearly 230.9 299.6 0.442 (-323.7, 813.5)
sin9_ct1_yearly -419.8 302.8 0.156 (-993.0, 179.6)
cos9_ct1_yearly 1155.0 325.5 <2e-16 *** (407.6, 1739.0)
sin10_ct1_yearly 515.3 307.4 0.104 (-62.03, 1087.0)
cos10_ct1_yearly 293.2 317.6 0.378 (-310.0, 863.5)
sin11_ct1_yearly 400.6 296.0 0.168 (-245.1, 935.7)
cos11_ct1_yearly -434.0 345.5 0.206 (-1154.0, 291.7)
sin12_ct1_yearly -622.3 332.2 0.072 . (-1302.0, 32.51)
cos12_ct1_yearly -448.2 337.8 0.190 (-1070.0, 261.5)
sin13_ct1_yearly 42.25 302.8 0.870 (-552.1, 679.0)
cos13_ct1_yearly 426.2 317.0 0.178 (-208.4, 1039.0)
sin14_ct1_yearly -184.3 299.3 0.554 (-753.0, 411.3)
cos14_ct1_yearly 181.5 310.7 0.542 (-429.4, 776.7)
sin15_ct1_yearly 4.377 313.1 0.996 (-610.3, 638.7)
cos15_ct1_yearly 270.9 307.6 0.378 (-338.2, 863.6)
sin16_ct1_yearly -647.6 297.5 0.034 * (-1232.0, -61.12)
cos16_ct1_yearly -161.2 299.2 0.580 (-674.9, 496.1)
sin17_ct1_yearly 262.3 316.6 0.404 (-336.8, 890.9)
cos17_ct1_yearly 238.9 302.1 0.450 (-388.1, 780.0)
sin18_ct1_yearly -184.3 325.2 0.544 (-864.6, 432.0)
cos18_ct1_yearly 68.1 298.3 0.820 (-474.6, 651.9)
sin19_ct1_yearly -204.6 306.7 0.526 (-749.2, 412.1)
cos19_ct1_yearly -245.3 308.0 0.422 (-846.5, 375.1)
sin20_ct1_yearly 359.6 297.5 0.226 (-216.9, 962.1)
cos20_ct1_yearly 171.0 313.0 0.584 (-418.2, 738.8)
sin21_ct1_yearly -207.8 314.8 0.532 (-806.3, 361.0)
cos21_ct1_yearly -223.9 315.3 0.488 (-820.9, 376.9)
sin22_ct1_yearly 257.6 301.0 0.378 (-330.5, 846.8)
cos22_ct1_yearly 108.0 315.7 0.756 (-507.2, 700.9)
sin23_ct1_yearly 243.9 281.7 0.402 (-345.6, 791.0)
cos23_ct1_yearly -896.1 329.3 0.008 ** (-1493.0, -207.5)
sin24_ct1_yearly -245.9 316.9 0.458 (-872.4, 349.6)
cos24_ct1_yearly -185.4 314.5 0.558 (-760.4, 430.2)
sin25_ct1_yearly -806.8 298.6 0.008 ** (-1316.0, -191.6)
cos25_ct1_yearly 81.94 326.0 0.782 (-584.0, 682.1)
cp0_2012_01_30_00 395.0 111.9 <2e-16 *** (161.9, 593.0)
cp1_2013_01_14_00 -142.7 132.2 0.306 (-376.4, 130.3)
cp2_2015_02_23_00 -872.0 304.0 0.008 ** (-1417.0, -219.0)
cp3_2017_10_02_00 -779.2 254.9 <2e-16 *** (-1195.0, -228.6)
y_lag3 0.3799 0.0526 <2e-16 *** (0.2882, 0.4984)
y_lag4 0.1591 0.05678 0.006 ** (0.05182, 0.2672)
y_lag5 0.2239 0.04996 <2e-16 *** (0.1155, 0.3122)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9, Adjusted R-squared: 0.8906
F-statistic: 91.799 on 39 and 425 DF, p-value: 1.110e-16
Model AIC: 11299.0, model BIC: 11468.0
WARNING: the condition number is large, 3.18e+10. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
, ================================ Model Summary =================================
Number of observations: 466, Number of features: 71
Method: Ridge regression
Number of nonzero features: 71
Regularization parameter: 148.5
Residuals:
Min 1Q Median 3Q Max
-3.003e+04 -4391.0 444.1 4266.0 2.785e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 7692.0 853.6 <2e-16 *** (5958.0, 9273.0)
ct1 1069.0 176.2 <2e-16 *** (629.2, 1315.0)
ct1:C(mo... 13)))_2 185.2 290.8 0.524 (-403.1, 732.1)
ct1:C(mo... 13)))_3 910.5 300.6 0.002 ** (258.7, 1495.0)
ct1:C(mo... 13)))_4 2822.0 274.3 <2e-16 *** (2275.0, 3346.0)
ct1:C(mo... 13)))_5 783.3 290.9 0.006 ** (181.5, 1320.0)
ct1:C(mo... 13)))_6 1669.0 237.8 <2e-16 *** (1194.0, 2109.0)
ct1:C(mo... 13)))_7 1213.0 212.3 <2e-16 *** (749.4, 1575.0)
ct1:C(mo... 13)))_8 1054.0 183.9 <2e-16 *** (634.8, 1340.0)
ct1:C(mo... 13)))_9 602.4 269.1 0.020 * (41.37, 1093.0)
ct1:C(mo...13)))_10 496.1 248.4 0.046 * (-47.75, 916.2)
ct1:C(mo...13)))_11 -1731.0 275.1 <2e-16 *** (-2280.0, -1163.0)
ct1:C(mo...13)))_12 -1624.0 256.6 <2e-16 *** (-2173.0, -1160.0)
ct_sqrt 679.1 109.4 <2e-16 *** (414.2, 838.1)
sin1_ct1_yearly -3186.0 221.9 <2e-16 *** (-3510.0, -2658.0)
cos1_ct1_yearly -860.1 183.1 <2e-16 *** (-1242.0, -535.1)
sin2_ct1_yearly 468.4 201.6 0.016 * (63.41, 827.2)
cos2_ct1_yearly 2102.0 245.9 <2e-16 *** (1534.0, 2506.0)
sin3_ct1_yearly 672.3 235.5 0.006 ** (183.0, 1100.0)
cos3_ct1_yearly -329.2 238.0 0.180 (-783.8, 151.4)
sin4_ct1_yearly 131.3 260.9 0.618 (-402.5, 598.1)
cos4_ct1_yearly 48.73 254.3 0.858 (-429.2, 565.4)
sin5_ct1_yearly -780.8 279.4 0.004 ** (-1299.0, -220.9)
cos5_ct1_yearly -305.8 271.6 0.254 (-828.8, 222.7)
sin6_ct1_yearly 660.8 241.6 0.006 ** (203.0, 1107.0)
cos6_ct1_yearly 244.7 319.1 0.472 (-371.4, 807.8)
sin7_ct1_yearly -323.8 309.3 0.284 (-870.8, 403.5)
cos7_ct1_yearly -364.4 274.6 0.204 (-858.5, 161.0)
sin8_ct1_yearly -648.6 309.1 0.034 * (-1176.0, 5.761)
cos8_ct1_yearly 478.3 293.4 0.104 (-91.89, 1050.0)
sin9_ct1_yearly -26.28 309.3 0.932 (-665.5, 576.3)
cos9_ct1_yearly 914.3 302.3 <2e-16 *** (286.6, 1509.0)
sin10_ct1_yearly 427.8 327.7 0.180 (-216.9, 1081.0)
cos10_ct1_yearly 166.4 310.4 0.618 (-406.6, 739.6)
sin11_ct1_yearly 291.4 306.4 0.356 (-311.9, 852.9)
cos11_ct1_yearly -374.9 337.7 0.262 (-1050.0, 264.9)
sin12_ct1_yearly -258.2 322.4 0.442 (-821.8, 392.4)
cos12_ct1_yearly -516.1 341.7 0.140 (-1152.0, 154.0)
sin13_ct1_yearly -126.6 331.9 0.714 (-776.6, 485.6)
cos13_ct1_yearly 225.5 336.1 0.486 (-445.0, 872.7)
sin14_ct1_yearly -414.5 337.2 0.224 (-1082.0, 201.6)
cos14_ct1_yearly 135.2 331.7 0.682 (-464.0, 790.6)
sin15_ct1_yearly -110.4 343.6 0.760 (-778.8, 517.1)
cos15_ct1_yearly 355.7 287.4 0.216 (-204.0, 891.8)
sin16_ct1_yearly -749.2 318.3 0.026 * (-1350.0, -77.53)
cos16_ct1_yearly -298.9 327.7 0.350 (-930.4, 355.9)
sin17_ct1_yearly 327.3 316.4 0.296 (-274.6, 956.6)
cos17_ct1_yearly 311.1 340.8 0.376 (-383.0, 907.3)
sin18_ct1_yearly -183.0 311.0 0.596 (-794.5, 381.7)
cos18_ct1_yearly -25.22 315.6 0.930 (-624.1, 625.3)
sin19_ct1_yearly -328.7 303.8 0.274 (-903.4, 270.5)
cos19_ct1_yearly -303.8 329.3 0.336 (-902.1, 396.4)
sin20_ct1_yearly 436.8 317.9 0.174 (-216.9, 993.3)
cos20_ct1_yearly 119.6 332.5 0.688 (-535.7, 734.9)
sin21_ct1_yearly -291.4 311.7 0.352 (-947.1, 281.0)
cos21_ct1_yearly -89.56 341.3 0.798 (-828.8, 568.7)
sin22_ct1_yearly 356.9 320.4 0.260 (-278.9, 913.1)
cos22_ct1_yearly -29.35 355.7 0.954 (-719.6, 689.3)
sin23_ct1_yearly -100.4 303.0 0.738 (-642.5, 520.5)
cos23_ct1_yearly -823.2 349.3 0.018 * (-1445.0, -96.43)
sin24_ct1_yearly -230.4 314.1 0.466 (-810.1, 373.7)
cos24_ct1_yearly -90.24 313.9 0.748 (-684.3, 541.0)
sin25_ct1_yearly -453.9 324.0 0.170 (-1075.0, 281.4)
cos25_ct1_yearly 97.38 332.5 0.784 (-565.5, 749.1)
cp0_2012_01_30_00 460.5 112.6 <2e-16 *** (224.6, 654.0)
cp1_2013_01_14_00 -190.9 121.3 0.106 (-400.1, 73.14)
cp2_2015_02_23_00 -1059.0 312.9 <2e-16 *** (-1534.0, -358.6)
cp3_2017_10_02_00 -928.5 251.1 0.002 ** (-1328.0, -347.8)
y_lag4 0.3126 0.0569 <2e-16 *** (0.2104, 0.4374)
y_lag5 0.2996 0.05828 <2e-16 *** (0.1721, 0.4178)
y_lag6 0.09878 0.06081 0.102 (-0.01423, 0.2153)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.8924, Adjusted R-squared: 0.8823
F-statistic: 83.401 on 39 and 425 DF, p-value: 1.110e-16
Model AIC: 11333.0, model BIC: 11503.0
WARNING: the condition number is large, 3.16e+10. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
]
================================= CV Results ==================================
0
rank_test_MAPE 1
mean_test_MAPE 11.44
split_test_MAPE (13.59, 15.64, 13.37, 11.07, 8.67, 6.3)
mean_train_MAPE 16.42
split_train_MAPE (16.53, 16.51, 16.46, 16.4, 16.35, 16.25)
mean_fit_time 29.22
mean_score_time 1.26
params []
=========================== Train/Test Evaluation =============================
train test
CORR 0.944564 0.971997
R2 0.891452 -0.485392
MSE 6.60628e+07 2.55211e+07
RMSE 8127.9 5051.84
MAE 6052.21 3666.02
MedAE 4392.44 3368.36
MAPE 16.1636 4.74337
MedAPE 9.03518 4.29144
sMAPE 7.01244 2.26905
Q80 3026.1 790.696
Q95 3026.1 269.539
Q99 3026.1 130.564
OutsideTolerance1p 0.941558 0.5
OutsideTolerance2p 0.893939 0.5
OutsideTolerance3p 0.82684 0.5
OutsideTolerance4p 0.766234 0.5
OutsideTolerance5p 0.722944 0.5
Outside Tolerance (fraction) None None
R2_null_model_score None None
Prediction Band Width (%) 90.4243 37.663
Prediction Band Coverage (fraction) 0.935065 1
Coverage: Lower Band 0.4329 0.75
Coverage: Upper Band 0.502165 0.25
Coverage Diff: Actual_Coverage - Intended_Coverage -0.0149351 0.05
Fit/backtest plot:
414 415 | fig = result.backtest.plot()
plotly.io.show(fig)
|
Forecast plot:
419 420 | fig = result.forecast.plot()
plotly.io.show(fig)
|
The components plot:
424 425 426 | figs = result.forecast.plot_components()
for fig in figs:
plotly.io.show(fig)
|
Total running time of the script: ( 6 minutes 43.325 seconds)