Note
Click here to download the full example code
Example for weekly data¶
This is a basic example for weekly data using Silverkite. Note that here we are fitting a few simple models and the goal is not to optimize the results as much as possible.
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | import warnings
from collections import defaultdict
import plotly
import pandas as pd
from greykite.common.constants import TIME_COL
from greykite.common.constants import VALUE_COL
from greykite.framework.benchmark.data_loader_ts import DataLoader
from greykite.framework.input.univariate_time_series import UnivariateTimeSeries
from greykite.framework.templates.autogen.forecast_config import EvaluationPeriodParam
from greykite.framework.templates.autogen.forecast_config import ForecastConfig
from greykite.framework.templates.autogen.forecast_config import MetadataParam
from greykite.framework.templates.autogen.forecast_config import ModelComponentsParam
from greykite.framework.templates.forecaster import Forecaster
from greykite.framework.utils.result_summary import summarize_grid_search_results
warnings.filterwarnings("ignore")
|
Loads weekly dataset into UnivariateTimeSeries
.
31 32 33 34 35 36 37 38 39 40 41 42 43 44 | dl = DataLoader()
agg_func = {"count": "sum"}
df = dl.load_bikesharing(agg_freq="weekly", agg_func=agg_func)
# In this dataset the first week and last week's data are incomplete, therefore we drop it
df.drop(df.head(1).index,inplace=True)
df.drop(df.tail(1).index,inplace=True)
df.reset_index(drop=True)
ts = UnivariateTimeSeries()
ts.load_data(
df=df,
time_col="ts",
value_col="count",
freq="W-MON")
print(ts.df.head())
|
Out:
ts y
2010-09-27 2010-09-27 2801
2010-10-04 2010-10-04 3238
2010-10-11 2010-10-11 6241
2010-10-18 2010-10-18 7756
2010-10-25 2010-10-25 9556
Exploratory Data Analysis (EDA)¶
After reading in a time series, we could first do some exploratory data analysis.
The UnivariateTimeSeries
class is
used to store a timeseries and perform EDA.
A quick description of the data can be obtained as follows.
55 56 | print(ts.describe_time_col())
print(ts.describe_value_col())
|
Out:
{'data_points': 466, 'mean_increment_secs': 604800.0, 'min_timestamp': Timestamp('2010-09-27 00:00:00'), 'max_timestamp': Timestamp('2019-08-26 00:00:00')}
count 466.000000
mean 53466.961373
std 24728.824016
min 2801.000000
25% 32819.750000
50% 51921.500000
75% 76160.750000
max 102350.000000
Name: y, dtype: float64
Let’s plot the original timeseries.
(The interactive plot is generated by plotly
: click to zoom!)
61 62 | fig = ts.plot()
plotly.io.show(fig)
|
Exploratory plots can be plotted to reveal the time series’s properties. Monthly overlay plot can be used to inspect the annual patterns. This plot overlays various years on top of each other.
68 69 70 71 72 73 74 75 76 77 78 79 80 | fig = ts.plot_quantiles_and_overlays(
groupby_time_feature="month",
show_mean=True,
show_quantiles=False,
show_overlays=True,
center_values=True,
overlay_label_time_feature="year", # splits overlays by year
overlay_style={"line": {"width": 1}, "opacity": 0.5},
xlabel="Month",
ylabel=ts.original_value_col,
title="Yearly seasonality by year (centered)",
)
plotly.io.show(fig)
|
Weekly overlay plot.
84 85 86 87 88 89 90 91 92 93 94 95 96 | fig = ts.plot_quantiles_and_overlays(
groupby_time_feature="woy",
show_mean=True,
show_quantiles=False,
show_overlays=True,
center_values=True,
overlay_label_time_feature="year", # splits overlays by year
overlay_style={"line": {"width": 1}, "opacity": 0.5},
xlabel="Week of year",
ylabel=ts.original_value_col,
title="Yearly seasonality by year (centered)",
)
plotly.io.show(fig)
|
Fit Greykite Models¶
After some exploratory data analysis, let’s specify the model parameters and fit a Greykite model.
Specify common metadata.
105 106 107 108 109 110 111 112 |
Specify common evaluation parameters. Set minimum input data for training.
117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | cv_min_train_periods = 52 * 2
# Let CV use most recent splits for cross-validation.
cv_use_most_recent_splits = True
# Determine the maximum number of validations.
cv_max_splits = 6
evaluation_period = EvaluationPeriodParam(
test_horizon=forecast_horizon,
cv_horizon=forecast_horizon,
periods_between_train_test=0,
cv_min_train_periods=cv_min_train_periods,
cv_expanding_window=True,
cv_use_most_recent_splits=cv_use_most_recent_splits,
cv_periods_between_splits=None,
cv_periods_between_train_test=0,
cv_max_splits=cv_max_splits,
)
|
Let’s also define a helper function that generates the model results summary and plots.
136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 | def get_model_results_summary(result):
"""Generates model results summary.
Parameters
----------
result : `ForecastResult`
See :class:`~greykite.framework.pipeline.pipeline.ForecastResult` for documentation.
Returns
-------
Prints out model coefficients, cross-validation results, overall train/test evalautions.
"""
# Get the useful fields from the forecast result
model = result.model[-1]
backtest = result.backtest
grid_search = result.grid_search
# Check model coefficients / variables
# Get model summary with p-values
print(model.summary())
# Get cross-validation results
cv_results = summarize_grid_search_results(
grid_search=grid_search,
decimals=2,
cv_report_metrics=None,
column_order=[
"rank", "mean_test", "split_test", "mean_train", "split_train",
"mean_fit_time", "mean_score_time", "params"])
# Transposes to save space in the printed output
print("================================= CV Results ==================================")
print(cv_results.transpose())
# Check historical evaluation metrics (on the historical training/test set).
backtest_eval = defaultdict(list)
for metric, value in backtest.train_evaluation.items():
backtest_eval[metric].append(value)
backtest_eval[metric].append(backtest.test_evaluation[metric])
metrics = pd.DataFrame(backtest_eval, index=["train", "test"]).T
print("=========================== Train/Test Evaluation =============================")
print(metrics)
|
Fit a simple model without autoregression.
The the most important model parameters are specified through ModelComponentsParam
.
The extra_pred_cols
is used to specify growth and annual seasonality
Growth is modelled with both “ct_sqrt”, “ct1” for extra flexibility as we have
longterm data and ridge regularization will avoid over-fitting the trend.
The yearly seasonality is modelled using Fourier series. In the ModelComponentsParam
,
we can specify the order of that - the higher the order is, the more flexible pattern
the model could capture. Usually one can try integers between 10 and 50.
188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 | autoregression = None
extra_pred_cols = ["ct1", "ct_sqrt", "ct1:C(month, levels=list(range(1, 13)))"]
# Specify the model parameters
model_components = ModelComponentsParam(
autoregression=autoregression,
seasonality={
"yearly_seasonality": 25,
"quarterly_seasonality": 0,
"monthly_seasonality": 0,
"weekly_seasonality": 0,
"daily_seasonality": 0
},
changepoints={
'changepoints_dict': {
"method": "auto",
"resample_freq": "7D",
"regularization_strength": 0.5,
"potential_changepoint_distance": "14D",
"no_changepoint_distance_from_end": "60D",
"yearly_seasonality_order": 25,
"yearly_seasonality_change_freq": None,
},
"seasonality_changepoints_dict": None
},
events={
"holiday_lookup_countries": []
},
growth={
"growth_term": None
},
custom={
'feature_sets_enabled': False,
'fit_algorithm_dict': dict(fit_algorithm='ridge'),
'extra_pred_cols': extra_pred_cols,
}
)
forecast_config = ForecastConfig(
metadata_param=metadata,
forecast_horizon=forecast_horizon,
coverage=0.95,
evaluation_period_param=evaluation_period,
model_components_param=model_components
)
# Run the forecast model
forecaster = Forecaster()
result = forecaster.run_forecast_config(
df=ts.df,
config=forecast_config
)
|
Out:
Fitting 6 folds for each of 1 candidates, totalling 6 fits
Let’s check the model results summary and plots.
243 | get_model_results_summary(result)
|
Out:
================================ Model Summary =================================
Number of observations: 466, Number of features: 68
Method: Ridge regression
Number of nonzero features: 68
Regularization parameter: 0.02807
Residuals:
Min 1Q Median 3Q Max
-2.758e+04 -3953.0 169.9 4685.0 2.165e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 1.086e+04 5469.0 0.038 * (1769.0, 2.295e+04)
ct1 1.932e+04 1.151e+04 0.108 (-1087.0, 4.416e+04)
ct1:C(mo... 13)))_2 917.1 6453.0 0.884 (-1.211e+04, 1.282e+04)
ct1:C(mo... 13)))_3 9628.0 6039.0 0.100 (-2648.0, 2.137e+04)
ct1:C(mo... 13)))_4 3.472e+04 5383.0 <2e-16 *** (2.381e+04, 4.503e+04)
ct1:C(mo... 13)))_5 2.762e+04 5947.0 <2e-16 *** (1.639e+04, 3.922e+04)
ct1:C(mo... 13)))_6 3.755e+04 5040.0 <2e-16 *** (2.671e+04, 4.820e+04)
ct1:C(mo... 13)))_7 4.068e+04 5000.0 <2e-16 *** (3.019e+04, 5.016e+04)
ct1:C(mo... 13)))_8 4.096e+04 5258.0 <2e-16 *** (3.010e+04, 5.026e+04)
ct1:C(mo... 13)))_9 3.261e+04 5876.0 <2e-16 *** (2.138e+04, 4.394e+04)
ct1:C(mo...13)))_10 3.322e+04 5077.0 <2e-16 *** (2.377e+04, 4.332e+04)
ct1:C(mo...13)))_11 1.036e+04 5632.0 0.078 . (-1244.0, 2.231e+04)
ct1:C(mo...13)))_12 4878.0 5030.0 0.310 (-5562.0, 1.429e+04)
ct_sqrt 5.656e+04 8890.0 <2e-16 *** (3.618e+04, 7.047e+04)
sin1_ct1_yearly -1.977e+04 1644.0 <2e-16 *** (-2.310e+04, -1.685e+04)
cos1_ct1_yearly 6433.0 1687.0 <2e-16 *** (3365.0, 9984.0)
sin2_ct1_yearly 5236.0 1595.0 <2e-16 *** (2154.0, 8272.0)
cos2_ct1_yearly 4841.0 1677.0 0.008 ** (1616.0, 8080.0)
sin3_ct1_yearly 1800.0 1585.0 0.266 (-1306.0, 4613.0)
cos3_ct1_yearly -272.9 1471.0 0.866 (-3326.0, 2817.0)
sin4_ct1_yearly -337.2 1461.0 0.810 (-3332.0, 2288.0)
cos4_ct1_yearly -1376.0 1494.0 0.372 (-4439.0, 1421.0)
sin5_ct1_yearly -2044.0 1368.0 0.140 (-4914.0, 420.5)
cos5_ct1_yearly -786.7 1293.0 0.534 (-3193.0, 2053.0)
sin6_ct1_yearly -126.7 1485.0 0.922 (-3026.0, 2559.0)
cos6_ct1_yearly 1482.0 1119.0 0.170 (-729.7, 3647.0)
sin7_ct1_yearly -363.0 1356.0 0.788 (-3091.0, 2180.0)
cos7_ct1_yearly -270.2 1134.0 0.800 (-2733.0, 1745.0)
sin8_ct1_yearly -1562.0 1154.0 0.172 (-3922.0, 396.8)
cos8_ct1_yearly 724.6 1050.0 0.492 (-1249.0, 2879.0)
sin9_ct1_yearly -1829.0 1098.0 0.106 (-3809.0, 422.6)
cos9_ct1_yearly 2049.0 1222.0 0.100 (-367.7, 4437.0)
sin10_ct1_yearly 630.4 1070.0 0.536 (-1709.0, 2705.0)
cos10_ct1_yearly 1728.0 1095.0 0.096 . (-225.6, 3943.0)
sin11_ct1_yearly 2838.0 1046.0 0.008 ** (932.4, 5057.0)
cos11_ct1_yearly -686.5 1079.0 0.532 (-2723.0, 1574.0)
sin12_ct1_yearly -2057.0 1014.0 0.042 * (-4114.0, -176.5)
cos12_ct1_yearly -2302.0 1016.0 0.018 * (-4311.0, -500.6)
sin13_ct1_yearly -407.8 1018.0 0.700 (-2588.0, 1390.0)
cos13_ct1_yearly 316.2 1072.0 0.748 (-1732.0, 2287.0)
sin14_ct1_yearly -1372.0 989.5 0.164 (-3154.0, 563.2)
cos14_ct1_yearly 1162.0 1113.0 0.284 (-1016.0, 3456.0)
sin15_ct1_yearly 112.2 1070.0 0.922 (-1879.0, 2353.0)
cos15_ct1_yearly 1259.0 1057.0 0.226 (-853.2, 3398.0)
sin16_ct1_yearly -2384.0 1037.0 0.026 * (-4327.0, -271.9)
cos16_ct1_yearly -644.5 1036.0 0.548 (-2571.0, 1488.0)
sin17_ct1_yearly 680.7 1068.0 0.512 (-1327.0, 2880.0)
cos17_ct1_yearly 850.7 1014.0 0.414 (-1130.0, 2837.0)
sin18_ct1_yearly -549.1 1087.0 0.616 (-2587.0, 1601.0)
cos18_ct1_yearly 220.5 1019.0 0.818 (-1672.0, 2215.0)
sin19_ct1_yearly -579.2 1053.0 0.562 (-2710.0, 1264.0)
cos19_ct1_yearly -773.8 998.4 0.452 (-2592.0, 1178.0)
sin20_ct1_yearly 1083.0 1120.0 0.332 (-959.2, 3325.0)
cos20_ct1_yearly 311.6 1070.0 0.766 (-1788.0, 2494.0)
sin21_ct1_yearly -1025.0 1051.0 0.302 (-3047.0, 1139.0)
cos21_ct1_yearly -483.3 1036.0 0.638 (-2502.0, 1510.0)
sin22_ct1_yearly 1110.0 981.9 0.246 (-859.5, 3137.0)
cos22_ct1_yearly 340.0 1140.0 0.776 (-1772.0, 2693.0)
sin23_ct1_yearly 506.2 1029.0 0.650 (-1586.0, 2581.0)
cos23_ct1_yearly -3098.0 1117.0 0.006 ** (-5189.0, -810.4)
sin24_ct1_yearly -1075.0 1097.0 0.338 (-3033.0, 1045.0)
cos24_ct1_yearly -367.7 1026.0 0.684 (-2447.0, 1774.0)
sin25_ct1_yearly -2124.0 1042.0 0.038 * (-4065.0, -171.1)
cos25_ct1_yearly -34.79 1048.0 0.972 (-1953.0, 2185.0)
cp0_2012_01_30_00 2645.0 1.201e+04 0.862 (-2.242e+04, 2.561e+04)
cp1_2013_01_14_00 -2.420e+04 1.042e+04 0.020 * (-4.451e+04, -4331.0)
cp2_2015_02_23_00 -1574.0 5339.0 0.754 (-1.191e+04, 9621.0)
cp3_2017_10_02_00 -1.979e+04 2755.0 <2e-16 *** (-2.492e+04, -1.434e+04)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9181, Adjusted R-squared: 0.9047
F-statistic: 67.998 on 65 and 399 DF, p-value: 1.110e-16
Model AIC: 11257.0, model BIC: 11533.0
WARNING: the condition number is large, 2.06e+05. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
================================= CV Results ==================================
0
rank_test_MAPE 1
mean_test_MAPE 10.73
split_test_MAPE (12.11, 15.22, 4.69, 10.99, 9.81, 11.59)
mean_train_MAPE 15.96
split_train_MAPE (15.89, 16.1, 16.05, 15.95, 15.91, 15.88)
mean_fit_time 2.36
mean_score_time 0.29
params []
=========================== Train/Test Evaluation =============================
train test
CORR 0.957887 0.886164
R2 0.917539 -2.01926
MSE 5.01863e+07 5.18751e+07
RMSE 7084.23 7202.44
MAE 5394.33 6425.43
MedAE 4193.48 6045.07
MAPE 15.8538 8.13425
MedAPE 8.59335 7.53921
sMAPE 7.41262 3.86442
Q80 2697.17 1285.09
Q95 2697.17 321.272
Q99 2697.17 64.2543
OutsideTolerance1p 0.943723 1
OutsideTolerance2p 0.883117 1
OutsideTolerance3p 0.80303 1
OutsideTolerance4p 0.755411 0.75
OutsideTolerance5p 0.705628 0.5
Outside Tolerance (fraction) None None
R2_null_model_score None None
Prediction Band Width (%) 78.8132 34.3116
Prediction Band Coverage (fraction) 0.941558 1
Coverage: Lower Band 0.450216 1
Coverage: Upper Band 0.491342 0
Coverage Diff: Actual_Coverage - Intended_Coverage -0.00844156 0.05
Fit/backtest plot:
247 248 | fig = result.backtest.plot()
plotly.io.show(fig)
|
Forecast plot:
252 253 | fig = result.forecast.plot()
plotly.io.show(fig)
|
The components plot:
257 258 | fig = result.forecast.plot_components()
plotly.io.show(fig)
|
Fit a simple model with autoregression.
This is done by specifying the autoregression
parameter in ModelComponentsParam
.
Note that the auto-regressive structure can be customized further depending on your data.
264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 | autoregression = {
"autoreg_dict": {
"lag_dict": {"orders": [1]}, # Only use lag-1
"agg_lag_dict": None
}
}
extra_pred_cols = ["ct1", "ct_sqrt", "ct1:C(month, levels=list(range(1, 13)))"]
# Specify the model parameters
model_components = ModelComponentsParam(
autoregression=autoregression,
seasonality={
"yearly_seasonality": 25,
"quarterly_seasonality": 0,
"monthly_seasonality": 0,
"weekly_seasonality": 0,
"daily_seasonality": 0
},
changepoints={
'changepoints_dict': {
"method": "auto",
"resample_freq": "7D",
"regularization_strength": 0.5,
"potential_changepoint_distance": "14D",
"no_changepoint_distance_from_end": "60D",
"yearly_seasonality_order": 25,
"yearly_seasonality_change_freq": None,
},
"seasonality_changepoints_dict": None
},
events={
"holiday_lookup_countries": []
},
growth={
"growth_term": None
},
custom={
'feature_sets_enabled': False,
'fit_algorithm_dict': dict(fit_algorithm='ridge'),
'extra_pred_cols': extra_pred_cols,
}
)
forecast_config = ForecastConfig(
metadata_param=metadata,
forecast_horizon=forecast_horizon,
coverage=0.95,
evaluation_period_param=evaluation_period,
model_components_param=model_components
)
# Run the forecast model
forecaster = Forecaster()
result = forecaster.run_forecast_config(
df=ts.df,
config=forecast_config
)
|
Out:
Fitting 6 folds for each of 1 candidates, totalling 6 fits
Let’s check the model results summary and plots.
324 | get_model_results_summary(result)
|
Out:
================================ Model Summary =================================
Number of observations: 466, Number of features: 69
Method: Ridge regression
Number of nonzero features: 69
Regularization parameter: 0.0621
Residuals:
Min 1Q Median 3Q Max
-2.954e+04 -3841.0 468.7 4111.0 2.007e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 1.228e+04 4624.0 0.010 * (4911.0, 2.222e+04)
ct1 1.700e+04 4986.0 <2e-16 *** (7294.0, 2.681e+04)
ct1:C(mo... 13)))_2 -1832.0 5245.0 0.730 (-1.195e+04, 7913.0)
ct1:C(mo... 13)))_3 4877.0 4973.0 0.320 (-5110.0, 1.460e+04)
ct1:C(mo... 13)))_4 2.342e+04 5067.0 <2e-16 *** (1.316e+04, 3.291e+04)
ct1:C(mo... 13)))_5 1.770e+04 5403.0 <2e-16 *** (7552.0, 2.810e+04)
ct1:C(mo... 13)))_6 2.350e+04 4775.0 <2e-16 *** (1.410e+04, 3.258e+04)
ct1:C(mo... 13)))_7 2.675e+04 4633.0 <2e-16 *** (1.677e+04, 3.624e+04)
ct1:C(mo... 13)))_8 2.596e+04 4902.0 <2e-16 *** (1.578e+04, 3.431e+04)
ct1:C(mo... 13)))_9 1.952e+04 5501.0 <2e-16 *** (8632.0, 2.968e+04)
ct1:C(mo...13)))_10 2.056e+04 5301.0 <2e-16 *** (1.010e+04, 3.106e+04)
ct1:C(mo...13)))_11 2924.0 4647.0 0.500 (-5897.0, 1.253e+04)
ct1:C(mo...13)))_12 1582.0 4918.0 0.772 (-9213.0, 1.072e+04)
ct_sqrt 3.718e+04 6095.0 <2e-16 *** (2.312e+04, 4.826e+04)
sin1_ct1_yearly -1.452e+04 1833.0 <2e-16 *** (-1.840e+04, -1.099e+04)
cos1_ct1_yearly 4158.0 1641.0 0.010 * (752.2, 7209.0)
sin2_ct1_yearly 3378.0 1399.0 0.012 * (770.4, 6172.0)
cos2_ct1_yearly 4393.0 1519.0 0.006 ** (1554.0, 7692.0)
sin3_ct1_yearly 1934.0 1514.0 0.182 (-1031.0, 4829.0)
cos3_ct1_yearly -201.8 1402.0 0.874 (-3147.0, 2387.0)
sin4_ct1_yearly -914.2 1355.0 0.478 (-3639.0, 1694.0)
cos4_ct1_yearly -796.1 1294.0 0.522 (-3318.0, 1864.0)
sin5_ct1_yearly -1596.0 1393.0 0.256 (-4178.0, 1300.0)
cos5_ct1_yearly -988.7 1206.0 0.414 (-3298.0, 1494.0)
sin6_ct1_yearly 113.2 1320.0 0.938 (-2515.0, 2869.0)
cos6_ct1_yearly 1641.0 1145.0 0.146 (-708.2, 3783.0)
sin7_ct1_yearly -485.6 1243.0 0.720 (-2761.0, 1987.0)
cos7_ct1_yearly -215.5 1067.0 0.858 (-2272.0, 1859.0)
sin8_ct1_yearly -1111.0 1123.0 0.330 (-3399.0, 956.7)
cos8_ct1_yearly 519.5 1036.0 0.644 (-1365.0, 2546.0)
sin9_ct1_yearly -2083.0 1039.0 0.040 * (-4081.0, -178.0)
cos9_ct1_yearly 1135.0 1065.0 0.304 (-846.4, 3279.0)
sin10_ct1_yearly 178.2 974.2 0.888 (-1784.0, 1997.0)
cos10_ct1_yearly 1482.0 1031.0 0.132 (-633.7, 3610.0)
sin11_ct1_yearly 2166.0 966.7 0.028 * (364.3, 4042.0)
cos11_ct1_yearly -251.9 991.1 0.798 (-2130.0, 1828.0)
sin12_ct1_yearly -1328.0 1011.0 0.186 (-3174.0, 645.0)
cos12_ct1_yearly -2660.0 978.4 0.010 * (-4574.0, -717.1)
sin13_ct1_yearly -1046.0 994.8 0.294 (-3068.0, 762.6)
cos13_ct1_yearly 463.8 968.4 0.638 (-1204.0, 2458.0)
sin14_ct1_yearly -1463.0 928.3 0.114 (-3200.0, 291.8)
cos14_ct1_yearly 1141.0 1023.0 0.270 (-797.0, 3098.0)
sin15_ct1_yearly -189.7 1063.0 0.850 (-2163.0, 1862.0)
cos15_ct1_yearly 1158.0 938.1 0.228 (-673.9, 2933.0)
sin16_ct1_yearly -2142.0 989.6 0.028 * (-4138.0, -346.0)
cos16_ct1_yearly -1075.0 945.0 0.232 (-2879.0, 882.2)
sin17_ct1_yearly 423.9 974.8 0.672 (-1420.0, 2302.0)
cos17_ct1_yearly 983.1 988.7 0.322 (-904.6, 2809.0)
sin18_ct1_yearly -245.8 1019.0 0.798 (-2264.0, 1659.0)
cos18_ct1_yearly -192.3 1006.0 0.838 (-2181.0, 1799.0)
sin19_ct1_yearly -698.6 995.0 0.476 (-2603.0, 1130.0)
cos19_ct1_yearly -929.2 949.4 0.332 (-2926.0, 747.6)
sin20_ct1_yearly 1321.0 1008.0 0.186 (-638.8, 3487.0)
cos20_ct1_yearly 428.0 983.0 0.676 (-1296.0, 2488.0)
sin21_ct1_yearly -1275.0 991.6 0.202 (-3181.0, 640.1)
cos21_ct1_yearly -740.6 1038.0 0.486 (-2970.0, 1093.0)
sin22_ct1_yearly 1233.0 876.9 0.156 (-532.8, 2887.0)
cos22_ct1_yearly 603.5 1092.0 0.572 (-1374.0, 2775.0)
sin23_ct1_yearly 570.9 919.8 0.512 (-1397.0, 2303.0)
cos23_ct1_yearly -3478.0 1060.0 <2e-16 *** (-5601.0, -1546.0)
sin24_ct1_yearly -1124.0 940.4 0.238 (-2907.0, 772.6)
cos24_ct1_yearly -484.6 1026.0 0.632 (-2422.0, 1341.0)
sin25_ct1_yearly -2646.0 967.8 0.006 ** (-4619.0, -795.1)
cos25_ct1_yearly 307.7 991.4 0.734 (-1690.0, 2021.0)
cp0_2012_01_30_00 2383.0 8194.0 0.772 (-1.359e+04, 1.806e+04)
cp1_2013_01_14_00 -1.616e+04 7896.0 0.046 * (-3.203e+04, -63.67)
cp2_2015_02_23_00 -2032.0 4985.0 0.688 (-1.170e+04, 7310.0)
cp3_2017_10_02_00 -1.392e+04 2900.0 <2e-16 *** (-1.906e+04, -7394.0)
y_lag1 2.981e+04 5173.0 <2e-16 *** (1.986e+04, 4.014e+04)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9245, Adjusted R-squared: 0.912
F-statistic: 73.697 on 66 and 398 DF, p-value: 1.110e-16
Model AIC: 11220.0, model BIC: 11498.0
WARNING: the condition number is large, 1.04e+05. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
================================= CV Results ==================================
0
rank_test_MAPE 1
mean_test_MAPE 10.36
split_test_MAPE (12.99, 16.75, 4.98, 8.52, 8.02, 10.94)
mean_train_MAPE 14.46
split_train_MAPE (14.61, 14.54, 14.48, 14.43, 14.38, 14.31)
mean_fit_time 2.34
mean_score_time 4.11
params []
=========================== Train/Test Evaluation =============================
train test
CORR 0.961255 0.539296
R2 0.924008 -0.939596
MSE 4.62489e+07 3.33249e+07
RMSE 6800.65 5772.77
MAE 5131.61 4452.25
MedAE 4077.58 3622.29
MAPE 14.267 5.72096
MedAPE 7.63189 4.56046
sMAPE 6.68905 2.72693
Q80 2565.81 890.45
Q95 2565.81 222.612
Q99 2565.81 44.5225
OutsideTolerance1p 0.919913 0.75
OutsideTolerance2p 0.863636 0.5
OutsideTolerance3p 0.798701 0.5
OutsideTolerance4p 0.757576 0.5
OutsideTolerance5p 0.675325 0.5
Outside Tolerance (fraction) None None
R2_null_model_score None None
Prediction Band Width (%) 75.6584 30.1178
Prediction Band Coverage (fraction) 0.941558 1
Coverage: Lower Band 0.424242 1
Coverage: Upper Band 0.517316 0
Coverage Diff: Actual_Coverage - Intended_Coverage -0.00844156 0.05
Fit/backtest plot:
328 329 | fig = result.backtest.plot()
plotly.io.show(fig)
|
Forecast plot:
333 334 | fig = result.forecast.plot()
plotly.io.show(fig)
|
The components plot:
338 339 | fig = result.forecast.plot_components()
plotly.io.show(fig)
|
Fit a greykite model with autoregression and forecast one-by-one. Forecast one-by-one is only
used when autoregression is set to “auto”, and it can be enable by setting forecast_one_by_one=True
in
Without forecast one-by-one, the lag order in autoregression has to be greater
than the forecast horizon in order to avoid simulation (which leads to less accuracy).
The advantage of turning on forecast_one_by_one is to improve the forecast accuracy by breaking
the forecast horizon to smaller steps, fitting multiple models using immediate lags.
Note that the forecast one-by-one option may slow down the training.
350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 | autoregression = {
"autoreg_dict": "auto"
}
extra_pred_cols = ["ct1", "ct_sqrt", "ct1:C(month, levels=list(range(1, 13)))"]
forecast_one_by_one = True
# Specify the model parameters
model_components = ModelComponentsParam(
autoregression=autoregression,
seasonality={
"yearly_seasonality": 25,
"quarterly_seasonality": 0,
"monthly_seasonality": 0,
"weekly_seasonality": 0,
"daily_seasonality": 0
},
changepoints={
'changepoints_dict': {
"method": "auto",
"resample_freq": "7D",
"regularization_strength": 0.5,
"potential_changepoint_distance": "14D",
"no_changepoint_distance_from_end": "60D",
"yearly_seasonality_order": 25,
"yearly_seasonality_change_freq": None,
},
"seasonality_changepoints_dict": None
},
events={
"holiday_lookup_countries": []
},
growth={
"growth_term": None
},
custom={
'feature_sets_enabled': False,
'fit_algorithm_dict': dict(fit_algorithm='ridge'),
'extra_pred_cols': extra_pred_cols,
}
)
forecast_config = ForecastConfig(
metadata_param=metadata,
forecast_horizon=forecast_horizon,
coverage=0.95,
evaluation_period_param=evaluation_period,
model_components_param=model_components,
forecast_one_by_one=forecast_one_by_one
)
# Run the forecast model
forecaster = Forecaster()
result = forecaster.run_forecast_config(
df=ts.df,
config=forecast_config
)
|
Out:
Fitting 6 folds for each of 1 candidates, totalling 6 fits
Let’s check the model results summary and plots. Here the forecast_one_by_one option fits 4 models for each step, hence 4 model summaries are printed, and 4 components plots are generated.
410 | get_model_results_summary(result)
|
Out:
[================================ Model Summary =================================
Number of observations: 466, Number of features: 71
Method: Ridge regression
Number of nonzero features: 71
Regularization parameter: 0.0621
Residuals:
Min 1Q Median 3Q Max
-2.890e+04 -3782.0 623.6 3971.0 2.048e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 1.157e+04 4586.0 0.018 * (4769.0, 2.257e+04)
ct1 1.383e+04 5010.0 0.002 ** (4594.0, 2.369e+04)
ct1:C(mo... 13)))_2 -1369.0 5398.0 0.784 (-1.181e+04, 9789.0)
ct1:C(mo... 13)))_3 3498.0 5370.0 0.494 (-7555.0, 1.317e+04)
ct1:C(mo... 13)))_4 2.005e+04 5527.0 <2e-16 *** (8450.0, 3.053e+04)
ct1:C(mo... 13)))_5 1.243e+04 5803.0 0.034 * (1335.0, 2.400e+04)
ct1:C(mo... 13)))_6 1.846e+04 5112.0 <2e-16 *** (7555.0, 2.847e+04)
ct1:C(mo... 13)))_7 2.014e+04 5008.0 <2e-16 *** (9397.0, 2.921e+04)
ct1:C(mo... 13)))_8 1.952e+04 5021.0 <2e-16 *** (9338.0, 2.886e+04)
ct1:C(mo... 13)))_9 1.316e+04 5525.0 0.018 * (2031.0, 2.422e+04)
ct1:C(mo...13)))_10 1.442e+04 5214.0 0.006 ** (3647.0, 2.401e+04)
ct1:C(mo...13)))_11 -2251.0 4928.0 0.634 (-1.245e+04, 6658.0)
ct1:C(mo...13)))_12 88.81 5116.0 0.986 (-1.047e+04, 9451.0)
ct_sqrt 2.724e+04 6529.0 <2e-16 *** (1.386e+04, 3.938e+04)
sin1_ct1_yearly -1.184e+04 1833.0 <2e-16 *** (-1.556e+04, -8239.0)
cos1_ct1_yearly 2201.0 1590.0 0.142 (-956.3, 5161.0)
sin2_ct1_yearly 2414.0 1358.0 0.086 . (-280.8, 5459.0)
cos2_ct1_yearly 4682.0 1522.0 <2e-16 *** (1692.0, 7621.0)
sin3_ct1_yearly 2098.0 1412.0 0.130 (-802.5, 4758.0)
cos3_ct1_yearly -160.8 1384.0 0.898 (-2866.0, 2338.0)
sin4_ct1_yearly -903.4 1308.0 0.492 (-3564.0, 1746.0)
cos4_ct1_yearly -403.2 1465.0 0.798 (-3430.0, 2376.0)
sin5_ct1_yearly -1422.0 1362.0 0.268 (-4181.0, 1263.0)
cos5_ct1_yearly -1450.0 1297.0 0.260 (-4127.0, 1017.0)
sin6_ct1_yearly 18.43 1417.0 0.990 (-2728.0, 2642.0)
cos6_ct1_yearly 2262.0 1086.0 0.038 * (77.73, 4250.0)
sin7_ct1_yearly -570.3 1213.0 0.646 (-2998.0, 1827.0)
cos7_ct1_yearly -392.5 1084.0 0.708 (-2452.0, 1561.0)
sin8_ct1_yearly -1046.0 1203.0 0.374 (-3542.0, 1225.0)
cos8_ct1_yearly 314.1 1097.0 0.796 (-1610.0, 2464.0)
sin9_ct1_yearly -2537.0 1060.0 0.020 * (-4764.0, -638.1)
cos9_ct1_yearly 1553.0 1075.0 0.148 (-673.6, 3556.0)
sin10_ct1_yearly 384.5 1046.0 0.750 (-1558.0, 2470.0)
cos10_ct1_yearly 1482.0 951.4 0.116 (-464.8, 3387.0)
sin11_ct1_yearly 2064.0 968.8 0.030 * (361.9, 4075.0)
cos11_ct1_yearly -377.9 938.0 0.690 (-2015.0, 1494.0)
sin12_ct1_yearly -1793.0 976.2 0.066 . (-3806.0, 4.433)
cos12_ct1_yearly -2747.0 954.6 0.008 ** (-4664.0, -925.8)
sin13_ct1_yearly -926.8 1021.0 0.354 (-2824.0, 1097.0)
cos13_ct1_yearly 1072.0 1022.0 0.290 (-947.3, 3037.0)
sin14_ct1_yearly -1079.0 942.9 0.250 (-2850.0, 884.2)
cos14_ct1_yearly 1292.0 994.7 0.190 (-617.6, 3207.0)
sin15_ct1_yearly -10.76 1098.0 0.992 (-2083.0, 2248.0)
cos15_ct1_yearly 1135.0 1005.0 0.244 (-779.1, 3014.0)
sin16_ct1_yearly -2247.0 1028.0 0.036 * (-4127.0, -139.1)
cos16_ct1_yearly -882.8 1014.0 0.384 (-2829.0, 1110.0)
sin17_ct1_yearly 484.5 962.1 0.610 (-1419.0, 2186.0)
cos17_ct1_yearly 891.7 990.1 0.374 (-964.5, 2753.0)
sin18_ct1_yearly -413.2 979.4 0.686 (-2313.0, 1456.0)
cos18_ct1_yearly -31.73 973.4 0.970 (-1909.0, 1995.0)
sin19_ct1_yearly -696.1 976.2 0.468 (-2691.0, 988.8)
cos19_ct1_yearly -785.3 1008.0 0.456 (-2826.0, 1244.0)
sin20_ct1_yearly 1262.0 916.5 0.178 (-480.9, 3036.0)
cos20_ct1_yearly 359.5 967.6 0.728 (-1348.0, 2459.0)
sin21_ct1_yearly -1148.0 1028.0 0.258 (-3156.0, 1021.0)
cos21_ct1_yearly -670.7 1065.0 0.548 (-2718.0, 1320.0)
sin22_ct1_yearly 1086.0 963.6 0.236 (-946.5, 2847.0)
cos22_ct1_yearly 421.2 999.2 0.656 (-1494.0, 2301.0)
sin23_ct1_yearly 409.4 909.6 0.654 (-1473.0, 2226.0)
cos23_ct1_yearly -3170.0 1003.0 <2e-16 *** (-5088.0, -1206.0)
sin24_ct1_yearly -1020.0 977.8 0.302 (-3093.0, 811.4)
cos24_ct1_yearly -452.9 1005.0 0.644 (-2587.0, 1482.0)
sin25_ct1_yearly -2412.0 1017.0 0.016 * (-4200.0, -398.4)
cos25_ct1_yearly 349.5 987.2 0.722 (-1481.0, 2374.0)
cp0_2012_01_30_00 2834.0 7737.0 0.714 (-1.205e+04, 1.711e+04)
cp1_2013_01_14_00 -1.253e+04 7983.0 0.106 (-2.928e+04, 2947.0)
cp2_2015_02_23_00 -1701.0 4985.0 0.764 (-1.120e+04, 7301.0)
cp3_2017_10_02_00 -1.108e+04 2989.0 <2e-16 *** (-1.678e+04, -5366.0)
y_lag1 2.380e+04 5687.0 <2e-16 *** (1.208e+04, 3.429e+04)
y_lag2 1.217e+04 5467.0 0.022 * (1269.0, 2.231e+04)
y_lag3 1.001e+04 5276.0 0.050 . (459.7, 2.026e+04)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9265, Adjusted R-squared: 0.9139
F-statistic: 73.444 on 67 and 397 DF, p-value: 1.110e-16
Model AIC: 11212.0, model BIC: 11498.0
WARNING: the condition number is large, 1.08e+05. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
, ================================ Model Summary =================================
Number of observations: 466, Number of features: 71
Method: Ridge regression
Number of nonzero features: 71
Regularization parameter: 0.0621
Residuals:
Min 1Q Median 3Q Max
-2.845e+04 -3677.0 385.2 4356.0 1.959e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 1.081e+04 4749.0 0.026 * (3416.0, 2.202e+04)
ct1 1.641e+04 5640.0 0.002 ** (4967.0, 2.802e+04)
ct1:C(mo... 13)))_2 -370.3 6152.0 0.962 (-1.342e+04, 1.036e+04)
ct1:C(mo... 13)))_3 4962.0 5765.0 0.384 (-6816.0, 1.480e+04)
ct1:C(mo... 13)))_4 2.512e+04 5309.0 <2e-16 *** (1.402e+04, 3.427e+04)
ct1:C(mo... 13)))_5 1.566e+04 6101.0 0.014 * (1580.0, 2.572e+04)
ct1:C(mo... 13)))_6 2.458e+04 5197.0 <2e-16 *** (1.341e+04, 3.380e+04)
ct1:C(mo... 13)))_7 2.551e+04 5503.0 0.002 ** (1.341e+04, 3.485e+04)
ct1:C(mo... 13)))_8 2.564e+04 5394.0 <2e-16 *** (1.365e+04, 3.406e+04)
ct1:C(mo... 13)))_9 1.826e+04 5980.0 0.004 ** (6089.0, 2.985e+04)
ct1:C(mo...13)))_10 1.944e+04 5696.0 0.002 ** (6677.0, 2.958e+04)
ct1:C(mo...13)))_11 -402.0 5436.0 0.938 (-1.278e+04, 8974.0)
ct1:C(mo...13)))_12 649.1 5304.0 0.898 (-1.008e+04, 8938.0)
ct_sqrt 3.508e+04 7247.0 <2e-16 *** (2.004e+04, 4.708e+04)
sin1_ct1_yearly -1.474e+04 1905.0 <2e-16 *** (-1.849e+04, -1.144e+04)
cos1_ct1_yearly 3027.0 1722.0 0.084 . (-421.9, 6460.0)
sin2_ct1_yearly 3383.0 1477.0 0.020 * (852.5, 6560.0)
cos2_ct1_yearly 5267.0 1495.0 <2e-16 *** (2328.0, 8145.0)
sin3_ct1_yearly 2135.0 1393.0 0.120 (-650.9, 4728.0)
cos3_ct1_yearly -350.5 1472.0 0.776 (-3166.0, 2616.0)
sin4_ct1_yearly -665.1 1403.0 0.626 (-3430.0, 2153.0)
cos4_ct1_yearly -669.7 1372.0 0.594 (-3314.0, 2228.0)
sin5_ct1_yearly -1646.0 1303.0 0.204 (-4071.0, 884.9)
cos5_ct1_yearly -1388.0 1263.0 0.298 (-3748.0, 1111.0)
sin6_ct1_yearly -82.52 1497.0 0.944 (-2934.0, 2708.0)
cos6_ct1_yearly 2357.0 1111.0 0.034 * (122.7, 4513.0)
sin7_ct1_yearly -591.4 1277.0 0.640 (-3352.0, 1782.0)
cos7_ct1_yearly -571.0 1055.0 0.576 (-2598.0, 1521.0)
sin8_ct1_yearly -1370.0 1162.0 0.246 (-3532.0, 956.1)
cos8_ct1_yearly 438.5 1095.0 0.748 (-1439.0, 2516.0)
sin9_ct1_yearly -2511.0 1073.0 0.024 * (-4532.0, -274.2)
cos9_ct1_yearly 2370.0 1147.0 0.034 * (341.5, 4653.0)
sin10_ct1_yearly 767.9 1034.0 0.450 (-1254.0, 2833.0)
cos10_ct1_yearly 1622.0 1011.0 0.132 (-449.4, 3517.0)
sin11_ct1_yearly 2459.0 1042.0 0.018 * (486.5, 4427.0)
cos11_ct1_yearly -712.9 997.1 0.490 (-2555.0, 1347.0)
sin12_ct1_yearly -2496.0 1017.0 0.012 * (-4679.0, -533.2)
cos12_ct1_yearly -2554.0 987.7 0.012 * (-4506.0, -746.5)
sin13_ct1_yearly -464.0 1078.0 0.666 (-2709.0, 1531.0)
cos13_ct1_yearly 1258.0 990.3 0.196 (-532.2, 3306.0)
sin14_ct1_yearly -833.4 970.5 0.404 (-2637.0, 1095.0)
cos14_ct1_yearly 1381.0 1023.0 0.168 (-612.4, 3349.0)
sin15_ct1_yearly 256.7 1110.0 0.816 (-1749.0, 2487.0)
cos15_ct1_yearly 1171.0 1019.0 0.264 (-850.0, 3023.0)
sin16_ct1_yearly -2512.0 1013.0 0.008 ** (-4600.0, -765.7)
cos16_ct1_yearly -483.9 987.6 0.614 (-2417.0, 1509.0)
sin17_ct1_yearly 730.2 1001.0 0.460 (-1127.0, 2585.0)
cos17_ct1_yearly 778.5 983.8 0.450 (-1174.0, 2607.0)
sin18_ct1_yearly -699.5 1064.0 0.514 (-2863.0, 1189.0)
cos18_ct1_yearly 272.6 1043.0 0.794 (-1729.0, 2595.0)
sin19_ct1_yearly -664.3 1021.0 0.500 (-2657.0, 1456.0)
cos19_ct1_yearly -601.2 1038.0 0.574 (-2650.0, 1383.0)
sin20_ct1_yearly 1087.0 1003.0 0.276 (-846.7, 3016.0)
cos20_ct1_yearly 252.7 958.0 0.758 (-1848.0, 2010.0)
sin21_ct1_yearly -908.7 1030.0 0.390 (-2909.0, 1046.0)
cos21_ct1_yearly -453.1 1013.0 0.624 (-2491.0, 1594.0)
sin22_ct1_yearly 922.5 978.9 0.342 (-985.6, 2707.0)
cos22_ct1_yearly 129.3 1035.0 0.898 (-2045.0, 2102.0)
sin23_ct1_yearly 247.3 983.1 0.792 (-1728.0, 2082.0)
cos23_ct1_yearly -2725.0 983.3 0.004 ** (-4520.0, -755.7)
sin24_ct1_yearly -933.8 1012.0 0.338 (-2866.0, 1092.0)
cos24_ct1_yearly -348.9 1028.0 0.700 (-2413.0, 1738.0)
sin25_ct1_yearly -1887.0 1015.0 0.070 . (-4037.0, 162.1)
cos25_ct1_yearly 142.6 1034.0 0.884 (-1690.0, 2224.0)
cp0_2012_01_30_00 2893.0 8225.0 0.728 (-1.325e+04, 1.886e+04)
cp1_2013_01_14_00 -1.565e+04 8463.0 0.068 . (-3.175e+04, 2032.0)
cp2_2015_02_23_00 -1969.0 5154.0 0.698 (-1.276e+04, 7712.0)
cp3_2017_10_02_00 -1.377e+04 2932.0 <2e-16 *** (-1.904e+04, -7614.0)
y_lag2 1.841e+04 5583.0 <2e-16 *** (6970.0, 2.948e+04)
y_lag3 1.335e+04 5698.0 0.020 * (2461.0, 2.513e+04)
y_lag4 384.1 5474.0 0.942 (-1.052e+04, 1.007e+04)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9225, Adjusted R-squared: 0.9092
F-statistic: 69.268 on 67 and 397 DF, p-value: 1.110e-16
Model AIC: 11236.0, model BIC: 11522.0
WARNING: the condition number is large, 1.08e+05. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
, ================================ Model Summary =================================
Number of observations: 466, Number of features: 71
Method: Ridge regression
Number of nonzero features: 71
Regularization parameter: 0.0621
Residuals:
Min 1Q Median 3Q Max
-2.803e+04 -3780.0 295.0 4331.0 2.130e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 1.174e+04 5057.0 0.020 * (3318.0, 2.308e+04)
ct1 1.729e+04 6260.0 0.002 ** (5731.0, 3.019e+04)
ct1:C(mo... 13)))_2 1405.0 6111.0 0.806 (-1.104e+04, 1.238e+04)
ct1:C(mo... 13)))_3 7198.0 5924.0 0.214 (-5566.0, 1.801e+04)
ct1:C(mo... 13)))_4 2.871e+04 5167.0 <2e-16 *** (1.719e+04, 3.708e+04)
ct1:C(mo... 13)))_5 1.782e+04 6066.0 0.004 ** (4248.0, 2.779e+04)
ct1:C(mo... 13)))_6 2.810e+04 4969.0 <2e-16 *** (1.741e+04, 3.742e+04)
ct1:C(mo... 13)))_7 2.890e+04 5096.0 <2e-16 *** (1.844e+04, 3.826e+04)
ct1:C(mo... 13)))_8 2.861e+04 5544.0 <2e-16 *** (1.753e+04, 3.900e+04)
ct1:C(mo... 13)))_9 2.073e+04 6162.0 <2e-16 *** (7510.0, 3.264e+04)
ct1:C(mo...13)))_10 2.231e+04 5238.0 <2e-16 *** (1.084e+04, 3.180e+04)
ct1:C(mo...13)))_11 132.7 5491.0 0.982 (-1.074e+04, 9768.0)
ct1:C(mo...13)))_12 1278.0 5227.0 0.778 (-1.068e+04, 9512.0)
ct_sqrt 3.670e+04 7045.0 <2e-16 *** (2.302e+04, 4.937e+04)
sin1_ct1_yearly -1.623e+04 1902.0 <2e-16 *** (-1.988e+04, -1.281e+04)
cos1_ct1_yearly 2976.0 1797.0 0.092 . (-309.5, 6888.0)
sin2_ct1_yearly 3903.0 1539.0 0.010 * (934.1, 7072.0)
cos2_ct1_yearly 5873.0 1649.0 <2e-16 *** (2542.0, 9174.0)
sin3_ct1_yearly 1995.0 1402.0 0.146 (-884.8, 4630.0)
cos3_ct1_yearly -385.0 1504.0 0.806 (-3422.0, 2331.0)
sin4_ct1_yearly -256.7 1545.0 0.868 (-3497.0, 2821.0)
cos4_ct1_yearly -1006.0 1454.0 0.490 (-3592.0, 1874.0)
sin5_ct1_yearly -2232.0 1490.0 0.138 (-5158.0, 600.9)
cos5_ct1_yearly -1400.0 1455.0 0.334 (-4127.0, 1459.0)
sin6_ct1_yearly -94.22 1436.0 0.942 (-2768.0, 2756.0)
cos6_ct1_yearly 1961.0 1132.0 0.074 . (-175.1, 3988.0)
sin7_ct1_yearly -576.8 1322.0 0.688 (-3137.0, 1782.0)
cos7_ct1_yearly -366.1 1147.0 0.756 (-2804.0, 1751.0)
sin8_ct1_yearly -1574.0 1157.0 0.174 (-3976.0, 625.3)
cos8_ct1_yearly 846.0 1126.0 0.462 (-1152.0, 3196.0)
sin9_ct1_yearly -1734.0 1070.0 0.098 . (-3730.0, 307.1)
cos9_ct1_yearly 2571.0 1166.0 0.028 * (315.2, 4878.0)
sin10_ct1_yearly 936.5 1095.0 0.366 (-1341.0, 3037.0)
cos10_ct1_yearly 1333.0 1073.0 0.232 (-776.3, 3317.0)
sin11_ct1_yearly 2276.0 1035.0 0.024 * (339.8, 4246.0)
cos11_ct1_yearly -887.5 1085.0 0.386 (-3027.0, 1216.0)
sin12_ct1_yearly -2133.0 1038.0 0.042 * (-4235.0, -207.3)
cos12_ct1_yearly -2129.0 1023.0 0.036 * (-4076.0, -191.2)
sin13_ct1_yearly -311.7 1084.0 0.768 (-2419.0, 1732.0)
cos13_ct1_yearly 806.0 1070.0 0.446 (-1186.0, 2888.0)
sin14_ct1_yearly -983.8 995.7 0.314 (-3029.0, 970.9)
cos14_ct1_yearly 1091.0 1050.0 0.296 (-912.9, 3128.0)
sin15_ct1_yearly 138.3 1076.0 0.880 (-1714.0, 2287.0)
cos15_ct1_yearly 1151.0 1055.0 0.276 (-783.2, 3275.0)
sin16_ct1_yearly -2434.0 1032.0 0.018 * (-4425.0, -525.7)
cos16_ct1_yearly -554.2 991.0 0.552 (-2395.0, 1516.0)
sin17_ct1_yearly 657.1 1055.0 0.586 (-1238.0, 2634.0)
cos17_ct1_yearly 765.6 1057.0 0.492 (-1162.0, 2773.0)
sin18_ct1_yearly -705.3 1014.0 0.524 (-2570.0, 1276.0)
cos18_ct1_yearly 471.5 1088.0 0.672 (-1466.0, 2638.0)
sin19_ct1_yearly -460.8 1024.0 0.630 (-2542.0, 1503.0)
cos19_ct1_yearly -671.8 1055.0 0.540 (-2656.0, 1353.0)
sin20_ct1_yearly 1066.0 1029.0 0.310 (-878.0, 3067.0)
cos20_ct1_yearly 441.2 1016.0 0.656 (-1383.0, 2658.0)
sin21_ct1_yearly -826.5 1031.0 0.434 (-2936.0, 1136.0)
cos21_ct1_yearly -657.0 1021.0 0.470 (-2765.0, 1279.0)
sin22_ct1_yearly 867.5 925.1 0.368 (-979.9, 2491.0)
cos22_ct1_yearly 350.7 1095.0 0.772 (-1592.0, 2497.0)
sin23_ct1_yearly 802.0 978.8 0.428 (-999.9, 2713.0)
cos23_ct1_yearly -3043.0 1084.0 0.002 ** (-5304.0, -1003.0)
sin24_ct1_yearly -1033.0 1010.0 0.330 (-3013.0, 904.9)
cos24_ct1_yearly -523.5 1008.0 0.628 (-2509.0, 1478.0)
sin25_ct1_yearly -2544.0 1008.0 0.014 * (-4599.0, -678.5)
cos25_ct1_yearly 49.26 1046.0 0.954 (-1994.0, 2195.0)
cp0_2012_01_30_00 2950.0 8545.0 0.742 (-1.358e+04, 1.957e+04)
cp1_2013_01_14_00 -1.687e+04 8154.0 0.030 * (-3.097e+04, -531.2)
cp2_2015_02_23_00 -2182.0 5035.0 0.700 (-1.115e+04, 8328.0)
cp3_2017_10_02_00 -1.481e+04 2865.0 <2e-16 *** (-2.022e+04, -9512.0)
y_lag3 1.671e+04 5838.0 0.006 ** (5379.0, 2.786e+04)
y_lag4 553.5 5565.0 0.932 (-9860.0, 1.160e+04)
y_lag5 1.004e+04 4902.0 0.042 * (281.9, 1.952e+04)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9208, Adjusted R-squared: 0.9072
F-statistic: 67.654 on 67 and 397 DF, p-value: 1.110e-16
Model AIC: 11246.0, model BIC: 11532.0
WARNING: the condition number is large, 1.08e+05. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
, ================================ Model Summary =================================
Number of observations: 466, Number of features: 71
Method: Ridge regression
Number of nonzero features: 71
Regularization parameter: 0.02807
Residuals:
Min 1Q Median 3Q Max
-2.844e+04 -3736.0 448.2 4182.0 2.170e+04
Pred_col Estimate Std. Err Pr(>)_boot sig. code 95%CI
Intercept 1.167e+04 5565.0 0.046 * (3538.0, 2.571e+04)
ct1 1.785e+04 1.093e+04 0.086 . (-293.3, 3.981e+04)
ct1:C(mo... 13)))_2 2556.0 6402.0 0.670 (-1.079e+04, 1.435e+04)
ct1:C(mo... 13)))_3 9731.0 6422.0 0.136 (-2886.0, 2.125e+04)
ct1:C(mo... 13)))_4 3.373e+04 5573.0 <2e-16 *** (2.243e+04, 4.414e+04)
ct1:C(mo... 13)))_5 2.337e+04 6455.0 <2e-16 *** (1.064e+04, 3.548e+04)
ct1:C(mo... 13)))_6 3.428e+04 5506.0 <2e-16 *** (2.431e+04, 4.466e+04)
ct1:C(mo... 13)))_7 3.618e+04 5631.0 <2e-16 *** (2.486e+04, 4.738e+04)
ct1:C(mo... 13)))_8 3.622e+04 5782.0 <2e-16 *** (2.487e+04, 4.731e+04)
ct1:C(mo... 13)))_9 2.771e+04 6384.0 <2e-16 *** (1.562e+04, 3.990e+04)
ct1:C(mo...13)))_10 2.884e+04 5796.0 <2e-16 *** (1.760e+04, 4.018e+04)
ct1:C(mo...13)))_11 5801.0 6172.0 0.350 (-5390.0, 1.764e+04)
ct1:C(mo...13)))_12 3647.0 5259.0 0.474 (-7792.0, 1.305e+04)
ct_sqrt 4.639e+04 9771.0 <2e-16 *** (2.642e+04, 6.428e+04)
sin1_ct1_yearly -1.796e+04 1754.0 <2e-16 *** (-2.182e+04, -1.480e+04)
cos1_ct1_yearly 4496.0 1834.0 0.010 * (953.9, 7875.0)
sin2_ct1_yearly 4660.0 1643.0 0.006 ** (1569.0, 8150.0)
cos2_ct1_yearly 5335.0 1671.0 <2e-16 *** (2300.0, 9232.0)
sin3_ct1_yearly 1819.0 1656.0 0.244 (-1657.0, 4822.0)
cos3_ct1_yearly -139.4 1522.0 0.938 (-3023.0, 2592.0)
sin4_ct1_yearly -144.7 1516.0 0.920 (-3151.0, 2737.0)
cos4_ct1_yearly -1255.0 1444.0 0.372 (-4072.0, 1590.0)
sin5_ct1_yearly -2204.0 1417.0 0.114 (-5037.0, 462.1)
cos5_ct1_yearly -1147.0 1316.0 0.372 (-3630.0, 1484.0)
sin6_ct1_yearly -262.0 1446.0 0.844 (-2896.0, 2531.0)
cos6_ct1_yearly 1650.0 1109.0 0.136 (-391.1, 4078.0)
sin7_ct1_yearly -376.6 1320.0 0.776 (-2868.0, 2158.0)
cos7_ct1_yearly -280.6 1155.0 0.804 (-2623.0, 1925.0)
sin8_ct1_yearly -1639.0 1212.0 0.160 (-4277.0, 471.6)
cos8_ct1_yearly 933.8 1143.0 0.446 (-1060.0, 3215.0)
sin9_ct1_yearly -1453.0 1056.0 0.164 (-3578.0, 433.8)
cos9_ct1_yearly 2266.0 1182.0 0.052 . (288.6, 4839.0)
sin10_ct1_yearly 826.0 1061.0 0.440 (-1102.0, 2963.0)
cos10_ct1_yearly 1374.0 1058.0 0.174 (-638.7, 3493.0)
sin11_ct1_yearly 2407.0 1042.0 0.026 * (209.6, 4313.0)
cos11_ct1_yearly -786.0 1120.0 0.480 (-2894.0, 1425.0)
sin12_ct1_yearly -1690.0 1012.0 0.088 . (-3668.0, 216.3)
cos12_ct1_yearly -2128.0 1141.0 0.060 . (-4460.0, 3.572)
sin13_ct1_yearly -512.2 1064.0 0.644 (-2740.0, 1581.0)
cos13_ct1_yearly 277.1 1063.0 0.820 (-1783.0, 2479.0)
sin14_ct1_yearly -1453.0 998.0 0.142 (-3484.0, 434.0)
cos14_ct1_yearly 1027.0 997.3 0.290 (-753.2, 3133.0)
sin15_ct1_yearly -11.37 1081.0 0.986 (-1956.0, 2040.0)
cos15_ct1_yearly 1352.0 1028.0 0.204 (-667.1, 3351.0)
sin16_ct1_yearly -2544.0 1076.0 0.022 * (-4699.0, -458.5)
cos16_ct1_yearly -857.2 1090.0 0.430 (-2690.0, 1483.0)
sin17_ct1_yearly 743.2 1041.0 0.470 (-1043.0, 2937.0)
cos17_ct1_yearly 887.1 1044.0 0.380 (-1287.0, 2666.0)
sin18_ct1_yearly -719.1 1048.0 0.518 (-2876.0, 1116.0)
cos18_ct1_yearly 346.5 1032.0 0.718 (-1688.0, 2316.0)
sin19_ct1_yearly -632.6 1135.0 0.598 (-3032.0, 1270.0)
cos19_ct1_yearly -719.2 1025.0 0.500 (-2714.0, 1111.0)
sin20_ct1_yearly 1077.0 1038.0 0.290 (-701.9, 3323.0)
cos20_ct1_yearly 254.7 1021.0 0.782 (-1748.0, 2371.0)
sin21_ct1_yearly -907.9 951.5 0.336 (-2830.0, 998.6)
cos21_ct1_yearly -420.9 1099.0 0.724 (-2484.0, 1868.0)
sin22_ct1_yearly 968.5 997.7 0.316 (-1010.0, 3041.0)
cos22_ct1_yearly 213.5 1103.0 0.844 (-2082.0, 2229.0)
sin23_ct1_yearly 593.9 996.1 0.554 (-1327.0, 2475.0)
cos23_ct1_yearly -2913.0 1113.0 0.010 * (-5124.0, -811.3)
sin24_ct1_yearly -1020.0 1068.0 0.334 (-2952.0, 1099.0)
cos24_ct1_yearly -433.4 988.8 0.672 (-2255.0, 1475.0)
sin25_ct1_yearly -2270.0 1075.0 0.032 * (-4314.0, -58.01)
cos25_ct1_yearly -60.41 1019.0 0.954 (-2025.0, 1895.0)
cp0_2012_01_30_00 2899.0 1.177e+04 0.790 (-2.031e+04, 2.598e+04)
cp1_2013_01_14_00 -2.130e+04 1.030e+04 0.034 * (-4.148e+04, -2872.0)
cp2_2015_02_23_00 -1575.0 5410.0 0.778 (-1.189e+04, 8716.0)
cp3_2017_10_02_00 -1.748e+04 3119.0 <2e-16 *** (-2.333e+04, -1.179e+04)
y_lag4 4134.0 5344.0 0.450 (-6608.0, 1.446e+04)
y_lag5 1.226e+04 5326.0 0.024 * (615.8, 2.224e+04)
y_lag6 -2743.0 6133.0 0.642 (-1.373e+04, 8868.0)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9193, Adjusted R-squared: 0.9054
F-statistic: 65.708 on 68 and 396 DF, p-value: 1.110e-16
Model AIC: 11256.0, model BIC: 11545.0
WARNING: the condition number is large, 2.17e+05. This might indicate that there are strong multicollinearity or other numerical problems.
WARNING: the F-ratio and its p-value on regularized methods might be misleading, they are provided only for reference purposes.
]
================================= CV Results ==================================
0
rank_test_MAPE 1
mean_test_MAPE 10.34
split_test_MAPE (10.34, 15.4, 6.67, 11.39, 9.49, 8.78)
mean_train_MAPE 15.57
split_train_MAPE (15.68, 15.65, 15.63, 15.54, 15.5, 15.44)
mean_fit_time 8.93
mean_score_time 0.83
params []
=========================== Train/Test Evaluation =============================
train test
CORR 0.958442 0.927707
R2 0.918606 -1.01788
MSE 4.95366e+07 3.46699e+07
RMSE 7038.22 5888.12
MAE 5322.25 4775.37
MedAE 4027.17 4324.4
MAPE 15.3776 6.1112
MedAPE 8.10622 5.45003
sMAPE 7.19792 2.91722
Q80 2661.13 955.075
Q95 2661.13 238.769
Q99 2661.13 47.7537
OutsideTolerance1p 0.950216 1
OutsideTolerance2p 0.880952 0.75
OutsideTolerance3p 0.829004 0.5
OutsideTolerance4p 0.774892 0.5
OutsideTolerance5p 0.679654 0.5
Outside Tolerance (fraction) None None
R2_null_model_score None None
Prediction Band Width (%) 78.3014 33.4519
Prediction Band Coverage (fraction) 0.937229 1
Coverage: Lower Band 0.430736 1
Coverage: Upper Band 0.506494 0
Coverage Diff: Actual_Coverage - Intended_Coverage -0.0127706 0.05
Fit/backtest plot:
414 415 | fig = result.backtest.plot()
plotly.io.show(fig)
|
Forecast plot:
419 420 | fig = result.forecast.plot()
plotly.io.show(fig)
|
The components plot:
424 425 426 | figs = result.forecast.plot_components()
for fig in figs:
plotly.io.show(fig)
|
Total running time of the script: ( 2 minutes 42.065 seconds)