Tune your first anomaly detection model

This is a basic tutorial for creating and tuning a Greykite AD (Anomaly Detection) model. It is intended for users who are new to Greykite AD and want to get started quickly.

The Greykite AD is a forecast-based AD method i.e. the forecast is used as the baseline. A data point is predicted as anomalous if it is outside the forecasted confidence intervals. The Greykite AD algorithm gives you the flexibility to better model and control the confidence intervals. A forecast based AD method is inherently dependent on an accurate forecasting model to achieve satisfactory AD performance.

Throughout this tutorial, we will assume that you are familiar with tuning a Greykite forecast model. If you are not, please refer to the Tune your first forecast model.

The anomaly detection config (ADConfig) allows the users divide the time series into segments and learn a different volatility model for each segment. The user can specify the volatility features. It also allows users to specify objective function, constraints and parameter space to optimize the confidence intervals.

These features include:

Volatility Features:

This allows users to specify the features to segment the time series and learn a different volatility model for each segment. For example, if the time series is a daily time series, the user can specify the volatility features as ["dow"] to learn a different volatility model for each day of the week. The user can also specify multiple volatility features. For example, if the time series is a daily time series, the user can specify the volatility features as [["dow", "is_weekend"]] to learn a different volatility model for each day of the week and a different volatility model for weekends.

Coverage Grid:

This allows users to specify a grid of the confidence intervals. The coverage_grid is specified as a list of floats between 0 and 1. For example, if the coverage_grid is specified as [0.5, 0.95], the algorithm optimizes over confidence intervals with coverage 0.5 and 0.95.

Target Anomaly Percentage:

This allows users to specify the target_anomaly_percent, which is specified as a float between 0 and 1. For example, if target_anomaly_percent is specified as 0.1, the anomaly score threshold is optimized such that 10% of the data points are predicted as anomalous.

Target Precision:

This allows users to specify the target_precision, which is specified as a float between 0 and 1. For example, if the target_precision is specified as 0.9, the anomaly score threshold is optimized such that at least 90% of the predicted anomalies are true anomalies. This is useful when the user has a limited budget to investigate the anomalies.

Target Recall:

This allows users to specify the target_recall, which is specified as a float between 0 and 1. For example, if the target_recall is specified as 0.9, the anomaly score threshold is optimized such that at least 90% of the true anomalies are predicted as anomalies. This is useful when the user wants to detect most of the anomalies.

59 import datetime
60
61 import numpy as np
62 import pandas as pd
63 import plotly
64 import plotly.express as px
65 from greykite.common.constants import ANOMALY_COL
66 from greykite.common.constants import TIME_COL
67 from greykite.common.constants import VALUE_COL
68 from greykite.common.testing_utils import generate_df_for_tests
69 from greykite.common.testing_utils_anomalies import contaminate_df_with_anomalies
70 from greykite.common.viz.timeseries_annotate import plot_lines_markers
71 from greykite.detection.common.ad_evaluation import f1_score
72 from greykite.detection.common.ad_evaluation import precision_score
73 from greykite.detection.common.ad_evaluation import recall_score
74 from greykite.detection.detector.ad_utils import partial_return
75 from greykite.detection.detector.config import ADConfig
76 from greykite.detection.detector.data import DetectorData
77 from greykite.detection.detector.greykite import GreykiteDetector
78 from greykite.detection.detector.reward import Reward
79 from greykite.framework.templates.autogen.forecast_config import EvaluationPeriodParam
80 from greykite.framework.templates.autogen.forecast_config import ForecastConfig
81 from greykite.framework.templates.autogen.forecast_config import MetadataParam
82 from greykite.framework.templates.autogen.forecast_config import ModelComponentsParam
83
84 # Evaluation metrics used in the tests.
85 # F1 score for the True label:
86 f1 = partial_return(f1_score, True)
87 # Precision score, for the True label:
88 precision = partial_return(precision_score, True)
89 # Recall score for the True label:
90 recall = partial_return(recall_score, True)

Generate a dataset with anomalies

Let us first generate a dataset with ground truth anomaly labels.

 96 df = generate_df_for_tests(
 97     freq="D",
 98     train_start_date=datetime.datetime(2020, 1, 1),
 99     intercept=50,
100     train_frac=0.99,
101     periods=200)["df"]
102
103 # Specifies anomaly locations.
104 anomaly_block_list = [
105     np.arange(10, 15),
106     np.arange(33, 35),
107     np.arange(60, 65),
108     np.arange(82, 85),
109     np.arange(94, 98),
110     np.arange(100, 105),
111     np.arange(111, 113),
112     np.arange(125, 130),
113     np.arange(160, 163),
114     np.arange(185, 190),
115     np.arange(198, 200)]
116
117 # Contaminates `df` with anomalies at the specified locations,
118 # via `anomaly_block_list`.
119 # If original value is y, the anomalous value is: (1 +/- delta)*y.
120 df = contaminate_df_with_anomalies(
121     df=df,
122     anomaly_block_list=anomaly_block_list,
123     delta_range_lower=0.25,
124     delta_range_upper=0.5,
125     value_col=VALUE_COL,
126     min_admissible_value=None,
127     max_admissible_value=None)
128
129 fig = plot_lines_markers(
130     df=df,
131     x_col=TIME_COL,
132     line_cols=["contaminated_y", "y"],
133     line_colors=["red", "blue"],
134     title="Generation of daily anomalous data")
135 fig.update_yaxes()
136 plotly.io.show(fig)

The anomalies are generated by adding a random delta to the original value. The plot above shows the original data (y) in blue and the contaminated data (contaminated_y) in red. We will drop the original data (y) and use the contaminated data (contaminated_y) as the input to the anomaly detector.

144 df = df.drop(columns=[VALUE_COL]).rename(
145     columns={"contaminated_y": VALUE_COL})
146 df[ANOMALY_COL] = (df[ANOMALY_COL] == 1)
147
148 train_size = int(100)
149 df_train = df[:train_size].reset_index(drop=True)
150 df_test = df[train_size:].reset_index(drop=True)

Structure of a Greykite AD model

The Greykite AD takes a forecast_config and ADConfig and builds a detector which uses the forecast as baseline. The fit consists of following stages:

  • Fit a forecast model using the given forecast_config.

  • Fit a volatility model using the given ADConfig. This builds a conf_interval

model that optimizes over the parameters specified in the ADConfig.

Any of the available forecast model templates (see Choose a Model) work in conjunction with the Greykite AD. In this example, we choose the “SILVERKITE_EMPTY” template.

169 metadata = MetadataParam(
170     time_col=TIME_COL,
171     value_col=VALUE_COL,
172     train_end_date=None,
173     anomaly_info=None)
174
175 evaluation_period = EvaluationPeriodParam(
176     test_horizon=0,
177     cv_max_splits=0)
178
179 model_components = ModelComponentsParam(
180     autoregression={
181         "autoreg_dict": {
182             "lag_dict": {"orders": [7]},
183             "agg_lag_dict": None}},
184     events={
185         "auto_holiday": False,
186         "holiday_lookup_countries": ["US"],
187         "holiday_pre_num_days": 2,
188         "holiday_post_num_days": 2,
189         "daily_event_df_dict": None},
190     custom={
191         "extra_pred_cols": ["dow"],
192         "min_admissible_value": 0,
193         "normalize_method": "zero_to_one"})
194
195 forecast_config = ForecastConfig(
196     model_template="SILVERKITE_EMPTY",
197     metadata_param=metadata,
198     coverage=None,
199     evaluation_period_param=evaluation_period,
200     forecast_horizon=1,
201     model_components_param=model_components)

The Greykite AD algorithm works with or without anomaly labels for training. The reward function for the AD algorithm is updated accordingly. When no anomaly labels are provided, the AD algorithm uses target_anomaly_percent to determine the anomaly score threshold. If anomaly labels are provided, the AD algorithm uses precision, recall or f1 to determine the anomaly score threshold.

Anomaly labels are available

Let us first consider the case where anomaly labels are available for training. You can pass the anomaly labels in a few different ways:

  • As the ANOMALY_COL column in the training dataframe (train_data.df).

  • As a vector of anomaly labels in the training data (train_data.y_true).

  • As a separate dataframe in the training data (train_data.anomaly_df).

  • As a separate dataframe in the metadata_param in the forecast_config.

The detector combines the anomaly labels from all these sources and stores it under the anomaly_df attribute in the detector.

In this example, the anomaly labels are passed as ANOMALY_COL column in the training dataframe. When anomalies are available for training, you can use precision, recall, f1 or a combination of these metrics to determine the anomaly score threshold. In this example, we will use f1.

227 ad_config = ADConfig(
228     volatility_features_list=[["dow"], ["is_weekend"]],
229     coverage_grid=[0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.9, 0.95, 0.99, 0.999],
230     variance_scaling=True)
231
232 def f1_reward(data):
233     return f1(
234         y_true=data.y_true,
235         y_pred=data.y_pred)
236 reward = Reward(f1_reward)
237 train_data = DetectorData(df=df_train)
238
239 # Initializes the detector.
240 detector = GreykiteDetector(
241     forecast_config=forecast_config,
242     ad_config=ad_config,
243     reward=reward)
244 # Fits the model
245 detector.fit(data=train_data)
246
247 # Checks parameter grid.
248 param_obj_list = detector.fit_info["param_obj_list"]
249 param_eval_df = pd.DataFrame.from_records(param_obj_list)
250 param_eval_df["volatility_features"] = param_eval_df["volatility_features"].map(str)
251 fig = px.line(
252     param_eval_df,
253     x="coverage",
254     y="obj_value",
255     color="volatility_features",
256     title="'GreykiteDetector' result of parameter search: reward=f1")
257 plotly.io.show(fig)

Plots the training results.

261 fig = detector.plot(title="'GreykiteDetector' prediction: reward=f1", phase="train")
262 plotly.io.show(fig)

Let us run the model on the test data and plot the results. The plot shows the actual data in orange, the forecast in blue, and the confidence intervals in grey. The predicted anomalies are marked in red.

268 test_data = DetectorData(
269     df=df_test,
270     y_true=df_test[ANOMALY_COL])
271 test_data = detector.predict(test_data)
272 fig = detector.plot(title="'GreykiteDetector' prediction: reward=f1")
273 plotly.io.show(fig)

We can see from the plot that our model is able to detect all the anomalies. Finally, let’s check the evaluation metrics via the summary method. You can see that the model achieved a high precision and recall value.

279 summary = detector.summary()
280 print(summary)

Out:

======================= Anomaly Detection Model Summary ========================

Number of observations: 100
Model: GreykiteDetector
Number of detected anomalies: 19

Average Anomaly Duration: 3 days 19:12:00
Minimum Anomaly Duration: 2 days 00:00:00
Maximum Anomaly Duration: 5 days 00:00:00

Alert Rate(%): 19.0,   Anomaly Rate(%): 19.0
Precision: 1.0,   Recall: 1.0,   F1 Score: 1.0

Optimal Objective Value: 1.0
Optimal Parameters: {'coverage': 0.99, 'volatility_features': ['is_weekend']}

============================ Forecast Model Summary ============================

Number of observations: 100,   Number of features: 8
Method: Ordinary least squares
Number of nonzero features: 8

Residuals:
         Min           1Q       Median           3Q          Max
      -6.806        -1.93     -0.03291        1.802        6.053

      Pred_col Estimate Std. Err t value Pr(>|t|) sig. code             95%CI
     Intercept    50.18    1.268   39.58   <2e-16       ***     (47.66, 52.7)
  events_Other   -0.449    1.778 -0.2525    0.801              (-3.98, 3.082)
events_Other-1     2.95    2.118   1.393    0.167             (-1.256, 7.156)
events_Other-2    1.435    2.114  0.6788    0.499             (-2.764, 5.635)
events_Other+1    2.881    1.731   1.664    0.099         .   (-0.5574, 6.32)
events_Other+2    3.187    1.698   1.877    0.064         .  (-0.1854, 6.559)
           dow   -2.466    1.021  -2.414    0.018         * (-4.495, -0.4371)
        y_lag7    7.152    1.744   4.102 8.84e-05       ***    (3.689, 10.62)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Multiple R-squared: 0.3195,   Adjusted R-squared: 0.2678
F-statistic: 6.1719 on 7 and 92 DF,   p-value: 6.244e-06
Model AIC: 680.21,   model BIC: 701.06

Examples of other reward functions

In this section we provide examples of other reward functions that can be used. The Reward class allows users the flexibility to specify their own reward functions. This class enables two powerful mechanisms:

  • taking a simple reward_func and construct a penalized version of that

  • starting from existing objectives building more complex ones by adding /

multiplying / dividing them or use same operations with numbers.

These two mechanisms together support robust multi-objective problems. Some examples are provided below. All these reward functions can be used as before.

295 # Builds precision as objective function.
296 def precision_func(data):
297     return precision(
298         y_true=data.y_true,
299         y_pred=data.y_pred)
300 precision_obj = Reward(precision_func)
301
302 # Builds recall as objective function.
303 def recall_func(data):
304     return recall(
305         y_true=data.y_true,
306         y_pred=data.y_pred)
307 recall_obj = Reward(recall_func)
308
309 # Builds sum of precision and recall objective function.
310 additive_obj = precision_obj + recall_obj

The class also allows for constrained optimization. For example, in the context of anomaly detection if recall is to be optimized subject to precision being at least 80 percent, the users can enable this. Let’s see how this can be done.

318 # First, let's build a penalized precision objective function that
319 # penalizes precision values under 0.8 by `penalty == -inf`.
320 penalized_precision_obj = Reward(
321     precision_func,
322     min_unpenalized=0.8,
323     penalty=-np.inf)
324
325 # The constraint can also be passed via the ADConfig.
326 ad_config = ADConfig(
327     target_precision=0.8)
328
329 # Builds a combined objective function that optimizes recall
330 # subject to precision being at least 80 percent.
331 combined_obj = recall_obj + penalized_precision_obj

Users can also combine objectives to achieve more complex objectives from existing ones. For example F1 can be easily expressed in terms of precision and recall objectives.

336 f1_obj = (2 * recall_obj * precision_obj) / (recall_obj + precision_obj)

Anomaly labels are NOT available

In this example, we will use an AD config which uses target_anomaly_percent to determine the anomaly score threshold. If not specified, the AD algorithm uses a default target_anomaly_percent of 10%.

345 ad_config = ADConfig(
346     volatility_features_list=[["dow"], ["is_weekend"]],
347     coverage_grid=[0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.9, 0.95, 0.99, 0.999],
348     target_anomaly_percent=10.0,
349     variance_scaling=True)
350
351 detector = GreykiteDetector(
352     forecast_config=forecast_config,
353     ad_config=ad_config,
354     reward=None)
355 detector.fit(data=train_data)
356
357 # Checks parameter grid.
358 param_obj_list = detector.fit_info["param_obj_list"]
359 param_eval_df = pd.DataFrame.from_records(param_obj_list)
360 param_eval_df["volatility_features"] = param_eval_df["volatility_features"].map(str)
361 fig = px.line(
362     param_eval_df,
363     x="coverage",
364     y="obj_value",
365     color="volatility_features",
366     title="'GreykiteDetector' result of param search: reward=anomaly_percent")
367 plotly.io.show(fig)

Plots the training results.

371 fig = detector.plot(title="'GreykiteDetector' prediction: reward=anomaly_percent", phase="train")
372 plotly.io.show(fig)

Let us run the model on the test data and plot the results. The plot shows the actual data in orange, the forecast in blue, and the confidence intervals in grey. The predicted anomalies are marked in red.

380 test_data = DetectorData(
381     df=df_test,
382     y_true=df_test[ANOMALY_COL])
383 test_data = detector.predict(test_data)
384 fig = detector.plot(title="'GreykiteDetector' prediction: reward=anomaly_percent")
385 plotly.io.show(fig)

We can see from the plot that our model is able to detect all the anomalies. Finally, let’s check the evaluation metrics via the summary method. You can see that the model achieved a high precision and recall value.

392 summary = detector.summary()
393 print(summary)

Out:

======================= Anomaly Detection Model Summary ========================

Number of observations: 100
Model: GreykiteDetector
Number of detected anomalies: 19

Average Anomaly Duration: 3 days 19:12:00
Minimum Anomaly Duration: 2 days 00:00:00
Maximum Anomaly Duration: 5 days 00:00:00

Alert Rate(%): 19.0,   Anomaly Rate(%): 19.0
Precision: 1.0,   Recall: 1.0,   F1 Score: 1.0

Optimal Objective Value: -1.09
Optimal Parameters: {'coverage': 0.99, 'volatility_features': ['is_weekend']}

============================ Forecast Model Summary ============================

Number of observations: 100,   Number of features: 8
Method: Ordinary least squares
Number of nonzero features: 8

Residuals:
         Min           1Q       Median           3Q          Max
      -6.806        -1.93     -0.03291        1.802        6.053

      Pred_col Estimate Std. Err t value Pr(>|t|) sig. code             95%CI
     Intercept    50.18    1.268   39.58   <2e-16       ***     (47.66, 52.7)
  events_Other   -0.449    1.778 -0.2525    0.801              (-3.98, 3.082)
events_Other-1     2.95    2.118   1.393    0.167             (-1.256, 7.156)
events_Other-2    1.435    2.114  0.6788    0.499             (-2.764, 5.635)
events_Other+1    2.881    1.731   1.664    0.099         .   (-0.5574, 6.32)
events_Other+2    3.187    1.698   1.877    0.064         .  (-0.1854, 6.559)
           dow   -2.466    1.021  -2.414    0.018         * (-4.495, -0.4371)
        y_lag7    7.152    1.744   4.102 8.84e-05       ***    (3.689, 10.62)
Signif. Code: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Multiple R-squared: 0.3195,   Adjusted R-squared: 0.2678
F-statistic: 6.1719 on 7 and 92 DF,   p-value: 6.244e-06
Model AIC: 680.21,   model BIC: 701.06

Total running time of the script: ( 1 minutes 4.563 seconds)

Gallery generated by Sphinx-Gallery