Skip to content

Best practices for monitoring

When setting up monitoring for a service, keep the following best practices in mind.

  • Use the right metric type—Most of the libraries available today offer various metric types. Choose the appropriate metric type for monitoring your system. Following are the types of metrics and their purposes.

    • GaugeGauge is a constant type of metric. After the metric is initialized, the metric value does not change unless you intentionally update it.

    • TimerTimer measures the time taken to complete a task.

    • CounterCounter counts the number of occurrences of a particular event.

For more information about these metric types, see Data Types.

  • Avoid over-monitoring—Monitoring can be a significant engineering endeavor. Therefore, be sure not to spend too much time and resources on monitoring services, yet make sure all important metrics are captured.

  • Prevent alert fatigue—Set alerts for metrics that are important and actionable. If you receive too many non-critical alerts, you might start ignoring alert notifications over time. As a result, critical alerts might get overlooked.

  • Have a runbook for alerts—For every alert, make sure you have a document explaining what actions and checks need to be performed when the alert fires. This enables any engineer on the team to handle the alert and take necessary actions, without any help from others.