Metrics

Prometheus

Counters

  • Simple, increment only metrics that keep track of the number of occurrences of a specific event or activity
  • Examples:
    • API requests
    • error occurrences
    • system restarts
  • When to use:
    • record value that only goes up
    • want to calculate the rate of increase
  • Use case:
    • Capacity Planning

Gauges

  • Provide snapshot of the particular value at a specific point in time
  • Examples:
    • CPU Usage
    • Memory Usage
    • Number of active connections
  • When to use:
    • Record value that goes up and down
    • Don’t need to calculate rate
  • Use case:
    • Performance Optimization

Histograms

  • Measure distribution and frequency of time durations for specific events
  • Examples:
    • Request Duration
  • When to use:
    • Want to take many measurements of the values and later calculate average or percentiles
    • Not bothered about the exact values and are happy with an approximation
    • When you know the range of values upfront, so you can use the default bucket definitions or define your own
  • Use case:
    • User experience enhancement

Timers

Summary

Google’s 4 Golden Signals

  • Errors
  • Latency
  • Throughput
  • Saturation