Prometheus Metrics Types: Counter, Gauge, Histogram, and Summary
As a DevOps Engineer, I thrive in the cloud and command a vast arsenal of tools and technologies: ☁️ AWS and Azure Cloud: Where the sky is the limit, I ensure applications soar. 🔨 DevOps Toolbelt: Git, GitHub, GitLab – I master them all for smooth development workflows. 🧱 Infrastructure as Code: Terraform and Ansible sculpt infrastructure like a masterpiece. 🐳 Containerization: With Docker, I package applications for effortless deployment. 🚀 Orchestration: Kubernetes conducts my application symphonies. 🌐 Web Servers: Nginx and Apache, my trusted gatekeepers of the web.
Introduction
Prometheus is a powerful monitoring system that collects and stores time-series data. It provides different metric types to effectively monitor various aspects of system performance. Understanding these metric types is essential for setting up accurate monitoring, analysis, and alerting.
This article explains the four primary metric types in Prometheus:
Counter – Tracks ever-increasing values.
Gauge – Tracks fluctuating values.
Histogram – Measures distributions over time.
Summary – Similar to a histogram but precomputes quantiles.
1. Counter Metrics
What is a Counter?
A Counter is a metric that only increases or resets to zero. It is used to track occurrences of events over time, such as the number of HTTP requests, errors, or completed tasks.
Key Characteristics
✅ Monotonic – The value never decreases.
✅ Resets to zero when the application restarts.
✅ Used to track event occurrences over time.
Example Use Cases
Number of HTTP requests received.
Number of failed database queries.
Total bytes sent over a network.
Example in Prometheus Format
http_requests_total{method="GET", status="200"} 1500
http_requests_total{method="POST", status="500"} 200
PromQL Query Example
Calculate the per-second rate of HTTP requests over the past 5 minutes:
rate(http_requests_total[5m])
2. Gauge Metrics
What is a Gauge?
A Gauge represents a metric that can increase or decrease over time. It is useful for monitoring values that change continuously, such as temperature, memory usage, or queue length.
Key Characteristics
✅ Fluctuates – Can go up or down.
✅ Represents instantaneous values rather than accumulated counts.
✅ Useful for monitoring real-time resource utilization.
Example Use Cases
Current CPU usage percentage.
Memory usage of a container.
Number of active connections to a server.
Example in Prometheus Format
node_memory_available_bytes 104857600
PromQL Query Example
Find the average memory usage over the past 5 minutes:
avg(node_memory_available_bytes[5m])
3. Histogram Metrics
What is a Histogram?
A Histogram collects observations and categorizes them into configurable buckets based on value ranges. It is mainly used for measuring distributions such as request durations or response sizes.
Key Characteristics
✅ Provides a count of observations in pre-defined buckets.
✅ Records a sum of all observed values.
✅ Useful for latency, response time, and request size monitoring.
Example Use Cases
Request duration in an API.
Distribution of response sizes.
Load times of database queries.
Example in Prometheus Format
http_request_duration_seconds_bucket{le="0.1"} 125
http_request_duration_seconds_bucket{le="0.2"} 300
http_request_duration_seconds_bucket{le="0.5"} 800
http_request_duration_seconds_count 1200
http_request_duration_seconds_sum 450
PromQL Query Example
Calculate the 95th percentile of HTTP request durations:
histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))
4. Summary Metrics
What is a Summary?
A Summary is similar to a Histogram, but instead of buckets, it precomputes quantiles (percentiles) at the time of observation. This makes querying easier but increases resource consumption.
Key Characteristics
✅ Precomputes quantiles (e.g., 50th, 95th, 99th percentile).
✅ Provides the total count and sum of observations.
✅ More memory-intensive compared to histograms.
Example Use Cases
Measuring request latencies.
Calculating percentiles of response times.
Analyzing user interactions.
Example in Prometheus Format
http_request_duration_seconds{quantile="0.5"} 0.15
http_request_duration_seconds{quantile="0.95"} 0.3
http_request_duration_seconds_count 1500
http_request_duration_seconds_sum 500
PromQL Query Example
Retrieve the 99th percentile request duration:
http_request_duration_seconds{quantile="0.99"}
Comparison of Metric Types
| Metric Type | Can Increase? | Can Decrease? | Use Case Example |
| Counter | ✅ Yes | ❌ No | HTTP requests, error counts |
| Gauge | ✅ Yes | ✅ Yes | CPU usage, memory consumption |
| Histogram | ✅ Yes | ❌ No | API response times, request sizes |
| Summary | ✅ Yes | ❌ No | Request latency percentiles |
When to Use Each Metric Type
| Scenario | Best Metric Type |
| Counting events (e.g., HTTP requests, errors) | Counter |
| Monitoring fluctuating values (e.g., CPU, memory) | Gauge |
| Measuring request durations with distribution | Histogram |
| Measuring percentiles of response times | Summary |
Conclusion
Understanding Counter, Gauge, Histogram, and Summary metrics is essential for designing effective monitoring strategies with Prometheus. Each metric type serves a unique purpose:
Counter is best for counting occurrences.
Gauge is used for real-time values.
Histogram is ideal for tracking distributions.
Summary is great for calculating precise percentiles.
By selecting the right metric type for the right use case, you can improve observability and gain valuable insights into your system’s health. 🚀