Before Prometheus: The Push Model Era
Before Prometheus took the monitoring world by storm, most systems relied on push-based metrics collection. Applications would send metrics to centralized collectors like StatsD, Graphite, or proprietary solutions.
This approach worked, but it had significant limitations:
- Network reliability - if metrics couldn't be pushed, they were lost
- Discovery complexity - collectors needed to know about every metric source
- Scaling challenges - central collectors became bottlenecks
- Configuration overhead - every new service required configuration changes
The Prometheus Revolution
Prometheus, born at SoundCloud in 2012, introduced a fundamentally different approach: pull-based metrics collection. Instead of applications pushing metrics, Prometheus actively scrapes metrics from configured endpoints.
The Pull Model Advantages
This architectural shift brought several key benefits:
1. Simplified Service Discovery
Prometheus can discover targets dynamically through various mechanisms:
- Kubernetes service discovery
- Consul integration
- DNS-based discovery
- Cloud provider APIs (AWS, GCP, Azure)
2. Improved Reliability
With pull-based collection, Prometheus controls the timing and can detect when services are unreachable. This provides better insight into system health compared to silent failures in push-based systems.
3. Centralized Configuration
All scraping configuration lives in one place - the Prometheus configuration file. No need to configure every application individually.
The Metrics Format Innovation
Prometheus also introduced a simple, human-readable metrics format:
# HELP http_requests_total Total number of HTTP requests
# TYPE http_requests_total counter
http_requests_total{method="GET",status="200"} 1024
http_requests_total{method="GET",status="404"} 3
http_requests_total{method="POST",status="200"} 512
# HELP response_time_seconds Response time in seconds
# TYPE response_time_seconds histogram
response_time_seconds_bucket{le="0.1"} 100
response_time_seconds_bucket{le="0.5"} 150
response_time_seconds_bucket{le="1.0"} 200
response_time_seconds_bucket{le="+Inf"} 200
response_time_seconds_sum 45.7
response_time_seconds_count 200
Metric Types
Prometheus defined four fundamental metric types:
Counter
A cumulative value that only increases (or resets to zero). Perfect for tracking requests, errors, or tasks completed.
Gauge
A value that can go up or down. Ideal for current values like memory usage, active connections, or queue size.
Histogram
Observations bucketed into configurable ranges. Essential for measuring latencies, request sizes, or response times with percentile calculations.
Summary
Similar to histograms but with client-side quantile calculation. Less flexible than histograms but lower server-side computational overhead.
PromQL: The Query Revolution
Perhaps Prometheus's most significant contribution was PromQL - a powerful query language for time-series data. PromQL enabled complex analytical queries:
# 95th percentile response time over 5 minutes
histogram_quantile(0.95,
rate(http_request_duration_seconds_bucket[5m])
)
# Error rate by service
sum(rate(http_requests_total{status=~"5.."}[5m])) by (service)
/
sum(rate(http_requests_total[5m])) by (service)
# Instances with high memory usage
(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes)
/ node_memory_MemTotal_bytes > 0.8
The Ecosystem Effect
Prometheus's success sparked an entire ecosystem:
Exporters
Third-party applications that expose Prometheus metrics for external systems:
- Node Exporter - system and hardware metrics
- MySQL Exporter - database performance metrics
- Blackbox Exporter - network probing and monitoring
- Custom exporters - for legacy applications
Client Libraries
Official libraries for all major programming languages made instrumentation straightforward:
// Go example
import "github.com/prometheus/client_golang/prometheus"
var (
requestCount = prometheus.NewCounterVec(
prometheus.CounterOpts{
Name: "http_requests_total",
Help: "Total number of HTTP requests",
},
[]string{"method", "status"},
)
)
func handleRequest(w http.ResponseWriter, r *http.Request) {
// Handle request logic...
requestCount.WithLabelValues(r.Method, "200").Inc()
}
Integration with Grafana
The combination of Prometheus and Grafana became the de facto standard for metrics visualization. Grafana's rich dashboarding capabilities complemented Prometheus's data collection and querying perfectly.
Challenges and Limitations
Despite its success, Prometheus has notable limitations:
- Single node architecture - scaling requires federation or sharding
- Limited long-term storage - designed for recent data
- Pull-only model - challenging for short-lived jobs
- Label cardinality - high cardinality can impact performance
The Lasting Impact
Prometheus fundamentally changed how we think about metrics:
- Made metrics collection accessible to every developer
- Established patterns for modern application instrumentation
- Influenced cloud-native architectures and tools
- Created the foundation for modern observability practices
Today, Prometheus remains the backbone of many observability stacks, and its influence can be seen in newer tools and standards like OpenTelemetry, which we'll explore in our next post.