USE Metrics¶
USE (Utilization, Saturation, Errors) is Brendan Gregg's methodology for infrastructure and resource monitoring.
What are USE Metrics?¶
USE provides a systematic approach to analyzing system performance by examining every resource:
- 📊 Utilization: Percentage of resource being used (0-100%)
- 🌊 Saturation: Degree of queuing or backlog
- ⚠️ Errors: Error events on the resource
SLOs in this Example (11 total)¶
Utilization SLOs¶
| SLO | Resource | Description |
|---|---|---|
ExampleCPUUtilizationSLO |
CPU | Average CPU usage |
ExampleMemoryUtilizationSLO |
Memory | Memory usage percentage |
ExampleDiskUtilizationSLO |
Disk | Disk space usage |
Saturation SLOs¶
| SLO | Resource | Description |
|---|---|---|
ExampleCPULoadAverageSLO |
CPU | Load average / CPU count |
ExampleSwapUsageSLO |
Memory | Swap space usage |
ExampleDiskIOSaturationSLO |
Disk | I/O wait percentage |
ExampleNetworkBandwidthSLO |
Network | Bandwidth utilization |
Error SLOs¶
| SLO | Resource | Description |
|---|---|---|
ExampleDiskIOErrorsSLO |
Disk | I/O error rate |
ExampleNetworkErrorsSLO |
Network | Packet error rate |
ExampleMemoryECCErrorsSLO |
Memory | ECC memory errors |
ExampleCPUThrottlingSLO |
CPU | Throttling events |
Usage¶
import usemetrics "github.com/grokify/slogo/examples/use-metrics"
// Get individual SLOs
cpuSLO := usemetrics.ExampleCPUUtilizationSLO()
memorySLO := usemetrics.ExampleMemoryUtilizationSLO()
// Get all SLOs
slos := usemetrics.SLOs()
When to Use USE Metrics¶
USE metrics are ideal for:
- Physical servers: CPU, memory, disk monitoring
- Virtual machines: VM resource tracking
- Containers: Pod/container resource limits
- Network infrastructure: Bandwidth, packet handling
- Storage systems: IOPS, throughput, capacity
Prometheus Queries¶
CPU Utilization¶
Memory Utilization¶
Disk I/O Saturation¶
Network Bandwidth¶
Ontology Labels¶
All USE metric SLOs use these labels:
ontology.LabelFramework: ontology.FrameworkUSE,
ontology.LabelLayer: ontology.LayerInfrastructure,
ontology.LabelScope: ontology.ScopeInternal,
ontology.LabelAudience: ontology.AudienceSRE,
ontology.LabelResourceType: ontology.ResourceTypeCPU, // or Memory, Disk, Network
Resource Matrix¶
For each resource, ask: Utilization? Saturation? Errors?
| Resource | Utilization | Saturation | Errors |
|---|---|---|---|
| CPU | Usage % | Load average, run queue | Throttling |
| Memory | Used % | Swap usage, OOM events | ECC errors |
| Disk | Space %, IOPS | I/O wait | I/O errors |
| Network | Bandwidth % | Queue depth | Packet errors |
Complementary Frameworks¶
| Framework | Focus | Use Case |
|---|---|---|
| RED Metrics | Services | Request handling |
| Four Golden Signals | Hybrid | Services + Saturation |
References¶
- ⚡ The USE Method - Brendan Gregg
- 📖 Systems Performance - Brendan Gregg
- 📜 OpenSLO Specification