SLIs & SLOs

PRISM supports Service Level Indicators (SLIs) and Service Level Objectives (SLOs) following SRE best practices.

SLI (Service Level Indicator)

An SLI defines what is being measured.

Fields

Field	Type	Required	Description
`name`	string	No	SLI name
`description`	string	No	SLI description
`formula`	string	No	Calculation formula

Example

{
  "sli": {
    "name": "Availability",
    "description": "Percentage of successful requests",
    "formula": "successful_requests / total_requests * 100"
  }
}

SLO (Service Level Objective)

An SLO defines the target for an SLI.

Fields

Field	Type	Required	Description
`target`	string	Yes	Human-readable target
`operator`	string	No	Comparison operator for machine evaluation
`value`	number	No	Numeric target value
`window`	string	No	Measurement window
`thresholds`	object	No	Additional thresholds

SLO Operators

PRISM supports machine-evaluable SLOs with these operators:

Operator	Constant	Description	Example
`gte`	`>=`	Greater than or equal	Availability ≥99.99%
`lte`	`<=`	Less than or equal	Latency ≤200ms
`gt`	`>`	Greater than	Score >80
`lt`	`<`	Less than	Error rate <0.1%
`eq`	`=`	Equal to	Target exactly 100

Example with Machine-Evaluable SLO

{
  "slo": {
    "target": ">=99.99%",
    "operator": "gte",
    "value": 99.99,
    "window": "30d"
  }
}

Measurement Windows

Common window values:

Window	Description
`7d`	7-day rolling window
`30d`	30-day rolling window
`90d`	90-day rolling window
`monthly`	Calendar month
`quarterly`	Calendar quarter

Programmatic SLO Checking

PRISM provides a MeetsSLO() method for programmatic checking:

metric := prism.Metric{
    Current: 99.95,
    SLO: &prism.SLO{
        Target:   ">=99.99%",
        Operator: prism.SLOOperatorGTE,
        Value:    99.99,
    },
}

if metric.MeetsSLO() {
    fmt.Println("SLO met!")
} else {
    fmt.Println("SLO not met")
}

Operator Behavior

Operator	Current	Value	MeetsSLO()
`gte`	99.99	99.99	true
`gte`	99.95	99.99	false
`lte`	200	250	true
`lte`	300	250	false
`eq`	100	100	true

Complete Metric Example

{
  "id": "ops-availability",
  "name": "Service Availability",
  "description": "Percentage of time the service is available",
  "domain": "operations",
  "stage": "runtime",
  "category": "reliability",
  "metricType": "rate",
  "trendDirection": "higher_better",
  "unit": "%",
  "baseline": 99.0,
  "current": 99.95,
  "target": 99.99,
  "sli": {
    "name": "Availability",
    "description": "Successful requests / total requests",
    "formula": "1 - (error_count / total_requests)"
  },
  "slo": {
    "target": ">=99.99%",
    "operator": "gte",
    "value": 99.99,
    "window": "30d"
  },
  "thresholds": {
    "green": 99.95,
    "yellow": 99.9,
    "red": 99.0
  },
  "frameworkMappings": [
    {"framework": "SRE", "reference": "availability-slo"},
    {"framework": "DORA", "reference": "availability"}
  ]
}

Best Practices

Set Realistic Targets - SLOs should be achievable but challenging
Include Error Budgets - Use thresholds to define acceptable ranges
Document Formulas - Include SLI formulas for clarity
Use Machine-Evaluable Operators - Enable automated SLO checking
Define Measurement Windows - Clarify the evaluation period
Map to Frameworks - Reference industry standards (SRE, DORA)