Skip to content

AI Agents

Comprehensive monitoring for AI agent platforms, covering both end-user experience metrics and business performance indicators.

What are AI Agent Metrics?

AI agent metrics track the health, quality, and business impact of AI-powered agents. These metrics go beyond traditional service metrics to include AI-specific concerns like:

  • Response quality and accuracy
  • Hallucination detection
  • Task completion success
  • Cost efficiency (token usage)
  • Per-user fairness

SLOs in this Example (20 total)

Availability & Reliability (2 SLOs)

SLO Type Description
ExampleAgentAvailabilitySLO Aggregated Overall service availability
ExamplePerUserAgentAvailabilitySLO Per-user Consistent per-user availability

Response Quality (3 SLOs)

SLO Type Description
ExampleAgentResponseQualitySLO Aggregated User satisfaction ratings
ExamplePerUserResponseQualitySLO Per-user Per-user quality tracking
ExampleAgentAccuracySLO Accuracy Hallucination rate

Performance (3 SLOs)

SLO Type Description
ExampleAgentResponseTimeSLO Aggregated P95 response time
ExamplePerUserResponseTimeSLO Per-user Per-user latency
ExampleAgentFirstTokenLatencySLO Streaming Time to first token

Task Completion (4 SLOs)

SLO Type Description
ExampleTaskCompletionRateSLO Aggregated Task success rate
ExamplePerUserTaskCompletionSLO Per-user Per-user task success
ExampleTaskAbandonmentRateSLO Aggregated User frustration indicator
ExampleMultiStepTaskSuccessSLO Complex Multi-step task completion

User Engagement (4 SLOs)

SLO Type Description
ExampleDailyActiveUsersSLO DAU Daily active users
ExampleUserRetentionSLO Retention 7-day retention rate
ExampleSessionDurationSLO Engagement Average session length
ExampleConversationTurnsSLO Depth Conversation depth

Cost Efficiency (4 SLOs)

SLO Type Description
ExampleTokenUsagePerTaskSLO Efficiency Tokens per task
ExamplePerUserCostSLO Per-user Per-user cost tracking
ExampleCostPerSuccessfulTaskSLO ROI Cost per successful task
ExampleCacheHitRateSLO Caching Response cache efficiency

Usage

import aiagents "github.com/grokify/slogo/examples/ai-agents"

// Get individual SLOs
qualitySLO := aiagents.ExampleAgentResponseQualitySLO()
costSLO := aiagents.ExampleTokenUsagePerTaskSLO()

// Get all SLOs
slos := aiagents.SLOs()

Key Metric Categories

1. Availability & Reliability

Monitor whether agents are accessible and functional when users need them.

Per-User Fairness

Per-user SLOs ensure no single user experiences consistently poor service, even if aggregate metrics look healthy.

2. Quality & Accuracy

Track response quality, user satisfaction, and accuracy (hallucination detection).

# Hallucination rate
1 - (sum(rate(agent_responses_factual_total[24h])) / sum(rate(agent_responses_total[24h])))

3. Performance

Response times for both batch and streaming responses.

# Time to first token (streaming)
histogram_quantile(0.95, rate(agent_first_token_latency_seconds_bucket[5m]))

4. Task Completion

Success rates for user-initiated tasks.

5. Cost Efficiency

Token usage, caching, and cost per outcome.

# Average tokens per task
sum(rate(llm_tokens_used_total[24h])) / sum(rate(agent_tasks_completed_total[24h]))

Ontology Labels

AI Agent SLOs use mixed labels depending on the metric:

// Service-level (availability, performance)
ontology.LabelFramework:  ontology.FrameworkRED,
ontology.LabelLayer:      ontology.LayerService,
ontology.LabelAudience:   ontology.AudienceSRE,

// Business-level (engagement, cost)
ontology.LabelFramework:  ontology.FrameworkCustom,
ontology.LabelLayer:      ontology.LayerBusiness,
ontology.LabelAudience:   ontology.AudienceProduct,
business.LabelDomain:     business.DomainAIML,

Per-User vs Aggregated

Metric Type Aggregated Per-User
Availability 99.9% overall 99% per-user minimum
Response Time P95 < 2s 95% of users < 3s
Quality 4.5/5 average 4.0/5 per-user minimum
Task Success 85% overall 80% per-user minimum

References