ProductGraph Integration Plan¶
Author: PlexusOne Date: 2026-04-27 Status: Draft
Executive Summary¶
Integrate systemforge observability with ProductGraph to enable frontend-backend correlation, backend event forwarding, and unified analytics via omnidxi.
Current State¶
systemforge Observability¶
- omniobserve integration with multiple providers (OTLP, Datadog, New Relic, Dynatrace)
- Pre-defined metrics for CoreAuth and CoreAPI
- HTTP middleware for request tracing
- OpenTelemetry span creation
Missing¶
- Frontend session correlation
- Backend event forwarding to ProductGraph
- Journey tracking from backend
- Unified analytics pipeline
Implementation Phases¶
Phase 1: Correlation Middleware¶
Goal: Extract frontend correlation IDs and inject into context.
Deliverables:
observability/correlationpackage- Middleware extracting X-Session-ID, X-Request-ID
- Context helper functions
- Unit tests
Files:
observability/
├── correlation/
│ ├── correlation.go # Middleware and context helpers
│ └── correlation_test.go # Unit tests
Implementation:
// correlation/correlation.go
package correlation
type ContextKey string
const (
SessionIDKey ContextKey = "session_id"
RequestIDKey ContextKey = "request_id"
)
func Middleware(next http.Handler) http.Handler { ... }
func SessionIDFromContext(ctx context.Context) string { ... }
func RequestIDFromContext(ctx context.Context) string { ... }
Phase 2: ProductGraph Client¶
Goal: Go client for sending events to ProductGraph.
Deliverables:
productgraphpackage- Event struct with OTel semantics
- Async batching and flushing
- Graceful shutdown
Files:
productgraph/
├── client.go # Client implementation
├── event.go # Event types
├── config.go # Configuration
└── client_test.go # Unit tests
API:
client := productgraph.New(config)
defer client.Close()
client.Track(ctx, event)
client.TrackAPICall(ctx, method, path, status, duration)
client.TrackError(ctx, errType, message)
client.TrackJourneyStep(ctx, journeyID, stepID, stepName)
Phase 3: Observability Integration¶
Goal: Integrate ProductGraph into existing observability provider.
Deliverables:
- WithProductGraph option
- RequestTracker middleware
- Environment configuration
- Integration tests
Changes:
// observability/observability.go
func WithProductGraph(cfg productgraph.Config) ProviderOption
// observability/middleware.go
func (p *Provider) RequestTracker(next http.Handler) http.Handler
Phase 4: Documentation and Examples¶
Goal: Complete documentation and example usage.
Deliverables:
- Package documentation
- Usage examples
- Integration guide
- Migration guide from direct omniobserve
Files:
docs/
├── design/productgraph/
│ ├── PRD.md # Product requirements
│ ├── TRD.md # Technical requirements
│ ├── PLAN.md # This document
│ └── TASKS.md # Task breakdown
productgraph/
├── README.md # Package documentation
└── example_test.go # Example usage
Timeline¶
| Phase | Duration | Target |
|---|---|---|
| Phase 1: Correlation | 2 days | 2026-05-02 |
| Phase 2: Client | 3 days | 2026-05-07 |
| Phase 3: Integration | 2 days | 2026-05-09 |
| Phase 4: Documentation | 2 days | 2026-05-13 |
Dependencies¶
Internal¶
| Dependency | Version | Status |
|---|---|---|
| ProductGraph | v0.2.0 | Ready |
| omniobserve | v0.8.0 | In use |
External¶
| Dependency | Version | Purpose |
|---|---|---|
| google/uuid | v1.6.0 | Event ID generation |
| go-chi/chi | v5.0.0 | HTTP router (optional) |
Risks¶
| Risk | Impact | Mitigation |
|---|---|---|
| Latency impact | Medium | Async batching |
| Data loss on crash | Low | Graceful shutdown, retry |
| Memory pressure | Low | Bounded buffer |
| Network failures | Medium | Retry with backoff |
Success Criteria¶
- Correlation: 95%+ requests have session ID in context
- Delivery: 99.9%+ event delivery rate
- Performance: < 1ms tracking overhead
- Adoption: Used in 2+ systemforge-based services
Architecture Decision Records¶
ADR-1: Direct Client vs omniobserve Extension¶
Context: Should ProductGraph be a new omniobserve provider or a separate client?
Decision: Separate client (productgraph package).
Rationale:
- ProductGraph is event-focused, not trace/metric focused
- Different batching semantics (events vs spans)
- Simpler to maintain independently
- Can still integrate with observability provider via composition
ADR-2: Sync vs Async Tracking¶
Context: Should Track() be synchronous or asynchronous?
Decision: Asynchronous with batching.
Rationale:
- Minimal latency impact on request path
- Better throughput with batching
- Graceful degradation on network issues
- Trade-off: Potential event loss on crash (acceptable)
Related Documents¶
- PRD.md - Product requirements
- TRD.md - Technical requirements
- TASKS.md - Task breakdown
- Observability TRD