Skip to content

Case Study: Agent Tooling Ecosystem

This case study examines how DIRECT principles guide the design of a tooling ecosystem built for AI agent-assisted development.

Overview

When building tools that AI agents will use, traditional developer experience (DX) assumptions break down. Agents cannot:

  • Infer meaning from prose documentation
  • Debug through trial and error efficiently
  • Handle ambiguous error messages
  • Tolerate inconsistencies across tools

DIRECT principles provide a framework for designing tools that agents can reliably use.

The Ecosystem

The following tools were designed with DIRECT principles as a guide:

Tool Purpose Primary Principles
agent-team-release Release automation R, C, T
agent-team-stats Statistics with verification D, T
ax-spec OpenAPI linting & enrichment I, E
brandkit SVG icon operations I, R
d2vision Declarative diagram generation D, I
schemalint Schema validation for static typing D, E
structured-changelog Token-efficient changelogs I, E, C
traffic2openapi Generate specs from traffic I, E
w3pilot Browser automation for agents I, R, T

Principle Application

Deterministic

Goal: Same input always produces same output shape.

schemalint validates JSON schemas for Go compatibility:

# Deterministic output - schema either passes or fails with specific errors
schemalint validate schema.json

# Output is structured, not prose
{
  "valid": false,
  "errors": [
    {"path": "$.properties.data", "code": "MISSING_TYPE"}
  ]
}

Why it matters: Agents generate code from schemas. If a schema validates, the generated code must compile. No exceptions.

d2vision produces identical diagrams from identical specs:

# Same input → same output, every time
d2vision render diagram.d2 --format svg

Why it matters: Agents can iterate on diagrams knowing changes are predictable.


Introspectable

Goal: Machine-readable capabilities and schemas.

traffic2openapi creates OpenAPI specs where none exist:

# Convert HTTP traffic to machine-readable spec
traffic2openapi --har session.har --output api.yaml

Why it matters: Without a spec, agents cannot discover API capabilities programmatically.

structured-changelog outputs TOON (Token-Oriented Object Notation):

# Default: token-efficient format for LLMs
schangelog parse-commits --since=v1.0.0

# ~8x fewer tokens than raw git log

Why it matters: Agents pay for context. Efficient formats reduce cost and improve comprehension.

ax-spec makes OpenAPI specs agent-readable:

# Enrich spec with agent metadata
ax-spec enrich api.yaml --infer-capabilities --infer-retryable

Adds extensions like:

x-ax-capabilities: [create_payment, transfer_funds]
x-ax-retryable: false
x-ax-required-fields: [amount, currency]

Why it matters: Agents can search by capability, not just endpoint URL.


Recoverable

Goal: Structured errors enable automated correction.

w3pilot returns actionable errors:

{
  "error_code": "ELEMENT_NOT_FOUND",
  "selector": "#submit-button",
  "suggestion": "Element may not be visible. Try waiting for page load.",
  "retryable": true,
  "screenshot": "error-state.png"
}

Why it matters: Agents can parse error codes, apply fixes, and retry automatically.

brandkit validates SVG operations before execution:

{
  "error_code": "INVALID_COLOR",
  "field": "fill",
  "value": "not-a-color",
  "suggestion": "Use hex (#RRGGBB), RGB, or named color"
}

Why it matters: Pre-validation prevents wasted API calls.

agent-team-release provides rollback guidance on failure:

{
  "error_code": "TAG_EXISTS",
  "tag": "v1.2.0",
  "suggestion": "Delete existing tag or increment version",
  "rollback_commands": [
    "git tag -d v1.2.0",
    "git push origin :refs/tags/v1.2.0"
  ]
}

Why it matters: Agents can recover from failures without human intervention.


Explicit

Goal: All constraints declared in specification.

schemalint enforces explicit constraints:

# Bad - implicit constraints
properties:
  email:
    type: string

# Good - explicit constraints
properties:
  email:
    type: string
    format: email
    maxLength: 255

Why it matters: Agents cannot infer constraints from context.

multi-agent-spec avoids polymorphism:

# Bad - degrades to interface{} in Go
Event:
  oneOf:
    - $ref: '#/components/schemas/CreateEvent'
    - $ref: '#/components/schemas/UpdateEvent'

# Good - explicit discriminator
Event:
  type: object
  required: [type]
  properties:
    type:
      enum: [create, update]

Why it matters: Static type systems cannot represent arbitrary unions cleanly.


Consistent

Goal: Uniform patterns across tools.

structured-changelog enforces consistent format across repositories:

# Same schema, same categories, everywhere
schangelog validate CHANGELOG.json
schangelog generate CHANGELOG.json -o CHANGELOG.md

Why it matters: Agents learn patterns. Inconsistency forces per-repo special cases.

agent-team-release applies identical release process:

# Same workflow for any repository
agent-team-release \
  --version v1.2.0 \
  --changelog CHANGELOG.md \
  --dry-run

Why it matters: Agents can generalize across projects.

design-system-spec provides uniform design tokens:

{
  "colors": {
    "primary": {"value": "#1a237e", "type": "color"}
  },
  "spacing": {
    "sm": {"value": "8px", "type": "dimension"}
  }
}

Why it matters: Same token names work across all components.


Testable

Goal: Safe, low-cost experimentation.

w3pilot supports headless testing:

# No visible browser, fast iteration
w3pilot --headless --screenshot-on-error

Why it matters: Agents iterate hundreds of times. Visual browsers slow this down.

agent-team-release includes dry-run mode:

# See what would happen without doing it
agent-team-release --version v1.2.0 --dry-run

# Output shows planned actions
Would create tag: v1.2.0
Would update CHANGELOG.md
Would create GitHub release

Why it matters: Agents can validate plans before execution.

agent-team-stats verifies against sources:

# Statistics include source URLs for verification
agent-team-stats --topic "AI adoption" --verify

# Output includes verification status
{
  "statistic": "73% of enterprises use AI",
  "source_url": "https://...",
  "verified": true,
  "verification_date": "2024-01-15"
}

Why it matters: Agents (and humans) can validate claims.


Cross-Reference Matrix

Tool D I R E C T
agent-team-release
agent-team-stats
ax-spec
brandkit
d2vision
design-system-spec
multi-agent-spec
schemalint
structured-changelog
traffic2openapi
w3pilot

Lessons Learned

1. Specification-First Development

Every tool benefits from a machine-readable specification:

  • APIs get OpenAPI specs
  • Schemas get JSON Schema
  • Changelogs get structured JSON
  • Design systems get token specs

When specs don't exist, generate them (traffic2openapi).

2. Static Typing as a Constraint

Design for the least flexible consumer (Go, Rust) rather than the most flexible (Python, JavaScript):

  • Avoid oneOf/anyOf/allOf where possible
  • Use explicit discriminator fields
  • Validate schemas for static type compatibility

3. Errors as API

Error responses are part of the interface:

  • Every error needs a machine-readable code
  • Every error needs a suggestion
  • Retryable status must be explicit

4. Consistency Compounds

Patterns that work across tools reduce agent complexity:

  • Same CLI flag conventions
  • Same error response shapes
  • Same output formats

5. Testability Enables Iteration

Agents iterate rapidly. Every tool needs:

  • Dry-run mode for mutations
  • Headless mode for UI operations
  • Verification mode for data

Conclusion

DIRECT principles provide actionable guidance for building agent-friendly tools. This ecosystem demonstrates that the principles apply across different domains:

  • Specification tools (ax-spec, traffic2openapi)
  • Validation tools (schemalint)
  • Automation tools (agent-team-release, w3pilot)
  • Content tools (structured-changelog, d2vision)

The common thread: design for machine consumption first, human convenience second.

Resources

  • AX Spec - OpenAPI linting and enrichment
  • PlexusOne - Agent tools and SDK integrations
  • grokify - Specifications and tooling