Browser & Test Automation¶
Capture HTTP traffic from browsers and test automation frameworks.
Overview¶
| Method | Tool | Streaming | Best For |
|---|---|---|---|
| Playwright Adapter | Playwright (Python/TS) | Yes | Recommended |
| HAR Export | DevTools, Playwright | No | Post-hoc analysis |
| Event-based | Playwright, Puppeteer | Yes | Real-time capture |
| Intercept | Cypress | Yes | E2E test suites |
Playwright Adapter (Recommended)¶
We provide official Playwright adapters for Python and TypeScript. These handle all the complexity of capturing traffic and outputting IR format.
Location: pkg/adapters/playwright/
Why Playwright Adapter?¶
- Streaming - Records are written as they happen, not buffered in memory
- Filtering - Built-in host, method, and path filtering
- Security - Automatically excludes sensitive headers
- Compression - Optional gzip output for large captures
- Type-safe - Full TypeScript types and Python type hints
- Test integration - Works with pytest and @playwright/test
Installation¶
Quick Start¶
from playwright.sync_api import sync_playwright
from traffic2openapi_playwright import PlaywrightCapture
with sync_playwright() as p:
browser = p.chromium.launch()
context = browser.new_context()
# Start capturing
capture = PlaywrightCapture("traffic.ndjson")
capture.attach(context)
page = context.new_page()
page.goto("https://api.example.com/users")
# Stop and save
capture.close()
browser.close()
print(f"Captured {capture.count} requests")
import asyncio
from playwright.async_api import async_playwright
from traffic2openapi_playwright import PlaywrightCapture
async def main():
async with async_playwright() as p:
browser = await p.chromium.launch()
context = await browser.new_context()
capture = PlaywrightCapture("traffic.ndjson")
await capture.attach_async(context)
page = await context.new_page()
await page.goto("https://api.example.com/users")
capture.close()
await browser.close()
asyncio.run(main())
import { chromium } from 'playwright';
import { PlaywrightCapture } from 'traffic2openapi-playwright';
const browser = await chromium.launch();
const context = await browser.newContext();
// Start capturing
const capture = new PlaywrightCapture('traffic.ndjson');
capture.attach(context);
const page = await context.newPage();
await page.goto('https://api.example.com/users');
// Stop and save
await capture.close();
await browser.close();
console.log(`Captured ${capture.count} requests`);
Configuration¶
Both adapters support the same configuration options:
| Option | Default | Description |
|---|---|---|
output |
(required) | Output file path |
filter_hosts / filterHosts |
[] |
Only capture requests to these hosts |
filter_methods / filterMethods |
[] |
Only capture these HTTP methods |
exclude_paths / excludePaths |
[] |
Skip requests matching these paths |
exclude_path_patterns / excludePathPatterns |
[] |
Skip paths matching regex patterns |
exclude_headers / excludeHeaders |
See below | Headers to exclude |
capture_request_body / captureRequestBody |
true |
Capture request bodies |
capture_response_body / captureResponseBody |
true |
Capture response bodies |
max_body_size / maxBodySize |
1048576 |
Max body size in bytes (1MB) |
gzip |
false |
Gzip compress output (auto-detected from .gz) |
compression_level / compressionLevel |
9 |
Gzip level 1-9 |
on_error / onError |
None |
Error callback function |
Default excluded headers:
authorizationcookieset-cookiex-api-keyx-auth-token
Filtering Traffic¶
Filter to only capture your API traffic:
import re
from traffic2openapi_playwright import PlaywrightCapture, CaptureOptions
options = CaptureOptions(
output="traffic.ndjson",
# Only capture your API
filter_hosts=["api.example.com"],
# Only capture these methods
filter_methods=["GET", "POST", "PUT", "DELETE"],
# Exclude health checks and internal routes
exclude_paths=["/health", "/metrics", "/version"],
# Exclude static assets
exclude_path_patterns=[
re.compile(r"\.(js|css|png|jpg|svg|woff2?)$"),
re.compile(r"^/_next/"),
],
)
capture = PlaywrightCapture(options)
import { PlaywrightCapture } from 'traffic2openapi-playwright';
const capture = new PlaywrightCapture({
output: 'traffic.ndjson',
// Only capture your API
filterHosts: ['api.example.com'],
// Only capture these methods
filterMethods: ['GET', 'POST', 'PUT', 'DELETE'],
// Exclude health checks and internal routes
excludePaths: ['/health', '/metrics', '/version'],
// Exclude static assets
excludePathPatterns: [
/\.(js|css|png|jpg|svg|woff2?)$/,
/^\/_next\//,
],
});
Test Framework Integration¶
pytest (Python)¶
# conftest.py
import pytest
from playwright.sync_api import sync_playwright
from traffic2openapi_playwright import PlaywrightCapture
@pytest.fixture(scope="session")
def playwright_instance():
with sync_playwright() as p:
yield p
@pytest.fixture(scope="session")
def browser(playwright_instance):
browser = playwright_instance.chromium.launch()
yield browser
browser.close()
@pytest.fixture(scope="session")
def traffic_capture():
capture = PlaywrightCapture("test-traffic.ndjson")
yield capture
capture.close()
@pytest.fixture
def context(browser, traffic_capture):
context = browser.new_context()
traffic_capture.attach(context)
yield context
context.close()
@pytest.fixture
def page(context):
page = context.new_page()
yield page
page.close()
# test_api.py
def test_get_users(page):
response = page.goto("https://api.example.com/users")
assert response.status == 200
def test_create_user(page):
page.goto("https://api.example.com")
# ... perform actions
@playwright/test (TypeScript)¶
// tests/fixtures.ts
import { test as base } from '@playwright/test';
import { PlaywrightCapture } from 'traffic2openapi-playwright';
export const test = base.extend<{}, { capture: PlaywrightCapture }>({
capture: [async ({}, use) => {
const capture = new PlaywrightCapture({
output: 'test-traffic.ndjson',
filterHosts: ['api.example.com'],
});
await use(capture);
await capture.close();
}, { scope: 'worker' }],
});
export { expect } from '@playwright/test';
// tests/api.spec.ts
import { test, expect } from './fixtures';
test.beforeEach(async ({ context, capture }) => {
capture.attach(context);
});
test('should get users', async ({ page }) => {
const response = await page.goto('/users');
expect(response?.status()).toBe(200);
});
Generating OpenAPI¶
After capturing traffic:
# Generate OpenAPI spec
traffic2openapi generate -i traffic.ndjson -o openapi.yaml
# With gzip input
traffic2openapi generate -i traffic.ndjson.gz -o openapi.yaml
Full Documentation¶
For complete API reference and examples:
Alternative: HAR Recording¶
If you prefer using Playwright's built-in HAR recording:
const context = await browser.newContext({
recordHar: { path: 'traffic.har', content: 'embed' }
});
const page = await context.newPage();
await page.goto('https://api.example.com');
// ... perform actions ...
await context.close(); // HAR written here
// Convert to IR
// $ traffic2openapi convert har -i traffic.har -o traffic.ndjson
Pros:
- Built into Playwright
- No additional dependencies
Cons:
- Not streaming (all in memory until close)
- Requires conversion step
- Less filtering control
Alternative: Manual Event Capture¶
For custom capture logic or frameworks without adapters:
import { chromium } from 'playwright';
import * as fs from 'fs';
const browser = await chromium.launch();
const context = await browser.newContext();
const page = await context.newPage();
const records: any[] = [];
// Capture responses (includes request info)
page.on('response', async response => {
const request = response.request();
const url = new URL(request.url());
// Filter to API calls only
if (!url.hostname.includes('api.example.com')) return;
const record = {
id: crypto.randomUUID(),
timestamp: new Date().toISOString(),
source: 'playwright',
request: {
method: request.method(),
path: url.pathname,
host: url.hostname,
query: Object.fromEntries(url.searchParams),
headers: request.headers(),
},
response: {
status: response.status(),
headers: response.headers(),
body: await response.json().catch(() => null),
},
};
records.push(record);
});
await page.goto('https://api.example.com');
// ... perform actions ...
// Write NDJSON
fs.writeFileSync('traffic.ndjson',
records.map(r => JSON.stringify(r)).join('\n') + '\n');
await browser.close();
Cypress¶
Network Intercept¶
// cypress/e2e/capture.cy.ts
describe('API Capture', () => {
const records: any[] = [];
beforeEach(() => {
cy.intercept('**/api/**', (req) => {
req.continue((res) => {
records.push({
id: crypto.randomUUID(),
timestamp: new Date().toISOString(),
source: 'cypress',
request: {
method: req.method,
path: new URL(req.url).pathname,
headers: req.headers,
body: req.body,
},
response: {
status: res.statusCode,
headers: res.headers,
body: res.body,
},
});
});
});
});
after(() => {
cy.writeFile('traffic.ndjson',
records.map(r => JSON.stringify(r)).join('\n') + '\n');
});
it('captures API traffic', () => {
cy.visit('/');
// ... test actions ...
});
});
Puppeteer¶
CDP-based Capture¶
import puppeteer from 'puppeteer';
import * as fs from 'fs';
const browser = await puppeteer.launch();
const page = await browser.newPage();
const records: any[] = [];
await page.setRequestInterception(true);
page.on('request', request => {
request.continue();
});
page.on('response', async response => {
const request = response.request();
const url = new URL(request.url());
records.push({
id: crypto.randomUUID(),
timestamp: new Date().toISOString(),
source: 'puppeteer',
request: {
method: request.method(),
path: url.pathname,
host: url.hostname,
headers: request.headers(),
},
response: {
status: response.status(),
headers: response.headers(),
body: await response.text().catch(() => null),
},
});
});
await page.goto('https://api.example.com');
// ... perform actions ...
fs.writeFileSync('traffic.ndjson',
records.map(r => JSON.stringify(r)).join('\n') + '\n');
await browser.close();
Chrome DevTools¶
Manual Export¶
- Open DevTools (F12)
- Go to Network tab
- Check "Preserve log" to capture across navigations
- Perform API operations
- Right-click → "Save all as HAR with content"
- Convert:
traffic2openapi convert har -i traffic.har -o traffic.ndjson
DevTools Protocol¶
// Use Chrome DevTools Protocol for programmatic capture
const CDP = require('chrome-remote-interface');
const client = await CDP();
const { Network } = client;
await Network.enable();
Network.requestWillBeSent((params) => {
console.log('Request:', params.request.url);
});
Network.responseReceived((params) => {
console.log('Response:', params.response.status);
});
Best Practices¶
1. Filter API Traffic¶
Exclude static assets, third-party requests, and internal routes:
exclude_path_patterns=[
re.compile(r"\.(js|css|png|jpg|gif|svg|ico|woff2?)$"),
re.compile(r"^/_next/"),
re.compile(r"^/sockjs-node/"),
re.compile(r"^/__webpack_hmr"),
]
2. Capture During Tests¶
Use E2E tests to generate comprehensive traffic that covers:
- All API endpoints
- Different HTTP methods
- Various request payloads
- Success and error responses
3. Include Error Cases¶
Capture 4xx/5xx responses for complete schemas:
4. Use Realistic Data¶
Representative request bodies improve schema inference:
// Good - realistic data
await page.evaluate(async () => {
await fetch('/users', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
name: 'Alice Smith',
email: 'alice@example.com',
age: 30,
roles: ['admin', 'user'],
}),
});
});
5. Compress Large Captures¶
For extensive test suites:
6. Exclude Sensitive Headers¶
The adapters exclude auth headers by default, but verify for your use case: