CLI Reference¶

Complete command-line interface reference.

Command Structure¶

vac uses a hierarchical command structure:

vac
├── slides              # Marp slide presentations
│   ├── video          # Full pipeline: parse, TTS, record, combine
│   └── tts            # Generate audio from transcript
├── browser            # Browser automation recordings
│   ├── video          # Record with TTS voiceover
│   └── record         # Silent recording (no audio)
└── subtitle           # Generate subtitles from audio

slides video¶

Generate video from Marp presentation (full pipeline).

vac slides video [flags]

Flags¶

Flag	Type	Default	Description
`-i, --input`	string	required	Input Marp markdown file
`-o, --output`	string	`output.mp4`	Output video file
`-m, --manifest`	string		Audio manifest file (from `slides tts`)
`-k, --api-key`	string	`$ELEVENLABS_API_KEY`	ElevenLabs API key
`-v, --voice`	string	`pNInz6obpgDQGcFmaJgB`	ElevenLabs voice ID (Adam)
`--width`	int	`1920`	Video width in pixels
`--height`	int	`1080`	Video height in pixels
`--fps`	int	`30`	Video frame rate
`--transition`	float	`0`	Transition duration (seconds)
`--subtitles`	string		Subtitle file to embed (SRT or VTT)
`--subtitles-lang`	string	auto-detect	Subtitle language code
`--output-individual`	string		Directory for individual slide videos
`--screen-device`	string	auto-detect	macOS screen capture device
`--workdir`	string	system temp	Working directory for temp files
`--check`	bool		Verify dependencies and exit

Examples¶

# Full pipeline with inline voiceovers
vac slides video --input slides.md --output video.mp4

# Use pre-generated audio
vac slides video --input slides.md --manifest audio/manifest.json --output video.mp4

# With transitions and custom resolution
vac slides video --input slides.md --output video.mp4 \
  --transition 0.5 --width 1280 --height 720

# Generate individual slide videos for Udemy
vac slides video --input slides.md --output combined.mp4 \
  --output-individual ./lectures/

# Check dependencies
vac slides video --check

slides tts¶

Generate audio files from a transcript JSON file.

vac slides tts [flags]

Flags¶

Flag	Type	Default	Description
`-t, --transcript`	string	required	Transcript JSON file
`-o, --output`	string	`audio`	Output directory for audio files
`-l, --lang`	string	from transcript	Language/locale code (e.g., `en-US`)
`--provider`	string	auto-detect	TTS provider: `elevenlabs` or `deepgram`
`--elevenlabs-api-key`	string	`$ELEVENLABS_API_KEY`	ElevenLabs API key
`--deepgram-api-key`	string	`$DEEPGRAM_API_KEY`	Deepgram API key
`-f, --force`	bool	`false`	Regenerate audio even if files exist

Examples¶

# Generate English audio
vac slides tts --transcript transcript.json --output audio/en-US/ --lang en-US

# Generate Spanish audio with Deepgram
vac slides tts --transcript transcript.json --output audio/es-ES/ \
  --lang es-ES --provider deepgram

# Force regeneration
vac slides tts --transcript transcript.json --output audio/ --force

browser video¶

Record browser-driven demos with AI-generated voiceover.

vac browser video [flags]

Flags¶

Flag	Type	Default	Description
`-c, --config`	string	required	Configuration file (YAML/JSON)
`-o, --output`	string	`output.mp4`	Output video file
`-a, --audio-dir`	string		Directory to save/reuse audio tracks
`-p, --provider`	string	auto-detect	TTS provider: `elevenlabs` or `deepgram`
`-v, --voice`	string	from config	TTS voice ID
`-l, --lang`	string	`en-US`	Languages to generate (comma-separated)
`--elevenlabs-api-key`	string	`$ELEVENLABS_API_KEY`	ElevenLabs API key
`--deepgram-api-key`	string	`$DEEPGRAM_API_KEY`	Deepgram API key
`--width`	int	`1920`	Video width in pixels
`--height`	int	`1080`	Video height in pixels
`--fps`	int	`30`	Video frame rate
`--transition`	float	`0`	Transition duration (seconds)
`--headless`	bool	`false`	Run browser in headless mode
`--subtitles`	bool	`false`	Generate subtitles from voiceover timing
`--subtitles-stt`	bool	`false`	Generate word-level subtitles using STT
`--subtitles-burn`	bool	`false`	Burn subtitles into video (requires FFmpeg with libass)
`--no-audio`	bool	`false`	Generate video without audio (TTS used for timing/subtitles)
`--fast`	bool	`false`	Use hardware-accelerated encoding (VideoToolbox on macOS)
`--limit`	int	`0`	Limit to first N segments (0 = no limit, for testing)
`--limit-steps`	int	`0`	Limit browser segments to first N steps (0 = no limit, for testing)
`--workdir`	string	system temp	Working directory for temp files

Examples¶

# Basic browser demo
vac browser video --config demo.yaml --output demo.mp4

# Multi-language with audio caching
vac browser video --config demo.yaml --output demo.mp4 \
  --audio-dir ./audio --lang en-US,fr-FR,zh-Hans

# With subtitles burned in (requires FFmpeg with libass)
vac browser video --config demo.yaml --output demo.mp4 \
  --subtitles --subtitles-burn

# Silent video with burned subtitles (no audio track)
vac browser video --config demo.yaml --output demo.mp4 \
  --subtitles --subtitles-burn --no-audio

# Headless mode for CI/CD
vac browser video --config demo.yaml --output demo.mp4 --headless

# Using Deepgram TTS
vac browser video --config demo.yaml --output demo.mp4 --provider deepgram

# Fast encoding with hardware acceleration (macOS VideoToolbox)
vac browser video --config demo.yaml --output demo.mp4 --fast

# Test with limited segments (faster iteration)
vac browser video --config demo.yaml --output demo.mp4 --limit 2

# Test with limited browser steps (faster iteration)
vac browser video --config demo.yaml --output demo.mp4 --limit-steps 3

Audio Caching¶

When using --audio-dir, vac caches generated TTS audio:

Audio files stored as {audio-dir}/{language}/segment_XXX.mp3
Metadata JSON files store per-voiceover timing information
Subsequent runs skip TTS generation if cached audio exists

Multi-Language Timing¶

When generating multiple languages, the video is paced to the longest audio:

TTS audio is generated for all requested languages
Per-voiceover durations are compared across languages
Each browser step uses the maximum duration
All language versions sync with the same video

browser record¶

Record browser session without audio (silent recording).

vac browser record [flags]

Flags¶

Flag	Type	Default	Description
`-c, --config`	string		Configuration file (YAML/JSON)
`-s, --steps`	string		Steps file defining browser actions
`-u, --url`	string		Starting URL for the browser
`-o, --output`	string	`recording.mp4`	Output video file
`--width`	int	`1920`	Browser viewport width
`--height`	int	`1080`	Browser viewport height
`--fps`	int	`30`	Video frame rate
`--headless`	bool	`false`	Run browser in headless mode
`-t, --timing`	string		Output timing JSON file
`--timeout`	int	`30000`	Default step timeout (ms)
`--workdir`	string	system temp	Working directory
`--cleanup`	bool	`true`	Clean up temp files after recording

Examples¶

# Record from steps file
vac browser record --url https://example.com --steps demo.json --output demo.mp4

# Record from config file
vac browser record --config demo.yaml --output demo.mp4

# Export timing data for later audio sync
vac browser record --url https://example.com --steps demo.json \
  --output demo.mp4 --timing timing.json

# Headless mode
vac browser record --url https://example.com --steps demo.json \
  --output demo.mp4 --headless

subtitle¶

Generate subtitles from audio files using speech-to-text.

vac subtitle [flags]

Flags¶

Flag	Type	Default	Description
`-a, --audio`	string	required	Audio directory containing manifest.json
`-o, --output`	string	`subtitles`	Output directory for subtitle files
`-l, --lang`	string	from manifest	Language code
`--provider`	string	`deepgram`	STT provider: `deepgram` or `elevenlabs`
`--individual`	bool	`false`	Also generate per-slide subtitle files

Examples¶

# Generate subtitles (language auto-detected)
vac subtitle --audio audio/en-US/

# Custom output directory
vac subtitle --audio audio/fr-FR/ --output subs/

# Keep individual slide subtitles
vac subtitle --audio audio/en-US/ --individual

Environment Variables¶

Variable	Description
`ELEVENLABS_API_KEY`	ElevenLabs API key for TTS
`DEEPGRAM_API_KEY`	Deepgram API key for TTS/STT

Exit Codes¶

Code	Description
0	Success
1	General error
2	Missing dependencies
3	Invalid input file
4	TTS generation failed
5	Recording failed
6	Video combination failed