Release Notes: v0.3.0¶
Release Date: 2026-02-21
Overview¶
vac v0.3.0 introduces browser video recording, enabling creation of product demos and tutorials by automating browser interactions with AI-generated voiceover. This release also adds hardware-accelerated encoding, subtitle burning, and a unified segment architecture for mixed content types.
Highlights¶
- Browser Video Recording - Automate browser interactions with synchronized voiceover narration
- Hardware-Accelerated Encoding - VideoToolbox support on macOS via
--fastflag - Subtitle Burning - Permanently embed subtitles into video via
--subtitles-burn - Multi-Language Pace-to-Longest - Automatically pace videos to the longest language audio
New Features¶
Browser Video Recording¶
Record browser-driven demos with AI-generated voiceover using declarative YAML configuration:
# demo.yaml
metadata:
title: "Product Demo"
defaultLanguage: "en-US"
defaultVoice:
provider: "elevenlabs"
voiceId: "pNInz6obpgDQGcFmaJgB"
segments:
- id: "segment_000"
type: "browser"
browser:
url: "https://example.com"
steps:
- action: "wait"
duration: 1000
voiceover:
en-US: "Welcome to our product demo."
- action: "click"
selector: "#login-button"
voiceover:
en-US: "Click the login button to get started."
Supported Browser Actions¶
| Action | Parameters | Description |
|---|---|---|
navigate | url | Navigate to a URL |
click | selector or text | Click an element |
input | selector, value | Type text into an element |
scroll | scrollX, scrollY | Scroll the page |
wait | duration (ms) | Wait for specified duration |
waitFor | selector | Wait for element to appear |
hover | selector | Hover over an element |
keypress | key | Send keyboard input |
evaluate | script | Execute JavaScript |
screenshot | - | Capture current state |
Advanced Scroll Options¶
- Scroll modes:
relative(delta) orabsolute(position) - Scroll behavior:
auto(instant) orsmooth(animated) - Automatic wait: Smooth scrolls wait for animation to complete
Click by Text Content¶
Click elements by visible text instead of CSS selectors:
- action: "click"
text: "Submit"
textScope: ".sidebar" # Optional: restrict search area
textMatch: "contains" # contains, exact, or regex
Hardware-Accelerated Encoding¶
Use --fast for VideoToolbox hardware acceleration on macOS:
Subtitle Burning¶
Permanently embed subtitles into the video:
# Burn subtitles into video
vac browser video --config demo.yaml --output demo.mp4 \
--subtitles --subtitles-burn
# Silent video with burned subtitles
vac browser video --config demo.yaml --output demo.mp4 \
--subtitles --subtitles-burn --no-audio
Testing and Debugging Flags¶
Iterate quickly with partial content:
# Test first 2 segments only
vac browser video --config demo.yaml --output demo.mp4 --limit 2
# Test first 3 browser steps only
vac browser video --config demo.yaml --output demo.mp4 --limit-steps 3
Multi-Language Support¶
Generate videos in multiple languages with automatic pace-to-longest:
Output:
New CLI Command¶
vac browser video¶
vac browser video [flags]
Flags:
-c, --config string Configuration file (YAML)
-o, --output string Output video file
-l, --lang string Languages (comma-separated, default: en-US)
--audio-dir string Directory for cached TTS audio
--provider string TTS provider: elevenlabs or deepgram
--voice string Voice ID for TTS
--width int Browser width (default: 1920)
--height int Browser height (default: 1080)
--fps int Frame rate (default: 30)
--headless Run browser in headless mode
--fast Use hardware-accelerated encoding
--subtitles Generate subtitle files
--subtitles-stt Use STT for word-level subtitles
--subtitles-burn Burn subtitles into video
--no-audio Generate silent video (no audio track)
--limit int Limit number of segments to process
--limit-steps int Limit browser steps per segment
--transition float Transition duration between segments (seconds)
New Packages¶
| Package | Description |
|---|---|
pkg/browser | Rod-based browser automation with step execution |
pkg/segment | Segment abstraction (slides, browser) |
pkg/config | Unified YAML configuration loading |
pkg/source | Content source loaders |
pkg/media | Media duration detection utilities |
Dependencies¶
- Added
google/uuidas direct dependency - Added
gopkg.in/yaml.v3for YAML config parsing
Installation¶
Prerequisites¶
- Chrome/Chromium - Required for browser automation (uses Rod)
- FFmpeg with libass - Required for
--subtitles-burn(see troubleshooting)
Documentation¶
- Added Browser Video Guide - Complete browser recording documentation
- Updated Multi-Language Guide - Pace-to-longest explanation
- Updated Subtitles Guide - Subtitle burning options
- Updated CLI Reference - New commands and flags
- Added example configs in
examples/browser-demo/
Breaking Changes¶
None. Existing slide-based workflows continue to work unchanged.
Known Limitations¶
- Scroll animations not smooth: Browser recordings capture 1 frame per step, not continuous frames during animations. Smooth scrolls will appear as jump cuts rather than fluid motion. Workaround: Use shorter scroll distances with multiple steps.
- 30 FPS default: Frame rate is fixed at 30 FPS. Higher frame rates (60 FPS) planned for future release.
What's Next¶
- FFmpeg wrapper package for type-safe video operations
- Continuous frame capture during animations (60fps smooth scroll)
- Element highlighting/annotations during demos
- Mixed slide + browser segments in single video