Skip to content

Release Notes: v0.3.0

Release Date: 2026-02-21

Overview

vac v0.3.0 introduces browser video recording, enabling creation of product demos and tutorials by automating browser interactions with AI-generated voiceover. This release also adds hardware-accelerated encoding, subtitle burning, and a unified segment architecture for mixed content types.

Highlights

  • Browser Video Recording - Automate browser interactions with synchronized voiceover narration
  • Hardware-Accelerated Encoding - VideoToolbox support on macOS via --fast flag
  • Subtitle Burning - Permanently embed subtitles into video via --subtitles-burn
  • Multi-Language Pace-to-Longest - Automatically pace videos to the longest language audio

New Features

Browser Video Recording

Record browser-driven demos with AI-generated voiceover using declarative YAML configuration:

# demo.yaml
metadata:
  title: "Product Demo"
  defaultLanguage: "en-US"

defaultVoice:
  provider: "elevenlabs"
  voiceId: "pNInz6obpgDQGcFmaJgB"

segments:
  - id: "segment_000"
    type: "browser"
    browser:
      url: "https://example.com"
      steps:
        - action: "wait"
          duration: 1000
          voiceover:
            en-US: "Welcome to our product demo."
        - action: "click"
          selector: "#login-button"
          voiceover:
            en-US: "Click the login button to get started."
vac browser video --config demo.yaml --output demo.mp4

Supported Browser Actions

Action Parameters Description
navigate url Navigate to a URL
click selector or text Click an element
input selector, value Type text into an element
scroll scrollX, scrollY Scroll the page
wait duration (ms) Wait for specified duration
waitFor selector Wait for element to appear
hover selector Hover over an element
keypress key Send keyboard input
evaluate script Execute JavaScript
screenshot - Capture current state

Advanced Scroll Options

  • Scroll modes: relative (delta) or absolute (position)
  • Scroll behavior: auto (instant) or smooth (animated)
  • Automatic wait: Smooth scrolls wait for animation to complete
- action: "scroll"
  scrollY: 400
  scrollMode: "absolute"
  scrollBehavior: "smooth"

Click by Text Content

Click elements by visible text instead of CSS selectors:

- action: "click"
  text: "Submit"
  textScope: ".sidebar"    # Optional: restrict search area
  textMatch: "contains"    # contains, exact, or regex

Hardware-Accelerated Encoding

Use --fast for VideoToolbox hardware acceleration on macOS:

vac browser video --config demo.yaml --output demo.mp4 --fast

Subtitle Burning

Permanently embed subtitles into the video:

# Burn subtitles into video
vac browser video --config demo.yaml --output demo.mp4 \
  --subtitles --subtitles-burn

# Silent video with burned subtitles
vac browser video --config demo.yaml --output demo.mp4 \
  --subtitles --subtitles-burn --no-audio

Testing and Debugging Flags

Iterate quickly with partial content:

# Test first 2 segments only
vac browser video --config demo.yaml --output demo.mp4 --limit 2

# Test first 3 browser steps only
vac browser video --config demo.yaml --output demo.mp4 --limit-steps 3

Multi-Language Support

Generate videos in multiple languages with automatic pace-to-longest:

vac browser video --config demo.yaml --output demo.mp4 \
  --lang en-US,fr-FR,zh-Hans

Output:

demo.mp4          # English (primary)
demo_fr-FR.mp4    # French version
demo_zh-Hans.mp4  # Chinese version

New CLI Command

vac browser video

vac browser video [flags]

Flags:
  -c, --config string      Configuration file (YAML)
  -o, --output string      Output video file
  -l, --lang string        Languages (comma-separated, default: en-US)
      --audio-dir string   Directory for cached TTS audio
      --provider string    TTS provider: elevenlabs or deepgram
      --voice string       Voice ID for TTS
      --width int          Browser width (default: 1920)
      --height int         Browser height (default: 1080)
      --fps int            Frame rate (default: 30)
      --headless           Run browser in headless mode
      --fast               Use hardware-accelerated encoding
      --subtitles          Generate subtitle files
      --subtitles-stt      Use STT for word-level subtitles
      --subtitles-burn     Burn subtitles into video
      --no-audio           Generate silent video (no audio track)
      --limit int          Limit number of segments to process
      --limit-steps int    Limit browser steps per segment
      --transition float   Transition duration between segments (seconds)

New Packages

Package Description
pkg/browser Rod-based browser automation with step execution
pkg/segment Segment abstraction (slides, browser)
pkg/config Unified YAML configuration loading
pkg/source Content source loaders
pkg/media Media duration detection utilities

Dependencies

  • Added google/uuid as direct dependency
  • Added gopkg.in/yaml.v3 for YAML config parsing

Installation

go install github.com/grokify/videoascode/cmd/vac@v0.3.0

Prerequisites

  • Chrome/Chromium - Required for browser automation (uses Rod)
  • FFmpeg with libass - Required for --subtitles-burn (see troubleshooting)

Documentation

Breaking Changes

None. Existing slide-based workflows continue to work unchanged.

Known Limitations

  • Scroll animations not smooth: Browser recordings capture 1 frame per step, not continuous frames during animations. Smooth scrolls will appear as jump cuts rather than fluid motion. Workaround: Use shorter scroll distances with multiple steps.
  • 30 FPS default: Frame rate is fixed at 30 FPS. Higher frame rates (60 FPS) planned for future release.

What's Next

  • FFmpeg wrapper package for type-safe video operations
  • Continuous frame capture during animations (60fps smooth scroll)
  • Element highlighting/annotations during demos
  • Mixed slide + browser segments in single video