Release Notes: v0.2.0¶

Release Date: 2026-02-14

Overview¶

vac v0.2.0 introduces automatic subtitle generation using speech-to-text, OmniVoice integration for unified TTS/STT provider abstraction, and dictionary-based case correction for professional subtitle output.

Highlights¶

Automatic Subtitle Generation - Generate SRT/VTT subtitle files from audio using Deepgram STT
OmniVoice Integration - Unified interface for TTS and STT providers
Dictionary-based Case Correction - Proper capitalization of tech terms and proper nouns in subtitles

New Features¶

Subtitle Generation¶

New vac subtitle command for generating subtitles from audio
Supports SRT and WebVTT output formats
Word-level timing accuracy via Deepgram speech-to-text
Auto-detects language from audio manifest

# Generate subtitles from audio
vac subtitle --audio audio/en-US/

# Output: subtitles/en-US.srt, subtitles/en-US.vtt

OmniVoice Provider Abstraction¶

Unified TTS provider interface via OmniVoice
Unified STT provider interface for subtitle generation
Tested with ElevenLabs (TTS) and Deepgram (STT)
Easy to add additional providers in the future

Dictionary-based Case Correction¶

Built-in dictionary with 200+ tech terms (AI, API, GitHub, Claude, etc.)
Custom dictionary support via JSON files
Ensures proper capitalization in auto-generated subtitles

# Custom dictionary location
~/.config/vac/dictionaries/*.json
./dictionaries/*.json

Subtitle Embedding¶

Embed soft subtitles (toggleable by viewer)
Burn hard subtitles (permanent in video)
Path sanitization for safe ffmpeg execution

New CLI Commands¶

`vac subtitle`¶

vac subtitle [flags]

Flags:
  -a, --audio string        Audio directory containing manifest.json (required)
  -o, --output string       Output directory for subtitle files (default "subtitles")
  -l, --lang string         Language code (auto-detected from manifest if not specified)
      --provider string     STT provider: deepgram or elevenlabs (default: deepgram)
      --individual          Also generate individual subtitle files per slide

`vac stt`¶

vac stt [flags]

Speech-to-text transcription for individual audio files.

Complete Workflow¶

With v0.2.0, the complete workflow from Marp presentation to video with subtitles is:

# Set API keys
export ELEVENLABS_API_KEY="your-key"
export DEEPGRAM_API_KEY="your-key"

# 1. Generate audio (TTS)
vac tts --input slides.md --output audio/en-US/

# 2. Generate video
vac video --input slides.md --manifest audio/en-US/manifest.json --output video/presentation.mp4

# 3. Generate subtitles (STT)
vac subtitle --audio audio/en-US/

# 4. Embed subtitles
ffmpeg -i video/presentation.mp4 -i subtitles/en-US.srt \
  -c:v copy -c:a copy -c:s mov_text \
  -metadata:s:s:0 language=eng \
  video/presentation_with_subs.mp4

See the Complete Workflow Guide for detailed instructions.

Dependencies¶

Added github.com/plexusone/omnivoice v0.4.1 for unified TTS/STT interface
Added github.com/plexusone/omnivoice-deepgram v0.3.0 for Deepgram STT support
Bumped github.com/grokify/mogo to v0.73.2 for SanitizePath support

Installation¶

go install github.com/grokify/videoascode/cmd/vac@v0.2.0

New Prerequisites¶

Deepgram API key (for subtitle generation): Sign up at Deepgram

Documentation¶

Added Complete Workflow Guide - End-to-end tutorial
Added Subtitle Generation Guide - Detailed subtitle documentation
Updated README with OmniVoice and Deepgram information

Breaking Changes¶

None. This release is fully backwards-compatible with v0.1.0.

What's Next¶

Burned-in subtitle styling options
Karaoke-style word highlighting
Word-by-word reveal captions for social media
Additional TTS/STT provider support via OmniVoice