Automated Video Subtitles: A Developer's Guide to the VideoFlow Captions Layer
May 22, 2026 · By VideoFlowLearn how to automate video subtitles using the VideoFlow Captions Layer. Build high-quality, time-coded captions with TypeScript for any video pipeline.
Automated Video Subtitles: A Developer's Guide to the VideoFlow Captions Layer
Generating subtitles for video used to mean wrestling with complex FFmpeg filters or proprietary APIs that charged by the minute. If you're building a content automation engine, a social media factory, or a SaaS dashboard that provides personalized video recaps, you need a way to render high-quality, time-coded captions directly from your data pipeline.
In this guide, we'll dive into the CaptionsLayer in VideoFlow—the open-source TypeScript toolkit that makes automated video subtitles as simple as passing an array of JSON objects.
Why Automated Subtitles Matter for Modern Video Pipelines
Subtitles aren't just an accessibility requirement; they are a retention engine. With the majority of social media video consumed on mute, baked-in captions (also known as "open captions") are essential for engagement.
For developers, the challenge is maintaining visual consistency across different rendering environments. Whether you're generating a preview in a React app or rendering a 4K MP4 on a headless server, those captions need to land on the exact same frame every time. VideoFlow solves this by treating captions as a first-class primitive within its portable VideoJSON schema.

The Anatomy of the CaptionsLayer
The @videoflow/core builder provides a dedicated $.addCaptions() method. Unlike a standard TextLayer, which handles a single string, the CaptionsLayer accepts a collection of timed entries. It inherits all the cinematic typography properties of VideoFlow's textual system—including custom fonts, strokes, and shadows—but adds logic to manage text visibility based on the project's timeline.
Basic Implementation
To add subtitles to your video, you define your typography in the first argument (properties) and your timing data in the second (settings).
import VideoFlow from '@videoflow/core';
const $ = new VideoFlow({ width: 1920, height: 1080, fps: 30 });
$.addCaptions(
// 1. Typography & Positioning (Properties)
{
fontSize: 4, // 4% of project width
fontWeight: 700,
color: '#ffffff',
position: [0.5, 0.85], // Centered horizontally, 85% from top
textAlign: 'center',
textShadow: true,
textShadowBlur: 0.5,
textShadowColor: 'rgba(0,0,0,0.5)',
},
// 2. Timing & Flow (Settings)
{
captions: [
{ caption: 'Welcome to the future of video.', startTime: 0, endTime: 2.5 },
{ caption: 'Render subtitles from code.', startTime: 2.5, endTime: 5.0 },
{ caption: 'Open-source and Apache-2.0.', startTime: 5.0, endTime: 7.5 },
],
maxCharsPerLine: 30,
maxLines: 2,
}
);
$.wait('7.5s');
const json = await $.compile();
Advanced Styling: Cinematic Captions
Because the CaptionsLayer is built on top of VideoFlow's visual engine, you aren't limited to plain text. You can stack GLSL effects to create "karaoke-style" glows or use blend modes to make text interact with the background footage. This is a significant advantage over alternatives like Remotion, where you'd often have to build these visual components from scratch using CSS.
For instance, adding a glow effect to your captions can make them pop against dark backgrounds:
$.addCaptions(
{
fontSize: 5,
color: '#FF5A1F', // VideoFlow Orange
effects: [
{ effect: 'glow', params: { strength: 0.6, radius: 1.2 } }
],
position: [0.5, 0.8],
},
{
captions: myTranscriptionData,
}
);

The Three-Renderer Rule
One of the most powerful features of VideoFlow is that your captions will render identically across all three official renderers:
- @videoflow/renderer-dom: Use this for a frame-accurate 60fps live preview in your web app. Users can scrub the timeline and see exactly how the subtitles align with the audio.
- @videoflow/renderer-browser: Export the final MP4 directly in the user's browser tab using WebCodecs. This is perfect for client-side tools that need to avoid server costs.
- @videoflow/renderer-server: Scale your production with headless Chromium in Node.js. It's ideal for batch processing thousands of personalized videos.
By using the same VideoJSON across the entire stack, you eliminate the "it works on my machine" bugs that plague complex video pipelines.
Building an Automated Pipeline
If you're already building something like an automated podcast audiogram generator, the CaptionsLayer is your best friend. You can take the output from a transcription service (like OpenAI Whisper or Deepgram), map it to the CaptionEntry shape, and have a fully rendered video in seconds.
Combined with other VideoFlow primitives like $.addAudio() and cinematic transitions, you can build a complete video production house entirely in TypeScript.
Get Started with VideoFlow
VideoFlow is fully open-source (Apache-2.0), meaning you can embed it in your commercial products today without license fees. Whether you're replacing fragile FFmpeg scripts or looking for a more portable alternative to Remotion, VideoFlow provides the primitives you need to treat video as data.
- Try it now: Head over to the VideoFlow Playground to see the Captions Layer in action.
- Read the Docs: Check out our Layer Guides for a deep dive into every available layer type.
- Star on GitHub: Join the community and contribute at github.com/ybouane/VideoFlow.
Start building your automated video pipeline today.