Auto-generated captions¶

This guide will demonstrate how to enable auto-captioning for on-demand HeapStream videos to enhance accessibility and generate transcripts.

Overview¶

HeapStream utilizes OpenAI's Whisper model for automatic captioning of on-demand videos.

This feature works best with clear audio, but may struggle with excessive non-speech sounds or silence.

We recommend testing it on your typical content to assess its performance for your use case.

Enabling Auto-Captioning¶

GUIAPI

*Enable auto-captioning for all videos in the Project Settings page. * Enable auto-captioning when uploading a video in the Video Upload page. * Generate captions for an existing video in the Video Edit -> Text Tracks page.

*Enable auto-captioning for all videos of a project in the API ⧉ "Update project settings" endpoint. * Set auto_tt array when when uploading ⧉ or fetching ⧉ a video. * Generate captions for an existing video through the endpoint "Generate auto captions" in the API ⧉

Faq¶

List of supported languages¶

Below are supported VOD auto-captioning languages and codes (note that beta languages may have lower accuracy):

Language	Language Code	Status
English	en	Stable
Spanish	es	Stable
Italian	it	Stable
Portuguese	pt	Stable
German	de	Stable
French	fr	Stable
Korean	ko	Stable
Dutch	nl	Stable
Thai	th	Stable
Russian	ru	Stable
Polish	pl	Stable
Japanese	ja	Stable
Swedish	sv	Stable
Turkish	tr	Stable
Catalan	ca	Stable
Indonesian	id	Stable
Ukrainian	uk	Stable
Malay	ms	Stable
Mandarin	zh	Stable
Finish	fi	Stable
Norwegian	no	Stable
Romanian	ro	Stable

What is the cost of auto-captioning for VOD?¶

It's free.

Time needed to generate captions?¶

For English, it's about 0.2x duration of the speech. For other languages, it takes 0.5x duration.

My audio speech is in multiple languages¶

It's not recommended to use auto-captioning on content with multiple languages.

The speech is in a language you don't support¶

Please contact us regarding any unsupported languages you need for auto-captioning.

Auto-generated captions may have mistakes¶

While automatic speech recognition has significantly improved, occasional errors may still occur.

One option is to edit the auto-generated captions:

Download the VTT file through the API or GUI
Correct mistakes in a text editor
Delete the auto-generated track
Create a new track with the edited VTT file