Skip to content

Auto-generated captions

This guide will demonstrate how to enable auto-captioning for on-demand HeapStream videos to enhance accessibility and generate transcripts.

Overview

HeapStream utilizes OpenAI's Whisper model for automatic captioning of on-demand videos.

This feature works best with clear audio, but may struggle with excessive non-speech sounds or silence.

We recommend testing it on your typical content to assess its performance for your use case.

Enabling Auto-Captioning

  • Enable auto-captioning for all videos in the Project Settings page.
  • Enable auto-captioning when uploading a video in the Video Upload page.
  • Generate captions for an existing video in the Video Edit -> Text Tracks page.
  • Enable auto-captioning for all videos of a project in the API ⧉ "Update project settings" endpoint.
  • Set auto_tt array when when uploading ⧉ or fetching ⧉ a video.
  • Generate captions for an existing video through the endpoint "Generate auto captions" in the API ⧉

Faq

List of supported languages

Below are supported VOD auto-captioning languages and codes (note that beta languages may have lower accuracy):

Language Language Code Status
English en Stable
Spanish es Stable
Italian it Stable
Portuguese pt Stable
German de Stable
French fr Stable
Korean ko Stable
Dutch nl Stable
Thai th Stable
Russian ru Stable
Polish pl Stable
Japanese ja Stable
Swedish sv Stable
Turkish tr Stable
Catalan ca Stable
Indonesian id Stable
Ukrainian uk Stable
Malay ms Stable
Mandarin zh Stable
Finish fi Stable
Norwegian no Stable
Romanian ro Stable

What is the cost of auto-captioning for VOD?

It's free.

Time needed to generate captions?

For English, it's about 0.2x duration of the speech. For other languages, it takes 0.5x duration.

My audio speech is in multiple languages

Currently, it's not recommended to use auto-captioning on content with multiple languages.

The speech is in a language you don't support

Please contact us regarding any unsupported languages you need for auto-captioning.

Auto-generated captions may have mistakes!

While automatic speech recognition has significantly improved, occasional errors may still occur.

One option is to edit the auto-generated captions:

  1. Download the VTT file through the API or GUI
  2. Correct mistakes in a text editor
  3. Delete the auto-generated track
  4. Create a new track with the edited VTT file