Multimedia Accessibility

HUIT offers guidance to schools and departments looking to add captioning to video resources.

Who benefits from closed captioning?

  • Individuals with hearing loss or hearing impairments
  • Those for whom English is a second language
  • Emerging readers
  • Anyone in a noisy environment
  • People with learning disabilities
  • All of us

What are factors to consider when choosing a captioning strategy?

  • Transcription accuracy
  • Cost
  • Time required to caption
  • Ease of integration into your video production workflow

Three approaches for captioning videos



Import video into caption editing tool and manually transcribe or compose.

Good choice if very short video.

Leverage a speech-to-text tool or import text from a Word document to create a ‘first pass’, then make any needed corrections and synchronize the text with your video.

Automated (speech-to-text) solutions provide variable degrees of accuracy. This approach works best when the video is of a single person speaking clearly, and with limited noise interference in the background. Accuracy rates drop if there are multiple people with overlapping dialog, if there is background noise in the video, or if specialized terminology is used.

Use a closed captioning service that provides both the captions or transcription and synchronization to your video.

This fee-for-service approach will provide the greatest accuracy with a minimal use of internal resources.

* Note: For videos that will be public-facing, the text output must be edited in order to conform to WCAG 2.0 guidelines. For that reason, using a service is highly recommended.

Three categories of tools

  • Closed caption services
  • Caption editor
  • Speech-to-text


1) Closed captioning services




Turn-around time

Suitable use cases

How to

3Play Media

Harvard negotiated pricing

very high (99%+)

Standard: 48 hours

Premium: 24 hours

Preferred Vendor.

Use for videos that need high degree of accuracy, such as for a student who needs captioning for a course lecture or videos to be made public. Offers a wide range of services and formats.

Instructions: 3Play



Very high (99%+)

24 hour turnaround if video is 20 minutes or less.

Used for videos that need high accuracy, such as for a student who needs captioning for a course lecture or videos to be made public.

Instructions: Rev


On per minute basis

Very high (99%+)

24 or 48 hours

Combination. Can be integrated with video platforms.

Instructions: Cielo24

2) Caption editors



Suitable use cases

How to



For DIY caption creation

Instructions: CADET



For DIY caption creation. Allows online collaboration.

Instructions: Amara


3) Speech-to-text (STT)




Suitable use cases

How to

IBM Watson


Medium-High (80-95%)

Automated. Can be integrated with video platforms. Has been used by DCE.

Instructions: IBM Watson

Microsoft Azure Indexer2

Currently free (while in preview); future costs still to be determined, but current version is $.02/min or less

Medium-High (80-95%)

Emphasizes searchability.

Instructions: Microsoft Azure Indexer2



Low-Medium, but provides tools to edit and correct

When you don’t need to collaborate to add the captioning.

Instructions: YouTube

Lock icon FAQs for Harvard Affiliates