Provide captions and descriptions of video

Video typically has spoken information in the audio track and visual information in the video track. Any information that is only provided audibly or visually will not be accessible to people who can’t hear or see. Captions provide a text version of spoken words, along with any sounds that are important to understanding the content, so people who can’t hear can comprehend the information. People who can’t see get the information from the video through the audio track. Video that contains important information in the video track only requires additional content to convey the information to people who can't see.

Transcripts, captions, and descriptions make media accessible to all users:

  • Provide captions. Captions make audible content accessible to people who can’t hear, and more comprehensible to everyone. Do not rely on the auto-captioning feature provided by platforms such as YouTube. In most cases, the accuracy is not sufficient to be useful.
  • Put the content in the words. Consider how to convey concepts in a way that will be understood by people who can’t see. For example, a video of a presentation will be more accessible if the speaker describes the content of the slides.
  • Use audio description. When essential visual information is not described in the video, one approach is to provide narrative describing visual information. The audio narrative plays during the natural pauses in the video: (example of a described video).
  • Provide a media transcript. A text-based alternative includes a running description of all visual information, including descriptions of scene changes and the actions and expressions of actors, as well as a transcript of all non-speech sound and spoken dialogue.

Be aware that adding accessibility features to media requires planning and time, especially audio description. The most efficient approach is to think about captioning and audio description requirements in advance, before video is produced, and during the production process. Harvard provides accessible media support to help with the process.


  • For video with a soundtrack, are synchronized captions provided that give a text equivalent for all spoken and key non-spoken audio? If not, is a text transcript of the spoken and key non-spoken audio available on the same page as the audio?
  • For video with substantial portions of visual events that are not described in the soundtrack, is audio description available? If not, is a transcript or other alternative text version of what happens in the video available on the same page as the audio?
  • For audio-only recordings, is a text transcript of the spoken audio available on the same page as the audio?


✎ Technique: Writing captions

Captions allow people who can't hear a video's soundtrack to have access to a text version of the information provided in the audio.

If you decide to caption your own video content rather than outsource this job to a captioning service, make sure the captions provide an accurate and meaningful alternative to the audio. In particular, when writing captions for audio content in a video, make sure all the spoken content is available in captions, as well as indications of speaker switching (using ">> SPEAKER NAME:") and descriptions of significant non-verbal sounds (setting them off in square brackets, such as "[BUTTON CLICK]").

Video: Writing closed captions