Compliance reference

WCAG captions vs transcripts: when you need which

The first time a training-ops lead reads through WCAG, the two terms often blur together. They shouldn't. Captions and transcripts serve different users and satisfy different success criteria, and only in one specific case does a transcript substitute for captions. Here is the operator's mapping.

TL;DR

Captions are the time-synchronized text you turn on inside a video player. They serve Deaf and hard-of-hearing viewers and satisfy SC 1.2.2. Transcripts are the full text on the page (or linked PDF) alongside the media. They serve readers who prefer text, low-bandwidth users, and search engines. For video with audio, WCAG 2.1 AA requires captions — a transcript alone is not enough. For audio-only content (podcasts, voice recordings), a transcript alone is enough, under SC 1.2.1.

Definitions that actually matter in an audit

The spec language is worth pinning down:

Captions — synchronized text equivalent of the audio track, designed to be displayed during playback. They include dialogue, speaker identification, and meaningful non-speech sounds. WCAG formal definition.
Transcript — a static text version of the audio (and optionally the visual) content, presented as text alongside or instead of the media. Not required to be time-coded.
Descriptive transcript — a transcript that also describes significant visual content (on-screen text, demos, silent visual steps). This is the form that can satisfy SC 1.2.3 at Level A as a media alternative.

The SC mapping

Content type	Captions required?	Transcript required?	Governing SC
Prerecorded video with audio (typical training)	Yes — SC 1.2.2 (A)	No (but recommended)	1.2.2 + 1.2.5 at AA
Audio-only (podcast, voice memo)	N/A	Yes — text alternative	1.2.1 (A)
Video-only (silent demo, animated GIF with no audio)	N/A	Yes — text alternative or audio track	1.2.1 (A)
Live-streamed video with audio	Yes — live captions	No	1.2.4 (AA)
Media alternative for text (narrated read-along)	Exempt if labeled as such	Already text — no additional	Exception to 1.2.1/1.2.2

When a transcript can substitute

Two specific cases:

Audio-only content. A podcast with a full text transcript on the same page satisfies SC 1.2.1 at Level A. No captions are required because there is no video track to caption.
Video-only content. A silent screen-recording with no audio track, accompanied by a text description or an audio description, satisfies SC 1.2.1.

For everything else — i.e., most training video — captions are the load-bearing requirement. A transcript is valuable (SEO, readability, LLM-crawler legibility) but it does not remove the caption obligation.

The common misread: "we have a transcript, so we don't need captions"

This mistake usually comes from reading SC 1.2.1 in isolation. 1.2.1 governs audio-only and video-only content — it has an explicit exclusion for synchronized media. Synchronized media (video with audio) falls under 1.2.2, which requires captions specifically. A page-level transcript next to your training video is great UX but doesn't satisfy 1.2.2.

Some accessibility teams adopt "captions plus on-page transcript" as house style. That pattern is defensible: captions meet 1.2.2, and the transcript improves non-caption-user UX and SEO. Shipping both is more work; shipping only the transcript fails the audit.

How GlossCap helps

GlossCap exports the caption format WCAG requires (SRT and WebVTT, time-synchronized, speaker-labeled, non-speech-sound-marked), and it can export a plain-text transcript alongside for the SEO/UX pattern. The transcript reuses the same glossary-corrected text as the caption track, so your engineering, medical, or product terms are consistent across both. See our SC 1.2.2 walkthrough for the caption-track content requirements, or WCAG 2.1 AA captions for the full compliance baseline.

See pricing