Compliance reference · Harmonised standard · CEN / CENELEC / ETSI · Mandate M/376 + M/554

EN 301 549 captions: clause 7.1, the WCAG cross-walk, and the EU procurement bar

Where the European Accessibility Act and the Web Accessibility Directive are the law, EN 301 549 is the standard the law reaches for when an EU procurement officer or a member-state market-surveillance authority needs an actually-enforceable technical bar. The current version, V3.2.1 (published 2021-03), is the harmonised European standard for accessibility of ICT products and services — developed jointly by CEN, CENELEC, and ETSI under European Commission Mandate M/376 and updated under Mandate M/554. For captions on training video specifically, clauses 7.1.1 through 7.1.5 are the operative text. They cross-walk substantively to WCAG 2.1 SC 1.2.2, SC 1.2.4, and SC 1.2.5 — and they're what an inspector or auditor opens when a tenant's accessibility statement claims conformance and the inspector wants to verify it.

TL;DR

EN 301 549 V3.2.1 is the EU harmonised accessibility standard for ICT products and services. Clause 5 covers generic ICT requirements; clause 6 covers two-way voice; clause 7 covers ICT with video capabilities; clause 8 covers hardware; clause 9 covers web; clause 10 covers non-web documents; clause 11 covers software (incl. mobile apps); clause 12 covers documentation and support services; clause 13 covers ICT providing relay or emergency services. Clauses 7.1.1–7.1.5 are the captioning text. 7.1.1 (Captioning playback) requires that ICT supports caption display when captions exist. 7.1.2 (Captioning synchronisation) requires synchronisation between captions and the video. 7.1.3 (Preservation of captioning) requires captions to be preserved through transmission. 7.1.4 (Captions characteristics) requires captions to be visually distinct from the video. 7.1.5 (Spoken subtitles) covers programmatic exposure of caption text. EN 301 549 is referenced by Directive 2016/2102 (the Web Accessibility Directive — public-sector bodies in EU member states), the European Accessibility Act (Directive 2019/882), and a substantial fraction of EU public-procurement specifications under Directive 2014/24/EU. The substantive captioning bar cross-walks WCAG 2.1 SC 1.2.2 (Captions, Prerecorded), SC 1.2.4 (Captions, Live), and SC 1.2.5 (Audio Description, Prerecorded).

What EN 301 549 is and how it relates to the directives

EN 301 549 is a harmonised European standard: a standard produced by the European Standardisation Organisations (ESOs) under a Commission mandate, listed in the Official Journal of the European Union, that confers a presumption of conformity with the corresponding EU directive when an organisation follows it. The relevant directives are:

Directive 2016/2102 (Web Accessibility Directive). Binds public-sector bodies in EU member states (national government, regional, local, and bodies governed by public law) to accessibility on websites and mobile applications. EN 301 549 V2.1.2 was the first harmonised reference; V3.2.1 supersedes. Substantive bar: WCAG 2.1 AA equivalents. Member-state monitoring under Implementing Decision (EU) 2018/1524.
Directive 2019/882 (European Accessibility Act). Binds private-sector providers of products and services within scope (consumer banking, e-books, ATMs, ticketing terminals, e-commerce, electronic communications, audiovisual media access services) since 2025-06-28. The harmonised standard for the substantive accessibility bar is EN 301 549. See the EAA captions reference.
Directive 2014/24/EU (Public Procurement). Article 42 requires that technical specifications for procurements take accessibility into account. In practice, EU public-sector procurement frequently references EN 301 549 directly in the technical-requirement annex; the harmonised standard makes the requirement enforceable through the contracting framework.

The structural advantage of EN 301 549 over a bare WCAG reference is that it covers ICT broader than just the web — it includes hardware (clause 8), software including mobile apps (clause 11), documentation (clause 12), and ICT video capabilities (clause 7), all in one normative document. For a training-video catalogue specifically, this means the LMS, the video player, the mobile app, the caption file, and the help documentation all have a single reference standard.

Clause 7 — ICT with video capabilities

Clause 7 of EN 301 549 V3.2.1 covers ICT that includes video. The clause has three top-level subdivisions:

7.1 Caption technology — the operative text for synchronised captions.
7.2 Audio description technology — programmatic infrastructure for audio descriptions of visual content (a separate accessibility track from captions).
7.3 User controls for captions and audio description — the user-facing controls (toggles, volume, presentation customisation) needed to access caption and audio-description tracks.

Section 7 applies to "ICT with video capability" — that includes the video player on a website, the video tile on a mobile app, the video that ships with a software application, the video on a hardware kiosk, and the broadcast or streaming infrastructure carrying video to a consumer. For the operator captioning a training-video catalogue on Moodle, Canvas, Brightspace, or any other LMS, clause 7 is what the catalogue has to satisfy in aggregate — both the captions on each video and the player infrastructure that delivers them.

Clause 7.1.1 — Captioning playback (the player's job)

Verbatim: "Where ICT displays video with synchronized audio, ICT shall provide a mode of operation to display available captions."

What this means in practice: if the video has captions available, the player has to expose them. The classic failure mode is a video player that supports a caption track but hides the toggle behind a confusing menu, or that disables the toggle on mobile, or that doesn't surface the caption track when the video is loaded inside an iframe. Clause 7.1.1 doesn't require the captions to exist (that's clause 7.1.4 + WCAG 1.2.2's job); it requires the player to support displaying them when they do.

For a training-video catalogue, 7.1.1 is satisfied by every modern HTML5 video player (the Canvas player, Brightspace's VideoJS, Moodle's VideoJS, Panopto's player, Kaltura's player, Vimeo's player, Wistia's player, Microsoft Stream's player). The places it fails are custom-built video players (rare in LMS contexts), older Flash-era custom embeds (mostly retired), and hardware kiosks running legacy media players.

Clause 7.1.2 — Captioning synchronisation

Verbatim: "Where ICT displays captions, ICT shall preserve synchronisation between captions and the soundtrack."

What this means in practice: the caption text shown on screen has to track the audio it transcribes. The classic failure mode is a caption track that drifts — the captions are several seconds ahead of or behind the audio, often because the source caption file was generated against a different cut of the video. The drift compounds over the length of the video; a 60-minute training video with a one-second drift at the end of the first ten minutes will be unreadably out-of-sync at minute 50.

For glossary-biased captioning specifically, synchronisation is typically excellent because the caption file is generated from the actual audio with per-word timestamps. The drift failure mode arises when caption files are imported from a previous edit of the video, or when the post-production team trims the video without re-aligning the caption track. The retrofit pattern is to re-align rather than to manually patch.

Clause 7.1.3 — Preservation of captioning

Verbatim: "ICT that relays video content with synchronized audio shall preserve any captions that are available in the source video."

What this means in practice: if captions exist in the source, they have to survive the path through whatever ICT is in between. A streaming server has to forward the captions; an LMS player has to render them; a mobile app has to expose them; a hardware set-top box has to pass them through.

The most common operator failure mode under 7.1.3 is the LMS-content-pipeline that strips captions during course copy, course backup-and-restore, or content migration. Canvas course-copy often detaches caption tracks; Brightspace course-copy often does too; Moodle course backup generally preserves them. The audit-evidence question is whether the surviving caption is the same captions that conformance was attested for in the previous accessibility statement.

Clause 7.1.4 — Captions characteristics

Verbatim: "Where ICT displays captions, captions shall be displayed visually distinct from the video, with sufficient contrast and a font size adequate to be read at normal viewing distance, and the user shall have a means to adjust caption presentation."

What this means in practice: captions have to be readable. The clause splits into the visual properties of the caption text itself (contrast, font size, distinctness from the video imagery) and the user's ability to customise the presentation (toggle, font size up/down, position).

This is where most institutional procurement specifications get into the weeds. A training-video procurement that references EN 301 549 V3.2.1 will typically specify:

Minimum 4.5:1 contrast between caption text and the area of video it overlays (matching WCAG 2.1 SC 1.4.3).
User-controllable font size (matching WCAG 2.1 SC 1.4.4).
Caption text positioned to avoid obscuring on-screen content (when the source allows; in practice this is the player's positioning logic).
Caption text rendered with a translucent or solid background where the video imagery beneath would otherwise reduce contrast.

The captioning vendor's WebVTT or SRT output normally satisfies 7.1.4 by handing over the caption text correctly; the player vendor's implementation handles the visual rendering. The audit-evidence chain for a training-video tenant in EU public-sector or EAA scope has to cover both — caption-vendor responsibility ends at the file; LMS-vendor responsibility begins at the player.

Clause 7.1.5 — Spoken subtitles (programmatic exposure)

Verbatim: "Where ICT displays captions, ICT shall provide a means by which the user can determine the text of the captions and other essential information programmatically."

What this means in practice: assistive technology (screen readers, refreshable braille displays, accessibility agents) has to be able to access the caption text programmatically. The classic implementation is the HTML5 track element — a sidecar caption file referenced from the video element, with the caption text exposed in the DOM and accessible to assistive technology. The corresponding accessibility-tree exposure is what 7.1.5 enforces.

The common failure mode is a video player that "burns" captions into the video pixels (open captions) without also providing a programmatic representation. Burned-in captions satisfy 7.1.1 (they're displayed) and 7.1.4 (they're visible) but not 7.1.5 (they're not programmatically exposed). For training-video tenants in EU public-sector scope, sidecar caption tracks (closed captioning) is the audit-defensible posture.

The WCAG cross-walk

EN 301 549 V3.2.1 explicitly cross-walks to WCAG 2.1 in clause 9 (web content) and references back into clause 7 for video. The relevant cross-walks for captions:

EN 301 549 V3.2.1	WCAG 2.1 (Level AA unless noted)	What it requires
9.1.2.2	SC 1.2.2 (Level A, but enforced in WCAG 2.1 AA bundles)	Captions on prerecorded audio/video
9.1.2.4	SC 1.2.4	Captions on live audio/video
9.1.2.5	SC 1.2.5	Audio description of prerecorded video
7.1.1	(no direct WCAG analogue)	Player exposes available captions
7.1.2	Implicit in SC 1.2.2's "synchronised"	Captions are synchronised to audio
7.1.3	(no direct WCAG analogue)	ICT relays captions through the pipeline
7.1.4	SC 1.4.3 (contrast); SC 1.4.4 (text resize)	Captions are readable; user can adjust
7.1.5	SC 1.3.1 (info and relationships)	Captions are programmatically exposed

For a training-video procurement specifying EN 301 549 V3.2.1, the captioning vendor's deliverable is the caption file (WebVTT preferred, SRT acceptable as a universal default) plus optional documentation of the production process. The LMS or player vendor is responsible for the playback and exposure clauses (7.1.1, 7.1.4, 7.1.5). The integration-tested audit-evidence pack covers both layers.

Clause 7.3 — User controls for captions

Where clause 7.1 is about the captions themselves, clause 7.3 is about the controls the user uses to enable, customise, and disable them. The substantive requirements:

Captions can be turned on and off independently of audio description.
The user can adjust caption presentation (where the platform supports it).
The control mechanism is operable by keyboard (cross-walks to WCAG SC 2.1.1).
The control mechanism is reachable by assistive technology (cross-walks to WCAG SC 4.1.2).

The HTML5 track element pattern that every modern LMS player uses — Canvas, Brightspace, Moodle, Panopto, Kaltura — satisfies clause 7.3 in default configurations. The places it fails are custom video controls overlaid on top of the native player (where the toggle isn't keyboard-reachable) and mobile-app embeds where the platform's caption toggle is buried in a settings sheet that screen readers struggle with.

How EN 301 549 enters EU public procurement

The operational pathway most EU operators encounter EN 301 549 through is procurement. A 50-500-employee SaaS or services company bidding on EU public-sector or EAA-bound work will see EN 301 549 V3.2.1 in the technical-specification annex of the call for tenders. The bid response is typically structured around a Conformance Statement against the standard, often using the EN 301 549 conformance reporting framework (similar in shape to a US VPAT but using the EN clauses directly).

For a training-video service provider — including any L&D platform that surfaces video to learners — clause 7 conformance is one of the items the procurement officer scores. A vendor that returns a "fully conformant against clause 7" statement that the procurement officer then verifies (open the platform, watch a video, check the caption track, check synchronisation, check programmatic exposure) is the bid that survives. A vendor that returns a "fully conformant" statement and then fails the verification step is the bid that gets disqualified — and frequently barred from re-bidding for the contract cycle.

This is why glossary-biased captioning matters in EU public procurement: a procurement officer's verification step, which lasts a couple of minutes per video, will catch the proper-noun mangling that auto-captioning produces, and the conformance attestation will not survive the verification.

The accessibility statement requirement

Both Directive 2016/2102 and the European Accessibility Act require an accessibility statement on the website or service. The statement must reference the conformance standard (typically EN 301 549 V3.2.1), describe non-conformant content, provide a feedback mechanism, and be reviewed periodically. Implementing Decision (EU) 2018/1523 specifies the model accessibility statement under Directive 2016/2102; the EAA reuses the same 12-clause structure.

For a training-video tenant, the statement is what an inspector reads first. If the statement says "fully conformant against EN 301 549 V3.2.1 clause 7" and the inspector watches a slice of video and finds mangled captions, the statement is non-conformant — and that's a separate finding from the substantive caption non-conformance. See the accessibility statement template for the per-clause structure and member-state variations.

Where glossary-biased captioning maps to clause 7.1

Clause 7.1.4 ("captions shall be displayed visually distinct from the video, with sufficient contrast and a font size adequate to be read at normal viewing distance") is the visual-presentation half. Clause 7.1.2 ("synchronisation") is the timing half. The substantive accuracy of the caption text — whether the captions actually convey what the speaker said — is folded into the cross-walked SC 1.2.2 from WCAG and into the CEN/CENELEC/ETSI working group's interpretation of "captions" as substantively-accurate captions, not auto-caption-shaped tokens.

Glossary-biased captioning maps to:

SC 1.2.2 (substantive accuracy) — the captions accurately convey the audio, including the named technical terms, regulatory citations, and proper nouns. This is where 80-90% auto-captions fail and glossary-biased captioning succeeds.
Clause 7.1.5 (programmatic exposure) — sidecar WebVTT (preferred for EU procurement responses; see WebVTT for training videos) is what makes captions programmatically exposed.
Clause 7.1.2 (synchronisation) — per-word timestamps from the captioning pipeline carry into WebVTT cue timing.

See pricing

FAQ — EN 301 549

Is EN 301 549 the same as WCAG?

No. EN 301 549 is a broader standard that covers ICT in general, with web content as one chapter (clause 9) that cross-walks WCAG 2.1. WCAG 2.1 is a W3C-published web-specific guideline. EN 301 549 references WCAG for web content and adds clauses for hardware, software, mobile apps, documentation, and ICT video. For a training-video tenant in EU public procurement, EN 301 549 is what the contract specifies; WCAG is referenced inside it.

Which version is current?

V3.2.1, published 2021-03 by ETSI on behalf of CEN, CENELEC, and ETSI. Earlier versions (V1.1.2, V2.1.2, V3.1.1) are retired. A V4 update under Mandate M/554 has been in working-group development; until that update is published and listed in the OJEU, V3.2.1 remains the operative reference.

If our captions are WCAG 2.1 AA-conformant, are they EN 301 549-conformant?

The substantive captioning text is covered. EN 301 549's caption requirements that don't have direct WCAG analogues (player playback exposure, transmission preservation, user controls under clause 7.3) need separate verification at the player and platform layers. For a training-video tenant on a modern LMS (Canvas, Brightspace, Moodle, Panopto, Kaltura) those player-layer requirements are typically satisfied by the platform; the operator's captioning responsibility focuses on the substantive accuracy of the caption text.

Does the EAA reference EN 301 549 directly?

The EAA's substantive accessibility requirements are listed in Annex I of the Directive. The harmonised standard providing the presumption of conformity is EN 301 549 V3.2.1 (most recently updated in the harmonised-standards listing under the Implementing Decisions cycle). The captioning text in clause 7.1 is what EAA-bound services with audiovisual content reference for the substantive bar.

What does an EN 301 549 Conformance Statement look like?

A clause-by-clause conformance assertion (typically structured as a table — clause number, conformance level, supporting notes, exceptions). The format mirrors the US VPAT but uses EN clauses. Many EU procurement officers accept either a vendor-prepared statement or a third-party-verified statement; the verification step at procurement-decision time is functionally the same regardless of which.

Are auto-captions ever sufficient under clause 7?

For low-stakes, no-proper-noun content the answer can be yes — the captions exist, are synchronised, and are programmatically exposed. For training video with regulated content, technical terms, or proper nouns, auto-captions fall below the substantive-accuracy bar that the cross-walked SC 1.2.2 enforces. The audit-defensible posture is to treat auto-captions as a draft and run a glossary-biased correction before exposure.

How does EN 301 549 relate to the US Section 508 update?

The 2018 Revised Section 508 Standards (36 CFR Part 1194) explicitly harmonised on WCAG 2.0 AA. EN 301 549 V3.2.1 cross-walks to WCAG 2.1 AA, which adds the 2.1-specific Success Criteria (notably mobile-and-cognitive criteria) on top of 2.0 AA. For an organisation in scope of both regimes — common for EU-and-US-operating SaaS — captioning produced to WCAG 2.1 AA clears both, and a single audit-evidence pack supports both regimes' procurement officers.