Compliance Operations · Published 2026-06-03

How to audit your LMS caption library for WCAG compliance: methodology, tooling, and a 5-day sprint plan

Six weeks after ADA Title II became enforceable for qualifying public entities, the question has shifted. In April 2026, the question was "do we need to comply?" The question now is "do our captions actually comply?" Those are completely different questions, and the second one is harder to answer than most L&D teams expect. Having a caption file attached to every video in the LMS is not the same as having a WCAG 2.1 AA–compliant caption file attached to every video. A YouTube-auto-captioned safety training video has a caption file. That caption file is almost certainly non-compliant — word error rate above 10%, mangled OSHA citation codes, missing synchronisation on safety-critical procedure narration. The difference between coverage (do we have a file?) and compliance (is the file WCAG 2.1 AA grade?) is exactly the gap an audit is designed to measure. This post is about how to run that audit — not the compliance build (that is the 90-day program post), not the vendor selection (that is the RFP playbook), not the legal landscape (that is the compliance matrix post). This is the operational playbook for an L&D operator, training manager, or accessibility lead who needs to know: for the caption files we already have, which ones pass and which ones fail, and how do we know the difference? The audit framework covers seven dimensions, runs across the specific LMS platforms your catalogue actually lives in, and fits inside a five-day sprint with a team of one or two people.

TL;DR

A WCAG caption compliance audit has seven dimensions. Most organisations check only the first (coverage). The other six are where the compliance risk actually lives:

  1. Coverage: Does every video in the active training catalogue have a caption file? (The baseline — necessary but not sufficient.)
  2. Format: Is the caption file in a format the LMS can serve as a sidecar track (SRT, VTT, TTML) rather than burned in? Is the file structurally valid?
  3. Accuracy: Is word error rate below 10% (WCAG 2.1 AA requires 99%+ accuracy)? Is proper-noun error rate below 10%? Can you document the accuracy figure?
  4. Synchronisation: Do captions appear within two seconds of the corresponding audio? Are there long unsynchronised stretches where no caption appears while narration plays?
  5. Metadata and labelling: Is the caption track tagged with the correct language code? Is the track labelled "captions" (not "subtitles")? Is the track set as default-on for users who need it?
  6. LMS delivery: Is the caption track actually served to end users in every LMS delivery context — embedded player, mobile app, offline download, SCORM package?
  7. Documentation: Is there an accuracy record for each file? Is the caption source (vendor, tool, date) logged? Can you produce this log for a compliance audit?

The five-day sprint plan lets a single L&D operator work through all seven dimensions across a catalogue of up to 500 assets, producing a remediation-prioritised export that maps directly to the triage tiers in the 90-day compliance program.

What a caption library audit is — and what it is not

A caption library audit is a structured examination of the caption files attached to your existing training video catalogue. It produces a row-level verdict on each asset — pass, fail, or unreviewed — across the seven compliance dimensions above, plus a prioritised remediation list. It is not a legal opinion on whether your organisation is compliant with ADA Title II, Section 508, or any other framework — that determination is made by a qualified attorney. It is the factual input that informs that determination.

The audit differs from the compliance build in temporal direction. The compliance build is prospective: you are designing a system that will produce compliant captions going forward. The audit is retrospective: you are evaluating whether the captions already in your library meet the standard, and where the gaps are. Both are necessary, but they are different operations. An organisation that has run the compliance build but not the audit is producing compliant captions from today forward while leaving a back catalogue of unknown compliance status — which is exactly the state most L&D teams are in six weeks post–ADA Title II enforcement.

It also differs from a QC review of a single video. A QC review asks: does this specific caption file meet the standard? An audit asks: across the entire catalogue, how many files meet the standard, which fail, and on which dimensions? Scale and systematisation are what distinguish an audit from a review.

Why "we have captions" is not the same as "we comply"

The conflation of coverage with compliance is the single most common failure mode in L&D captioning programmes. It produces a false sense of safety at exactly the moment when the organisation needs accurate information. Four specific patterns drive this conflation:

Pattern 1: YouTube auto-captions treated as compliant. YouTube's automatic captions have word-level accuracy in the range of 80–90% on average-quality audio. On content with above-average proper-noun density — which describes virtually all professional training video — accuracy drops below 80%. WCAG 2.1 AA SC 1.2.2 requires accuracy high enough that a person with a hearing disability can understand the content, a standard that courts and the Department of Justice have consistently interpreted as requiring 99%+ accuracy for formal training content. An LMS with YouTube-captioned course videos has caption coverage but not caption compliance.

Pattern 2: Burned-in captions from video editing tools. Many training videos produced in Camtasia, Articulate Storyline, or Adobe Premiere have burned-in (open) captions — caption text rendered directly into the video frame. Burned-in captions do not satisfy WCAG 2.1 AA for three reasons: they cannot be restyled for users with combined hearing and visual disabilities who need both captions and high-contrast or large-text presentation; they cannot be turned off for users who do not need them; and they are not provided as a synchronised alternative text track as specified in WCAG SC 1.2.2. An LMS with burned-in captions has the appearance of coverage but the compliance of no captions.

Pattern 3: Caption files that are not delivered. A caption file can exist in a vendor system or cloud storage and fail to be served to end users because it was never uploaded to the LMS, or was uploaded in a format the LMS player does not support, or is present but not default-on. The audit's LMS delivery dimension (Dimension 6 below) exists specifically to catch this pattern.

Pattern 4: Accurate captions with broken synchronisation. A caption file can have correct text but misaligned timing — captions appearing one to three seconds after the corresponding audio, or long gaps where narration plays without caption text. Synchronisation failures produce a comprehension experience for a hearing-disabled learner that is materially worse than no captions, because the effort required to reconcile audio timing with caption timing is higher than the effort of reading a static transcript. WCAG SC 1.2.2 requires synchronised captions, not merely accurate ones.

The audit as input to the compliance record

A WCAG compliance audit serves two functions simultaneously: it tells the compliance team what needs to be fixed, and it begins building the documentation record that answers an investigator's or auditor's questions. The Department of Justice's ADA Title II rule and the Office for Civil Rights' Section 504/508 guidance both treat the existence of a documented audit — showing that the organisation assessed its compliance position and produced a remediation plan — as relevant mitigating evidence. The absence of an audit, by contrast, means that the organisation cannot demonstrate that it knew about its compliance gaps or had a plan to address them. Running the audit and producing the output documented below is not just an operational step — it is a compliance-record artefact.

Pre-audit setup: what you need before you start

Before you can run the seven-dimension audit, you need to establish five inputs. Missing any of them will block you partway through the sprint and force a restart. Spend Day 0 (or the afternoon before Day 1) confirming all five are in place.

Input 1: The complete asset inventory

You need a spreadsheet or database export of every video asset in the active training catalogue — not just the ones you know about, but all of them. The active catalogue is the set of videos that are assigned, accessible, or linked from an active course module in your LMS. Inactive, archived, or unpublished assets can be deferred; focus the audit on active content first. For each asset you need: asset ID or URL, course module name, content category (mandatory compliance training, product onboarding, safety training, etc.), runtime in minutes, and the compliance tier assigned during the pre-audit inventory (from the compliance programme's triage framework). If you have not run the compliance programme inventory, spend Day 0 producing the asset list before running the audit. The audit cannot proceed without it.

Input 2: LMS admin access

You need admin-level access to every LMS platform in the catalogue. In most mid-market organisations, this means the primary LMS (TalentLMS, Docebo, Absorb, Cornerstone OnDemand, Workday Learning, or another platform) plus the video hosting layer if it is separate (Kaltura, Panopto, Vimeo, Wistia). If you are auditing a sales enablement platform alongside the core LMS — WorkRamp, Allego, Bigtincan, or a microlearning platform like TalentCards — you need admin access to those platforms as well. Coordinate with IT to confirm access before the sprint starts; waiting for access provisioning during Day 2 of a five-day sprint is a common blocker.

Input 3: A sample caption file standard to test against

You need to define what "pass" means before you start scoring files. For WCAG 2.1 AA compliance, the minimum passing standard for the audit is: word error rate below 10% (confirmed by spot-check against the video audio), synchronisation offset below two seconds on 95%+ of caption blocks, file format is SRT, VTT, or TTML (not burned-in, not SBV), and the track is served to end users as a sidecar track. Write down this standard before you start. If the standard changes mid-audit ("actually, let's check for 5% WER"), you cannot make consistent comparisons across assets. Document the standard in the audit log — it becomes part of the compliance record.

Input 4: The accuracy spot-check protocol

You cannot manually verify accuracy on every caption file for a catalogue of hundreds of videos during a five-day sprint. You need a spot-check protocol: a defined random sample of two to three one-minute segments per video, against which you compare the caption text to the audio. The DCMP (Described and Captioned Media Program) spot-check protocol — count the number of substitution errors (wrong word), deletion errors (missed word), and insertion errors (extra word) in the sample, divide by the total words in the sample — is the standard. A WER above 10% in the spot check is a fail. A WER of 5–10% is a marginal pass requiring a full review. Below 5% is a clear pass. Document the sampled segments (timestamp range, word count, error count) for each video reviewed — this is the accuracy record that the documentation audit (Dimension 7) will reference.

Input 5: The audit log template

Prepare a spreadsheet with the following columns before Day 1: Asset ID, Asset Title, Course Module, Compliance Tier (1–4), Runtime (minutes), Caption File Present (Y/N), Caption Format (SRT/VTT/TTML/Burned-in/None), Format Valid (Y/N), Accuracy Spot-Check WER (%), Sync Pass (Y/N), Language Tag Correct (Y/N), LMS Delivery Confirmed (Y/N), Documentation Record Exists (Y/N), Overall Verdict (Pass/Fail/Needs Review), Remediation Priority (Critical/High/Medium/Low), Remediation Action (specific action required). This template is the working output of the audit. By Day 5, every active asset in the catalogue should have a row in this log with a verdict in every column.

The seven-dimension audit framework

Each dimension of the audit is a separate assessment pass over the asset inventory. Some can be done programmatically (checking whether a caption file exists, checking whether the file is valid SRT/VTT). Others require manual spot-checking (accuracy, synchronisation). Organise the five-day sprint so that the programmatic passes happen first, narrowing the population for the manual review passes. This is how a single operator can audit a catalogue of 200–500 assets in five days without skipping dimensions.

Dimension 1: Coverage audit

The coverage audit answers one question: does every active video in the catalogue have at least one caption file attached? This is the necessary-but-not-sufficient pass. Running it first gives you the baseline gap count — the number of assets that need new captions regardless of what the other six dimensions find — and lets you set that population aside for the remediation queue rather than running it through the subsequent dimensions.

How to run it: in your LMS admin dashboard, export a full content inventory that includes a field for "caption file present" or equivalent. Most enterprise LMS platforms provide this in the course or content management export. If the export does not include a caption field, you will need to spot-check a sample of course modules manually. The LMS-specific playbooks below give the exact location of the caption-file field in each platform's admin interface. Add the coverage result (Y/N) to the audit log for every asset.

Common findings at the coverage pass: back-catalogue assets from before the compliance initiative are uncaptioned (expected); newly uploaded assets from the last six months are captioned (if a new-production gate is in place); SCORM packages and xAPI activities are harder to assess because caption files may be embedded inside the package rather than attached as LMS sidecar tracks (see the LMS delivery dimension below for the SCORM-specific check). The coverage pass typically takes one to two hours for a catalogue of 200 assets if the LMS export is available, half a day if it requires manual verification.

Remediation action for coverage fails: assets with no caption file are the highest-priority remediation target, regardless of compliance tier, because no subsequent dimension check is possible without a file. Flag every coverage-fail asset with remediation priority "Critical" if it is in Tier 1 (live compliance obligation) or "High" if it is in Tiers 2–3. Tier 4 assets (no compliance obligation) can be flagged "Low" and deferred to the back-catalogue plan.

Dimension 2: Format compliance audit

A caption file's format determines whether it can be served as a synchronised accessible track in the LMS player. The format audit checks two things: is the file format one that the LMS can serve as a sidecar track, and is the file structurally valid?

Acceptable sidecar formats for WCAG compliance are SRT (SubRip), VTT (WebVTT), and TTML (Timed Text Markup Language, including DFXP). STL (EBU STL) is used in broadcast workflows and some enterprise video platforms. SBV (SubViewer) is a YouTube-specific format that most enterprise LMS platforms do not support natively. Burned-in text is not a caption format — it is text rendered into the video frame and is not accessible. The Section 508 caption format guidance requires that captions be provided as a separate, synchronised alternative — a standard that burned-in captions do not satisfy.

Format validity is a separate check from format type. A file can be in SRT format but structurally invalid — malformed timestamps (00:01:2A.000 rather than 00:01:25.000), missing blank lines between caption blocks, overlapping timestamp ranges, or encoding errors (non-UTF-8 characters in the file). Structurally invalid caption files may display inconsistently across LMS players, fail to load in some delivery contexts, or generate accessibility-flag warnings in automated compliance scanners. The simplest way to check structural validity is to run the file through a free caption validator (SubRip validator, W3C TTML validator). Flag any file that fails validation as a format-fail even if the file type is correct.

Burned-in caption identification: identifying burned-in captions requires watching the video. The practical indicator is that the caption text appears inside the video frame and moves with the video resolution scaling — it is part of the image, not overlaid as a separate layer. Most LMS admin interfaces do not have a field that distinguishes "sidecar SRT track" from "burned-in text" — this requires a manual check of a sample of videos. In the spot-check pass (Dimension 3), note whether captions appear to be burned-in for any asset in the sample, and flag the rest of the catalogue from that production era for a targeted check if burned-in captions are found in the sample.

Remediation action for format fails: format-fail assets where a source transcript or accurate caption text exists (but in the wrong format or structurally invalid) can often be remediated quickly — conversion from SBV to VTT is a minute of work with a conversion tool; validation repair on a malformed SRT is typically 15–30 minutes. Flag these as "High" priority with remediation action "format conversion/validation repair" rather than "recaption from scratch."

Dimension 3: Accuracy audit

The accuracy audit is the most technically intensive dimension because it requires human listening. No automated tool can currently verify WCAG 2.1 AA caption accuracy at the reliability level the compliance record requires — automated speech-to-text comparison tools exist (WhisperX, Caption Inspector) but they measure machine transcription accuracy against the caption file, not the caption file accuracy against the ground-truth audio. The ground-truth accuracy check requires a human listener.

The WCAG 2.1 AA accuracy standard is 99%+ word accuracy, measured against the ground-truth audio. A 1% WER on a 300-word training video segment means three substitution, deletion, or insertion errors. On medical terminology, engineering acronyms, or product names, a single substitution error can change the meaning of a procedure step — "Administer 0.5 mg" captioned as "Administer 0.5 mL" is not a word error in a content-agnostic sense, but it is a safety-critical error in a clinical training context.

The spot-check protocol for the accuracy audit: for each asset in the review queue, select two random one-minute segments from the video (excluding the first and last 30 seconds, which tend to be lower-density). Play the segment with captions on. Count the number of caption text errors against the audio — substitutions (wrong word), deletions (word in audio not in captions), insertions (word in captions not in audio). Divide error count by total words in the segment's caption text. This gives the segment WER. Average the two segments to get the per-asset WER estimate. If either segment's WER is above 10%, flag the asset as accuracy-fail regardless of the average.

Accuracy audit coverage: in a five-day sprint, a single auditor can complete the spot-check on approximately 80–100 assets per day. For a catalogue of 300+ assets, prioritise the accuracy spot-check by compliance tier: Tier 1 assets first, Tier 2 second, and stop at Tier 3 if time runs out. Tier 3 and 4 assets without a documented accuracy verification can be flagged "Needs Review" in the audit log — this is an honest record that does not claim compliance for unverified assets.

The proper-noun problem in accuracy auditing: generic word error rate understates the compliance risk for training content with high proper-noun density. A medical training video that accurately transcribes all the common words but substitutes drug names at a 15% rate has a low overall WER but is non-compliant for the clinical content it is supposed to convey. When auditing any content with above-average proper-noun density — healthcare training, engineering onboarding, sales enablement videos with product SKUs, safety training with chemical names — run a secondary proper-noun error rate check alongside the generic WER. Count only errors on named entities (drug names, product names, company names, acronyms, regulatory standards) and compute the rate against the total named-entity count in the segment. A proper-noun WER above 20% is an accuracy fail even if the generic WER is below 10%. The glossary-biased decoding post covers why proper-noun WER diverges from generic WER on training content and what the accuracy improvement looks like with a customer glossary applied.

Documenting accuracy results: for each asset spot-checked, log the segment timestamps, the word count, the error count, and the WER in the audit log. This is the accuracy documentation the compliance record requires. If you later produce new caption files with a WCAG-compliant tool, document the tool name, the glossary version used, and the output accuracy claim (from the tool's output QC report) for each newly captioned asset. The before-and-after accuracy record is the evidence that the remediation was effective.

Dimension 4: Synchronisation audit

Caption synchronisation — the alignment between when a word is spoken in the audio and when the corresponding caption text appears on screen — is a WCAG 2.1 AA requirement that is distinct from accuracy. A perfectly accurate caption file with systematic two-second synchronisation delays fails WCAG SC 1.2.2 because the synchronised alternative is not meaningfully synchronised. The synchronisation audit checks two failure modes: systematic offset (all or most captions appear N seconds after the audio) and gap failures (stretches of narration with no corresponding caption text).

Systematic offset typically results from a caption file that was produced from a correct transcript but timed incorrectly — for example, a machine transcription that was timed to a different audio track or a different cut of the video. Identifying systematic offset requires watching approximately 90 seconds of video with captions on at a segment in the middle of the video (not the beginning, where some offset is often present due to opening music or intro slides). If captions appear consistently more than two seconds after the corresponding audio throughout the middle segment, flag the asset as a synchronisation fail.

Gap failures — long stretches of narration with no caption text — typically result from transcription tools that skip low-confidence audio segments, from caption files that were truncated before the end of the video, or from videos that were re-edited after the caption file was produced (adding new segments that have no caption coverage). The easiest way to check for gap failures is to run the caption file through a gap-duration analysis: list all the timestamp intervals where no caption block is active; flag any gap longer than five seconds as a potential gap failure; spot-check each flagged gap against the video audio to confirm whether narration is present during the gap. A gap of five seconds or more with active narration is a synchronisation fail.

Synchronisation and SCORM packages: synchronisation failures in SCORM-packaged content are harder to detect because the caption file is embedded in the package and may be served through an Articulate or Lectora player rather than the LMS's native player. The SCORM-specific check is in the LMS delivery dimension below. For the synchronisation audit, treat SCORM packages as a separate category requiring a manual play-through review rather than an automated gap-detection pass.

Dimension 5: Metadata and accessibility labelling audit

A caption file that is accurate and synchronised still fails WCAG accessibility requirements if it is incorrectly labelled in the LMS — because users who need captions cannot find or enable them. The metadata audit checks three labelling requirements.

Language code: every caption track must be tagged with the ISO 639-1 language code of the caption language (en for English, fr for French, es for Spanish, etc.). The language code must match the language of the audio — a Spanish-language training video with English captions is not a compliant synchronised alternative for Spanish-language learners, and an English-language video with an English caption track tagged "fr" will cause the LMS to serve the track in the wrong language context for multi-language learners. Check the language tag in the LMS caption track metadata for every asset in the audit. In most LMS platforms, the language tag is set at upload time and visible in the caption management interface.

Track type: captions vs subtitles: the distinction between "captions" and "subtitles" is meaningful in accessibility contexts. Captions are intended for users with hearing disabilities and include not just spoken dialogue but also descriptions of non-speech audio elements that carry information — sound effects, speaker identification, musical description. Subtitles are translations of spoken dialogue for hearing users watching in a non-native language. A track labelled "subtitles" may not trigger the accessibility caption delivery pathway in some LMS players and screen reader combinations. For WCAG compliance, the track type should be explicitly set to "captions" where the LMS supports the distinction. Check that tracks intended for accessibility are labelled "captions" rather than "subtitles" in the LMS interface.

Default-on setting: WCAG SC 1.2.2 requires that captions be provided — it does not require that they be default-on for all users. However, many L&D compliance programmes and HR policies require that captions be default-on for mandatory training content, particularly for organisations with employees who have registered hearing-disability accommodation needs. Check whether mandatory training videos have captions set as the default track in the LMS, and flag any deviation from the organisation's accommodation policy as a metadata fail.

Dimension 6: LMS delivery audit

The LMS delivery audit confirms that caption files reach end users in every delivery context the LMS supports. This dimension catches the "file exists but doesn't deliver" failure pattern. Five delivery contexts require separate checks:

Primary LMS player (web browser): the most common delivery context. Open the course module as a learner in a web browser, play the video, and confirm that the caption track is available and loadable. This sounds obvious, but it catches upload-configuration errors (file uploaded to the admin system but not published to the learner interface) and format incompatibilities (SRT file uploaded but the LMS player only serves VTT natively). Document the confirmation with a screenshot for the compliance record.

LMS mobile app: most enterprise LMS platforms have mobile apps that serve training content on iOS and Android. Caption delivery in the mobile app is not guaranteed by caption delivery in the web player — mobile apps often use a different video rendering layer and may not serve sidecar caption tracks at all, or may serve them differently. Test caption delivery in the LMS mobile app for a sample of Tier 1 assets. If mobile app caption delivery is broken, this is a platform-level finding — the remediation action is a support ticket with the LMS vendor, not a per-asset fix.

SCORM/xAPI packages: SCORM packages are zip archives containing HTML, JavaScript, and media files that the LMS unpacks and serves through a sandboxed player. Caption delivery in SCORM packages bypasses the LMS's native caption infrastructure entirely — the captions must be embedded in the SCORM package's HTML or video file and served through the authoring tool's player. For Articulate Storyline packages, captions are embedded as an HTML text track. For Camtasia packages, captions may be burned-in or served as an embedded SRT. For each SCORM package in the Tier 1 catalogue, open the package as a learner and confirm that caption delivery works in the SCORM player. If captions are burned-in, flag the asset as a format fail in Dimension 2 and a delivery fail in this dimension.

Video hosting layer (Kaltura / Panopto / Vimeo): when the LMS embeds video from an external hosting platform, the caption track is typically served from the hosting platform's player, not the LMS's native player. An SRT file uploaded to the LMS may not be served if the video is embedded from Kaltura or Panopto — the caption file needs to be in the hosting platform's caption management system, not just the LMS's. The Kaltura caption management interface and the Panopto caption management interface are separate from the LMS caption upload interfaces. Check that caption files are present and correctly configured in the hosting platform for any video embedded from an external hosting layer.

Offline/downloaded content: some LMS platforms (particularly those used for field sales training — WorkRamp, Allego, and mobile-first platforms like TalentCards) support offline download of training content for users in low-connectivity environments. Caption file delivery in offline mode depends on whether the app downloads the caption file alongside the video file. Test offline caption delivery for a sample of sales enablement and mobile-first content in the audit. If the platform does not support offline caption delivery, this is a platform-level finding requiring escalation to the vendor.

Dimension 7: Documentation and audit-trail audit

The documentation audit is the dimension that connects the content quality assessment to the compliance record. It checks whether a documentation trail exists for each caption file that allows the organisation to demonstrate, in a compliance investigation, that it knows what standard was applied to each file, who produced it, when, and with what result. The documentation audit asks four questions about each asset in the Tier 1 catalogue:

Does a production record exist? A production record documents when the caption file was created, who created it (vendor name, tool name, or internal team), and what the intended output standard was. For vendor-produced captions, this is typically a delivery confirmation email or a contract-referenced batch delivery report. For tool-produced captions, this is a processing log. For manually produced captions, this is a timestamp of the final export. Assets with no production record get a documentation fail.

Does an accuracy record exist? An accuracy record documents the measured or claimed accuracy of the caption file at the time of production. For vendor-produced captions, this is typically a delivery QC report that includes a WER claim. For AI-tool-produced captions, this is an output accuracy claim from the tool. For spot-checked captions, this is the spot-check log described in the accuracy audit above. Assets where accuracy has never been measured or recorded get a documentation fail on this sub-dimension even if the content is accurate.

Does a remediation record exist for any assets that failed prior QC? If an asset was previously flagged as a caption quality fail and then remediated, is there a record of the remediation — the date, the method, the post-remediation accuracy confirmation? A remediation record converts a documented fail into a documented resolution, which is the compliance-record form a legal or OCR investigation actually needs.

Can the documentation package be assembled in under two hours? This is a practical test for the documentation system's adequacy. If producing the documentation for 20 Tier 1 assets requires searching through email archives, shared drives, and two LMS admin interfaces, the documentation system is inadequate. The documentation should be centralised in one location — a shared folder, a compliance tracking spreadsheet, or a purpose-built compliance management tool — that a compliance officer can query in hours, not days. Assess the documentation system's accessibility as part of the documentation dimension and flag any documentation that is difficult to retrieve as a systemic risk, not just a per-asset fail.

LMS-specific audit playbooks

Each LMS platform has a different interface for managing caption files, and the fields visible in the admin export differ by platform. These platform-specific playbooks give you the exact location of the caption management interface and the export fields relevant to the seven-dimension audit in each major enterprise LMS and training platform.

TalentLMS

In TalentLMS, caption files are managed at the course unit level. Navigate to the course, enter edit mode, select a video unit, and find the "Captions" tab in the unit editor. The caption management interface shows the list of uploaded caption files, the language tag for each, and whether the track is enabled for learner display. The TalentLMS admin export (Reports → Course content → Export to CSV) does not include a caption-file field by default — you will need to verify caption status per-unit manually for a TalentLMS audit, or use the TalentLMS API (GET /api/v1/courses/{id}/units) to extract unit-level metadata including caption status. For the coverage pass, the most efficient TalentLMS workflow is to export the full course unit list and then spot-verify caption status in the UI for all video units in Tier 1 courses first.

TalentLMS serves SRT and VTT caption files through its native HTML5 video player. It does not natively support TTML/DFXP — convert TTML files to VTT before upload. TalentLMS's mobile app (iOS and Android) serves captions from the same track source as the web player; mobile delivery is generally reliable. SCORM packages served through TalentLMS bypass the platform's native caption infrastructure, so SCORM packages need a separate manual delivery check.

Docebo

In Docebo, caption files are managed in the learning object (LO) editor. Navigate to Course Management → select the course → open the LO → under the Properties tab, find the Subtitles section. Docebo uses the term "subtitles" for what are functionally caption tracks — the compliance metadata audit (Dimension 5) should flag this labelling as a potential issue if WCAG requires track type "captions." Upload SRT or VTT files here. The Docebo admin export (Admin → Reports → Custom Reports) can include LO-level metadata; building a custom report with LO ID, LO type, and subtitle-file-present fields is the most efficient path to a bulk coverage export. Docebo's coaching and informal learning modules (Coach and Share) may serve video content without native caption support — audit these modules separately if the organisation uses them for mandatory training content.

Docebo integrates with Kaltura for enterprise video hosting. If Kaltura is the video source for Docebo-hosted courses, caption management is in Kaltura (see below), not in Docebo's LO editor. Check which platform is hosting the video before determining where to verify caption delivery.

Absorb LMS

In Absorb LMS, caption files are uploaded in the course authoring interface at the chapter or lesson level. Navigate to Admin → Courses → select the course → Chapters → select the chapter → under the Media section, find the Captions upload field. Absorb supports SRT and VTT. The Absorb report builder (Admin → Reports → Custom Reports) allows building a report by lesson type that can filter to "Video" lessons; adding the "Caption File" field to the report gives a bulk coverage export. As with other platforms, SCORM packages in Absorb bypass the native caption infrastructure — audit SCORM packages manually.

Absorb's mobile app caption delivery is supported for native video lessons. Verify with a spot-check on a sample of Tier 1 assets. Absorb does not natively support Kaltura or Panopto embedding in most configurations — check whether any video content is embedded from an external source before completing the LMS delivery dimension.

Cornerstone OnDemand

In Cornerstone OnDemand, captions are managed in the online learning object (OLO) editor. Navigate to Content → Online Learning Objects → select the OLO → under the Media tab, the Caption File upload field accepts SRT, VTT, and DFXP. Cornerstone also supports Kaltura as a native integration — if Kaltura is enabled, caption management moves to the Kaltura media management interface rather than the Cornerstone OLO editor. Cornerstone's reporting engine (Reporting 2.0) can export OLO-level metadata; build a custom report with OLO ID, content type, and caption-file fields for the coverage pass. Cornerstone Extended Enterprise and Cornerstone for Salesforce are separate delivery contexts that may not inherit caption files from the core platform — verify delivery in these contexts separately if they are in scope for the audit.

Cornerstone's mobile learning app (the Cornerstone Learning app) provides native caption support for OLO content but may not serve captions for externally embedded content. Cornerstone SCORM packages require a separate delivery check. Cornerstone has a documented accessibility conformance report (VPAT) that covers caption delivery — reviewing the current VPAT alongside the audit provides context on known platform-level limitations.

Workday Learning

In Workday Learning, caption files are managed at the learning content record level. Navigate to the Learning Content section, find the relevant content record, and locate the Caption Files related action. Workday Learning supports SRT and VTT. The Workday reporting infrastructure (accessible through the Report Writer tool) can generate a learning content inventory with caption-file fields; building a custom report is necessary because the standard Workday Learning reports do not include caption status by default. Workday Learning's video player uses the Workday-native HTML5 player, which is separate from the Workday ERP's document management infrastructure — caption files uploaded to Learning Content are not shared with any other Workday module.

Workday's mobile app (Workday for iOS and Android) supports caption delivery for native Learning Content videos. SCORM packages in Workday Learning require a separate delivery check. Note that Workday Learning has a more constrained content authoring environment than dedicated LMS platforms — if the organisation uses Workday as the formal LMS record system but produces content in Articulate or Camtasia and uploads SCORM packages, the audit for SCORM packages is effectively a package-level audit rather than a platform-level audit.

Kaltura

Kaltura is a video hosting and management platform that frequently serves as the video layer under an LMS. Caption management in Kaltura is in the Kaltura Management Console (KMC) under Content → Entries → select the entry → Captions tab. The Captions tab shows all caption files for the entry, their language tags, and their active/default status. The Kaltura API (caption list API call against each entry ID) provides bulk caption metadata export — for large Kaltura libraries, this is the most efficient coverage audit method. Kaltura supports SRT, VTT, DFXP, and SBV for caption uploads; however, SBV is converted to VTT on ingest, and DFXP is converted to WebVTT for player delivery. Kaltura's native machine captioning (REACH) generates VTT files with documented accuracy claims; if REACH-generated captions are in the library, the accuracy claim from the REACH output log is the accuracy record for Dimension 7.

Kaltura caption delivery to LMS-embedded players depends on the Kaltura player configuration. The Kaltura Universal Player (KAlturaPlayer v7+) serves captions from the Kaltura caption track by default. Earlier versions of the Kaltura player (KDP, Kaltura HTML5 v1) may not. Verify the player version in the KMC and confirm caption delivery in the player used by the LMS integration.

Panopto

In Panopto, caption files are managed in the Panopto Web App's session manager. Navigate to the session, select Manage (gear icon) → Import Captions, or review the Captions tab for machine-generated caption tracks. Panopto's machine captioning is on by default for new uploads and produces an ASR (automatic speech recognition) transcript that is converted to a VTT track. The accuracy of Panopto ASR varies by audio quality and content type; for the accuracy audit, treat Panopto ASR-generated captions as unverified and apply the spot-check protocol. Panopto's bulk export API (GET /Panopto/api/v1/sessions) returns session metadata including caption status for each session — use this for the coverage pass on large Panopto libraries.

Panopto caption delivery in LMS embeds depends on the embed configuration. Panopto iFrame embeds in Moodle, Canvas, or Blackboard serve captions from the Panopto player — the LMS caption infrastructure is bypassed. For LMS platforms with native Panopto integrations (Canvas, Brightspace, Moodle via Panopto LTI), caption delivery is through Panopto's player and should be verified directly in the LTI context. Panopto's mobile app caption support is available for sessions with caption tracks.

WorkRamp

In WorkRamp, caption management is at the module content level in the WorkRamp admin interface. Navigate to Library → select the training → open the module → find the video content block → in the block settings, locate the Captions upload field. WorkRamp supports SRT and VTT. WorkRamp's administrative reporting (Analytics → Content) provides a content inventory export, but caption-file status is not included as a standard field — the coverage pass for WorkRamp requires manual verification per-video-block in Tier 1 training paths. WorkRamp's mobile companion app supports caption delivery for native video blocks; verify with a spot-check on a sample of sales onboarding content, which is the primary WorkRamp use case for caption compliance.

Allego

In Allego, caption delivery is managed through the content management interface in the Allego admin console. Allego's native video compression and processing pipeline may alter the timing of an uploaded SRT file if the video is re-encoded during ingest — verify synchronisation (Dimension 4) on a sample of Allego-hosted videos, particularly if caption files were produced from a pre-Allego version of the video. Allego supports offline access for sales reps in low-connectivity field environments; verify offline caption delivery separately from the web player check, as caption file packaging in the Allego offline client may not include sidecar SRT files depending on the client version.

Bigtincan

In Bigtincan, caption management is in the Hub admin interface under Content Management → select the file → Properties → Captions. Bigtincan's content hub serves multiple file types and the caption interface for video files may differ from the interface for presentation files with embedded video. Verify caption delivery specifically in the Bigtincan player context — not all Bigtincan content types serve sidecar caption tracks in every viewer. Bigtincan's offline mode (Smart Client) requires separate caption delivery verification.

TalentCards

In TalentCards, microlearning cards can include video content with caption tracks. The TalentCards mobile-first design means that caption delivery verification must be done in the mobile app context rather than a web browser. Navigate to the card content management interface in the TalentCards admin panel and locate the video card settings for caption file upload. TalentCards' primary use case — frontline worker microlearning delivered on mobile in manufacturing, retail, and logistics environments — often involves users in high-ambient-noise environments who rely on captions even without a hearing disability. Verify caption delivery on iOS and Android in the TalentCards app and confirm that captions are default-on for any mandatory safety or compliance training cards.

The 5-day audit sprint plan

The five-day sprint is designed for a single L&D operator or accessibility lead running the audit as a focused project, with administrative access to all relevant platforms and a complete asset inventory in hand before Day 1. A two-person team can cover a larger catalogue or run the manual spot-check passes in parallel. The sprint produces a fully populated audit log — every active asset scored on all seven dimensions — and a remediation-prioritised export ready for the compliance programme's intake queue.

Day 1: Coverage and format passes (Dimensions 1–2)

Morning (3–4 hours): Run the coverage pass across the full active asset inventory. Use LMS API exports or admin dashboard exports wherever available. Populate the "Caption File Present" and "Caption Format" columns in the audit log for every asset. Flag assets with no caption file as coverage fails; flag assets with burned-in captions or invalid formats as format fails. By end of morning, you should have a complete coverage picture — how many assets have files, how many don't, and the distribution by compliance tier.

Afternoon (3–4 hours): Run the format validation pass on all caption files identified in the morning. Download a sample of files from each LMS platform and run them through a format validator. Focus on any files that were flagged as potentially burned-in or in non-standard formats. Update the "Format Valid" column in the audit log. By end of Day 1, Dimensions 1 and 2 are complete for the full catalogue, and you have a clear count of: (a) assets with no captions (coverage fail), (b) assets with format-fail captions (burned-in or invalid), and (c) assets with format-pass captions that need accuracy and synchronisation review.

Day 2: Accuracy spot-check — Tier 1 assets (Dimension 3)

Full day: Run the accuracy spot-check protocol on all Tier 1 assets that passed the format check on Day 1. For each asset, select two one-minute segments, watch with captions, count errors, compute WER. Log segment timestamps, word count, error count, and WER in the audit log. Flag assets with WER above 10% as accuracy fails. For assets with any proper-noun–dense content (healthcare, engineering, sales enablement), run the secondary proper-noun WER check. A single auditor can complete approximately 40–60 Tier 1 spot-checks in a full day, depending on video runtime and content density. If the Tier 1 queue is larger than 60 assets, complete Day 2 on the highest-priority Tier 1 assets (live accommodation requests, imminent Joint Commission survey, active OCR investigation) and flag the remainder for Day 3 morning.

Day 3: Accuracy spot-check — Tier 2 assets and synchronisation pass (Dimensions 3–4)

Morning (3 hours): Complete any remaining Tier 1 accuracy spot-checks from Day 2, then begin the Tier 2 accuracy spot-check queue using the same protocol. Target 30–40 Tier 2 assets.

Afternoon (3 hours): Run the synchronisation audit (Dimension 4) on all Tier 1 assets that passed the accuracy check. Play the middle segment of each video with captions on and assess synchronisation offset. Run the gap-detection check on the caption file for all Tier 1 assets (list all gaps, flag gaps longer than five seconds, spot-check flagged gaps against audio). Log synchronisation results in the audit log. By end of Day 3, Dimensions 3 and 4 are complete for all Tier 1 assets and partially complete for Tier 2.

Day 4: Metadata, LMS delivery, and documentation passes (Dimensions 5–7)

Morning (3 hours): Run the metadata audit (Dimension 5) across all assets in the Tier 1 and Tier 2 catalogues. Check language tags, track type labels, and default-on settings in each LMS platform. This is a relatively fast pass — most LMS platforms expose language and track-type metadata in the admin interface without requiring per-video manual review. Log metadata results in the audit log.

Afternoon (3 hours): Run the LMS delivery audit (Dimension 6) for a sample of Tier 1 assets in each delivery context: primary web player, mobile app, SCORM packages (if applicable), external video hosting (Kaltura/Panopto, if applicable), offline/download (if applicable). Take screenshots of confirmed caption delivery for the compliance record. Flag any delivery failures as platform-level findings and draft the escalation action for each. By end of Day 4, Dimensions 5 and 6 are complete for the Tier 1 catalogue and partially complete for Tier 2.

End of Day 4 (1 hour): Begin the documentation audit (Dimension 7) by inventorying what production records and accuracy records exist for Tier 1 assets. Identify the location of each record (vendor delivery emails, tool output logs, spot-check log from Day 2). Flag assets with missing production records or missing accuracy records as documentation fails.

Day 5: Documentation pass, remediation prioritisation, and output assembly

Morning (3 hours): Complete the documentation audit (Dimension 7) for Tier 1 and Tier 2 assets. Identify documentation gaps and log them. Then run the Tier 2 and Tier 3 synchronisation and metadata passes to complete those dimensions for the lower-tier catalogue. Flag Tier 3 and Tier 4 assets that have not received full accuracy or synchronisation spot-checks as "Needs Review" — this is honest record-keeping rather than an assumed pass.

Afternoon (3 hours): Build the remediation-prioritised output. Sort the audit log by compliance tier, then by fail count across the seven dimensions. Assign remediation priorities: "Critical" for any Tier 1 asset with a coverage fail, format fail, accuracy fail, or delivery fail; "High" for any Tier 1 asset with a synchronisation fail, metadata fail, or documentation fail, and any Tier 2 asset with a coverage or accuracy fail; "Medium" for Tier 2 synchronisation, metadata, and documentation fails and all Tier 3 fails; "Low" for all Tier 4 finds and Tier 3 documentation fails. Produce a remediation export sorted by priority — this is the input to the LMS caption ingestion workflow and the back-catalogue retrofit phase of the compliance programme.

End of Day 5 (1 hour): Produce the audit summary document — a one-page summary of the audit results: number of assets reviewed, pass/fail counts by dimension, key findings by LMS platform, and the top five remediation actions by priority. This is the executive-ready summary for the compliance officer, the VP of Learning, or the legal team. File it alongside the full audit log in the compliance record location.

What the audit output looks like

A completed five-day audit for a 300-asset Tier 1–3 catalogue typically produces the following outputs:

Typical audit output for a 300-asset catalogue (50% Tier 1, 30% Tier 2, 20% Tier 3)
Dimension Typical pass rate (post-ADA-deadline org) Common fail pattern
1 — Coverage 70–85% Pre-2025 back-catalogue assets without caption files; SCORM packages with no caption track
2 — Format 80–90% of covered assets Burned-in captions from Camtasia/Storyline production; SBV files from YouTube export not converted to VTT
3 — Accuracy 40–65% of covered assets YouTube auto-captions and LMS-native ASR on proper-noun–dense content; machine captions without glossary on medical/engineering/sales training
4 — Synchronisation 75–85% of covered assets Caption files produced from a different cut of the video than is currently served; SCORM packages re-exported without re-timing the caption file
5 — Metadata 60–80% of covered assets Missing language tag; tracks labelled "subtitles" rather than "captions"; default-on not set for mandatory training content
6 — LMS delivery 70–90% of format-pass assets Caption file uploaded to LMS but not served in mobile app; Kaltura/Panopto caption files not matching LMS-side caption upload; SCORM package caption delivery broken in the player
7 — Documentation 30–50% of covered assets No accuracy record for any asset; production records scattered across vendor emails, shared drives, and tool output logs; no centralised compliance documentation system

The audit will rarely produce a clean pass across all seven dimensions for more than 30–40% of a back-catalogue. That is expected — it is the finding, not a failure of the audit. The audit exists to quantify the gap, not to certify compliance. The remediation queue it produces is the input to the compliance programme.

After the audit: remediation prioritisation and decision framework

The audit output is a prioritised remediation queue. The decision framework for working through that queue has two axes: compliance urgency (which tier is the asset in?) and remediation effort (how much work does fixing this specific failure require?). The combination determines the order in which remediation actions should be executed.

The remediation effort taxonomy

Remediation actions for caption audit fails fall into four effort categories:

Effort A — Format conversion or validation repair (minutes to 1 hour): applies to format-fail assets where an accurate caption file exists but is in the wrong format (SBV → VTT conversion) or is structurally invalid (timestamp repair). Free conversion tools handle format conversion; manual validation repair typically takes 15–45 minutes per file. These are the highest-ROI remediation actions — low effort, high compliance impact.

Effort B — Metadata correction (minutes per asset): applies to metadata-fail assets where the caption file is correct but the language tag, track type, or default-on setting is wrong. Metadata corrections are made in the LMS admin interface and typically take less than five minutes per asset. Run all metadata corrections in a single batch after the audit rather than spreading them across the remediation sprint.

Effort C — Recaptioning from existing transcript (1–4 hours per hour of video): applies to accuracy-fail or synchronisation-fail assets where an existing transcript or a previous draft caption file can be used as the starting point. If a vendor-produced caption file failed accuracy because of proper-noun errors but the base transcription is structurally correct, applying a customer glossary to the existing file produces a corrected output without a full recaption. This is the primary use case for a glossary-biased captioning tool — the customer glossary architecture post covers how to build the glossary that feeds this workflow. Recaptioning from transcript with glossary is faster and produces better accuracy than re-running machine captioning from scratch, because the glossary correction applies targeted fixes to the specific proper-noun errors identified in the accuracy audit.

Effort D — Full recaption from audio (4–8 hours per hour of video): applies to coverage-fail assets (no caption file exists) and to accuracy-fail assets where no usable transcript or draft caption file exists. Full recaption is the most resource-intensive remediation action. For Tier 1 assets, this is unavoidable — the cost of non-compliance exceeds the cost of recaption for any asset in the active mandatory training catalogue. For Tier 3 and 4 assets, full recaption is lower priority and should be queued for the back-catalogue retrofit phase rather than the emergency sprint. If the half-FTE cost analysis has been run for the organisation, the internal recaption rate (practitioner time per hour of captioned video) is the correct labour cost to use when prioritising the Effort D queue against the vendor recaption option.

The prioritisation matrix

Combine compliance urgency and remediation effort to produce the execution sequence:

Caption remediation prioritisation matrix
Compliance tier Effort A (format/metadata) Effort B (metadata only) Effort C (recaption from transcript) Effort D (full recaption)
Tier 1 — live obligation Execute Day 1 of remediation sprint Execute Day 1 of remediation sprint Execute Days 2–5 of remediation sprint (vendor or internal) Prioritise for immediate vendor engagement; set a completion deadline within 30 days
Tier 2 — 12-month obligation Execute in Weeks 2–3 Execute in Weeks 2–3 Schedule within 60 days Schedule within 90 days; include in vendor pipeline
Tier 3 — best practice Queue for quarterly batch Queue for quarterly batch Back-catalogue plan Back-catalogue plan — defer until Tier 1 and 2 complete
Tier 4 — no obligation Opportunistic (combine with other work) Opportunistic Defer indefinitely Defer indefinitely

The documentation remediation — building the compliance documentation system and back-filling accuracy records for Tier 1 assets — is Effort B from a time perspective but is often treated as lower priority than content remediation. This is a strategic error. The documentation system is what allows the organisation to demonstrate compliance in a legal or investigatory context. Completing the content remediation without completing the documentation remediation leaves the organisation in a position where it is actually more compliant than it can demonstrate. Run the documentation remediation in parallel with the content remediation sprint, not sequentially after it.

Using the audit output to size the vendor relationship

The audit output — specifically the Effort D (full recaption from audio) queue — is the input to the vendor engagement for the back-catalogue retrofit. The total runtime in hours of Effort D Tier 1 assets is the minimum scope of the initial vendor engagement. If the organisation is selecting a captioning vendor as part of the compliance programme build, the audit output is the RFP input — the RFP playbook post walks the full scoring framework for vendor selection, and Section 1 of that framework (accuracy on proper-noun–dense content samples) is directly informed by the proper-noun WER findings from the audit's accuracy dimension. Specifically: use the most accuracy-challenging content identified in the audit (highest proper-noun density, most obscure domain vocabulary) as the test sample in the vendor evaluation. The vendor that performs best on your hardest content is the vendor you need for the compliance retrofit.

For organisations already using a captioning vendor, the audit output is the next contract renewal input. If the vendor's accuracy performance on Tier 1 content is below the 99% WER threshold documented in the audit's accuracy findings, the audit record is the basis for a contract performance discussion — and the comparison against alternative vendor accuracy on the same sample content (Rev, 3Play Media, Verbit) is the negotiation anchor.

Tooling for the caption audit

The seven-dimension audit does not require specialised compliance software — it can be run with a spreadsheet, a caption validator, and admin access to your LMS platforms. However, several tools accelerate specific dimensions of the audit, particularly for large catalogues.

Caption file validation

Format validation (Dimension 2) is the most automatable dimension of the audit. Several free tools validate SRT and VTT files against their specifications: the W3C TTML validator (validator.w3.org/nu) for TTML/DFXP files; the SubRip validator included in subtitle editor tools (Aegisub, Subtitle Edit) for SRT files; the Web Captions Checker (a browser extension) for VTT files loaded in a page. For bulk validation of large caption libraries, the Caption Inspector tool (open source, CaptionInspector on GitHub) runs automated checks on SRT and SCC files against DCMP quality standards — it checks for timing gaps, overlapping blocks, and line-length violations. Caption Inspector does not run accuracy checks, but it covers the format validity and gap-detection components of the synchronisation audit.

LMS API access for bulk inventory

For catalogues of 200+ assets, manually verifying caption status in each LMS admin interface is the primary time bottleneck of the coverage pass (Dimension 1). Every major LMS has a REST API that returns asset-level metadata. The TalentLMS API, the Docebo REST API, the Absorb Reporting API, the Cornerstone Edge API, and the Workday Learning API all support course content and learning object inventory calls. Building a simple script to pull content inventory — course ID, content type, caption file present (Y/N), language tag — from each LMS API reduces the Day 1 coverage pass from a half-day manual process to a one-hour automated export. If the L&D team has an engineer who can write a simple REST API call (no authentication complexity beyond an API token), this optimisation is worth one hour of engineering time for a catalogue of 200+ assets.

Similarly, the Kaltura Media Entry API and the Panopto Sessions API return caption metadata for each media entry. For large Kaltura or Panopto libraries (1000+ entries), the API export is the only practical coverage audit method.

Accuracy measurement at scale

The manual spot-check protocol (Dimension 3) is accurate but slow. For large Tier 1 catalogues, WhisperX (an open-source forced alignment tool) can be used to generate a time-aligned reference transcript from the video audio and compare it against the existing caption file programmatically. The comparison produces a WER calculation at the segment level. This is not a substitute for human verification of the proper-noun WER check — machine-to-machine WER comparison can miss proper-noun errors when both the reference transcription and the caption file have the same substitution pattern — but it accelerates the first-pass WER estimate for each asset, allowing the human spot-check to focus on assets flagged as high-WER rather than every asset in the queue. For a 300-asset Tier 1 catalogue, replacing the manual spot-check with an automated WhisperX pass reduces the Day 2 accuracy audit from a full day to a half-day, with the manual spot-check reserved for a targeted 20% of assets flagged as high-risk.

Compliance documentation management

The documentation audit (Dimension 7) is most painful when compliance documentation is scattered across email archives, vendor delivery portals, and personal shared drives. A minimal compliance documentation system requires only a shared folder with a consistent naming convention and an index spreadsheet. A more structured option is a dedicated compliance management tool (several accessibility compliance platforms support caption audit documentation natively). At minimum, the audit output — the populated audit log from the five-day sprint — should be stored in a location that any member of the L&D and legal teams can access in under five minutes. The 90-day compliance programme post covers the documentation system architecture in Phase 2 (policy and governance).

Glossary-biased captioning for remediation

The most common finding in the accuracy audit is accuracy failure on proper-noun–dense content — exactly the content most likely to trigger a compliance investigation (mandatory training, safety training, clinical training). The structural fix is to re-run the captions using a tool that applies your organisation's domain vocabulary during transcription, rather than using a generic speech-to-text model. The customer glossary architecture post covers how to build the term list for your training content. The workflow for applying the glossary to the remediation queue is: upload the Effort C (recaption from transcript) assets to GlossCap with the customer glossary connected → receive corrected caption files with proper-noun accuracy above 99% → verify with a spot-check on the first batch → upload corrected files to the LMS. The accuracy improvement on proper-noun–dense medical, engineering, or sales training content is typically from 80–88% (generic ASR) to 99%+ (glossary-biased decoding) — the same improvement pattern documented in the glossary-biased Whisper post and the vertical accuracy benchmarks post.

For the documentation record (Dimension 7), GlossCap's output includes a per-file accuracy claim and a log of the glossary version applied — the documentation artefact that the documentation audit requires for Effort C remediations. The embed widget provides a real-time preview of glossary-corrected captions for the review step, allowing the L&D operator to verify the output before uploading to the LMS.

Common audit failure patterns and their root causes

Across the seven dimensions, three root causes account for the majority of audit failures in L&D organisations at the typical post-ADA-deadline compliance maturity level:

Root cause 1: Toolchain mismatch between caption production and LMS delivery

Caption files are produced in one environment (a vendor's delivery portal, a machine captioning tool, a video editing tool) and served from another (the LMS, the video hosting layer, the SCORM player). Every handoff between environments is a failure point — the wrong format is exported, the file is uploaded to the wrong system, the language tag is lost during the transfer, the player in the delivery environment doesn't support the format. The toolchain mismatch root cause produces a cluster of Dimension 2 (format) and Dimension 6 (delivery) failures that are individually small but collectively account for 15–25% of audit fails in most organisations. The structural fix is to establish a documented caption upload workflow that specifies the format, the target system, and the verification step for every production environment in the catalogue. The LMS caption ingestion workflow post covers this workflow in detail for the four most common LMS platforms.

Root cause 2: Accuracy assumption from coverage

The coverage audit finding ("85% of Tier 1 assets have caption files") is treated as a compliance finding ("we are 85% compliant on Tier 1"). This conflation is what the audit is designed to correct. The operational pattern that produces this root cause is the use of automatic captioning tools (YouTube, LMS-native ASR, Camtasia auto-caption) for content production without a QC pass. The caption files exist — they were generated as part of the production process — but they were never verified for accuracy. The structural fix is to insert an accuracy QC gate into the production workflow before upload, not a post-hoc audit after the fact. The compliance programme build (Phase 4 in the 90-day plan) covers the QC gate design. The audit surfaces this gap in the existing catalogue so it can be remediated.

Root cause 3: Documentation never built

Caption compliance documentation is the last thing L&D teams think about and the first thing a compliance investigation asks for. In organisations where caption compliance has been treated as a content problem rather than a governance problem, the documentation has never been built — there is no central record of which assets have compliant caption files, when they were produced, by whom, and to what accuracy standard. The audit's documentation dimension (Dimension 7) surfaces this gap, but fixing it requires building the documentation system prospectively, not just locating scattered existing documentation. The documentation system build — establishing the central compliance record, backfilling accuracy records for Tier 1 assets, and creating the intake process for future assets — should be treated as a parallel sprint to the content remediation, not a deferred action.

Frequently asked questions

How is the caption library audit different from the compliance programme build in the 90-day plan?

The 90-day compliance programme build is a forward-looking operation: you are designing a policy, a production workflow, a vendor relationship, and a documentation system that will produce compliant captions for new content going forward. The caption library audit is a backward-looking operation: you are assessing whether the captions already in your library meet the WCAG 2.1 AA standard. Both are necessary, and they are most effectively run in parallel rather than in sequence — the audit runs over the existing catalogue while the compliance programme build establishes the system for new production. The audit's remediation queue feeds directly into the compliance programme's back-catalogue retrofit phase (Phase 5 in the 90-day plan), so running the audit early gives the compliance programme accurate data on the scope of the back-catalogue problem. Running the compliance programme build first without the audit is the more common sequence, and it produces the situation most L&D teams are in six weeks post–ADA Title II: compliant new production, unknown back-catalogue status.

We have 800 videos in our LMS. How do we scope the audit to what we can actually complete in five days?

Scope by compliance tier, not by catalogue size. For an 800-video catalogue, the five-day sprint should cover all Tier 1 assets fully across all seven dimensions, Tier 2 assets on Dimensions 1–3 (coverage, format, accuracy), and Tier 3 and 4 assets only on Dimension 1 (coverage, using an automated LMS API export). The goal of the five-day sprint for a large catalogue is not to audit every video — it is to audit every video with a live compliance obligation. Tier 3 and 4 assets get logged with "Needs Review" status in the audit log and are deferred to the quarterly back-catalogue plan. For a 800-video catalogue with a typical 30–40% Tier 1 distribution, the Tier 1 queue is 240–320 assets — a manageable five-day sprint for a two-person team running the manual spot-check on the accuracy dimension in parallel. If the team is one person, extend the sprint to eight days rather than skipping dimensions on Tier 1 assets.

Our LMS auto-generates captions using built-in ASR. Do those count as WCAG-compliant captions?

LMS-native ASR-generated captions do not automatically meet WCAG 2.1 AA SC 1.2.2. The WCAG standard requires accuracy sufficient for a person with a hearing disability to understand the content — courts and the Department of Justice have interpreted this as requiring 99%+ word accuracy for formal training content, and no LMS-native ASR system achieves 99% accuracy reliably on domain-specific training content with above-average proper-noun density. Docebo's auto-captioning, TalentLMS's auto-captioning, and Panopto's ASR all perform in the 80–90% accuracy range on average audio quality and drop further on technical, medical, or sales training content. LMS-native ASR is appropriate as a starting point for caption production — it provides the synchronised text structure that then requires a QC pass and, where accuracy is below 99%, a correction pass using a glossary-corrected tool. The audit's accuracy dimension (Dimension 3) should treat all LMS-native ASR–generated captions as unverified until a spot-check confirms their WER.

We use Articulate Storyline for most of our content. How do we audit captions in Storyline-packaged SCORM files?

Articulate Storyline SCORM packages embed captions as an HTML text track in the package's internal HTML structure. To audit captions in a Storyline package: (1) download the published SCORM package from your LMS, (2) unzip it, (3) navigate to the html5 or mobile subfolder and find the story.html or index.html file, (4) open it in a browser and play the video while checking for caption delivery. Alternatively, open the source Storyline file and check for captions in the Notes/Captions panel for each slide. The most reliable delivery check is to play the SCORM content as a learner in the LMS, confirm that captions are present and synchronised, and screenshot the delivery confirmation for the compliance record. For the accuracy audit, extract the caption text from the Storyline package (the captions are stored in a JSON or SRT file inside the story_content folder) and run the spot-check protocol against the extracted text. Note that Storyline's built-in auto-captioning feature (introduced in Storyline 360) generates captions from the text-to-speech audio it produces for accessible review slides — these are not transcriptions of the recorded narration and should not be treated as caption files for recorded video narration. Only captions that were explicitly added to the video layer in Storyline and exported in the SCORM package as synchronised text tracks count for the compliance audit.

What does "documentation record" mean in practice, and how detailed does it need to be for a legal or OCR audit?

A documentation record for a caption file answers three questions: who produced the caption file, when, and to what standard. For a WCAG 2.1 AA compliance record, the minimum documentation is: the caption source (vendor name or tool name), the production date, and either a vendor-provided accuracy claim or a spot-check result. In a legal or OCR investigation, the documentation record is used to show that the organisation actively managed caption quality rather than passively accepting whatever a machine generated. The investigation is not looking for perfection — it is looking for a good-faith effort to understand and maintain compliance. A spreadsheet with one row per Tier 1 video, showing the vendor, the date, and the accuracy claim, stored in a shared folder accessible to the compliance officer, meets the minimum standard. More detailed documentation — the specific glossary version applied, the QC reviewer's name and spot-check timestamp, the pre- and post-remediation WER — is useful for demonstrating continuous improvement and for defending against claims of wilful non-compliance, but is not required in the initial compliance record build. The documentation audit (Dimension 7) is designed to identify whether the minimum standard is met and flag any documentation gaps for the remediation queue.

How do we handle third-party content in our LMS — vendor-provided courses, off-the-shelf compliance training, licensed content from LinkedIn Learning or Coursera for Business?

Third-party content is an audit scope question, not an exemption from compliance. Your organisation's obligation to provide accessible training applies to all mandatory training content assigned to employees, regardless of whether that content was produced internally or licensed externally. If you assign a LinkedIn Learning course as mandatory compliance training for 15+ employees, ADA Title I requires that accessible alternatives be available for employees with hearing disabilities — including caption files that meet WCAG 2.1 AA SC 1.2.2. For third-party content, the audit approach is: include it in the Tier 1 inventory if it is assigned as mandatory training; verify caption availability and quality from the content provider; if the provider's captions fail the accuracy check, issue a formal request to the provider for WCAG-compliant captions and document the request as part of the compliance record. Most enterprise content providers (LinkedIn Learning, Coursera for Business, Skillsoft, OpenSesame) provide caption files for their catalogue — the question is whether those caption files meet the accuracy standard on content with above-average proper-noun density. Request the provider's WCAG conformance documentation for any third-party content included in the Tier 1 audit scope.

After the audit, how do we prevent the next audit from finding the same problems?

The audit surfaces systemic gaps that a programme can address structurally. The three most common audit findings — toolchain mismatch, accuracy assumption from coverage, and missing documentation — each have a structural fix that prevents recurrence. For toolchain mismatch: establish a documented caption upload checklist for every production environment, with a format specification and a verification step that confirms delivery in the primary learner context before the course module goes live. For accuracy assumption from coverage: insert a mandatory accuracy QC gate into the production workflow — no video goes live without a documented accuracy claim, whether a vendor QC report or an internal spot-check result. For missing documentation: build the compliance documentation system before the remediation sprint ends, not after, and establish the intake workflow for new production (who logs the accuracy record, where, and by when). These three fixes are the product of the compliance programme build, not the audit — the audit identifies what needs fixing; the programme build defines how it stays fixed. The next audit, typically run annually, should find that Dimension 3 (accuracy) and Dimension 7 (documentation) pass rates are above 85% for all Tier 1 assets, because the production workflow QC gate and the documentation intake process are operating reliably. If the second audit finds the same gaps as the first, the programme build's QC gate or documentation intake process is not functioning and needs a root-cause investigation before the programme can be considered operating.

Fix the accuracy gaps the audit finds — starting with the glossary

The audit will find accuracy fails on exactly the content where accuracy matters most: the mandatory training videos with the highest proper-noun density, the medical and safety training where a substitution error on a drug name or hazard code is not just an accessibility failure but a safety failure. The structural fix is a captioning tool that knows your domain vocabulary — your product names, your drug names, your SDK symbols, your OSHA citation codes — and applies that vocabulary during transcription rather than after the fact. GlossCap connects to your existing terminology source (Notion, Confluence, Google Docs, or a pasted term list) and applies your glossary to every caption job, producing WCAG 2.1 AA–grade caption files that pass the spot-check protocol on the first review.

For the Effort C remediation queue — assets where an accurate base transcript exists but proper-noun accuracy fails — GlossCap applies the glossary correction to the existing file without a full recaption. The output includes a per-file accuracy documentation log that satisfies Dimension 7's documentation record requirement. The Solo plan at $29/month covers the sprint-sized remediation queue for a small L&D team; the Team plan at $99/month includes the Notion/Confluence/Docs glossary sync and the LMS integration for the ongoing production workflow. Start with the highest-priority Tier 1 accuracy fails from the audit output, verify the first batch, and build from there.

Start remediating the audit fails with the right glossary

Other tools from the factory