Compliance Operations · Published 2026-06-18
Caption backlog remediation at scale: prioritisation frameworks for 5,000–50,000 hour libraries, batch economics, vendor throughput math, and the phased approach that keeps compliance evidence current
Every organisation with a multi-year training video library has a caption backlog. For most small and mid-size L&D teams, the backlog is measured in dozens or hundreds of hours — a problem that can be resolved over one to two budget cycles with a modest captioning line item. For public universities, healthcare networks, large enterprise, and government agencies, the backlog is measured differently: 5,000 hours, 20,000 hours, 50,000 hours. At that scale, the operational problem is no longer "how do we caption our videos." It becomes: how do we sequence 50,000 hours of remediation work so that the highest-risk content is addressed first, the compliance evidence trail demonstrates good-faith progress to regulators, the vendor throughput capacity is matched to internal review capacity, and the budget is not exhausted on archive content while currently-enrolled learners are still encountering uncaptioned training? The answer requires a fundamentally different operational approach than the sequential, video-by-video captioning workflow that handles small backlogs.
The scale threshold at which the problem changes is roughly 2,000 hours. Below that threshold, an organisation with a $50,000 annual captioning budget and a single vendor can reach full compliance within 18 months without prioritisation. Above 2,000 hours, sequential completion within a reasonable compliance timeline requires either a budget that most L&D teams do not have or a prioritisation framework that concentrates limited remediation capacity on the content that matters most for compliance purposes. The organisations most commonly facing 5,000-to-50,000-hour backlogs are public universities (10,000–100,000+ lecture hours in archive from a decade or more of course recordings), healthcare networks (3,000–15,000 hours of clinical training, compliance, and continuing education content), large enterprise (2,000–20,000 hours across 10+ years of internally-produced video), and government agencies subject to Section 508 (often 5,000–30,000 hours accumulated before 2018 Section 508 refresh enforcement intensified).
The compliance framing for large-scale remediation is different from the compliance framing for ongoing captioning programmes. An organisation with a 30,000-hour backlog cannot complete remediation before the first compliance deadline — the deadline has already passed for every video that has ever been assigned to a learner with a hearing disability and lacked a caption track. The compliance question is not "can we achieve compliance before the deadline" but rather "can we demonstrate a good-faith remediation programme that shows systematic, risk-prioritised progress toward full compliance." Regulators reviewing an OCR complaint or DOJ investigation do not expect overnight completion of a decade-long backlog. They do expect to see: a documented inventory of uncaptioned content, a risk-based prioritisation plan with milestone dates, evidence that the highest-risk content (active-enrollment, mandatory compliance, onboarding) was addressed first, and consistent monthly or quarterly progress against the plan. An organisation that can show a well-documented remediation programme with completed Tier 1 and active Tier 2 work is in a defensible position even with thousands of uncaptioned archive hours remaining.
This guide covers the complete operational playbook for large-scale caption backlog remediation, from the inventory and prioritisation decisions that precede the first batch submission to the per-LMS platform workflows that govern how captioned files get from the vendor back into the systems where learners access them. The LMS caption audit methodology post covers how to discover and inventory uncaptioned content; this post assumes the inventory is complete and focuses on execution. The caption compliance programme build covers the 90-day programme launch for new programmes; this post covers the operational execution for organisations that already have a programme and are managing a large existing backlog. These three posts together form the complete remediation reference for L&D and compliance teams.
TL;DR — six things every L&D and compliance team needs to know about large-scale remediation
- The prioritisation decision matters as much as the captioning itself. At large scale, you will not complete the full backlog within the first year regardless of budget. Organisations that caption archive content before active-enrollment content are investing remediation capacity in precisely the wrong order: the compliance risk is in the content currently assigned to learners with hearing disabilities, not in a 2016 leadership development video that has had zero enrollments in three years. The compliance reporting framework for large programmes is built around the active-enrollment ratio — the percentage of currently-enrolled content that has compliant captions — as the primary risk metric.
- Vendor throughput is rarely the constraint. Internal review capacity and LMS ingestion time usually are. Most enterprise captioning vendors can process 500–2,000 hours of audio per month. What limits a remediation programme is the internal review step (someone must validate quality before content is published to learners) and the LMS administration step (uploading and associating SRT files in most LMS platforms is a manual, per-video process that takes 2–3 minutes per video even with batch tools). At 2 minutes per video, 600 hours of content (approximately 600 separate video assets) requires 20 hours of LMS administration time — a full work week — before a single learner can see the new captions.
- AI-only captioning for the backlog without a domain glossary will fail the 99% WCAG threshold. The apparent cost saving of AI-only ASR at $0.10–$0.30 per minute (vs AI + human review at $0.50–$1.00 per minute) evaporates when internal teams must review and correct the output before publishing. AI-only ASR on technical training content without a domain-specific glossary achieves 75–85% accuracy on specialised vocabulary — far below the 99% WCAG 2.1 AA threshold. The correction labour cost at the internal team rate frequently exceeds the cost of ordering the review step from the captioning vendor. For large-scale remediation, the financially optimal approach is AI + glossary-assisted review at the vendor stage, which dramatically reduces the correction burden before internal QA.
- SCORM-packaged eLearning content in your backlog cannot be captioned by uploading a sidecar file. For a video asset in an LMS course unit, you upload an SRT file alongside the video. For an eLearning course published in Storyline, Rise, or Captivate as a SCORM package, captions are embedded in the package at publish time — there is no post-hoc sidecar capability. Adding captions to historical SCORM content requires access to the original authoring tool project file (.story, .cptx, etc.), which may not exist for courses produced several years ago. The eLearning authoring tool caption workflow covers the per-tool approaches in detail; the critical point for remediation planning is that SCORM content requires a separate pathway and must be inventoried separately from raw video content.
- Compliance evidence must be documented throughout the remediation, not assembled at the end. If a regulatory complaint arrives while you are mid-remediation, you need to be able to show the compliance team or regulator a current-state inventory, the prioritisation rationale, the milestone completion dates for each tier, and evidence that Tier 1 (highest-risk active-enrollment content) was completed. Documentation assembled after the fact is far less credible than contemporaneous progress reports. Build the evidence documentation into the programme management workflow from month one, not as a retrospective project.
- The remediation programme is not complete when the last video is captioned — it is complete when the new-content gate prevents the backlog from re-accumulating. The single most common outcome of a large-scale remediation programme: the organisation completes the backlog over 18 months, then spends the following two years recreating a new backlog at the same rate as before because the programme lacked a new-content gate. A new-content gate — a requirement that no new video is published to learners without a caption track — is the structural fix that prevents the remediation investment from being eroded. The caption governance policy template covers the new-content gate requirements for programmes at different levels of operational maturity.
Why large-scale backlog is a different operational problem
The caption workflow that works for a 200-hour backlog is a sequential queue: videos are identified, submitted to a captioning vendor, reviewed by an L&D team member, and uploaded to the LMS one by one. Decisions are made video-by-video. The programme manager can hold the entire backlog in their head. Review is done by the person who knows the content. Exceptions (a video with no audio, a SCORM package that needs the .story file) are handled ad hoc.
At 5,000 hours, this approach breaks in multiple ways simultaneously. The queue is too large for any individual to review video-by-video without full-time dedication. Ad hoc exception handling becomes a major source of programme delay — a batch of 400 videos that includes 30 SCORM packages, 15 videos with no audio, and 20 with non-English content will stall unless there is a documented protocol for each exception type. The vendor submission and LMS ingestion workflow must be systematised or it consumes the available programme management capacity entirely. The compliance evidence trail must be active and contemporaneous rather than retroactive. And critically, the sequencing decision — which content gets captioned first — becomes a formal risk management decision with financial and legal consequences rather than a convenience question about which course the team is currently working on.
The other dimension that changes at large scale is the relationship between the remediation programme and the ongoing new-content programme. For an organisation captioning a 200-hour backlog, the ongoing new-content captioning workflow can continue largely independently. For an organisation managing a 20,000-hour backlog, the remediation programme and the new-content programme must be managed together: they compete for the same captioning vendor capacity, the same internal reviewer bandwidth, and the same LMS administration time. A poorly coordinated large-scale remediation will find that the remediation programme starves the new-content workflow of reviewer capacity, leading to a growing new-content queue even as the legacy backlog shrinks. The programme management structure must explicitly allocate capacity between the two workstreams.
The budget dimension is equally constraining. The three-year caption programme budget model shows how per-minute costs accumulate for mid-scale libraries. At large scale, the numbers are qualitatively different: a 10,000-hour library at $0.75/minute (the midpoint of the AI + human review range) is a $450,000 project. No organisation can approve a $450,000 captioning project in a single fiscal year budget cycle; it must be phased across multiple years with milestone-based budget releases. Phasing requires prioritisation — the budget for year one must be concentrated on the highest-risk content. The prioritisation framework is therefore not just an operational tool; it is the budget justification document that determines what leadership approves in year one vs year two vs year three.
What counts as large-scale: thresholds and who has them
Four thresholds define the operational tier of a caption backlog programme:
Small (under 500 hours): Single-year remediation is financially feasible for most L&D budgets. A 500-hour backlog at $0.75/minute is a $22,500 project. Sequential processing works. A single L&D team member can manage the programme alongside other responsibilities. No formal programme management infrastructure is required.
Medium (500–2,000 hours): Two-to-three-year remediation at typical budget allocations. Prioritisation is valuable but not mandatory — the compliance exposure difference between captioning Tier 1 and Tier 3 content in this range is smaller because the total timeline to full completion is manageable. A basic tracking spreadsheet and a quarterly review cadence is sufficient.
Large (2,000–20,000 hours): Multi-year programme requiring formal project management, risk-based prioritisation, documented compliance evidence, and dedicated resourcing. A single vendor is often sufficient for throughput but internal review and LMS ingestion capacity will be constrained. Budget approval typically requires a multi-year plan with annual milestone gates.
Enterprise (20,000+ hours): Multi-year, multi-vendor programme requiring project management office-level governance. Vendor throughput must be managed across concurrent suppliers to hit reasonable calendar timelines. Internal review capacity is a structural constraint that may require dedicated FTE allocation or partial outsourcing of the review function. LMS ingestion automation (via API where available) is worth the engineering investment to avoid the per-video manual upload bottleneck.
Who has large-scale backlogs
Public universities: The highest-density large-scale backlog environment. A mid-size university with 20,000 enrolled students and a 10-year course recording archive may have 30,000–80,000 lecture hours in Panopto, Echo360, or Kaltura. Lecture capture volumes accumulate at roughly 5,000–15,000 hours per year at a moderately active institution. ADA Title II and Section 508 obligations apply to the full archive, not just recent content. The university lecture capture captioning post covers the per-platform workflows; this post covers the programme-level remediation approach that governs which lecture content gets prioritised for backlog processing.
Healthcare networks: Hospital systems and integrated delivery networks accumulate clinical training, compliance, and continuing education video at scale. A health system with 50,000 employees may have 5,000–20,000 hours of mandatory training content across the centralised LMS, department-level SharePoint libraries, and clinical education platforms (HealthStream, Relias, Cornerstone). HIPAA training, Joint Commission compliance modules, and clinical simulation recordings are all in scope for ADA Title I and Section 1557 (the ACA non-discrimination provision that healthcare entities must comply with). Healthcare backlogs are typically lower in total hours than university archives but higher in per-hour compliance risk because the content is mandatory and the audience includes employees with hearing disabilities who have the most immediate ADA exposure.
Large enterprise (1,000+ employees): Organisations with substantial L&D programmes that have been producing training video for a decade or more can easily accumulate 5,000–20,000 hours across their LMS, video host, and intranet. The typical composition: a few hundred hours of high-production-value courses in the LMS, a larger body of department-level instructional video in SharePoint or Confluence, and several thousand hours of Zoom and Teams meeting recordings that were repurposed as training content. The repurposed recording category is often the discovery surprise: when an LMS caption audit is conducted, teams frequently find that a significant volume of content is not in the formal LMS catalogue at all — it is in SharePoint document libraries, accessible via links in course resources but not tracked as LMS media assets.
Government agencies (Section 508): Federal agencies subject to the 2018 Section 508 refresh and state agencies subject to analogous state requirements often have large backlogs of pre-refresh training content. The 2018 refresh brought WCAG 2.1 AA as the operative standard for federal ICT; for agencies that previously interpreted the original 1998 Section 508 as permitting lower-quality captions, the 2018 refresh created a backlog of content that technically had captions but did not meet the refreshed standard. For these agencies, the remediation problem is compound: identifying which content has no captions, identifying which content has non-compliant captions (wrong format, below 99% accuracy, poor synchronisation), and prioritising across both categories.
The inventory problem: what you do not know yet
Before a prioritisation framework can be applied, the inventory must be complete. The LMS caption audit methodology covers the five-day discovery sprint in detail; this section summarises the inventory gaps that consistently surprise L&D teams undertaking large-scale remediation.
You probably have more content than your LMS catalogue shows
The LMS catalogue is a record of formally published courses. It is not a record of all video content in the organisation's training ecosystem. The typical discovery gaps: Zoom and Teams recordings flagged as training resources and linked from LMS course resources (the video file lives outside the LMS but is accessible to enrolled learners), SharePoint video libraries managed by individual departments (product teams, HR, IT, Compliance) that contain instructional video without LMS course wrappers, Loom recordings shared via direct link in onboarding documents, Confluence pages, and intranet wikis, YouTube or Vimeo video hosted on those platforms and embedded in LMS course content (the LMS has no record of the video, only the embed code). When these non-LMS sources are included in the inventory, the total video volume is typically 20–40% higher than what the LMS catalogue alone would suggest.
You probably have more captioned content than you realise — and some of it is non-compliant
Inventory discovery also reveals captioned content that teams did not know existed. Auto-generated captions enabled on a YouTube channel, Panopto ASR turned on as a default setting, Zoom auto-transcript records linked as resources — some of this content appears captioned but at 70–80% ASR accuracy on technical content, which is below the 99% WCAG 2.1 AA threshold. Auto-captions that have never been reviewed for accuracy count as a compliance gap just as much as missing captions — the distinction is that the fix is correction rather than production from scratch, and correction is typically 30–50% less expensive than production.
SCORM content requires a separate inventory track
eLearning courses published as SCORM packages require a separate inventory and remediation pathway. Unlike video files where captions are delivered as sidecar SRT files, SCORM packages embed caption data at authoring tool publish time — adding captions after the package is deployed requires the original authoring tool project file (.story for Storyline, .cptx for Captivate) and republishing. The inventory must track: which SCORM courses contain video narration, whether those courses have the original project files accessible, and whether the original authoring tool version is still licensed and functional. For courses more than five years old, original .story or .cptx files may be lost, the Articulate version used may be out of support, or the voice-over narration may have been recorded by a vendor who did not provide the script, making manual caption production from audio the only option. The eLearning authoring tool caption workflow covers the per-tool remediation options in detail.
Risk-based prioritisation: the four-tier framework
The four-tier prioritisation framework sequences content by compliance risk, not by production date, subject matter, or course catalogue organisation. The fundamental principle: any content currently being served to learners who may have hearing disabilities creates an active, ongoing compliance exposure. Archive content that no learner has accessed in two years creates a theoretical compliance exposure. Remediation resources should always flow toward active exposure first.
Tier 1 — Caption within 60 days
Tier 1 content carries the highest and most immediate compliance risk. Any delay in captioning Tier 1 content extends an active compliance exposure that should be treated as already out of compliance from the date the ADA obligation became applicable.
Content that triggered or is related to an active complaint or OCR/DOJ investigation. If a learner has filed a complaint or an investigation is underway, all content identified in the complaint must move to the top of the Tier 1 queue regardless of other prioritisation criteria. The compliance exposure here is not merely ongoing — it is acutely elevated because regulators are watching.
Mandatory compliance training assigned to all employees. Anti-harassment, workplace safety, privacy and data security, and similar mandatory modules that all employees are required to complete are Tier 1 for two reasons: they are assigned to the full workforce (maximising the probability that a learner with a hearing disability has encountered them) and they are legally mandated by employment law, making a caption failure on these modules a compounded compliance issue — both ADA non-compliance and potentially a failure to provide required employment training.
New-hire onboarding content. As covered in the employee onboarding captioning playbook, ADA Title I obligations begin on day one of employment with no grace period. Onboarding content in the HRIS portal, LMS onboarding track, and company intranet must all be captioned before a new hire with a hearing disability joins the organisation. If any of these systems contain uncaptioned onboarding content, that content is Tier 1.
Content currently in active enrollment. Any course currently assigned to learners — meaning it appears on a learner's required or enrolled course list — is Tier 1. A video that is technically in the catalogue but has zero active enrollments in the past 12 months is not actively exposing learners to uncaptioned content; a video with 50 currently-enrolled learners is.
Content with the highest historical completion volume. The top 20% of courses by completion count in the past 12 months represent the highest-density audience exposure. These courses have served more learners than any other content; if any of them have inadequate captions, the exposure has already been widely realised. Captioning these courses cannot undo prior exposure, but it prevents continuation.
Tier 2 — Caption within 6 months
Role-specific compliance content for regulated functions. Healthcare HIPAA training, financial services regulatory training, safety-sensitive job function training (OSHA-mandated programmes, DOT certification training), and professional licensing continuing education content are all Tier 2. These are not assigned to all employees but are mandatory for the employees to whom they are assigned, and many of those employees hold roles that are disproportionately occupied by individuals who may have hearing disabilities in the workforce.
Management and leadership development content. Content assigned to supervisory and managerial levels has a secondary compliance dimension: managers who receive uncaptioned training on how to conduct performance reviews, deliver feedback, or manage accommodation requests may do so with an incomplete understanding of the subject matter — including accommodation management itself. Leadership development content is not as time-critical as active mandatory compliance training but is materially higher priority than general catalogue archive content.
Content associated with documented accommodation requests. If any employee has formally requested a captioning accommodation and specified content categories or courses, those content items become Tier 1 for that employee and Tier 2 programme-wide. The accommodation request elevates the priority for the specific content identified, and the team should flag similar content in the same category for Tier 2 processing.
Content used in performance evaluation or certification processes. Training programmes where completion and demonstrated competency are tied to performance evaluations, role certifications, or regulatory licence maintenance have an elevated priority because failure to access the content has downstream consequences beyond a single viewing experience. If an employee cannot access the content accurately, their performance evaluation or certification timeline is affected — which creates an ADA accommodation failure that compounds beyond the immediate caption issue.
Tier 3 — Caption within 12 months
General skills training and catalogue content in active use. Soft skills training, productivity tool tutorials, communication and writing modules, and similar content that is available in the catalogue and regularly accessed but not formally assigned or tied to compliance obligations falls into Tier 3. The compliance exposure is lower than Tier 1–2 but the content is still being served to learners and should be captioned within the programme year.
Archived content still accessible to employees. Historical training content that is no longer actively promoted but remains in the catalogue and accessible via search or direct URL is Tier 3. A learner who finds and accesses this content has the same access entitlement as a learner who accesses active content. The difference in risk from Tier 1 is the lower probability of discovery and the lower frequency of access.
Recently-produced content not yet enrolled. Content that has been produced but not yet deployed to learner-accessible courses is a lower-priority remediation target than content already in active enrollment, but should be captioned before deployment rather than after to prevent it from entering the Tier 1 category uncaptioned.
Tier 4 — Assess and decide before captioning
Tier 4 content requires an assessment step before any captioning work is ordered. The assessment determines whether captioning is the right action or whether retirement, replacement, or flagging is more appropriate.
Content older than 7–10 years with minimal recent access. A 2014 compliance module that has had fewer than 10 access events in the past three years may be superseded by updated content, cover regulatory frameworks that have since changed, or feature production quality that the organisation would not approve for current learners. The decision is: retire it, replace it, or caption it. Captioning it without a review of currency and quality wastes remediation budget on content that may be removed from the catalogue in the next content audit.
Content in technically obsolete formats. Flash-based eLearning that no longer renders in modern browsers, SCORM 1.2 packages with codec issues, or video in formats not supported by the current LMS player represents a category where captioning would require full technical remediation of the format issue first — and at that point, reconstruction from source or replacement with modern equivalents may be more cost-effective than captioning the original.
Content with unresolvable source file issues. SCORM courses for which the original .story or .cptx project files cannot be located and which contain narration that would require expensive voice reconstruction to caption accurately. The cost of manual transcription from audio, combined with any re-engineering of the SCORM package to accept the captions, may exceed the cost of replacing the course with a re-authored version using modern captioning-native workflows.
The Tier 4 assessment saves remediation budget by preventing the programme from investing $10,000 in captioning content that will be retired in the next annual content review. Build the Tier 4 assessment into the programme management calendar — a content review pass that runs in parallel with Tier 3 captioning so the Tier 4 decision is made before the budget line runs out.
Batch economics: the full cost model
Caption remediation has three cost components that must all be included in the budget model: vendor production costs, internal review labour costs, and LMS administration costs. Most budget conversations focus exclusively on the first; the second and third are often the actual constraints.
Vendor production costs
Per-minute pricing for batch captioning varies significantly by service tier:
AI-only ASR (no human review): $0.10–$0.30/minute. The apparent cost leader, but only for content where AI accuracy is already high. On general narrative content with clear audio and no technical vocabulary, AI-only ASR can approach 92–95% accuracy — still below the 99% WCAG threshold but requiring only light correction. On technical training content — medical, legal, engineering, compliance — AI-only ASR on technical vocabulary achieves 75–85% accuracy, requiring substantial correction. Without a domain glossary, the "$0.15/minute" AI-only option requires 0.5–1.0x real-time human correction, which at an internal team rate of $35–$60/hour can cost more than the human-review option from the vendor.
AI + human review (standard turnaround): $0.50–$1.00/minute. This is the standard enterprise captioning service level. A captioning vendor receives the audio, runs it through ASR, and a human reviewer corrects errors before delivery. Turnaround time is typically 3–5 business days. For large-scale remediation with a domain-specific glossary loaded at the vendor (or handled pre-submission by the organisation), AI accuracy improves and human review time decreases — pushing the effective rate toward the lower end of this range. The hidden FTE cost analysis shows why glossary-assisted ASR before the review step materially reduces total programme cost at scale.
Human transcription from scratch: $1.50–$4.00/minute. Appropriate for audio quality so poor that ASR produces unusable output: significant background noise, heavy accents not well-represented in training data, technical terminology density so high that ASR accuracy drops below 60%. At large scale, this tier is used for specific exception categories — specific clinical simulation recordings with multiple overlapping speakers, for example — not as the default approach.
Rush (24-hour turnaround): 2–3× the standard rate at any service tier. Rush should not be part of the large-scale remediation batch pricing; it is reserved for specific Tier 1 items where a learner accommodation request is pending and the standard turnaround timeline is unacceptable. Plan around this: if a learner with a hearing disability joins the organisation and needs onboarding content captioned, the budget for a limited volume of rush turnaround is worth having pre-approved.
Cost model for library sizes
At the midpoint AI + human review rate of $0.75/minute:
- 2,000-hour library: $90,000
- 5,000-hour library: $225,000
- 10,000-hour library: $450,000
- 20,000-hour library: $900,000
- 50,000-hour library: $2,250,000
These are total project costs across the full backlog. For the prioritisation framework, what matters is the cost by tier. If Tier 1 content represents 10% of the library (200–2,000 hours in a 2,000–20,000 hour backlog), the year-one budget required for Tier 1 compliance is $9,000–$90,000. The multi-year programme budget covers Tier 2–3 sequentially. For the budget justification conversation with Finance or senior leadership, the ROI framing for finance executives provides the expected-value model for compliance cost vs remediation investment.
Internal review labour costs
The internal review step is the component most commonly omitted from budget models. After a vendor delivers a caption file, an internal L&D team member or subject-matter expert must review the output for accuracy before it is published to learners. The review time depends on ASR input quality:
- With glossary-assisted ASR (pre-loaded domain vocabulary): 0.1–0.2× real time per minute of video. A 10-minute video takes 1–2 minutes to review.
- Standard ASR on general training content (no technical vocabulary): 0.2–0.4× real time. A 10-minute video takes 2–4 minutes to review.
- Standard ASR on technical training content (significant terminology): 0.5–1.5× real time. A 10-minute video takes 5–15 minutes to review.
- AI-only ASR without domain glossary on high-density technical content: 1.5–3.0× real time. A 10-minute video takes 15–30 minutes to review — comparable to captioning the video from scratch internally.
For a 10,000-hour backlog with mixed content (50% technical, 50% general), at a blended 0.5× review rate and an internal reviewer rate of $45/hour, the internal review labour cost is approximately $135,000 — 30% of the vendor production cost at $0.75/minute. This is real budget that must be allocated, not merely a time estimate. The accessibility coordinator playbook covers how to structure the review function and whether to use internal subject-matter experts, a dedicated accessibility coordinator, or partial outsourcing to the captioning vendor for the review pass.
LMS administration costs
Caption files delivered by a vendor are not ready for learners until they are uploaded to the LMS and associated with the correct video asset. In most LMS platforms, this is a manual, per-video operation: navigate to the video in the course editor, open the caption upload interface, upload the SRT file, save. At 2–3 minutes per video, the LMS administration time for a batch of 200 videos is 400–600 minutes — 7–10 hours of LMS administration work per batch. For a 10,000-hour backlog with an average video length of 10 minutes (roughly 60,000 individual video assets), the LMS administration time at 2.5 minutes per video is 2,500 hours — more than a full year of full-time work.
This calculation argues strongly for LMS API-based batch ingestion where available (covered in the per-platform workflows section below) and for pre-batch organisation of the delivery files to match the LMS asset structure. Vendors that deliver captions with filenames that match the LMS video asset IDs reduce the manual matching step from the ingestion workflow. Build the naming convention requirement into the vendor contract from the start of the programme.
Vendor throughput math: projecting calendar time
Projecting the calendar timeline for large-scale remediation requires modelling three constraints simultaneously: vendor throughput, internal review capacity, and LMS ingestion capacity. The constraint that binds first determines the programme timeline.
Vendor throughput
Enterprise captioning vendors (3Play Media, Verbit, Rev Enterprise, Cielo24) have maximum throughput rates of approximately 500–2,000 hours of captioned content per month. The effective throughput is lower for two reasons: vendor capacity is not dedicated — large backlogs must be queued and delivered in batches, and peak demand periods (semester starts for university clients, annual compliance renewal cycles for enterprise clients) can compress available capacity. For planning purposes, a conservative estimate is 500–700 hours/month effective throughput from a single vendor.
At 500 hours/month from a single vendor, the timeline to caption a 10,000-hour backlog (assuming all content can be delivered to the vendor, which requires pre-processing for audio extraction) is 20 months. To compress this to 10 months requires either using two concurrent vendors (common for university-scale backlogs) or upgrading to a vendor's highest-throughput tier, which typically requires a minimum volume commitment.
Internal review capacity as the binding constraint
For most organisations, internal review capacity — not vendor throughput — is the binding constraint. An organisation with one dedicated reviewer and a 0.3× average review rate can process approximately 170 hours of captioned content per month (8 hours/day × 21 working days × 0.3 review rate / (average 10-min video review at 0.5 min = 0.05 hr per video × 60 min video = 3 min, times 170 hours/month of video = 1,020 videos × 3 min = 3,060 min = 51 hours review time — this math shows 170 hours/month requires about 0.3 FTE at the described review rate). Put differently: one dedicated reviewer at full-time allocation can support captioning approximately 400–600 hours/month of content at the described review rates. Two reviewers doubles this. Vendor throughput above the internal review capacity creates a delivery backlog at the ingestion stage — captions arrive from the vendor faster than they can be reviewed and published.
The solution for organisations with limited review capacity is to reduce the per-video review burden through glossary pre-loading and batch QA sampling rather than per-video review. The caption QA methodology covers the DCMP spot-check protocol that allows statistical quality validation across batches (reviewing 10% of videos to the DCMP standard and accepting or rejecting the batch based on the sample) rather than reviewing every video individually. Batch QA sampling reduces review time per video from 3–15 minutes to approximately 0.3–0.5 minutes per video for the 90% not in the sample, dramatically expanding the volume that a single reviewer can support.
LMS ingestion as the latent bottleneck
The LMS ingestion bottleneck is the constraint that surprises most remediation programmes. A batch of 200 videos delivered by a vendor with 100% on-time delivery and 100% passing QA still requires 7–10 hours of LMS administration work before any learner sees the new captions. When this bottleneck is not planned for, captions accumulate in a delivery queue at the LMS administration step while the compliance evidence trail shows them as "vendor-complete" but not yet deployed. A complaint received during this window will find content that is vendor-captioned but not yet learner-accessible — a distinction without practical compliance difference.
Planning for the LMS ingestion bottleneck: schedule the LMS administration step as a defined programme workstream with allocated time, not as a background task to be completed when convenient. For large-scale programmes, evaluate LMS API availability for batch caption upload (covered in the platform workflows section). For platforms without API support, evaluate whether the LMS vendor's professional services team can assist with bulk ingestion as a one-time engagement.
The 12-month phased remediation approach
The following 12-month framework is calibrated for a large-scale remediation programme (5,000–20,000 hour backlog) with a single captioning vendor, one dedicated reviewer (or equivalent review capacity), and standard LMS administration support. Larger programmes (20,000+ hours) should compress the per-phase timelines by adding vendor and reviewer capacity rather than extending the calendar timeline.
Month 1–2: Inventory completion, prioritisation, and vendor activation
Before the first batch is submitted, three things must be in place: a complete inventory, a documented prioritisation, and an active vendor relationship. If the LMS caption audit is not yet complete, month 1 is dedicated to completing it. The inventory output must include: total hours by system (LMS, video host, SharePoint, etc.), content type (video, SCORM, embedded), caption status (none, auto-generated unreviewed, reviewed compliant), and tier assignment per the four-tier framework above.
The vendor activation step includes issuing or activating a batch captioning contract, loading the organisational glossary at the vendor (this is the single highest-ROI action in the programme setup — glossary-biased ASR at the vendor stage dramatically reduces the internal review burden and is the mechanism that compresses the timeline more than any other single factor), establishing the file naming convention for delivered SRT files, and agreeing on the batch submission and delivery cadence (weekly or biweekly batches are the standard for large-scale programmes).
Month 2 deliverables: prioritised tier list with estimated hours per tier, vendor contract active with glossary loaded, first Tier 1 batch submitted by end of month 2.
Month 3–4: Tier 1 sprint
The Tier 1 sprint is the highest-intensity phase of the programme. All Tier 1 content must be captioned and deployed by the end of month 4 at the latest. Given that Tier 1 content is often 10–20% of the library (500–2,000 hours for a 5,000–10,000 hour programme), the vendor and review capacity must be fully committed to this sprint.
Key Tier 1 sprint requirements: highest QA intensity (per-video review rather than batch sampling for the content most likely to be scrutinised in a compliance audit), immediate LMS deployment of reviewed files (no accumulation in an ingestion queue — files reviewed today are uploaded today), and compliance evidence documentation for each completed item (course ID, caption file version, review sign-off date, deployment date).
Month 4 deliverables: Tier 1 content 100% captioned and deployed, compliance evidence package for Tier 1 completed, Tier 2 batch submission cadence established, LMS ingestion workflow systematised.
Month 5–8: Tier 2 sustained delivery
Tier 2 delivery uses a steady-state batch cadence: biweekly vendor submissions of 50–100 hours each, batch QA sampling protocol (10% DCMP spot-check), weekly LMS ingestion of reviewed batches. The programme manager tracks the batch completion rate against the Tier 2 milestone (100% of Tier 2 content deployed by end of month 8).
The budget monitoring function becomes important in this phase. Track spend-to-date against the remaining Tier 2–3 budget to identify whether the programme is pacing correctly for year-one budget consumption. If Tier 2 hours were underestimated in the prioritisation phase (a common discovery when the inventory uncovered content that was not in the initial estimate), the programme manager must either request additional year-one budget or carry Tier 2 content into year two.
Month 8 deliverables: Tier 2 content 100% captioned and deployed (or a documented revised timeline if scope changed), monthly progress reports for months 5–8 in the compliance evidence package, Tier 3 batch submission beginning.
Month 9–12: Tier 3 sustained delivery and Tier 4 assessment
Tier 3 delivery continues the batch cadence from month 5–8. Tier 4 assessment runs in parallel: a content review pass that applies the retirement/replacement/caption decision to all Tier 4 content. Tier 4 content that is retired is documented as retired (not captioned) in the compliance evidence trail with a rationale (superseded by current content, technically obsolete, below minimum access threshold). Tier 4 content that is captioned moves into the Tier 3 batch queue. Tier 4 content flagged for replacement is logged as a future programme item.
Month 12 deliverables: Tier 3 batch delivery in progress (may extend into year two for large libraries), Tier 4 assessment complete, new-content gate implemented (all new video produced after programme close date is captioned before deployment), year-end compliance evidence package, transition plan for ongoing new-content captioning programme.
Platform-specific batch caption workflows
Each LMS platform has a different architecture for video hosting and caption management, which determines the batch ingestion workflow. The following covers the nine most common platforms in L&D environments. For platforms not listed, the general principle applies: if the platform has a public API, investigate the caption upload endpoint; if not, the manual per-video upload interface is the ingestion method.
Panopto
Panopto is the most common video platform for university lecture capture and is increasingly used in corporate L&D for recorded training sessions. See the Panopto captioning guide for the complete workflow; the batch remediation specifics:
Panopto does not have a native bulk video export interface, but the Panopto REST API (v1) supports programmatic enumeration and download of all recordings in a folder hierarchy. For batch remediation: use GET /api/v1/videos to enumerate all recordings, filter by caption availability using the sessions.hasCaptions field, download the audio track for captioning (the .m4a or extracted audio is sufficient for batch ASR), and after caption production, upload SRT files via POST to the caption endpoint (/api/v1/sessions/{sessionId}/captions) in the Panopto API.
A critical Panopto constraint for remediation programmes: Panopto's built-in automatic speech recognition generates captions automatically for new recordings when the feature is enabled at the folder level, but it does not retroactively apply to historical recordings in existing folders. Enabling auto-captions in a Panopto folder does not resolve the backlog — it only affects new recordings added to that folder after the setting is enabled. The remediation programme must explicitly process historical recordings through the batch API workflow.
Panopto caption status in bulk: the Panopto API does not currently provide a single endpoint to list all sessions with caption status across the entire deployment. Enumerating caption status requires iterating through all folder hierarchies, which at university scale (thousands of courses with hundreds of recordings each) requires a scripted enumeration process or a Panopto professional services engagement. Plan for this inventory step to take 1–2 days of engineering time at university scale.
Kaltura
Kaltura has the most comprehensive batch caption management API of any LMS-integrated video platform. The Kaltura captioning workflow covers the full integration options; the batch remediation specifics:
Kaltura MediaSpace and KMC (Kaltura Management Console) both support bulk operations. The content audit view in KMC (Content menu → Entries) can be filtered by caption availability, and the filtered list can be exported as a CSV. For batch caption upload: the Kaltura API (captions.add to create a caption asset, followed by captions.setContent to upload the SRT data) supports programmatic upload for any number of entries. The entry ID in Kaltura (format: 1_xxxxxxxx) serves as the natural file naming convention for delivered SRT files, which means vendor delivery against entry-ID-named files maps directly to the batch upload script.
Kaltura also has native integrations with enterprise captioning vendors (3Play Media, Verbit, Cielo24). These integrations, when configured in KMC, allow content to be submitted directly from the KMC interface to the vendor's captioning queue without exporting and re-importing. For large-scale programmes, this reduces the submission overhead significantly. The integration delivers completed captions back to Kaltura automatically, eliminating the manual SRT upload step. For Kaltura-hosted environments undertaking large-scale remediation, activating one of these vendor integrations at the start of the programme is worth the setup cost.
The LMS migration caption checklist includes Kaltura-specific notes on caption data portability — relevant if the remediation programme coincides with a Kaltura version migration or a platform consolidation from multiple Kaltura instances.
Cornerstone OnDemand
Cornerstone OnDemand is common in large enterprise and healthcare L&D environments. The Cornerstone captions guide covers the standard workflow; the batch remediation specifics:
Cornerstone does not have a public API endpoint for bulk caption operations in the standard OnDemand product. Caption upload in Cornerstone is done through the Content Management interface, per video asset: navigate to the course, open the asset editor, upload the SRT file for the video segment. For large-scale remediation, this is a significant bottleneck — there is no scriptable shortcut in the standard product.
Cornerstone Content Connector: Cornerstone's enterprise integration layer includes a Content Connector API that supports certain content management operations. If the organisation has a Cornerstone Content Connector configuration (typically part of an enterprise contract), it may support programmatic caption upload. Verify availability with Cornerstone's technical account management before concluding that per-video manual upload is the only option.
SCORM content in Cornerstone: as noted in the inventory section, SCORM packages hosted in Cornerstone embed caption data at authoring tool publish time. The Cornerstone LMS cannot add or modify captions inside a SCORM package post-deployment. The remediation pathway for SCORM content in Cornerstone is: obtain the original authoring tool project file, add captions in the authoring tool, republish the SCORM package, and re-upload to Cornerstone as a replacement course. For organisations without the original project files, the alternative is to convert the course to a raw video format (screen record the SCORM presentation), apply standard video captioning, and deploy the video as a replacement course unit. This is expensive and time-consuming; flag all SCORM content in Cornerstone for Tier 4 assessment before budgeting for its remediation.
Docebo
Docebo is common in mid-market enterprise and customer education environments. The Docebo captioning guide covers the standard workflow; the batch specifics:
Docebo has a REST API (v2) that supports course content enumeration. Enumerating all course assets for caption status assessment: GET /api/v2/learn/course-content returns course-level content metadata. Docebo does not expose a caption upload endpoint in the standard v2 API — caption files are uploaded through the course editor interface, per video asset. For large-scale remediation in Docebo, the per-video manual upload pathway is the standard route.
Docebo's native video hosting integrates with Vimeo at the enterprise tier. For organisations using Docebo + Vimeo as the video hosting layer, caption management at the Vimeo level automatically propagates to Docebo players. This is the preferred architecture for large-scale remediation: batch-process captions in Vimeo (where the API supports programmatic upload), and Docebo inherits the captions without a separate ingestion step. The Docebo + Vimeo architecture reduces the LMS ingestion burden from the Docebo side to zero for Vimeo-hosted content.
TalentLMS
TalentLMS is common in SMB and mid-market environments. See the TalentLMS captioning guide for details. The batch specifics:
TalentLMS has a REST API but limited bulk content management capability for caption files. Per-video SRT upload through the course editor is the standard method. TalentLMS's video hosting architecture matters here: organisations that host video natively in TalentLMS (direct MP4 upload to course units) must use the per-video interface for each SRT upload. Organisations that embed video from Vimeo, YouTube, or another platform with its own API can use the video host's bulk upload capability instead, with TalentLMS inheriting the captions through the embed. For large-scale remediation in TalentLMS, the architectural approach — whether to manage captions at the TalentLMS layer or at the video host layer — is the most consequential efficiency decision in the programme design.
Workday Learning
Workday Learning is deployed in large enterprise environments, often as part of a broader Workday HCM suite. The Workday Learning captioning guide covers the full workflow. The batch specifics:
Workday Learning has strict video hosting requirements and limited public API exposure for content management operations. Caption management in Workday Learning is done through the Workday media library interface — per video, through the standard Workday administrative interface. Workday does not provide a public bulk caption upload API in the standard product. For organisations undertaking large-scale remediation in Workday Learning, the per-video manual upload pathway is the realistic option. The implication for throughput planning: allocate more LMS administration time per video for Workday than for API-capable platforms.
Workday Learning + external video hosting: some Workday Learning deployments embed video from Kaltura or Vimeo rather than using Workday's native video hosting. In those configurations, caption management at the video host level (Kaltura API or Vimeo API) is the efficient path, and Workday inherits the captions through the embed. Verify the video hosting architecture before allocating LMS administration time for the remediation programme.
Moodle
Moodle is the dominant open-source LMS in higher education and is used in some enterprise environments. The batch specifics are architecture-dependent:
Moodle with native file activities: Video uploaded directly to Moodle course units as File activities has no native caption management interface — the video file is served directly without a caption layer. For these assets, captioning requires either converting the course unit to a different activity type (Video Resource or HTML page with embedded player) or adding the video to an external platform and embedding. For large-scale remediation of native-file-activity video in Moodle, the most efficient approach is migrating the content to a video platform (Kaltura, Panopto, or Vimeo) where caption management is available, and updating the Moodle course units to embed from the platform. This is a migration project as well as a captioning project — factor in the engineering time.
Moodle with Panopto integration: The most common configuration in higher education. Panopto recordings are embedded in Moodle course pages via the Panopto block or activity plugin. Caption management happens in Panopto; Moodle inherits the captions through the embed. The large-scale remediation programme is a Panopto batch programme, not a Moodle programme. The Panopto API workflow described above is the correct approach for this configuration.
Moodle with Kaltura integration (Kaltura Video Package for Moodle): Similar to the Panopto integration — caption management happens in Kaltura, Moodle inherits. Use the Kaltura API batch upload approach.
H5P video in Moodle: H5P video activities support SRT caption upload through the H5P editor interface — one SRT file per H5P video unit, uploaded through the edit form. No bulk API. For large-scale programmes with significant H5P content, the per-unit upload time must be factored into the LMS administration estimate.
Absorb LMS
Absorb LMS is common in mid-market enterprise and association management environments. Absorb has a REST API for content management, but caption upload for video assets is through the per-asset editor in the Absorb admin interface. The batch specifics mirror those of TalentLMS: the architecture of video hosting (native Absorb vs embedded from a video platform) determines whether the efficient path is through the video host API or through Absorb's per-video interface.
Absorb's SCORM handling is similar to Cornerstone — captions in SCORM packages are embedded at authoring tool publish time and cannot be modified by the Absorb LMS. The same SCORM remediation pathway (obtain source file, add captions in authoring tool, republish, re-upload) applies.
Brightspace (D2L)
Brightspace by D2L is common in higher education (particularly in Canada and the United States K–12 system) and some enterprise environments. Brightspace has a Valence API that supports content management operations. Caption upload via the Valence API: the /d2l/api/lp/{ver}/courses/{orgUnitId}/content/topics endpoint supports content topic management, and for hosted video topics, the caption association can be managed programmatically for supported media formats.
Brightspace Capture (D2L's video hosting layer): for video hosted in Brightspace Capture, captions are managed through the Capture interface with per-video SRT upload capability. Brightspace also has integrations with Kaltura and Panopto; organisations using those integrations manage captions at the video platform level. For large-scale remediation in a Brightspace environment with external video platform integration, follow the Kaltura or Panopto batch workflows described above.
Compliance evidence documentation during active remediation
A large-scale remediation programme that does not maintain contemporaneous compliance evidence is at significant risk if a regulatory complaint or OCR investigation arrives mid-programme. The evidence package must demonstrate three things: that the organisation knew the extent of the problem (inventory), that it had a systematic plan to address it (prioritisation and timeline), and that it was executing against the plan with measurable progress (batch completion records). Evidence assembled after a complaint is received is far less credible than a programme document trail that predates the complaint by months or years.
Required documentation components
Remediation plan document (version-controlled). A dated document (not a spreadsheet — a formal document) that records the inventory total, the four-tier prioritisation rationale, the milestone dates for each tier, the vendor selection and contract terms, and the budget allocation by year. Version-control this document so that any updates (scope changes, timeline revisions) are tracked with dates and rationale. The original version of this document, produced at programme inception, is the primary evidence of good-faith planning.
Monthly batch completion reports. A monthly report showing: number of videos submitted in the period, number of videos completed and reviewed, number deployed to the LMS, cumulative Tier 1/Tier 2/Tier 3 completion percentages, and any exceptions or timeline deviations from the plan with reasons. These reports are the evidence of systematic progress. Send them to the compliance and legal stakeholders on a monthly cadence — this creates a contemporaneous record with a distribution trail that demonstrates the programme was actively managed and reported to appropriate leadership.
Tier 1 completion certificate. At the completion of Tier 1 remediation, produce a formal completion document listing every course and content asset in Tier 1, the date it was captioned and deployed, the QA review sign-off, and the deployment confirmation. This document is the single most important evidence item in the package — it demonstrates that the highest-risk content was addressed first and that each item has a documented completion date.
Active-enrollment coverage metric. The most meaningful ongoing compliance metric is the active-enrollment coverage ratio: the percentage of content currently in active enrollment (assigned to learners in the current enrolment window) that has compliant captions. This metric should reach 100% at the end of Tier 1 remediation and stay at 100% through the new-content gate. Track and report this metric monthly. If it ever drops below 100% during the Tier 2–3 phase (because new content was added to active enrollment without captions), that content immediately becomes Tier 1 priority.
The compliance reporting framework covers the full dashboard structure for reporting programme progress to executive leadership. The key point for regulatory compliance evidence: the monthly reports and dashboard records must be retained as part of the programme documentation, not just as internal management tools. A compliance programme that has strong internal dashboard visibility but no retained historical records cannot demonstrate to a regulator that the programme was consistently managed.
Programme management: tracking, resourcing, and governance
A large-scale remediation programme requires programme management infrastructure that is qualitatively different from a small captioning programme managed through a shared spreadsheet. The following covers the minimum viable programme management structure for a 5,000–20,000 hour remediation.
Resourcing model
The three dedicated roles required for a large-scale programme:
Programme manager (accessibility coordinator or L&D operations lead). 0.5–1.0 FTE depending on programme scale. Responsible for batch submission scheduling, vendor relationship management, monthly progress reporting, and compliance evidence documentation. The accessibility coordinator playbook covers the role design in detail, including the distinction between the compliance management function (owns the programme) and the caption production function (executes the work).
Content reviewer(s). 0.25–1.0 FTE depending on review volume and content technical complexity. Responsible for QA review of vendor deliverables using the DCMP spot-check protocol. For programmes using batch sampling (10% of each batch reviewed to full DCMP standard), a single part-time reviewer can support a 400–600 hour/month programme. For programmes reviewing every video (appropriate during the Tier 1 sprint), 0.5–0.75 FTE is typically required for the sprint duration.
LMS administrator(s). The LMS ingestion bottleneck requires allocated administration time. Based on the throughput math above (2–3 minutes per video for standard manual upload), a programme processing 400 hours/month (approximately 400 videos at 10-minute average) requires 800–1,200 minutes of LMS administration time per month — approximately 15–20 hours. This can be part of an existing LMS administrator role if the total time allocation is planned, but it must be explicitly scheduled and not treated as a background task that will be done when convenient.
Batch tracking infrastructure
A programme management spreadsheet or project management tool should track the following fields for every batch submitted to the captioning vendor:
- Batch ID (assign a sequential batch number for internal reference)
- Submission date
- Vendor batch reference number
- Number of videos in batch
- Total minutes in batch
- Tier assignment (all Tier 1, mixed Tier 1–2, etc.)
- Expected delivery date (per vendor contract SLA)
- Actual delivery date
- QA sample size and pass/fail result
- Rework requested (count of videos returned for correction)
- LMS ingestion start date
- LMS ingestion completion date
- Compliance evidence documentation status (Tier 1 only: yes/no)
The rework field is particularly important: when a vendor batch fails QA sampling, the rework process adds 3–5 business days to the deployment timeline and affects the active-enrollment coverage metric if the content was in active enrollment. Track rework frequency by vendor to inform contract renewal decisions.
Vendor management at scale
A large-scale programme is a significant revenue relationship for captioning vendors, which gives the organisation more leverage than a small account would have. Use this leverage to negotiate: a dedicated account manager (not a shared support queue), a defined escalation path for quality issues, a volume commitment that activates the vendor's priority queue, and contractual terms that address the programme-specific requirements (SRT file naming conventions, glossary loading, DCMP accuracy standard, rework turnaround SLA). The caption vendor SLA and contract review checklist covers the 42-point contract review process relevant for this level of engagement.
For programmes using multiple concurrent vendors (enterprise scale), establish a lead vendor designation: one vendor receives the priority Tier 1 content and holds the account relationship; secondary vendors receive Tier 2–3 overflow. This prevents a situation where the same content appears in multiple vendor queues (a coordination failure that creates duplicate spend) and ensures that the highest-risk content always goes through the most vetted vendor relationship. The vendor transition playbook covers the glossary portability and accuracy-reset considerations that apply when working across multiple vendors simultaneously.
New-content gate implementation
The new-content gate — the requirement that no new video is published to learners without a compliant caption track — must be implemented as part of the remediation programme, not after it is complete. Implementing the gate mid-programme prevents the backlog from growing while it is being remediated. The gate takes different forms by system: an LMS course publication checklist that includes caption status verification, a video upload policy in the intranet that requires a caption file to be uploaded simultaneously with the video, and an authoring tool production workflow that includes caption production as a post-production step before SCORM publish. The change management rollout guide covers the content creator communication and training required to make a new-content gate function in practice across multiple teams.
Eight failure modes in large-scale caption backlog remediation
1. Beginning with archive content instead of active-enrollment content
The most consequential sequencing failure: remediating the oldest, lowest-enrollment content first because it is easy to batch (similar format, similar content type) while active-enrollment content in multiple formats and systems waits in the Tier 2–3 queue. This failure mode typically occurs when programme ownership sits in an operations function focused on efficiency (batching similar content is operationally simpler) rather than in a compliance function focused on risk (active-enrollment content is the highest-risk regardless of format). The remediation programme's sequencing must be governed by tier assignment, not by operational convenience.
2. Not accounting for internal review capacity in throughput projections
The project plan shows vendor delivery completing in 12 months at a given throughput rate. The plan does not account for the fact that internal review capacity supports only half the vendor throughput rate. At month 6, a large queue of vendor-delivered but unreviewed content has accumulated, the LMS still shows the pre-remediation caption status, and the programme is behind on active-enrollment coverage even though the vendor has completed the work. Model all three constraints — vendor throughput, internal review capacity, and LMS ingestion time — before finalising the programme timeline.
3. Missing the SCORM source file problem until the programme is mid-execution
The inventory did not distinguish between video assets (where SRT sidecar delivery is straightforward) and SCORM packages (where adding captions requires the original authoring tool project file). The programme manager discovers mid-remediation that 30% of the "video" backlog is actually SCORM content, for which the original .story and .cptx files cannot be located. This discovery at month 6 requires a programme revision (Tier 1 SCORM content must be either reconstructed from audio or replaced), additional budget (SCORM reconstruction is significantly more expensive than SRT delivery), and a revised timeline. Conduct the SCORM inventory and source-file audit in month 1, not during batch processing.
4. Treating AI-only output as QA-ready without a glossary
To reduce vendor costs, the programme uses the AI-only service tier for Tier 3 archive content and applies QA sampling. The sample reveals that technical training content (compliance modules, product training, engineering onboarding) is at 78–82% ASR accuracy — below the WCAG threshold. The entire Tier 3 batch must be resubmitted for human review, adding cost and delay. The 99% WCAG threshold applies to all deployed content regardless of tier assignment. Using the AI-only tier without a loaded domain glossary on technical training content guarantees below-threshold accuracy.
5. Letting compliance evidence documentation fall behind the work
Monthly reports were not produced during months 3–8 because the programme team was focused on delivery. When a regulatory complaint is received at month 9, the compliance team cannot produce contemporaneous evidence of the programme's progress — they can only show the current status, not the historical sequence. The retrospective evidence reconstruction is not credible to regulators. Build the monthly reporting cadence into the programme governance from month 1 and treat a missed report as a programme escalation, not a minor administrative gap.
6. Single-vendor dependency at enterprise scale
A 20,000-hour programme is committed to a single captioning vendor who has a 700-hour/month effective throughput. The programme timeline is 28 months. At month 10, the vendor experiences a capacity constraint (peak demand season, internal quality issue) and throughput drops to 400 hours/month for eight weeks. The Tier 2 milestone slips by 6 weeks. For enterprise-scale programmes, a secondary vendor on retainer provides a throughput buffer that prevents single points of failure from driving compliance milestone delays.
7. Not implementing the new-content gate during the remediation period
The remediation programme is designed to complete the existing backlog. During the 18-month programme, the organisation's L&D and communications teams continue producing new training video at the pre-programme rate — adding 2,000 hours of new uncaptioned content while the programme remediates the existing backlog. At programme completion, the "backlog" is 80% of its starting size. Implementing the new-content gate on month 1 of the programme prevents this erosion. The governance policy template provides the language for the new-content gate policy requirement.
8. Treating programme completion as "when the last video is captioned" rather than "when the programme is sustainable"
The final batch of Tier 3 content is delivered and ingested. The programme is declared complete. Six months later, the programme manager has left the organisation, the vendor contract has lapsed, the glossary has not been updated with new product launches, and new content is being published without captions because the new-content gate was never formally institutionalised in the L&D workflow. The large-scale remediation programme is an investment that must transition into an ongoing programme — not a project that ends with the last batch delivery. The programme close-out activities include: transitioning the vendor relationship to the ongoing new-content programme, training L&D content creators on the new-content gate workflow, documenting the glossary maintenance process, and embedding the caption QA step into the L&D production workflow as a permanent requirement.
FAQ
How do we justify the budget to leadership for a multi-year remediation programme?
The most effective budget justification for large-scale remediation uses the expected-value framework from the ROI framing for finance executives post: the expected cost of a regulatory investigation, settlement, or consent decree is compared against the remediation cost. For a mid-size organisation with a documented ADA compliance gap (the existence of a large uncaptioned backlog is discoverable in an audit), the expected cost of an OCR complaint investigation ranges from $50,000–$300,000 in legal fees, staff time, and potential settlement, even for cases that do not proceed to litigation. A multi-year remediation programme budgeted at $100,000–$200,000 is a net expected-value positive investment even before considering the reputational and employee relations benefits of accessible training. Present the budget request as a compliance risk management investment with a documented expected ROI, not as an accessibility programme cost.
What if we discover that some of our highest-priority Tier 1 content is in SCORM format without accessible source files?
This is a common and difficult situation. The options in priority order: (1) search comprehensively for the original project files — check archived project drives, the email history of the team that built the course, shared storage from prior years, and the original authoring tool vendor if the course was produced by an external agency (agencies often retain project files for several years under their own retention policies); (2) if the project files are genuinely unrecoverable, commission a manual transcript from the audio track and submit it to a captioning vendor who can time-align the transcript to the audio as a human-produced SRT file — this bypasses the SCORM packaging issue by providing the transcript directly and having the vendor match it to the audio timing; (3) if the audio quality is too poor for transcript alignment or the course is severely outdated, the fastest route to compliance is replacing the SCORM course with a newly-produced version that uses a captioning-native workflow from the start. For Tier 1 content, option 2 or 3 is usually necessary within the 60-day timeline — the SCORM reconstruction path takes longer than the timeframe allows.
Should we use AI-only captioning for Tier 4 archive content to minimise cost?
Yes, with conditions. Tier 4 content that passes the Tier 4 assessment (not retired, not replaced) can use the AI-only tier if: (a) a domain glossary is loaded at the vendor to improve ASR accuracy on technical vocabulary, (b) a batch QA sample (DCMP spot-check on 10% of the batch) is run to validate that the delivered output meets the 99% threshold, and (c) the internal review capacity is allocated to handle the correction pass if the QA sample fails. AI-only without a glossary on technical training content will consistently fail the 99% threshold for material vocabulary (product names, regulatory frameworks, technical terminology). The apparent cost saving disappears when the internal correction burden is added. AI with glossary pre-loading achieves 92–96% on most technical content before human review, reducing the correction load to a level where batch QA sampling is a valid quality control approach.
What constitutes a defensible remediation timeline in the event of an OCR complaint?
There is no universal standard — the defensibility of a timeline depends on the documentation, the prioritisation rationale, and the good-faith evidence. The Office for Civil Rights (DOE OCR for universities; DOJ for ADA Title I employment) does not specify a maximum acceptable remediation timeline. What OCR has required in resolved complaints and consent agreements typically includes: (1) all current-enrollment content captioned within 60–120 days of the complaint, (2) a documented plan with milestones for remaining content, (3) quarterly progress reports to OCR during the remediation period, (4) a new-content gate implemented immediately to prevent future accumulation, and (5) designation of a responsible compliance officer. An organisation that can present a pre-complaint remediation programme with Tier 1 completed, Tier 2 in active progress, and monthly reporting documentation is in a far stronger position than one that begins remediation only after the complaint is filed. The timeliness of Tier 1 completion is the most heavily weighted factor — completing Tier 1 before any complaint is filed provides the strongest evidence of proactive good-faith effort.
We have training content in three different LMS platforms from acquisitions. How do we prioritise across systems?
The four-tier prioritisation framework applies across all systems simultaneously using the same tier criteria — it does not vary by which LMS hosts the content. Active-enrollment content in an acquired company's LMS is as much Tier 1 as active-enrollment content in the primary LMS. The practical challenge is that the three systems have three different caption upload workflows, three different inventory approaches, and potentially three different vendor integrations. For the programme management structure, assign a system lead for each LMS (someone who knows that platform's admin interface) and have them own the LMS ingestion workflow for their system's content. The prioritisation, vendor management, and compliance evidence functions remain centralised under the programme manager. The LMS migration caption checklist is relevant here even though you are not doing a full migration — the section on cross-LMS caption data inventory applies to your multi-system audit situation.
When is the remediation programme "done" — when all content is captioned, or when all content is WCAG-compliant?
The compliance standard is WCAG 2.1 AA compliance, not merely the presence of a caption file. A video that has auto-generated captions at 78% accuracy is not compliant even though it has a caption track. "Done" from a compliance perspective means: every video accessible to learners has a caption track that meets the 99% accuracy WCAG threshold, is properly synchronised (within ±2 seconds), and has been reviewed for accuracy on domain-specific vocabulary. A caption file that was produced and reviewed to that standard is compliant. A caption file that was auto-generated and never reviewed is not compliant regardless of how long it has been attached to the video. The programme is done when the compliant-captions metric — not the captioned-at-all metric — reaches 100% for active-enrollment content and the historical backlog, and when the new-content gate prevents new non-compliant content from entering the system.
We have purchased compliance training modules from a vendor (ComplianceWire, Navex, Skillsoft, OpenSesame). Are those in scope for our remediation programme?
Yes. As covered in the third-party compliance training captioning guide, the caption obligation under ADA Title I rests with the employer who assigns the training — not solely with the vendor who produced it. If you assign purchased compliance modules to employees and those modules do not have WCAG-compliant captions, you are responsible for the compliance gap. The practical steps: (1) audit the caption quality of all purchased modules currently deployed (many enterprise compliance training vendors now provide WCAG-compliant SRT files; the gap is often in older legacy content in the vendor's library); (2) contact the vendor's accessibility team to request compliant caption files for any module that does not meet the standard; (3) if the vendor cannot or will not provide compliant captions, factor the module into your remediation programme as a content asset requiring external captioning (you may need contractual clarity on whether captioning a licensed module for accessibility purposes is permitted under your licence terms). Most enterprise compliance training vendors are highly responsive to accessibility requests from large accounts — the request triggers their accessibility compliance workflow and typically results in an updated caption file within 2–4 weeks.
Remediation at scale — without rebuilding your caption infrastructure from scratch
GlossCap applies your organisational glossary at the ASR decoding stage, reducing the human review burden on every batch by 40–60% compared to generic ASR output. For a 10,000-hour remediation programme, that difference is several hundred hours of internal reviewer time and weeks of programme schedule. Start with a glossary-assisted accuracy evaluation before committing to a full-scale batch.