Compliance Operations · Published 2026-06-05
Multi-language caption workflow for global L&D teams: translation pipeline, LMS delivery, EAA and AODA compliance
The moment a training video is delivered to an employee in a language other than the source language, every assumption in your caption workflow changes. The caption file format stays the same — SRT or VTT — but the timing, the vocabulary, the QA method, the LMS delivery configuration, and the compliance obligation all have different requirements in the target language than they had in the source. Teams that treat multi-language captioning as "send the SRT to a translation vendor and re-upload" discover these differences the hard way: French captions that scroll faster than the WCAG guideline allows because the French translation of English text runs 25–30% longer, compliance training videos where "directive de confidentialité" was rendered as "privacy directive" because the translator received no instruction on how to handle French-language regulatory terms, or LMS deployments where the French caption track plays by default for English-speaking learners because the locale-routing configuration was not updated alongside the file upload.
Two regulations are forcing this from optional to mandatory for a growing share of global L&D operations. The European Accessibility Act has been enforceable since June 2025 and requires that digital products and services — including training platforms — meet the EAA's accessibility requirements for caption accuracy and synchronization in the language in which content is served. An EU employer that provides French-language compliance training to employees in France must provide French-language captions at WCAG 2.1 AA accuracy — not English captions, and not machine-translated captions that were never QA'd for French accuracy. The AODA's Integrated Accessibility Standards Regulation has required captions for synchronous video since 2017 and, for Ontario employers serving French-speaking employees under the French Language Services Act, creates a practical bilingual obligation for any training content produced in both official languages.
This post is the operational guide to building a multi-language caption workflow that actually meets these obligations. It covers the four-stage translation pipeline in enough detail to apply it on a specific content library next week, how to choose between machine translation and human translation by content type and regulatory exposure, the per-language glossary architecture that prevents the terminology inconsistencies that defeat MT quality, LMS delivery workflows for eight platforms we see most frequently in global enterprise L&D environments, the EAA and AODA compliance obligations as they apply specifically to caption language, the adapted QA protocol for translated captions including the reading-speed adjustment requirement, and the eight failure modes that cause teams to ship non-compliant or operationally broken multi-language caption tracks. The preceding posts in this series — caption QA methodology, the feedback loop that compounds accuracy, and how to audit an LMS caption library — cover the source-language production system. This post assumes that system is in place and focuses on what changes when you add a second language.
TL;DR — three things that matter about multi-language caption operations
- Source caption quality is the hard constraint. You cannot produce a compliant French caption from a non-compliant English caption. If the source caption has not cleared 99% accuracy against the WCAG 2.1 AA standard, the translation stage will propagate every error into the target language and introduce additional errors through the translation process. Source caption QA lock is not optional — it is the first gate in the pipeline.
- Machine translation works for soft-skills content; it fails for regulated content. DeepL and Google Translate produce usable output on communication training, leadership content, and process documentation. They produce unacceptable output on compliance training, healthcare training, and legal training — where a term like "reasonable accommodation" in ADA compliance content must be rendered as "aménagement raisonnable" (France) or "mesure d'adaptation raisonnable" (Quebec) depending on jurisdiction, not as whatever the MT model's most common output is. The method selection decision has a direct impact on whether the translated captions are legally defensible.
- Timing must be re-adapted for text-expansion languages. English source timing files cannot be reused as-is for French, Spanish, German, or Portuguese caption tracks. French runs 25–30% longer than English in text at equivalent meaning; Spanish runs 20–25% longer; German runs 30–35% longer. A caption frame timed for 40 characters of English text will display 50–55 characters of French text, exceeding WCAG's reading-speed guidance (approximately 17 characters per second or two lines of 37 characters). Every multi-language caption workflow must include a timing adaptation step before the translated file is ingested into the LMS.
What multi-language captioning means — and what it does not mean
Multi-language captioning is a workflow problem, not a translation problem. The translation is one step — and not the hardest one. The harder steps are: locking source caption quality before translation begins, choosing the right translation method for the content type, adapting timing after translation for text-expansion languages, routing the translated file to the correct LMS caption track, verifying that the LMS locale configuration delivers the correct language to the correct learner, and running a QA check on the translated caption against a native speaker reference. Teams that execute only the translation step and skip the other five will produce caption files that are technically translated but practically non-compliant.
What it is
A multi-language caption workflow produces, for each piece of captioned video content:
- One caption file per delivery language, each with its own timing optimized for that language's text expansion properties
- One QA record per language, documenting accuracy against the WCAG 99% threshold in the target language (not the source language)
- One LMS configuration per language, ensuring that each learner's locale setting routes them to the correct caption track
- One glossary entry set per language, covering all proper nouns, regulatory terms, and product names that appear in the content, with their correct target-language form
This is the minimum viable multi-language caption operation. It is more work than producing captions in a single language, but it is the only configuration that is compliant under EAA and defensible under AODA for organizations serving employees in multiple languages.
What it is not
Multi-language captioning is not:
- Auto-translating the source SRT file and uploading it without re-timing. Auto-translated SRT files carry the source timing, which is calibrated to the source language text length. French, Spanish, German, and Portuguese text is longer than English text at equivalent meaning, so auto-translated files will generate reading-speed failures in these languages. The timing must be re-adapted for each target language.
- Adding a subtitle file to the LMS without updating locale routing. Uploading a French SRT file does not automatically make it the default caption track for French-locale learners. LMS locale configuration must be explicitly set, and the configuration depends on how the specific platform handles multi-language course variants versus single-course multi-language subtitle selectors.
- Translating captions without QA in the target language. A translated caption that has not been QA'd against a native-speaker reference in the target language cannot be cited as compliant. The QA methodology described for source captions applies to translated captions, with adaptations for translation-layer error types. You cannot sign an accessibility statement claiming French caption compliance if no one who reads French has verified the captions.
- Assuming the source glossary works for the target language. The source glossary covers English-language terms and their canonical English forms. Product names, regulatory terms, and technical terms all need target-language entries. "Workday" stays "Workday" in French; "aménagement raisonnable" is the correct French-language term for "reasonable accommodation" in a French ADA-equivalent training context. These decisions must be made explicitly and entered in a per-language term list before translation begins.
Which organizations need this
Organizations that need a multi-language caption workflow fall into three categories.
EU-based or EU-serving organizations under EAA. Any organization providing digital services to EU consumers or employees is subject to EAA's accessibility requirements from June 2025. For L&D teams, this means training content served in French, German, Dutch, Spanish, Italian, Polish, or any other EU language must meet EN 301 549's WCAG 2.1 AA caption requirements in the delivery language. An English-headquartered company running a Docebo instance for its French subsidiary cannot serve English-only captions to French-speaking learners and claim EAA compliance. The EU accessibility statement requirement reinforces this: the statement must declare the languages for which captions are provided and the accuracy level at which they are maintained.
Ontario employers under AODA. The AODA's Integrated Accessibility Standards Regulation §14 requires captions for all synchronous video communications and, by extension under the employment standard, for recorded training content distributed to employees. Ontario employers with French-speaking employees under the French Language Services Act face a practical bilingual obligation: any training that must be provided in French under FLSA must be captioned in French to meet AODA's accessibility requirement in that language. This is not explicitly stated in a single provision, but it is the correct interpretation of the intersection of the two statutes. The AODA accessibility plan template and the AODA captions guidance describe the base obligation; bilingual content brings the French-language caption requirement.
Global enterprises with multi-country L&D operations. Even outside EAA and AODA jurisdictions, organizations with manufacturing, sales, or support operations in non-English-speaking markets increasingly include training content accessibility as part of their global L&D quality standard. The driver is often not regulation but inclusion: a production floor worker in a Mexican facility whose primary language is Spanish derives significantly less learning value from an English caption than from a Spanish one. The compliance argument and the learning-effectiveness argument both point to the same operational investment.
The four-stage translation pipeline
A multi-language caption workflow has four sequential stages. They must be completed in order — skipping or reordering stages produces failures that are difficult to detect and expensive to remediate.
Stage 1: Source caption lock
Source caption lock is the discipline of not beginning translation until the source-language caption file has passed QA at the compliance threshold. This sounds obvious, but it is violated consistently in practice — usually because of timeline pressure: the video needs to go live, translation is starting "while QA wraps up," and the plan is to update the translated file if the source QA finds failures. This plan fails for two reasons.
First, a source caption error propagates into the translation. If the English caption says "cube nettis control plane" instead of "Kubernetes control plane," the translator — whether human or MT — translates the wrong text. The French caption may say "plan de contrôle du cube nettis" or may produce a translation error trying to process nonsense phonemes. Either way, the French caption is wrong and must be re-translated after the source is corrected. The re-translation cost is higher than the delay cost of waiting for source QA completion.
Second, timing corrections in the source cascade into the target. If source QA discovers that a caption frame was mistimed — covering 8 seconds of audio when the actual speech covers 12 seconds — the frame must be re-split in the source file. When the French translator has already worked from the incorrect timing, the translated file's timing is based on wrong frames. Correcting the source and re-timing the translation requires re-doing a significant portion of the translation work.
Source caption lock means: the source file has passed the QA protocol at 99% accuracy, all timing corrections have been applied, and the file is marked as the canonical source version before translation scope begins. Most teams formalize this with a version-control discipline: the source SRT/VTT file gets a version tag (v1.0-source-locked) before it is sent to translation, and any correction after that point requires a re-translation of the affected segments.
Stage 2: Translation method selection
Translation method selection is the decision of whether to use machine translation, machine translation with human post-editing, or full human translation for a specific piece of content. The decision is driven by content type, regulatory exposure, and target-language complexity.
The translation method selection framework is described in detail in the next section. The key output of this stage is a method decision for each content type in the library, documented before translation begins. Making this decision per-piece, mid-workflow, is a common source of inconsistency — teams end up with some content translated by MT and some by human translators, with no systematic record of which method was used for what, making audit and QA harder.
Stage 3: Timing adaptation
After translation, the caption file's timing must be adapted for the target language before it is ingested into any LMS. Timing adaptation is the process of re-segmenting the caption frames to account for text length differences between the source and target languages.
The core issue is that WCAG 1.2.2 Captions (Prerecorded) requires captions to be synchronized with audio and displayed at a readable pace. The operationalized reading-speed guideline — used by DCMP, BBC, and Netflix subtitle standards — is approximately 160–180 words per minute, or about 17 characters per second, with a maximum of two lines of approximately 37 characters per frame. When English text is translated to French, German, or Spanish, the translated text is longer at equivalent meaning. A frame that contains 35 characters of English text may contain 45–50 characters of French text — exceeding the two-line guideline if displayed in a single frame.
| Target language | Typical expansion vs. English | Implication for caption timing |
|---|---|---|
| French | +25–30% | Re-segment frames; split 40-char English frames into two 26-char French frames where expansion occurs |
| Spanish | +20–25% | Re-segment frames; modest expansion — check frames with dense technical terminology first |
| German | +30–35% | Most aggressive expansion of common European languages; compound nouns ("Barrierefreiheitsgesetz") will break simple frame splits — may need grammar-aware re-segmentation |
| Portuguese (BR) | +15–20% | Smaller expansion; most frames survive reuse with minor adjustment at high-density segments |
| Italian | +10–15% | Smallest expansion of the Romance group; timing reuse is often workable with spot corrections |
| Dutch | +15–20% | Compound nouns create localized frame overflow; check noun phrases specifically |
| Polish | −5 to +10% | Synthetic morphology often produces shorter output than English; timing may under-fill — re-segment to avoid excessive white space on screen |
| Japanese | −20 to −40% | Character-dense writing produces significantly shorter files; source timing leaves excessive dead space — re-segment to fill frames more fully and reduce cognitive load |
| Arabic | −10 to +15% | Right-to-left display requires LMS RTL caption support; expansion varies by content type; verify LMS RTL rendering before deployment |
Timing adaptation can be performed manually (in a subtitle editor such as Aegisub, Subtitle Edit, or the timing editor built into your caption vendor's platform) or automatically (using a re-timing script that expands frame boundaries proportionally to text length increase). Manual re-segmentation produces more grammatically correct frame splits — splitting at clause boundaries rather than at arbitrary character limits — but is significantly more time-consuming. For high-volume libraries, a two-pass approach works well: automatic proportional re-timing as the first pass, followed by manual review of any frame that now exceeds two lines or that splits across a syntactic boundary.
Stage 4: Multi-language QA
Translated captions require QA in the target language before they are published. The adapted QA protocol is described in detail later in this post. The key point here is that target-language QA is not optional even if you used professional human translators with domain expertise. Translation errors are a distinct error category from transcription errors — a human translator can produce a grammatically correct sentence that uses the wrong regulatory term, renders a product name incorrectly, or fails to adapt an idiomatic phrase that does not transfer across languages. QA is how you catch these.
For each target language, target-language QA requires a native speaker (or near-native L2 speaker with professional domain knowledge). You cannot QA French captions by running them through a back-translation to English and checking for meaning equivalence. The DCMP accuracy check requires reading the target-language caption alongside the target-language audio or a reference document and identifying discrepancies at the word level in the target language.
Translation method selection framework
The choice of translation method is the highest-impact decision in the multi-language caption pipeline. It affects cost, speed, accuracy, and regulatory defensibility. Three methods are available, and the right choice depends on content type and the consequences of error.
Machine translation (MT-only)
MT-only uses a neural machine translation engine — DeepL, Google Cloud Translation, Microsoft Azure Translator, or Amazon Translate — to translate the source caption file directly. Modern NMT produces usable output for a wide range of content types: DeepL in particular handles European language pairs well, with grammatical accuracy that exceeds what most reviewers expect from a machine output.
MT-only is appropriate when:
- The content is soft-skills training — communication, leadership, time management, interviewing — where there are no domain-specific regulatory terms and the vocabulary is common language
- The target language pair is well-served by the chosen MT engine (English → French, English → Spanish, English → German are all strong pairs for current NMT)
- The translation is not going to be cited as the basis of a compliance claim in the target language — it is for employee learning, not for regulatory audit evidence
- The volume is high and the per-unit budget is low — MT-only costs a fraction of human translation for large libraries
MT-only is not appropriate for:
- Compliance training, where the translation of regulatory terms must be legally precise (EAA, AODA, GDPR, OSHA equivalent in the target-language jurisdiction)
- Healthcare training, where drug names, clinical procedure names, and diagnostic terms must match the target-language clinical vocabulary, not the MT model's probabilistic output
- Financial services training, where specific regulatory terminology in the target-language jurisdiction may differ substantially from its English-language surface equivalent
- Any content where the organization needs to produce a multi-language accessibility statement certifying caption accuracy at WCAG compliance levels
Machine translation with human post-editing (MTPE)
MTPE produces a machine translation first, then routes the output to a human translator for review and correction. The human corrects errors, adapts terminology, and applies the per-language glossary decisions. MTPE is faster than full human translation (by approximately 30–50%) and more accurate than MT-only for domain-specific content.
MTPE is the method most training teams land on for the majority of their content once they have been through the decision framework. It combines the speed and cost advantage of MT for the majority of sentences (which are common language and translate correctly) with the judgment of a domain-aware human translator for the minority of sentences that require it (regulatory terms, product names, idiomatic phrases).
For MTPE to work well, two inputs must be provided to the human post-editor before they begin: the per-language glossary (covering product names, regulatory terms, and technical vocabulary with their correct target-language forms) and a style guide specifying the register and formality level appropriate for the organization's training content in that language. Without these, the post-editor has to make these decisions individually for each segment, producing inconsistent terminology across the file.
Full human translation (HT)
Full human translation engages a domain-expert translator to produce the target-language text from the source, without an MT intermediate. HT is slower and more expensive than MTPE, but it is the only method that is appropriate for content where translation errors would constitute a compliance risk or a patient safety risk.
Full HT is required for:
- Healthcare compliance training in regulated markets — drug name accuracy and clinical procedure terminology must be verified by a clinically trained translator in the target language
- Regulatory compliance training that will be submitted as audit evidence to a regulatory body in the target-language jurisdiction
- Legal training where the specific terminology has legal meaning in the target-language jurisdiction (the French-language legal equivalent of a common law concept is often different from its English surface translation)
- Content where MT post-editing would cost approximately the same as full HT due to high domain-vocabulary density — if the MT output requires correction on more than 60% of segments, full HT is more efficient
| Content type | Recommended method | EAA-defensible at 99%? | AODA-defensible? | Notes |
|---|---|---|---|---|
| Soft skills (communication, leadership) | MT-only | Yes, with timing adaptation + QA | Yes, with QA | Common vocabulary; MT models are strong |
| Product and process training | MTPE | Yes, with glossary + timing + QA | Yes | Product names need glossary anchoring |
| Onboarding + HR compliance | MTPE | Yes, with glossary + timing + QA | Yes | Policy terms need human review |
| ADA/EAA compliance training | HT | Yes (required) | Yes (required) | Regulatory terms must be jurisdiction-correct |
| GDPR / data privacy training | HT | Yes (required) | Yes | GDPR terminology is defined by EU regulation in each official language |
| Healthcare / clinical training | HT by clinical translator | Yes (required) | Yes (required) | Drug names and clinical procedure terms are patient-safety critical |
| Financial services compliance | HT by finance translator | Yes (required) | Yes (required) | Target-language regulatory terminology differs from English surface equivalent |
| Safety / OSHA-equivalent training | MTPE or HT | MTPE sufficient for common hazard language; HT for HazCom-equivalent regulatory terms | MTPE with QA | See HazCom captioning for chemical name handling |
Glossary architecture for multi-language production
The source-language glossary covers English proper nouns and their correct English canonical forms, fed into the AI captioning system's glossary-biased decoding to improve transcription accuracy. A multi-language operation requires an extension of this architecture: a per-language term map that specifies how each source-language term should appear in each target language.
The per-language term map
The per-language term map is a document (usually a CSV or spreadsheet) with one row per term and one column per language. Each cell specifies the correct target-language form of that term. The term map is the input to the translation workflow: it is given to every translator and every MT post-editor before they work on any segment, and it is checked as part of QA to verify that no term was rendered in a non-approved form.
Three categories of terms require explicit per-language decisions:
Product names and brand names. Most product names should be left untranslated: "GlossCap" stays "GlossCap" in French, German, and Japanese. Platform names like "Workday" and "Docebo" stay as-is. The decision is simple — do not translate proper brand names. The mistake is when a translator or MT model attempts to translate a brand name by meaning: "Cornerstone OnDemand" translated as "Pierre d'angle à la demande" (French literal) is wrong. Brand names go into the term map as "do not translate" entries to prevent this.
Regulatory terms. Regulatory terms must match the official vocabulary used in the target-language jurisdiction. This is where MT-only fails most consequentially. Consider three English terms that require different handling by language:
- "Reasonable accommodation" in ADA/EAA training: French (France) → "aménagement raisonnable"; French (Canada/Quebec) → "mesure d'adaptation raisonnable"; German → "angemessene Vorkehrung" (EN 301 549 Article 4 terminology)
- "Accessibility statement" in EAA context: French → "déclaration d'accessibilité" (specific term from Directive 2016/2102 implementing regulations); German → "Barrierefreiheitserklärung"
- "Synchronised captions" in WCAG context: French → "sous-titres synchronisés"; German → "synchronisierte Untertitel"; Dutch → "gesynchroniseerde ondertitels"
An MT model may produce correct output for any of these, or it may produce a plausible but incorrect equivalent. The term map removes the uncertainty by specifying the approved form in advance, before any translation work begins.
Technical vocabulary. Domain-specific technical terms — LMS platform names, API names, file format names, diagnostic terms, product SKU names — need explicit per-language entries. SRT stays "SRT" in all languages. Caption file formats are defined by standards bodies and have the same abbreviation across languages. "WCAG" stays "WCAG" in French, not "DCARPI" (the French-language acronym for the Directives pour l'accessibilité aux contenus Web). The term map prevents MT models from attempting to translate acronyms that should not be translated.
Term freeze before translation begins
The term map must be finalized and approved before any translation work begins on a content set. This is called term freeze, and it is the discipline that prevents the most expensive form of multi-language caption correction: discovering mid-QA that a regulatory term was rendered inconsistently across the first 40 videos in the library, requiring re-translation of every segment containing that term across every translated language.
Term freeze operates as follows: the term map is assembled (by the L&D team, with input from legal, HR, and any domain subject matter experts), reviewed and approved by a native speaker with domain knowledge in each target language, and then distributed as the locked reference for all translators before production begins. Any change to a term after freeze requires a formal amendment, a note of which content files are affected, and a queue for re-translation of affected segments.
This sounds bureaucratic for a small content library. It scales well when the library reaches 50–200 videos, because the investment in term freeze up front prevents the compounding remediation cost of inconsistent terminology across a large library. For the compound accuracy effect to work in multi-language production, the vocabulary model must be consistent — which requires term freeze discipline from the beginning.
Product names that vary by market
Some organizations localize product names by market — a product sold as "Apex" in the US may be sold as "Apex Pro" in Germany or under a different trade name in Japan due to trademark considerations. These market-specific product name variants must be in the per-language term map explicitly, not left to the translator's judgment. A translator who does not know that the German market name for the product is different from the US market name will use the US name in the German caption, which is wrong both as a captioning error and as a brand consistency error.
This is a small category — most organizations do not have market-specific product names — but when it applies, it is a source of errors that are extremely difficult to catch in QA unless the QA reviewer is specifically looking for it.
LMS delivery: per-platform multi-language caption workflows
Multi-language caption delivery on an LMS requires two things that are separate from the caption file itself: a correctly configured per-language caption track in the LMS, and a locale routing configuration that delivers the correct language to each learner. The way these are configured differs significantly across platforms. Eight platforms account for the majority of the global enterprise L&D market, and each has a different approach to multi-language caption delivery.
Docebo
Docebo supports multi-language caption tracks through its subtitling functionality within the course module editor. For each video asset in a course, Docebo allows multiple subtitle file uploads, with a language tag applied to each. Learners can select caption language from the video player's CC menu. Docebo's locale-aware delivery — where the platform's interface language is set per learner based on their profile locale — does not automatically pre-select the caption language. The learner must select the caption language manually from the CC dropdown.
For organizations that want French-locale learners to receive French captions by default, Docebo's workaround is to create separate course versions per locale: a French-locale course with only the French caption track visible, and an English-locale course with only the English caption track visible. This approach requires more content management overhead but produces the correct default experience. The alternative — a single course with all caption tracks available and no default — technically delivers the accessibility requirement but reduces the probability that learners will select the correct track.
File format: SRT or VTT. VTT is preferred for Docebo as it supports the extended styling attributes some organizations use for speaker identification. File upload is through the Media Manager or the course content editor's subtitle tab.
Kaltura
Kaltura MediaSpace and Kaltura's LMS integrations (including the common Kaltura-Moodle and Kaltura-Canvas integrations) support multi-language caption tracks natively. Kaltura's player allows up to 99 caption tracks per video entry, with language metadata stored in the caption asset. The player's CC menu displays the available languages, and the default caption language can be configured at the player profile level to match the user's browser locale setting.
Kaltura's native caption management API supports bulk upload of multi-language caption files, making it practical for large libraries. The Kaltura Caption Bulk Upload tool accepts a CSV manifest with video IDs, language codes, and file paths, enabling teams to upload 100+ translated caption files in a single batch rather than through the UI. This is the most operationally efficient approach for an organization migrating an existing English-only caption library to a multi-language configuration.
Caption language detection: Kaltura's automatic caption service (powered by Voicebase or Verbit, depending on the Kaltura deployment) can produce AI captions in English. For non-English languages, organizations should upload manually produced translated caption files rather than using Kaltura's automatic captioning, which is optimized for English-language content. The accuracy benchmarks for automatic captioning in non-English languages vary significantly by language and content type.
TalentLMS
TalentLMS supports subtitle file upload at the course content level through its video component editor. Multiple subtitle files can be uploaded per video, with a language selection dropdown displayed in the video player. TalentLMS's multi-language interface — which allows the platform UI to be displayed in the learner's profile language — does not cascade to automatically pre-selecting the subtitle language; learners must select subtitles manually.
TalentLMS's branching and branch-specific content management is the mechanism some organizations use to deliver language-specific courses: a French branch receives a French course variant with only the French subtitle file configured, reducing the cognitive overhead for learners who do not need to select from a multi-language menu. For organizations already using TalentLMS's branch architecture for other organizational reasons (department-level course assignment, regional content distribution), this approach adds minimal overhead.
File format: SRT recommended. VTT is supported on most TalentLMS deployments but SRT has higher compatibility across TalentLMS versions. Maximum file size per subtitle file: 10MB (adequate for any standard training video caption file).
Workday Learning
Workday Learning handles multi-language caption delivery through its learning content localization framework. Workday supports locale-aware content delivery at the learning item level: an organization can create locale-specific variants of a learning item, each with the appropriate language caption track. Workday's locale routing logic then delivers the correct variant to each learner based on their Workday profile locale setting — making Workday one of the platforms that gets closest to automatic per-learner language routing without requiring learner-side caption track selection.
Caption file upload for Workday Learning is through the content management area of the learning item editor. Workday accepts SRT and VTT files. The locale variant workflow requires configuring a separate content version per language, which is additional content management overhead but produces the cleanest learner experience.
Workday's accessibility compliance posture is strong relative to other enterprise HCM platforms — it publishes a VPAT for Workday Learning that covers caption support. For organizations subject to EAA requirements, Workday's locale-routing capability is a meaningful differentiator from platforms that require learner-side language selection.
Cornerstone OnDemand
Cornerstone OnDemand supports multi-language caption delivery through its online course and vILT (virtual instructor-led training) modules. For recorded video content, Cornerstone's subtitle configuration is at the content object level. Multiple subtitle tracks can be associated with a single video, with learners selecting from the caption language menu in the player.
Cornerstone's Intelligent Automation capabilities, available in higher-tier deployments, include rules-based content routing that can be configured to deliver language-specific course variants based on learner attributes including locale, business unit, and country. Organizations using Cornerstone's content page and playlist architecture can build language-specific learning paths that route learners to the appropriate locale variant without learner-side selection. This is the pattern most appropriate for large Cornerstone deployments serving employees across multiple countries.
Panopto
Panopto's multi-language caption support operates through its caption management system, which allows multiple caption tracks per recording. Panopto's primary captioning workflow is automatic captioning (built-in) and third-party integration (for human captions via vendors like 3Play or Rev). For translated captions, organizations upload target-language SRT or VTT files through the Panopto caption management interface.
Panopto's caption track selector is available in the viewer's settings panel. Caption language defaults are configurable at the session and folder level. For organizations deploying Panopto within an LMS — the common Panopto-Canvas, Panopto-Blackboard, or Panopto-Moodle integration — caption language selection is handled within the Panopto viewer embedded in the LMS, not by the LMS caption management.
One Panopto-specific consideration for multi-language workflows: Panopto's automatic caption generation produces English output and does not natively support multi-language automatic captioning. Translated captions for Panopto content must be produced outside Panopto (through the four-stage translation pipeline) and uploaded as external caption files. This is consistent with how all translated captions should be produced regardless of platform, but it is worth confirming with any Panopto vendor sales representative who may represent automatic translation as a native Panopto capability.
SAP Litmos
SAP Litmos supports subtitle delivery at the module level. For video content, subtitle files are uploaded through the content editor's subtitle management section. Multiple languages can be uploaded per video, with the subtitle language selector available in the video player. Litmos's learner experience interface can be configured in multiple languages, but caption language pre-selection based on learner locale is not a standard feature — learners select caption language from the player controls.
SAP Litmos's deep integration with SAP SuccessFactors — the common enterprise HCM for large SAP-ecosystem organizations — means that learner locale and language preferences may be available from SuccessFactors as attributes that can inform content routing decisions. Organizations using the SAP SuccessFactors Learning integration may be able to build locale-aware course assignment rules that route learners to language-specific course variants, achieving automatic caption language delivery without learner-side selection. This requires configuration in SuccessFactors Learning rather than in Litmos directly.
360Learning
360Learning's collaborative learning model includes a multi-language content workflow as part of its platform design. 360Learning supports course cloning for localization: an English-language course is cloned into a French-language variant, the French variant receives the French caption track, and the two course versions are published to their respective learner populations based on group or path assignment. This approach leverages 360Learning's group-based content access control to achieve language routing.
360Learning also supports multi-language subtitle upload within a single course, with the learner able to select from available subtitle languages. The collaborative authoring tools in 360Learning can be used to coordinate the translation workflow within the platform: an L&D team member can annotate the French course clone with translation notes, assign segments to regional contributors for review, and track completion of the localization workflow in the same platform as the content is being built. This is more operationally integrated than most platforms, which treat caption localization as an out-of-band workflow.
For Absorb LMS, Moodle, and Microsoft Stream (which feeds Teams-based video delivery), multi-language caption delivery follows the same general pattern: upload translated caption files as additional tracks on each video, configure locale routing either at the content level or through course assignment rules, and verify that the correct track is delivered to the correct learner group before publishing. The specific UI paths differ; the workflow logic is consistent.
EAA and EN 301 549 compliance for multi-language content
The European Accessibility Act (Directive 2019/882) and its technical standard EN 301 549 create caption obligations that apply to the language in which content is delivered, not just to the existence of captions in any language. This is the compliance driver that makes multi-language caption operations mandatory — not optional — for organizations serving EU employees or consumers.
The EAA language obligation
The EAA requires that digital products and services meet the accessibility requirements set out in Annex I, Section I, which references EN 301 549 as the technical standard. EN 301 549 §9 (Web content) and §10 (Non-web documents) both apply WCAG 2.1 AA requirements. WCAG 1.2.2 (Captions, Prerecorded) requires synchronized captions. The "synchronized captions" requirement is language-neutral in its text — it does not specify that captions must be in any particular language. However, the broader WCAG 1.3.1 (Info and Relationships) and 3.1.2 (Language of Parts) requirements establish that the language in which content is presented must be programmatically determinable and appropriate for the content.
The practical reading of this, adopted by EU national competent authorities (NCAs) enforcing the EAA, is that captions must be in the same language as the audio content being captioned. If the audio is French, the captions must be French. English captions on a French-language training video do not satisfy the EAA's accessibility requirement for French-speaking learners, because they require the learner to be bilingual to benefit from the accommodation — which defeats the purpose of the accessibility requirement. This interpretation is consistent with the EAA's broader objective of ensuring non-discrimination on grounds of disability for persons using assistive technology or requiring accessibility features.
For L&D teams, the operational implication is:
- Any training video served to EU learners in a language other than English requires captions in the delivery language
- If the same video is served in multiple EU languages (e.g., an English original distributed to English, French, German, and Dutch learners), each language delivery requires captions in that language
- The 99% accuracy standard applies to each language's caption track independently — a French caption that was produced by MT without QA and achieves 94% accuracy does not meet the EAA standard even if the English original is at 99.4%
The accessibility statement requirement under EAA
The EU accessibility statement required under the EAA must declare the accessibility status of the digital service, including caption compliance. For organizations that provide training content in multiple languages, the accessibility statement should specify:
- The languages for which synchronized captions are provided
- The accuracy standard to which those captions have been produced (DCMP 99% protocol is appropriate)
- The date of the most recent QA review for caption accuracy in each language
- Any languages for which captions are not yet provided, with a remediation timeline
A statement that says "captions are provided for training videos" without specifying languages is inadequate for organizations serving employees in multiple languages. The NCA in each EU member state where the organization has employees has authority to investigate and require remediation of accessibility statements that do not accurately reflect the accessibility status of the service.
EN 301 549 §5: Generic requirements
Beyond the WCAG content requirements, EN 301 549 §5 includes generic performance requirements for caption timing and reading speed that apply independently of language. The relevant provisions for multi-language caption timing are:
- §5.1.3.4 (Captions speed): Captions shall be displayed at a speed that permits them to be read. The standard references a maximum reading speed of approximately 180 words per minute (approximately 17 characters per second) as the baseline for caption display rate assessment.
- §5.1.3.5 (Captions synchronisation): Captions shall be synchronized with the audio within 2 seconds. This applies per-frame in each language — a translated caption that starts 3 seconds after the corresponding audio in French does not meet this requirement even if the English caption was synchronized correctly.
Both of these requirements reinforce why timing adaptation is mandatory for text-expansion languages. A French caption frame that displays 50 characters in a frame timed for 35 characters of English text will either require the viewer to read at faster than the permissible rate, or the frame will be cut off before the next caption begins. Either outcome is an EN 301 549 failure.
AODA and bilingual content obligations for Ontario employers
The Accessibility for Ontarians with Disabilities Act (AODA) and its Integrated Accessibility Standards Regulation (IASR) create caption requirements for Ontario employers that interact with the French Language Services Act to produce a bilingual caption obligation for specific organizations and content types.
IASR §14: Caption requirement for synchronous video
IASR §14 (Training to Employees) and the broader IASR multimedia requirements require that recorded training content provided to employees includes captions when the content uses audio. This baseline requirement has applied to large designated public sector organizations since January 1, 2014, and to all other organizations with 50+ employees since January 1, 2020 under the amended regulation.
The AODA captions requirement in §14 does not specify a language. A strict reading of the provision would suggest that captions in any language satisfy the requirement. However, the AODA's broader objective — removing barriers to participation for persons with disabilities — requires that the caption accommodation actually be usable by the person with the disability. A Deaf employee whose primary language is French cannot use an English-only caption to access French-language training content. The accommodation fails its purpose.
French Language Services Act interaction
The French Language Services Act (FLSA) requires designated provincial government organizations and publicly funded bodies in Ontario to provide services in French. For organizations subject to FLSA, any training content produced in French to meet FLSA service delivery obligations must also be accessible — which, under AODA, means it must have captions. Since the training is in French (required by FLSA) and must be accessible (required by AODA), the captions must be in French. This is not explicitly stated as a single provision in either statute, but it is the correct interpretation of the combined obligations.
For private-sector employers in Ontario who are not subject to FLSA but who employ French-speaking workers under collective agreements or employment equity commitments, the analysis is different. There is no statutory requirement to provide training in French, but where training is voluntarily provided in French (as part of an inclusive workplace program), AODA requires that it be accessible — and the same reasoning applies: French-language training must have French captions to be accessible to Deaf or hard-of-hearing employees whose primary language is French.
Practical guidance for Ontario L&D teams
The AODA accessibility plan for an Ontario organization with bilingual training operations should specify:
- All training content produced in French that is required by FLSA or provided voluntarily must include French captions
- The QA standard for French captions is the same as for English captions: DCMP 99% protocol
- The LMS configuration for French-language courses must deliver French captions as the default for learners whose profile language is French
- The AODA multi-year accessibility plan must include a timeline for adding French captions to any French-language training content that currently lacks them
Ontario's Accessibility Directorate enforces AODA compliance through audits and public reporting requirements. Organizations with 20+ employees in Ontario must file annual or multi-year accessibility reports, and caption compliance for training content is within scope of these reports. The most common finding in Accessibility Directorate reviews for training-intensive organizations is that recorded training content lacks captions — adding French-language captions to the remediation plan shows awareness of the full obligation, not just the English-language baseline.
QA for translated captions: the adapted protocol
The DCMP spot-check protocol used for source-language captions applies to translated captions with adaptations for the error types specific to translated content. The core structure — sample selection, word count, error classification, score calculation — is the same. The error taxonomy expands to cover translation-layer errors that do not exist in source-language captions.
Who can run QA on translated captions
Target-language QA requires a native speaker of the target language with domain knowledge in the content area. This is not negotiable. A person who speaks intermediate French cannot reliably QA a French caption against the 99% accuracy threshold — they will miss errors that are natural-sounding in French but technically incorrect, particularly for regulatory terms and domain vocabulary. A native French speaker without domain knowledge in healthcare will miss clinical terminology errors. The QA reviewer must have both.
For organizations that do not have native speakers of the target language on their L&D team (the common case), target-language QA must be sourced externally: from the translation vendor, from a freelance native-speaker reviewer with domain expertise, or from an internal employee with native proficiency in the target language and familiarity with the content domain. The QA reviewer is not the same person as the translator — the QA check is more reliable when the reviewer did not produce the translation being reviewed.
Extended error taxonomy for translated captions
The four source-language error types — substitution, insertion, deletion, formatting — remain relevant and should be counted. Add these translation-specific error types:
Term substitution (regulatory/glossary): The correct target-language term for a specific concept was not used. This is distinct from a word-level substitution in the source: the caption text is grammatically correct French, but "accommodation raisonnable" was used instead of "aménagement raisonnable," which is the approved term in the per-language glossary. Count each instance as one error. Root cause is almost always a missing or incorrect glossary entry — the translator used a different but plausible equivalent.
Literal translation failure: An idiomatic phrase in the source language was translated word-for-word, producing a grammatically correct sentence that does not mean the same thing in the target language. "Hit the ground running" translated as "frapper le sol en courant" is a literal translation failure in French — the idiomatic equivalent would be "démarrer sur les chapeaux de roues" or simply "commencer immédiatement." Count as one error per failed idiom. Root cause is MT-only or inadequately briefed human translation without style guide instructions on idiomatic adaptation.
Brand name translation: A brand name that should have been left untranslated was translated. "Cornerstone OnDemand" → "Pierre d'angle à la demande" is a brand name translation error. Count as one error per instance. Root cause is missing "do not translate" entry in the term map.
Reading-speed violation: A caption frame contains more text than can be read at the WCAG reading speed for the frame's display duration. This is a timing adaptation failure, not a translation accuracy failure per se — but it is detectable and recordable in QA. Count as one error per frame that exceeds the 17-characters-per-second threshold. Root cause is timing adaptation not performed after translation.
Score calculation for translated captions
The score is calculated the same way as for source captions: (total words − total errors) / total words × 100. All error types — source-language errors plus translation-specific errors — are counted in the numerator. The 99% threshold applies.
A translated caption file should be expected to have a lower raw accuracy score on its first QA review than the source-language caption file — translation introduces additional error opportunities that the source caption QA does not generate. A first-pass translated file at 97–98% is typical; remediation (correcting term substitutions and glossary errors, fixing literal translation failures, re-timing affected frames) should bring it to 99%+ before publication.
Reading-speed verification protocol
For text-expansion languages (French, Spanish, German, Portuguese), include a reading-speed check as a separate verification step after translation and timing adaptation. The check proceeds as follows:
- Extract the translated caption file's frame list (sequence number, start time, end time, text)
- For each frame, calculate characters per second: (character count of caption text) / (end time − start time in seconds)
- Flag any frame where characters per second exceeds 17 (approximately 160–180 words per minute for average word length in European languages)
- Re-time flagged frames by splitting them or extending the display duration (if the audio permits) to bring the reading speed below threshold
- Document the number of frames corrected in the QA record
For German specifically, compound nouns create a systematic reading-speed risk. A German compound like "Barrierefreiheitserklärung" (accessibility statement) counts as one word but contains 27 characters — the character-per-second check will flag frames containing long compound nouns even when the word count per frame is within the guideline. Re-timing for German content typically requires more manual intervention than for French or Spanish because grammatical line-break points must be at morpheme boundaries within compound nouns, not at arbitrary character positions.
Eight failure modes in multi-language caption operations
These are the eight failures we see most consistently in organizations that have English-language caption operations running correctly but have not yet built the multi-language extension. Each one is fixable, but each one produces a distinct category of non-compliance or operational damage.
Failure mode 1: Translating before source QA is complete
The most common and most expensive failure mode. Timeline pressure leads to starting translation on an unreviewed source file. When source QA finds failures — which it reliably does for any content with significant proper-noun density — the translated file must be re-worked. The cost of re-translation is higher than the cost of delaying translation by the time required for source QA to complete. The fix: establish source caption lock as a formal workflow gate. No translation PO is issued until the source version has a v1.0-source-locked version tag.
Failure mode 2: Using MT-only for compliance or healthcare content
The second most costly failure mode. MT produces plausible regulatory terminology most of the time — plausible enough to pass a casual review. It fails specifically on the terms that matter most for compliance: jurisdiction-specific regulatory vocabulary, terms with different meanings in common language versus regulatory language, and terms that have multiple plausible translations depending on regulatory context. "Reasonable accommodation" in an EAA training video must be "aménagement raisonnable" in French (France) — not "logement raisonnable" (which means "reasonable housing"), not "hébergement raisonnable," not "adaptation raisonnable" (which is the Quebec term). An MT model may produce any of these. A human translator with EAA domain knowledge will produce the correct term. The fix: apply the method selection framework rigorously and use HT for any content that will be cited as compliance evidence in the target-language jurisdiction.
Failure mode 3: Skipping timing adaptation
The translated SRT file goes from translator to LMS with no timing review. For French and German content, the result is frames that exceed the WCAG reading-speed guideline, creating an accessibility failure in the very accommodation intended to serve accessibility. This is a testable failure — any reading-speed verification tool will flag it — but it is often not tested because the team assumes the translated file can be treated identically to the source file. The fix: timing adaptation is a mandatory step in the pipeline for any text-expansion language, completed before LMS upload.
Failure mode 4: Inconsistent brand name handling
Some brand names in the translated captions are left in English (correct); others are translated by meaning (incorrect); still others are transliterated phonetically in languages with non-Latin scripts (sometimes correct, sometimes not). The inconsistency comes from having no term map with explicit "do not translate" entries for brand names. When a single long-form training module contains 15 references to "Cornerstone OnDemand" and 8 are translated and 7 are not, the resulting caption file looks like a quality failure to any reviewer — even though the individual errors are small. The fix: brand names go into the term map as "do not translate" entries before production begins.
Failure mode 5: No per-language glossary for regulatory terms
The source English glossary is built and maintained carefully — it drives the glossary-biased decoding that produces accurate English captions. But no equivalent per-language term map exists for the translated content. The translator or MT model makes individual term decisions for each segment, producing inconsistent terminology across the library. The inconsistency is not caught in QA because the QA reviewer is checking accuracy (word-for-word against audio) rather than term consistency (canonical form against term map). The fix: build the per-language term map before any translation begins, and include a term consistency check as a separate QA step.
Failure mode 6: Uploading translated captions to the wrong LMS track
The French SRT file is uploaded, but its language metadata tag is set to "English" — either because the LMS upload interface defaulted to English and the person uploading did not change it, or because the file was uploaded to the existing English caption track rather than a new French track. The result: French learners see French captions in the player's "English" CC option, which is confusing, and the English-locale default shows the French caption to English learners. This is a pure operational error, not a translation or accuracy error, but it is extremely common. The fix: caption track language metadata must be part of the QA checklist. Before publishing, verify in the LMS player that each language's caption track is labeled correctly and routes to the correct learner group.
Failure mode 7: No native-speaker QA for translated captions
The translated file is reviewed by the English-speaking L&D team member who commissioned the translation. They verify that the translated file "looks right" — consistent length, proper format, no obviously broken segments. They cannot verify whether the French regulatory terms are correct, whether the idiomatic phrases were adapted appropriately, or whether any term substitution errors are present. The file is published with uncorrected translation-layer errors. The fix: target-language QA must be performed by a native speaker with domain knowledge. This is a workflow and resourcing problem, not a technical problem.
Failure mode 8: Treating multi-language caption delivery as a one-time project rather than an ongoing workflow
The initial multi-language caption project is completed: 200 videos are translated into French and German, uploaded to the LMS, and QA'd. The project is closed. Six months later, 40 new English-language videos have been produced and uploaded to the LMS. None of them have French or German captions, because the multi-language caption workflow was not integrated into the standard content production pipeline. The organization is now non-compliant for the new content, and the gap will grow every time new content is produced without triggering the translation pipeline.
The fix: multi-language captioning must be part of the standard caption ingestion workflow, not a separate project. Every video that enters the English caption pipeline should automatically trigger a translation ticket for each target language. The term map must be maintained as a living document, updated whenever new vocabulary enters the English glossary. The LMS caption audit must cover all languages, not just English. This is the operational maturity level that separates organizations that maintain compliance from organizations that achieved compliance once and then drifted.
FAQ
Does EAA require captions in every language my EU employees speak?
The EAA requires that training content served in a particular language is accessible to learners who need accessibility features for that content. In practice, this means captions are required in every language in which the audio content is produced and distributed. If you distribute a French-language training video to employees in France, Belgium, or Luxembourg, that video requires French-language captions meeting WCAG 2.1 AA accuracy. You do not need captions in languages you do not use for training content. The obligation tracks the delivery language, not the employee's language preferences — if all your EU employees are served English-language training (because your organization's working language is English), you need English captions, not French or German captions. The obligation expands when you serve content in additional languages.
Can I use DeepL to translate my English captions to French for AODA compliance?
For soft-skills and process training content, DeepL-translated captions that have been timing-adapted and QA'd by a native French speaker can meet the AODA caption requirement. DeepL's French output quality is high for common-language content. For compliance training, legal training, or any content that uses regulatory terminology specific to Ontario or Quebec law, DeepL alone is not sufficient — you need human post-editing by a translator with knowledge of the specific regulatory vocabulary in the target jurisdiction. The critical distinction is whether the translation will be cited as compliance evidence. If yes, the translation method must be defensible under a legal standard — which means human involvement at the regulatory terminology level, not just MT output.
How do I handle a product name that looks meaningless or incorrect when literally translated?
Brand names and product names that appear in your training content should be left untranslated as a default rule, with "do not translate" entries in the per-language term map. "GlossCap" is "GlossCap" in French, German, Spanish, and Japanese — the brand name is not translated. Platform names like "TalentLMS" or "Workday" are left as-is. The exception is a product name that consists of common vocabulary words and is registered in the target language under a different trade name — in this case, the target-language trade name is the correct form, and it goes into the term map as the approved translation. When in doubt: leave brand names untranslated and note "do not translate" explicitly in the term map so that no translator makes a different decision on the next piece of content.
What reading-speed adjustment do I need for French, Spanish, and German caption tracks?
The adjustment is not a fixed multiplier applied to all frames — it is a per-frame check that identifies frames where the translated text exceeds the WCAG reading-speed threshold (approximately 17 characters per second) and re-times those frames specifically. The general expectation is that approximately 20–35% of frames in a French translation will require re-timing (the higher percentage appears in segments with dense technical vocabulary, where French text expansion is greatest). For German, expect 30–45% of frames to require re-timing due to higher text expansion and compound noun effects. For Spanish and Portuguese, 15–25% of frames typically require adjustment. The practical approach is to use a subtitle editing tool that can calculate characters-per-second for each frame and flag overruns automatically, then review and re-time flagged frames rather than re-timing the entire file manually.
Does my Docebo multi-language caption setup need to be separately audited per language?
Yes. The LMS caption audit methodology applies to each language independently. The audit checks that captions exist, that they are correctly configured in the LMS (right language tag, correct track assignment, functional locale routing), that a QA record exists documenting accuracy at the 99% threshold, and that the audit trail is complete for compliance documentation. A caption library that has been fully audited for English but has never been audited for French is half-audited — the French portion of the compliance obligation has not been verified. For organizations under EAA, the accessibility statement must reflect the audit status for each language for which the statement claims caption compliance.
My LMS doesn't support multiple caption tracks per video. What do I do?
If your LMS supports only a single caption track per video, the standard workaround is to create language-specific course variants: a French-locale course that contains the same video with the French caption track, and an English-locale course with the English caption track. The two course versions are then assigned to learners based on their locale setting or group membership. This requires more content management overhead — you are maintaining two course records per language for every video — but it is the only approach that delivers the correct caption track to each learner without requiring the learner to select manually. Before building this architecture, verify with your LMS vendor whether a multi-track caption capability is on the roadmap and whether a platform upgrade or configuration change might enable it without the course-duplication workaround.
We have 200 English-language training videos. What is the right order to prioritize for multilingual captioning?
Prioritize by compliance exposure first, then by audience size, then by content change frequency. Compliance-first means: identify which videos are required to be accessible under EAA or AODA, and which of those currently lack captions in the required target language. These are your first wave. Any video used in a regulatory compliance training program — EAA training, GDPR training, OSHA-equivalent training, Section 508 training — that is delivered to EU or Ontario employees in a non-English language is a priority. Audience size second: for the remaining library, prioritize videos with the highest viewership in the target-language locale. A French-language onboarding video watched by 200 new employees per year has higher remediation value than a French-language archive lecture watched by 12 people. Content change frequency last: if a video is likely to be re-produced or updated in the next 12 months, you may choose to hold translation until the English version stabilizes — translating content that will be replaced shortly is low-ROI. The procurement and vendor selection framework applies to sourcing translation vendors for a large library at scale: the same scoring dimensions (per-hour cost, domain expertise, QA methodology, turnaround time, glossary integration) are relevant for translation vendor selection as for caption vendor selection.
Build a multi-language caption operation that stays compliant
GlossCap's glossary-biased captioning engine produces accurate source-language captions that serve as the locked foundation for your multi-language translation pipeline. The glossary your team builds for English proper nouns — product names, regulatory terms, technical vocabulary — can be extended into a per-language term map for translation, giving every translated caption track the same vocabulary consistency as the source. View pricing or try the embed widget to see how glossary-anchored captioning works before you commit to a vendor change.
For organizations already managing a multi-language caption operation, GlossCap's caption format exports (SRT, VTT, TTML) and LMS integration hooks are compatible with all eight platforms described in this post. The Rev vs GlossCap, 3Play vs GlossCap, and Verbit vs GlossCap comparison pages cover how GlossCap's per-customer glossary model differs from these vendors' approaches — the glossary architecture is the operational difference that matters most for multi-language compliance accuracy. The pricing breakdown shows how GlossCap's cost compares at the volume tiers typical for mid-market L&D operations running multi-language content programs.