EHS Operations · Published 2026-06-27
Captioning manufacturing safety and equipment training video: LOTO procedures, PPE training, confined space entry, and why OSHA vocabulary fails general-purpose ASR
There is a category of training video where caption inaccuracy is not a compliance problem — it is a safety problem. Lockout/tagout procedure demonstrations, confined space entry training, and PPE inspection videos are watched by workers who sometimes consume training via captions because the shop floor is too loud to hear audio, because they are working in a second language, or because they have a hearing disability that triggers an ADA accommodation obligation. When a caption reads “lotto” instead of “LOTO,” the worker may not connect the caption to the lockout/tagout standard they are trained on. When a caption reads “the energize” instead of “de-energize,” the safety instruction is inverted. When a caption reads “idyll” instead of “IDLH,” the atmospheric condition that triggers confined space evacuation disappears from the training record. These are not hypothetical failures. They are the exact substitutions general-purpose ASR makes when it encounters OSHA safety vocabulary — vocabulary that appears in manufacturing and EHS training video all day, every day, but rarely in the podcast episodes, YouTube videos, and news broadcasts that ASR models are trained on. This post covers why manufacturing safety vocabulary fails ASR at this rate, the specific failure profiles for lockout/tagout, confined space entry, PPE, and construction safety, the OSHA and ADA compliance obligations that apply to captioned safety training, and the glossary construction method that closes vocabulary gaps before a single training module is delivered.
TL;DR
- Manufacturing safety training video has a vocabulary failure rate 2–3× higher than general L&D content, and the consequences of failure are higher. LOTO procedure video, confined space entry training, and PPE inspection modules contain OSHA-defined terms, equipment abbreviations, and regulatory citation formats that do not appear in ASR training corpora at meaningful frequency. The LMS native auto-caption accuracy comparison shows general content at 89–93%; OSHA safety procedure content runs 73–82% on the same systems without domain glossary configuration.
- The vocabulary failures cluster by content type. LOTO failures centre on acronym-as-word substitution (LOTO→lotto) and prefix stripping (de-energize→the energize, de-energisation→the energisation). Confined space failures centre on acronym recognition (IDLH→idyll, PRCS→various) and atmospheric condition vocabulary. PPE failures centre on equipment designation acronyms (SRL, PFAS, PAPR, SCBA, APF) and protective standard citations (ANSI Z87.1, NRR). Fall protection failures centre on PFAS acronym collision with chemical terminology and multi-word compound equipment names.
- OSHA and ADA both independently require accurate captioned access to safety training. ADA Title I (15+ employee organisations) requires accessible training materials for employees with disabilities. The OSH Act General Duty Clause (Section 5(a)(1)) creates an additional obligation: if video is the sole delivery mechanism for a safety procedure, inaccurate captions on that video are a General Duty Clause exposure. OSHA 29 CFR 1910.132(f)(1) requires that each “affected employee” receives training — employees who can only access training via captions must receive accurate captions or the training obligation is not met.
- Domain glossary coverage must be built before delivery, not after errors are reported. The correction labour cost for safety training video is higher per minute than for general L&D content because every term requires a subject matter expert to verify, not just a language-competent reviewer. Building glossary coverage from OSHA CFR text and equipment manufacturer vocabulary before the first module is captioned prevents the post-delivery correction cycle. For the HazCom vocabulary failure profile (chemical names, SDS section vocabulary), see the HazCom captioning post; this post covers the equipment-procedure vocabulary that is distinct from chemical naming.
- Third-party safety training libraries (SafeStart, Convergence Training, J. J. Keller, Vivid Learning) carry the same vocabulary accuracy obligation as internally produced content, but the correction workflow is structurally harder. When an employee accesses a purchased LOTO module with 78% accuracy captions, the vocabulary failures are present regardless of who produced the content. ADA Title I and the General Duty Clause apply to the training experience, not the production source. See the third-party compliance training captioning post for the procurement approach.
Why OSHA safety vocabulary fails general-purpose ASR
General-purpose ASR models — the engine behind every LMS native auto-caption feature, most caption vendor platforms at the lower price tier, and the default Whisper configuration without domain adaptation — are trained on speech corpora that reflect how English is spoken in contexts where audio is available: podcasts, YouTube, news, lecture captures, court transcripts, film dialogue. These corpora are large (hundreds of thousands to millions of hours) but they reflect a specific distribution of vocabulary.
OSHA safety procedure vocabulary does not appear in that distribution at meaningful frequency. The acronyms are defined by OSHA rulemaking and equipment standards bodies. The compound terms follow a regulatory drafting convention that prioritises precision over natural speech cadence. The equipment designations are manufacturer-assigned and standard-body-assigned rather than organic language. When an ASR model encounters a token sequence it has never reliably trained on, it does what any sequence model does: it substitutes the statistically most likely completion given the surrounding context. In safety training video, the surrounding context is often other specialised vocabulary — which means the model has no high-confidence anchor to pull from. The result is a cascade of substitutions that individually look like typos but collectively misrepresent the safety procedure.
The three vocabulary failure mechanisms
Acronym-as-word substitution. OSHA safety and equipment vocabulary is dense with acronyms. When the training corpus does not contain enough examples of an acronym to establish reliable token-to-spelling mapping, the model attempts to render the spoken acronym as a pronounceable word. LOTO (spoken as a word, “LOH-toe”) becomes “lotto.” IDLH (spoken as letters, “eye-dee-el-aitch”) becomes “idlh” (all lowercase, unformatted) or occasionally “idyll” when the model finds a plausible word match. PAPR (spoken as “PAY-per”) becomes “paper.” SCBA (spoken as “ess-see-bee-ay”) becomes the model’s best approximation, which sometimes lands on “scuba” — a plausible English word with similar phonetics that is categorically wrong in the industrial respiratory protection context.
Prefix stripping. The prefix “de-” in safety procedure language signals a reversal of a state: de-energise, de-energisation, de-pressurisation, de-isolate. In natural English speech, the prefix is not always phonetically distinct — it blends with the following vowel. When the model’s training data contains more occurrences of the unprefixed word than the prefixed version, the model produces the base word. “De-energize the equipment before beginning maintenance” becomes “the energize the equipment before beginning maintenance” — a sentence that is grammatically wrong and that inverts the safety instruction from its intended meaning.
Defined-term reformulation. OSHA rulemaking uses precise multi-word defined terms: “permit-required confined space,” “authorized employee,” “affected employee,” “energy isolating device,” “zero-energy state.” These terms have specific legal definitions in the CFR. When ASR reformulates them — “effective employee” for “affected employee,” “energy isolation device” for “energy isolating device,” “permit required” (without hyphen) for “permit-required” — the result may still be legible to a lay reader but is incorrect relative to the regulatory standard. Employees trained on these definitions, or EHS managers reviewing compliance documentation, will notice the reformulation as an accuracy gap even if the general meaning is preserved.
The combined effect of these three failure mechanisms on a typical LOTO procedure video runs 18–27 percentage points below the WCAG 2.1 AA 99% accuracy floor. Without domain glossary configuration, a LOTO module captioned on any standard ASR system will fail the accuracy standard before a single correction is applied.
LOTO (lockout/tagout) — OSHA 29 CFR 1910.147
OSHA 29 CFR 1910.147, “The Control of Hazardous Energy (Lockout/Tagout),” requires employers to establish energy control procedures for machinery and equipment. Initial training is required for both authorised employees (those who perform lockout/tagout) and affected employees (those whose work is affected by the energy control procedure). Refresher training is required when procedures change or an employee demonstrates inadequate knowledge. All of this training is commonly delivered by video.
LOTO procedure video is among the most vocabulary-dense training content type in manufacturing EHS because the procedure itself is sequential and precise: each step has a defined name, each equipment item has a defined term, and the distinction between “tagout” (a warning device only) and “lockout” (a physical lock-out) carries regulatory weight. Captions on this content are not background text — they are the record of the training.
LOTO vocabulary failure table
| OSHA-defined term or acronym | Common ASR substitution | Accuracy impact |
|---|---|---|
| LOTO (lockout/tagout) | “lotto” | Critical: acronym-as-word substitution; worker may not recognise connection to the 1910.147 standard |
| de-energize | “the energize” | Critical: prefix stripped; safety instruction inverted |
| de-energization | “the energization” or “de energization” | High: prefix stripped or hyphen dropped; same inversion risk |
| zero-energy state | “zeroing state” or “zero energy state” | High: “zeroing” implies a process in progress, not a confirmed completed condition |
| authorized employee | “authorised employee” (UK spelling) | Medium: spelling substitution; not wrong in meaning but inconsistent with CFR text |
| affected employee | “effective employee” | High: word substitution that produces a different meaning; “effective employee” is not an OSHA defined term |
| energy isolating device | “energy isolation device” | Medium: word form change; EHS professionals will notice the deviation from 1910.147 language |
| lockout device | “lock out device” | Medium: space insertion changes the compound noun to a verb phrase |
| tagout device | “tag out device” | Medium: same space insertion failure as lockout |
| specific energy control procedure | “specific energy controlled procedure” | Medium: tense substitution changes the noun-modifier relationship |
| energy control program | usually correct | Low: common enough in training contexts for ASR to handle; verify anyway |
| re-energize | “re energize” or “reenergize” | Low: hyphen formatting inconsistency; meaning preserved |
LOTO video types and their accuracy risk
Not all LOTO training video carries equal accuracy risk. Procedure demonstration video — a video showing an authorized employee performing the LOTO procedure step-by-step — carries the highest risk because the caption is the instruction. A worker following captions through a LOTO procedure demonstration who encounters “the energize the equipment before applying the lockout device” has received the opposite safety instruction from what was intended.
Regulatory overview video — a lecture-format explanation of why LOTO matters, the regulatory history, or the distinction between lockout and tagout — carries lower risk because the vocabulary density is lower and the correction consequences of a miss are less severe. The viewer has more context to detect an error. However, the QA methodology should not treat regulatory overview video as lower priority during production; it should treat it as lower risk during error impact assessment.
Competency verification video — a quiz-format or scenario-based module where the learner must identify the correct LOTO step — carries the same high risk as demonstration video, because the answer choices or scenario descriptions may include the critical vocabulary. A wrong caption on an answer choice changes the correct answer.
Glossary minimum for LOTO content
A minimum viable LOTO glossary for a single manufacturing facility should contain: (1) all OSHA 1910.147 defined terms verbatim from the CFR text (12–15 terms), (2) all equipment manufacturer names for energy isolating devices in use at that facility (3–20 terms depending on equipment variety), (3) all internal energy control procedure names if they have facility-specific titles, and (4) the acronym LOTO and its expansion. This is a 25–50 term base glossary. Facilities with multiple equipment types (hydraulic, pneumatic, chemical, thermal, gravitational energy sources) should add 5–10 terms per energy source type.
Confined space entry — OSHA 29 CFR 1910.146
OSHA 29 CFR 1910.146 governs permit-required confined spaces in general industry. Training is required for all workers who enter permit spaces, attendants who monitor entries, and entry supervisors who authorise and coordinate them. The training must address the hazards specific to each permit space, the conditions requiring evacuation, and the roles and responsibilities under the permit system. Atmospheric testing training — how to use a multi-gas monitor, how to interpret readings, what the IDLH thresholds are for each gas in the space — is both required and safety-critical.
Confined space entry training video has a particularly challenging vocabulary profile because it combines OSHA regulatory defined terms, atmospheric chemistry vocabulary, equipment model names, and emergency procedure terminology. The atmospheric chemistry vocabulary (oxygen-deficient atmosphere, IDLH, LEL percentage, H⊂2;S concentration) is absent from general ASR training corpora almost entirely.
Confined space vocabulary failure table
| Term or acronym | Common ASR substitution | Accuracy impact |
|---|---|---|
| IDLH (immediately dangerous to life or health) | “idlh” (unformatted) or “idyll” or letter-by-letter | Critical: the IDLH threshold is the evacuation trigger; a worker who cannot recognise the term from captions may not understand when the atmospheric condition requires evacuation |
| permit-required confined space | “permit required confined space” (hyphen dropped) | Medium: hyphen removal changes the compound adjective to a noun phrase; EHS managers reviewing documentation will flag it |
| PRCS (abbreviation of above) | varies widely; sometimes “pricks” | High: acronym-as-word substitution produces unacceptable output in formal training content |
| non-permit-required confined space | “non permit required” (both hyphens dropped) | Medium: double prefix compound; the regulatory distinction between permit-required and non-permit-required is material |
| atmospheric testing | usually correct | Low: common enough in EHS contexts; verify LEL/IDLH combinations |
| LEL (lower explosive limit) | “L-E-L” or “lel” | High: percentage readings before the acronym (“10% LEL”) are critical; unformatted or dropped acronym removes the safety threshold |
| oxygen-deficient atmosphere | “oxygen deficient atmosphere” (hyphen dropped) | Low: meaning preserved; formatting deviation only |
| engulfment hazard | “engulfment hazard” (usually OK) or “in golf ment hazard” | Medium: occasional phonetic breakdown; verify in training QA |
| tripod rescue system | “try pod rescue system” or “tripod rescue system” (usually OK) | Low: equipment name generally handled; verify in context of rescue procedure video |
| retrieval system | usually correct | Low: common word combination in rescue contexts |
| entry supervisor | usually correct | Low: both words common; defined term is legible from ASR output |
| bump test | “bum test” | High: the bump test procedure (momentary exposure to calibration gas to verify sensor response) is misrepresented; produces unacceptable output in formal content |
| calibration gas | usually correct | Low: common enough in lab and industrial contexts |
| H2S (hydrogen sulphide) | “H-2-S” or “H2S” (no spaces) | Medium: chemical formula rendering inconsistent; expand to “hydrogen sulphide (H2S)” in glossary to ensure legibility |
Atmospheric testing video: the highest-risk sub-category
Within confined space entry training, atmospheric testing procedure video is the sub-category with the highest accuracy requirement. These videos show how to use a multi-gas monitor, interpret readings against IDLH and LEL thresholds, and determine whether conditions are safe for entry or require evacuation. The vocabulary that appears in this content — IDLH, LEL, oxygen percentage, H⊂2;S ppm, CO ppm, the specific sensor model name on the monitor being demonstrated — are exactly the terms that ASR handles worst.
A worker who accesses atmospheric testing training via captions and encounters “idyll” where “IDLH” should appear, or who sees a gas reading percentage paired with an unrecognised acronym, has received training that is both non-compliant and potentially dangerous. The caption vendor pilot design post describes the test corpus construction method; for confined space entry training, the pilot corpus should include at minimum one atmospheric testing demonstration video and one permit procedure walkthrough alongside general safety content.
PPE training — OSHA 29 CFR 1910.132–138
OSHA 29 CFR 1910.132 through 1910.138 covers personal protective equipment for general industry. Training under 1910.132(f) must address when PPE is necessary, what type is necessary, how to properly don, adjust, and wear PPE, and the limitations of PPE. The standards reference both the specific equipment type and the protective standard it must meet (ANSI Z87.1 for eye protection, ANSI/ISEA Z89.1 for head protection, NIOSH standard for respiratory protection, ANSI/ASSE for fall protection).
PPE training video contains two overlapping vocabulary challenges: the equipment designation acronyms (SRL, PFAS, PAPR, SCBA, APF) and the protective standard citations (ANSI Z87.1, NRR value in dB, APF table from OSHA 1910.134 Appendix A). Both are absent from general ASR training corpora.
PPE vocabulary failure table
| Term or acronym | Common ASR substitution | Accuracy impact |
|---|---|---|
| SRL (self-retracting lifeline) | “S-R-L” (letters) or “serial” or “es-ar-el” | High: fall protection equipment designation; worker selecting fall protection based on training video must recognise the equipment label |
| PFAS (personal fall arrest system) | “P-F-A-S” or “pfas” (chemical term collision) | Critical: acronym collision with PFAS (per- and polyfluoroalkyl substances) — a chemical category with entirely different safety implications; a worker reading “pfas” in a fall protection module may be confused by the collision |
| PAPR (powered air-purifying respirator) | “paper” or “papper” | Critical: respiratory protection equipment designation; the distinction between a PAPR and an N95 or half-face respirator is clinically and regulatory significant |
| SCBA (self-contained breathing apparatus) | “scuba” (plausible word match) | Critical: scuba equipment is recreational; SCBA is industrial respiratory protection for IDLH atmospheres; the substitution produces absurd output in a serious safety procedure context |
| APF (assigned protection factor) | “A-P-F” or “apf” | High: the APF is the level of respiratory protection each type provides; without the acronym rendered correctly, APF table references in training are uninterpretable |
| N95 | usually correct | Low: high frequency post-2020 in ASR training data; verify in combination with NIOSH references |
| half-face respirator | usually correct | Low: compound noun generally handled; verify “half-face piece” variations |
| NRR (noise reduction rating) | “N-R-R” or “near” | High: the NRR value (in decibels) determines whether hearing protection is adequate for a given noise exposure; NRR dropped or rendered as “near” removes the protection rating from the training |
| ANSI Z87.1 | “A-N-S-I Z87” or “ANSI Z87.1” (usually partial) | Medium: eye protection standard citation; partial rendering still communicates the ANSI reference; the decimal and full number are what fail |
| D-ring | “D ring” (usually OK) or “dee ring” | Low: equipment component; meaning preserved in most substitutions |
| anchorage point | usually correct | Low: common enough; verify in fall protection procedure context |
| fall factor | usually correct | Low: two common words; verify in combination with fall distance calculations |
| permeation rate | usually correct | Low: verify in chemical protective clothing contexts |
Respiratory protection: the highest-vocabulary sub-category within PPE
Respiratory protection training carries a vocabulary burden beyond other PPE categories because it involves three overlapping vocabularies: the equipment designation (N95, half-face, full-face, PAPR, SCBA), the protection factor system (APF, fit factor, fit test), and the atmospheric threshold it is protecting against (IDLH, STEL, TLV, PEL). All three vocabularies fail ASR at high rates when not glossary-configured.
A respiratory protection training module that contains the sequence “when the atmospheric concentration exceeds the IDLH value, you must use an SCBA or other NIOSH-approved supplied-air respirator with an APF of at least 10,000” will produce, without glossary configuration, something close to “when the atmospheric concentration exceeds the idyll value, you must use a scuba or other NIOSH approved supplied air respirator with an A-P-F of at least ten thousand.” The safety information is present but systematically degraded: the threshold acronym is unrecognisable, the equipment type is misidentified, and the protection factor designation is spelled out as letters rather than rendered as an acronym the reader can look up in the OSHA APF table.
The customer glossary architecture post covers how to structure a domain glossary for maximum coverage; for respiratory protection training, the glossary should include all equipment designations (N95, KN95, half-face, full-face, PAPR, SCBA, airline respirator, supplied-air), all protection factor abbreviations (APF, fit factor, PF), and all atmospheric standard abbreviations (IDLH, STEL, TLV-C, TLV-TWA, PEL, REL).
Fall protection and construction safety — OSHA 29 CFR 1926
OSHA 29 CFR 1926 covers the construction industry specifically, with fall protection requirements in 1926 Subpart M. Construction safety training video has a vocabulary profile that partially overlaps with general industry PPE (the fall protection equipment vocabulary is the same) but adds construction-specific defined terms and the 1926 Subpart M compliance structure.
Construction-specific vocabulary and fall protection
The PFAS acronym collision is particularly acute in construction training, where the equipment term “personal fall arrest system” (PFAS in 1926.502) appears frequently alongside chemical awareness training (PFAS as per- and polyfluoroalkyl substances). Workers who consume construction training via captions may encounter “pfas” rendered as lowercase in both contexts — an ambiguity that a well-configured glossary should resolve by expanding PFAS to “personal fall arrest system (PFAS)” on first occurrence in the module.
Additional construction-specific vocabulary that fails ASR at elevated rates includes:
- Competent person — a defined term in 1926 with specific qualifications (ability to identify existing and predictable hazards and the authority to take corrective measures). Usually rendered correctly by ASR, but “competent” is sometimes substituted with “competence” or “compete” in fast-speech contexts.
- Leading edge — an unprotected side or edge where a worker could fall. “Leading edge” is generally handled; “leading edge fall protection” as a compound occasionally fails.
- PFAS deployment — the phrase pairing the equipment acronym with a deployment action; ASR struggles with acronym-plus-verb combinations where the acronym is rare.
- Controlled access zone (CAZ) — an area where guardrails are not required but access is controlled. “CAZ” rendered as “cause” in fast speech is a documented failure.
- Slide guard — usually correct; verify in low-slope roofing context.
- PFAS anchorage — anchorage specified for the personal fall arrest system; verify in fall protection planning video.
Scaffold and elevated work platform training
Scaffold training (1926 Subpart L) and aerial lift training (1926.453 and ANSI/SAIA A92 series) add equipment-specific vocabulary: supported scaffold, suspended scaffold, MEWP (mobile elevated work platform), MEWP designation categories (Type 1–3, Group A/B), scissor lift, boom lift. ASR handles “scaffold” and “scissor lift” reliably; “MEWP” is rendered as “mew-p” or letters in almost every general ASR system tested.
EHS training video types and vocabulary risk profiles
Manufacturing EHS training video spans several production formats, each with a different vocabulary density and accuracy risk profile. Understanding the distribution helps allocate glossary development effort and QA review priority.
Procedure demonstration video (highest risk)
A subject matter expert or actor performs a safety procedure step-by-step while a narrator describes each step. LOTO demonstrations, confined space entry permit walkthrough, PPE donning and doffing procedures, and emergency evacuation drills are all commonly produced in this format. Vocabulary density is high (every step includes equipment names and procedure-specific terminology) and accuracy risk is highest because the caption is the instruction. A worker following a LOTO demonstration via captions and encountering “the energize the equipment” where the video says “de-energize the equipment” has received an incorrect safety instruction.
Regulatory overview video (medium risk)
A lecture-format or presentation-format video explaining why the regulation exists, what it requires, and what the penalties for non-compliance are. OSHA 1910.147 overview, OSHA 1910.146 permit-required confined space awareness, ADA accessibility training. Vocabulary density is lower than procedure demonstration because the narrative is explanatory rather than procedural. The OSHA defined terms still appear but in context that provides more semantic scaffolding for the reader. Risk is medium: a worker who encounters “lotto” in a regulatory overview will likely understand from context that “lockout/tagout” is meant; a worker who encounters “the energize” in a procedure demonstration may not.
Competency verification video (high risk)
A scenario-based video where the learner must identify the correct safety action, choose the correct PPE, or sequence the correct procedure steps. Answer choices and scenario descriptions include the safety vocabulary. A wrong caption on an answer choice can change which answer is correct. An OSHA compliance officer reviewing training documentation will expect that the competency verification record reflects the correct answer — not the answer a worker selected because the caption misrepresented the option.
Job hazard analysis (JHA) walkthrough video (high risk)
A video documenting the hazard analysis for a specific job or task, often produced by an EHS professional walking through a facility with a camera. JHA videos contain facility-specific terminology (equipment model names, process names, hazard zone identifiers) that does not appear in any general ASR training corpus. The glossary for a JHA video must be built from the facility’s own equipment inventory — public domain OSHA text covers the regulatory vocabulary, but model numbers and process names are facility-specific.
Emergency response and evacuation video (medium risk)
Emergency action plan video, evacuation route walkthroughs, spill response procedures. Vocabulary includes emergency designation names (code red, code yellow, shelter-in-place, facility-specific alarm types), facility zone identifiers, and emergency response equipment names. General emergency vocabulary is handled adequately by ASR; facility-specific content requires a glossary seeded from the facility’s own emergency action plan documentation.
For frontline worker training delivered on mobile platforms — a growing modality in manufacturing EHS — the frontline microlearning captioning post covers the platform-specific delivery challenges (offline bundling, mobile-first push, OSHA vocabulary benchmarks by module type) that compound the vocabulary accuracy problem described here.
OSHA and ADA compliance obligations for safety training captioning
Captioning manufacturing safety training video is not optional for organisations with more than 14 employees. Two separate statutory frameworks independently require it, and the interaction between them means that a single failure in caption accuracy can create exposure under both.
ADA Title I: accessible training for employees with disabilities
The Americans with Disabilities Act Title I (employers with 15+ employees) requires that training materials be accessible to employees with disabilities, including employees who are deaf or hard of hearing. Safety training video that is uncaptioned or captioned at below-standard accuracy is not an accessible training material. An employer cannot satisfy an ADA Title I accommodation request by providing access to training video with captions that misrepresent safety procedure vocabulary.
The reasonable accommodation analysis applies: if an employee with a hearing disability requests accessible safety training and the employer provides a LOTO module with captions that read “the energize the equipment” at the critical step, the accommodation has not been made. The compliance matrix post provides the full regulatory structure; for manufacturing EHS purposes, the relevant standard is ADA Title I plus WCAG 2.1 AA success criterion 1.2.2 (Captions (Prerecorded)) as the technical specification for caption quality.
OSH Act Section 5(a)(1): the General Duty Clause
The General Duty Clause of the Occupational Safety and Health Act requires employers to provide a workplace “free from recognized hazards that are causing or are likely to cause death or serious physical harm to employees.” OSHA has applied the General Duty Clause to training deficiencies: if an employer has identified a safety hazard requiring training and the training provided is inadequate to communicate the required information to affected employees, the General Duty Clause may apply.
When video is the sole delivery mechanism for safety training — as is increasingly common in manufacturing with LMS-based training programs — inaccurate captions on that video create a training adequacy question under the General Duty Clause. An OSHA inspector who reviews training documentation and finds that an employee completed a LOTO module with captions that rendered the de-energisation step as “the energize the equipment” has grounds to question whether the training obligation under 1910.147(c)(7) was met for that employee.
OSHA 29 CFR 1910.132(f)(1): the training requirement
OSHA 29 CFR 1910.132(f)(1) requires that “each affected employee shall be trained to know at least the following: (i) When PPE is necessary; (ii) What PPE is necessary; (iii) How to properly don, adjust, wear, and doff the PPE; (iv) The limitations of the PPE; (v) The proper care, maintenance, useful life, and disposal of the PPE.” Each training item requires that the specific PPE type be identifiable from the training content. If a worker who accesses PPE training via captions cannot identify that the video is specifying a PAPR (because the caption reads “paper”) or a PFAS (because the acronym is rendered as lowercase “pfas” without disambiguation), the training has not communicated the required information. The 1910.132(f) obligation is not met by delivering a video; it is met by ensuring each affected employee has received the specified information. Caption accuracy is a component of whether that standard is met for employees who access training via captions.
The General Duty and ADA interaction
The practical significance of both obligations applying simultaneously is that caption accuracy in safety training is not a nice-to-have accessibility feature — it is a component of both safety compliance and employment law compliance. An organisation that is cited for a General Duty Clause violation on a LOTO training adequacy issue and simultaneously receives an ADA accommodation complaint about the same training video has two separate regulatory exposures from a single caption accuracy failure. The caption compliance programme post covers how to structure a compliance programme that addresses both frameworks; for manufacturing specifically, the EHS and HR functions both need visibility into caption quality standards.
Glossary construction for manufacturing safety content
The vocabulary failure profile described in the sections above is predictable from the source material: OSHA CFR text, ANSI and NIOSH standards documents, and equipment manufacturer documentation. This predictability means the glossary can be built before captioning begins — not reactively after errors are reported.
Tier 1: OSHA CFR text (public domain)
The definitions sections of each relevant OSHA standard (1910.147, 1910.146, 1910.132–138, 1926 Subpart M) contain all the defined terms that will appear in training content for that standard. The CFR is public domain. Extracting all defined terms from the applicable standards and adding them to the glossary as correctly-spelled, canonical forms takes approximately 2–4 hours per standard. For an organisation running LOTO, confined space, and PPE training, this tier covers 40–60 terms and addresses the majority of the regulatory defined-term failures described above.
Tier 2: Equipment manufacturer documentation
Equipment model names, manufacturer names, and trade designations are not covered by OSHA CFR text. A LOTO programme at a facility with a specific PLC brand, specific conveyor system names, and specific lockout hardware brand (e.g., Master Lock, Brady) requires those terms in the glossary. Equipment manufacturer product catalogues and installation manuals list the canonical model names and abbreviations. This tier requires facility-level research and will vary by site; it typically adds 20–100 terms per facility depending on equipment variety.
Tier 3: Acronym expansions for known failure patterns
The high-risk acronyms identified in the tables above (LOTO, IDLH, PAPR, SCBA, APF, SRL, PFAS, NRR, LEL, PRCS, MEWP) should be added as explicit glossary entries with their expanded forms. For acronyms with collision risk (PFAS), the glossary entry should specify the expansion in context: “PFAS (personal fall arrest system)” disambiguates from the chemical acronym.
Tier 4: Facility-specific procedure vocabulary
Internal procedure names (if the facility has named energy control procedures beyond the OSHA generic terms), internal zone identifiers (Building 3, Line 4, Unit 7), internal alarm designations, and internal job titles (EHS Coordinator, Safety Lead) complete the glossary. This tier is unique to each facility and must be sourced from the facility’s own EHS documentation.
Glossary maintenance cycle for safety content
OSHA standards are updated periodically: enforcement interpretations change, new standards are issued, existing standards are amended. A glossary built from CFR text needs an annual review against the current version of each applicable standard. Equipment vocabulary needs review when new equipment is installed or existing equipment is decommissioned. The glossary maintenance workflow post covers the mechanics of term submission, review, and version control that apply to any domain glossary; for EHS content, the term-owner is typically the EHS function rather than the L&D function, and the review cadence should align with the facility’s existing EHS programme review cycle.
After the glossary is built and deployed, the caption feedback loop post describes how to capture the errors that do appear in production and convert them into glossary additions — because no pre-built glossary will catch every facility-specific term on the first pass.
Third-party safety training content
A significant proportion of manufacturing EHS training is delivered via purchased or licensed content from third-party safety training publishers: Convergence Training (now part of Vector Solutions), J. J. Keller & Associates, SafeStart, Vivid Learning Systems, Safetycare, and National Safety Council online courses. These publishers produce LOTO, confined space, PPE, and construction safety modules that organisations licence and deliver via their LMS without producing the content themselves.
The ADA Title I and General Duty Clause obligations described above apply to the training experience, not to the production source. An employee who accesses a purchased LOTO module via captions and encounters the vocabulary failures described in this post has received inadequate training regardless of whether the module was produced internally or licensed from a publisher. The organisation delivering the training bears the compliance obligation.
Why third-party content is harder to correct
When an internal EHS team produces a LOTO video and the captions contain vocabulary errors, the correction workflow is direct: the EHS team reviews the caption file, the corrections are applied, and the updated file is delivered to the LMS. When a purchased content module contains caption errors, the correction path is more complex. The third-party publisher owns the source content and the delivered caption file may be embedded in the module rather than delivered as a separate sidecar file. The licence agreement may not grant the licensee the right to modify captions. The publisher may produce corrected captions only on their standard revision cycle, not on demand.
The third-party compliance training captioning post covers the procurement approach: what to require in the content licence agreement (sidecar SRT/VTT file delivery, modification rights for accessibility corrections, WCAG 2.1 AA accuracy warranty), how to run a pre-delivery accuracy assessment on purchased content, and how to structure the caption correction workflow when the publisher cannot or will not provide an accurate file on the required timeline.
Requesting a vendor pilot on safety content
Before deploying a purchased safety training library, an accuracy assessment on the safety-specific content should be performed. The pilot design framework from the caption vendor pilot post applies directly: build a test corpus that includes your highest-vocabulary-density content (LOTO procedure demonstration, confined space atmospheric testing, respiratory protection selection training), produce reference transcripts for DCMP scoring, and evaluate the purchased module captions against your pre-committed accuracy floor before deploying the library to employees. If the purchased content does not meet the floor, exercise the contractual correction workflow before delivery.
Eight failure modes
- 1. Captioning safety training video without EHS domain glossary configuration
-
The most common failure: an L&D team captions safety training video using the same caption vendor workflow used for general onboarding and soft-skills content, without flagging the content as EHS-domain or providing a safety-specific glossary. The vendor processes the content at their standard quality tier, which produces 89–93% accuracy on general content and 73–82% accuracy on OSHA safety vocabulary. The L&D team reviews the output visually and the overall quality looks adequate — the narrative sentences are mostly correct, the visual errors are scattered. What they miss is that the scattered errors are concentrated on the safety-critical terms: de-energize, IDLH, PAPR, SCBA. The QA methodology post covers how to design a review process that specifically flags safety-vocabulary errors rather than averaging them into an overall accuracy score.
- 2. Treating LOTO and HazCom safety video as the same vocabulary category
-
EHS training programmes often span multiple regulatory domains (LOTO, HazCom, confined space, PPE, emergency response). An organisation that builds a glossary for its HazCom training — chemical names, SDS section vocabulary, GHS pictogram descriptions — and then uses the same glossary for LOTO training will find that the HazCom glossary provides almost no coverage for LOTO vocabulary. The two domains have nearly non-overlapping vocabulary sets. The HazCom captioning post covers the chemical-name vocabulary failure profile; this post covers the equipment-procedure failure profile. Both glossaries need to be built, and the EHS training programme needs a glossary management approach that maintains domain separation while allowing cross-domain access when a module spans multiple standards.
- 3. Accepting third-party safety content captions without pre-deployment accuracy assessment
-
A publisher delivers a LOTO training library with captions included. The L&D team adds the modules to the LMS and marks them as deployed without testing the caption accuracy. Six months later, a hearing-impaired employee files an ADA accommodation complaint. The L&D team reviews the captions for the first time and finds LOTO rendered as “lotto” throughout. The correction requires working with the publisher, who has a 45-day revision cycle. The accommodation is not fulfilled for 45+ days after the complaint. A pre-deployment assessment would have identified this before employees accessed the content.
- 4. Limiting glossary to general vocabulary and omitting equipment model names
-
An EHS team builds a LOTO glossary from OSHA 1910.147 CFR text. The glossary covers all defined terms correctly. The training video, however, includes references to specific equipment installed at the facility: Allen-Bradley PLCs, Pilz safety relays, Rexnord conveyor components. These names are not in any OSHA text and are not covered by a CFR-derived glossary. A worker following the LOTO procedure for a specific piece of equipment via captions needs those equipment names to be accurate to match the physical equipment in front of them. Facility-specific equipment terminology must be sourced from the facility’s own equipment inventory, not from regulatory text.
- 5. Using LMS native auto-captions for safety training modules
-
LMS native auto-caption accuracy on OSHA safety vocabulary runs 15–22 percentage points below the same system’s accuracy on general training content. Systems that perform at 91% on soft-skills content perform at 69–76% on LOTO and confined space procedure video. None of the major LMS platforms currently support domain glossary configuration in their native caption generation. For safety training content, native auto-captions should be disabled and a glossary-configured external captioning workflow should be used instead. The LMS native auto-caption comparison post covers accuracy benchmark data by platform for both general and specialist content.
- 6. Conducting OSHA training compliance audits that exclude caption accuracy
-
An EHS compliance audit that verifies that LOTO training has been completed (LMS completion record exists for each affected employee) but does not verify that the training delivered was accurate may miss a compliance gap. If 20% of employees completed a LOTO module via captions that rendered the de-energisation procedure incorrectly, the LMS records show 100% training completion but the training did not deliver the required information to those employees. A caption accuracy audit as part of the annual EHS training audit is the mechanism that closes this gap. The QA methodology post includes an audit template applicable to safety training content.
- 7. Building a safety glossary once without a maintenance cycle
-
An EHS team builds a comprehensive LOTO and PPE glossary during a captioning programme launch. Three years later, the facility has installed new equipment, two OSHA standards have been updated, and the organisation has added confined space entry training for a new facility. The original glossary covers the original equipment and the original regulatory text. New content produced from the expanded programme is captioned against the original glossary and produces accuracy gaps on the new vocabulary. The glossary maintenance cycle described above — annual review against current CFR text, equipment inventory review on new installations, and a structured submission workflow for subject matter experts — prevents this accumulation of ungoverned vocabulary.
- 8. Capturing only visual caption errors during QA review and missing audio-only failures
-
A QA reviewer watches a LOTO procedure video and reviews captions by reading along while listening to the audio. When the audio says “de-energize” and the caption reads “the energize,” a reviewer listening to the audio will mentally correct the caption before marking it as an error. The only accurate QA methodology for safety training video captions is to review the captions without audio — reading them as the target user (a worker relying solely on captions) would read them. The DCMP Captioning Key methodology scores errors against the reference transcript, not against the reviewer’s audio comprehension; applying DCMP scoring rather than visual impressionistic review is the mechanism that catches the safety-critical substitutions that audio-assisted review misses. The error rate calculator post explains how to apply the DCMP formula to a sample of safety training content.
FAQ
- Does OSHA independently require captions on safety training video, separate from ADA?
-
OSHA does not have a regulation that expressly requires captions on safety training video. However, two OSHA frameworks create indirect obligations that effectively require accurate captioned access for employees who cannot access audio. First, OSHA 29 CFR 1910.132(f)(1) requires that “each affected employee” be trained in PPE requirements. An employee with a hearing disability who can only access training via captions is an “affected employee” whose training obligation under 1910.132(f)(1) is not met by delivering a video with inaccurate captions. Second, the OSH Act General Duty Clause has been interpreted by OSHA to require training that actually communicates the required safety information; inaccurate captions on procedure demonstration video are a training adequacy gap under this interpretation. ADA Title I provides the clearest and most direct requirement; OSHA obligations reinforce it.
- What is the target accuracy floor for manufacturing safety training content, vs. general L&D content?
-
WCAG 2.1 AA success criterion 1.2.2 (Captions (Prerecorded)) requires “synchronized media alternative” without specifying a numeric accuracy floor. DCMP Captioning Key, the most widely cited technical standard for caption quality in training contexts, targets 99% or higher accuracy for educational and training content. For safety training video, the practical accuracy target is not lower than 99% — it may need to be higher because the consequences of a single critical error (on a LOTO step, on an IDLH threshold, on a respirator designation) are more severe than in general training content. The 99% accuracy post explains why the DCMP floor is not arbitrarily high; for safety content, the argument for 99% is even stronger than for general L&D content.
- Can we use general-purpose ASR for safety training if we then do a human review pass?
-
Yes, but only if the human review pass is designed to catch the safety-critical substitutions. Most post-ASR human review workflows are edited for fluency — the reviewer reads the caption alongside the audio and fixes the errors that interrupt reading comprehension. The de-energize/the-energize failure often does not interrupt fluency because the sentence is grammatically coherent with either reading; a reviewer listening to audio while reviewing captions may correct it automatically without marking it. The review protocol for safety training content must require caption-only review (audio off) for at least the procedure demonstration video segments, and must specifically check the vocabulary failure categories identified in this post. A glossary-configured first-pass before human review reduces the errors the reviewer must catch, making the human review more reliable and less expensive. The correction cost post quantifies the labour cost reduction from glossary-first vs. human-review-first workflows.
- What is the right procurement approach for safety training content where we need both EHS-vocabulary accuracy and a vendor who can handle facility-specific terminology?
-
The vendor evaluation criteria for a caption vendor handling safety training content should include: (1) support for glossary configuration with customer-supplied term lists, including multi-word compound terms; (2) accuracy benchmark on safety procedure content, not general training content — require a sample of the vendor’s output on LOTO or confined space content specifically; (3) reference transcript capability — the vendor should be able to score against a customer-supplied reference transcript; (4) per-video scoring in the QA report, not only aggregate; (5) a correction SLA for safety-critical errors that is shorter than the general SLA. The vendor SLA checklist provides the full contract review structure; for safety content, add a contractual definition of “safety-critical term error” and a remediation clause specific to it.
- We have a hearing-impaired employee who has requested accessible LOTO training. The LMS module was deployed six months ago with native auto-captions. What should we do now?
-
Treat this as an active ADA accommodation request under Title I. Step one: pull the LOTO module’s caption output and review it for the vocabulary failures described in this post, specifically checking the de-energisation steps, the energy isolating device identification, and the zero-energy state verification steps. If any of these steps contain the failures described above, the accommodation is not yet made — do not mark the module as accessible until the errors are corrected. Step two: provide the employee with an alternative accessible training pathway while corrections are being made (a live instructor, a corrected transcript, or a postponement of the training completion deadline with documented reason). Step three: correct the captions using a glossary-configured captioning workflow and re-deliver the module. Document all steps. Step four: audit all other safety training modules in the LMS for the same vocabulary failure pattern before the next accommodation request arrives. The compliance programme post covers how to build a proactive programme rather than reacting to individual accommodation requests.
- How many glossary terms are needed to close the primary vocabulary gaps in a LOTO training programme?
-
For a single-facility LOTO programme with standard equipment variety, a well-scoped glossary contains 60–120 terms: approximately 15 OSHA 1910.147 defined terms from CFR text, 10–15 acronym expansions (LOTO, LOTO, IDLH, PPE, EHS, etc.), and 30–90 facility-specific equipment model names and manufacturer terms. This is a small glossary by domain standards — the financial services glossary for a comparable training programme runs 150–300 terms. The LOTO glossary achieves high impact per term because the CFR defined terms cover the highest-risk vocabulary failures, and the facility-specific terms are identifiable from existing documentation. The glossary can be built in one 4–8 hour session by an EHS professional who has access to the facility’s equipment inventory and the applicable CFR standards text.
- Our manufacturing organisation operates in both the US and EU. Does the EAA (European Accessibility Act) create additional obligations for safety training video?
-
The European Accessibility Act (EAA), enforceable since June 2025, covers products and services within scope under Directive 2019/882 but does not directly apply to internal employee training video in most jurisdictions. The relevant EU obligation for internal training captioning is the employment equality framework (Directive 2000/78/EC), which requires reasonable accommodation for disabled employees in access to training. Additionally, national transpositions of the EAA and national disability non-discrimination laws in Germany, France, the Netherlands, and other EU member states create similar employment-context obligations to ADA Title I. The vocabulary challenge for EU manufacturing sites is compounded by language: OSHA-specific vocabulary in English appears in US-produced training that is used at EU sites; local safety vocabulary (EN ISO standards references, CE marking, ATEX directive vocabulary) appears in EU-produced content. If your manufacturing programme spans both geographies, the glossary must cover both regulatory vocabularies, and the captioning workflow must handle both English and local-language content. The compliance matrix post covers the US regulatory framework; EU captioning obligations will be covered in a future post.