Platform reference · manufacturing training captions · factory training video · ISO 9001 § 7.2 · OSHA 1910 · lean manufacturing · Six Sigma · HACCP · FSMA PCQI · LOTO · ADA Title I
Manufacturing training captions: lean/Six Sigma Japanese vocabulary, OSHA 1910, ISO 9001 § 7.2, and HACCP
Manufacturing and factory employers — discrete manufacturers, process manufacturers, food and beverage producers, chemical processors, automotive suppliers, aerospace fabricators, and industrial equipment makers — deliver mandatory training video across five distinct regulatory compliance domains: OSHA 1910 workplace safety (lockout/tagout, confined space, machine guarding, forklift, arc flash, process safety management); ISO 9001:2015 quality system competence under clause 7.2; lean manufacturing and Six Sigma operational excellence; food safety under HACCP principles and the FDA Food Safety Modernization Act (FSMA); and environmental management under ISO 14001:2015. This training content is not optional: the OSHA 1910 sub-parts carry specific documented-training requirements (1910.147(c)(7) for LOTO, 1910.146(g) for confined space, 1910.178(l) for forklift operation, 1910.119(g) for process safety management); ISO 9001:2015 clause 7.2 requires evidence of competence; and the FSMA § 103 mandate requires that each facility's Preventive Controls Qualified Individual (PCQI) complete a specific FDA-recognized training curriculum. Every one of these compliance training domains narrates vocabulary that is severely out-of-distribution for general speech-to-text systems. The lean manufacturing domain carries Japanese-origin terms — heijunka, poka-yoke, kaizen, muda/muri/mura, kanban, jidoka — that appear in virtually no generic STT training data. The Six Sigma domain carries statistical abbreviations — Cpk/Ppk, DPMO, Gauge R&R, DMAIC, DFSS, control chart type designations — whose specific pronunciation-to-text mappings are ambiguous in general corpora. The OSHA domain carries compound CFR citation formats ("twenty-nine CFR nineteen-ten dot one-four-seven") that STT renders in half a dozen inconsistent formats. The food safety domain carries the acronym HACCP, whose standard pronunciation "HACK-up" creates a serious disambiguation problem with ordinary English usage. For hearing-impaired workers on manufacturing production floors — where safety training carries the most direct life-safety consequence of any industry training content type — caption accuracy is not an accessibility convenience: it is the difference between a worker who understood the confined space atmospheric monitoring procedure and a worker who did not. The compliance frame: ADA Title I employer accommodation for virtually all manufacturing employers with 15+ employees; Section 503 and VEVRAA for federal contractor manufacturers (defence, aerospace, government supply chain); and the documented-training requirement under five separate OSHA 1910 sub-parts, each of which creates an independent regulatory obligation to demonstrate that training was effective.
TL;DR
Manufacturing training video has three vocabulary layers that compound speech-to-text failures in ways that no other industry training content type replicates. First, the lean/Six Sigma domain carries Japanese-origin terms — heijunka, poka-yoke, jidoka, muda, muri, mura — that are essentially absent from STT training data, producing phonetic guesses ("hay-junk-ah," "poke-a-yoke," "jee-DOH-ka") that do not correspond to any consistent written form. Second, the statistical quality vocabulary — Cpk/Ppk, DPMO, Gauge R&R, DMAIC, control chart type designations (X-bar/R, X-bar/S, IMR, P-chart, NP-chart) — uses abbreviations whose pronunciation-to-text mapping is ambiguous in general corpora; "Cpk" narrated as "see-pee-kay" is transcribed as "CPK," "C-P-K," "CPC," or "cpk" without consistent capitalisation or formatting. Third, OSHA 1910 regulatory citations are narrated in the same compound CFR shorthand ("nineteen ten dot one forty seven for lockout/tagout") that fails across all regulated-industry training, producing "1910.147," "1910 dot 147," "19 ten point 147," or "section 1910.147" (with or without section symbol) in the same training video. Beyond these three core layers, HACCP introduces a disambiguation problem unique in any training domain: the standard pronunciation "HACK-up" maps directly onto common English usage ("hack up" as in a coughing episode), so generic STT may transcribe HACCP correctly in letter-spelling contexts and as literal "hack up" in phonetic-pronunciation contexts within the same video. The compliance obligations converge: ISO 9001:2015 clause 7.2 Competence, the FSMA PCQI training mandate under § 103, OSHA 1910 documented-training requirements across six sub-parts, and ADA Title I employer accommodation. Manufacturing is the industry where caption accuracy most directly intersects with worker safety.
Manufacturing training content types
Operator safety training — OSHA 1910 compliance
OSHA 29 CFR Part 1910 (General Industry Standards) mandates documented training for every major hazard category in manufacturing. Each sub-part carries its own training requirement with specific elements that must be covered and records that must be retained. The vocabulary narrated in OSHA 1910 compliance training video is among the most safety-critical content in any training domain: a worker who misunderstood the confined space atmospheric monitoring procedure because the caption was inaccurate does not have an academic gap — they have a life-safety gap.
Lockout/Tagout (LOTO) — 29 CFR 1910.147
Lockout/tagout training under 29 CFR 1910.147 (Control of Hazardous Energy) is required for three distinct employee categories with different training content requirements: authorized employees (who perform the lockout/tagout procedure), affected employees (who operate machinery in the area and must recognize when LOTO is in effect), and other employees (who work in areas where LOTO procedures are used and must understand that they cannot remove a lock or tag). Training video must cover the energy control procedure for each machine or equipment type, the types of hazardous energy (electrical, pneumatic, hydraulic, mechanical — spring tension and gravity, thermal, chemical, and radiation where applicable), the specific lockout/tagout hardware used at the facility (multiple lockout hasps for group lockout/tagout procedures where more than one authorized employee is working on the same machine), stored energy release procedures, and verification steps before re-energizing.
LOTO training vocabulary creates specific STT failure modes. "Lockout/tagout" itself is narrated as a compound word with the slash — STT produces "lockout tagout" (space, no slash), "lock out tag out" (two separate phrases, no hyphen), "lock-out/tag-out" (hyphenated), or "LOTO" (when the acronym is used, sometimes rendered phonetically as "low-toe" or misheard as "lock-oh"). The authorized/affected/other employee triad is safety-critical: a training video that explains "authorized employees must affix their personal lock" and "affected employees must not operate equipment" requires that the caption preserve the employee category distinction. STT confuses "authorized" and "affected" in rapid narration when the narrator moves quickly through the triad. "Multiple lockout hasps" — the hasp device used for group lockout — is narrated at the same speed and STT renders "hasps" as "clasps," "hasp" as "half" in some accents, and "multiple lockout hasps" as "multiple lockout pass" in audio contexts with production-floor background noise.
Energy types: "hydraulic" and "pneumatic" are handled reasonably well by STT but "stored energy" phrases compound: "residual hydraulic pressure after isolation" → STT: "residual hydraulic pressure after isolation" (usually correct in isolation) but in the context of "stored spring energy in the press ram after hydraulic isolation," the vocabulary density causes STT to mis-transcribe "press ram" as "pressure ram" or "press ramp."
Confined Space Entry — 29 CFR 1910.146
Confined space entry training under 29 CFR 1910.146 distinguishes permit-required confined spaces from non-permit confined spaces — a distinction that carries the highest safety consequence in manufacturing training, because entering a permit-required confined space without following permit procedures has caused numerous worker fatalities. Training video must cover: the permit-required vs non-permit classification criteria (contains or has potential to contain a serious atmospheric hazard; contains material that has the potential for engulfment; has an internal configuration that could trap or asphyxiate; contains any other recognized serious safety or health hazard); the entry supervisor/attendant/entrant triad (parallel to but distinct from the LOTO authorized/affected/other triad); atmospheric monitoring for LEL (Lower Explosive Limit), oxygen percentage (acceptable range 19.5–23.5%), and IDLH (Immediately Dangerous to Life or Health) levels; and emergency rescue procedures.
STT failure modes in confined space training: "permit-required" is the regulatory term of art; STT produces "permit required" (no hyphen), "permitrequired," or simply "permitted" in compressed speech. "IDLH" (narrated as "I-D-L-H" or "eye-dee-el-aitch") → STT: "IDLH," "I-D-L-H," "idle H," or "idlh" — without consistent capitalisation. "LEL" (narrated as "L-E-L" or "el-ee-el") → "LEL," "L.E.L.," "Lel" — STT inconsistently formats three-letter acronyms when spelled out at speech speed. "Entrant/attendant/entry supervisor" — the three-role triad for permit-required confined space — is narrated rapidly, and STT conflates "attendant" with "assistant" and "entry supervisor" with "entry support" in background-noise conditions.
Machine Guarding — 29 CFR 1910.212
Machine guarding training under 29 CFR 1910.212 covers point of operation guarding, transmission guarding, and other machine-part guarding requirements. Training vocabulary includes: point of operation (the area on a machine where work is performed on the material — where the hazard of cutting, shaping, boring, or forming exists); pinch point (any point where it is possible for a person to be caught between a moving part and a stationary object, or between two moving parts); shear point (where two edges move across each other — common in cutting machines and conveyors); nip point (where two rotating objects move together — rollers, conveyors, gears); kick-back (the hazard of wood or material being thrown back from cutting operations); and guard types (fixed guard, interlocked barrier guard, adjustable guard, self-adjusting guard, presence-sensing device, two-hand control, restraint and pullback device).
STT renders "pinch point" reliably but "nip point" is more variable — "nip" is a common English word in non-manufacturing contexts and STT may produce "nip point" correctly but with less confidence, resulting in substitution errors in continuous speech. "Kick-back" in sawmill/woodworking training is correctly transcribed but STT does not consistently hyphenate it. "Interlocked barrier guard" — a compound safety device name — is rendered "interlocked barrier guard," "inter-locked barrier guard," or "interlock barrier guard" (dropping the -ed suffix).
Forklift/Powered Industrial Truck (PIT) Training — 29 CFR 1910.178
Forklift/PIT training under 29 CFR 1910.178 is one of the most commonly deployed safety training topics in manufacturing, distribution, and warehousing. It must be completed before a worker operates a powered industrial truck and must be repeated every three years or upon evidence of unsafe operation. Training video covers: pre-operation inspection checklist items; load capacity and the stability triangle (the three-point contact principle between front axle and rear wheels that defines forklift stability); dock leveler operation (bridging the gap between dock floor and trailer — "dock leveler" or "dock plate" depending on the specific equipment); pedestrian interface (spotter protocols, dedicated pedestrian lanes, horn use at intersections); and truck type-specific hazards (counterbalanced sit-down rider, reach truck, order picker, pallet jack, rough terrain forklift).
"Powered industrial truck" — the regulatory term of art from 1910.178 — is frequently shortened to "PIT" in training narration. STT: "P-I-T" (spelled out), "pit" (common English word — the disambiguation problem is real), "powered industrial truck" (when narrated in full — usually correct). "Stability triangle" → STT: "stability triangle" (usually correct). "Dock leveler" → STT: "dock leveler," "dock leveller" (British spelling variant — not incorrect but inconsistent for US manufacturing documentation), "dock level-er." "Counterbalanced" → STT: "counterbalanced" (generally correct but sometimes "counter balanced" or "counter-balanced").
Arc Flash and Electrical Safety — NFPA 70E
Arc flash and electrical safety training is governed by NFPA 70E (Standard for Electrical Safety in the Workplace), not directly by an OSHA 1910 sub-part, though OSHA cites NFPA 70E compliance as evidence of meeting OSHA's general duty clause and 1910 Subpart S (Electrical). Arc flash training vocabulary includes: incident energy (the amount of thermal energy impressed on a surface during an arc flash event, measured in cal/cm² — calories per centimetre squared); arc flash boundary (the distance at which a worker could receive a second-degree burn — 1.2 cal/cm² incident energy — in the event of an arc flash); PPE Category 1 through Category 4 (NFPA 70E Table 130.5(G) — each category defines a minimum arc rating for PPE); arc-rated clothing (AR clothing — distinct from "flame resistant" or "FR" clothing, though both terms appear in training); and the Hierarchy of Controls applied to electrical hazards (elimination, substitution, engineering controls including guarding and interlocks, administrative controls including energised electrical work permits, and PPE).
STT failure modes: "cal/cm²" narrated as "calories per centimetre squared" → STT produces "calories per centimetre squared" (often correctly), "cal per cm squared," or "cal/cm squared" — the slash and superscript ² are formatting choices that STT does not handle. "Arc flash boundary" → correctly transcribed in most cases, but "arc-rated" (as an adjective — "arc-rated clothing") → STT: "arc-rated," "arc rated" (no hyphen), "arc-rating" (wrong part of speech). "PPE Category 2" narrated as "PPE category two" → STT: "PPE Category 2," "PPE category 2," "P-P-E Category 2" — the capitalisation of "Category" and the digit vs word form of the number vary by STT run.
Process Safety Management (PSM) — 29 CFR 1910.119
Process Safety Management training under 29 CFR 1910.119 applies to facilities that use, store, manufacture, handle, or move highly hazardous chemicals (HHCs) above threshold quantities (most commonly the 10,000-pound threshold for flammables). PSM training video covers the 14 elements of PSM: Process Safety Information (PSI), Process Hazard Analysis (PHA), Operating Procedures, Training, Contractors, Pre-Startup Safety Review (PSSR), Mechanical Integrity, Hot Work Permit, Management of Change (MOC), Incident Investigation, Emergency Planning and Response, Compliance Audits, Trade Secrets, and Employee Participation. Training narration references HAZOP (Hazard and Operability Study), LOPA (Layers of Protection Analysis), what-if analysis, fault tree analysis (FTA), checklist analysis, and failure mode and effects analysis (FMEA) as the specific PHA methodologies listed in 1910.119.
STT failure modes in PSM training: "HAZOP" (narrated as "HAY-zop" or "HAZOP") → STT: "hazop," "HAZOP," "hay-zop" — capitalisation inconsistent. "LOPA" (narrated as "LO-pa" or "L-O-P-A") → STT: "LOPA," "low-pa," "L.O.P.A." "Management of Change" — the specific PSM element — is abbreviated "MOC" in rapid narration; STT: "MOC," "M-O-C," "mock" (phonetic substitution). "Pre-Startup Safety Review" → STT: "pre-startup safety review," "pre startup safety review," "prestartup safety review" — hyphenation inconsistent. "Highly hazardous chemical" with the specific regulatory threshold ("10,000 pounds") → the number is generally correct but "threshold quantity" (TQ) → "TQ," "T-Q," "threshold quality" (word substitution).
Quality system training — ISO 9001:2015
ISO 9001:2015 clause structure and the training obligation
ISO 9001:2015 organises its requirements into ten numbered clauses. The first three clauses (1 Scope, 2 Normative References, 3 Terms and Definitions) are introductory; the operative requirements are in clauses 4 through 10: Clause 4 (Context of the Organization), Clause 5 (Leadership), Clause 6 (Planning), Clause 7 (Support), Clause 8 (Operation), Clause 9 (Performance Evaluation), and Clause 10 (Improvement). Internal auditor training, quality awareness training, and management system overview training all narrate these clause numbers and titles together — "clause seven-point-two Competence requires the organisation to determine the necessary competence" — at a pace that creates STT rendering challenges for the clause number plus title combinations.
Clause 7.2 (Competence) is the direct ISO 9001 mandate for training: the organization must determine the necessary competence of persons doing work under its control that affects the quality management system performance; ensure those persons are competent (on the basis of appropriate education, training, or experience); take actions to acquire the necessary competence and evaluate the effectiveness of those actions; and retain appropriate documented information as evidence of competence. The "documented information as evidence of competence" requirement creates the ISO 9001 training record obligation — and accurate caption tracks are part of the evidence that the training content was accessible and effective.
PDCA cycle and clause narration in quality training
ISO 9001 quality awareness training universally presents the Plan-Do-Check-Act (PDCA) cycle as the organisational model. PDCA is narrated both as the acronym ("P-D-C-A" or "PDCA") and as the spelled-out cycle ("Plan — design the QMS processes; Do — implement and operate; Check — monitor, measure, analyse, and evaluate; Act — take actions to improve"). STT renders "PDCA" as "P.D.C.A.," "PDCA," or occasionally "petticoat" (a phonetic guess at "P-D-C-A" in rapid speech). The spelled-out PDCA descriptions are generally handled well by STT because they use common vocabulary.
Internal auditor training is the ISO 9001 training type with the densest regulatory vocabulary. Auditor training narrates the audit programme, audit criteria, audit scope, audit evidence, audit findings (with the specific non-conformity/observation/opportunity for improvement distinction), corrective action request (CAR), and audit follow-up. The findings taxonomy is safety-critical for auditors: a "nonconformity" is a failure to fulfil a requirement; an "observation" is a finding that may develop into a nonconformity if not addressed; an "opportunity for improvement" is a suggestion but not a finding of failure. STT renders all three finding types inconsistently: "nonconformity" → "non-conformity," "nonconformance" (a different but related term), "non conformity" (no hyphen); "corrective action request" → "CAR," "C-A-R," "corrective action report" (common abbreviation confusion).
FMEA, PPAP, and APQP in automotive quality training
Automotive manufacturing quality training adds a layer of IATF 16949-specific vocabulary above the ISO 9001 base: Production Part Approval Process (PPAP), Advanced Product Quality Planning (APQP), Measurement System Analysis (MSA), Statistical Process Control (SPC), and the Automotive Core Tools suite. PPAP training narrates the 18 required PPAP elements (Design Record, Engineering Change Documents, Customer Engineering Approval, Design FMEA, Process Flow Diagram, Process FMEA, Control Plan, Measurement System Analysis Studies, Dimensional Results, Records of Material / Performance Test Results, Initial Process Study — Cpk, Qualified Laboratory Documentation, Appearance Approval Report, Sample Production Parts, Master Sample, Checking Aids, Customer-Specific Requirements, Part Submission Warrant). The "Part Submission Warrant" (PSW) — the PPAP cover document — is narrated as "PSW" or "Part Submission Warrant"; STT: "PSW," "P-S-W," "P.S.W.," "pissed-double-u" (phonetic guess in rapid speech).
8D problem solving (Eight Disciplines) is narrated in automotive and ISO quality training as a numbered sequence: D0 (Prepare for the 8D — emergency containment action), D1 (Establish the Team), D2 (Describe the Problem), D3 (Develop Interim Containment Action), D4 (Define and Verify Root Cause), D5 (Choose and Verify Permanent Corrective Actions), D6 (Implement and Validate Permanent Corrective Actions), D7 (Prevent Recurrence), D8 (Congratulate the Team). The "D-zero through D-eight" narration sequence creates STT rendering challenges where the alpha-numeric "D-one" is rendered "D1," "D-1," "D one," "Dee one," or "Dee 1" inconsistently across the eight discipline descriptions in the same video.
Control plan vocabulary
Control plan training narrates the column headers of the control plan document: Process Number, Process Name/Operation Description, Machine/Device/Jig/Fixture (for each operation), Characteristics (Product and Process), Special Characteristics classification, Product/Process Specification/Tolerance, Evaluation/Measurement Technique, Sample Size and Frequency, Control Method, Reaction Plan. The "special characteristics" classification (high-impact characteristics identified with symbols — safety characteristic, key characteristic, significant characteristic — with customer-specific symbols varying by OEM: Ford's inverted triangle, GM's shield, Chrysler/FCA's diamond, Toyota/Honda customer-specific symbols) is narrated at speed in automotive quality training with customer-specific designation names that STT has no training data for.
Lean manufacturing and Six Sigma training vocabulary
Lean Japanese terms: the highest-density STT failure zone in manufacturing training
Lean manufacturing training carries a vocabulary layer that is unique among all training domains in the depth of its STT failure: the Japanese-origin terms that are the canonical vocabulary of the Toyota Production System and its derivatives have essentially no representation in generic STT training data. They are pronounced with approximated English phonetics by most American and European trainers, and that phonetic approximation maps onto no stable written form in STT output. The result is that a 30-minute lean manufacturing training video may have a different transcription for "heijunka" in every instance it appears.
Heijunka (production leveling)
Heijunka (平準化) — production leveling or production smoothing, the practice of levelling the type and quantity of production over a fixed period to reduce mura (unevenness) — is among the most STT-hostile terms in any training domain. The standard English-speaker pronunciation ranges from "hay-JUN-ka" to "hay-junk-ah" to "HEY-jun-ka" to "hey-JOON-ka." STT transcription outputs observed across multiple STT systems for this single term include: "HEY-junka," "heijunka" (correctly spelled but with incorrect capitalisation in all-caps), "hay junk ah," "HEJUNKA" (no hyphen, all caps), "hayĵunka," "Hajunka," and "hey junk a." None of these forms are consistent, and none correspond to the standard romanisation "heijunka." For a training video that narrates "the heijunka box is used to visualize the heijunka schedule," all three occurrences may appear with different spellings in the same auto-generated caption track — a failure that is visible to any quality-trained learner and that undermines confidence in the entire caption track.
Poka-yoke (mistake-proofing)
Poka-yoke (ポカヨケ) — mistake-proofing or error-proofing, the design of processes and tools to prevent errors or to detect and correct errors before they cause defects — is narrated with pronunciations ranging from "POH-ka-YO-kay" to "poka-YOKE" (rhyming with "smoke") to "POH-ka-yo-kay" to "POKE-a-yoke." STT outputs include: "POKA-yoke," "poke-a-yoke" (splitting into two words with a different root meaning), "POKEY-oak" (losing the second syllable of "yoke"), "poka yoke" (correct spacing but no hyphen), "poke a yolk" (yolk substitution — common in US-accented narration of "yoke"), and "POKAyoke" (run together). The poke-a-yolk error is particularly problematic in food manufacturing training contexts where "yolk" is already a domain-relevant word (egg-processing lines use both poka-yoke quality controls and literal egg yolks).
Jidoka (autonomation)
Jidoka (自働化) — autonomation, the practice of designing machines and processes to stop automatically when a problem is detected, then requiring human judgment to diagnose and correct the problem before restarting — is the second pillar of the Toyota Production System alongside Just-in-Time. Pronounced "jih-DOH-ka" or "jee-DOH-ka" by most English-speaking trainers. STT outputs: "jidoka" (correctly spelled — the most common correct output), "jih-DOH-ka" (phonetic rendering preserved from narration — happens in some STT systems that render low-confidence transcriptions as phonetic sequences), "G-I-DOH-ka" (splitting the initial consonant from the vowel), "Gidoka" (treating as a G-initial word), "she-DOH-ka" (a Japanese pronunciation approximation that some trainers use).
Muda, muri, and mura — the waste triplet
Muda (無駄, waste), muri (無理, overburden), and mura (斑, unevenness) — the three types of operational losses in lean manufacturing — are typically narrated as a triplet: "muda, muri, and mura." Each individual word is short enough that STT generally transcribes it correctly in isolation ("muda" → "muda" or "mooda," "muri" → "muri" or "Murray," "mura" → "mura" or "Mura"). The failure mode for the triplet is sequencing: when narrated rapidly as "muda-muri-mura" or "muda, muri, and mura" in a single breath, STT may produce "muda Murray mura," "mooda muri mura," "muda, Murray, and moora," or other partial-substitution sequences that break the visual triplet pattern that lean learners expect. In training videos that present the three types with Japanese characters alongside the romanisation, the caption track must use the romanisation forms consistently for the on-screen text to match.
Kanban (pull system)
Kanban (看板) — the pull-system signal used to authorize production or material replenishment — is one of the better-known lean terms in general business English, and STT generally handles "kanban" correctly. However, training vocabulary expands beyond the single word: "kanban card" (the physical or electronic signal), "WIP limit" (Work In Progress limit — "W-I-P limit" or "whip limit"), "replenishment kanban" (triggers material replenishment from a supplier or upstream process), "production kanban" (triggers production of a specific quantity), and "kanban board" (the visual management tool). "WIP" → STT: "WIP" (correctly as an acronym), "whip" (phonetic — common in rapid speech), "W.I.P." (with periods). The compound "replenishment kanban" is handled correctly in most STT outputs.
SMED (Single Minute Exchange of Die)
SMED — Single Minute Exchange of Die, the lean methodology for reducing setup and changeover time to below ten minutes (single-digit minutes) — is narrated as "S-M-E-D" (spelled out) or "smed" (as a word). STT outputs: "SMED" (correct), "S-M-E-D" (correct when spelled), "smeed" (phonetic — treating as a word with the long-e vowel), "S.M.E.D." (with periods), "speed" (substitution when narrated quickly in the context of "faster changeover").
OEE (Overall Equipment Effectiveness)
OEE — Overall Equipment Effectiveness, calculated as Availability × Performance × Quality, where each component is a ratio between 0 and 1 — is narrated as "O-E-E" (spelled out) or "oee" (as a word, rhyming with "key"). STT outputs: "OEE" (correct), "O-E-E" (correct when spelled), "oh-ee-ee" (phonetic), "OE" (dropping the third letter), "overall equipment effectiveness" (when narrated in full — usually correct).
TPM (Total Productive Maintenance)
TPM — Total Productive Maintenance, the manufacturing program for maximising equipment effectiveness through autonomous maintenance, planned maintenance, quality maintenance, focused improvement, early equipment management, education and training, safety/health/environment, and administrative/office TPM pillars — is narrated as "T-P-M" or "tee-pee-em." STT: "TPM" (correct), "T.P.M." (with periods), "T-P-M" (hyphenated), "tee-pee-em" (phonetic).
Andon cord and Andon system
The Andon cord (or Andon system) — the mechanism by which any worker on the production line can signal a quality or safety problem and stop the line — is narrated as "AN-don" (the standard English approximation). STT: "Andon" (usually correct when the word is prominent), "andon" (lowercase — capitalisation inconsistency), "and on" (splitting the word at the English word boundary — a frequent error in continuous speech where "andon" sounds like "and on"), "Anton" (common proper name that STT substitutes).
Six Sigma statistical vocabulary
DMAIC (Define/Measure/Analyze/Improve/Control)
DMAIC — the Six Sigma improvement methodology cycle — is narrated as "deh-MAY-ik" or "D-M-A-I-C" (spelled out). STT outputs: "DMAIC" (correct), "D-M-A-I-C" (correct when spelled), "deh-MAY-ik" (phonetic rendering in low-confidence outputs), "dee-mayic" (phonetic), "D-M-A-IC" (splitting the last two letters). The context-expanded form "Define, Measure, Analyze, Improve, and Control" is generally correct in STT. The variant DFSS (Design for Six Sigma) → "D-F-S-S," "DFSS," "design for Six Sigma" (when spelled out — correct).
DPMO (Defects Per Million Opportunities)
DPMO — Defects Per Million Opportunities, the Six Sigma metric for process quality level — is narrated as "D-P-M-O" (spelled out) or "dee-pee-em-oh" (phonetic). STT: "DPMO" (correct), "D-P-M-O" (correct), "dee-pee-em-oh" (phonetic rendering in low-confidence outputs), "D.P.M.O." (with periods). The relationship between DPMO and sigma level (3.4 DPMO = Six Sigma) is narrated with the sigma level as a number: "three-point-four DPMO at six sigma" → STT generally handles this correctly but may produce "3.4 DPMO at 6 sigma" or "three point four DPMO at six sigma" with inconsistent digit vs word form.
Capability indices: Cpk, Ppk, Cp, Pp
Process capability indices are among the most format-sensitive statistics narrated in manufacturing training video. Cpk (narrated "C-P-K" or "see-pee-kay") is the short-term process capability index measuring how centered the process is within specification limits. Ppk (narrated "P-P-K" or "pee-pee-kay") is the long-term process performance index. Cp and Pp are the corresponding non-centered indices. In training narration, the four indices often appear in close proximity: "Cp and Cpk measure short-term capability while Pp and Ppk measure long-term performance." STT renders this string with multiple substitution patterns:
- Cpk → "C-P-K": correct letter-for-letter. But STT also produces "CPK" (no hyphen, medical/cardiology context where CPK means Creatine Phosphokinase — a high-frequency medical acronym), "CPC" (K-to-C substitution, which is a common STT error when the final consonant is phonetically ambiguous), and "cpk" (all lowercase).
- Ppk → "P-P-K": STT: "PPK," "P-P-K," "peak" (phonetic substitution — "pee-pee-kay" mis-heard as "peak"), "pee-pee-k" (phonetic rendering).
- Cp → "C-P": STT: "CP," "C-P," "see-pee," "cp" — the two-letter capability index abbreviation lacks the discriminating final letter that separates Cpk from CPC.
Gauge R&R (reproducibility and repeatability)
Gauge R&R — the measurement system analysis study that quantifies the variation in measurement caused by the measurement system (reproducibility — variation between different operators measuring the same part — and repeatability — variation when the same operator measures the same part multiple times with the same gauge) — is narrated as "Gauge R-and-R" or "Gage R-R" (the American "gage" spelling variant is accepted by AIAG and appears in automotive training). STT outputs include: "Gauge R and R" (correct spacing, common), "Gauge R&R" (ampersand form — STT does not generate the ampersand symbol; it produces the words "and"), "Gage RR" (no space or conjunction — both the spelling variant and the conjunction are dropped), "gage R-R" (American spelling, hyphen form), "Gauge RR" (no conjunction). The gage/gauge spelling variant is a consistent issue across automotive and manufacturing quality training because the two spellings appear in the same training corpus (AIAG MSA Reference Manual uses "gage," ISO 5725 uses "gauge"), and caption formatting must be consistent with the organisation's chosen standard.
Control charts
Control chart types are narrated in Six Sigma and SPC (Statistical Process Control) training with single-letter or abbreviated designations that are among the most STT-ambiguous terms in manufacturing training. The primary control charts and their narration patterns:
- X-bar and R chart: the variable data control chart for subgroup averages (X-bar) and subgroup ranges (R). Narrated "X-bar and R chart." STT: "X-bar and R chart" (correct), "x-bar and R chart" (lowercase x), "X bar and R chart" (no hyphen), "X-bar in our chart" (catastrophic misparse — "and R" heard as "in our").
- X-bar and S chart: the alternative to the X-bar/R chart for larger subgroup sizes, using standard deviation (S) instead of range. Narrated "X-bar and S chart." STT: "X-bar and S chart" (correct), "X-bar and ass chart" (S phonetically approximates to a common English word in some audio conditions — a documented STT error in SPC training contexts).
- IMR chart (Individual and Moving Range): the control chart for individual measurements (no subgrouping). Narrated "I-M-R chart" or "Individual and Moving Range chart." STT: "IMR chart," "I-M-R chart," "I.M.R. chart," "immer chart" (phonetic — "I-M-R" pronounced as a word).
- P-chart, NP-chart, C-chart, U-chart: attribute data control charts. Single-letter designations are maximally ambiguous in STT. "P-chart" → "P chart," "pie chart" (phonetic — extremely common substitution, and a "pie chart" is a real data visualisation term that makes contextual sense to STT), "P-chart" (correct). "NP-chart" → "N-P chart," "and P chart," "MP chart." "C-chart" → "C chart," "see chart," "C-Chart." "U-chart" → "U chart," "you chart," "U-Chart."
- CUSUM and EWMA: advanced control charts (Cumulative Sum and Exponentially Weighted Moving Average). "CUSUM" → "Q-sum," "cue-sum," "CUSUM" (correct). "EWMA" → "E-W-M-A," "EWMA" (correct), "Emma" (phonetic — "EWMA" sounds like "Emma" when narrated quickly).
5S and value stream mapping vocabulary
5S — the dual Japanese/English vocabulary layer
5S (Sort, Set in Order, Shine, Standardize, Sustain) is the lean workplace organization methodology. The distinctive feature of 5S training vocabulary is its dual-layer structure: each S is simultaneously presented with its Japanese source term (Seiri, Seiton, Seiso, Seiketsu, Shitsuke) and its English translation. Training video narrators move back and forth between the two forms within the same sentence: "The first S — Seiri, which we translate as Sort — means eliminating unnecessary items from the workplace." STT must therefore transcribe both the Japanese romanisation and the English translation correctly in close proximity. The five Japanese terms:
- Seiri (Sort): STT: "Seiri" (correctly romanised — reasonably handled), "Siri" (the Apple voice assistant — a very high-frequency word in modern STT training data that creates substitution risk), "Sierra" (phonetic), "C.R.I." (mis-spelling).
- Seiton (Set in Order): STT: "Seiton" (correctly romanised), "sayton," "Say-ton," "Seaton" (a surname that STT may substitute).
- Seiso (Shine): STT: "Seiso" (correctly romanised), "say-so" (phonetic — and "say-so" is a common English phrase meaning "authority to decide," creating disambiguation risk in quality management training context), "seashore" (phonetic approximation).
- Seiketsu (Standardize): STT: "Seiketsu" (correctly romanised), "say-ket-su," "seiketsoo," "Seikets." Of the five 5S Japanese terms, Seiketsu is the one most frequently garbled by STT because the "-ketsu" ending has no common English equivalent.
- Shitsuke (Sustain): STT: "Shitsuke" (correctly romanised when the narrator speaks clearly), "shitsuke" (lowercase — capitalisation inconsistency), "shit-soo-kay" (phonetic rendering that creates inappropriate content in the caption — documented in at least one publicly reported manufacturing training captioning incident), "shitzuke," "sheats-kay."
Kaizen vocabulary
Kaizen (改善) — continuous improvement — is the most widely known Japanese lean term in Western business vocabulary and is generally transcribed correctly by STT as "kaizen." However, training-specific kaizen vocabulary expands the failure surface: "kaizen blitz" (a rapid improvement event of typically 3-5 days) → STT: "kaizen blitz" (correct), "kaizen bits" (dropping the -l-). "Kaizen newspaper" (the documentation form used to track kaizen event actions and status) → STT: "kaizen newspaper" (correct), "kaizen news paper" (split). "Kaizen event" and "rapid improvement event" (RIE) are interchangeable terms in different lean traditions; RIE → "R-I-E," "RIE" (correct), "rye" (phonetic — "R-I-E" narrated as a word).
Value stream mapping (VSM)
Value stream mapping training narrates the VSM methodology: current-state map vs future-state map, value-added (VA) time vs non-value-added (NVA) time, process boxes, push arrows vs pull triangles, inventory triangles, kaizen lightning bursts, and takt time calculation. "Takt time" — the available production time divided by customer demand rate — is narrated as "takt time" (the German word "Takt" means beat or pulse). STT: "takt time" (correct — reasonably well-handled), "tact time" (common spelling error that propagates into STT output — "tact" is a common English word), "tacked time," "TAC time." The formula narration "takt time equals available production time divided by customer demand rate" is handled correctly by STT in most contexts. "Non-value-added" → "non-value added" (hyphen inconsistency), "non value-added," "NVA" (when abbreviated).
Food safety training — HACCP and FSMA
HACCP: the pronunciation disambiguation problem
HACCP — Hazard Analysis Critical Control Points, the systematic preventive approach to food safety — is the most pronounced disambiguation failure in food manufacturing training video. The standard industry pronunciation is "HACK-up" — which is phonetically identical to the common English phrase "hack up," meaning to cough or cut something into pieces. Generic STT, trained on general English text corpora where "hack up" as an ordinary phrase vastly outnumbers HACCP as a food safety acronym, frequently transcribes "HACCP" as "hack up" in phonetic-pronunciation contexts.
The failure mode is situational: when HACCP is spelled out by the narrator ("H-A-C-C-P"), STT usually transcribes it as "HACCP" or "H-A-C-C-P" correctly. When the narrator uses the pronunciation "HACK-up" in continuous speech — "the HACCP plan requires identifying all critical control points" narrated as "the HACK-up plan requires identifying all critical control points" — STT may produce "the hack-up plan requires identifying all critical control points" — which is not merely inaccurate but potentially humorous or incoherent to a reader encountering it without audio context. In a food manufacturing plant where workers are receiving HACCP training as a condition of employment under the facility's FSMA preventive controls programme, a caption track that renders "hack-up plan" is both a compliance failure and a credibility failure for the training programme.
The HACCP plan structure — the seven HACCP principles — is narrated in training with numbered sequences similar to the 8D problem-solving steps: Principle 1 (Conduct a hazard analysis), Principle 2 (Identify Critical Control Points), Principle 3 (Establish critical limits), Principle 4 (Establish monitoring procedures), Principle 5 (Establish corrective action procedures), Principle 6 (Establish verification procedures), Principle 7 (Establish record-keeping and documentation procedures). The numbered-principle narration creates the same formatting inconsistencies as other numbered-sequence vocabulary: "Principle one" → "Principle 1," "principle one," "Principal 1" (principal/principle confusion — common STT error).
CCP (Critical Control Point)
CCP — Critical Control Point, the point in the food production process at which a control measure can be applied to prevent, eliminate, or reduce a food safety hazard to an acceptable level — is narrated as "C-C-P" (spelled out) or "see-see-pee" (phonetic). STT: "CCP" (correct), "C.C.P." (with periods), "see-see-pee" (phonetic rendering in low-confidence outputs), "CPC" (letter reversal). The phrase "Critical Control Point" is generally transcribed correctly in full. The distinction between a CCP and a Control Measure (a broader term in HACCP) and between a CCP and a Critical Quality Point (CQP — used in some food quality programs) requires consistent formatting in the caption track.
FSMA and the PCQI training mandate
The FDA Food Safety Modernization Act (FSMA), signed into law in 2011 and implemented through a series of FDA final rules beginning in 2015, fundamentally shifted US food safety regulation from a reactive (respond to contamination events) to a preventive (prevent contamination before it occurs) model. The FSMA Preventive Controls for Human Food rule (21 CFR Part 117, implementing FSMA § 103) requires that each food facility's preventive controls be overseen or verified by a Preventive Controls Qualified Individual (PCQI).
The PCQI requirement has a specific training dimension: the PCQI must have "successfully completed training in the development and application of risk-based preventive controls" consistent with the standardized curriculum recognized as adequate by FDA. The Food Safety Preventive Controls Alliance (FSPCA) delivers the recognized PCQI training curriculum, and PCQI training is delivered through a combination of in-person instruction, eLearning modules, and video content. This video content must be captioned for hearing-impaired PCQI candidates — and the captioning of PCQI training video must be accurate because PCQI training content is a regulatory credential: the FSMA rule requires the PCQI to have "successfully completed" the training, and a certificate of completion is the evidence of compliance.
"PCQI" is narrated as "P-C-Q-I" (spelled out) or "PICK-ee" (phonetic — the most common industry pronunciation). STT: "PCQI" (correct when spelled), "P.C.Q.I." (with periods), "PICK-ee" (phonetic rendering when the narrator uses the pronunciation), "picky" (a common English word phonetically identical to the PCQI pronunciation — the disambiguation problem parallels the HACCP/"hack-up" problem; "picky" is even more frequent in English general usage than "hack up").
HARPC (Hazard Analysis and Risk-Based Preventive Controls)
HARPC — Hazard Analysis and Risk-Based Preventive Controls, the FSMA equivalent of the HACCP hazard analysis under 21 CFR Part 117 — is narrated as "HAR-pee-see" or "H-A-R-P-C." STT: "HARPC" (correct), "H-A-R-P-C" (correct when spelled), "harp-c" (phonetic), "harpy-c" (phonetic substitution). "Preventive controls" (the FSMA term) is sometimes confused with "preventative controls" (a common variant pronunciation) in STT — "preventative" is a non-standard form that appears in some training audio and should be normalised to "preventive" in captions consistent with FSMA regulatory text.
SQF, BRC, GFSI, and food safety scheme vocabulary
Food manufacturing quality training for facilities certified to retail or foodservice customer requirements covers the GFSI-benchmarked food safety schemes: SQF (Safe Quality Food — the SQFI scheme operated by FMI, the Food Industry Association), BRC (now BRCGS — British Retail Consortium Global Standards), FSSC 22000 (Food Safety System Certification, based on ISO 22000), IFS Food (International Featured Standards), and GlobalG.A.P. (primarily for primary production). Training video for SQF, BRCGS, and FSSC 22000 certification preparation narrates scheme-specific vocabulary:
- SQF: narrated as "S-Q-F" or "squf." The SQF Code has three certification levels: SQF Fundamentals (Level 1), SQF Food Safety (Level 2), and SQF Food Quality (Level 3). Training narrates "SQF level two" or "SQF food safety certification." STT: "SQF" (correct), "S.Q.F." (with periods), "squf" (phonetic).
- BRCGS (formerly BRC): narrated as "B-R-C-G-S" or "BRC-GS" or simply "BRC" (using the legacy abbreviation). STT: "BRCGS," "BRC GS," "B-R-C-G-S," "BRC" (when the legacy abbreviation is used).
- PRPs (Prerequisite Programs): the foundational food safety programs (GMP, pest control, sanitation, allergen management, etc.) that underpin HACCP and HARPC. Narrated "P-R-Ps" (letter-spelled plural) or "PREPS" (phonetic). STT: "PRPs" (correct), "P-R-Ps" (correct when spelled), "PREPS" (phonetic — and "preps" is a common English word for preparations), "P.R.P.s."
Food Defense and the ALERT framework
Food Defense training — required under the FSMA Intentional Adulteration rule (21 CFR Part 121) for large food facilities — introduces the FDA ALERT acronym: Assure (that supplies and ingredients are from safe and secure sources), Look (for anything out of the ordinary), Employees (know who is in your facility and consider how to handle disgruntled employees or those with a grudge), Reports (to management when you notice something suspicious), and Threat (your facility is not vulnerable to intentional adulteration by an inside attacker). ALERT is narrated as an acronym with letter-by-letter expansion: "A stands for Assure, L stands for Look," etc. STT generally handles this correctly because the letter-by-letter structure provides sufficient pausing. The compound term "intentional adulteration" (as distinct from "unintentional adulteration" — the accidental contamination covered by HACCP) → STT: "intentional adulteration" (correct), "intentional adultery" (a word substitution that creates an absurd meaning in food safety training context — documented in training caption error reports).
OSHA 1910 citation format failures — the compound CFR narration problem
How manufacturing trainers narrate OSHA citations
OSHA 1910 citations are narrated in manufacturing training video in several different formats, depending on the trainer's style and the specific sub-part being cited. The principal narration patterns and their STT rendering outcomes:
- "Twenty-nine CFR nineteen-ten dot one-four-seven": the full citation for LOTO (29 CFR 1910.147). STT outputs: "29 CFR 1910.147" (correct), "1910.147" (dropping the "29 CFR" prefix), "29 CFR 1910 dot 147" (rendering the decimal point as the word "dot"), "29 CFR 19 10.147" (splitting "1910" into "19 10"), "§ 1910.147" (inserting the section symbol — which the narrator did not say), "1910.147(c)(7)" (adding the paragraph designation — which the narrator may or may not have cited).
- "Nineteen-ten point one-forty-seven": informal citation (omitting the "29 CFR" prefix). STT: "1910.147," "1910 point 147," "19.10.147" (treating the dots as decimal separators rather than CFR formatting).
- "OSHA nineteen-ten": informal reference to the general industry standard without a specific sub-part. STT: "OSHA 1910" (correct), "OSHA nineteen ten" (word form), "OSHA 19-10" (treating as a two-part number).
- Sub-part narration: "nineteen-ten subpart D" (Subpart D — Walking-Working Surfaces) → "1910 Subpart D," "1910 sub-part D," "1910 Sub-Part D" — capitalisation of "Subpart" varies.
- Paragraph designation: "nineteen-ten dot one-forty-seven sub-paragraph c-seven" (1910.147(c)(7) — the specific paragraph requiring LOTO training). STT: "1910.147(c)(7)," "1910.147 (c)(7)," "1910.147 c 7," "1910.147(c)7" — parenthesis formatting is inconsistent across STT outputs.
The consequence for manufacturing employers is significant: OSHA 1910 training records — including transcripts or captioned versions of training video — may be reviewed during an OSHA inspection. An OSHA compliance officer reviewing a training record that contains "19 ten point 147" or "1910 dot 147" in a lockout/tagout training caption will recognize the training content but may note the caption quality inconsistency as evidence of training-system limitations. More critically, a production-floor worker reading an OSHA citation in a caption to locate the regulatory requirement in the CFR (as some training programs direct workers to do) cannot locate "19 ten point 147" in the Code of Federal Regulations — the citation format must be exactly "29 CFR 1910.147" or "1910.147" to be searchable.
State-plan state enhanced requirements
Twenty-nine states and two territories operate OSHA-approved State Plans with requirements that are "at least as effective" as federal OSHA standards and may be more stringent. Manufacturing employers in state-plan states must comply with state-specific requirements narrated in training alongside the federal 1910 citations:
- California — Cal/OSHA: California operates one of the most comprehensive state OSHA programs. Cal/OSHA requires an Injury and Illness Prevention Program (IIPP) under 8 CCR § 3203 (Title 8, California Code of Regulations, Section 3203). The citation format "8 CCR section 3203" or "Title 8, Section 3203" appears in California manufacturing training alongside the federal 1910 citations, creating a dual-citation vocabulary layer. STT: "8 CCR 3203," "8 CCR section 3203," "Title 8 Section 3203," "8 C.C.R. 3203." Cal/OSHA's specific standards (such as the Cal/OSHA LOTO standard at 8 CCR § 3314, which is more stringent than 29 CFR 1910.147 in several respects) are cited in California manufacturing training in addition to the federal standard.
- Michigan — MIOSHA: Michigan Occupational Safety and Health Administration (MIOSHA) standards are cited in Michigan manufacturing training (Michigan is a major manufacturing state — automotive industry, tooling and die, plastics). MIOSHA citation formats follow a "Part XX" structure (MIOSHA Part 85 for Hazardous Energy Control — the LOTO standard) that differs from the federal CFR sub-part structure. STT: "MIOSHA Part 85," "MY-osha Part 85," "MIOSHA part 85."
- Washington — WISHA: Washington Industrial Safety and Health Act (WISHA) standards for Washington State. "WISHA" → STT: "WISHA" (correct), "wish-a," "Wisha" (capitalisation inconsistency). Washington State manufacturing training (Boeing supply chain, wood products, agriculture processing) cites WAC (Washington Administrative Code) standards in addition to federal OSHA.
Environmental training — ISO 14001:2015
Environmental management system training vocabulary
ISO 14001:2015 environmental management system (EMS) training covers the EMS structure (context of the organization, environmental aspects and impacts, legal and other requirements, objectives and targets, operational controls, emergency preparedness and response, monitoring and measurement, internal audit, management review). The ISO 14001 training vocabulary has significant overlap with ISO 9001 clause vocabulary (both standards share the Annex SL High Level Structure), which means that manufacturers pursuing an integrated ISO 9001/14001 management system (increasingly common) deliver integrated training that narrates both standard clause structures in the same video.
Environmental aspects and impacts training is the core conceptual content of ISO 14001: an environmental aspect is an element of an organisation's activities, products, or services that can interact with the environment; an environmental impact is a change to the environment, whether adverse or beneficial, wholly or partially resulting from those aspects. The "aspects and impacts" relationship — and the determination of significant environmental aspects — is narrated in training with the phrase "significant environmental aspect" appearing dozens of times. STT: "significant environmental aspect" (correct — generally well-handled), "significant environmental aspects" (plural form — correctly transcribed), "EMS" → "E-M-S," "EMS," "em-es."
Emergency preparedness and spill response vocabulary
Environmental emergency preparedness training in manufacturing covers spill response scenarios: secondary containment (the berms, dikes, and containment structures designed to hold spilled material), spill response kit components, SPCC (Spill Prevention, Control, and Countermeasure) plan requirements under 40 CFR Part 112 for facilities storing oil in quantities above threshold levels, and emergency coordinator roles. "SPCC" → "S-P-C-C," "SPCC" (correct), "speck-see" (phonetic). "Secondary containment" → "secondary containment" (correct — generally well-handled), "second-ary containment" (unusual pause). "40 CFR Part 112" → "40 CFR Part 112," "40 CFR 112," "forty C.F.R. Part 112" — the CFR citation format failures observed in OSHA 1910 training apply identically to EPA 40 CFR training.
Compliance obligations for manufacturing training video
OSHA 1910 documented-training requirements
OSHA 1910 imposes specific documented-training requirements across multiple sub-parts. Each requirement has its own training content specification and record-retention expectation:
- 1910.147(c)(7) — LOTO training: requires training for authorized employees (who perform LOTO), affected employees (who operate in LOTO areas), and other employees (who work in areas where LOTO is used). Training must be retrained whenever there is a reason to believe the employee does not understand the energy control procedure. Records must document employee name, training date, and training certification.
- 1910.146(g) — Confined space entry training: requires training for entrants, attendants, and entry supervisors before they first perform confined space entry duties, whenever there is a change in assigned duties, whenever there is a change in permit space operations, and whenever the employer has reason to believe deficiencies exist. Training records must include employee name, trainer name, training date.
- 1910.132(f) — PPE training: requires training for each employee required to use PPE before they perform work requiring PPE use. Training must cover when PPE is necessary, what PPE is necessary, how to properly don/doff/adjust/wear PPE, limitations of PPE, and care/maintenance/life/disposal of PPE.
- 1910.178(l) — Forklift/PIT training: requires training before initial use and reevaluation at least every 3 years (or sooner if unsafe operation is observed). Training must cover truck-related topics (load capacity, stability triangle, pre-operation inspection, hazards) and workplace-related topics (surface conditions, pedestrian traffic, narrow aisles).
- 1910.119(g) — PSM training: requires initial training for all employees involved in operating a highly hazardous chemical process (overview of the process and operating procedures); refresher training at least every 3 years; and training certification documentation (employee name, date of training, training means used).
- 1910.1200(h) — HazCom/GHS training: requires training on the hazard communication program elements (SDS, labels, chemical-specific hazard information) when employees are initially assigned to work with hazardous chemicals and whenever a new physical or health hazard is introduced into the work area. See the HazCom chemical name captions reference for the specific STT failure modes in GHS/HazCom training.
ISO 9001:2015 clause 7.2 Competence
ISO 9001:2015 clause 7.2 (Competence) requires the organization to: (a) determine the necessary competence of person(s) doing work under its control that affects the performance and effectiveness of the QMS; (b) ensure those persons are competent on the basis of appropriate education, training, or experience; (c) where applicable, take actions to acquire the necessary competence and evaluate the effectiveness of those actions; and (d) retain appropriate documented information as evidence of competence. The documented information requirement creates a direct ISO 9001 training record obligation. For manufacturers with third-party ISO 9001 certifications (or IATF 16949, AS9100, ISO 13485, ISO/IEC 17025 accreditations that include clause 7.2 by reference), the training record is a certification audit artefact — auditors will examine training records to verify competence evidence during certification and surveillance audits.
The clause 7.2(c) requirement to "evaluate the effectiveness" of training creates an additional caption accuracy imperative: training effectiveness evaluation typically uses a post-training assessment (quiz, practical demonstration, or supervisor observation). For hearing-impaired workers who relied on captions to receive the training content, an inaccurate caption track creates a systematic disadvantage in the training effectiveness assessment — the worker was not exposed to the same information content as hearing workers. A caption track that mis-transcribed "Cpk" as "CPC" or "heijunka" as "hay-junk-ah" means the worker was exposed to a different text-form of the terminology than the assessment will test, creating an accessibility-based assessment gap that clause 7.2(c) does not contemplate.
FSMA PCQI training mandate
The FDA FSMA Preventive Controls for Human Food rule (21 CFR Part 117) requires that the preventive controls in a facility's food safety plan be prepared by — or overseen by — a Preventive Controls Qualified Individual (PCQI). A PCQI must be a qualified individual who has successfully completed training in the development and application of risk-based preventive controls at least equivalent to the standardized curriculum recognized as adequate by FDA. The Food Safety Preventive Controls Alliance (FSPCA) PCQI training course is the FDA-recognized curriculum. The FSPCA PCQI training course is a blended-learning program (in-person instruction plus online modules); the online modules are delivered as video-based eLearning content. For hearing-impaired food safety professionals pursuing PCQI qualification, accurate captions on the online FSPCA training modules are required for equal access to the credential. The PCQI credential itself is a regulatory requirement — not merely professional development — so the accessibility obligation is direct: an inaccurate caption track on PCQI training video denies a hearing-impaired food safety professional equal access to a regulatory credential required for their job function.
ADA Title I — manufacturing employer obligations
ADA Title I applies to manufacturing employers with 15 or more employees — which covers virtually every manufacturing facility that delivers the training content types described in this reference. Manufacturing facilities are large physical workplaces with production, quality, maintenance, engineering, warehouse, and office functions; the workforce is diverse by tenure, education level, and, in many facilities, primary language. Manufacturing employers have ADA Title I accommodation obligations for hearing-impaired production workers, quality technicians, maintenance technicians, safety coordinators, and supervisors who receive mandatory OSHA 1910 and ISO 9001 training.
The life-safety dimension distinguishes manufacturing training from other training domains: an inaccurate LOTO caption that mis-transcribes the authorized/affected/other employee distinction, or a confined space caption that renders "permit-required" as "permit required" without the regulatory hyphenation, or a machine guarding caption that loses the nip-point/pinch-point distinction — these are not merely compliance failures. They are failures of the training system to communicate safety-critical information to workers whose safety depends on understanding it. The ADA Title I effective-communication standard for employer-provided training is not satisfied by a caption track that conveys some of the information some of the time.
Section 503 and VEVRAA for federal contractor manufacturers
Manufacturing companies with federal contracts (defence contractors, aerospace suppliers, government equipment manufacturers, VA medical supply chain) are subject to Section 503 of the Rehabilitation Act (requiring affirmative action and nondiscrimination for qualified individuals with disabilities) and VEVRAA (Vietnam Era Veterans Readjustment Assistance Act). Section 503 and VEVRAA requirements under OFCCP (Office of Federal Contract Compliance Programs) oversight include accessible training and onboarding for employees with disabilities. For Section 508-covered federal agencies that use manufacturing contractor training materials, the Section 508 caption standard (WCAG 2.1 AA SC 1.2.2) applies directly to any training content produced or procured with federal funds.
Manufacturing LMS and training delivery context
Manufacturing training video is delivered through both industry-specific manufacturing LMS platforms and general enterprise LMS systems. Caption support and workflow vary significantly across these platforms:
- Alchemy Systems (now Intertek Alchemy): the dominant LMS specifically built for food, beverage, and consumer goods manufacturing. Alchemy Systems training content focuses on food safety (HACCP, FSMA, GMP), allergen management, sanitation, and operational compliance for production-floor workers, including multi-language training delivery. Caption file upload is supported for video content; auto-generation through Alchemy's platform is limited. For the food manufacturing sector — where HACCP and PCQI training video must be captioned — the Alchemy platform is the primary delivery vehicle for the highest-HACCP-vocabulary training content.
- ComplianceQuest: a Salesforce-native QMS/EHS/LMS platform designed for manufacturing and life sciences. ComplianceQuest integrates training management with quality events (CAPAs, nonconformances, audits) and EHS incidents, so that training completion records are linked to quality system records. Caption support mirrors Salesforce platform capabilities.
- EtQ (now Ideagen EtQ): a QMS-integrated training management system for quality system training in manufacturing. EtQ training is particularly prevalent in the medical device, pharmaceutical, and chemical manufacturing sectors for ISO 9001, ISO 13485, and GMP training management. Caption file upload is supported.
- Intelex (EHSQ platform): an EHS/Quality/Sustainability platform with integrated training management for manufacturing and industrial facilities. Intelex training management is used extensively for OSHA 1910 compliance training tracking and ISO 14001 EMS training. Caption file management follows the platform's document management architecture.
- Enablon (EHS/sustainability platform): an EHS and sustainability management platform with training management capabilities for large manufacturing enterprises. Enablon deployments at global manufacturers (chemical, energy, industrial) manage OSHA and environmental training record compliance.
- IQMS/Plex (ERP-integrated training): manufacturing ERP systems with integrated training management (Plex Manufacturing Cloud, IQMS — now Epicor Manufacturing) link training records directly to production job assignments. In these systems, a worker's training completion record for a specific LOTO procedure or ISO 9001 operation is a prerequisite for that worker to be assigned to the corresponding production job. The training video content and its caption track are embedded in the ERP system's operational workflow — not an isolated LMS — making caption accuracy a production-system data integrity issue.
- SAP SuccessFactors: large manufacturing enterprises (Toyota, Caterpillar, GE, 3M, Honeywell, Emerson) use SAP SuccessFactors or comparable enterprise HCM suites for manufacturing workforce training at scale. SuccessFactors Learning supports caption file upload (SRT/VTT) for hosted video content; caption generation is not built in.
- Cornerstone OnDemand: widely used in large manufacturing enterprises for corporate learning programmes alongside manufacturing-specific compliance training. Cornerstone OnDemand caption workflow uses SRT/VTT sidecar upload for hosted video and SCORM-wrapped caption embedding for SCORM content.
- Docebo: used in mid-market manufacturing companies for blended learning. Docebo's caption ingestion workflow requires VTT format with BCP-47 language tags and a two-step subtitle track API.
The GlossCap approach for manufacturing training video
Manufacturing training vocabulary has a large shared base layer that spans all manufacturing facilities plus a facility-specific overlay that is unique to each plant. GlossCap's approach applies both layers simultaneously:
The shared manufacturing base layer covers all lean/Six Sigma Japanese terms and their English translations (heijunka, poka-yoke, jidoka, muda/muri/mura, kanban, kaizen, 5S Japanese/English pairs, SMED, OEE, TPM, Andon), all Six Sigma statistical vocabulary (DMAIC, DFSS, DPMO, Cpk/Ppk/Cp/Pp, Gauge R&R/gage R&R, all control chart type designations including X-bar/R, X-bar/S, IMR, P-chart, NP-chart, C-chart, U-chart, CUSUM, EWMA), all OSHA 1910 citation formats (29 CFR 1910.147, .146, .132, .178, .119, .212, .1200 with common paragraph designations), all ISO 9001 clause numbers and titles (clauses 4 through 10 with all sub-clauses), all HACCP structural vocabulary (7 HACCP principles, CCP, critical limit, HARPC, PRPs), FSMA regulatory vocabulary (PCQI, FSPCA curriculum, 21 CFR Part 117, § 103), and environmental training vocabulary (aspects and impacts, SPCC, 40 CFR Part 112, ISO 14001 EMS).
The facility-specific overlay covers the equipment model names and designations in use at the specific manufacturing plant (brand names of presses, lathes, CNC machines, filling lines, packaging equipment), production area names (Line 1 through Line N, specific cell names, cleanroom designations for food or pharma manufacturing facilities), quality system document identifiers and numbering conventions (SOP number formats, work instruction codes, control plan document numbers), and product names being manufactured at the facility (product line names, SKU designations, proprietary blend or alloy names). For automotive tier-one suppliers, the facility-specific overlay includes customer-specific quality vocabulary (Ford Q1, GM Supplier Quality, Chrysler/FCA supplier requirements, Toyota North America TMSV requirements) that is facility-specific because it reflects the specific OEM customers served by that plant.
The result is that a GlossCap-processed manufacturing training video caption track renders "heijunka" as "heijunka" (not "hay-junk-ah"), "poka-yoke" as "poka-yoke" (not "poke-a-yolk"), "Cpk" as "Cpk" (not "CPC"), "HACCP" as "HACCP" (not "hack up"), "PCQI" as "PCQI" (not "picky"), and "29 CFR 1910.147" as "29 CFR 1910.147" (not "19 ten point 147") — consistently, across every occurrence in the video, without requiring manual caption review and correction of each instance.
FAQ — manufacturing training captions
Does ISO 9001 require captions on training video?
ISO 9001:2015 clause 7.2 (Competence) does not mention captions, audio-visual accessibility, or any specific delivery format for training content. What clause 7.2 requires is that the organization ensure relevant persons are competent, take actions to acquire competence where needed, evaluate the effectiveness of those actions, and retain documented information as evidence of competence. The captioning obligation under ISO 9001 is indirect: if a hearing-impaired quality technician cannot access ISO 9001 training video through accurate captions, the training is not effective for that individual, and the evidence of competence for that individual is compromised. Clause 7.2(c) requires evaluating training effectiveness — a training programme that is inaccessible to a segment of the workforce cannot demonstrate effective training for that segment. The direct captioning obligation comes from ADA Title I (for US manufacturing employers with 15+ employees) and, for companies in specific states, state FEHA equivalents. ISO 9001 certification auditors may note inaccessible training materials as a clause 7.2 observation (not necessarily a nonconformity, but a potential gap in competence evidence) during a surveillance or recertification audit. IATF 16949 and AS9100 auditors, who tend to examine training effectiveness evidence more stringently than general ISO 9001 auditors, are more likely to probe this gap.
How do heijunka and poka-yoke create specific STT failure modes that a glossary can fix?
Heijunka and poka-yoke fail in generic STT for the same structural reason: they are Japanese-origin words romanised into English orthography, pronounced by English-speaking manufacturing trainers with English phonetic approximations, and represented in essentially no English-language STT training data in either their written romanised form or their spoken English phonetic form. Generic STT, encountering "heijunka" in audio, has no prior reference for what it sounds like or how it should be written. The STT model therefore makes its best guess from acoustic similarity to words it has seen — producing "HEY-junka," "hay-junk-ah," "HEJUNKA," "hey junk a," or other phonetic approximations. Each occurrence of "heijunka" in the training video may produce a different guess, because the acoustic input to the STT model varies slightly with the narrator's speed, emphasis, and audio environment. A glossary applied during STT inference or post-processing resolves this by providing the canonical written form "heijunka" as a high-probability match for the observed acoustic input, regardless of the specific phonetic approximation used. The same mechanism applies to poka-yoke, jidoka, and the 5S Japanese terms: the glossary entry provides a stable written target that supersedes the STT model's uninformed phonetic guessing. Without a glossary, a caption reviewer must manually identify and correct every non-standard rendering — across potentially hundreds of occurrences in a lean manufacturing training catalogue.
What is the FSMA PCQI training requirement, and does it apply to captions?
The FSMA Preventive Controls for Human Food rule (21 CFR Part 117.4(b)(2)) requires that the preventive controls in a food facility's food safety plan be prepared by or overseen by a Preventive Controls Qualified Individual (PCQI). A PCQI must have successfully completed training at least equivalent to the standardized curriculum recognized as adequate by FDA. The Food Safety Preventive Controls Alliance (FSPCA) delivers the FDA-recognized PCQI training curriculum, which is a blended-learning programme combining in-person instruction with online video-based eLearning modules. The PCQI training requirement is a regulatory credential requirement, not optional professional development — a facility's food safety plan must be overseen by a PCQI, and a PCQI must have completed the FSPCA curriculum or equivalent. For hearing-impaired food safety professionals pursuing PCQI qualification through the online FSPCA eLearning modules, the ADA Title III obligation (for the FSPCA training provider) and the ADA Title I obligation (for the employer who directed the employee to complete PCQI training) both require that the online training modules be captioned accurately. The PCQI pronunciation "PICK-ee" creates the same disambiguation problem as HACCP's "HACK-up": generic STT trained on general English text data encounters the acoustic signal for "PICK-ee" and maps it to "picky" — a vastly more common English word. A glossary entry for PCQI that maps the acoustic "PICK-ee" pronunciation to the written form "PCQI" resolves this disambiguation without requiring manual caption correction in every FSMA training video.
Do food manufacturing plants have different captioning obligations than other manufacturing employers?
Food manufacturing plants are subject to the same ADA Title I employer accommodation obligations as all manufacturing employers with 15+ employees. Where food manufacturing plants differ from other manufacturing is in the density and specificity of the regulatory vocabulary in their training content, and in the existence of training-specific regulatory mandates (FSMA PCQI) that create independent accessibility obligations. The FSMA PCQI requirement is unique: it is a credential that requires a specific training curriculum, that curriculum is delivered through video-based eLearning, and that credential is a regulatory prerequisite for a specific job function. No equivalent credential-driven training requirement exists in other manufacturing sectors with the same specificity of curriculum and delivery format. Food manufacturing plants are also the principal users of Alchemy Systems (Intertek Alchemy), which is purpose-built for food and beverage manufacturing training and hosts the highest concentration of HACCP, GMP, and food safety training video in any industry-specific LMS. Caption accuracy for food safety training content on Alchemy Systems directly affects the FSMA and HACCP training programme compliance of a large share of the US food manufacturing sector. GFSI-benchmarked food safety scheme certifications (SQF, BRCGS, FSSC 22000) also include training requirements in their audit standards; GFSI scheme auditors review training records and may examine training content accessibility as part of the food safety culture assessment that most GFSI schemes now include.
How does lockout/tagout training vocabulary — specifically the LOTO authorized/affected/other employee distinction — create specific caption accuracy requirements?
29 CFR 1910.147 defines three categories of employees with distinctly different LOTO training requirements: authorized employees (who perform lockout or tagout of machinery and must be trained on the energy control procedures, hazardous energy types, and methods for isolating and controlling energy); affected employees (who operate or use the machines or equipment on which service or maintenance is performed, or who work in areas where LOTO is used — they must be trained on the purpose and use of the LOTO procedure but do not perform lockout themselves); and other employees (whose work operations are or may be in an area where LOTO procedures are used — they must be trained to understand that they must not start, energize, or use equipment that has been locked out). The three-category distinction is the core safety architecture of LOTO: an authorized employee may place and remove a lock; an affected employee may not remove a lock (they would be removing safety protection on equipment that an authorized employee is working on); and other employees may not interact with locked-out equipment at all. A caption track that conflates "authorized employee" with "affected employee" — through substitution of one for the other in rapid narration, or through dropping the category qualifier entirely — creates a direct safety hazard by obscuring the employee's rights and responsibilities under the LOTO procedure. The 1910.147(c)(7)(i)(A) specification for authorized employee training content and the 1910.147(c)(7)(i)(B) content for affected employees are legally distinct requirements, and the training content for each must clearly identify which category is being addressed. STT conflation of "authorized" and "affected" — two words of similar length and somewhat similar phonetic profile in rapid speech — is a documented source of LOTO training caption errors. A LOTO-specific manufacturing glossary that includes the three employee categories as defined terms with phonetic profiles resolves this conflation.
Further reading
- Safety training video captions: OSHA, MSHA, and HazCom vocabulary
- Compliance training video captions: SOX, HIPAA, GDPR acronym vocabulary
- FDA-regulated training captions: GxP, 21 CFR Part 11, and WCAG 2.1 AA
- Engineering onboarding captions: technical vocabulary in new-hire training
- ADA Title I captions: employer accommodation for regulated-industry training
- WCAG 2.1 AA captions: what SC 1.2.2 "accurately convey the audio" requires
- Section 508 captions: federal contractor and government supply chain manufacturers
- Cornerstone OnDemand captions: enterprise LMS for manufacturing training
- Workday Learning captions: enterprise HCM training for large manufacturers
- HazCom training captions: chemical names, IUPAC vocabulary, and OSHA 1910.1200
- Why 99% caption accuracy matters: the WCAG 2.1 AA threshold for manufacturing compliance training
- Glossary-biased captioning with Whisper: engineering term accuracy in manufacturing training