Platform reference · Whatfix DAP · Enterprise software training · ADA Title I · WCAG 2.1 AA · LMS integration

Whatfix captions: digital adoption platform training video exports, vocabulary failure modes, and WCAG 2.1 AA compliance

Whatfix is a Digital Adoption Platform used by enterprise and mid-market companies to build interactive walkthroughs for complex software applications — SAP, Salesforce, ServiceNow, Workday, Oracle, Microsoft 365, and proprietary internal tools. When Whatfix walkthroughs are exported as training videos (MP4) and uploaded to an LMS, those MP4 files contain no caption track. The captioning step must happen between the Whatfix export and the LMS upload, and it faces a double vocabulary challenge: the Whatfix platform's own terminology combined with the dense, highly proprietary vocabulary of whichever enterprise application is being demonstrated. Generic speech-to-text fails on both layers simultaneously — SAP T-codes, Salesforce Cloud names, ServiceNow CMDB vocabulary, and Workday business process terminology are all out-of-vocabulary for standard STT models. ADA Title I (42 U.S.C. § 12112) applies to all mandatory employee software onboarding and training delivered through an LMS. California FEHA (Gov. Code § 12940(m)) from five employees covers virtually all California tech companies in Whatfix's ICP. Section 508 applies to Whatfix customers in federal agencies and federal contractors using Whatfix to create training for government systems.

TL;DR

Whatfix exports interactive software walkthroughs as MP4 video files — useful for asynchronous LMS training, but the export has no caption track. Captions must be added before the video is uploaded to Cornerstone, Workday Learning, SAP SuccessFactors, TalentLMS, or any other LMS. The vocabulary failure rate is double-layered: Whatfix product names (SmartTips, Flows, Beacons, ContextBot) fail in generic STT, and so does the enterprise application vocabulary being demonstrated (SAP T-codes like "FB01" and "ME21N," Salesforce Cloud names like "Agentforce" and "Revenue Cloud," ServiceNow terms like "CMDB" and "GlideRecord"). ADA Title I (15+ employees), California FEHA (5+ employees), and Section 508 (federal agencies/contractors) apply to Whatfix-generated training video. The fix is a corrected VTT or SRT file — generated with a two-layer glossary covering both the target application vocabulary and Whatfix's own product terminology — uploaded to the LMS alongside the Whatfix MP4 export.

How Whatfix generates training video and why captions require a separate step

The Whatfix content creation workflow

Whatfix operates as a browser-based or desktop overlay on top of enterprise applications. An L&D professional or application trainer uses the Whatfix Studio to record a guided walkthrough of a software workflow — clicking through screens, completing form fields, triggering system actions — while adding annotations, tooltips (SmartTips), step narration, and guidance overlays. This walkthrough, when published, runs as an in-application interactive guide that employees can follow inside the live application.

Whatfix provides three primary output formats for training content:

The MP4 video export does not include a caption track. Whatfix Studio does not generate subtitles, SRT files, or VTT files as part of the export workflow. The exported MP4 contains the screen recording with synchronized narration audio, but there is no accompanying text representation of that audio for hearing-impaired learners.

Where captions must be added in the Whatfix-to-LMS pipeline

The captioning workflow for Whatfix-generated training video is a post-export, pre-upload step. After the L&D team exports the Whatfix flow as MP4, and before uploading to the LMS, a caption file must be prepared and attached. The specific mechanism for attaching the caption file depends on the target LMS:

In all cases, the caption file must exist as a separate VTT or SRT file before the LMS upload. There is no post-upload captioning path within Whatfix itself.

Why the Whatfix video export specifically needs high-accuracy captioning

Whatfix MP4 exports are not general-purpose instructional videos. They are screen-capture walkthroughs of enterprise software workflows, narrated to accompany specific screen states. The audio track of a Whatfix export has characteristics that substantially increase the vocabulary error rate for generic STT:

Enterprise application vocabulary failure modes in Whatfix training exports

SAP implementations: T-codes, module abbreviations, and Fiori app names

SAP is one of Whatfix's largest deployment contexts. Enterprise companies running SAP S/4HANA or SAP ECC use Whatfix to create guided walkthroughs for finance, procurement, supply chain, HR, and other SAP processes. The vocabulary failure surface in SAP-based Whatfix exports is among the most severe of any enterprise application:

For companies using both Whatfix and SAP, the captioning challenge compounds with the SAP Enable Now authoring tool, which is discussed in detail on the SAP Enable Now captions page. Whatfix and SAP Enable Now serve similar DAP/authoring purposes and are often deployed together or considered as alternatives; the vocabulary requirements are parallel.

Salesforce implementations: Cloud names, Agentforce, and org-specific vocabulary

Salesforce is the other dominant Whatfix deployment context. Companies use Whatfix to guide sales teams, service teams, and administrators through Salesforce workflows — CRM data entry, approval processes, CPQ configuration, Salesforce Flow automation. The Salesforce vocabulary failure modes in Whatfix exports include:

For Salesforce Trailhead training specifically, the Salesforce Trailhead captions page covers the Trailhead-specific surface in more detail. Whatfix generates the internal employee walkthrough layer; Trailhead provides the official certification training layer.

ServiceNow implementations: Now Platform vocabulary and CMDB terminology

ServiceNow is deployed at 85% of the Fortune 500 and across 70+ US federal agencies. Whatfix is commonly used to guide employees and administrators through ServiceNow Now Platform workflows — incident management, change management, service catalog requests, CMDB updates, and HR case management. The ServiceNow vocabulary failure surface includes:

The ServiceNow Learning captions page covers the Now Learning training platform specifically. Whatfix walkthroughs generate the custom workflow training layer that sits alongside Now Learning's official product education.

Workday implementations: business process and HCM terminology

Workday is widely deployed for HR, finance, and planning. Whatfix walkthroughs for Workday guide employees through self-service HR transactions, manager workflows, and finance processes. Workday vocabulary failures include Workday-specific terminology (business process, worklet, integration, report writer, calculated field, supervisory organization, position management, headcount plan) and the transition from legacy systems (SAP HCM, Oracle HRMS) that creates dual-naming contexts in training narration.

ADA Title I, Section 508, and EAA for Whatfix customers

ADA Title I: mandatory software onboarding and training

ADA Title I (42 U.S.C. § 12112) requires employers with 15 or more employees to provide reasonable accommodations for employees with disabilities, including hearing impairments. Software onboarding training delivered through Whatfix-generated videos is mandatory training — employees are required to learn the systems they use to do their jobs. A hearing-impaired employee assigned to complete a Whatfix-generated SAP onboarding training module has the same right to an accurate, accessible training experience as any hearing employee.

The ADA Title I accommodation standard requires functional equivalence: the hearing-impaired employee should be able to obtain the same information and comprehension from the training as a hearing employee watching without captions. A Whatfix video export captioned with generic STT at 80-90% accuracy — with SAP T-codes, Salesforce API names, and ServiceNow CMDB vocabulary systematically misrecognized — does not provide functional equivalence. The employee cannot follow the precise procedural steps described in the narration if the technical terms are garbled.

The ADA obligation attaches to the mandatory nature of the training assignment, not to the software platform. Using Whatfix as the content creation tool and Cornerstone or Workday Learning as the LMS delivery platform does not modify the employer's accommodation obligation. The obligation to provide accessible training exists because the training is mandatory, regardless of which tools create and deliver it.

Section 508 for federal agencies and contractors using Whatfix

Whatfix is deployed at federal agencies and at federal contractors who build and maintain government IT systems. Section 508 (29 U.S.C. § 794d) requires that federal agencies' electronic and information technology — including training content — be accessible to employees with disabilities. The technical standard referenced by Section 508 is WCAG 2.0 Level AA (for web content) and parallel standards for multimedia. For training video derived from Whatfix exports, this means captions that meet the accuracy standard specified in WCAG Success Criterion 1.2.2.

Federal contractors providing Whatfix-generated training to agency employees as part of a contract deliverable are bound by the same Section 508 standards through FAR clause 52.239-2 (or equivalent agency-specific clauses). A system integrator using Whatfix to create SAP S/4HANA onboarding training for a federal agency must deliver captioned training video that meets Section 508 standards — which means corrected caption files, not uncaptioned or auto-captioned exports.

For a comprehensive overview of the federal training video captioning obligation, see Section 508 captions for federal training video.

California FEHA for California technology companies

California FEHA (Gov. Code § 12940(m)) requires employers with five or more employees to provide reasonable accommodations for employees with disabilities. Whatfix's primary market is technology and enterprise software companies — a market heavily concentrated in California. The five-employee threshold makes virtually every California technology company that deploys Whatfix a FEHA-covered employer from its earliest growth stage. A 10-person startup using Whatfix to build product onboarding walkthroughs for its own internal team is a FEHA-covered employer for any hearing-impaired employee it hires.

California also requires specific mandatory training that creates captioning obligations when delivered as video: harassment prevention training under AB 1825/SB 1343 (two hours for supervisors, one hour for non-supervisors), pay equity and FEHA training for HR staff, and workplace safety training under Cal/OSHA standards. If any of these mandatory trainings are delivered as Whatfix-generated video, the FEHA accommodation obligation requires accurate captions.

European Accessibility Act for EU-deployed Whatfix implementations

The European Accessibility Act (EAA), enforceable from June 28, 2025, requires that digital products and services placed on the EU market meet WCAG 2.1 AA accessibility standards. For companies using Whatfix to create training video for employees in EU member states, the EAA applies to training content accessed by those EU-based employees. The practical consequence: Whatfix-generated MP4 exports uploaded to an LMS for EU-employee training must include WCAG 2.1 AA-compliant captions. YouTube auto-captions or uncaptioned exports do not satisfy the EAA standard.

The EAA applies in parallel with EU employment law: Article 5 of the EU Employment Framework Directive (2000/78/EC) requires employers to provide reasonable accommodations for employees with disabilities, including accessible training materials. A hearing-impaired employee in Germany, France, or the Netherlands assigned mandatory SAP onboarding training via Whatfix-generated video has the right to accessible captions under both the EAA framework and national employment accommodation law.

Caption file formats for Whatfix LMS integrations

VTT and SRT: the universal sidecar formats

The two caption formats that cover essentially all LMS upload scenarios for Whatfix-generated MP4 exports are WebVTT (.vtt) and SubRip (.srt). Both formats represent the same information — timestamped text segments synchronized to the video timeline — with minor syntactic differences.

For Whatfix exports used in SCORM packages, the VTT file is included in the SCORM package directory and referenced in the video element's track tag within the HTML content page. The SCORM player in the LMS then serves the caption track alongside the video. For direct video uploads to the LMS, the VTT or SRT file is uploaded as a sidecar alongside the MP4 in the LMS content management interface.

Platform-specific caption upload paths for Whatfix LMS targets

The following LMS platforms are common targets for Whatfix MP4 export uploads, with different caption upload mechanisms:

Building the glossary for Whatfix training video captioning

Two-layer glossary for DAP-generated content

Captioning Whatfix-generated training video requires a two-layer vocabulary approach because the audio contains two distinct vocabulary domains simultaneously:

Layer 1: the target application vocabulary — the enterprise software being demonstrated. This includes product names, module names, T-codes or equivalent identifiers, UI element labels, field names, dropdown values, and workflow step names. For SAP implementations, this means the SAP standard vocabulary plus the organization's custom Z-objects and custom T-codes. For Salesforce orgs, this means Salesforce product names plus org-specific custom object names and picklist values. This layer is the larger and more variable of the two.

Layer 2: the Whatfix platform vocabulary — terms specific to Whatfix itself that appear when the walkthrough narrates Whatfix UI elements or when the training content explains how to use the Whatfix Self Help widget. Whatfix-specific terms include: SmartTip, Flow, Task list, Beacon, Pop-up, Slideout, Guided Tour, Survey, NPS, Self Help widget, ContextBot, Whatfix Guidance, Whatfix Analytics, Task Manager, Flows Editor, Whatfix Studio. For internal training teams explaining how to use Whatfix itself, or for organizations training Whatfix administrators, the Layer 2 vocabulary is directly present in the audio.

GlossCap's glossary input accepts both layers as a combined glossary. The captioning engine applies glossary terms preferentially during the STT decoding process, recovering correct terminology for both the application layer and the DAP layer simultaneously. For organizations using Whatfix across multiple enterprise applications, a consolidated master glossary covering all deployed applications plus the Whatfix platform terms is more efficient than per-video glossaries.

Iterative glossary refinement for multi-application DAP deployments

Large enterprises that deploy Whatfix across five or more enterprise applications — SAP, Salesforce, Workday, ServiceNow, and one or more proprietary internal applications — accumulate vocabulary that spans different domains and update cycles. Application upgrades introduce new vocabulary (Salesforce Spring/Summer/Winter releases, SAP enhancement packages, ServiceNow Now Platform releases) that requires glossary updates. The recommended approach is a living glossary that is updated with each application release cycle and validated against new Whatfix content before LMS upload.

For regulated industries (healthcare, financial services, government), the glossary validation step also serves a compliance documentation function: demonstrating that the captioning workflow included a vocabulary-specific quality control step is evidence of due diligence in meeting the applicable accessibility standard.

See GlossCap pricing

FAQ — Whatfix captions

Does Whatfix generate captions automatically when exporting video?

No. Whatfix video exports (MP4) do not include a caption track. Whatfix Studio exports the screen recording with synchronized narration audio as a standard MP4 file. No VTT, SRT, or other caption format is generated as part of the Whatfix export workflow. Captioning is a post-export step that must happen before the MP4 is uploaded to the LMS. This is the same pattern as most authoring tool exports — Camtasia, Articulate Storyline, Adobe Captivate, and Lectora also export MP4 (or SCORM packages containing MP4) without embedded caption tracks unless the author explicitly adds captions in the authoring tool before export. See Camtasia captions and Articulate Storyline captions for the authoring-tool-side captioning comparison.

We use Whatfix for SAP onboarding training — what specific SAP vocabulary needs to be in the glossary?

For SAP-based Whatfix training, the vocabulary glossary should include: all standard SAP T-codes used in the narration (FB01, ME21N, MIGO, FBL3N, VA01, MM01, and every other T-code referenced in the walkthroughs), all SAP module abbreviations (FI, CO, SD, MM, PP, HCM, PM, QM, PS, etc.), SAP product names (S/4HANA, Fiori, BTP, SuccessFactors, Ariba, Concur), SAP-specific terms (BAPI, IDOC, ABAP, RFC, ALE, EDI, SLD, SolMan, transport request, change request, NWBC, Launchpad), and all organization-specific custom T-codes and Z-table names used in your implementation. Your SAP basis team or implementation partner typically has a glossary of custom objects that can serve as the starting point. For a detailed SAP vocabulary analysis in the context of authoring tools, see SAP Enable Now captions, which covers the same vocabulary domain from the SAP-native authoring perspective.

We have Whatfix deployed on Salesforce — does the Salesforce vocabulary analysis on the Salesforce Trailhead captions page apply here too?

Yes. The Salesforce product vocabulary (Salesforce Cloud names, Agentforce, Einstein AI sub-products, Apex, SOQL, LWC, Flow Builder), Salesforce-specific terms, and org-specific custom object/field names that produce vocabulary failures in Salesforce training video are exactly the same vocabulary present in Whatfix walkthrough narration for Salesforce implementations. The Salesforce Trailhead captions page covers this vocabulary in detail — use that vocabulary analysis as the basis for your Whatfix-on-Salesforce glossary. The difference is the delivery mechanism: Trailhead is Salesforce's official certification training; Whatfix generates the custom internal workflow training layer. Both have the same Salesforce vocabulary requirement.

We're a federal contractor using Whatfix to build training for a federal agency — what are the Section 508 requirements for our Whatfix exports?

Federal contractors delivering training to federal agencies as part of a contract are subject to Section 508 requirements. Section 508 (29 U.S.C. § 794d) and its implementing regulations at 36 CFR Part 1194 require that electronic and information technology — including training video — be accessible to people with disabilities. The applicable technical standard is the Revised 508 Standards (2017), which references WCAG 2.0 Level AA for web content. For prerecorded video, WCAG SC 1.2.2 (Captions — Prerecorded) requires captions that accurately convey all spoken content. Whatfix MP4 exports delivered to a federal agency without a corrected VTT or SRT caption file do not meet Section 508 standards and create contract compliance exposure. The FAR clause 52.239-2 (where incorporated) requires that supplies and services meet Section 508 standards. Your contracting officer or agency technical representative may specify the exact format and accuracy requirement in the statement of work or quality acceptance criteria. See Section 508 captions for the full federal compliance analysis.

How does captioning Whatfix video differ from captioning Articulate Storyline or Camtasia exports?

The LMS upload workflow and caption file format requirements are essentially identical: all three produce MP4 (or SCORM packages containing MP4) that require a separate VTT or SRT sidecar file. The key difference is the vocabulary profile. Articulate Storyline and Camtasia exports typically contain ID-produced narrated training content with general instructional vocabulary. Whatfix exports are screen captures of live enterprise software with narrated UI interactions — which means the vocabulary is entirely driven by the target application being walked through, not by the ID's narration script. A Camtasia export on a cybersecurity awareness course has cybersecurity vocabulary. A Whatfix export on a ServiceNow incident management workflow has ITSM vocabulary plus ServiceNow Now Platform vocabulary plus the organization's CMDB configuration vocabulary. The vocabulary failure rate for Whatfix exports is typically higher than for ID-produced narrated training because the vocabulary is more proprietary and more variable. See Camtasia captions and Articulate Storyline captions for the authoring-tool comparison.

Further reading