eLearning Authoring Operations · Published 2026-06-18

Captioning eLearning modules built in Articulate Storyline, Rise 360, and Adobe Captivate: SCORM caption workflow, authoring tool failure modes, and WCAG compliance for eLearning content

When L&D teams talk about captioning training content, the conversation almost always centres on video assets: upload the video, generate a caption file, attach the SRT to the LMS record. This workflow is well-understood, and the GlossCap blog corpus covers it in depth across every major LMS, content type, and compliance framework. But a significant share of corporate training content — in many organisations, the majority of it — is not raw video. It is interactive eLearning built in Articulate Storyline 360, Articulate Rise 360, Adobe Captivate, Lectora Online, or iSpring Suite. These courses have narration on every slide, are packaged as SCORM or HTML5 output, and are delivered through the LMS exactly like a video course — except that the caption workflow is completely different, less understood, and more likely to produce a WCAG compliance gap that nobody on the L&D team realises is there.

The scale of the authoring-tool caption problem is large. Articulate 360 alone has approximately twelve million users. A typical mid-market L&D team maintains fifty to three hundred Storyline modules alongside their video library. Each module has narration on most of its slides. Narration audio on a slide is audio that requires captions under WCAG 2.1 SC 1.2.2 for prerecorded synchronized media — the same standard that governs the training videos the team is already captioning. The authoring tool library is not exempt from SC 1.2.2 because the content is delivered via SCORM rather than MP4. WCAG applies to the content, not the container format.

The tool-specific caption architectures vary in ways that create predictable failure modes. Articulate Storyline has a built-in closed captions feature that most L&D teams believe solves the compliance problem — it often does not. Storyline CC is driven by the Notes panel text, not by ASR on the recorded audio. If a live narrator was used and the Notes text was not maintained as a perfect script transcript, Storyline CC shows the script rather than the spoken words, which is a WCAG accuracy failure at SC 1.2.2. Articulate Rise 360 has no native caption editor at all — captions for video blocks in Rise must be set at the video host level (Vimeo, Wistia, YouTube, Loom) before the video is embedded in Rise. Direct MP4 uploads to Rise video blocks bypass the video host entirely, which means there is no mechanism to attach a caption track. Many Rise courses in production have uncaptioned video blocks for exactly this reason. Adobe Captivate, Lectora, and iSpring all require a separate SRT import step that many L&D teams skip because it falls outside the standard publish workflow.

This guide covers the complete caption workflow for the five most widely used eLearning authoring tools in corporate L&D: Articulate Storyline 360, Articulate Rise 360, Adobe Captivate, Lectora Online, and iSpring Suite. For each tool, the coverage includes the native caption feature and where it falls short, the correct captioning workflow for both live-narrated and TTS-narrated content, the SCORM packaging implications for caption delivery, and LMS-specific delivery considerations. The guide also covers the Section 508 applicability to SCORM content, vocabulary failure modes in authoring tool narration, a production workflow at scale, eight failure modes specific to the authoring tool context, and a seven-question FAQ addressing the most common L&D team questions about eLearning caption compliance.

TL;DR — six things every L&D team producing eLearning needs to know

Storyline's built-in CC uses Notes panel text, not ASR. It captures what was typed in the Notes panel, synchronised to the narration timeline. If a live narrator was used and the Notes text has any deviation from the spoken audio — ad-hoc phrasing, corrections, additional context — the CC shows the script, not the spoken words. This is a WCAG SC 1.2.2 accuracy failure. For TTS narration (Storyline AI Voice), Notes text and audio match by definition, and text-sync CC works correctly.
Rise 360 has no native caption editor. Captions for Rise video blocks must be set at the video host level — Vimeo, Wistia, YouTube, or Loom — before the video is embedded in Rise. Direct MP4 uploads to Rise video blocks have no caption capability. Audio blocks in Rise (audio lesson format) have no caption UI; a text transcript in a separate block is required to satisfy WCAG SC 1.2.1.
SCORM packages carry captions in the HTML5 output — the LMS cannot add them after upload. Captions in Storyline and Captivate are embedded in the published HTML5 output. When the SCORM package is uploaded to the LMS, the captions are already there (or not). No LMS can retroactively add captions to SCORM content after the package is uploaded.
Captivate, Lectora, and iSpring require a separate SRT import before publish. These tools support SRT/VTT caption import, but only if the step is taken before the course is published. The import is a separate step in the authoring tool's caption or media panel — it is not automatic, and courses published without the import step have no captions in the output.
WCAG SC 1.2.2 applies to narrated eLearning slides. A narrated Storyline slide is prerecorded synchronized media: it has audio (the narration) paired with time-synchronized visual content (the slide). This is exactly the content type SC 1.2.2 governs. Section 508 holds SCORM content to the same standard via WCAG2ICT guidance and 36 CFR Part 1194 Appendix A.
The authoring-tool TTS narration paradox applies to Storyline AI Voice and Captivate TTS. If someone runs ASR on the TTS audio output to generate captions (instead of deriving captions from the narration script), accuracy degrades 5–10 percentage points compared to human-narrated audio. The correct approach is to derive captions from the narration script text, not from running a captioning service on TTS audio. The same paradox documented for Synthesia and HeyGen AI-generated video applies here.

Why eLearning authoring tools are different from video assets

The three caption delivery architectures in eLearning

Raw video training content has one caption delivery architecture: the video file is paired with a caption sidecar (SRT, VTT, or TTML), and the LMS presents both to the learner. The video player reads the sidecar and overlays caption text at the correct timestamps. This is the architecture that the LMS caption documentation describes, that the caption format guide addresses, and that most L&D caption workflows are built around.

eLearning authoring tools use three different architectures, none of which match the sidecar model exactly:

Architecture 1: Text-synchronised captions (Articulate Storyline built-in CC). The narration script is stored in the authoring tool's notes or script field. At publish time, the tool generates a caption track by synchronising the notes text to the narration audio timeline. The published HTML5 output contains the caption text as inline content, not as a separate sidecar file. The LMS receives a SCORM package that already contains the caption data embedded in the HTML5 output. This architecture captures what was typed — not what was spoken — which makes it accurate for TTS narration and potentially inaccurate for live-narrated content.

Architecture 2: SRT/VTT import (Adobe Captivate, Lectora, iSpring). These tools require the L&D team to produce a caption file externally — by running the narration audio through a captioning service, generating a transcript, timing it, and exporting as SRT or VTT — then import that file into the authoring tool before the course is published. The caption file is attached to the narration asset within the tool. At publish time, the tool embeds the caption data in the HTML5 output. The accuracy of the captions depends on the quality of the external captioning step. If the import step is skipped, the published SCORM package has no captions.

Architecture 3: Video-host-level captions (Articulate Rise 360 video blocks). Rise 360 embeds videos from external video hosts — Vimeo, Wistia, YouTube, Loom, or Brightcove — using an iframe embed. The captions are not part of the Rise course; they are part of the video on the host platform. When a learner accesses the Rise course and reaches a video block, they see the host platform's native video player (with the CC button, if captions have been configured at the host). Rise inherits the host's caption state but cannot modify it. If the video was uploaded to Vimeo without a caption file, the Rise video block shows no captions. If the video was uploaded directly to Rise (direct MP4 upload), there is no video host layer — and no caption capability.

Why SCORM packaging changes the caption model

SCORM 1.2 and SCORM 2004 packages are ZIP files containing HTML5, JavaScript, media assets, and a manifest. When an L&D team publishes a Storyline course to SCORM, the published output — including all caption data — is bundled into the ZIP. The LMS uploads and extracts this ZIP. The HTML5 content runs inside the LMS as an iframe. The LMS does not process the caption content; it just runs the HTML5 output as delivered.

This means three things for caption management:

First, the LMS cannot add captions to a SCORM package after upload. Every LMS platform (TalentLMS, Docebo, Cornerstone OnDemand, Workday Learning, Kaltura, SAP Litmos) can add caption tracks to video assets stored in its media library. None of them can modify the caption content of a SCORM package that has already been published and uploaded. The LMS migration caption checklist notes this asymmetry: video caption sidecars can be managed post-upload, but SCORM caption data is frozen in the package.

Second, when a SCORM package moves between LMS platforms (as part of an LMS migration or a content vendor switch), the captions in the package move with it — or don't, depending on how they were implemented. Storyline CC captions (embedded as HTML5 inline text) survive the move. Captivate SRT-import captions (embedded in the published JS) survive the move. Video-host captions (Rise with Vimeo) survive the move only if the Vimeo configuration is maintained — the Rise package still contains the Vimeo iframe, which still points to the Vimeo video with its caption configuration.

Third, some LMS platforms reprocess SCORM packages on ingest — particularly older Moodle installations and some Cornerstone configurations — in ways that can break CSS or JavaScript references, including the code that controls caption button visibility. A SCORM package that shows captions correctly in Articulate Review 360 or in a local SCORM test player may fail to show captions in the target LMS. Testing in the actual LMS before release is not optional.

WCAG applicability to narrated eLearning slides

WCAG 2.1 SC 1.2.2 (Captions — Prerecorded) requires captions for "all prerecorded audio content in synchronized media." The question of whether a narrated eLearning slide is "synchronized media" is settled: it is. WCAG defines synchronized media as "audio or video synchronized with another format for presenting information and/or with time-based interactive components, unless the media is a media alternative for text that is clearly labeled as such." A narrated Storyline slide synchronises narration audio with slide animations, layer reveals, and timeline-triggered interactions. It is exactly the content type SC 1.2.2 governs.

"Prerecorded" means "synchronized media that is not live" — a qualifier that applies the moment the course is published. Even if a course was produced yesterday, the narration is prerecorded at the point of distribution. SC 1.2.2 applies at conformance level AA, which is the level required by WCAG 2.1 AA compliance (the standard cited in ADA litigation, Section 508, and the EAA).

The practical implication: an L&D team that has captioned its video library and believes its WCAG compliance work is complete may have missed its entire Storyline and Rise library — which is not a video library and therefore escaped the video captioning workflow. The enterprise LMS caption audit methodology notes that eLearning modules are commonly excluded from caption audits because the audit tools are optimised for finding videos without sidecar files, not SCORM packages without embedded caption data.

Articulate Storyline 360: built-in CC, SRT import, and the Notes panel dependency

How Storyline's built-in closed captions feature works

Articulate Storyline 360 includes a closed captions feature (Insert > Captions in the Storyline ribbon) that generates caption text from the Notes panel content and synchronises it to the narration audio timeline. When the course is published, a CC button appears in the player controls. Learners can toggle captions on and off. The caption text is displayed as an overlay or as a separate caption area depending on the player skin configuration.

The key technical detail: Storyline generates the caption track by aligning the Notes panel text to the narration audio. It does this by either (a) using the narration recording's timing markers if the narration was recorded inside Storyline, or (b) performing a basic text-to-audio alignment algorithm if the audio was imported from an external file. In both cases, the source of the caption text is the Notes panel — not a speech-to-text process on the audio itself.

This architecture has a significant implication. If the Notes panel text accurately reflects what the narrator said, the captions are accurate. If the narrator deviated from the script — adding context, rephrasing, correcting an error, skipping a sentence — the captions show the scripted version while the audio contains the spoken version. From a WCAG SC 1.2.2 perspective, the caption must "provide equivalent access to the audio content." A caption that shows the scripted version of a sentence while the narrator speaks a different phrasing is not providing equivalent access to the audio — it is providing equivalent access to the script. These are not the same thing when the narration deviates, which it does regularly in live-recorded eLearning.

When Storyline CC satisfies WCAG and when it doesn't

Satisfies WCAG SC 1.2.2: Storyline AI Voice (TTS narration from the Notes text). When Storyline generates the narration audio from the Notes panel text using its built-in text-to-speech engine, the audio is a direct synthesis of the Notes text. The caption (also derived from the Notes text) is by definition an accurate representation of the audio. There is no deviation between script and speech because there is no live narrator. This is the correct workflow for TTS-narrated Storyline content.

Satisfies WCAG SC 1.2.2 (conditional): Live narration with perfect Notes discipline. If the L&D team records live narration and maintains the Notes panel as a word-for-word transcript of what was spoken — including corrections, ad-hoc additions, and natural speech patterns like sentence rephrasing — the text-sync approach can produce accurate captions. This requires production discipline that most teams do not maintain consistently. A quality gate that spot-checks caption accuracy against audio for at least 10% of slides is necessary to validate this approach.

Does not satisfy WCAG SC 1.2.2: Live narration with script-as-Notes. When the Notes panel contains the pre-production script and the live narrator speaks from that script with any deviation — the most common case — the Storyline CC produces captions that are an approximation of the audio, not a faithful representation. The 99% accuracy threshold under WCAG 2.1 AA applies to the synchronisation between the caption and the audio content. A 5% deviation rate between scripted Notes text and live narration produces captions at approximately 95% accuracy — below the required threshold.

Does not satisfy WCAG SC 1.2.2: Courses published with Storyline CC enabled but empty Notes panels. A common failure mode: the L&D team enables Storyline CC in the player settings (believing that turning on the feature is the compliance step), but the Notes panel for each slide is empty. The published output has a CC button that produces no caption text. Learners who click the CC button see nothing.

The correct captioning workflow for live-narrated Storyline courses

For Storyline courses narrated by a live voice actor or SME:

Step 1: After narration recording is complete, export the narration audio for each slide as a WAV or MP3 file. Storyline allows audio export per slide from the media library.

Step 2: Run the exported audio through a glossary-corrected captioning workflow. This means applying the organisational glossary to the ASR output before timing is finalised. The glossary-biased decoding approach applies here — the proper nouns and technical terms that are most likely to be mispronounced by the narrator or misrecognised by ASR are exactly the terms where caption accuracy matters most to the learner.

Step 3: Export a properly timed SRT file for each slide's narration audio.

Step 4: Import the SRT files into Storyline via Insert > Captions > Import. Storyline accepts per-slide SRT import. Map each SRT file to its corresponding slide.

Step 5: Verify caption display in the Storyline preview (not just Review 360 — test in the published HTML5 output). Confirm that the CC button is visible in the player, that captions appear on the correct slides, and that the timing is correct for the first and last caption cue on each slide.

Step 6: Publish to SCORM 1.2, SCORM 2004, or HTML5 output. The imported SRT captions are embedded in the published output and travel with the SCORM package.

Screen recording slides in Storyline

Storyline's screen recording feature produces two slide types: Step-by-step mode (one slide per action, each with narration) and Demo mode (a single video of the entire screen recording). Caption handling differs between them.

Step-by-step mode: each slide has narration from the step recording. The Notes panel approach (text-sync or SRT import) applies per slide, same as any other narrated slide. The narration was typically recorded spontaneously while performing the screen actions — the Notes panel is likely to be empty or contain rough notes rather than a transcript. SRT import from ASR on the screen-recording narration audio is required for accuracy.

Demo mode: a single video asset covering the full screen recording sequence. This is treated as a video asset, not as a slide-by-slide Notes panel asset. Caption it with an SRT file at the source (the screen recording video) and import the SRT using Storyline's video caption panel (the same insert captions workflow, applied to the video clip rather than to the slide narration track).

Quiz narration and branching scenarios in Storyline

Quiz slides in Storyline often have narration for the question stem, answer choices, and feedback layers (correct/incorrect). Each narration instance requires a caption. The Notes panel approach applies to question stem narration if the Notes text was maintained. Feedback layer narration (the "Correct!" or "Let's try that again" audio) is frequently not in the Notes panel and requires separate handling — either explicit SRT import for each feedback trigger, or a policy decision to use TTS for feedback audio (which aligns with the text-sync approach cleanly).

Branching scenario slides with branching-point narration require caption handling on every branch path, not just the linear path. A scenario with five branching points and three possible outcomes per point has fifteen narration assets that each require caption text.

Articulate Rise 360: video-host-level captions and the direct-upload trap

Rise 360's content architecture and caption model

Articulate Rise 360 is a block-based eLearning authoring tool. Content is composed from blocks: text blocks, image blocks, video blocks, audio blocks, process blocks, quiz blocks, and embedded Storyline blocks. Rise has no native caption editor for any block type — it does not generate, host, or modify caption data for any content it publishes.

For content that requires captions under WCAG SC 1.2.2 (video with narration) or SC 1.2.1 (audio-only content), the captioning must be handled upstream of Rise, at the source of the audio or video content, before that content is embedded in the Rise course.

Video block captions: the video-host-first requirement

Rise video blocks embed video from external video hosting platforms using an iframe. When a learner accesses the Rise course and reaches a video block, they see the video in its native player from the host platform. The caption configuration of that video — whether captions exist, what caption tracks are available, and what the default caption state is — is determined by the host platform configuration, not by Rise.

The correct caption workflow for each supported video host:

YouTube: Upload the video to YouTube. Go to YouTube Studio > Subtitles > Add subtitle/CC. Either accept YouTube's auto-generated captions (which require editing for accuracy on technical vocabulary — the auto-captions compliance analysis documents why auto-generated captions alone typically do not meet WCAG 99% accuracy on technical content) or upload a corrected SRT file. After captions are confirmed in YouTube Studio, embed the YouTube video in the Rise video block. Rise inherits the YouTube player with the CC button available.

Vimeo: Upload the video to Vimeo. Go to the video's Settings > Distribution > Subtitles. Upload a VTT or SRT caption file. Select the language and set as default if appropriate. After the caption track is confirmed in Vimeo, embed the Vimeo video in the Rise video block. The Vimeo player appears in Rise with the CC (CC) control available.

Wistia: Upload the video to Wistia. Go to Customize > Captions and Transcripts. Upload an SRT file. After the caption track is confirmed, embed the Wistia video in the Rise video block. The Wistia player renders in Rise with the CC control.

Loom: Loom generates auto-captions for all videos. These auto-captions require editing for technical vocabulary — Loom's auto-caption accuracy on technical L&D content has the same limitations as other platform ASR systems. Edit captions in Loom's caption editor before embedding the Loom video in Rise. Loom's CC button is available in the embedded player.

Brightcove: Enterprise video platform. Captions are managed in Brightcove's Video Cloud studio under the video's text tracks settings. VTT file upload is the standard method. After tracks are configured in Brightcove, embed via the Brightcove embed code in Rise's embed block (Rise does not have a native Brightcove block — use the custom content block with the Brightcove embed code).

The direct-upload trap

Rise 360 allows direct MP4 upload to video blocks without using an external video host. L&D teams use direct upload when they want to keep the video content within the Rise/Articulate ecosystem (hosted on Articulate's CDN with the Rise-managed SCORM output) rather than managing a separate Vimeo or Wistia account.

Direct MP4 upload to Rise video blocks has no caption capability. There is no UI in Rise for uploading a caption file to accompany a directly uploaded video. The video appears in the Rise video block player, but there is no CC button, no caption track, and no mechanism for the learner to access captions. This is not a configuration issue or a limitation that can be worked around — it is an architectural gap in the Rise platform as of the date of this writing.

L&D teams who use direct MP4 upload for Rise video content are publishing uncaptioned narrated video in SCORM packages and distributing it to learners through the LMS. If any learner who is deaf, hard of hearing, or who needs captions for any reason encounters one of these courses, the course fails WCAG SC 1.2.2, Section 508, and ADA accommodation requirements from the first day it is deployed.

The operational fix: move all Rise video content to a video host (Vimeo, Wistia, or YouTube) before embedding. For legacy courses that used direct upload, replace each video block with a video host embed. This requires re-publishing the Rise course and re-uploading the SCORM package to the LMS.

Audio block captions in Rise

Rise's audio lesson block (which presents audio content without a video track) does not have a caption UI. Audio-only content presented without a text alternative fails WCAG 2.1 SC 1.2.1 (Audio-only and Video-only — Prerecorded), which requires either captions or a text transcript for prerecorded audio-only media.

The correct approach for Rise audio blocks: provide a text transcript as a text block immediately below the audio block. This satisfies SC 1.2.1's "text alternative" requirement. The text transcript should be a faithful representation of the audio content — not a summary or an approximation, but a word-for-word transcript with speaker identification if multiple voices are present.

This approach does not satisfy SC 1.2.2 (which requires synchronized captions for audio-video content) because audio-only blocks are not synchronized media. SC 1.2.1 governs audio-only, and a text alternative satisfies it. SC 1.2.2 would apply if the audio block were paired with visual content — which a text block below the audio block is not, by design.

Storyline blocks embedded in Rise

A Storyline block in Rise exports a Storyline course as a web object packaged inside the Rise output. When a learner accesses the Storyline block in Rise, the Storyline course runs inside an iframe within the Rise lesson. Storyline's built-in CC (or SRT-imported captions, if the correct workflow was followed) carries through into the Rise Storyline block — if the Storyline CC is correctly configured, the CC button appears in the Storyline player inside the Rise lesson.

One important nuance: the CC button position within the Storyline player must be set to a location that is visible inside the Rise block frame. If the Storyline player controls are configured to display at a size or position that is clipped by the Rise block container, the CC button may be rendered outside the visible area. Test Storyline blocks in Rise to confirm CC button visibility before publishing.

Adobe Captivate: SRT import workflow and TTS narration

Captivate's closed caption import panel

Adobe Captivate (2019 and later versions) supports SRT caption import through the Project > Closed Captions panel (earlier versions: Publish > Closed Captions). The panel shows a slide-by-slide view of the course with a field for importing or manually entering caption text for each slide's narration.

The import workflow: for each slide with narration, select the slide in the Closed Captions panel, click Import, and browse to the SRT file for that slide's narration audio. Captivate reads the SRT and maps the caption cues to the slide's narration timeline. If the SRT timing is relative to the slide (beginning at 00:00:00 for the slide's narration start), the import is straightforward. If the SRT timing is absolute (derived from the full course audio exported as a single file), Captivate requires offset adjustment per slide.

After import, the Closed Captions panel shows a preview of each slide's caption text aligned to its timeline. Review the alignment for at least a sample of slides before publishing. Pay particular attention to slides where the narration is longer than two minutes — timing drift can accumulate on long narration tracks.

Captivate generates a CC button in the published SWF or HTML5 output. For HTML5 output (the current standard for SCORM delivery), the CC button is rendered in the course player controls. Learners click the CC button to toggle captions on and off.

Captivate TTS narration and the SRT source problem

Captivate supports text-to-speech narration via services including Neospeech and third-party TTS APIs. When TTS narration is used, the narration audio is generated from the text entered in the slide script field. The same principle applies as for Storyline AI Voice: the correct caption source is the TTS script text, not the result of running ASR on the TTS audio output.

Some L&D teams commission an external captioning service to caption the published TTS audio — running Whisper or a similar ASR system on the TTS voice output. This introduces the TTS→ASR accuracy paradox documented in the AI-generated video captioning guide: synthetic voice models produce acoustic patterns that differ from human speech in ways that degrade ASR accuracy by 5–10 percentage points on technical vocabulary. For a Captivate course on cybersecurity or pharmaceutical compliance, this means captions generated by running ASR on TTS audio are at approximately 83–91% accuracy before any glossary correction — below the 99% WCAG threshold before the first quality gate is applied.

The correct workflow for Captivate TTS narration: derive the SRT caption file from the TTS script text, with timing generated by syncing the script text to the TTS audio duration (a simple linear alignment is adequate for most slide narrations; more precise timing from the TTS system's phoneme output is available for high-stakes content). Import this script-derived SRT into the Captivate Closed Captions panel. The accuracy is then limited only by the fidelity of the script text to what is actually synthesised — which is typically 99%+ for non-homophones and standard vocabulary.

Software simulation and demo mode in Captivate

Captivate's software simulation and demo recording modes produce similar output to Storyline's screen recording: either step-by-step slides with narration per step, or a single video recording of the full demo. The caption approach follows the same logic: step slides use the slide-by-slide SRT import in the Closed Captions panel; demo video uses the video caption import. Captivate's demo mode output is typically a full-motion video file embedded in the course — treat it as a video asset for caption purposes.

Responsive vs. Fluid Boxes Captivate output

Captivate publishes to two primary HTML5 output modes: non-responsive (fixed-dimension HTML5) and Fluid Boxes (responsive design). The CC button appears in both output modes, but its position and styling differ. For Fluid Boxes output, the CC button is part of the responsive player skin — confirm it renders correctly on mobile screen sizes, as some Captivate player configurations clip the CC button at narrow widths.

Lectora Online: per-asset caption upload

Lectora's caption model

Lectora Online (the cloud-based version, formerly Lectora Inspire in desktop form) handles captions at the individual media asset level, not at the slide or page level. For each audio or video asset used in the course, Lectora provides a Closed Captions option in the asset's Media Properties panel. An SRT or VTT file can be uploaded per asset.

The per-asset model means a course with forty narrated pages using forty separate audio clips requires forty SRT files — one per audio clip. This is the most granular caption management model of any major authoring tool, and it creates significant production overhead for courses with many short audio segments. For courses where the narration is structured as longer segments (one audio file per topic section rather than one per page), the per-asset model is more manageable.

Lectora generates a CC button in the published HTML5 output. Learners can activate captions per asset. The CC button is associated with the specific media player for each audio or video element — it is not a global course-level toggle.

Lectora SCORM output

Lectora publishes to SCORM 1.2, SCORM 2004, and xAPI (Tin Can). Caption data is embedded in the published HTML5 output and travels with the SCORM package. When the package is uploaded to a supported LMS (TalentLMS, Cornerstone OnDemand, Docebo, or others), the captions are part of the package and do not require separate LMS configuration.

One LMS-specific note: some LMS platforms render Lectora SCORM output in a restricted iframe environment that can block JavaScript used by Lectora's CC rendering engine. Test Lectora SCORM packages in the target LMS before mass distribution. The test should specifically verify CC button functionality with captions enabled, not just confirm that the SCORM package launches.

iSpring Suite: TTS narration and SCORM output

iSpring's caption approach

iSpring Suite is a PowerPoint add-in that converts PowerPoint presentations to interactive eLearning content, published as SCORM or HTML5. iSpring Suite 11 (and later) added closed caption support for video and narration content via the Narration Editor's caption panel.

For recorded narration in iSpring Suite: in the Presentation Properties panel, open the Narration tab, select a slide with narration, and access the Caption Settings. iSpring supports SRT file import for narration tracks. The imported SRT is embedded in the iSpring-published output and appears as a CC toggle in the iSpring player.

iSpring's TTS narration (available via integration with text-to-speech services): the same TTS→ASR paradox applies. Caption source should be the narration script text, not ASR-on-TTS output. iSpring's narration script text field is the correct source for caption derivation.

iSpring SCORM output

iSpring publishes to SCORM 1.2, SCORM 2004, xAPI, and cmi5. Caption data is embedded in the published iSpring HTML5 output. LMS delivery follows the same rules as other authoring tools: captions must be imported before publish, cannot be added after LMS upload, and must be tested in the target LMS rather than only in the iSpring preview.

iSpring Learn (the paired LMS) has deep integration with iSpring Suite output — captions configured in iSpring Suite carry through cleanly into iSpring Learn. For organisations using iSpring Suite with a third-party LMS, test specifically that the iSpring CC button renders correctly in the target LMS's SCORM player.

The SCORM packaging problem: what travels with the package and what doesn't

Caption data inside a SCORM package

A SCORM 1.2 or SCORM 2004 package is a ZIP file. Inside the ZIP: the course manifest (imsmanifest.xml), HTML pages for each slide or screen, JavaScript files for the player logic, media assets (audio, video, images), and any caption data. Caption data in Storyline, Captivate, Lectora, and iSpring SCORM output is stored as text content within the JavaScript files — not as separate SRT or VTT sidecar files. The caption text is inlined into the published JavaScript, where it is read by the player logic and rendered as caption overlays at the appropriate timestamps.

Because the caption data is inlined in the JavaScript, it is not a separate file that can be replaced or updated without republishing the course. Unlike a video with a sidecar SRT (where the SRT can be updated on the LMS independently of the video), a SCORM package with updated caption text requires a full republish of the course from the authoring tool and a re-upload of the SCORM package to the LMS.

The exception: Rise 360 with video-host captions. In this architecture, the Rise SCORM package contains only the iframe embed for the video host. The caption data lives at the video host (Vimeo, Wistia, YouTube). If a caption error is found after the course is deployed, it can be corrected at the video host level without republishing the Rise course or re-uploading the SCORM package. This is actually an architectural advantage of the video-host model — caption corrections are faster and don't require LMS re-deployment.

LMS migration and SCORM caption survival

The LMS migration caption checklist identifies SCORM caption data as one of the five caption data types that must be audited before an LMS migration. When SCORM packages are migrated from one LMS to another (for example, from TalentLMS to Cornerstone OnDemand, or from an on-premise LMS to a cloud LMS), the packages are downloaded from the source LMS and re-uploaded to the destination LMS. The caption data inside the packages is preserved in this process — because it is part of the package, not a separate LMS record. The migration risk is not that caption data is lost from the package; it is that the LMS player environment at the destination renders the caption differently than at the source.

Post-migration caption verification should include: launching at least a 10% sample of captioned SCORM packages in the destination LMS, confirming the CC button is visible and functional, toggling captions on and verifying that caption text appears and is correctly timed for the first and last cue of the narration.

The three-way LMS CSS conflict failure mode

The most common SCORM caption delivery failure in production is a CSS conflict between the LMS's course player styling and the authoring tool's caption button rendering. The scenario: a Storyline course with correctly configured CC (verified in Review 360 and in Articulate's HTML5 test export) is uploaded to the LMS. Learners report that there is no CC button or that the CC button is visible but produces no caption display. Investigation reveals that a LMS global stylesheet applies CSS rules that override the z-index, display property, or visibility property of the Storyline player's CC container element.

This is not a Storyline bug — it is a CSS specificity conflict between two independently authored stylesheets. The fix requires either: (1) a CSS override in the Storyline player skin or a wrapper div that isolates the Storyline output from the LMS's global CSS, (2) a LMS administrator adjustment to exclude the SCORM player iframe from LMS global stylesheet scope, or (3) republishing the Storyline course with a different player skin configuration that avoids the conflicting CSS selectors. All three approaches require coordination between the L&D team and the LMS administrator.

This is why testing SCORM caption delivery in the target LMS — not just in the authoring tool preview or a generic SCORM test player — is not optional. The LMS production environment is the only place where the CSS conflict will manifest.

xAPI and cmi5: caption tracking is not standardised

Neither SCORM 1.2 nor SCORM 2004 track caption engagement in their CMI data model. xAPI (Tin Can) allows custom statements, and cmi5 defines a standard set of statements, but neither standard includes a caption-engagement statement in the base recipe. Whether a learner enabled captions, how many caption cues they saw, or whether captions were available for a course cannot be reported from the LRS (Learning Record Store) using standard xAPI or cmi5 statements unless the authoring tool has been specifically configured to emit custom caption-engagement statements — which none of the major authoring tools do by default.

For compliance audit purposes, this means caption availability evidence must come from the content level (the authoring tool's caption configuration and the published output), not from the LRS. The compliance KPI reporting framework addresses this: caption availability is a property of the content asset (is a CC button present in the published output?), not of the learner completion record (did the learner use captions?). Audit evidence is the SCORM package inspection or the published course preview — not the LRS data.

Section 508 and WCAG for SCORM-packaged content

Section 508 applicability to SCORM courses

Section 508 of the Rehabilitation Act (as amended by the ADA Amendments Act and the Access Board's 2017 ICT Refresh) applies to electronic and information technology produced, procured, or used by US federal agencies and their contractors. Federal agencies that use LMS platforms to deliver SCORM-packaged eLearning are distributing electronic content within the scope of Section 508. The SCORM packages themselves must meet accessibility standards — not just the LMS platform that hosts them.

The 2017 ICT Refresh incorporated WCAG 2.0 Level AA by reference into Section 508, via 36 CFR Part 1194, Appendix A (the "Revised 508 Standards"). WCAG 2.0 AA requires captions for prerecorded synchronized media at SC 1.2.2 — the same requirement as WCAG 2.1 AA. For organizations subject to Section 508, every narrated SCORM course distributed via a federal LMS must have captions meeting the 99% accuracy threshold at SC 1.2.2.

The VPAT/ACR (Voluntary Product Accessibility Template / Accessibility Conformance Report) is the standard documentation format for Section 508 conformance claims. SCORM courses procured by federal agencies are increasingly required to include a VPAT. The VPAT's "Media Players" and "Synchronized Media" criteria rows must report on whether captions are present and whether they meet the conformance level — SC 1.2.2 at Level AA. An L&D team that delivers SCORM content to a federal client without captions will fail the VPAT requirement and may face procurement disqualification or remediation requirements.

For state and local government entities covered by ADA Title II (universities, school districts, state agencies), the same logic applies: the SCORM courses distributed via their LMS are part of the entity's electronic and information technology, and the entity's WCAG 2.1 AA compliance obligation extends to the content inside the LMS, not just to the LMS platform itself. The Section 508 captioning reference page covers the federal framework in detail.

The "software" classification question for eLearning

Some Section 508 compliance frameworks classify interactive eLearning content as "software" under 36 CFR Part 1194, Chapter 5 (Software), rather than as "Web Content" under Chapter 2 (which directly incorporates WCAG 2.0 AA). The software classification is based on the argument that SCORM-packaged eLearning is a software application running inside a browser, not a web document in the traditional sense.

Under the software classification, WCAG2ICT (WCAG2 for non-web documents and software, published by the W3C) guidance applies WCAG 2.0 AA to the software interface. WCAG2ICT concludes that SC 1.2.2 applies to non-web software that includes synchronized media, using the same conformance criteria as the web standard. Whether SCORM courses are classified as "web content" or "software" under Section 508, the SC 1.2.2 caption requirement applies under either classification.

For practical compliance purposes: do not rely on the software-vs.-web-content classification ambiguity as a reason to defer captioning authoring tool content. Both interpretations lead to the same captioning requirement.

State-specific eLearning accessibility requirements

Several US states have enacted WCAG-based accessibility requirements for state agency and higher-education web content that extend to eLearning content delivered via state-operated systems:

California AB 434 (2017): State agency websites and web applications must conform to WCAG 2.0 Level AA. California State University and University of California campuses, as state-operated entities, are covered. eLearning courses delivered via campus LMS platforms are within scope of this obligation. The WCAG 2.1 AA captioning standard is the current applicable benchmark (California's informal practice has updated to reference WCAG 2.1 AA rather than the literal 2.0 AA cited in the statute).

Texas Government Code § 2054.460 (2021): State agencies must ensure their websites and web applications conform to WCAG 2.1 Level AA. State agency training content delivered via LMS is within scope. Texas state universities are covered under the broader TexasOnline framework.

New York State IT Accessibility Policy (ITS-P24-001): Applies to New York state agency information technology, including eLearning platforms. References WCAG 2.1 AA. SUNY and CUNY campuses operate under parallel obligations under their respective system-level accessibility policies.

ADA Title III and eLearning for external learners

When eLearning content is delivered to audiences outside the employment relationship — customer training, partner academies, public-facing certification courses — ADA Title III (public accommodations) becomes the applicable framework rather than ADA Title I (employment). The customer education captioning guide covers this framework in detail. For authoring tool-produced eLearning distributed externally via Skilljar, Thought Industries, Gainsight PX, LearnUpon, or similar customer education platforms, the same WCAG SC 1.2.2 requirement applies — under a Title III rationale rather than a Title I rationale.

Vocabulary failure modes in authoring tool narration

Live-narrated Storyline: the Notes panel deviation risk

When a live narrator records Storyline content from a pre-production script, the recorded audio typically deviates from the script in minor ways: natural speech patterns differ from written prose, the narrator self-corrects errors, sentence structures are simplified in delivery, and ad-hoc context is added at the moment of recording. These deviations — individually minor — accumulate to produce a caption text (derived from the Notes panel script) that differs from the spoken audio in ways that exceed the WCAG accuracy threshold.

The vocabulary terms most affected by Notes-deviation are ironically the most critical ones: technical proper nouns that the narrator pronounces correctly because they know the content, but which the Notes text captures slightly differently — abbreviated form in Notes vs. full form in narration, or vice versa; brand name capitalisation in Notes vs. spoken pronunciation without emphasis; acronym expansion in Notes vs. spoken as initials in narration. A compliance training course that refers to "FCPA" in the Notes text while the narrator says "the Foreign Corrupt Practices Act" throughout has every FCPA instance in the caption showing "FCPA" while the learner hears "the Foreign Corrupt Practices Act" — a synchronisation mismatch.

TTS narration vocabulary risks

TTS narration — whether from Storyline AI Voice, Captivate's TTS engine, ElevenLabs API integration, or Microsoft Azure Cognitive Services — introduces a different vocabulary risk: mispronunciation without a corresponding caption error. Because TTS-narrated content uses the Notes text as both the TTS script and the caption source, the caption always matches the intended text. But if the TTS engine mispronounces a technical term, the learner hears an incorrect pronunciation while reading the correct text in the caption — which is cognitively confusing for vocabulary-building content, even if the caption is technically accurate.

Common TTS mispronunciations in L&D content:

Technical term	TTS rendering	Caption text (correct)	Impact
OAuth	"Oh-auth" or "O-ath"	OAuth	Audio/text mismatch for a security term learners need to pronounce correctly
SCIM	"Skim" or "S-C-I-M"	SCIM	Learner unsure of pronunciation after the course
Kubernetes	"Ku-bur-net-eez" or "Cube-uh-ne-tees"	Kubernetes	Widely mispronounced by TTS; learner learns the wrong pronunciation
SQL	"S-Q-L" (letters) vs "sequel" (word)	SQL	Depends on industry convention; TTS inconsistency causes confusion
GIF	"Jif" or "Gif" depending on TTS model	GIF	Low impact; pronunciation is contested anyway
HIPAA	"Hip-pah" (correct) or "H-I-P-A-A"	HIPAA	High impact for compliance training — learners need to say this in conversation
OSHA	"Oh-sha" (correct) — usually correct	OSHA	Low risk; OSHA is common enough that TTS models handle it reliably
25 mcg	"Twenty-five mig" or "twenty-five micro-grams"	25 mcg	High impact for pharmaceutical training — dosing units must be precise

The organisational glossary serves a dual function in TTS-narrated eLearning: it guides the captioning workflow to use the correct text representation of terms, and it should also include pronunciation guides for the TTS engine (phonetic spellings, SSML markup in tools that support it) to ensure the audio and caption are aligned in both text and pronunciation.

Screen recording narration: spontaneous vocabulary

Screen recording narration in Storyline and Captivate is typically recorded spontaneously — the L&D professional performs the software workflow while narrating aloud, without a pre-written script. The Notes panel is empty or contains rough notes. The spoken narration includes all the technical vocabulary of the software being demonstrated: menu names, button labels, field names, API calls, keyboard shortcuts.

This is the highest-risk narration type for caption accuracy. The vocabulary is technical and specific (menu item labels, modal dialog names, specific field values), the narration is unscripted, and the speaker is often not a professional voice actor — they are a subject-matter expert whose speaking cadence, recording environment, and microphone setup are optimised for documentation accuracy rather than audio quality.

The correct approach: run the screen recording narration audio through a glossary-corrected ASR workflow (not a plain-text Notes sync approach), verify the vocabulary against the on-screen interface elements visible in the recording, and import the verified SRT before publishing. For organisations with many software simulation courses, a systematic screen recording captioning workflow — with UI-element vocabulary lists maintained per application — significantly reduces the correction burden compared to per-recording manual review.

Branching scenario dialogue: multi-speaker and character names

Branching scenario eLearning (common in sales training, compliance training, and soft-skills content) involves multiple character voices — typically played by voice actors or team members voicing characters. The caption challenges include: speaker identification in the caption (who is speaking?), character name vocabulary (character names may be unusual or may match actual colleague names that require correct spelling), and the branching path architecture (every branch must be captioned, not just the primary path).

Speaker identification in captions is not required by WCAG SC 1.2.2, but it is listed as a best practice in DCMP's Captioning Key when the speaker is not visually identified by the content. In a branching scenario where two characters alternate dialogue without on-screen identification, adding speaker names in captions (e.g., "[Taylor]:" before each Taylor line) is the recommended practice. Storyline's Notes panel approach requires manually including speaker identification text in the Notes for each speaker's narration — it is not inferred from which audio track is active.

Production workflow at scale

Phase 1: Caption source decision (before production begins)

The caption source decision — whether captions will be derived from the Notes panel text (text-sync) or from an SRT import of ASR on the narration audio — should be made before production begins on each course, not as an afterthought at publish time. The decision determines the production workflow and the quality gate requirements.

Decision framework:

TTS narration (Storyline AI Voice, Captivate TTS, ElevenLabs): Use text-sync (Notes panel → CC). The TTS audio matches the Notes text by construction. Verify timing for slides with narration longer than two minutes.
Live narration with professional voice actor and disciplined Notes maintenance: Text-sync is acceptable if the production workflow includes a mandatory Notes-vs-audio QA step before publish. Without the QA step, use SRT import.
Live narration by SME or internal team member, unscripted or loosely scripted: SRT import from ASR on the narration audio is required. Text-sync will not produce accurate captions.
Screen recording with spontaneous narration: SRT import from ASR on the screen recording audio is required. Notes panel is empty or inadequate.
Rise 360 video blocks: Caption at video host level before embedding. No exceptions for direct MP4 uploads — move video to a host platform first.

Document the caption source decision in the course development template or project brief. The decision determines which quality gate is applied at the pre-publish review step.

Phase 2: Captioning during production

For text-sync courses: maintain the Notes panel as a verbatim transcript of what will be spoken (for professional voice-actor recordings) or what was spoken (for post-recording Notes updates). If Notes text is the caption source, it must be treated with the same precision as the published caption track — not as production metadata.

For SRT-import courses: the captioning step must be built into the production timeline before the publish step. The sequence: narration recorded → audio exported from authoring tool → glossary-corrected ASR captioning → SRT file verified → SRT imported into authoring tool → course published. For a course with twenty narrated slides, budget one to two hours for the ASR, correction, and import steps in addition to recording time. The hidden FTE cost of caption correction in video workflows also applies here — at 4× real-time correction rate for technical content, a ten-minute slide deck with dense technical narration requires approximately forty minutes of caption correction work.

For Rise 360 courses with video blocks: the video captioning step must be completed at the video host before the Rise course is assembled. If the video is uploaded to Vimeo after the Rise course is built and the embed code is already in a Rise video block, the video will appear captioned in the host but the Rise embed must be refreshed (or the course must be re-published with the updated embed code if the captioning changes the Vimeo player URL parameters). Build the video-host caption step into the video production workflow, not the Rise course assembly workflow.

Phase 3: Pre-publish caption verification

The pre-publish verification step checks caption availability and correctness before the SCORM package is generated. For authoring tools, this means testing in the authoring tool's preview (Storyline Review, Captivate Preview, Rise Preview) and in the published HTML5 output in a browser — two separate tests, because the authoring tool preview may render differently from the published output.

Verification checklist:

CC button is visible in the published course player
Clicking CC button produces caption display (not silence or an error)
Caption text appears correctly for the first and last caption cue of at least three slides
Caption timing is correct (captions appear when narration begins, clear when narration ends) for at least three slides
Screen recording sections show caption text appropriate to the audio (not empty)
All branching paths include caption content (not just the primary path)
For Rise: video blocks show CC controls in the embedded player; audio blocks have text transcript blocks present

This verification step catches the most common failure modes (empty Notes panel, skipped SRT import, CC-less direct MP4 in Rise) before the SCORM package is uploaded to the LMS.

Phase 4: LMS delivery and post-upload verification

Upload the SCORM package to the LMS. After upload, launch the course as a learner (not as an administrator with elevated privileges, as administrator views sometimes bypass iframe restrictions that affect CC rendering). Repeat the verification checklist in the LMS environment. The LMS production environment is the only place where CSS conflicts, iframe restrictions, and LMS-specific player rendering will manifest.

LMS-specific notes:

TalentLMS: Storyline SCORM output generally renders CC correctly. Test with the responsive player skin enabled (TalentLMS has both classic and responsive player options — CC button position differs between them).
Cornerstone OnDemand: Cornerstone's SCORM player wraps content in a custom iframe with CSS scope restrictions. CSS conflicts with Storyline CC button are documented — test in Cornerstone specifically. Some clients require a custom player skin in Storyline that uses a different CSS class for the CC button to avoid the Cornerstone conflict.
Docebo: Docebo's SCORM player renders most authoring tool output cleanly. Rise 360 video blocks with Vimeo embeds require that the Docebo security policy allows Vimeo iframes — check the Docebo domain allowlist if Vimeo players don't load inside Rise courses.
Kaltura: Kaltura's integrated LMS capability (via Kaltura Lecture Capture or Kaltura MediaSpace) manages video assets natively. Storyline courses uploaded as SCORM packages are served from Kaltura's SCORM player. Test CC in the Kaltura SCORM player specifically — it differs from Kaltura's native video player (which has its own CC management).
Workday Learning: Workday's SCORM player runs content in a strict iframe. Storyline CC button rendering in Workday has been reported to fail with some Storyline player skins. Test with the current Workday production environment before certifying a SCORM caption workflow.
Moodle: Moodle's SCORM activity module has had documented repackaging behavior in older versions that can strip CSS references from SCORM packages. Confirm the Moodle version (4.0+ has more reliable SCORM handling) and test CC in the SCORM player.

Phase 5: Backlog assessment and remediation planning

For organisations with existing Storyline, Rise, or Captivate libraries that were built without captions, a backlog assessment is the first step. The assessment quantifies: how many courses, how many slides, which tools, which narration types (TTS vs. live), and which LMS platforms. This assessment parallels the enterprise LMS caption audit methodology applied specifically to SCORM packages rather than video assets.

Remediation priority follows the same triage logic as the broader compliance programme build: (1) courses with active accommodation requests first, (2) mandatory compliance and onboarding training next (highest audience density and highest compliance obligation), (3) performance support and optional content on a rolling schedule.

For Storyline backlog remediation at scale: export narration audio from all courses using Articulate's batch media export utility, run through a glossary-corrected batch captioning workflow, import SRT files back into the source .story files, and republish. A 100-course backlog with an average of fifteen slides per course (1,500 slides) requires approximately 300 hours of SRT import and verification work in addition to the ASR captioning time — at a 4× real-time correction rate for technical content, budget one hour of correction time per fifteen minutes of narration. Plan a republish-and-re-upload cycle for the entire batch rather than individual course-by-course remediation.

Eight failure modes

Failure mode 1: Empty Notes panel with Storyline CC enabled

The Storyline CC feature is enabled in the player skin settings (turned on in the global publish settings), but the Notes panel for most or all slides is empty. The published SCORM output has a CC button that produces no caption text when clicked. Learners who enable captions see a blank caption area. This is the single most common Storyline caption failure in corporate L&D — it occurs when the L&D team believes that enabling the CC feature in player settings is the complete compliance step, without understanding that the CC text source is the Notes panel.

Detection: Launch the published course and click the CC button. If no caption text appears on a narrated slide, the Notes panel is likely empty. Confirm in the source .story file by checking the Notes panel for a narrated slide.

Fix: Populate the Notes panel with the narration transcript for each narrated slide, or import SRT files via the Insert > Captions > Import workflow. Republish and re-upload the SCORM package.

Failure mode 2: Direct MP4 upload in Rise 360

An L&D team building a Rise course uploads MP4 video files directly to the Rise video block (via the "Upload from computer" option in the video block settings) instead of embedding from a video host. The Rise video block displays the video in its native Rise player. There is no CC button, no caption track, and no mechanism for the learner to access captions. The course is published and deployed to the LMS with uncaptioned video.

Detection: Preview the Rise course and check each video block. If the video plays in the Rise-styled player (with Rise's play/pause controls and no separate CC button), it was likely directly uploaded. Video-host-embedded videos show the host platform's player controls (Vimeo's, Wistia's, or YouTube's player) and include the host's CC button if captions were configured.

Fix: Upload the videos to a video host (Vimeo, Wistia, or YouTube), configure captions at the host, replace the Rise video block with a video-host embed. Republish the Rise course and re-upload the SCORM package.

Failure mode 3: Captivate SRT import skipped

A Captivate course is developed, narrated, and published without the SRT import step in the Closed Captions panel. The team intended to caption the course but left the import for a final step that never happened before the publish deadline. The published SCORM output has no caption data. Learners see no CC button in the Captivate player.

Detection: Launch the published course. If there is no CC button in the Captivate player controls for a narrated course, the SRT import was not completed. Confirm in the source .cptx file by opening Project > Closed Captions and checking whether caption text is present for any slide.

Fix: Return to the source .cptx file, import SRT files for each narrated slide via Project > Closed Captions > Import, republish, and re-upload to the LMS.

Failure mode 4: Storyline CC shows script deviations from live narration

A Storyline course was professionally narrated by a voice actor. The Notes panel contains the pre-production script. The voice actor made minor modifications during recording — more natural sentence flow, a correction mid-sentence, an added example. The published Storyline CC shows the scripted Notes text while the audio plays the modified narration. The discrepancy rate is 3–8% of caption cues, putting the course below WCAG's 99% accuracy threshold.

Detection: This failure mode cannot be detected by a CC button click test — the captions appear to be present and functional. Detection requires a DCMP spot-check of at least 10% of slides, comparing the CC text to the narration audio. A spot-check of three slides from a twenty-slide course will typically reveal any systematic script deviation.

Fix: Update the Notes panel for deviated slides to match the spoken narration, or replace the text-sync approach with SRT import from ASR on the narration audio. Republish the corrected Storyline course.

Failure mode 5: LMS CSS conflict removes CC button from published output

A Storyline course with correctly configured CC (verified in Review 360 and local HTML5 preview) is uploaded to the LMS. Learners report no CC button in the course player. Investigation shows that the LMS's global player CSS applies a display: none or visibility: hidden rule to an element class that Storyline uses for the CC button container.

Detection: The CC button is present in the authoring tool preview but absent in the LMS player. Use browser developer tools to inspect the LMS player HTML and confirm the CC container element's CSS properties. Look for an LMS global stylesheet applying display or visibility rules to the relevant CSS class.

Fix: Options (in ascending order of invasiveness): (a) request the LMS administrator to exclude the SCORM player iframe from the conflicting CSS scope; (b) republish the Storyline course with a modified player skin that uses a different CSS class for the CC button; (c) add an inline CSS override to the Storyline player skin that applies higher specificity than the conflicting LMS rule. Options (a) and (b) are preferred; option (c) requires CSS editing in the Storyline player template files.

Failure mode 6: Rise audio blocks without transcripts

An L&D team uses Rise audio blocks (audio lesson format) to deliver narrated content without video. The team captions all Rise video blocks correctly but overlooks the audio blocks because audio blocks have no CC UI — there is nothing in the Rise audio block to suggest that accessibility action is required. The published Rise course has narrated audio content that a deaf or hard-of-hearing learner cannot access. WCAG SC 1.2.1 failure.

Detection: Review all Rise courses for audio blocks. In Rise's course view, audio blocks are identifiable by their waveform icon and audio-only player. Check each audio block for an accompanying text block below it. If no text transcript block is present, the audio content is inaccessible.

Fix: Add a text transcript block (standard text block in Rise) immediately below each audio block. The transcript text should be a word-for-word transcription of the audio content. Republish the Rise course.

Failure mode 7: TTS narration captioned by ASR instead of script text

A Storyline or Captivate course uses TTS narration. Instead of deriving captions from the TTS script text (the correct approach), the L&D team sends the published TTS audio to an external captioning service for ASR processing. The captioning service runs Whisper or a similar ASR system on the synthetic voice audio. TTS→ASR accuracy on technical content is approximately 83–91% before glossary correction — the same paradox documented for Synthesia and HeyGen AI-generated video. The resulting captions are below the 99% WCAG threshold before the correction step begins.

Detection: Review the captioning vendor invoice or workflow log. If the workflow included audio export from the authoring tool followed by ASR processing, this failure mode may be present. Spot-check 10% of caption cues against the TTS script text to identify systematic mispronunciation-induced errors in the ASR output.

Fix: Derive captions from the TTS script text directly (Notes panel text for Storyline AI Voice, script field text for Captivate TTS). Generate timing from the TTS audio duration per slide. Import the script-derived SRT into the authoring tool instead of the ASR-on-TTS SRT.

Failure mode 8: Branching paths partially captioned

A Storyline branching scenario is captioned on the primary (most common) path through the scenario but not on secondary paths (failure branches, alternative responses, feedback layers). The L&D team walked the primary path during the pre-publish verification and confirmed captions were present. A learner who takes a secondary path — by giving an incorrect answer or choosing the alternative response — encounters slides with no captions.

Detection: Walk all branching paths in the published course (or in Storyline's Preview mode) with CC enabled. Confirm captions are present on every slide in every path, including feedback layers, branching-point results, and alternative outcome slides.

Fix: Identify the uncaptioned slides (empty Notes panel or missing SRT import for each branching path slide), populate Notes or import SRT for each, and republish.

Seven-question FAQ

Does Storyline's built-in CC feature satisfy WCAG 2.1 AA?

It depends on whether the Notes panel text accurately represents what the narrator said. Storyline CC is driven by the Notes panel — it performs text-to-audio synchronisation using the Notes text as the caption source. If the Notes text is a word-for-word transcript of the spoken audio (which requires disciplined Notes maintenance or the use of TTS narration where Notes = audio by construction), Storyline CC satisfies WCAG SC 1.2.2. If the Notes text is a pre-production script that the live narrator deviated from during recording, Storyline CC shows the script, not the audio — and the deviation rate can put the course below WCAG's implicit 99% accuracy standard. The safest approach for live-narrated courses is SRT import from ASR on the narration audio, verified against the DCMP spot-check protocol, rather than relying on Notes text-sync.

Can I use Storyline's AI Voice and rely on the built-in CC?

Yes — this is the correct workflow for TTS-narrated Storyline content. Storyline AI Voice generates narration audio from the Notes panel text. The Notes text and the synthesised audio are by definition aligned: the audio says exactly what the Notes text contains (minus TTS pronunciation variations, which are audio-only and do not create caption text inaccuracies). Enable Storyline's built-in CC, verify that the Notes panel is complete and accurate for all narrated slides, and confirm timing in the published preview for slides with narration longer than two minutes (timing can drift slightly on long narration tracks with many caption cues). This approach produces reliable WCAG SC 1.2.2 compliance for TTS-narrated courses without the SRT import workflow overhead.

Rise 360 has no caption editor — does that mean Rise courses are inherently non-WCAG-compliant?

No — Rise courses can be WCAG-compliant if the caption work is done upstream of Rise. For video blocks: caption every video at the video host level (Vimeo, Wistia, YouTube) before embedding in Rise. Never use direct MP4 upload to Rise video blocks for content that requires captions. For audio blocks: provide a word-for-word text transcript as a text block below each audio block — this satisfies WCAG SC 1.2.1 for prerecorded audio-only content. For embedded Storyline blocks: ensure the Storyline block's CC is correctly configured before embedding in Rise. The Rise course itself is compliant when all its content components are compliant at their source. The gap is the direct MP4 upload path, which must be avoided entirely.

Does a WCAG-compliant LMS make SCORM content inside it compliant?

No. WCAG applies to the content, not just the container. A LMS that holds a VPAT claiming WCAG 2.1 AA conformance is making claims about the LMS interface — the navigation, the search, the learner profile, the course catalogue. It is not making claims about the content of the SCORM packages uploaded to it. Each SCORM package is a distinct piece of electronic content that must independently conform to SC 1.2.2 if it contains narrated video or synchronized media. An LMS administrator who tells an L&D team "our LMS is WCAG-compliant" is not telling them that the SCORM courses inside it are captioned — those are separate questions. The compliance matrix and the auto-captions compliance analysis both address this platform-vs-content distinction in detail.

We have 300 Storyline courses built before anyone thought about captions. How do we prioritise the backlog?

Use a three-tier triage model: Tier 1 (immediate, within 30 days): any courses for which a specific accommodation request has been received from a deaf or hard-of-hearing learner. An active accommodation request creates an individual compliance obligation that must be addressed before the broader programme rollout. Tier 2 (priority, within 90 days): mandatory compliance training (annual compliance certifications, HIPAA, safety, ADA training), onboarding courses (highest audience throughput and ADA Title I exposure from day one of employment), and courses taken by the largest audience segments. Tier 3 (rolling programme): performance support, optional modules, archived content that is rarely accessed. For the backlog workflow at scale: export narration audio from all source .story files (Articulate provides a media export utility), run through a batch glossary-corrected captioning pipeline, generate per-slide SRTs, import back into each .story file, republish to SCORM, and re-upload to LMS. For a 300-course library averaging fifteen narrated slides each, budget approximately 300 hours of SRT import and verification work beyond the ASR captioning time. The budget planning guide covers how to build a multi-year backlog remediation budget.

Can the LMS automatically add captions to SCORM packages after upload?

No LMS platform currently adds captions to SCORM content after upload. LMS platforms can add caption tracks to video assets stored in their media library (Kaltura, Panopto, Vimeo-integrated platforms), but SCORM packages are not video assets — they are HTML5 software packages that happen to contain audio. The LMS serves the SCORM package as uploaded; it does not process, modify, or augment the package content. The only intervention point is the authoring tool, before publish. If a SCORM package is deployed without captions, remediation requires returning to the source authoring tool file, adding captions, republishing, and re-uploading the SCORM package to the LMS. This is why the pre-publish verification step in the production workflow is the highest-leverage quality gate — it catches caption failures before they are embedded in a deployed package.

We use both Storyline and Rise. Do we need two different captioning workflows?

Yes, they require different approaches at the production step, but a shared production checklist handles this cleanly. The Storyline workflow: decide between text-sync (TTS or disciplined Notes) and SRT import (live narration) before production begins; run the appropriate captioning step before publish; verify CC in the published HTML5 preview and in the target LMS. The Rise workflow: caption all video at the host level before embedding (never direct-upload); add text transcript blocks for all audio blocks; verify CC button in embedded video players in the Rise preview and in the target LMS. The post-publish verification step is common to both: launch the course in the LMS as a learner, enable CC, confirm caption text appears and is correctly timed. A shared pre-publish checklist with Storyline-specific and Rise-specific rows, plus shared LMS verification rows, covers both workflows without requiring two separate documentation systems. The QA methodology guide covers the DCMP spot-check protocol applicable to both eLearning and video content.

Glosscap applies your organisational glossary to eLearning narration audio

Whether you're captioning a backlog of Storyline courses, setting up a Rise 360 video-host caption workflow, or maintaining caption accuracy as your Captivate library grows, GlossCap's glossary-biased captioning applies your organisational vocabulary to the ASR step — so product names, regulatory acronyms, and SDK identifiers come out correctly the first time, before the SRT import. The result is captions ready for Storyline import, Captivate upload, or Vimeo track attachment without the round of manual corrections that technical-vocabulary narration typically requires.

See plans and pricing · Compare with Rev