LMS integration
Kaltura captions: REST upload, multi-language tracks, and the audit-ready workflow
Kaltura sits underneath a huge fraction of higher-ed lecture capture and large-enterprise video — Canvas/Blackboard/Moodle integrations, MediaSpace portals, embedded enterprise players. Caption attachment in Kaltura is more flexible than in any other major LMS-adjacent platform: there's a UI path through KMC, a REST Caption API for automation, and a per-entry track model that natively supports multiple languages. Here is the upload flow that actually works, the moves that break audit posture, and the retrofit playbook for a Kaltura library hit by the ADA Title II 2026-04-24 deadline.
TL;DR
In Kaltura, captions are caption assets attached to a video entry. Upload via the Kaltura Management Console (KMC) — Entry → Captions tab — or via the REST Caption API (caption_captionasset.add + caption_captionasset.setContent). Kaltura accepts SRT, VTT, DFXP/TTML, and SCC. Multi-language tracks attach to the same entry. For WCAG 2.1 AA the content of the caption file is the load-bearing piece — the player passes; your terminology accuracy is what an auditor samples.
Where caption upload lives in Kaltura
Two production paths, picked by scale:
- KMC UI (one-off, small libraries). Sign in to KMC → Content → Entries → click the entry → Captions tab → "Upload Captions". Pick the language, select Yes for "Default" if this is the primary track, choose the caption format (SRT / VTT / DFXP / SCC), upload the file. Kaltura immediately makes the track visible on the player.
- Kaltura REST Caption API (scale, automation). The two-call dance is
caption_captionasset.add(declares the asset metadata: entryId, language, label, format, isDefault) followed bycaption_captionasset.setContent(uploads the actual file content as a token-uploaded resource). Both calls are documented in the Kaltura API Console and ship in the official Node, Python, and PHP client libraries.
For libraries above ~50 entries, the REST path wins by an order of magnitude on time-to-finish — the UI path requires a click-pivot per entry, the API path is one batch run.
The multi-language track model
Kaltura's caption-asset model is one of the cleaner ones in the LMS-adjacent space: each video entry holds a list of caption assets, each with its own ISO language code, label, and isDefault flag. The Kaltura player auto-renders the CC menu from this list. Adding French + Spanish + English to the same training module is three caption-asset uploads against the same entry, each with the right language code (en, fr, es) and at most one with isDefault = true.
A common mistake: setting isDefault = true on multiple caption assets for the same entry. The player picks the last one written, which makes the active default non-deterministic across re-imports. Pick exactly one default per entry per language family.
The retrofit workflow for a Kaltura library
- List the entries. Use
media.listfiltered by category, owner, or createdAt window. Pull entryId, name, duration, status, and current captionAssetCount into a working spreadsheet so the post-retrofit verification step has a baseline. - Classify by audit risk. Public-facing higher-ed lecture capture (covered by ADA Title II for state/local public entities since 2026-04-24), customer-facing product training, and HR-mandated compliance training all need captions first; archived or admin-only content can wait.
- Caption with one glossary in GlossCap. Pull the source video files from Kaltura (or your upstream archive), drop them into a GlossCap batch, sync the company glossary once. SDK names, drug names, regulatory acronyms, internal product names — all logit-boosted into the Whisper-large decoder before output is generated.
- Export with entry-id-tagged filenames. Configure the GlossCap batch export so each SRT or VTT file is named with the Kaltura entryId, e.g.
1_abc123def.srt. The next step becomes a one-line script. - Bulk-attach via REST. A short script iterates the SRT files, calls
caption_captionasset.add+caption_captionasset.setContentper entry. With the entryId baked into the filename and a stable language label, the script is well under 100 lines in any of the official client libraries. - Verify on a sample player URL. Open 5-10 sampled entries in the public player or your MediaSpace portal, confirm the CC button shows the right languages and that the timings look right. This catches mis-mapping where wrong filename → wrong entry.
Higher-ed and the ADA Title II reality
Public universities are disproportionately hit by the 2026-04-24 ADA Title II deadline — it covers state and local government entities, and most flagship-university Kaltura libraries fall squarely in scope. Years of accumulated lecture capture, in many cases with no caption track or with auto-captions of ~85% accuracy, do not pass an audit. The university lecture capture page walks through what an auditor specifically samples: degree-program technical terms, faculty-coined coined terminology, and proper-name drug/procedure/method words. These are exactly the categories where general speech models fail.
The right architectural move is to retrofit the back-catalog with glossary-aware captions tied to the department's own term list (chemistry, biology, CS, law all maintain their own jargon stockpiles), then move new captures to a captioning-on-ingest pipeline. GlossCap is built for this exact pattern.
Why glossary-aware captions matter more on Kaltura
Kaltura is the LMS-adjacent platform with the heaviest concentration of higher-ed and life-sciences content — both verticals where domain terminology is the surface form most exposed to mis-captioning. A general-purpose Whisper output will write "tier zip a tide" where the lecturer said tirzepatide; "ku ber net es" where the engineer said Kubernetes; "see four ess" where the reading list said CSS-4. Each of those is a sampled-segment failure on an audit, and worse, a comprehension failure for the deaf-or-hard-of-hearing learner the captions exist for.
GlossCap's approach: your company or department glossary feeds a logit bias into Whisper-large's decoder before output, so the surface form lands right the first time. The output VTT or SRT is WCAG 2.1 AA-compliant on first export — ready for the caption_captionasset.add call.
Related questions
Can I use Kaltura's built-in REACH service instead?
REACH is Kaltura's own captioning marketplace: orders flow to Kaltura's vendor network for human or machine captioning. It works for general content. For terminology-heavy training (engineering SDKs, drug names, regulatory acronyms), the upstream model isn't seeded with your glossary, so the same mis-caption pattern shows up. Pulling source video out, captioning in GlossCap with the glossary, and reattaching via the Caption API is the high-accuracy path.
Does Canvas/Blackboard/Moodle Kaltura integration affect the upload flow?
Caption assets attach at the Kaltura entry level. The Canvas/Blackboard/Moodle integration shows the underlying entry and its caption tracks — it does not change where captions are stored. Upload via KMC or REST against the entry; the integration surfaces the captions in the embedded player automatically.
What about the Kaltura player on a custom MediaSpace site?
The standard Kaltura player UV.js renders the CC button from the entry's caption assets. Custom MediaSpace skins occasionally hide the CC control under a "more" menu — verify the player config before assuming captions are missing. The asset itself is still attached.
Does Kaltura accept TTML/DFXP?
Yes — DFXP is a Kaltura-supported caption format (the format dropdown lists it explicitly). For higher-ed and broadcast workflows that prefer TTML over WebVTT, this is a clean fit. See our TTML for LMS page for the format specifics.