LMS integration

Docebo captions integration: VTT, language tracks, and the Central Repository workflow

Docebo is the enterprise LMS most common in 500+-employee learning orgs — a segment GlossCap supports via the Org tier. Docebo's Central Repository stores video assets with caption tracks hanging off them; once you grok that structure, caption retrofit is straightforward. This is the upload flow, the language-track handling, and why VTT is usually the format to hand Docebo, not SRT.

TL;DR

Docebo's video lives in the Central Repository (sometimes called Central Content Repository, CCR). A video asset can carry multiple caption tracks, one per language, and those tracks are bound to the asset — not to the course that uses the asset. Reuse an asset across courses, the captions follow. VTT is the well-tested format for Docebo; SRT usually works too but the player's track-switching behaviour is cleaner on VTT. The compliance story is the same as every other LMS: the caption content — verbatim dialogue, speaker labels, non-speech cues, ≈99% accuracy — is what carries the audit, not the platform.

Where captions live in Docebo's data model

Most LMS-retrofit work goes wrong because admins look for captions where courses live (instructional design view) instead of where assets live (content management view). Docebo is explicit about this: the Central Repository is the asset store. Each video asset in the repository has metadata (title, language, description) and can have caption tracks attached as part of that metadata.

Practically, this means:

Why VTT over SRT for Docebo

Both formats work in modern Docebo video players. The reason VTT is the safer default:

SRT is still acceptable if your team has an SRT-based authoring workflow. But if you are asking "which should I export from GlossCap for Docebo", the answer is VTT.

The retrofit workflow for a Docebo library

Retrofitting ADA Title II compliance (deadline live as of 2026-04-24) or EAA scope (EAA) for an enterprise Docebo library typically looks like:

  1. Identify every video asset in the Central Repository. Admin can filter assets by type = video. Export the list with asset IDs, titles, durations.
  2. Decide caption scope. Not every asset needs captions; onboarding and compliance modules do, but an ops-only demo asset used in a single internal course may not. The scope call is organizational, not technical.
  3. Upload source videos to a GlossCap batch. Attach your company glossary once (Notion / Confluence / Google Docs sync, or paste list). For enterprise libraries, glossary depth matters — every product name, every internal acronym, every SDK symbol should be in it.
  4. Export VTTs with filenames tied to asset IDs. GlossCap supports arbitrary filename conventions on bulk export so the subsequent upload is a drag-and-drop match.
  5. Attach per asset in Docebo. Central Repository → asset → captions → upload. With 100+ assets, this is best done as a focused admin sprint.
  6. Verify on one test course per domain. Docebo's multi-domain (branding / audience isolation) feature means a test-learner account in one domain may not see what a learner in another will; sample at least one learner account per customer-facing domain.

The two things teams underestimate

Asset reuse is a win, not a footgun. If a "manager onboarding" video asset is reused across five courses in five branded domains, captioning once covers all five. Count your assets, not your courses — the scope is smaller than it looks. GlossCap's per-asset run means you pay once for the compute, not five times.

Speaker attribution matters more on enterprise training. Docebo's large-course-library use case tends to include multi-speaker panels, instructor-plus-guest formats, and cross-functional walkthroughs. The auditor sampling question shifts from "is it accurate" to "can I tell who is speaking at every change". GlossCap emits speaker labels using <v Name> voice tags in VTT output, which Docebo's player renders as a prefix on each cue.

How GlossCap fits Docebo specifically

The core loop is the one on the homepage — captions that know your jargon — but the Docebo-specific details matter:

All of which amounts to: the caption file you download from GlossCap is the caption file you upload to Docebo, with no manual fixup between. The terminology preservation — kubectl, Docebo (yes, even its own product name), tirzepatide — comes from the glossary-biased decode, not from a post-hoc find-and-replace.

See pricing

Related questions

Does Docebo accept TTML?

Some Docebo deployments accept TTML if the upstream video infrastructure is configured for it, but it is not the default path. VTT is the safe choice for most instances. If your admin team has a specific TTML requirement, our TTML page covers what GlossCap exports.

What about Docebo's AI Captions feature?

Docebo has shipped AI caption auto-generation in recent releases; it uses a general-purpose speech model that has the same terminology-accuracy limitations as YouTube auto-captions on training content. Glossary-biased decoding is the fix for the specific mangles — product names, SDK symbols, drug names — that generic AI captions can't solve. GlossCap complements, doesn't compete with, the AI-caption feature for teams where terminology preservation matters.

Can I attach captions via the Docebo API?

Docebo's REST API covers most Central Repository operations. Caption-track attachment is supported on the asset endpoints; for a large one-time retrofit it's often faster to work in the admin UI, but for ongoing ingestion (new video every week), API-driven attachment is reasonable. Check the Docebo Developer Portal docs for the current endpoint shape.

How does multi-language caption track switching work for learners?

Docebo's player renders a CC menu listing every attached caption track by its language label. Learners pick. GlossCap exports one language per run; for multi-language coverage you run the source through GlossCap per language and attach each output to the same Docebo asset.

Further reading