Tool reference · Loom (async video)

Loom captions: glossary-biased SRT/VTT for async-video training in modern SaaS

Loom is the async-video default for the modern SaaS workplace — the tool the engineering manager uses for the "here's the architecture review I would have done in your meeting" video, the tool the customer-success rep uses for the "here's how to configure your account" walkthrough, the tool the product-manager uses for the "here's how the new feature works" launch demo, and increasingly the tool the L&D team uses for the lightweight customer-academy module. Loom's defining bet is async-first, browser-first, no-edit-friction video; everything else (auto-transcript, SRT export, AI summaries, embeddable links) hangs off the recording. The captioning question on Loom is the same question we ask on every authoring surface: not whether captions are technically supported (they are, via auto-transcript and SRT export) but whether the captions preserve the vocabulary the recording was made to teach. The answer, on the kind of content Loom is used for, is consistently no — until you bring a glossary-biased upstream pass to it.

TL;DR

Loom auto-transcripts every recording and exposes the transcript as searchable text plus a downloadable SRT (Business and Enterprise tiers, with limits on Starter). The auto-transcript is generic ASR and mangles product names, SDK terms, customer identifiers, and internal acronyms — exactly the vocabulary modern SaaS recordings are dense with. For training video that ships into a customer academy, an L&D module, or a public knowledge base, the auto-transcript is unfit for purpose. Glossary-biased captioning with the customer's product catalogue, SDK reference, internal acronym register, and customer-name register as the project glossary produces a clean SRT. The SRT can replace Loom's auto-caption (uploaded via the Loom UI on Business/Enterprise) or accompany the Loom MP4 download on its way to a hosted destination (Vimeo, Wistia, the LMS).

What Loom is, and where in the workflow captioning lands

Loom (acquired by Atlassian in 2023, now an Atlassian Cloud product) is an async-video recorder-and-host with a browser extension, a desktop app, and mobile clients. Distinguishing characteristics:

Capture-then-host model. Recording happens in the Loom client; the recording uploads to Loom's hosted service immediately; sharing is by URL. There is no separate "render" or "publish" step — the recording IS the published asset.
Auto-transcript by default. Every recording is transcribed by Loom's ASR pipeline. The transcript is searchable, exposed inline below the player, and downloadable as SRT or VTT (tier-dependent).
AI features layered on top. Loom AI generates titles, chapters, action items, and (recently) summaries from the transcript. The features are downstream of the ASR; if the ASR mangles the proper-nouns, every AI feature inherits the mangle.
Embeddable / shareable. The Loom URL is shareable directly, embeddable via iframe in Notion, Confluence, Intercom Articles, Salesforce knowledge bases, customer academies. The captions follow the embed.
Workspace and library structure. Loom organises recordings into Workspace folders, with workspace-level permissions. For training-team use, the Workspace is the L&D team's content library.

Captioning lands at one of three points in the workflow: (1) the Loom auto-transcript, served alongside the recording natively; (2) a manually uploaded SRT replacing the auto-transcript (Business / Enterprise tier feature); (3) the SRT exported alongside the MP4 download for use at a downstream destination (LMS, video host, knowledge base).

The Loom caption-export and -upload mechanics

Auto-transcript. Generated automatically on every recording. Available immediately after upload completes (typically seconds to a minute, depending on length). Searchable in-player (the search box jumps the playhead to the matching transcript line); downloadable as SRT or VTT from the recording's options menu (Business / Enterprise).
Replace transcript / upload custom captions. On Business and Enterprise tiers, the recording's caption track can be replaced with an uploaded SRT or VTT. The uploaded captions take priority over the auto-transcript and render in the Loom player when CC is toggled on.
MP4 download. Loom recordings can be downloaded as MP4 (Business / Enterprise feature, with Starter-tier limits). The downloaded MP4 plus a sidecar SRT (downloaded separately) ships to a downstream host or LMS.
Embed behaviour. When a Loom URL is embedded via iframe in Notion / Confluence / Intercom / Salesforce, the Loom player renders captions according to the same CC toggle behaviour as the standalone player. The captions follow the embed; the consuming surface doesn't host the captions separately.
Multi-language captions. Loom's built-in caption track is a single language (the auto-transcript language is detected from the audio). Multi-language captions on Loom are not natively supported at the time of writing; for multilingual training the practical path is per-language Loom recordings or routing through a video host that supports multi-track captions.

The vocabulary surface on Loom recordings in modern SaaS

Loom recordings in a modern SaaS workspace concentrate the highest proper-noun density of any video surface we measure. Why: Loom is the async tool of choice for context-rich, narrow-topic, single-author content — exactly the content profile that pulls heavily on internal vocabulary.

Product names and feature names. Internal product line names, feature flag names, SKU codes, integration partner names, persona names, dashboard names, alert names. SaaS Loom recordings carry these by the dozen per minute.
Customer identifiers. Customer names mentioned in customer-success enablement Looms; account IDs, organisation slugs, environment names. Generic ASR mangles every one. (Also a privacy concern — see the privacy posture section below.)
SDK and code symbols. Function names, library names, command-line tools, API endpoint paths. Engineering-team Looms are dense with these. Our engineering onboarding captions reference walks the SDK surface.
Internal acronyms. Project codenames, working-group names, OKR names, KPI abbreviations. "OBS-WG", "CXP-Q3", "ARR-NRR" — these mangle non-deterministically across pronunciations.
UI element references. "Click on the <Customer Health Score> column…" "Open the <Audit Trail> pane…" Same surface as Camtasia screen-record content, with the additional Loom-typical bias toward the speaker's particular shorthand for menu paths and configuration steps.
Sales-enablement vocabulary. Competitor names, framework names ("MEDDIC", "Challenger", "Command of the Message"), persona names, talk-track-section names. See sales enablement captions.

Why Loom's auto-transcript fails on this content

Loom's auto-transcript engine is generic ASR — well-tuned, but generic. It has no access to your product catalogue, your SDK reference, your customer name register, your internal acronym register, your sales-framework register. The proper-noun mangling we measure on a representative SaaS Loom recording averages 11-18 mangles per minute of speech for engineering-team content, 6-10 mangles per minute for customer-success enablement, and 8-14 mangles per minute for product-marketing enablement. The mangle pattern is deterministic per term — the same product name mangles the same way across recordings — so a hand-correction workflow at the team level is, predictably, a half-FTE problem. Our long-form post on the hidden half-FTE in L&D caption correction walks the full math; on Loom-heavy SaaS workspaces, the Loom slice is the largest chunk of that half-FTE.

Loom AI's downstream features — auto-titles, chapters, summaries — inherit the mangle. The "AI summary" of a sales-enablement Loom that calls every customer name and competitor name by a hallucinated phonetic neighbour is an obvious problem; the customer-success enablement summary that mangles every product feature name is the same problem with subtler downstream consequences (the rep watches the summary, not the recording, and learns the wrong terms).

The glossary-biased workflow upstream of Loom

Pull the customer's controlled vocabulary. SaaS-specific surface: the product-feature catalogue (a CSV from the product team, or the feature-flag table, or the public marketing-site feature index), the SDK reference (TypeDoc / JSDoc / pdoc / swagger output), the integration partner register, the persona / customer / account-name register (with caution — see privacy section), the internal acronym register from the company wiki.
Download the Loom MP4. Business / Enterprise tier supports MP4 download from the recording's options menu. For workspace-scale retrofit, the Loom Workspace API supports programmatic MP4 download for the Workspace owner.
Caption the MP4 with glossary-biased decoding. Run the audio through the captioning workflow with the workspace glossary biasing the decoder. Output: a clean SRT.
Reviewer pass with amber-highlight UI. Every glossary-applied term highlighted with source-line provenance (feature catalogue entry, SDK reference URL, persona-register entry). The author or a peer reviewer scrubs the SRT; corrections feed the workspace glossary.
Replace Loom's auto-caption. Business / Enterprise: upload the SRT to the recording, replacing the auto-caption. The Loom player now serves the clean caption track. The Loom auto-transcript is replaced for searchability, AI features, and CC display.
Document. For training Looms that ship to a customer academy or external knowledge base, log the captioning provenance (vendor + glossary version, reviewer, date) for audit-evidence purposes.

See pricing

The privacy and compliance posture on customer-name handling

Loom recordings in customer-success enablement, sales coaching, and account-review contexts mention customer names. The captioning workflow has to handle this carefully:

Customer names as glossary terms. Including a customer name register in the glossary improves caption fidelity but raises the question of who has access to that register and how it is stored. The conservative posture: customer names are a separate, access-controlled glossary tier, not bundled with general SaaS product vocabulary.
Audio with customer-confidential content. Some Loom recordings reference customer-confidential information (account size, deal stage, churn-risk indicators). Sending such audio to a third-party captioning service has BAA, DPA, and customer-contract-clause implications. The conservative posture: customer-confidential Loom audio runs through a captioning service with a signed DPA covering the data flow, and the audio is deleted after caption generation.
HIPAA-relevant content on Loom. Loom is generally not a HIPAA-covered service in default configuration. Recording PHI on Loom requires a BAA with Loom (Atlassian / Loom Enterprise tier with HIPAA add-on, where available) and a corresponding BAA with any captioning service that touches the audio. See our HIPAA training captions reference for the workforce-training-versus-PHI scoping question.
GDPR / CCPA on customer-name captioning. Captioning audio that mentions named customers is processing of personal data in those captioning systems. The DPA chain (controller → captioning processor → any sub-processors) must be in place. For EU-operating organisations, Article 28 / 32 obligations apply.

For most internal-training and customer-education Loom content (no customer names, no PHI, no PII beyond the speaker's own identity), the standard captioning workflow is appropriate. For higher-sensitivity content, the workflow design is the audit-relevant question.

Where Loom-captioned content typically lands

The Loom player itself — embedded in Notion, Confluence, Intercom Articles, Salesforce Knowledge, Slack canvases, internal wikis. The clean caption track follows the embed.
Customer academies hosted on a learning platform — the MP4 + SRT download path lands the captioned recording in a hosted academy course. Common destinations: Skilljar, Northpass, Thought Industries, Intellum.
An LMS for internal training — the MP4 + SRT path into TalentLMS, Docebo, Absorb, or Cornerstone OnDemand. Less common than purpose-built authoring (Storyline / Rise / Camtasia) for full courses, but increasingly used for short async modules.
Public marketing or knowledge-base content — the MP4 + SRT lands on YouTube, Wistia, or Vimeo for public consumption. See our Vimeo and Wistia references.
Re-imported into Articulate Rise's Video block — the Loom-recorded MP4 with a clean SRT becomes a Rise video block. The captions ride into the published SCORM/xAPI package.

How Loom captions intersect WCAG 2.1 AA, ADA Title II, and EAA

Loom recordings shipped to a public-facing destination (customer academy, public knowledge base, marketing-site explainer) inherit the destination's accessibility regime. Loom recordings shipped to internal SaaS-employee consumption inherit the employer's internal accessibility posture, which under Section 504 for federal-financial-assistance recipients and indirect ADA Title III posture for private-sector employers, increasingly converges on WCAG 2.1 AA.

Loom recordings consumed inside an EU-operating organisation fall under the European Accessibility Act for B2C-facing content; B2B is currently exempt under Article 4(5)'s microenterprise carve-out for very small services-only providers, but the safer baseline is to caption all training video to WCAG 2.1 AA regardless of regime. See our EAA captions requirements reference and our EAA Q3 2026 inflection-point post.

The technical caption requirement at SC 1.2.2 (Captions, Prerecorded) is the relevant WCAG checkpoint. SC 1.2.4 (Captions, Live) does not apply to Loom — Loom is async only.