Platform reference · Sonic Foundry Mediasite · Showcase · MyMediasite · LTI 1.3

Sonic Foundry Mediasite captions: Showcase, MyMediasite, LTI integration, and the lecture-capture retrofit pattern

Sonic Foundry Mediasite is one of the three dominant lecture-capture and enterprise-video platforms in the United States — alongside Panopto and Kaltura — and the platform of choice at a meaningful fraction of large public universities, large healthcare academic medical centres, and Fortune 1000 corporate-L&D operations. Mediasite's footprint is heaviest at institutions that adopted lecture-capture early (mid-2000s through early 2010s) and have built up multi-thousand-hour catalogues over the intervening years. Captioning a Mediasite catalogue is operationally distinct from captioning a Panopto or Kaltura catalogue because Mediasite's deployment surface area is wider — there is the on-premises Mediasite server, the Mediasite Cloud SaaS, the MyMediasite faculty-self-service portal, the Mediasite Showcase public-and-restricted publishing destination, the Mediasite Catch (capture-everywhere) appliance, the Mediasite Mosaic desktop recorder, the legacy Mediasite Recorder hardware appliances installed in classrooms — and each of these touches the captioning workflow differently. After ADA Title II's 2026-04-24 deadline, the retrofit task at every Mediasite-running public university is now the single largest captioning project the institution will undertake.

TL;DR

A Mediasite captioning workflow has five surfaces. (1) Automated transcription — Mediasite's integrated speech-to-text produces a caption track on uploaded recordings, exposed as the WebVTT file delivered to the player. (2) The Mediasite caption editor — browser UI for line-by-line correction or wholesale upload-replace. (3) MyMediasite — the faculty-self-service portal where instructors upload, trim, and publish their own recordings; captions get attached at this stage. (4) Mediasite Showcase — the public-or-restricted publishing destination where finished recordings live (search-engine-discoverable for public Showcase, restricted for institutional Showcase); captions follow the recording into Showcase. (5) LTI 1.3 integration — Mediasite courses and folders surface inside Canvas / Blackboard / Brightspace / Moodle via LTI; the captions follow the recording into the LMS. The ADA Title II deadline at 2026-04-24 (now live) makes catalogue-wide retrofit the operational priority. The retrofit pattern is: inventory the catalogue across on-prem and Cloud → identify which surface owns each asset → re-caption the high-stakes content first → publish glossary-biased captions back to each surface → log the asset register for OCR-sampling readiness.

Why Mediasite captioning is now urgent: ADA Title II Section 36.504

The Department of Justice's final rule under ADA Title II bound state and local-government public entities to WCAG 2.1 Level AA on web content and mobile apps. The compliance date for large public entities — including all public universities and large community-college systems — was 2026-04-24, and that date has now passed. SC 1.2.2 (Captions, Prerecorded) is the operative success criterion; the substantive bar is captions that accurately convey the audio.

Mediasite tenants concentrate at large public universities, large state university systems, large academic medical centres, and at the kind of regulated-industry corporate-L&D customer that runs a Sonic Foundry relationship for the lecture-capture-class workflow. The 2026-04-24 deadline created an immediate audit-evidence task at every public-university Mediasite tenant: every recording in every active course had to either have substantively accurate captions, or had to be removed, or had to have a documented accommodation pathway.

The other regulatory regimes Mediasite tenants typically face: Section 504 (any institution receiving federal financial assistance — i.e. virtually every US college accepting Title IV student-aid dollars), Section 508 (federal-contractor universities and federal-grant-funded research programmes), AODA (Canadian Mediasite tenants on the three-year compliance reporting cycle), the EAA (European tenants since 2025-06), Section 1557 (academic medical centres), and HIPAA (45 CFR § 164.530(b) workforce training mandate at any covered-entity tenant). The substantive caption-quality bar is similar across all of them; the institutional workflow has to clear the substantive bar regardless of which regime is the operative cause for action.

Surface 1 — Mediasite automated transcription

Mediasite's automated transcription runs on uploaded or recorded content, producing a caption track exposed as WebVTT. Behaviour relevant to captioning operators:

Per-account / per-folder configuration. Whether automated transcription runs by default is configured at the folder or account level. Many institutional Mediasite tenants enabled automated transcription as a default after 2018-2020; older recordings predating that toggle have no caption track.
Substantive accuracy. Mediasite's STT lands in the same 80–90% accuracy band as every other generic STT system. Proper-noun mangling on regulatory citations, drug names, technical product terms, faculty names, and institution-specific programme names is the dominant failure mode.
WebVTT output. Captions are exposed as WebVTT to the Mediasite player. Per-cue timing, speaker-detection in some configurations, and standard WebVTT cue structure.
Replace-track wholesale. The supported workflow for vendor-supplied captions is to replace the entire caption track wholesale. The clean SRT or VTT replaces the auto-caption.
Multi-language tracks. Mediasite supports multiple caption tracks per recording. The student-facing player exposes a language selector. For institutions serving multilingual student populations — California state systems, Texas systems, City University of New York, federal-grant-funded programmes requiring Spanish-language access — this is the right surface for multi-track delivery.

Automated transcription is the captioning baseline; it is not the substantive accuracy bar SC 1.2.2 requires. Glossary-biased re-captioning during the back-catalogue retrofit is what changes the substantive bar outcome.

Surface 2 — The Mediasite caption editor

Mediasite ships an in-browser caption editor for line-by-line correction or wholesale upload-replace:

Per-cue inline editing. Click a cue, edit the text, save. The per-word timing alignment is preserved.
Bulk caption-track upload. Upload a corrected SRT or WebVTT file; the file replaces the caption track wholesale on the recording.
Caption-track download. Download the auto-generated VTT for external editing, then upload back. This is the supported pattern for high-volume back-catalogue work — bulk-export captions, run them through a glossary-biased re-captioning pipeline, bulk-replace.
Caption-track audit. Per-recording metadata: caption file, source (auto vs manual vs vendor), upload date, modifying user. The audit trail is the artefact for OCR / DOJ document responses.

The caption editor is functional for line-by-line corrections but doesn't scale to a back-catalogue retrofit at multi-thousand-hour scale. For volume work, the operational pattern is: bulk-export auto-captions out of Mediasite via API, run them through glossary-biased re-captioning, bulk-replace via API.

Surface 3 — MyMediasite (faculty self-service)

MyMediasite is the faculty-self-service portal where instructors upload, trim, and publish their own recordings without IT intervention. Captioning workflow at MyMediasite:

Upload-and-process. Instructor uploads MP4 (or other supported container); Mediasite processes, generates the player formats, runs automated transcription, and exposes the caption editor.
Self-service caption upload. The instructor can upload their own SRT or VTT to the recording, replacing the auto-caption.
Auto-publish caption track to Showcase. When the recording is published to a Showcase, the caption track travels with it.
LTI handoff. When the recording is embedded into the LMS via the Mediasite LTI tool, the caption track surfaces in the LMS player.

MyMediasite is the surface where most ongoing forward-captioning work happens at large institutional tenants — the existing instructional content gets captioned through this self-service path. The back-catalogue retrofit work typically happens through the admin / API surface rather than MyMediasite.

Surface 4 — Mediasite Showcase

Mediasite Showcase is the publishing destination for finished recordings. There are two flavours:

Public Showcase. Search-engine-discoverable, anonymous-access. Used for public-facing institutional content (commencement addresses, public lectures, marketing video). The caption track is publicly accessible.
Restricted Showcase. Authenticated access. Used for institutional content that should be visible to logged-in students/staff but not the broader internet. The caption track travels with the recording into the restricted view.

Public Showcases are the surface most exposed to OCR / DOJ / EU enforcement sampling because they are publicly accessible. A captioning failure on a Public Showcase recording is more likely to attract a complaint and an investigation than the same failure on a Restricted Showcase recording (which is still in scope but less visible). The triage prioritisation in the back-catalogue retrofit should weight Public Showcase content higher.

The other operational consideration with Showcase: the recordings are search-engine-discoverable, meaning the caption text is part of what Google indexes. Substantively accurate captions on the Public Showcase improve the institution's web search presence on the relevant programme names and topics; substantively inaccurate captions actively hurt that presence (Google indexes the wrong proper-noun spellings, the institution doesn't surface for the right queries). The SEO value of correctly captioning Showcase content is a real second-order argument for the institutional retrofit, beyond the regulatory obligation.

Surface 5 — LTI 1.3 integration with the institutional LMS

Mediasite integrates with the institutional LMS via LTI 1.3. The Mediasite LTI tool exposes folders of recordings inside Canvas, Blackboard, Brightspace, and Moodle. Behaviour relevant to captioning:

Captions follow the recording into the LMS. When an instructor inserts a Mediasite recording into a course page via the LTI tool, the recording's caption track surfaces in the LMS-embedded player. The captioning workflow happens in Mediasite, not in the LMS.
Course-folder mapping. The Mediasite folder for a course can be mapped to the LMS course shell, so all recordings in that folder are accessible to the course's enrolled students.
Single sign-on. LTI 1.3 carries the institutional SSO into Mediasite. The same authenticated user surfaces in Mediasite with the same role.
Cross-course-copy persistence. When the LMS course is copied across terms, the Mediasite LTI link copies as well; the captions follow the underlying Mediasite recording.

The LTI integration is the surface where Mediasite content reaches most students. The captioning obligation surfaces on the LTI-embedded view — a student watching the recording inside Canvas / Blackboard / Brightspace / Moodle sees the same caption track that appears in the Mediasite player.

The OCR sampling pattern, applied to a Mediasite tenant

The Office for Civil Rights' sampling pattern when an investigation lands on a Mediasite-running institution is consistent across the cases that have been published:

Identify a course or a Showcase. Often the complainant names a specific course (LTI-embedded Mediasite recordings) or a specific public Showcase recording.
Open a recent module. The investigator looks for video — instructor-recorded lecture, a guest-speaker recording, a procedural demonstration, a regulated-content module, a public-Showcase commencement or guest-lecture recording.
Watch a slice with captions on. Two to three minutes is enough to assess whether the captions track the speaker, including the named technical terms.
Read the caption track against the audio. Mangled proper nouns (drug names, regulatory citations, technical product terms, institution-specific programme names, faculty names) are the failure pattern that gets flagged in writing.
Sample the back-catalogue. If the named recording fails, the investigator typically samples a half-dozen other recordings — across the same course, related courses, and the public Showcase — to check for a pattern. A pattern triggers a programme-wide finding.

The proper-noun failure mode is what generic auto-captioning is structurally bad at. The words that distinguish a competent caption from a mangled one are the words generic STT has the least training data for. Mediasite's automated transcription, like every other generic STT system, fails on these words with predictable regularity.

The Mediasite back-catalogue retrofit pattern

For an institution sitting on years of un-captioned, auto-captioned, or partially captioned Mediasite recordings, the retrofit runs in five phases:

Inventory. Generate a flat list of every recording across the institutional Mediasite tenant. Mediasite's REST API exposes recording listing, folder structure, and per-recording caption-status metadata. Most institutions discover their catalogue is larger than expected — the typical institutional Mediasite back-catalogue at a large research university spans 5,000 to 50,000 recording hours, accumulated since the original Mediasite deployment in the mid-2000s.
Triage. Rank by exposure: actively-referenced LTI-embedded recordings first (still in active course shells), public-Showcase content high (search-engine-discoverable), regulated-content modules urgent (medical, healthcare, compliance, legal), recordings cited from required-graduation-track courses high. The triage cut typically removes 30–50% of the catalogue from retrofit scope (recordings nobody links to anymore can be archived rather than re-captioned).
Re-caption. Replace mangled or absent captions with glossary-biased output. The institutional glossary is built once — programme names, course names, faculty names, regulatory citations, drug and procedure names if a healthcare programme, SDK symbols if a CS programme, the institution's acronym handbook — and applies to every retrofit asset. Per-customer compounding accuracy is what makes this scale.
Publish. Push captions back to Mediasite via the API caption-replace endpoint. Verify the LTI-embedded surface (LMS player) and the Showcase surface both pick up the new caption track.
Log. Maintain an asset register: recording URL, Mediasite recording ID, caption file, caption source, reviewer, review date, glossary version, downstream syndication targets (LMS course shells, Showcases, external links). The asset register is the artefact that answers OCR / DOJ / EU enforcement document requests, and it's how institutional risk management proves work-in-progress on the long tail.

See pricing

Where glossary-biased captioning changes the math

The standard institutional Mediasite retrofit cost calculus pits hand-corrected auto-captioning against vendor-supplied human captioning. Hand-correction at one to two hours per recording, multiplied by a 10,000-recording back-catalogue at a large research university, multiplied by a $40-per-hour staff or student-worker rate, produces a multi-six-figure to seven-figure project. Human captioning at $1.25-$3.00 per minute of video, multiplied by an average 50-minute lecture across that catalogue, produces a similar or larger project — sometimes more, because Mediasite catalogues skew toward full-length lectures rather than the shorter content that drives Cloud Recording or Loom catalogue size.

Glossary-biased captioning is a different cost shape. The institution builds the glossary once. Each minute of video costs a fraction of human-vendor pricing. The accuracy is high enough on the proper-noun surface that the human-review pass collapses from the full-correction hour to a quick scrub of the amber-highlighted glossary surface. For a 10,000-recording, 8,500-hour Mediasite catalogue retrofitted over a six-month window, the GlossCap math (Org plan, 8,500 hours over six months) lands well under the in-house and vendor-only paths. See the vendor pricing breakdown for the per-hour comparison.

The high-leverage steady-state pattern is: Mediasite webhook on recording-completed → external captioning service downloads the recording → produces glossary-biased VTT → uploads back through the caption-replace endpoint → Mediasite Showcase + LTI-embedded LMS player both surface the new captions automatically.

Mediasite vs Panopto vs Kaltura — does the workflow differ materially?

The three lecture-capture platforms are operationally similar at the captioning surface but differ in tenant-deployment shape:

Mediasite vs Panopto. Both produce a caption track exposed as WebVTT to the player. Both have a caption editor. Both support replace-track wholesale. Panopto's accessibility report (per-folder caption-status spreadsheet) is more mature than Mediasite's; Mediasite's API surface is comparable. The institutional choice between the two is typically driven by classroom-hardware history (Mediasite's hardware-recorder appliance footprint vs Panopto's classroom-recorder solution) rather than by captioning workflow differences.
Mediasite vs Kaltura. Kaltura's positioning as a campus-wide media management platform (covering more than lecture capture — institutional video portal, media asset management, digital signage) gives it broader reach than Mediasite's lecture-capture focus. The captioning workflow is operationally similar; Kaltura's REST API is the most mature of the three.
Mediasite's distinctive deployment surface. Mediasite's mix of on-premises Mediasite Server, Mediasite Cloud SaaS, MyMediasite, Showcase, and LTI integration is broader than the typical Panopto or Kaltura deployment. Institutions that adopted Mediasite early often run a hybrid (some content on the legacy on-prem server, newer content on Cloud); the captioning retrofit has to reach both.

The retrofit pattern is identical across the three — inventory, triage, re-caption, publish, log — and the substantive caption-quality bar SC 1.2.2 enforces is identical regardless of platform.

Higher-ed proper-noun failure modes in Mediasite content

The proper-noun categories that cause the highest substantive-accuracy failures in Mediasite content vary by programme. The institutional glossary should pre-load the institution-specific terms; the per-discipline categories the glossary should cover include:

Healthcare programmes. Drug INNs (tirzepatide, semaglutide, apixaban, rivaroxaban, dexamethasone); CPT and ICD-10 codes; procedure abbreviations (TAVR, CRRT, ECMO, PCI); pathogen names (C. difficile, S. aureus); anatomy terms; medical training captions reference.
Engineering and computer science. SDK and library names (PyTorch, TensorFlow, Helm, kubectl, Terraform); language constructs; cloud-vendor product names.
Law programmes. Case names with non-Anglophone parties; Latin terms; statutory citations; international-tribunal names.
Business and finance. FINRA / SEC / OCC / FDIC / CFPB abbreviations; product names (Bloomberg, Refinitiv, FactSet); accounting standard codes (ASC 606, IFRS 15).
Humanities and social sciences. Non-Anglophone names; period-specific vocabulary; foreign-language terms quoted within an English lecture; archaeological-site names; political-history institutional acronyms.
STEM research lectures. Reagent names; instrument names (LC-MS, NMR, electron microscope manufacturer names); protein and gene names with non-standard capitalisation rules.
Institution-specific. Course numbers; programme names; faculty names; campus-building names; institutional traditions and acronyms.

The compounding-accuracy property of glossary-biased captioning makes the back-catalogue retrofit cheaper than the steady-state forward captioning over time. The first 100 hours captioned with the institutional glossary set the floor; subsequent hours benefit from per-customer term-frequency weighting.

FAQ — Mediasite captions

Does Mediasite's automated transcription clear ADA Title II SC 1.2.2?

Mediasite's automated transcription lands in the same 80–90% substantive-accuracy band as YouTube auto-captions on training-style content with technical proper nouns. The substantive-accuracy bar SC 1.2.2 enforces is "captions that accurately convey the audio," not "captions that exist." For a no-proper-noun, conversational video, automated transcription can be substantively accurate. For lecture, regulated-content, or technical-procedure video, automated transcription virtually always requires correction. The defensible posture is to treat automated transcription as a draft and run a glossary-biased correction pass before the recording is exposed to a student.

What format does Mediasite use for captions — SRT or VTT?

WebVTT for player delivery. Mediasite accepts SRT and VTT for caption upload; both are converted to VTT internally for the player. Most institutional retrofits standardise on uploading VTT to keep the per-cue styling and positioning options open.

Can I run an external captioning service against Mediasite via API?

Yes — Mediasite's REST API exposes recording listing, recording download, and caption-replace endpoints. The standard production-grade pattern is: webhook on recording-completed → external captioning service downloads the recording → produces glossary-biased VTT → uploads back through the caption-replace endpoint. The webhook + API pattern is the only way to scale captioning across a large institutional Mediasite catalogue.

How does the Mediasite Showcase publishing surface affect SEO?

Public Showcases are search-engine-discoverable. The caption track is part of what Google indexes for the recording. Substantively accurate captions on Public Showcase content surface the institution for the right programme and topic queries; substantively inaccurate captions actively hurt that presence (Google indexes the wrong proper-noun spellings, the institution doesn't surface for the right queries). The SEO benefit of correctly captioning Showcase content is a meaningful second-order argument beyond the regulatory obligation.

If we copy a course across terms in our LMS, do the Mediasite captions follow?

The captions live with the underlying Mediasite recording, not with the LMS course shell. When the LMS course is copied across terms, the LTI link to the Mediasite recording copies; the recording's caption track is unchanged and surfaces in the new term's course shell. This is operationally cleaner than the Files-area sidecar caption pattern in Canvas, where the caption-track association in the rich-text editor sometimes has to be re-attached after course copy.

What about Mediasite Mosaic desktop recordings?

Mediasite Mosaic is the desktop recorder for instructors who want to record at-the-desk content rather than in a classroom. Mosaic recordings upload to the institutional Mediasite tenant and inherit the captioning surfaces — automated transcription on upload, caption editor in the web UI, MyMediasite for self-service caption upload, Showcase for publishing, LTI for LMS embedding. The captioning workflow is identical to the classroom-recorded path.

What does the OCR investigation packet typically request for a Mediasite tenant?

For a video-accessibility complaint, OCR typically requests: the course or Showcase URL, the videos in the course, the caption files attached to each, the institutional accessibility policy, the staff and faculty training records around accessibility, the accommodation-services request log relevant to the complainant. For Mediasite-specific cases, the request often calls out the Showcase content separately because Public Showcase is publicly visible. The asset register described in the retrofit pattern is the artefact that answers the documentation half of the request quickly.

How does this relate to AODA captions for Canadian Mediasite tenants?

AODA's Integrated Accessibility Standards Regulation binds large Ontario organisations (50+ employees) — including all Ontario public universities — to WCAG 2.0 AA on web content, with a three-year compliance reporting cycle. The substantive captioning bar is identical to ADA Title II's. See the AODA captions reference for the reporting-cycle detail; institutions in scope of both regimes can ship one caption track to clear both.