Platform reference · Blackboard Learn · Anthology · LTI 1.3

Blackboard Learn captions: Ally, Collaborate, Mediasite/Panopto LTI, and the back-catalogue retrofit pattern

Blackboard Learn — Anthology's flagship LMS, used in both Ultra and Original Course View — is a major higher-education and corporate-L&D learning platform with the densest accessibility-evidence surface of any major LMS thanks to Blackboard Ally. Captioning a Blackboard course is not one workflow but four: SRT/VTT sidecar tracks on Content Items video, Blackboard Ally's per-asset accessibility score and Alternative Formats pipeline, Blackboard Collaborate's live closed-captioning and session-recording captions, and the larger video catalogues delivered through external-tool embeds (Mediasite, Panopto, Kaltura, YouTube) via LTI 1.3. After the 2026-04-24 ADA Title II deadline, the urgent operational task at every public-college and public-university Blackboard tenant is the back-catalogue retrofit — and the failure mode auditors find first is the proper-noun mangling generic auto-captioning produces with predictable regularity.

TL;DR

A Blackboard Learn captioning workflow has four surfaces. (1) Content Items video — instructors attach an MP4 plus a sidecar SRT or VTT file and the Blackboard player reads the sidecar via the track element. (2) Blackboard Ally — Anthology's accessibility-scoring layer scores every video on caption presence and quality, surfaces a per-course report, exposes the institutional report, and powers the Alternative Formats download menu (audio-only, transcript). (3) Blackboard Collaborate — both live sessions and session recordings carry their own caption tracks, with manual live-captioner support and post-session auto-caption editing. (4) External video via LTI 1.3 — most full-length lecture content lives in Mediasite, Panopto, Kaltura, or YouTube, embedded into Blackboard, and captioning runs through the external platform. The ADA Title II deadline at 2026-04-24 (now live) makes this catalogue-wide. The retrofit pattern is: inventory the catalogue → identify which surface owns each asset → re-caption the high-stakes content first → publish glossary-biased captions back to each surface → log the asset register for OCR-sampling readiness.

Why Blackboard captioning is now urgent: ADA Title II Section 36.504

The Department of Justice's final rule under ADA Title II (28 CFR Part 35, with the digital-content provisions consolidated at the new Section 36.504) bound state and local-government public entities to WCAG 2.1 Level AA on web content and mobile apps. The compliance date for large public entities — including all public universities, large community-college systems, and state-run technical institutes — was 2026-04-24, and that date has now passed. SC 1.2.2 (Captions, Prerecorded) is the operative success criterion; the substantive bar is captions that accurately convey the audio.

Blackboard Learn tenants concentrate at large public universities, large community-college systems, and at the kind of regulated-industry corporate-L&D customer that runs an Anthology relationship for the audit-evidence shape. The 2026-04-24 deadline created an immediate audit-evidence task: every video in every active course had to either have substantively accurate captions on it, or had to be removed, or had to have a documented accommodation pathway. Auto-captions in the 80–90% accuracy band do not clear the SC 1.2.2 substantive-accuracy bar when the words being mangled are the words the student is being tested on.

The other regulatory regimes Blackboard tenants typically face: Section 504 (any institution receiving federal financial assistance — i.e. virtually every US college accepting Title IV student-aid dollars), Section 508 (federal-contractor universities and federal-grant-funded research programmes), AODA (Canadian Blackboard tenants on the three-year compliance reporting cycle), the EAA (European tenants since 2025-06), and Section 1557 (any healthcare-college programme that participates in HHS-administered programmes, including university medical centres and academic teaching hospitals).

Surface 1 — Content Items video with sidecar captions

The most basic Blackboard captioning surface is video uploaded as a Content Item attachment. The instructor uploads an MP4, MOV, or other supported container; Blackboard serves it through the built-in HTML5 video player. The supported sidecar workflow:

Upload the caption file (SRT or WebVTT) alongside the video file.
In the rich-text editor, insert the video as an HTML5 video element with a track child pointing at the caption file.
Save. The Blackboard video player exposes the CC button and renders the captions over the video.

For most older Blackboard tenants — Original Course View instances dating back to the late 2000s — the back-catalogue lives in this surface. Videos uploaded into Content Items over the last decade-plus that pre-date Blackboard Ally and never had captions attached. Two operational realities apply:

Caption files are first-class file objects. They have their own permissions, can be moved between courses via course copy, and can be downloaded by anyone with read access to the Content Item. Don't put PII or unredacted student feedback into caption text — it's now a downloadable file.
Caption files don't auto-attach across course exports. When a Blackboard course is exported and re-imported (the common pattern at term boundary, or when promoting a master-course shell into a delivery shell), the videos and the caption files come over but the track association in the rich-text editor often does not survive. Re-attach captions after every course export, or use Blackboard Collaborate or an LTI-embedded video host for content that travels across terms.

For format choice, see the SRT and VTT reference pages — Blackboard's HTML5 player handles both, with VTT preferred when the captions need positioning cues, styling, or speaker-identification metadata.

Surface 2 — Blackboard Ally

Blackboard Ally is Anthology's accessibility-scoring layer. It runs in-tenant, scans every uploaded asset (video, audio, image, document), and produces a per-asset accessibility score plus a per-course and an institutional report. Ally is one of the strongest reasons that institutions sit on Blackboard rather than migrate — the accessibility-evidence shape Ally ships is materially more developed than what Canvas, D2L Brightspace, or Moodle expose natively. The relevant Ally behaviour for captioning operators:

Per-asset score. Every video uploaded to a course gets a score between 0 and 100. Captions are a binary first-order check: a video without a detectable caption track caps the score sharply. The instructor sees a coloured indicator (red / orange / yellow / green) next to the asset.
Course-level report. Instructors can open the Ally course report and see every accessibility issue across the course, prioritised by severity. Caption-related findings rank near the top of the impact list because video has high reach.
Institutional report. Administrators see an institution-wide rolling accessibility score with caption-presence as one of the dominant component scores. This is the artefact accessibility committees and senior administrators read; it is also the artefact that often shows up in OCR and DOJ document requests.
Alternative Formats. Ally generates accessible alternative formats from each asset on demand. For video, the relevant ones are: a transcript in HTML, a transcript in BeeLine Reader format, and (if captions are attached) a SubRip / WebVTT download. Students see these formats in the Alternative Formats menu under the asset.
Caption-quality scoring. Ally scores caption presence reliably; caption-quality scoring is heuristic — it can detect the absence of timestamps or the presence of placeholder text but cannot detect substantive proper-noun mangling. The substantive-accuracy bar is the institution's responsibility, not Ally's.

The institutional Ally score is the political artefact in any retrofit conversation. After the 2026-04-24 deadline, the institution's score on the captions component became the cleanest leading-indicator of whether the catalogue was substantively compliant. A high Ally score with high auto-caption usage is the failure pattern: the institution looks compliant on the score and is structurally non-compliant on the substantive content. Glossary-biased captioning is what makes the score and the substantive content match.

Surface 3 — Blackboard Collaborate

Blackboard Collaborate is Anthology's web-conferencing platform, deeply integrated with Blackboard Learn and used for synchronous teaching, virtual office hours, guest lectures, and recorded asynchronous viewing. Collaborate has its own caption surface that is operationally distinct from the Content Items surface:

Live closed-captioning. A moderator can promote a participant to "Captioner," and the captioner's typed text appears as live captions on the session for all participants. This is the supported pathway for institutions providing CART (Communication Access Realtime Translation) services for a registered student. Per FCC's CVAA rules and the institution's accommodation obligations, live captioning is a manual process that the institution runs through a contracted CART provider or the campus disability-resource centre.
Auto-caption on live sessions. Recent Collaborate releases include automated speech-to-text that produces a draft caption track in real time. Like every other auto-caption surface on the market, it lands in the 80–90% band on conversational audio and worse on technical content with named entities. Auto-captions on a live session are not a substitute for a CART captioner when the institution has an accommodation obligation.
Recording captions. Collaborate session recordings carry whatever caption track was produced live, plus a post-session option to generate auto-captions and an option to upload a corrected SRT/VTT track in their place. The recording becomes a Content-Item-equivalent asset that lives in the Collaborate Recordings list and is exposed through the rich-text editor as an embeddable.
Caption editing post-session. The post-session caption editor supports replace-track wholesale. The supported workflow for vendor-supplied captions is to delete the auto-caption track and upload a clean SRT or VTT in its place; this is what re-captioning the back-catalogue Collaborate recordings with glossary-biased output looks like.
Captions and chat are different artefacts. The Collaborate chat log is a separate downloadable; captions are the spoken-content artefact. An OCR investigation packet typically asks for both for a recorded session because they capture different student-facing content.

Collaborate session recordings are often the back-catalogue surface most institutions underestimate. The recordings accumulate over years, are referenced in subsequent terms ("watch the guest-lecture recording from spring 2024"), and are surfaced through the Collaborate Recordings tool to current students. Substantive captioning of the recordings is the same SC 1.2.2 bar as any other prerecorded video; SC 1.2.2 reference.

Surface 4 — External video through LTI 1.3

The largest video catalogues at higher-education Blackboard tenants are not in Content Items, not in Ally-scored assets, and not in Collaborate recordings. They are in external lecture-capture and video-hosting platforms — Mediasite, Panopto, Kaltura, YouTube, and Vimeo — embedded into Blackboard through LTI 1.3 tools. The captioning workflow follows the external platform, not Blackboard:

Lecture-capture content typically lives in Sonic Foundry Mediasite or Panopto, with the Blackboard LTI tool surfacing a course folder of recordings. Captions are uploaded inside the platform's UI or via API and follow the video into Blackboard.
Some institutions standardise on Kaltura as a campus-wide media management platform, with the Kaltura LTI tool ("Kaltura Mediaspace" or "Kaltura Video Resources") surfacing the institution's video repository inside Blackboard courses.
Marketing, public-facing welcome content, and short instructional segments are often on YouTube — embedded through the rich-text editor's media insertion. YouTube captions are the responsibility of the channel owner; YouTube auto-captions are explicitly insufficient for Title II.
SMB and continuing-education content frequently lives on Vimeo, with embeds through the same rich-text editor pathway. Vimeo supports up to five caption formats per video; the SRT or VTT track travels with the embed.
Modern async-video tools like Loom appear in Blackboard through institutional Loom Education licences and the Loom LTI app; captioning is Loom's responsibility, but the auto-transcript failure mode is the same as YouTube's.

The catalogue-inventory step that opens any retrofit must look across all four surfaces. The 2026-04-24 deadline applies to every video the student encounters through Blackboard — regardless of which platform actually hosts it.

The OCR sampling pattern, applied to a Blackboard tenant

The Office for Civil Rights (US Department of Education) is the primary federal enforcement body for higher-education ADA and Section 504 complaints; for ADA Title II compliance, public-entity actions also flow through the DOJ. The OCR's sampling pattern, when an investigation lands on a Blackboard tenant, is consistent across the cases that have been published:

Identify a course. Often the complainant names a specific course; the institution provides the course shell URL.
Open a recent module. The investigator looks for video — instructor-created lecture, a guest-speaker recording, a procedural demonstration, a regulated-content module, a Collaborate session recording.
Watch a slice with captions on. Two to three minutes is enough to assess whether the captions track the speaker, including the named technical terms.
Read the caption track against the audio. Mangled proper nouns (drug names, regulatory citations, technical product terms, institution-specific programme names) are the failure pattern that gets flagged in writing.
Sample the back-catalogue. If the named course fails, the investigator typically samples a half-dozen other active-term courses to check for a pattern. A pattern triggers a programme-wide finding.
Pull the institutional Ally report. The institutional Ally score and the per-course report are typically in the document request. A high institutional score that conceals substantive auto-caption-only coverage is the worst-of-both-worlds outcome — the institution looks self-aware and the substantive content is non-compliant.

The proper-noun failure mode is what generic auto-captioning is structurally bad at. The words that distinguish a competent caption from a mangled one — the regulatory citations a healthcare student is being tested on, the SDK symbols a software-engineering student must read off the screen, the procedure names in a nursing module, the institution-specific course numbers and faculty names that anchor the conversation — are exactly the words generic STT has the least training data for.

The back-catalogue retrofit pattern

For a Blackboard institution sitting on years of un-captioned, auto-captioned, or partially captioned video, the retrofit runs in five phases:

Inventory. Generate a flat list of every video asset across Content Items, Ally-scored documents that contain embedded video, Collaborate recordings, and the LTI-embedded platforms. Blackboard's REST APIs expose the Content Items and the course structure; the Collaborate API exposes the recordings list; the Ally institutional report exposes the cross-asset list with scores; Mediasite / Panopto / Kaltura / YouTube / Vimeo each have list APIs. Most institutions discover that 50–70% of the catalogue lives outside Content Items.
Triage. Rank by exposure: required courses first, regulated-content modules first within those (compliance training, healthcare procedure videos, anything that's audit-bait under Section 504 or HIPAA), public-facing welcome video first within marketing, and cross-term-referenced Collaborate recordings high. The Ally course-report severity ranking is a useful starting list for triage but should be re-ranked by audit exposure rather than by Ally severity alone.
Re-caption. Replace mangled or absent captions with glossary-biased output. The institutional glossary is built once — programme names, course names, faculty names, regulatory citations, drug and procedure names if you have a healthcare programme, SDK symbols if you have a CS programme, the institution's acronym handbook — and applies to every retrofit asset. Per-customer compounding accuracy is what makes this scale.
Publish. Push captions back to the originating surface. Sidecar SRT/VTT for Content Items; replace caption track in Collaborate recordings; upload through the platform API for Mediasite/Panopto/Kaltura/Vimeo; channel-owner action for YouTube. Re-run the Ally scan after publication to refresh the institutional score.
Log. Maintain an asset register: video URL, surface, caption file, caption source, reviewer, review date, glossary version. This is the documentation an OCR investigator asks for, and it's how institutional risk management proves work-in-progress on the long tail.

See pricing

Where glossary-biased captioning changes the math

The standard institutional retrofit cost calculus pits hand-corrected auto-captioning against vendor-supplied human captioning. Hand-correction at one to two hours per video, multiplied by a five-thousand-asset back-catalogue, multiplied by a $40-per-hour staff or student-worker rate, produces a six-figure project. Human captioning at $1.25-$3.00 per minute of video, multiplied by an average 30-minute lecture across that catalogue, produces a similar six-figure project — sometimes worse, particularly for institutions that have to caption Collaborate session recordings (where the time-per-minute is similar but the volume is higher because every term's recordings accumulate).

Glossary-biased captioning is a different cost shape. The institution builds the glossary once. Each minute of video costs a fraction of human-vendor pricing. The accuracy is high enough on the proper-noun surface that the human-review pass collapses from the full-correction hour to a quick scrub of the amber-highlighted glossary surface. For a 5,000-asset catalogue at an average 30-minute length — 2,500 hours — the GlossCap math (Org plan, 2,500 hours over a four-month retrofit window) lands well under the in-house and vendor-only paths. See the vendor pricing breakdown for the per-hour comparison.

The other cost most retrofit calculations miss is the cost of getting the proper nouns wrong twice — once in the captions and again in the OCR finding letter. Glossary-biased captioning is what stops that recurrence in the second-cycle audit, when the institution attests to a clean catalogue and the investigator spot-checks at random.

Ultra Course View vs Original Course View — does the workflow differ?

Anthology has been migrating tenants from Original Course View to Ultra Course View since 2018. Both are still supported as of 2026 in the Blackboard Learn SaaS deployment, and many institutions run a mix — typically with new course shells in Ultra and legacy archived shells in Original. The captioning workflow differences:

Rich-text editor. Ultra ships with a different RTE than Original. Both support the HTML5 video+track caption pattern; Ultra's media-insertion UI is more direct (drag-and-drop a video file, drag-and-drop a caption file, the editor binds them automatically). Original's UI requires the explicit track child markup in many cases.
Ally surface. Ally runs identically across Ultra and Original. The institutional and course reports merge across both course views.
Collaborate integration. Both views surface Collaborate the same way. Recording captions are platform-level, not course-view-level.
LTI 1.3 tools. Both views support LTI 1.3. Mediasite, Panopto, Kaltura, Loom, and YouTube tools work in both, with minor UI differences in launch placement.
Course copy and export-import. Both views support course copy and Common Cartridge export-import. Caption-track associations sometimes don't survive the round-trip in either view; the practical default is to test the first-week pages of every newly-copied course shell for the CC button.

The retrofit pattern is identical across Ultra and Original; the primary difference is that Ultra tenants tend to have a smaller Content Items back-catalogue (because the migration encouraged moving content into LTI-embedded platforms) and a larger Collaborate recordings catalogue.

Anthology Reach, Anthology Illuminate — do they touch this workflow?

Anthology's broader product portfolio (Reach for student CRM, Illuminate for analytics, Encompass for student information system, Beacon for student success, Engage for student engagement) does not directly touch the captioning workflow. Two indirect touchpoints worth flagging:

Anthology Illuminate. The institutional accessibility report data exposed by Ally is consumed by Illuminate analytics in some institutions; the captioning component score appears as a leading indicator on student-success dashboards in those tenants.
Anthology Reach. Recruitment-marketing video produced for Reach landing pages typically lives outside Blackboard Learn (often on Vimeo or the institution's CMS). Captioning that content is the same SC 1.2.2 bar but goes through the marketing video host, not Blackboard.

For institutions running the broader Anthology stack, the practical posture is that Blackboard Learn is the SC-1.2.2 caption surface for instructional content; Reach and Illuminate are downstream consumers of the accessibility-score signal Ally generates.

Proper-noun failure modes in higher-ed Blackboard content

The proper-noun categories that cause the highest substantive-accuracy failures in Blackboard content vary by programme. The institutional glossary should pre-load the institution-specific terms, but the per-discipline categories the glossary should cover include:

Healthcare programmes. Drug INNs (e.g. tirzepatide, semaglutide, apixaban, rivaroxaban); CPT and ICD-10 codes; procedure abbreviations (TAVR, CRRT, ECMO); pathogen names (C. difficile, S. aureus); anatomy terms.
Engineering and computer science. SDK and library names (PyTorch, TensorFlow, Helm, kubectl); language constructs (lambdas, generics); cloud-vendor product names (AWS Lambda, Azure Functions, Cloud Run).
Law programmes. Case names with non-Anglophone parties; Latin terms; statutory citations (e.g. 42 U.S.C. § 12132, 28 CFR Part 35).
Business and finance. FINRA / SEC / OCC / FDIC abbreviations; product names (Bloomberg, Refinitiv, FactSet); accounting standard codes (ASC 606, IFRS 15).
Liberal arts and humanities. Non-Anglophone names; period-specific vocabulary; foreign-language terms quoted within an English lecture.
Institution-specific. Course numbers (the institution's catalogue numbering scheme); programme names; faculty names; campus-building names; institutional traditions and acronyms.

The compounding-accuracy property of glossary-biased captioning makes the back-catalogue retrofit cheaper than the steady-state forward captioning over time. The first 100 hours captioned with the institutional glossary set the floor; subsequent hours benefit from per-customer term-frequency weighting.

FAQ — Blackboard captioning

Does Blackboard Ally's auto-generated transcript clear ADA Title II SC 1.2.2?

The Ally Alternative Formats pipeline generates a transcript when one isn't already attached. That transcript is auto-generated from the audio and lands in the same 80–90% substantive-accuracy band as YouTube auto-captions on training-style content with technical proper nouns. The substantive-accuracy bar SC 1.2.2 enforces is "captions that accurately convey the audio," not "captions that exist." For a no-proper-noun, conversational video, the auto-transcript can be substantively accurate. For lecture, regulated-content, or technical-procedure video, the auto-transcript virtually always requires correction. The defensible posture is to treat Ally's auto-transcript as a draft and run a glossary-biased correction pass before the video is exposed to a student.

What format do I upload to Blackboard — SRT or VTT?

Both are accepted by the Blackboard HTML5 video player. SRT is the universal default and works in every consumption surface. WebVTT (VTT) is preferred when you need positioning cues, styling, or speaker-identification metadata; the file format is also what the HTML5 track element natively reads. Most institutional retrofits standardise on SRT for Content Items and let Collaborate output VTT internally.

Does the Ally institutional score change if I caption the back-catalogue?

Yes — Ally re-scans assets when their attached caption tracks change, and the institutional score updates on the next refresh cycle. Most institutions see a measurable Ally score lift within the first 30 days of a sustained retrofit. The score lift is the artefact your administration uses to demonstrate progress in OCR document responses; the substantive accuracy of the captions is what stops the OCR finding from landing in the first place. Both matter — the score for the political artefact, the substance for the legal artefact.

How do Collaborate session captions compare to LMS-stored video for accessibility evidence?

Both produce per-asset caption files that are uploadable to an OCR or accommodation-services request. Collaborate captions are tied to the session recording and inherit the recording's permissions; Content Items captions are tied to the video and inherit the Content Item's permissions. The institutional asset register has to track both surfaces because OCR document requests typically span both — the request asks for "all video used in the named course" without distinguishing the technical surface.

If we copy a course across terms, do the captions travel?

Mostly. Content Items video files travel with their caption files; the track-element association in the rich-text editor sometimes has to be re-attached after course copy or Common Cartridge export-import. Collaborate session recordings travel with their caption tracks because they are platform-level assets. LTI-embedded video captions follow the external-platform asset, which travels with the LTI link. Course-copy is the most common point at which Content Items captions detach in practice; checking the new term's first-week course pages for the CC button is the fastest pre-flight test.

What about Blackboard-embedded YouTube video — whose caption is it?

The channel owner's. Blackboard only embeds the YouTube player; the caption track is whatever YouTube serves. For institutional content posted on the institution's official YouTube channel, the captioning workflow is the same as any other YouTube content — upload SRT/VTT through YouTube Studio. For third-party YouTube content embedded into a Blackboard course (which raises a separate copyright question), the institution can't control the captions; it has to either find an alternative source or provide an equivalent captioned alternative.

What does the OCR investigation packet typically request for a Blackboard tenant?

For a video-accessibility complaint, OCR typically requests: the course shell URL, the videos in the course (or the sample courses if the complaint is programme-wide), the caption files attached to each, the institutional Ally accessibility report, the institutional accessibility policy, the staff and faculty training records around accessibility, and the accommodation-services request log relevant to the complainant. Collaborate session recordings are often called out separately — "any synchronous session recordings made available for asynchronous viewing." The asset register described in the retrofit pattern is exactly the artefact that answers the documentation half of that request quickly.

Does this workflow apply to Blackboard's K-12 product (Blackboard for K-12) the same way?

The captioning surfaces are the same — Content Items + Collaborate + Ally + LTI. The regulatory regime is broader at K-12: ADA Title II applies to public school districts, Section 504 applies to any district receiving federal funds, IDEA applies for special-education students, and most states layer on additional accessibility requirements through state digital instructional materials standards. The proper-noun failure modes shift toward grade-level vocabulary and curriculum-specific programme names. The substantive-accuracy bar is identical.