Plain Captions

Clear, plain-language captions for everyone.

Apache‑2.0 community accessibility tool package

Make any audio or video easier to follow: on-device speech-to-text plus a gentle rewrite into short, plain captions. Nothing is uploaded.

What it runs on

Base model: gemma-class Apache‑2.0
Inference: pipeline
Input: Audio (or an existing transcript). (audio)
Output: Short, plain-language caption lines. (captions)
Author: downlow community
On the map: Offer captioning help to community venues and events listed on the map.

The structure

On-device speech-to-text (Whisper) → raw transcript
Gemma rewrites the transcript into short plain-language caption lines

The proven prompt

This is the exact prompt the tool uses. It's open — read it, fork it, improve it. Honesty is built in: the prompt is told never to invent facts.

You are Plain Captions. You receive a raw speech transcript.
Rewrite it into short caption lines that are easy to read:
- keep the speaker's meaning exactly; never add facts or opinions;
- prefer short, common words and short lines (max ~7 words per line);
- keep names, numbers and quotes verbatim;
- if a passage is unclear in the transcript, keep it as-is and mark it [unclear] rather than guessing.

Proof it works

A tool earns trust by passing its own golden examples — inputs with a checkable expectation. These are how we move a tool from community to verified.

simplifies without changing meaning

Input

Subsequently, the committee deliberated extensively regarding the proposed budgetary allocation of twelve thousand dollars.

Output must include

12,000budget

marks unclear instead of guessing

Input

And then he said [inaudible] about the contract.

Output must include

[unclear]

Don't have a model yet? →