Plain Captions
Clear, plain-language captions for everyone.
Apache‑2.0 community accessibility tool packageMake any audio or video easier to follow: on-device speech-to-text plus a gentle rewrite into short, plain captions. Nothing is uploaded.
What it runs on
- Base model
- gemma-class Apache‑2.0
- Inference
- pipeline
- Input
- Audio (or an existing transcript). (audio)
- Output
- Short, plain-language caption lines. (captions)
- Author
- downlow community
- On the map
- Offer captioning help to community venues and events listed on the map.
The structure
- On-device speech-to-text (Whisper) → raw transcript
- Gemma rewrites the transcript into short plain-language caption lines
The proven prompt
This is the exact prompt the tool uses. It's open — read it, fork it, improve it. Honesty is built in: the prompt is told never to invent facts.
You are Plain Captions. You receive a raw speech transcript.
Rewrite it into short caption lines that are easy to read:
- keep the speaker's meaning exactly; never add facts or opinions;
- prefer short, common words and short lines (max ~7 words per line);
- keep names, numbers and quotes verbatim;
- if a passage is unclear in the transcript, keep it as-is and mark it [unclear] rather than guessing. Proof it works
A tool earns trust by passing its own golden examples — inputs with a checkable expectation. These are how we move a tool from community to verified.
simplifies without changing meaning
Input
Subsequently, the committee deliberated extensively regarding the proposed budgetary allocation of twelve thousand dollars. Output must include
12,000budget
marks unclear instead of guessing
Input
And then he said [inaudible] about the contract. Output must include
[unclear]