Linux-First Dictation

Speak. Your Linux desktop
types it perfectly.

Local Whisper-powered speech-to-text with application-aware formatting. No cloud. No subscription. Complete privacy.

Built for developers, writers, and accessibility users on Linux.

Open source · GPLv3 · Works offline
voiced — dictation
$ voiced --mode dictation
Listening on PipeWire... (F9 to toggle)
 
[raw] "ok so the deploy script needs to uh check the health endpoint first before um routing traffic"
[out] The deploy script needs to check the health endpoint before routing traffic.
✓ Inserted via wtype · 380ms · whisper-large-v3

macOS has dictation. Windows has Dragon.
Linux had nothing comparable. Until now.

Voiced runs entirely on your machine. Your speech never leaves your hardware. GPU-accelerated Whisper delivers accuracy that matches or exceeds cloud services.

Cloud Dictation
~500ms
Round-trip latency
Speech sent to servers
Nerd Dictation
~200ms
Local but basic
VOSK only, no AI cleanup
Voiced
<400ms
Local + AI refinement
Whisper + context-aware formatting
Features

Everything you need. Nothing in the cloud.

GPU-Accelerated Whisper

Multiple ASR engines: faster-whisper, whisper.cpp, OpenVINO. Pick the best for your hardware. All local, all fast.

Application-Aware Context

Detects your active window and project. Dictating in VS Code? It knows your codebase. Writing email? It adjusts formatting.

AI Refinement Modes

dictation cleans speech. prompt uses project context. plan structures ideas. commit writes messages.

Wayland + X11 Native

Text insertion via wtype (Wayland), xdotool (X11), or clipboard fallback. Works in any application — terminal, browser, IDE.

Complete Privacy

Zero network calls for transcription. Your voice data stays on your machine. No accounts, no telemetry, no cloud processing.

One Hotkey

Press F9 to start dictating, F9 again to stop. Text appears in your focused application. Voice activity detection auto-stops on silence.

How It Works

Speech to perfectly formatted text in under 500ms

1

Capture

PipeWire/PulseAudio audio capture with VAD endpoint detection. Records until you stop or silence.

2

Transcribe

Local Whisper ASR converts speech to raw text. GPU-accelerated. Multiple engine options.

3

Refine

AI cleans filler words, fixes grammar, applies context-aware formatting from your active project.

4

Insert

Cleaned text is typed into your focused application via wtype/xdotool. No clipboard hijacking.

Multiple ASR engines. One interface.

Pick the engine that fits your hardware and accuracy needs.

faster-whisper
CTranslate2 backend
Default
whisper.cpp
C++ optimized
CPU-friendly
OpenVINO
Intel optimized
Intel GPUs
OpenAI STT
Cloud fallback
Optional
Get Started

Up and running in 60 seconds

terminal
# Clone and install
$ git clone https://github.com/cancelei-org/voiced
$ cd voiced && pip install -e .
 
# Configure (optional — works out of the box)
$ cp config.example.yaml ~/.config/voiced/config.yaml
 
# Start dictating
$ voiced
Voiced started · Press F9 to dictate

Requires: Python 3.11+, PipeWire or PulseAudio, wtype (Wayland) or xdotool (X11)