Speech-to-text, simplified. Live transcription, read-aloud, voice commands, and a clean transcript workspace.
Simple STT is a speech-to-text browser app and extension with live transcription, read-aloud, configurable voice commands, and optional active-field insertion.
- a full-page transcript workspace
- a compact popup launcher
- a settings surface for language, voice, hotkey, and command phrases
- a hosted web app at simple-stt.github.io
- optional writing of final dictated text into the currently focused editable field in the active tab
- One live transcript surface with interim and final speech merged into the same textarea
- Start and stop transcription from the main app
- Read-aloud with play, pause, resume, and restart controls
- Language selection and read-aloud voice selection
- Copy, cut, and clear transcript actions
- Configurable spoken command phrases for line and paragraph breaks
- Configurable transcription toggle hotkey
- Optional active-field writing for normal web pages
- Guards against restricted browser and internal pages
- Snackbar feedback for clipboard actions, settings changes, and important errors
Visit simple-stt.github.io
- Click the extension toolbar icon, then choose
Open App - Open settings from the popup gear or the app gear
The extension deduplicates its own app and settings tabs, so opening them again focuses the existing tab instead of creating a new copy.
- Open
chrome://extensionsin Google Chrome - Enable
Developer mode - Click
Load unpacked - Select this repository's
ext/directory
Chrome only:
- This project is built and tested for Google Chrome
- Brave and other Chromium browsers are not supported targets for this repo
- The main app page is the primary workspace
- Interim speech stays visible in the same textarea while speaking
- Final speech is committed into the transcript
- The transcript grows until a visual cap, then scrolls internally
- The textarea auto-focuses on load and regains focus after the main transcript actions
Simple STT can read the transcript back using the browser speech-synthesis engine.
- If text is selected, it reads the selection
- If no text is selected, it can read from the cursor position
- Restart jumps back to the beginning and starts again
- Starting read-aloud stops active transcription so the app does not transcribe its own output
When Write to active field is enabled in the main app, final dictated text is also inserted into the currently focused editable field in the active tab.
Notes:
- This only applies to normal editable pages and fields
- Restricted browser and internal pages are skipped quietly
- Turning the toggle off keeps transcription local to Simple STT
- The setting is persisted through the shared settings layer
Simple STT replaces configured spoken phrases after recognition finalizes.
Default phrases:
carriage return=> newlinedouble carriage return=> blank line
You can change these in Settings.
Default transcription toggle:
Alt+Shift+R
Transcript actions in the app page:
Ctrl/Cmd+Aselects the transcriptCtrl/Cmd+Ccopies the transcriptCtrl/Cmd+Xcuts the transcript
The transcription toggle hotkey is configurable in Settings.
Expected hotkey format:
Alt+Shift+RCmd+Shift+RCtrl+Alt+K
If the saved value is blank or invalid, Simple STT falls back to the default.
Settings currently support:
- line break phrase
- paragraph break phrase
- transcription toggle hotkey
- language
- read-aloud voice
The live Write to active field mode toggle stays on the main app because it is intended as a working-mode control rather than a static preference.
activeTab: lets the extension interact with the current tab when neededscripting: used for focused-field insertion into editable pagesstorage: stores settings such as command phrases and the hotkeytabs: used for app and settings tab focusing and deduping
- Speech recognition depends on the browser's built-in speech recognition support
- Behavior is intended for Google Chrome only
- Active-field writing will not work on restricted browser and internal pages
- Dictation quality and availability depend on the browser speech engine and microphone permissions
MIT. See LICENSE.