Typing is boring, let's use voice to speed up your workflow. This project combines:
- Real-time speech recognition using Whisper.cpp
- Transcription cleanup using LangChain
- Global hotkey support for voice control
- Automatic clipboard integration for the cleaned transcription
- Python 3.8+
- OpenAI API Key
- For MacOS:
- ffmpeg (for audio processing)
- portaudio (for audio capture)
- Install system dependencies (MacOS):
# Install ffmpeg and portaudio using Homebrew
brew install ffmpeg portaudio
- Install the project:
pip install whisperchain
WhisperChain will look for configuration in the following locations:
- Environment variables
- .env file in the current directory
- ~/.whisperchain/.env file
On first run, if no configuration is found, you will be prompted to enter your OpenAI API key. The key will be saved in ~/.whisperchain/.env
for future use.
You can also manually set your OpenAI API key in any of these ways:
# Option 1: Environment variable
export OPENAI_API_KEY=your-api-key-here
# Option 2: Create .env file in current directory
echo "OPENAI_API_KEY=your-api-key-here" > .env
# Option 3: Create global config
mkdir -p ~/.whisperchain
echo "OPENAI_API_KEY=your-api-key-here" > ~/.whisperchain/.env
- Start the application:
# Run with default settings
whisperchain
# Run with custom configuration
whisperchain --config config.json
# Override specific settings
whisperchain --port 8080 --hotkey "<ctrl>+<alt>+t" --model "large" --debug
- Use the global hotkey (
<ctrl>+<alt>+r
by default.<ctrl>+<option>+r
on MacOS):- Press and hold to start recording
- Speak your text
- Release to stop recording
- The cleaned transcription will be copied to your clipboard automatically
- Paste (Ctrl+V) to paste the transcription
streamlit run src/whisperchain/ui/streamlit_app.py
If there is an error in the Streamlit UI, you can run the following command to kill all running Streamlit processes:
lsof -ti :8501 | xargs kill -9
Install test dependencies:
pip install -e ".[test]"
Run tests:
pytest tests/
Run tests with microphone input:
# Run specific microphone test
TEST_WITH_MIC=1 pytest tests/test_stream_client.py -v -k test_stream_client_with_real_mic
# Run all tests including microphone test
TEST_WITH_MIC=1 pytest tests/
python -m build
pip install .
python -m build
twine upload --repository pypi dist/*
graph TB
subgraph "Client Options"
K[Key Listener]
A[Audio Stream]
C[Clipboard]
end
subgraph "Streamlit Web UI :8501"
WebP[Prompt]
WebH[History]
end
subgraph "FastAPI Server :8000"
WS[WebSocket /stream]
W[Whisper Model]
LC[LangChain Processor]
H[History]
end
K -->|"Hot Key"| A
A -->|"Audio Stream"| WS
WS --> W
W --> LC
WebP --> LC
LC --> C
LC --> H
H --> WebH