Skip to content

Speech to Text but will all the bells and whistles and most importantly AI! AI will clean up your filler words, edit, refine what you said!

License

Notifications You must be signed in to change notification settings

chrischoy/WhisperChain

Repository files navigation

Whisper Chain

Whisper Chain Logo

Overview

Typing is boring, let's use voice to speed up your workflow. This project combines:

  • Real-time speech recognition using Whisper.cpp
  • Transcription cleanup using LangChain
  • Global hotkey support for voice control
  • Automatic clipboard integration for the cleaned transcription

Requirements

  • Python 3.8+
  • OpenAI API Key
  • For MacOS:
    • ffmpeg (for audio processing)
    • portaudio (for audio capture)

Installation

  1. Install system dependencies (MacOS):
# Install ffmpeg and portaudio using Homebrew
brew install ffmpeg portaudio
  1. Install the project:
pip install whisperchain

Configuration

WhisperChain will look for configuration in the following locations:

  1. Environment variables
  2. .env file in the current directory
  3. ~/.whisperchain/.env file

On first run, if no configuration is found, you will be prompted to enter your OpenAI API key. The key will be saved in ~/.whisperchain/.env for future use.

You can also manually set your OpenAI API key in any of these ways:

# Option 1: Environment variable
export OPENAI_API_KEY=your-api-key-here

# Option 2: Create .env file in current directory
echo "OPENAI_API_KEY=your-api-key-here" > .env

# Option 3: Create global config
mkdir -p ~/.whisperchain
echo "OPENAI_API_KEY=your-api-key-here" > ~/.whisperchain/.env

Usage

  1. Start the application:
# Run with default settings
whisperchain

# Run with custom configuration
whisperchain --config config.json

# Override specific settings
whisperchain --port 8080 --hotkey "<ctrl>+<alt>+t" --model "large" --debug
  1. Use the global hotkey (<ctrl>+<alt>+r by default. <ctrl>+<option>+r on MacOS):
    • Press and hold to start recording
    • Speak your text
    • Release to stop recording
    • The cleaned transcription will be copied to your clipboard automatically
    • Paste (Ctrl+V) to paste the transcription

Development

Streamlit UI

streamlit run src/whisperchain/ui/streamlit_app.py

If there is an error in the Streamlit UI, you can run the following command to kill all running Streamlit processes:

lsof -ti :8501 | xargs kill -9

Running Tests

Install test dependencies:

pip install -e ".[test]"

Run tests:

pytest tests/

Run tests with microphone input:

# Run specific microphone test
TEST_WITH_MIC=1 pytest tests/test_stream_client.py -v -k test_stream_client_with_real_mic

# Run all tests including microphone test
TEST_WITH_MIC=1 pytest tests/

Building the project

python -m build
pip install .

Publishing to PyPI

python -m build
twine upload --repository pypi dist/*

License

LICENSE

Acknowledgments

Architecture

graph TB
    subgraph "Client Options"
        K[Key Listener]
        A[Audio Stream]
        C[Clipboard]
    end

    subgraph "Streamlit Web UI :8501"
        WebP[Prompt]
        WebH[History]
    end

    subgraph "FastAPI Server :8000"
        WS[WebSocket /stream]
        W[Whisper Model]
        LC[LangChain Processor]
        H[History]
    end

    K -->|"Hot Key"| A
    A -->|"Audio Stream"| WS
    WS --> W
    W --> LC
    WebP --> LC
    LC --> C
    LC --> H
    H --> WebH
Loading

About

Speech to Text but will all the bells and whistles and most importantly AI! AI will clean up your filler words, edit, refine what you said!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages