Skip to content

WhisperPen is an advanced command-line tool that converts speech to enhanced text using AI. It combines OpenAI's Whisper model for accurate speech recognition with Ollama's Qwen 2.5 32B model for professional text enhancement and translation.

Notifications You must be signed in to change notification settings

zuozuo/Whisperpen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WhisperPen

语音转文字增强工具,支持离线语音识别和 AI 文本优化。

License Python

功能特点

  • 离线语音识别 (Whisper)
  • AI 文本增强 (Qwen)
  • 中英互译
  • 降噪处理
  • 智能缓存
  • 剪贴板集成

快速开始

环境要求

  • Python 3.8+
  • FFmpeg
  • Ollama

安装

# 安装 FFmpeg
brew install ffmpeg  # macOS
apt install ffmpeg   # Ubuntu/Debian
choco install ffmpeg # Windows

# 安装依赖
pip install -r requirements.txt

# 安装 Qwen 模型
ollama pull qwen2.5:32b

使用

# 单次识别
python -m src.main

# 后台监听模式(使用唤醒词"小王小王")
python -m src.main -b

# 持续监听模式
python -m src.main -c

项目结构

WhisperPen/
├── src/          # 源代码
├── tests/        # 测试代码
├── config/       # 配置文件
├── data/         # 数据文件
└── docs/         # 文档

文档

作者

许可证

MIT License

About

WhisperPen is an advanced command-line tool that converts speech to enhanced text using AI. It combines OpenAI's Whisper model for accurate speech recognition with Ollama's Qwen 2.5 32B model for professional text enhancement and translation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages