Skip to content

Speech Recognition applied to transcribe amateur radio traffic experiments

License

Notifications You must be signed in to change notification settings

darienmt/radio-listener

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

radio-listener

Overview

This repo contains my experiments with speech recognition applied to transcribe amateur radio traffic. I'm very new to the radio amateur hobby, and I have problems understanding the call signs to be able to call them back or reply to somebody calling me. This project aims to help me see how the hobby works by reading the transcribed traffic and learning.

Running code

To run the code, install Virtual Environment, and then run this commands:

python3 -m venv .venv
. ./.venv/bin/activate
python3 -m pip install -r ./src/requirements.txt

Then, install PyAudio:

sudo apt-get install python-pyaudio python3-pyaudio
sudo apt-get install portaudio19-dev

Then install ffmpeg for mp3 support on pydub

apt-get install ffmpeg libavcodec-extra

Code descriptions

  • ./src/mic_sample.py: Use SpeechRecognition library to read the computer's microphone(or other input device) and execute the OpenAI's Whisper model to recognize the speech and log it to the screen(std).

    python ./src/mic_sample.py
  • ./src/mic_reproduction.py: This is an attempt to reproduce the audio input on the speaker in real-time (not very successful) with PyAudio and a couple of threads and queues. It will ask to select the input and output device and start listening and reproducing.

    python ./src/mic_reproduction.py
  • ./src/select_input.py: This function selects an input or output device from the local hardware with PyAudio. It is used in most of the other files, as all of them need this selection.

  • ./src/mic_writer.py: This is the first one actually doing some interesting job. It listens to an audio input(using SpeechRecognition), recognizes the speech using OpenAI's Whisper model medium model, and writes to the std and a file(./.logs).

    # It will ask to select an input device
    python ./src/mic_writer.py
    
    # Or you can preselect the device index(believe me, you will remember it after a while...)
    python ./src/mic_writer.py --device-index 9
  • ./src/mic_writer_whisper.py: More or less like ./src/mic_writer.py, but this time it is using OpenAI's Whisper model medium model directly and writing the audio as mp3 at ./.audio/{yyyy-mm-dd}/*.mp3.

    # It will ask to select an input device
    python ./src/mic_writer_whisper.py
    
    # Or you can preselect the device index
    python ./src/mic_writer_whisper.py --device-index 9

Development

Create venv and install dependencies

python3 -m venv .venv
. ./.venv/bin/activate
python3 -m pip install -r ./src/requirements.txt

Update Dependencies

python3 -m venv .venv
. ./.venv/bin/activate
python3 -m pip freeze > ./src/requirements.txt

About

Speech Recognition applied to transcribe amateur radio traffic experiments

Topics

Resources

License

Stars

Watchers

Forks

Languages