Audio to Text Transcription

This Python script provides a way to transcribe an audio file and generate a corrected transcript using OpenAI's Whisper ASR model. It then feeds the output of the audio transcription to GPT-4 in order to correct any spelling or grammar mistakes.

Dependencies

This script requires the openai Python library. You can install it using pip:

pip install openai

You'll need an API key from OpenAI to use their servcies. Once you have a key set it as an environment variable:

export OPENAI_API_KEY='your-api-key'

What the script does

The script first checks if a file named 'subtitle-correction-prompt.txt' and 'subtitle-transcription-prompt.txt' exist in the same directory. This file should contain the system prompt for the AI model and the subtitle creation model.
It then transcribes an audio file using the 'whisper-1' model from the OpenAI API.
The transcription is then processed by GPT-4 AI model to generate a corrected transcript.

Configuration

The script requires the following parameters:

temperature: The temperature parameter for the AI model.
system_prompt: The system prompt for the AI model. This is read from the 'prompt.txt' file.
audio_file: The audio file to transcribe.

These parameters should be passed to the generate_corrected_transcript function.

Usage

Make sure you have a file named 'subtitle-correction-prompt.txt' in the same directory as the script. This file should contain the system prompt for the AI model for correcting the text of the initial subtitle result.
Setup a virtual env: python -m venv .venv
Activate it: source .venv/bin/activate
Install requirements: pip install -r requirements.txt
Run the script: python audio_to_srt.py

Output

The transcriptions are saved to files in SubRip (.srt) format. Both the original and post-processed transcriptions are saved for comparison.

Future ideas

Accept a youtube link
Parse audio from YT file
create a simple webfront end or API endpoint

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Audio to Text Transcription

Dependencies

What the script does

Configuration

Usage

Output

Future ideas

Files

README.md

Latest commit

History

README.md

File metadata and controls

Audio to Text Transcription

Dependencies

What the script does

Configuration

Usage

Output

Future ideas