MyReplica.ai - Building Your own Personal AI

Part 1: Speech

Description: This part of the project tries to mimic the functionality of a personal AI that always listens to what you are listening to. You can play anything on your desktop, (run it for atleast a few minutes for better results) and OpenAI's whisper would convert all the speech to text and store it. Then when you ask a query, the saved text will be converted to a vector database and the most similar data to the query is retrieved.

Steps:

Install all the required dependencies

pip install -r requirements.txt

You will need to figure out which audio input is your Desktop Input and which is your Microphone or Headphone's input. Enabling Stereo Mix will let you listen to desktop audio when using PyAudio.

Go into your Control Panel > Sound > Recording tab > Enable Stereo Mix. I was using a headphone for Mic input, so depending on the audio input device, you will need to update the input_device_index in the question_audio_input.py file:

stream = p.open(format=FORMAT,
                    channels=CHANNELS,
                    rate=RATE,
                    input=True,
                    input_device_index=<your suitable device index>,
                    frames_per_buffer=CHUNK)

All the devices can be seen using this code present in the same file:

p = pyaudio.PyAudio()
input_index = None
for i in range(p.get_device_count()):
    dev = p.get_device_info_by_index(i)
    print("Device:",dev)

Next you will need to run 3 python files simultaneously in different terminals. This can also be done using multithreading.

Run Desktop Audio Input. It listens to your desktop audio in real time and stores it in one-minute intervals.

python experiments/speech/desktop_audio_input.py

Run Desktop Audio to Text. It uses Whisper to convert those audio speech files to text files.

python experiments/speech/desktop_audio_to_text.py

Run the HuggingFace Agent file. It uses whisper to listen to your query, then finds the relevant data and replies in audio.

python experiments/speech/text_reply.py

Wait till the Agent loads, then Press 'w' to start asking a question. The query listening time is hardcoded to 10 seconds.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
experiments		experiments
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MyReplica.ai - Building Your own Personal AI

Part 1: Speech

About

Releases

Packages

Languages

subodhr258/MyReplica.ai

Folders and files

Latest commit

History

Repository files navigation

MyReplica.ai - Building Your own Personal AI

Part 1: Speech

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages