Softcery Voice Agent

Softcery Voice Agent is a voice-enabled application designed to assist users in discussing product ideas. It leverages a combination of modern web technologies and voice processing tools to provide an interactive and engaging user experience.

Introduction

Voice agents represent a significant advancement in human-computer interaction, enabling users to communicate with devices using natural language.

Voice agents are increasingly being integrated into various applications, from virtual assistants and customer service bots to smart home devices and automotive systems. They offer the potential to revolutionize how we interact with technology, making it more accessible and user-friendly.

STT/LLM/TTS

STT (the "ears" of the system): Converts audio into text.
- Providers: Deepgram, Amazon Transcribe, Google Speech-to-Text, Microsoft Azure Speech Service
LLM (the "brain" of the system): Processes transcribed text and generates a response.
- Providers: OpenAI’s GPT, Anthropic’s Claude, Meta’s LLaMA, Google’s Gemini
TTS (the "voice" of the system): Synthesizes text into spoken audio.
- Providers: Deepgram, Cartesia, Microsoft Azure Speech Synthesis

Each service in this setup is accessed through RESTful or WebSocket APIs, creating a multi-step process with potential latency due to network communication and processing.

Features

Voice Interaction: Engage with the application using voice commands.
Rate Limiting: Prevents abuse by limiting the number of requests per user.
Real-time Audio Visualization: Visual feedback for audio input and output.
Speech-to-Text Conversion: Utilizes Deepgram to transform spoken language into text for further processing.
Text-to-Speech Synthesis: Employs Cartesia to convert text responses into natural-sounding speech.
Real-time Communication Management: Uses LiveKit to handle real-time interactions and participant events efficiently.
Advanced Language Processing: Leverages Llama3 with OpenAI Groq for processing and generating text-based responses.

Components

Agent:
- Uses Flask to create API endpoints for token generation.
- Rate limiter: Limits both API requests for token generation and voice requests within a single session through LiveKit RTC connection.
- Integrates with external voice processing services, including Deepgram and Cartesia.
Web Client:
- Utilizes LiveKit React SDK for real-time communication.

Installation

Clone the repository:

git clone https://github.com/softcery/softcery-agent.git
cd softcery-agent

Agent Setup:
- Navigate to the agent directory.
- Install the required packages:
```
pip install -r requirements.txt
```
Web Client Setup:
- Navigate to the web directory.
- Install the dependencies:
```
npm install
```

Usage

Start the Token Server:
```
python main.py # from `agent` directory
```

Start Voice Agent LiveKit worker:

python voice_agent.py dev # from `agent` directory

Start the Web Client:
```
npm run dev # from `web` directory
```

Environment Variables

The application requires several environment variables to be set. Create a .env file in the agent and web directories. Examples of these environment variables can be found in the .env.template.sh files.

Name	Name	Last commit message	Last commit date
Latest commit OrestSonich chore: update prompt Feb 19, 2025 e308cc1 · Feb 19, 2025 History 37 Commits
agent	agent	chore: update prompt	Feb 19, 2025
web	web	fix(web): fix button ui	Nov 18, 2024
.env.template.sh	.env.template.sh	feat: add rate limiter, create client	Nov 6, 2024
.gitignore	.gitignore	feat: add flask server	Nov 5, 2024
README.md	README.md	chore: update diagram in README.md	Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Softcery Voice Agent

Table of Contents

Introduction

Introduction

STT/LLM/TTS

Features

Components

Installation

Usage

Environment Variables

About

Releases

Packages

Contributors 2

Languages

softcery/softcery-agent

Folders and files

Latest commit

History

Repository files navigation

Softcery Voice Agent

Table of Contents

Introduction

Introduction

STT/LLM/TTS

Features

Components

Installation

Usage

Environment Variables

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages