Welcome to the Conversational AI Hackathon, hosted at UCL! 🚀 This event is your opportunity to dive into the exciting world of Conversational AI and build real-time, voice-driven applications that tackle innovative challenges.
Email {sohaib, jiameng, adnan, lohith, emma}@neuphonic.com if you need any help!
Build a real-time conversational AI solution that delivers seamless, voice-driven interactions for innovative use cases. Your goal is to combine state-of-the-art components to create a functional, impactful system.
Your project will be judged based on the following criteria:
-
Functionality (15%)
- How well does the solution perform its intended task?
- Does the conversational AI respond appropriately and handle various inputs effectively?
-
Innovation & Creativity (40%)
- Is the idea unique, or does it improve upon existing solutions?
- Does it demonstrate creative use of conversational AI technology?
-
User Experience (15%)
- Is the AI interaction intuitive and engaging for users?
- Are the responses natural and contextually relevant?
-
Impact & Applicability (30%)
- How well does the solution address a real-world problem?
- Can the project be scaled or adapted for broader use cases?
- Introduction
- Setup
- Project Structure
- Code Overview
- How to Run
- Challenges & Ideas
- Contribution Guidelines
- License
The hackathon is designed to give you hands-on experience with:
- Text-to-Speech (TTS): Utilising Neuphonic's API for Voice Synthesis.
- Speech-to-Speech (TTS): Utilising Neuphonic's API for Conversational AI
In what follows, we'll show you how to get started and use our software!
You will need to be running a minimum of Python 3.10+
You'll also need to get an API Key, which you can get from beta.neuphonic.com.
If you're on MacOS, install brew
at https://brew.sh/ if you don't have it already.
In most cases, you will be playing the audio returned from our servers directly on your device.
⚠️ Mac users encountering a'portaudio.h' file not found
error can resolve it by runningbrew install portaudio
.
Clone the repository:
git clone https://github.com/neuphonic/ucl_hackathon.git
cd UCL-Hackathon
Create a virtual environment and install the dependencies:
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
├── hackathon_demo_ucl.ipynb # JupyterNotebook going over various helpful examples
├── neuphonic_texttospeech.py # TTS module
├── neuphonic_speech_to_speech.py # Main integration program
├── README.md # Documentation
├── LICENSE # MIT License
└── requirements.txt # Dependencies
Try out the ipython notebook hackathon_demo_ucl.ipynb
in your IDE!
For more detailed documentation and examples, please refer to the pyneuphonic
SDK README available
at pyneuphonic GitHub repository.
The Text-to-Speech module leverages Neuphonic’s API for generating high-quality audio.
Key Functionality:
neuphonic_tts(input_text)
: Converts input text into speech and plays it.
Test this out with
python neuphonic_texttospeech.py
*** NOTE: The speech recognition is quite sensitive, so you might need to be in a quiet area (and using headphones) to make this work fluidly, otherwise it can pick up background noise.
We've also created a file called neuphonic_speech_to_speech.py which allows you to talk to a module in a speech to speech fashion. It's connected to a Deepgram ASR, OpenAI LLM, and then the Neuphonic TTS. By installing pyneuphonic, you install all the depencies required to get it up and running.
- Speak into the microphone, and the system will transcribe your speech in real time.
- The transcribed text is sent to the LLM to generate a response.
- The response is converted to speech using the TTS module and played back to you.
- Repeat the process to continue the conversation.
Test this out with
python neuphonic_speech_to_speech.py
- Real-time performance: Ensure smooth, low-latency interactions.
- Robustness: Handle varied accents, speech rates, and noisy environments.
- Virtual Assistant: Build a personalized voice assistant.
- Interactive Learning: Develop a language learning app.
- Accessibility Tool: Create tools for users with disabilities.
- News Summarisation: Fetch the latest news, generates concise summaries, and delivers them as personalized audio clips.
- Dynamic Storytelling: Create interactive audiobooks, with the story adapting based on mood or context.
- TTS Fitness Coach: Cirtual fitness coach that provides real-time, motivational voice instructions during workouts.
- AI Audioguides:Design a tool for generating personalized audioguides for museums or attractions.
All contributions during the hackathon should be:
- Clearly documented.
- Tested to ensure compatibility with the main system.
This project is licensed under the MIT License. See LICENSE
for more details.
Happy Hacking! 🎉