Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quickstart demo model.predict_clip(data) #51

Open
dzianisv opened this issue Jan 3, 2025 · 3 comments
Open

Quickstart demo model.predict_clip(data) #51

dzianisv opened this issue Jan 3, 2025 · 3 comments

Comments

@dzianisv
Copy link

dzianisv commented Jan 3, 2025

Hi, I am trying to use this library on macOS, here is my code snippet

from microwakeword import inference
import os
import urllib.request
import logging
import pyaudio
import numpy as np

# Configure logging
logging.basicConfig(level=logging.INFO)

def require_model():
    model_filename = "hey_jarvis.tflite"
    model_url = "https://github.com/esphome/micro-wake-word-models/raw/refs/heads/main/models/hey_jarvis.tflite"
    
    # Check if the model file exists in the current working directory
    if not os.path.exists(model_filename):
        logging.info(f"{model_filename} not found. Downloading...")
        # Download the model file
        urllib.request.urlretrieve(model_url, model_filename)
        logging.info(f"Downloaded {model_filename}.")
    
    # Return the path to the model file
    return os.path.abspath(model_filename)

model = inference.Model(path=require_model())

def capture_audio_and_predict():
    # Initialize PyAudio
    p = pyaudio.PyAudio()

    # Define audio stream parameters
    stream = p.open(format=pyaudio.paInt16,
                    channels=1,
                    rate=16000,
                    input=True,
                    frames_per_buffer=1024)

    try:
        while True:
            # Read audio data from the stream
            audio_data = stream.read(1024)
            # Convert audio data to numpy array
            data = np.frombuffer(audio_data, dtype=np.int16)
            # Call predict_clip with the captured audio data
            print(model.predict_clip(data))
    except KeyboardInterrupt:
        logging.info("Stopping audio capture.")
    finally:
        # Stop and close the stream
        stream.stop_stream()
        stream.close()
        # Terminate PyAudio
        p.terminate()

# Call the function to start capturing audio and predicting
capture_audio_and_predict()

and Pipfile

[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
pyaudio = "*"
microwakeword = {ref = "main", git = "https://github.com/kahrendt/microWakeWord.git"}

[dev-packages]

[requires]
python_version = "3.10"

I appreciate any help with a simple demo. Thank you!

@dzianisv dzianisv changed the title Quickstart demo Quickstart demo model.predict_clip(data) Jan 3, 2025
@dzianisv
Copy link
Author

dzianisv commented Jan 4, 2025

When I try my demo, it fails:

You can set PIPENV_VERBOSITY=-1 to suppress this warning.
Traceback (most recent call last):
  File "/Users/engineer/workspace/Voice1/main.py", line 1, in <module>
    from microwakeword import inference
  File "/Users/engineer/workspace/Voice1/.venv/lib/python3.10/site-packages/microwakeword/inference.py", line 22, in <module>
    from microwakeword.audio.audio_utils import generate_features_for_clip
ModuleNotFoundError: No module named 'microwakeword.audio'

I see that for some reason python can't import audio module ...

@dzianisv
Copy link
Author

dzianisv commented Jan 4, 2025

I added the following to microwakeword/init.py, but now it fails find layers module

from . import audio, inception, inference, utils

that is imported in inception module

and

ModuleNotFoundError: No module named 'microwakeword.audio'

that is imported in inference

@kahrendt
Copy link
Owner

Those seem like Python install issues, but I'm not super familiar with the details on how this type of stuff works!

I suggest looking at pymicro-wakeword for running models in streaming mode. The inference code included in this repo is not suitable for processing streaming audio, as it generates spectrogram features assuming you have the entire audio clip.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants