Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2 way audio support #344

Closed
NikDevx opened this issue Apr 7, 2021 · 106 comments
Closed

2 way audio support #344

NikDevx opened this issue Apr 7, 2021 · 106 comments

Comments

@NikDevx
Copy link

NikDevx commented Apr 7, 2021

Hello @roleoroleo
Is it possible to add 2 way audio or just audio support for this camera Yi 1080p Home 9FUS (y203c)?

@roleoroleo
Copy link
Owner

Audio is supported in the stream.
2 way is work in progress but I don't know if I will be able to complete it.

@NikDevx
Copy link
Author

NikDevx commented Apr 7, 2021

Audio is supported in the stream.
2 way is work in progress but I don't know if I will be able to complete it.

Can I help you?

I saw something about 2 way audio here homebridge-plugins/homebridge-camera-ffmpeg#738

@roleoroleo
Copy link
Owner

The problem here is how to inject audio to the device, not how to stream it.
I'm working on a hacked libalsa but it's very complicated.

@NikDevx
Copy link
Author

NikDevx commented Apr 8, 2021

Do you mean the problem with detecting the sound card to transfer audio from the device to the camera speaker?

@roleoroleo
Copy link
Owner

Audio devices are in use by rmm process.
You can't open them with another process.
About mic, we are using an hacked version of tinyalsa that writes all samples captured to a fifo.
Now, I'm working to a similar hack to play audio on the speaker.

@NikDevx
Copy link
Author

NikDevx commented Apr 9, 2021

I got it.
Thanks

@roleoroleo
Copy link
Owner

Do you want to try a beta?

@NikDevx
Copy link
Author

NikDevx commented Apr 15, 2021

Yes
I can try tomorrow

@roleoroleo
Copy link
Owner

y203c_0.4.1.tar.gz

@roleoroleo
Copy link
Owner

How it works:
1 - Create a raw pcm (wav without header): 16 KHz, 16 bit, mono (S16LE).
2 - Copy it to /tmp/audio_in_fifo

If you want to use a tts engine download nanotts from here:
https://github.com/roleoroleo/yi-hack-utils
and use with this webservice:

@NikDevx
Copy link
Author

NikDevx commented Apr 16, 2021

How it works:
1 - Create a raw pcm (wav without header): 16 KHz, 16 bit, mono (S16LE).
2 - Copy it to /tmp/audio_in_fifo

If you want to use a tts engine download nanotts from here:
https://github.com/roleoroleo/yi-hack-utils
and use with this webservice:

What about without header? Do you mean without file name?
Which version nanotts should I use? MStar, Allwinner, Allwinner-v2

@NikDevx
Copy link
Author

NikDevx commented Apr 16, 2021

y203c_0.4.1.tar.gz

Will all my configuration data be deleted after flashing?

@roleoroleo
Copy link
Owner

Yes, save it in Maintenance page.

@NikDevx
Copy link
Author

NikDevx commented Apr 23, 2021

Hello!
This works like a charm
Thanks a lot

@NikDevx
Copy link
Author

NikDevx commented Apr 24, 2021

Which one audio card you use for send audio to speaker?

For example I can send audio to speaker via ffmpeg
I use alsa audio card in my raspberry-pi camera speaker "audio": "2way -f alsa default"

Maybe I can send somehow audio to speaker via ffmpeg?

@roleoroleo
Copy link
Owner

roleoroleo commented Apr 24, 2021

The cam support alsa via tinyalsa lib.
But the device is busy because rmm opens it ad boot.
So, you can't use it with standard programs.
I created a "fork" patching the library: the library reads from a fifo file and send the raw audio to the speaker.
You can send audio writing to this file (/tmp/audio_in_fifo) but the stream must to match the correct format: 16 KHz, 16 bit, mono (S16LE).

@NikDevx
Copy link
Author

NikDevx commented Apr 24, 2021

The cam support alsa via tinyalsa lib.
But the device is busy because rmm opens it ad boot.
So, you can't use it with standard programs.
I created a "fork" patching the library: the library reads from a fifo file and send the raw audio to the speaker.
You can send audio writing to this file (/tmp/audio_in_fifo) but the stream must to match the correct format: 16 KHz, 16 bit, mono (S16LE).

I already used this one

http://IP-CAM:8080/cgi-bin/speak.sh?lang=en-US
POST
Text message in the payload

Anyway thank you!

@roleoroleo
Copy link
Owner

ffmpeg from local shell or remotely?
Where do you want to run ffmpeg?

@NikDevx
Copy link
Author

NikDevx commented Apr 25, 2021

ffmpeg on my raspberry-pi
I am using this plugin to add camera to homebridge https://github.com/Sunoo/homebridge-camera-ffmpeg#readme
And this plugin has ffmpeg

@roleoroleo
Copy link
Owner

You could try to create a new web service with this code:

#!/bin/sh

printf "Content-type: application/json\r\n\r\n"
printf "{}"

cat - > /tmp/audio_in_fifo

Pay attention to the permission 0755
Then you can send a file using this web service as a payload of a post request.
I gave it a try and it's ok .
Probably combining ffmpeg and curl it works.

@NikDevx
Copy link
Author

NikDevx commented Apr 26, 2021

You could try to create a new web service with this code:

#!/bin/sh

printf "Content-type: application/json\r\n\r\n"
printf "{}"

cat - > /tmp/audio_in_fifo

I should create via ssh or another way?

@NikDevx
Copy link
Author

NikDevx commented Apr 26, 2021

Thanks a lot!

@NikDevx
Copy link
Author

NikDevx commented Apr 26, 2021

@roleoroleo can you compile firmware for me with add speaker service please?
I have some problem with docker on Mac OS Big Sur

@roleoroleo
Copy link
Owner

Check the release section.

@NikDevx
Copy link
Author

NikDevx commented Apr 27, 2021

Check the release section.

You are the best!
Thanks!

@bepece1
Copy link

bepece1 commented Apr 27, 2021

How can I use the new "speaker" function in version 0.4.1???

@roleoroleo
Copy link
Owner

speak.sh
http://IP-CAM:8080/cgi-bin/speak.sh?lang=en-US
POST
Text message in the payload

speaker.sh
http://IP-CAM:8080/cgi-bin/speaker.sh
POST
Raw audio in the payload: 16 KHz, 16 bit, mono (S16LE)

@jd1900
Copy link

jd1900 commented Apr 27, 2021

Any idea of how to tie this with TTS from Home Assistant to send audio to the camera?

@NikDevx
Copy link
Author

NikDevx commented Apr 27, 2021

speak.sh
http://IP-CAM:8080/cgi-bin/speak.sh?lang=en-US
POST
Text message in the payload

speaker.sh
http://IP-CAM:8080/cgi-bin/speaker.sh
POST
Raw audio in the payload: 16 KHz, 16 bit, mono (S16LE)

@roleoroleo, on actual version of firmware service speak and speaker not working.

my curl: curl -X POST -d 'hello' http://192.168.0.30:8080/cgi-bin/speak.sh?lang=en-US
zsh: no matches found: http://192.168.0.30:8080/cgi-bin/speak.sh?lang=en-US

only work without lang parameter curl -X POST -d 'hello' http://192.168.0.30:8080/cgi-bin/speak.sh

{
"error":"false",
"description":"hello"
}

But no audio coming

↓↓↓↓↓

y203c_0.4.1.tar.gz

Here it was working

@tunerooster
Copy link

tunerooster commented Nov 16, 2021

I hate to say it, but the volume through the media_player interface is too low to be usable. Using both tts and play_media the volume is to weak. I can likely get around play_media by increasing the volume in the .mp3 file, but I can't see a way around this using tts. Any advice is welcome...

Thanks!

P.S. Even without a VOL option, maybe you could just always include a high volume as the default in the ffmpeg conversion: ffmpeg -filter:a "volume=25dB"

P.P.S. Is there an FFMPEG environment variable which one could set their own ffmpeg options? Or what about the "extra_arguments" in the HA ffmpeg platform variable?

@tunerooster
Copy link

I seem to remember seeing " -rtsp_transport tcp" when installing the integration. Could I add -filter:a "volume=25dB" there? I can't find where/how to "reinstall" or "reconfigure" the integration. Do I have to delete it and reinstall to get back to the field which had "-rtsp_transport tcp" in it?

Sorry for the multiple posts...

@tunerooster
Copy link

I found the "extra_arguments" option in:

.storage/core.config_entries

Should I stop HA, change it there and restart?

{
    "entry_id": "c1ec7feea03a21153635f45a56654e02",
    "version": 1,
    "domain": "yi_hack",
    "title": "yi_hack_a2_3aed03",
    "data": {
        "host": "frontdoor.krautclan.com",
        "port": 8080,
        "username": "root",
        "password": "xxxxxxxxx",
        "extra_arguments": "-rtsp_transport tcp",
        "SERIAL_NUMBER": "00000000000000000000",
        "mac": "xx:xx:xx:xx:xx:xx",
        "PTZ": "no",
        "HACK_NAME": "yi-hack-allwinner-v2",
        "name": "yi_hack_a2_3aed03",
        "RTSP_PORT": "554",
        "MQTT_PREFIX": "frontdoor",
        "TOPIC_BIRTH_WILL": "status",
        "TOPIC_MOTION": "motion",
        "TOPIC_SOUND_DETECTION": "sound_detection",
        "TOPIC_BABY_CRYING": "baby_crying",
        "TOPIC_MOTION_IMAGE": "",
        "MOTION_START_MSG": "motion_start",
        "MOTION_STOP_MSG": "motion_stop",
        "BIRTH_MSG": "online",
        "WILL_MSG": "offline",
        "BABY_CRYING_MSG": "crying",
        "SOUND_DETECTION_MSG": "sound_detected"
    },
    "options": {},
    "system_options": {
        "disable_new_entities": false
    },
    "source": "user",
    "connection_class": "local_poll",
    "unique_id": "xx:xx:xx:xx:xx:xx",
    "disabled_by": null
}

@tunerooster
Copy link

Still working on this...

I ran a test and using "ps" I was able to see where the yi-hack Home Assistant integration runs ffmpeg as follows:

ffmpeg -i https://mydomain.com/api/tts_proxy/a94a8fe5ccb19ba61c4c0873d391e987982fbbd3_en_-_google_translate.mp3 -f s16le -acodec pcm_s16le -ar 16000 -

However, I do not see the "-rtsp_transport tcp" option in the call to ffmpeg, though it was specified when the integration was initially configured as confirmed in the .storage/core.config_entries file above.

What am I missing here?

Is "extra_arguments" really being used/honored or is this a bug, or am I completely off base?

Thanks for any help!

@tunerooster
Copy link

tunerooster commented Nov 16, 2021

FYI, I just noticed, in the alpha python code above, the line:

 cmd = ["ffmpeg",  "-i",  in_file_path, "-f", "s16le", "-acodec",  "pcm_s16le", "-ar", "16000", "-y", out_file_path]

makes no reference to "extra_arguments", and shows only the options I see in the "ps" above.

@roleoroleo
Copy link
Owner

You are right.
I probably forgot to call the variable.
I'll check it.

@tunerooster
Copy link

tunerooster commented Nov 16, 2021 via email

@tunerooster
Copy link

tunerooster commented Nov 17, 2021

P.S., I noticed that the Google Home media_players are listed with "Google Cast" integration, whereas the media_player.frontdoor_camera is listed with "Yi Cam with yi-hack" integration. Might it be possible to lift some code from the Google Cast integration?

It looks like to work, the camera has to be "discoverable" as a chromecast device.

@roleoroleo
Copy link
Owner

It looks like to work, the camera has to be "discoverable" as a chromecast device.

I think this isn't a simple task.
But I will study it.

@roleoroleo
Copy link
Owner

Now I remember.
The extra_argument parameter is useful for video capture process.
I think it's not correct to mix these options in the ffmpeg call.
But I added a new parameter "Boost speaker" that you can check during integration (default true).
Do you want to try the last commit?

@tunerooster
Copy link

tunerooster commented Nov 17, 2021 via email

@vin-w
Copy link

vin-w commented Nov 17, 2021

I was using

        cmd = ["ffmpeg",  "-i",  media_id, "-f", "s16le", "-acodec",  "pcm_s16le", "-ac", "1",  "-vol", "1024", "-ar", "16000", "-"]
-ac channels        set number of audio channels
-ac 1 means change to mono 

-vol volume change audio volume (256=normal)
-vol 1024 means... 1024/256 = 4 times larger

I only change for mstar camera, for allwinner camera the volume was loud enough

Could we make boost audio an extra config. like video capture extra_argument?

@roleoroleo
Copy link
Owner

Is there a release file somewhere?

It's not a release but you can download the repo.

Could we make boost audio an extra config. like video capture extra_argument?

Yes, it's an option:
immagine

@roleoroleo
Copy link
Owner

But I need help to add the right parameter for all models: mstar, allwinner and allwinner-v2.

@tunerooster
Copy link

tunerooster commented Nov 19, 2021

As I mentioned above, for Allwinner-v2, I had to use:

ffmpeg -filter:a volume=3

The "3" means 3x the original volume.

However, in my opinion, it would be nice to allow a text field to let the user specify any options needed. The "Boost" check box could be in addition, if the user does not know what ffmpeg arguments to use.

And, if extra_arguments is not used, is should not appear. Is there a reason it is still in the "yi-hack configuration" window shown above?

@tunerooster
Copy link

tunerooster commented Nov 19, 2021

Not sure this is your problem, but do you know of a way to reconfigure the yi-hack_ha_integration or do I need to delete it and reinstall each time?

And with multiple cameras, do you install it over again, once for each camera?

@tunerooster
Copy link

tunerooster commented Nov 19, 2021

I think I answered my own questions:

  • Yes, you install it once for each camera.
  • By installing it for camera 2, the new code applies to all existing cameras.
  • Didn't try, but it appears that a fresh install is the same as a reinstall. (If I am wrong on that, someone please correct me.)
  • No need to delete, just install new code under new custom_components/yi_hack directory, then restart HA server.

It appears that the Boost setting applies to all cameras, which works for me. :)

@roleoroleo
Copy link
Owner

And, if extra_arguments is not used, is should not appear. Is there a reason it is still in the "yi-hack configuration" window shown above?

extra_arguments is used to take the snapshot and to get stream. I can't remove it.

Not sure this is your problem, but do you know of a way to reconfigure the yi-hack_ha_integration or do I need to delete it and reinstall each time?

No, I think there is no way.

And with multiple cameras, do you install it over again, once for each camera?

Yes, once for each camera.

When you update, you don't need to remove and reinstall the cams.
The new "boost" option is default true for all cams installed before the upgrade.

@roleoroleo
Copy link
Owner

roleoroleo commented Nov 19, 2021

So...
Mstar:
-ac 1 -filter:a volume=4
Allwinner:
nothing
Allwinner-v2:
-ac 1 -filter:a volume=3

vol option seems to be deprecated.

What about changing the boolean option with a integer multiplier?
Boost: x1, x2, x3 and so on

@tunerooster
Copy link

"changing the boolean option with a integer multiplier" would be great! That's all that is needed. Please post when this is implemented and I will download and test it. I downloaded the current code and it works great. Looking forward to the release which lets me set volume=[1234] etc!

Thanks you have really created a great hack. No other camera hack even comes close!

@tunerooster
Copy link

Update:

The media_player entity from the "yi-hack_ha_integration" seems to have quit working since I updated. Entering "test" in the tts screen (see attached) does nothing. I have confirmed the camera and audio are on (ie, it plays a pcm through the fifo fine), but tts plays nothing now. Hope this isn't something I have done wrong.

I do see ffmpeg running when I click "send" as follows:

ffmpeg -i https://mydomain.com/api/tts_proxy/a94a8fe5c...987982fbbd3_en_-_google_translate.mp3 -f s16le -acodec pcm_s16le -ar 16000 -filter:a "volume=25dB" -

but no sound results from the camera.

Can you suggest any other troubleshooting tests?
output.pdf

@roleoroleo
Copy link
Owner

Do you want to try the last commit?

@tunerooster
Copy link

tunerooster commented Nov 19, 2021 via email

@tunerooster
Copy link

tunerooster commented Nov 20, 2021

Installed last commit,,,

tts.google_say is now working!

However, media_player.play_media is not. From the Developer Tools -> SERVICES menu:

service: media_player.play_media
data:
  media_content_id: /local/audio/hello.mp3
  media_content_type: music
target:
  entity_id:
    - media_player.backdoor_camera

Produces no sound. Whereas it works when entity_id is a Google Home Mini.

From DEBUG level output in the HA log I see:

2021-11-19 19:55:35 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection] [139944132205536] Received {'type': 'execute_script', 'sequence': [{'service': 'media_player.play_media', 'data': {'media_content_id': '/local/audio/hello.mp3', 'media_content_type': 'music'}, 'target': {'entity_id': ['media_player.backdoor_camera']}}], 'id': 99}
2021-11-19 19:55:35 INFO (MainThread) [homeassistant.helpers.script.websocket_api_script] websocket_api script: Running websocket_api script
2021-11-19 19:55:35 INFO (MainThread) [homeassistant.helpers.script.websocket_api_script] websocket_api script: Executing step call service

The volume via TTS is now loud and clear (at 3x) and I can see it is correctly set when ffmpeg runs.

Please advise re: media_player.play_media

Thanks as always!

@tunerooster
Copy link

Hold on...

I just discovered it is working on the frontdoor camera but not the backdoor camera. They are supposed to be identically configured and are the same model.

I deleted and reinstalled the integration for the backdoor camera, but not for the frontdoor. I am not clear where/what changes in a reinstall. I.e., they both are using the same new media_player.py. Should I delete and reinstall the frontdoor camera and see if it breaks it?

@tunerooster
Copy link

I also noticed this, from .storage/core.config_entries:

                    "host": "backdoor",
                    "boost_speaker": "x 3",
----------------
                    "host": "frontdoor",

I.e., no "boost_speaker" is included in the frontdoor config.

I doubt this is relevant, as it likely only applies to the tts service, but it is the only difference I could find in the configurations. I will reinstall the frontdoor with the x3 boost and see if it causes the frontdoor to quit working via media_player.play_media service.

@tunerooster
Copy link

OK, I reinstalled the frontdoor with the x3 boost and it is still working, both play_media and tts. So there must be something misconfigured on my backdoor camera. I will keep looking...

Sorry for the premature alarm.

Any ideas/suggestions which may come to mind are welcome. :)

@tunerooster
Copy link

tunerooster commented Nov 20, 2021

I think I have found the problem...

For the backdoor, in the media_content_id: field, in was using "/local/audio/hello.mp3".

For the frontdoor, I was using "https://mydomain.com/local/audio/hello.mp3"

When I switch to the Fully Qualified Domain name for the backdoor, it worked.

I was certain that HA appended the domain name (base_url from the configuration.yaml file) when the media_content_id started with a slash (/), but apparently not.

Live and learn... and sorry for the false alarm!

P.S.
Interestingly though, it accepts "/local/audio/hello.mp3" when playing to a Google Home speaker, just not a yi_hack camera, so maybe there is a fix that can be made there...

@roleoroleo
Copy link
Owner

I think /local/audio (without http) refers to the local file system.
If I donwload the mp3 file to /tmp and I play it with
media_content_id: /tmp/hello,mp3
it works.
Probably it depends on how media_id is handled.
In my case is sent directly to ffmpeg.

@tunerooster
Copy link

tunerooster commented Nov 20, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants