Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dahua VTO support #49

Open
luzik opened this issue Sep 19, 2022 · 87 comments
Open

Dahua VTO support #49

luzik opened this issue Sep 19, 2022 · 87 comments
Labels
dahua Dahua cameras, VTO, SIP enhancement New feature or request

Comments

@luzik
Copy link

luzik commented Sep 19, 2022

Browser is asking for microphone permission, but no audio from VTO and unmute button is disabled

api:
  base_path: "/go2rtc"
streams:
  vto:
    - rtsp://admin:[email protected]/cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif
[
  {
    "media:0": "video, sendonly, 96 H264/90000",
    "media:1": "audio, sendonly, 97 L16/16000",
    "media:2": "application, sendonly, 107 VND.ONVIF.METADATA/90000",
    "media:3": "audio, recvonly, 97 L16/16000",
    "receive": 824840,
    "remote_addr": "192.168.124.30:554",
    "send": 0,
    "track:0": "96 H264/90000, sinks=1",
    "type": "RTSP client producer",
    "url": "rtsp://192.168.124.30:554/cam/realmonitor?channel=1\u0026subtype=0\u0026unicast=true\u0026proto=Onvif/"
  },
  {
    "remote_addr": "udp4 host 192.168.123.33:63426",
    "send": 826301,
    "type": "WebRTC server consumer",
    "user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.4976.0 Safari/537.36"
  }
]
v=0
o=- 8819968399131000794 2 IN IP4 127.0.0.1
s=-
t=0 0
a=group:BUNDLE 0 1
a=extmap-allow-mixed
a=msid-semantic: WMS mUTzunk2m8gOBSbMpIqZr9ZW7amqcT2Uu8h0
m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 102 123 35 36 127 122 125 107 108 109 124 121 120 119 114 37
c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0
a=ice-ufrag:cSCj
a=ice-pwd:azQ5DYFApCx3rzPQWvEqknTA
a=ice-options:trickle
a=fingerprint:sha-256 2A:0A:DC:2B:B9:B5:CC:A3:07:E0:98:E7:B3:65:7A:2B:98:5E:AA:AE:0E:76:DB:71:3F:52:3D:E8:55:2B:F8:F7
a=setup:actpass
a=mid:0
a=extmap:1 urn:ietf:params:rtp-hdrext:toffset
a=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=extmap:3 urn:3gpp:video-orientation
a=extmap:4 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=extmap:5 http://www.webrtc.org/experiments/rtp-hdrext/playout-delay
a=extmap:6 http://www.webrtc.org/experiments/rtp-hdrext/video-content-type
a=extmap:7 http://www.webrtc.org/experiments/rtp-hdrext/video-timing
a=extmap:8 http://www.webrtc.org/experiments/rtp-hdrext/color-space
a=extmap:9 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:10 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
a=extmap:11 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
a=recvonly
a=rtcp-mux
a=rtcp-rsize
a=rtpmap:96 VP8/90000
a=rtcp-fb:96 goog-remb
a=rtcp-fb:96 transport-cc
a=rtcp-fb:96 ccm fir
a=rtcp-fb:96 nack
a=rtcp-fb:96 nack pli
a=rtpmap:97 rtx/90000
a=fmtp:97 apt=96
a=rtpmap:98 VP9/90000
a=rtcp-fb:98 goog-remb
a=rtcp-fb:98 transport-cc
a=rtcp-fb:98 ccm fir
a=rtcp-fb:98 nack
a=rtcp-fb:98 nack pli
a=fmtp:98 profile-id=0
a=rtpmap:99 rtx/90000
a=fmtp:99 apt=98
a=rtpmap:100 VP9/90000
a=rtcp-fb:100 goog-remb
a=rtcp-fb:100 transport-cc
a=rtcp-fb:100 ccm fir
a=rtcp-fb:100 nack
a=rtcp-fb:100 nack pli
a=fmtp:100 profile-id=2
a=rtpmap:101 rtx/90000
a=fmtp:101 apt=100
a=rtpmap:102 VP9/90000
a=rtcp-fb:102 goog-remb
a=rtcp-fb:102 transport-cc
a=rtcp-fb:102 ccm fir
a=rtcp-fb:102 nack
a=rtcp-fb:102 nack pli
a=fmtp:102 profile-id=1
a=rtpmap:123 rtx/90000
a=fmtp:123 apt=102
a=rtpmap:35 AV1/90000
a=rtcp-fb:35 goog-remb
a=rtcp-fb:35 transport-cc
a=rtcp-fb:35 ccm fir
a=rtcp-fb:35 nack
a=rtcp-fb:35 nack pli
a=rtpmap:36 rtx/90000
a=fmtp:36 apt=35
a=rtpmap:127 H264/90000
a=rtcp-fb:127 goog-remb
a=rtcp-fb:127 transport-cc
a=rtcp-fb:127 ccm fir
a=rtcp-fb:127 nack
a=rtcp-fb:127 nack pli
a=fmtp:127 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42001f
a=rtpmap:122 rtx/90000
a=fmtp:122 apt=127
a=rtpmap:125 H264/90000
a=rtcp-fb:125 goog-remb
a=rtcp-fb:125 transport-cc
a=rtcp-fb:125 ccm fir
a=rtcp-fb:125 nack
a=rtcp-fb:125 nack pli
a=fmtp:125 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=4d001f
a=rtpmap:107 rtx/90000
a=fmtp:107 apt=125
a=rtpmap:108 H264/90000
a=rtcp-fb:108 goog-remb
a=rtcp-fb:108 transport-cc
a=rtcp-fb:108 ccm fir
a=rtcp-fb:108 nack
a=rtcp-fb:108 nack pli
a=fmtp:108 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=64001f
a=rtpmap:109 rtx/90000
a=fmtp:109 apt=108
a=rtpmap:124 H264/90000
a=rtcp-fb:124 goog-remb
a=rtcp-fb:124 transport-cc
a=rtcp-fb:124 ccm fir
a=rtcp-fb:124 nack
a=rtcp-fb:124 nack pli
a=fmtp:124 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42e01f
a=rtpmap:121 rtx/90000
a=fmtp:121 apt=124
a=rtpmap:120 red/90000
a=rtpmap:119 rtx/90000
a=fmtp:119 apt=120
a=rtpmap:114 ulpfec/90000
a=rtpmap:37 flexfec-03/90000
a=rtcp-fb:37 goog-remb
a=rtcp-fb:37 transport-cc
a=fmtp:37 repair-window=10000000
m=audio 9 UDP/TLS/RTP/SAVPF 111 63 103 104 9 0 8 106 105 13 110 112 113 126
c=IN IP4 0.0.0.0
a=rtcp:9 IN IP4 0.0.0.0
a=ice-ufrag:cSCj
a=ice-pwd:azQ5DYFApCx3rzPQWvEqknTA
a=ice-options:trickle
a=fingerprint:sha-256 2A:0A:DC:2B:B9:B5:CC:A3:07:E0:98:E7:B3:65:7A:2B:98:5E:AA:AE:0E:76:DB:71:3F:52:3D:E8:55:2B:F8:F7
a=setup:actpass
a=mid:1
a=extmap:14 urn:ietf:params:rtp-hdrext:ssrc-audio-level
a=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=extmap:4 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=extmap:9 urn:ietf:params:rtp-hdrext:sdes:mid
a=sendrecv
a=msid:mUTzunk2m8gOBSbMpIqZr9ZW7amqcT2Uu8h0 9654a211-5d7a-4500-8034-3b17b1c6c7a7
a=rtcp-mux
a=rtpmap:111 opus/48000/2
a=rtcp-fb:111 transport-cc
a=fmtp:111 minptime=10;useinbandfec=1
a=rtpmap:63 red/48000/2
a=fmtp:63 111/111
a=rtpmap:103 ISAC/16000
a=rtpmap:104 ISAC/32000
a=rtpmap:9 G722/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:106 CN/32000
a=rtpmap:105 CN/16000
a=rtpmap:13 CN/8000
a=rtpmap:110 telephone-event/48000
a=rtpmap:112 telephone-event/32000
a=rtpmap:113 telephone-event/16000
a=rtpmap:126 telephone-event/8000
a=ssrc:1443778823 cname:5sgtTCGio9Mf4sP8
a=ssrc:1443778823 msid:mUTzunk2m8gOBSbMpIqZr9ZW7amqcT2Uu8h0 9654a211-5d7a-4500-8034-3b17b1c6c7a7
a=ssrc:1443778823 mslabel:mUTzunk2m8gOBSbMpIqZr9ZW7amqcT2Uu8h0
a=ssrc:1443778823 label:9654a211-5d7a-4500-8034-3b17b1c6c7a7

6webrtc.html:1 Uncaught (in promise) DOMException: Failed to execute 'addIceCandidate' on 'RTCPeerConnection': The remote description was null
@calisro
Copy link

calisro commented Sep 19, 2022

The camera is outputting a format that is incompatible with webrtc. Probably AAC. So transcode it to something webrtc friendly or go into your camera settings and change it to a compatible format. If its like my amcrest, you can change it to G.711 so you don't need to transcode.

Alternatively, this would work:

  vto:
    - rtsp://admin:[email protected]/cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif
    - ffmpeg:rtsp://admin:[email protected]/cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif#audio=opus

https://developer.mozilla.org/en-US/docs/Web/Media/Formats/WebRTC_codecs#supported_audio_codecs

@AlexxIT AlexxIT added the question Further information is requested label Sep 20, 2022
@AlexxIT
Copy link
Owner

AlexxIT commented Sep 20, 2022

Both send and receive tracks has L16 codec.

L16 is not AAC, but also doesn't supported by WebRTC. It's possible to receive audio with transcoding via FFmpeg.

But send audio (from microphone) with transcoding is not supported yet.

@luzik
Copy link
Author

luzik commented Sep 20, 2022

So Dahua VTO2211G do not allow to change audio codec via browser UI. Maybe there is a way using ONVIF, GET param or dahua api ?

Documentation show that device support
G.711Mu; G.711u; PCM audio compression
https://www.dahuasecurity.com/my/products/All-Products/Video-Intercoms/IP-Series/Villa-Door-Station/Pro-Series/VTO2211G-WP

vlc shows it as PCM S16 BE (s16b)
and ffprobe
Stream #0:1: Audio: pcm_s16be, 16000 Hz, 1 channels, s16, 256 kb/s

VTO do accept audio via API in ulaw. I believe that VTO supporting g.711 while using SIP protocol

@AlexxIT
Copy link
Owner

AlexxIT commented Sep 20, 2022

Are you sure you can't change codec via camera Web UI?

image

@luzik
Copy link
Author

luzik commented Sep 20, 2022

Zrzut ekranu 2022-09-20 o 19 02 29

Zrzut ekranu 2022-09-20 o 19 02 45

@luzik
Copy link
Author

luzik commented Sep 20, 2022

Hmm, interesting ..there is a way to change audio codec using Onvif

blakeblackshear/frigate#2572

@calisro
Copy link

calisro commented Sep 20, 2022

There's likely a way to change it. I can't change it with Amcrest tools but I could change mine with this:

https://dahuawiki.com/ConfigTool

@luzik
Copy link
Author

luzik commented Sep 20, 2022

So yeah, it's working. I did used this Happytime onvif client and I hope this settings will stay for ever. If not I will reopen this issue, and ask for help with automating this change.

[
{
"media:0": "video, sendonly, 96 H264/90000",
"media:1": "audio, sendonly, 0 PCMU/8000",
"media:2": "application, sendonly, 107 VND.ONVIF.METADATA/90000",
"media:3": "audio, recvonly, 0 PCMU/8000",
"receive": 8279516,
"remote_addr": "192.168.124.30:554",
"send": 993960,
"track:0": "0 PCMU/8000, sinks=1",
"track:1": "0 PCMU/8000, sinks=1",
"type": "RTSP client producer",
"url": "rtsp://192.168.124.30:554/cam/realmonitor?channel=1\u0026subtype=0\u0026unicast=true\u0026proto=Onvif/"
},
{
"receive": 883680,
"remote_addr": "udp4 host 192.168.123.33:61346",
"send": 8292693,
"type": "WebRTC server consumer",
"user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.5 Safari/605.1.15"
}
]

@luzik luzik closed this as completed Sep 20, 2022
@felipecrs
Copy link
Contributor

@luzik is 2-way audio working for you?

@luzik
Copy link
Author

luzik commented Sep 21, 2022

Yeah, latency i low and we are ready to build video doorbell intercom, the only problem is to switch audio codec in programming way. I believe that this could be in scope of this project :)

@bonuzzz
Copy link

bonuzzz commented Sep 29, 2022

It is possible to change parameters without onvif client.
To get possible audio/video parameters:
http://admin:[email protected]/cgi-bin/configManager.cgi?action=getConfig&name=Encode
For example to change audio bitrate to 8khz:
http://admin:[email protected]/cgi-bin/configManager.cgi?action=setConfig&Encode[0].MainFormat[0].Audio.Depth=8

I have VTO2211G. It should work on other models.
ps: microphone in chrome doesn't work without https connection

@luzik
Copy link
Author

luzik commented Sep 29, 2022

That's a fantastic news for me, because my vto2211g did go to its defaults "L16/16000".
Does go2rtc can give support for such behave ? ..I mean if camera returning L16 try such action and try reconnect

@luzik luzik reopened this Sep 29, 2022
@felipecrs
Copy link
Contributor

This approach is a little intrusive, perhaps some kind of automatic codec conversion would be a little better. Refs:

@luzik
Copy link
Author

luzik commented Sep 29, 2022

Yeah, You are right, but as conversion make use of CPU, maybe "force_vto_codec=true" is a way to go ?

@felipecrs
Copy link
Contributor

I think the ideal solution would be to:

  1. Report the issue to Dahua, ask them to either make the audio codec selectable from the admin page or prevent it from reverting back to the original codec after some time

But I do understand that it's very unlikely that they would provide any support. Another option would be to look for alternative firmwares, like OpenIPC (which may be already compatible).

That's because, supposedly, this is a camera-specific requirement (other cameras/doorbells allows you to set the codec from their UI).


force_vto_codec=true

If I'm not mistaken, this is an ONVIF configuration (not something specific to VTO), which could potentially mean it would work for other cameras, from other manufactures even. Maybe a better name would be try_adjusting_onvif_audio_codec=true.

@felipecrs
Copy link
Contributor

Since the VTO seems to retain the audio codec configuration for at least some days, a workaround for now is to create a script that fixes the codec and configure it to run automatically every day at midnight in Home Assistant, for example.

@bonuzzz
Copy link

bonuzzz commented Sep 29, 2022

to change bitrate from console:
curl --digest -u "admin:pass" -g "http://192.168.1.110/cgi-bin/configManager.cgi?action=setConfig&Encode[0].MainFormat[0].Audio.Depth=8"
@felipecrs dahua stops stream when trying to change bitrate, but continue stream if bitrate is the same already. So I think it's better to execute script before start streaming in go2rtc.
I tried something like:

 vto:
    - exec:curl --digest -u "admin:pass" -g "http://192.168.1.110/cgi-bin/configManager.cgi?action=setConfig&Encode[0].MainFormat[0].Audio.Depth=8" 
    - rtsp://admin:[email protected]/cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif
vto:
  - exec:curl --digest -u "admin:pass" -g "http://192.168.1.110/cgi-bin/configManager.cgi?action=setConfig&Encode[0].MainFormat[0].Audio.Depth=8";ffmpeg -i rtsp://admin:[email protected]/cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif -vcodec copy -c copy -rtsp_transport tcp -f rtsp {output}

@AlexxIT
Copy link
Owner

AlexxIT commented Sep 30, 2022

You can use echo source. This is bash script that should print (echo) link to the stream. So you can change codec there.

https://github.com/AlexxIT/go2rtc#source-echo

@felipecrs
Copy link
Contributor

felipecrs commented Sep 30, 2022

Wow. This is brilliant. The possibilities are endless!

@bonuzzz
Copy link

bonuzzz commented Sep 30, 2022

I wrote simple script to check bitrate parameter before start stream. I included sleep to give dahua time to restart rtsp stream.

#!/bin/bash
array=`curl -s --digest -u "admin:pass" -g "http://192.168.1.110/cgi-bin/configManager.cgi?action=getConfig&name=Encode"`
value="table.Encode[0].MainFormat[0].Audio.Depth=16"

if [[ " ${array[@]} " == *"$value"* ]]; then
  curl -s --digest -u "admin:pass" -g "http://192.168.1.110/cgi-bin/configManager.cgi?action=setConfig&Encode[0].MainFormat[0].Audio.Depth=8"
  sleep 2 
fi

@bonuzzz
Copy link

bonuzzz commented Sep 30, 2022

@AlexxIT if I add several strings to source block like

vto:
  - echo: ....
  - ffmpeg: ....

are there executed sequentially or it is bad idea?

@felipecrs
Copy link
Contributor

felipecrs commented Sep 30, 2022

@bonuzzz you need to convert them all to echo. For example, your first echo should return the rtsp stream string, your second one should return the ffmpeg stream string. (I think)

@felipecrs
Copy link
Contributor

By the way, I don't know if my VTO is different than yours or not, but the setting I need to adjust is:

table.Encode[0].MainFormat[0].Audio.Frequency=16000

Instead of Depth=16

@felipecrs
Copy link
Contributor

felipecrs commented Sep 30, 2022

This seems to be working well for me. @luzik I suggest you try it too.

  1. Create the file /config/scripts/get_vto_stream.sh
  2. Place the following contents on it, replace the host and password with your values:
#!/bin/bash

creds="admin:pass"
host="192.168.1.40"

# Attempt to change sampling rate. It won't harm if it's already set.
curl --silent --digest --user "${creds}" --globoff \
    "http://${host}/cgi-bin/configManager.cgi?action=setConfig&Encode[0].MainFormat[0].Audio.Frequency=8000" >&2

echo "rtsp://${creds}@${host}/cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif"
  1. Give it execution permission: chmod +x /config/scripts/get_vto_stream.sh
  2. Set your go2rtc.yaml:
streams:
  vto:
    - echo:/config/scripts/get_vto_stream.sh

@bonuzzz
Copy link

bonuzzz commented Sep 30, 2022

@felipecrs

By the way, I don't know if my VTO is different than yours or not, but the setting I need to adjust is:

table.Encode[0].MainFormat[0].Audio.Frequency=16000

Instead of Depth=16

Of course frequency. Thanks. I didn't always get mic working in chrome, so tested several options besides frequency.
I have vto2211g model.
I advice to add sleep command to your script between switching frequency and strart streaming, because dahua is not always finished restart own rtsp stream during 60 sec which go2rtc waits for them.
And my script also checking current parameter.
ps: 2 sec is just a sample there. may be it need more

@AlexxIT
Copy link
Owner

AlexxIT commented Oct 15, 2023

Info about Dahua url was in very first docs:
https://github.com/AlexxIT/go2rtc/tree/v0.1-alpha.1#rtsp-source

@oNaiPs
Copy link

oNaiPs commented Oct 15, 2023

@oNaiPs can you please clarify how did you change the backchannel codec? I was only able to change the stream codec so far.

@felipecrs check below, you just need to call the following URLs (on your browser for example):

// change to L16/16000
http://admin:pass@ip_addr/cgi-bin/configManager.cgi?action=setConfig&Encode[0].MainFormat[0].Audio.Compression=PCM&Encode[0].MainFormat[0].Audio.Frequency=16000&Encode[0].MainFormat[0].Audio.Depth=16&Encode[0].MainFormat[0].Audio.Bitrate=64

//change to PCMA/8000
http://admin:pass@ip_addr/cgi-bin/configManager.cgi?action=setConfig&Encode[0].MainFormat[0].Audio.Compression=G.711A&Encode[0].MainFormat[0].Audio.Frequency=8000&Encode[0].MainFormat[0].Audio.Depth=16&Encode[0].MainFormat[0].Audio.Bitrate=64

//get available codecs
http://admin:pass@ip_addr/cgi-bin/encode.cgi?action=getConfigCaps

//get current setting
http://admin:pass@ip_addr/cgi-bin/configManager.cgi?action=getConfig&name=Encode[0].MainFormat[0].Audio

@felipecrs
Copy link
Contributor

Hm... that's funny. When I call such endpoints Encode[0].MainFormat[0], the codec from the backchannel does not change. Only the regular audio track from the main channel is changed.

@oNaiPs
Copy link

oNaiPs commented Oct 15, 2023

Hm... that's funny. When I call such endpoints Encode[0].MainFormat[0], the codec from the backchannel does not change. Only the regular audio track from the main channel is changed.

@felipecrs appologies I just noticed the URLs were wrong.. can you check again? (edited above)

@felipecrs
Copy link
Contributor

Well, the only difference between you and me is that I set only Compression and Frequency, you also change Depth and Bitrate.

Anyway. Let's see if things changes for you after the newest firmware.

@oNaiPs
Copy link

oNaiPs commented Oct 15, 2023

@felipecrs I can confirm that you need to change all of those to make a real change in the backchannel codec (i was trying to simplify before writing the post, and that's why it got broken).
One thing I know, is that for 2 way audio to be working, the "producer" (rtsp camera) needs to have a "sender" defined and in the "consumer" (webrtc) a "receiver" needs to be defined. I could only get this to work when my backchannel codec is defined as either PCMA/8000 or PCMU/8000.

Did the firmware upgrade, 2 way audio seems to be working but didnt' confirm with the monitor (VTH5321GB) which I also upgraded since its late here. However, looks promising, since there's no more garbled sound coming from it (that's the problem I was having after changing the codec in the VTO).

@oNaiPs
Copy link

oNaiPs commented Oct 15, 2023

@felipecrs I'm curious, what is the second channel for and why do you change it to AAC?

@felipecrs
Copy link
Contributor

Oh, that's not necessary for you if you don't use Frigate.

Frigate needs an AAC audio track for recordings.

Adjusting the substream to output in AAC prevents me from having to do the conversion with ffmpeg and thus saves me some good CPU cycles.

@patykolak
Copy link

Hello,

Can You send all of the Your config files? Like:
go2rtc.yaml
frigate
frigate card 2way audio

@sebastian-bartkowiak
Copy link

sebastian-bartkowiak commented Nov 13, 2023

Hi everyone - I have Dahua VTO2311R-WP doorbell and I'm struggling to achieve a solution like:

  • dorbell stream being connected to frigate for recording (ideally - with sound)
  • 2-way audio with video card for opening when someone calls the door (that would be opened on my wall-mounted Android tablet)
  • retain Dahua's app (DMSS) support for answering calls from VTO (I have that app installed on my phone, so I can answer doorbell when I'm outside of the house)

Is that configuration possible to achieve? I'm usually quite good with configuring stuff like that, but to be honest learning curve is a bit too steep over here for me, and I feel a bit lost 😄 Currently feature no 1 and 3 (frigate and DMSS) are up and running in my setup, so it's down to implementing feature no 2 without breaking behavior of the other ones 😛

Currently I have go2rtc version 1.7.1 installed with Frigate. My go2rtc config is:

streams:
    domofon:
      - rtsp://login:[email protected]/cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif

and the info from VTO:

{
    "producers": [
        {
            "type": "RTSP active producer",
            "url": "rtsp://192.168.1.202:554/cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif/",
            "remote_addr": "192.168.1.202:554",
            "user_agent": "go2rtc/1.7.1",
            "sdp": "v=0\r\no=- 2255458062 2255458062 IN IP4 0.0.0.0\r\ns=Media Server\r\nc=IN IP4 0.0.0.0\r\nt=0 0\r\na=control:*\r\na=packetization-supported:DH\r\na=rtppayload-supported:DH\r\na=range:npt=now-\r\na=x-packetization-supported:IV\r\na=x-rtppayload-supported:IV\r\nm=video 0 RTP/AVP 98\r\na=control:trackID=0\r\na=framerate:25.000000\r\na=rtpmap:98 H265/90000\r\na=fmtp:98 profile-id=1;sprop-sps=QgEBAWAAAAMAAAMAAAMAAAMAlqACgIAtFja5JMmuWMAgAAB9IAAMOCE=;sprop-pps=RAHgdrAmQA==;sprop-vps=QAEMAf//AWAAAAMAAAMAAAMAAAMAlqwJ\r\na=recvonly\r\nm=audio 0 RTP/AVP 97\r\na=control:trackID=1\r\na=rtpmap:97 L16/16000\r\na=recvonly\r\nm=application 0 RTP/AVP 107\r\na=control:trackID=4\r\na=rtpmap:107 vnd.onvif.metadata/90000\r\na=recvonly\r\nm=audio 0 RTP/AVP 97\r\na=control:trackID=5\r\na=rtpmap:97 L16/16000\r\na=sendonly\r\n",
            "medias": [
                "video, recvonly, H265",
                "audio, recvonly, L16/16000",
                "application, recvonly, VND.ONVIF.METADATA",
                "audio, sendonly, L16/16000"
            ],
            "receivers": [
                "97 L16/16000, bytes=145352960, senders=2",
                "98 H265, bytes=496738553, senders=0"
            ],
            "recv": 648398305
        },
        {
            "type": "WebRTC/WebSocket async passive producer",
            "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36"
        }
    ],
    "consumers": [
        {
            "type": "WebRTC/WebSocket async passive consumer",
            "remote_addr": "udp4 srflx 37.30.28.38:34891 related 0.0.0.0:0",
            "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36",
            "medias": [
                "video, sendonly, VP8, RTX, VP9, H264, AV1, RED, ULPFEC, FLEXFEC-03",
                "audio, sendonly, OPUS/48000/2, RED/48000/2, G722/8000, PCMU/8000, PCMA/8000, CN/8000, TELEPHONE-EVENT/48000, TELEPHONE-EVENT/8000, PCML"
            ],
            "senders": [
                "8 PCMA/8000, bytes=144560640, receivers=1"
            ],
            "send": 30267384
        },
        {
            "type": "WebRTC/WebSocket async passive consumer",
            "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36",
            "medias": [
                "video, sendonly, VP8, RTX, VP9, H264, AV1, RED, ULPFEC, FLEXFEC-03",
                "audio, sendonly, OPUS/48000/2, RED/48000/2, G722/8000, PCMU/8000, PCMA/8000, CN/8000, TELEPHONE-EVENT/48000, TELEPHONE-EVENT/8000, PCML"
            ],
            "senders": [
                "8 PCMA/8000, bytes=202240, receivers=1"
            ],
            "send": 42344
        }
    ]
}

As far as I understand I have to change the audio codec?

@miguelangel-nubla
Copy link
Contributor

miguelangel-nubla commented Nov 27, 2023

Working config for DHI-VTO2202F-P-S2 with 2 way audio on HD or SD, and SIP with 2 way audio+video:

If you use the app for the VTO, these settings will be changed on demand and the go2rtc streams will break.

Note the script uses different credentials for the configuration and for the stream.
Admin credentials are the ones for the WebUI administrator account.
Onvif credentials: WebUI->Local>Onvif User->Add

#!/bin/bash
set -euo pipefail

readonly admincreds='adminuser:adminpass'
readonly onvifcreds='onvifuser:onvifpass'
readonly host='192.168.0.123'
readonly path="${1}"

params=""
params+="&Encode[0].MainFormat[0].Audio.Compression=G.711A" # Chrome WebRTC backchannel codec restrictions.
params+="&Encode[0].MainFormat[0].Audio.Frequency=8000" # Chrome WebRTC backchannel codec restrictions.
params+="&Encode[0].MainFormat[0].Audio.Bitrate=64"
params+="&Encode[0].ExtraFormat[0].Audio.Compression=AAC" # For use with Frigate
params+="&Encode[0].ExtraFormat[0].Audio.Frequency=16000"
params+="&Encode[0].ExtraFormat[0].Audio.Bitrate=64"

params+="&Encode[0].MainFormat[0].Video.GOP=4" # Optional, read considerations

curl --fail --silent --show-error --digest --globoff --user "${admincreds}" \
  "http://${host}/cgi-bin/configManager.cgi?action=setConfig${params}" >&2

echo "rtsp://${onvifcreds}@${host}${path}"
streams:
  vto_sd_2wayaudio:
    # MainFormat: PCMA/8000, 2-way audio, H264
    - echo:/config/vto_setup_stream.sh /cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif
  vto_hd_2wayaudio:
    # ExtraFormat: H264. Must be first to be served to the client by default.
    - echo:/config/vto_setup_stream.sh /cam/realmonitor?channel=1&subtype=1&unicast=true&proto=Onvif#media=video#backchannel=0
    # MainFormat: PCMA/8000, 2-way audio, H264. Currently no way of disabling video stream while keeping 2 way audio?
    - echo:/config/vto_setup_stream.sh /cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif
  vto_hd_frigate:
    # ExtraFormat: MPEG4-GENERIC/16000, H264
    - echo:/config/vto_setup_stream.sh /cam/realmonitor?channel=1&subtype=1&unicast=true&proto=Onvif#media=audio+video#backchannel=0

You can of course reuse stream connections, but keep in mind backchannel will not be redirected.

My personal configuration

streams:
  vto:
    # ExtraFormat: MPEG4-GENERIC/16000 (ACC), H264. Frigate without transcoding.
    - echo:/config/vto_setup_stream.sh /cam/realmonitor?channel=1&subtype=1&unicast=true&proto=Onvif#media=audio+video#backchannel=0
    # Conversion from best source (ACC) to opus for WebRTC support.
    - ffmpeg:vto#audio=opus#async
    # MainFormat: PCMA/8000, 2-way audio, H264. 2 way audio last resort, so regular WebRTC or Frigate client does not block backchannel for other 2way audio client
    - echo:/config/vto_setup_stream.sh /cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif
  vto_fast:
    # MainFormat: PCMA/8000, 2-way audio, H264
    - echo:/config/vto_setup_stream.sh /cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif

Considerations:

In this camera MainStream is intended for low quality real time and ExtraStream (SubStream in the WebUI) for HD recording.

Using 2 way audio:

Only MainStream or ExtraStream can receive an audio stream at a given time.

If you open a stream to MainStream and another one to ExtraStream, the second one will not have the audio backchannel, and you will have to both close the first one and reconnect the second one to switch.
You can however have multiple 2 way audio clients to the same stream, but instead of mixing the audio at the VTO speaker as you would expect, audio is played sequentially by chunks and independently
Delay starts accumulating and it is unusable until all the audio sent ends playing, which can take some time even after you close the stream.

SIP is like a extra connection using MainStream, same quirks apply.
Conclusion: only one audio sender at any given time. For regular audio/video receiving this does not apply.

If you use Frigate, make sure it is not using your 2 way audio channel with #media=audio+video#backchannel=0, as shown in the example

GOP Setting

Max time between each complete frame, so you receive one sooner and video initial load is faster. This has special effect if you use low fps (like 4) and a standard gop like 50, it can take up to ~50/4=12.5s for the video to start. Drawback is it sends more data more often.

L16 Codec = PCM

Currently not supported by go2rtc for backchannel.
It does show up on go2rtc via onvif, but http://admin:pass@ip_addr/cgi-bin/encode.cgi?action=getConfigCaps does not list it as supported, strange.
PCMA also achieves go2rtc 2 way audio + SIP and it is way less bandwidth intensive.
Maybe using it will solve the 2 way audio quirks mentioned before?


Thanks to @felipecrs and @oNaiPs for the clues.

@felipecrs
Copy link
Contributor

felipecrs commented Nov 28, 2023

This was a really great summary, @miguelangel-nubla. I am also hopeful that 2-way audio through CGI can help overcome some of these limitations, especially the need of configuring codecs, the slowness of backchannel audio stream, and perhaps even the concurrency problem.

@winconlin
Copy link

Working config for DHI-VTO2202F-P-S2 with 2 way audio on HD or SD, and SIP with 2 way audio+video:

If you use the app for the VTO, these settings will be changed on demand and the go2rtc streams will break.

Note the script uses different credentials for the configuration and for the stream. Admin credentials are the ones for the WebUI administrator account. Onvif credentials: WebUI->Local>Onvif User->Add

#!/bin/bash
set -euo pipefail

readonly admincreds='adminuser:adminpass'
readonly onvifcreds='onvifuser:onvifpass'
readonly host='192.168.0.123'
readonly path="${1}"

params=""
params+="&Encode[0].MainFormat[0].Audio.Compression=G.711A" # Chrome WebRTC backchannel codec restrictions.
params+="&Encode[0].MainFormat[0].Audio.Frequency=8000" # Chrome WebRTC backchannel codec restrictions.
params+="&Encode[0].MainFormat[0].Audio.Bitrate=64"
params+="&Encode[0].ExtraFormat[0].Audio.Compression=AAC" # For use with Frigate
params+="&Encode[0].ExtraFormat[0].Audio.Frequency=16000"
params+="&Encode[0].ExtraFormat[0].Audio.Bitrate=64"

params+="&Encode[0].MainFormat[0].Video.GOP=4" # Optional, read considerations

curl --fail --silent --show-error --digest --globoff --user "${admincreds}" \
  "http://${host}/cgi-bin/configManager.cgi?action=setConfig${params}" >&2

echo "rtsp://${onvifcreds}@${host}${path}"
streams:
  vto_sd_2wayaudio:
    # MainFormat: PCMA/8000, 2-way audio, H264
    - echo:/config/vto_setup_stream.sh /cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif
  vto_hd_2wayaudio:
    # ExtraFormat: H264. Must be first to be served to the client by default.
    - echo:/config/vto_setup_stream.sh /cam/realmonitor?channel=1&subtype=1&unicast=true&proto=Onvif#media=video#backchannel=0
    # MainFormat: PCMA/8000, 2-way audio, H264. Currently no way of disabling video stream while keeping 2 way audio?
    - echo:/config/vto_setup_stream.sh /cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif
  vto_hd_frigate:
    # ExtraFormat: MPEG4-GENERIC/16000, H264
    - echo:/config/vto_setup_stream.sh /cam/realmonitor?channel=1&subtype=1&unicast=true&proto=Onvif#media=audio+video#backchannel=0

You can of course reuse stream connections, but keep in mind backchannel will not be redirected.

My personal configuration

streams:
  vto:
    # ExtraFormat: MPEG4-GENERIC/16000 (ACC), H264. Frigate without transcoding.
    - echo:/config/vto_setup_stream.sh /cam/realmonitor?channel=1&subtype=1&unicast=true&proto=Onvif#media=audio+video#backchannel=0
    # Conversion from best source (ACC) to opus for WebRTC support.
    - ffmpeg:vto#audio=opus#async
    # MainFormat: PCMA/8000, 2-way audio, H264. 2 way audio last resort, so regular WebRTC or Frigate client does not block backchannel for other 2way audio client
    - echo:/config/vto_setup_stream.sh /cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif
  vto_fast:
    # MainFormat: PCMA/8000, 2-way audio, H264
    - echo:/config/vto_setup_stream.sh /cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif

Considerations:

In this camera MainStream is intended for low quality real time and ExtraStream (SubStream in the WebUI) for HD recording.

Using 2 way audio:

Only MainStream or ExtraStream can receive an audio stream at a given time.

If you open a stream to MainStream and another one to ExtraStream, the second one will not have the audio backchannel, and you will have to both close the first one and reconnect the second one to switch. You can however have multiple 2 way audio clients to the same stream, but instead of mixing the audio at the VTO speaker as you would expect, audio is played sequentially by chunks and independently Delay starts accumulating and it is unusable until all the audio sent ends playing, which can take some time even after you close the stream.

SIP is like a extra connection using MainStream, same quirks apply. Conclusion: only one audio sender at any given time. For regular audio/video receiving this does not apply.

If you use Frigate, make sure it is not using your 2 way audio channel with #media=audio+video#backchannel=0, as shown in the example

GOP Setting

Max time between each complete frame, so you receive one sooner and video initial load is faster. This has special effect if you use low fps (like 4) and a standard gop like 50, it can take up to ~50/4=12.5s for the video to start. Drawback is it sends more data more often.

L16 Codec = PCM

Currently not supported by go2rtc for backchannel. It does show up on go2rtc via onvif, but http://admin:pass@ip_addr/cgi-bin/encode.cgi?action=getConfigCaps does not list it as supported, strange. PCMA also achieves go2rtc 2 way audio + SIP and it is way less bandwidth intensive. Maybe using it will solve the 2 way audio quirks mentioned before?

Thanks to @felipecrs and @oNaiPs for the clues.

sorry I am somewhat of a newbe when it comes to go2rtc and all this so please baer with me. I have read this whole thread as well as others and still I have several questions.
My Setup is as follows:

Doorbell: DHI-VTO2202F-P-S2
go2rtc is running on Home Assistant (VM) go2rtc Master Hardware
I have the Home Assistant Dahua Integration from rroller installed and my doorbell configured
I have the frigate lovelace card installed
Frigate is up and running as a seperate docker container on my server (which also runs the home assistant VM - all my cams including the doorbell are visible both on frigate itself as well as the home assistant frigate lovelace card)

1.) I don't get at all where exactly to put the script - where exactly do I have to put it - when edit the yaml as you mentioned above go2rtc gives me 3 streams but whenever I click on one I get mse: streams: fork/exec /config/vto_setup_stream.sh no such file or directory - which is no surprise but I am at a loss where I can put it so that it actually works
2.) the following is my frigate card in home assistant:

type: custom:frigate-card
cameras:
  - camera_entity: camera.hausturcam
    live_provider: go2rtc
    go2rtc:
      modes:
        - webrtc
menu:
  style: outside
  position: bottom
  buttons:
    microphone:
      enabled: true
      type: toggle
    screenshot:
      enabled: false
    download:
      enabled: false
    fullscreen:
      enabled: false
    snapshots:
      enabled: false
    timeline:
      enabled: false
    media_player:
      enabled: false
    clips:
      enabled: false
    live:
      enabled: false
    cameras:
      enabled: false
    frigate:
      enabled: false
    camera_ui:
      enabled: false
live:
  auto_mute: never
  controls:
    builtin: true
    title:
      mode: none
  layout:
    fit: fill
elements:
  - type: custom:frigate-card-menu-icon
    icon: mdi:volume-high
    tap_action:
      - action: custom:frigate-card-action
        frigate_card_action: unmute
  - type: custom:frigate-card-menu-icon
    icon: mdi:volume-off
    tap_action:
      - action: custom:frigate-card-action
        frigate_card_action: mute
  - type: custom:frigate-card-menu-icon
    icon: mdi:phone
    tap_action:
      - action: call-service
        service: button.press
        service_data:
          entity_id: button.ds_kh9510_answer_call
      - action: custom:frigate-card-action
        frigate_card_action: unmute
      - action: custom:frigate-card-action
        frigate_card_action: microphone_unmute
  - type: custom:frigate-card-menu-icon
    icon: mdi:phone-hangup
    tap_action:
      - action: custom:frigate-card-action
        frigate_card_action: microphone_mute
dimensions:
  aspect_ratio_mode: static
  aspect_ratio: '16:9'

this shows the cam live with buttons for answering call and declining call as well as increase and decrease volume.
Answering call does not work (which I can understand since i don't have an entity_id button.ds_kh9510_answer_call
3) I have an automation for the doorbell:

alias: kingel
description: ""
trigger:
  - platform: state
    entity_id:
      - binary_sensor.hausturklingel_button_pressed
    from: null
    to: "on"
condition: []
action:
  - parallel:
      - service: notify.mobile_app_pixel_5
        data:
          message: Es hat geklingelt
          title: Haustür
          image: /api/camera_proxy/camera.hausturklingel_main
      - service: notify.mobile_app_pixel_6_pro
        data:
          message: Es hat geklingelt
          title: Haustür
          image: /api/camera_proxy/camera.hausturklingel_main
      - service: camera.play_stream
        metadata: {}
        data:
          format: hls
          media_player: media_player.flur
        target:
          device_id: f63a33b8fa0d196f0e2ebe780416f8ab
      - delay:
          hours: 0
          minutes: 5
          seconds: 0
          milliseconds: 0
      - service: media_player.media_stop
        metadata: {}
        data: {}
        target:
          device_id: 528dccc697e42cdb68fb35c4a90e6808
mode: single

this sort of works...
It plays a live stream on my google nest (with audio but only receiving audio from the doorbell - and it's quite laggy I'd say up to 30 seconds)
it does not show up the notifications on the smartphones (which i do not care about right now - I just want to get the 2 way audio working on the frigate card)

Can someone please help me out. Thank you all for your understanding you're doing great work!

@felipecrs
Copy link
Contributor

I'd say 2-way audio with VTO just can't be done reliably right now with go2rtc. I'm also observing severe delays.

Maybe this situation will improve with #52.

@winconlin
Copy link

well i would accept the delay as far as it is somewhat working

@winconlin
Copy link

i am sorry maybe i misunderstood - do you mean if i change the rtsp stream to multipart the 2 way audio might(?) work?

@felipecrs
Copy link
Contributor

No, I just said that maybe someday go2rtc will allow to perform 2-way audio through Dahua's CGI interface, and then it may work better.

2-way audio through backchannel with the VTO is already possible. Scroll up in the conversation, many people proposed working solutions.

@AlexxIT
Copy link
Owner

AlexxIT commented Feb 1, 2024

I need physical access to Dahua with this CGI. Tried to add support remotely without big success.

@felipecrs
Copy link
Contributor

If you tried to work with a VTO, I'd recommend you first try with a regular Dahua 2-way audio camera. VTO's firmware is a true mess. (I don't have any regular cameras though)

@winconlin
Copy link

I need physical access to Dahua with this CGI. Tried to add support remotely without big success.

If you happen to live in the neighborhood (so to say) I could send mine to you (if I get it back in working condition)

@felipecrs
Copy link
Contributor

Just to set expectations of newcomers to this thread:

I am not aware of any reliable approach to use go2rtc (with something like WebRTC card) to replace a SIP-based system for 2-way audio communication with a Dahua VTO.

Today I tried putting together everything I've learned about this to make it work (so far I have been using the SIP card as daily driver), and unfortunately I stumbled upon: #1133

@felipecrs
Copy link
Contributor

felipecrs commented May 23, 2024

@miguelangel-nubla, one thing I realized is that if I set:

  "Encode[0].ExtraFormat[0].Audio.Compression=AAC"
  "Encode[0].ExtraFormat[0].Audio.Frequency=8000"

AAC to 8000Hz, the VTO will in fact serve 16000. If I set 16000 like you are doing, it will serve 32000Hz.

I suggest you test it as well, I suppose your goal was to have AAC/16000 with your config above and not AAC/32000.

So, I ended up with:

  "Encode[0].MainFormat[0].Audio.Compression=G.711A"
  "Encode[0].ExtraFormat[0].Audio.Compression=AAC"

Because:

  • The default frequency in configuration seems to always be 8000 anyway.
  • 8000 for AAC actually makes the VTO serve it as 16000, as I said.
  • Setting a lower frequency than 8000 (which is actually 16000) for AAC breaks it.
  • The default bitrate is already 64.

Another realization is that restarting the doorbell is enough for it to reset back the audio codecs to the defaults.

This is my current script for reference:

#!/bin/bash
#
# This script fixes the VTO audio codecs before supplying the stream url to go2rtc.
#
# Examples:
#
#   ./fix_vto_codecs.sh rtsp://user:[email protected]/cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif
#   ./fix_vto_codecs.sh rtsp://user:[email protected]/cam/realmonitor?channel=1&subtype=1
#

set -euo pipefail

if [[ -n "${DEBUG:-}" ]]; then
  set -x
fi

readonly vto_stream_url="${1}"

vto_host_with_creds="${vto_stream_url#"rtsp://"}"
vto_host_with_creds="${vto_host_with_creds%%"/"*}"
readonly vto_host_with_creds

query="action=setConfig"
# PCMA: good for webrtc and 2-way audio
query+="&Encode[0].MainFormat[0].Audio.Compression=G.711A"
# AAC: good for Frigate
query+="&Encode[0].ExtraFormat[0].Audio.Compression=AAC"
readonly query

curl --fail --silent --show-error --digest --globoff \
  "http://${vto_host_with_creds}/cgi-bin/configManager.cgi?${query}" |
  grep -q OK

echo -n "${vto_stream_url}"

Here is my go2rtc:

  streams:
    vto_two_way_audio:
      # H264, G.711A, 2-way audio
      - echo:/config/scripts/fix_vto_codecs.sh rtsp://admin:[email protected]/cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif
    vto:
      # H264, G.711A
      - echo:/config/scripts/fix_vto_codecs.sh rtsp://admin:[email protected]/cam/realmonitor?channel=1&subtype=0
      # AAC
      - rtsp://127.0.0.1:8554/video_porteiro_hd?audio=aac
    vto_hd:
      # H264, AAC
      - echo:/config/scripts/fix_vto_codecs.sh rtsp://admin:[email protected]/cam/realmonitor?channel=1&subtype=1
      # PCMA
      - rtsp://127.0.0.1:8554/video_porteiro?audio=pcma

Don't trust it 100% yet. I now converted my home to use 2-way audio with go2rtc rather than using SIP, so let me test for some days and I will post the results back here (but obviously everything I said is working ATM).

I am using the Frigate Card dev version, but I have ran into several issues so I prefer not to share it here until things are solved there.

@felipecrs
Copy link
Contributor

I am satisfied with my current setup. I am not using VTH nor the Dahua app to answer the doorbell, I am only using Home Assistant itself.

I am documenting it here: https://github.com/felipecrs/dahua-vto-on-home-assistant

@felipecrs
Copy link
Contributor

I just added a video demonstration on how it's working for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dahua Dahua cameras, VTO, SIP enhancement New feature or request
Projects
None yet
Development

No branches or pull requests