Questions about new producer pause RTP behavior

Catching up on some updates and I noticed this change: Effectively disable encoder on Producer pause by jmillan · Pull Request #215 · versatica/mediasoup-client · GitHub

A couple questions here:

  1. It seems to indicate that since we negotiate as inactive, we should send no RTP (the comment suggesting removal of zeroRtpOnPause). However I am still seeing about 1.2Kbps on an audio producer that is paused which is about the same as I was seeing with DTX. Is this expected?

  2. When using zeroRtpOnPause I am still getting delayed audio after some time. I believe this is a somewhat known issue but I though it was related to replacing the underlying track with null but is this still needed when renegotiating the source as inactive?

It seem like the “sweet spot” is {disableTrackOnPause: false, zeroRtpOnPause: true} and I think (still testing) that this prevents the delay bug, while still sending zero RTP. Does this match with other people’s findings?

edit: unfortunately after letting things idle for a while the delay returned. Looks like zeroRtpOnPause is still a no go.

Have you tried using opusDtx? It might give you the result you’re looking for, at least for audio. I’ve been experimenting with it, but haven’t deployed it widely yet.

Oh yeah I have been using DTX up until now, however I have had issues with audio artifacts creeping in under certain speech conditions. Here is on example from the OPUS repo: Loud static noise generated by decoder when encoding using Auto signal type and DTX mode for Opus 1.3.1. · Issue #253 · xiph/opus · GitHub while the issue has been fixed, it is not in the version of opus used by libwebrtc/blink… and there has not been an OPUS release since 2019 (they need a release manager).

So in the short term I am looking for a way to still reduce bandwidth without actually using DTX.

1 Like

Oh that’s very interesting. I’ll hold off on deploying dtx then for now. than you

It is honestly really minor and most users have no complained about it. I hear it because I am testing this stuff every day. For us a little quality loss in rare conditions was worth the bandwidth savings. Using Voice Activity Detection client-side and pausing producers seems to be working just about as well as leaving the producer open and using DTX though. But I would still love to get down to 0bps if the producer is paused (right now it sits around 1.2Kbps when not using zeroRtpOnPause)

What are you using client-side for VAD? Do you have some way to access the webrtc VAD or are you using some custom solution?

We are using a semi-custom solution but it is WebRTC VAD under the hood. We use a WASM build of libfvad (forked from WebRTC VAD) GitHub - OzymandiasTheGreat/libfvad-wasm: Voice activity detection (VAD) library, based on WebRTC's VAD engine built to WASM with Emscripten to run in browsers, Node, and NativeScript and run this off a WebAudio AudioWorklet, this is for “automatic detection” and we also have an option to set a threshold for detection manually and use a simple AudioAnalyser from WebAudio to support this mode.

Someday I would love to experiment with OPUS’ built-in VAD as well as I have heard it is very good.

1 Like

Ah very interesting. Thank you!

Hi @alexciarlillo, I know its been a while since this thread is open but are you still having issue with zeroRtpOnPause sending RTP when producer is paused?