erratic webrtc BWE behaviour

wjps · July 22, 2020, 1:15pm

Hi,

We’re using mediasoup for a room based video chat application with mediasoup code roughly based on the demo app. We’re using H264 and simulcast to try and keep cpu & bandwidth requirements low, and all clients are typically running chrome 83+ on a variety of desktop OS. The clients usually encode 2 simulcast streams at 360p and 180p, ~600kb/s and ~100kb/s respectively.

We see a few problems with the WebRTC BandWidth Estimation, and have been using the bwe trace events to try and monitor what’s going on. We’re mostly testing on >=100Mb links with a server hosted at a local data centre so not really expecting to have bandwidth issues.

The most annoying problem is that intermittently a client will seem to get locked in a low available bandwidth situation with the mediasoup server flipping between sending a null spatial layer and layer zero. Simply rejoining the room and hence recreating all transports typically fixes the issue.

We’ve recently un-commented the following line in RTC/TransportCongestionController.cpp:50, anecdotally this seems to have improved this particular situation.

		// TODO: Let's see.
		this->rtpTransportControllerSend->EnablePeriodicAlrProbing(true);

Is there a good reason not to enable this?

More generally, we often see from the bwe trace events that available < desired bitrate even on supposedly reliable, high bandwidth, low latency, links and as a result low bandwidth streams are getting forwarded when there should be plenty of bandwidth available for the high quality one. Before investigating further, I’m wondering if there’s any value trying to update the libwebrtc code that seems to be used for some of this Bandwidth Estimation. As far as I can see it’s taken from something around m78 and the libwebrtc code gets re-organised a bit after that point so I’m not sure how straightforward updating it would be.

Also wondering if anybody else sees similar issues, or whether I’m barking up the wrong tree looking at the BWE code at all.

Thanks in advance

jmillan · July 23, 2020, 3:55pm

Hi @wjps,

It is unfortunate that we are using libwebrtc code for BWE because on one hand we are not certain that we are using the code 100% properly and on the other hand we are not certain of the fact that it is indeed working 100% as it should. Lastly we don’t have full control over it.

We’ve experienced such kind of behaviour too so yes, you are likely barking up to the right tree.

We would like to remove it and make our own implementation but this is not in the near plans and can’t provide any ETA for it.

Any PR for updating the libwebrtc code in the meantime would be appreciated though.

wjps · July 24, 2020, 7:32am

Ok thanks @jmillan, understood.

I’ll have a look at updating libwebrtc but I’m not very familiar with the chromium codebase so I’m a little worried that even if I get it working, I might break it subtly. Any tips?

If I get something working I’ll send a PR…

ibc · July 24, 2020, 7:37am

@jmillan do you remember why we commented this?

jmillan · July 24, 2020, 8:20am

No, we did many tuning tests by the time. The comment is very unfortunate.

We’ve recently un-commented the following line in RTC/TransportCongestionController.cpp:50, anecdotally this seems to have improved this particular situation.

@wjps, how does it improve in your case?

wjps · July 24, 2020, 9:00am

Simply that the video stream seems to recover more consistently to a high quality layer when previously it would sometimes get stuck oscillating between spatial layer null and zero.

I tried this because the bwe seemed to recover much more quickly in situations where this line appeared in the logs.

webrtc::ProbeController::RequestProbe() | detected big bandwidth drop, start probing

In the situation where we were getting trapped in the null/0 oscillation we weren’t seeing this line in the logs.

I think the periodic probe may be helping it break out of some kind of corner case by resetting the bandwidth estimate, but I’m really guessing at this point.

ibc · July 28, 2020, 1:44pm

@jmillan what about enabling such a line this->rtpTransportControllerSend->EnablePeriodicAlrProbing(true);?

jmillan · July 28, 2020, 2:35pm

@jmillan what about enabling such a line this->rtpTransportControllerSend->EnablePeriodicAlrProbing(true); ?

I find it OK to enable it.

Some description about ALR:

https://source.chromium.org/chromium/chromium/src/+/master:third_party/webrtc/modules/congestion_controller/goog_cc/alr_detector.h

ibc · July 29, 2020, 7:50am

Periodic ALR probing has been enabled in v3 branch. Will be available when 3.6.14 is released.

eyals · August 2, 2020, 1:54pm

Moving to version 3.6.14 (from 3.6.12), we see endless printouts as below, moving from and to spatial layer 0->1. Could this be the cause?
mediasoup:Channel [pid:34432] RTC::SimulcastConsumer::RTC::SimulcastConsumer::UpdateTargetLayers() | target layers changed [spatial:0, temporal:1, consumerId:38ee785b-a77f-469c-9286-c86ccd2a7df6] +66ms
mediasoup:Channel [pid:34432] RTC::SimulcastConsumer::RTC::SimulcastConsumer::UpdateTargetLayers() | target layers changed [spatial:1, temporal:1, consumerId:38ee785b-a77f-469c-9286-c86ccd2a7df6]

jony · February 11, 2022, 7:02am

I have updated libwebrtc from m77 to m94 (4515), testing

ea167 · March 29, 2022, 11:58pm

Any success with the libwebrtc update @jony ?

jony · July 25, 2022, 9:32am

yes, update to m97,just testing

ea167 · August 1, 2022, 7:02pm

Any PR that we could collaborate on, @jony ?

jony · August 8, 2022, 10:21am

When my upgrade is complete and testing is complete, a marge requset will be submitted

Topic		Replies	Views
Bandwidth estimation (BWE) tuning. mediasoup libraries	9	2717	August 25, 2021
Client-side controlled simulcast mediasoup libraries	1	97	August 29, 2024
BWE (Bandwidth Estimation) mediasoup libraries	3	697	December 2, 2019
Debugging stream discontinuity. mediasoup libraries	5	588	July 15, 2020
FPS instability mediasoup libraries	4	181	February 27, 2024

erratic webrtc BWE behaviour

Related topics