Delayed Message Delivery with Direct Transport

Hi, We recently upgraded mediasoup in production and observed some really weird behaviour in direct transports and datachannels. I think i’ve tracked it down the the 3.9.0 release and this pr.

What we observed was crazy high memory usage and a multi minute delay of chat message being delivered only when we had 8+ people connected to a room. Whats interesting is that they’d come in waves and everyone in the room would get the messages all at once then nothing for an extended period of time. Normal usage of data channels with webrtc transports appears normals and normal voip traffic was not delated. We think the issue relates to direct transports in conjunction to webrtc transports.

Our architecture looks something like this

sender webrtc data producer → sender direct data consumer → sender direct data producer → receiver direct data consumer → recevier direct data producer → receiver webrtc data consumer

We can repo this in our testing environment but Im not sure how to debug this behaviour. Can someone help explain what we observed?

On both TypeScript and C++ side memory usage only decreased after that PR. Multi-minute delay doesn’t sound good at all.

What version did you upgrade from?

Hi,

We upgraded from 3.8.4. I really wish I kept graphs from the deploy. The memory usage increased drastically compared to 3.8.4. 3.8.4 memory usage was pretty consistent for us

It really relates to direct transports. Voip and normal data channels using webrtc transports are not effected.

Versions tested: 3.9.9, 3.9.1.

Thankfully we can repo this in a test environment.

Can you create a reproducible test case for this? That would simplify debugging significantly.

Ill try with the demo. Might take me a couple days.

Simple standalone app would be preferred. I’ll take a look at this as well once I have time.

Ideally it would be a test case that can be included with official tests, then if this is an actual issue, we’d have a regression test for it as well.

Any news on this topic?

Hi @nazar-pc still an issue. I’ve just been swamped and haven’t been able to cut away time to build out a test case.