Media streams produced by FFmpeg or GStreamer are not received via some ISPs

The following scenario works well with some Internet providers but fails with some others:

ffmpeg/gstreamer => mediasoup producer (PlainRtpTransport) => mediasoup consumer (browser)

Checking the stats shows that data is received by the consumer but all packets are lost. This is while media produced by mic or webcam is received without any issue or packet lost.

However when I set the browser to use socks proxy, every thing works fine! I guess some thing is blocked or being filtered by those ISPs. Any thought? How can I inspect the issue?

Update: the problem is valid only for video. Audio is fine.

No idea what that means. You may check how good the browser is decoding the received video bytes, there is a stat for that.

Also, this was already told somewhere else. Using FFmpeg to produce there are LOT of issues. Even if no RTP packet is lost, sometimes FFmpeg fails to encode all video frames into RTP packets (under heavy load) so everything is good at RTP level (no packetLost at al) but video content is missed so the receiver cannot decode it.

Here’s the stats graph. As you can see data is transferred but soon receiving frames is stopped:

Yeah it was me :grin:. On that discussion I was suspicious to ffmpeg but after I could test with gstreamer I got the same result. So I decided to open a new topic.

The key point is that it works fine with some ISPs so it could not be an encoding issue but a network or protocol issue.

Super strange. Honestly no idea, but obviously those packets do not seem legit RTP packets or they do not contain proper encoded video. You may wish to run Chrome with full WebRTC logs to see what happens.

Using the following switch I got no useful data. Is it correct?
chrome --enable-logging --v=4 --vmodule=*/webrtc/*=2,*/libjingle/*=2,*=-2

I also got a log from chrome://webrtc-internals but it is in an unknown format. How can I read or convert it?

I do this:

$ cat bin/CanaryWebRtcDebug.sh

#!/bin/bash

# Clean the log.
> ~/Library/Application\ Support/Google/Chrome\ Canary/chrome_debug.log

cd /Applications/Google\ Chrome\ Canary.app/Contents/MacOS/

# ./Google\ Chrome\ Canary --enable-logging --v=1  --vmodule=*/webrtc/*=3,*=-2 --enable-logging=stderr

./Google\ Chrome\ Canary --enable-logging --vmodule=*/webrtc/*=9,*=-2 $*


# Then open the log:
# tail -f ~/Library/Application\ Support/Google/Chrome\ Canary/chrome_debug.log

Honestly no idea.

@ibc
I sent you the log file via private message.

Well, just to clarify, when I say “check the Chrome log” I do not mean “send it to me” :slight_smile:

Anyway: there are TONS of lines like this:

[10472:4796:1121/083211.690:WARNING:nack_module.cc(277)] Sequence number 771 removed from NACK list due to max retries.
[10472:4796:1121/083211.791:WARNING:nack_module.cc(277)] Sequence number 773 removed from NACK list due to max retries.
[10472:4796:1121/083211.791:WARNING:nack_module.cc(277)] Sequence number 774 removed from NACK list due to max retries.
[10472:4796:1121/083211.791:WARNING:nack_module.cc(277)] Sequence number 775 removed from NACK list due to max retries.
[10472:4796:1121/083211.791:WARNING:nack_module.cc(277)] Sequence number 777 removed from NACK list due to max retries.
[10472:10788:1121/083211.791:WARNING:video_receive_stream.cc(759)] No decodable frame in 200 ms, requesting keyframe.

The last one is specially bad (“No decodable frame in 200 ms, requesting keyframe.”). It means that Chrome/libwebrtc did not receive all required video frames so cannot render video and needs a video keyframe, so it will send a RTCP PLI. And mediasoup will send a PLI or FIR (depending on negotiated RTP parameters) to the sender, which is ffmpeg, which will just ignore the PLI or FIR because, anyway, ffmpeg does not support it. But GStreamer does AFAIR.

Not a bug or limitation or bad behavior in mediasoup.

Sorry, I just wanted to make sure I’m not missing some thing in the log. :pray:

And thanks for the analysis. I’ll try the same test with gstreamer.

np. BTW you should periodically check mediasoup side stats in the video Producer (ffmpeg or gstreamer) and the Consumer. Metrics like packetsLost and pliCount are the key.

1 Like

Tested with gstreamer I got exactly the same result and the same warning messages in the chrome debug log. I also checked the stats. The producer has zero packetsLost/ pliCount but in the consumer side all packets are lost while the Transport stats shows that data is received. :roll_eyes:

Now we have two facts:

  1. Streaming by ffmpeg and gstreamer, both fails on one of four tested ISPs. (3 other ISPs are fine so not a ffmpeg issue)
  2. Streaming from webcam is okay with the troubled network/ISP (so not a network issue).

The only difference between these two is the Transport. Using WebRtcTransport for webcam and PlainRtpTransport for ffmpeg/gstreamer. Maybe some thing is missing in the PlainRtpTransport. :face_with_monocle:

What does “in the consumer side all packets are lost” exactly mean? How do the server side stats of the Consumer look like? And what does “Transport stats shows that data is received” mean?

Where is ffmpeg/gstreamer placed? in the same server in which mediasoup runs? in a server close to it (so zero lost is expected)? or in a host far away from mediasoup server?

And nackCount?

Also, are you announcing support for NACK and PLI in the Producer rtpParameters you use within transport.produce() in mediasoup for ffmpeg/gstreamer? Please print those full rtpParameters. BTW you understand that it’s completely useless to announce NACK or PLI in those RtpParameters if the engine sending RTP (ffmpeg or gstreamer) does not support NACK and PLI, right?

Yes, of course, let’s assume this is a bug in mediasoup XD.

As I stated before, bytes are received but zero frames are decoded and almost all packets are reported as lost.

Here’s the Consumer stats which contains also the Producer stats:

[{ 
  bitrate: 1753357,
  byteCount: 11967592,
  firCount: 0,
  fractionLost: 0,
  jitter: 0,
  kind: 'video',
  mimeType: 'video/H264',
  nackCount: 0,
  nackPacketCount: 0,
  packetCount: 10977,
  packetsDiscarded: 0,
  packetsLost: 0,
  packetsRepaired: 0,
  packetsRetransmitted: 0,
  pliCount: 0,
  score: 10,
  ssrc: 22222222,
  timestamp: 5785900973,
  type: 'inbound-rtp',
},
{ 
  bitrate: 1773094,
  byteCount: 9543195,
  firCount: 0,
  fractionLost: 167,
  kind: 'video',
  mimeType: 'video/H264',
  nackCount: 3498,
  nackPacketCount: 55662,
  packetCount: 8665,
  packetsDiscarded: 0,
  packetsLost: 5722,
  packetsRepaired: 5785,
  packetsRetransmitted: 55412,
  pliCount: 176,
  roundTripTime: 23.98681640625,
  rtxSsrc: 493105124,
  score: 0,
  ssrc: 783016559,
  timestamp: 5785900973,
  type: 'outbound-rtp',
}]

In the same server.

No, I am not but I can add them and test again. Although I think that would not help since all frames seem to be invalid.

What is mediasoup XD?

Well, you will not ignore this annoying amount of lost packets in the Consumer, right?:

  nackCount: 3498,
  nackPacketCount: 55662,
  packetCount: 8665,
  packetsDiscarded: 0,
  packetsLost: 5722,
  packetsRepaired: 5785,
  packetsRetransmitted: 55412,
  pliCount: 176

Which means that, indeed, most packets are lost. In the Producer or in the Consumer side. But you are not checking the packet lost in the Producer.

I must re-clarify that announcing support for NACK when the device sending RTP does not support NACK is completely useless. However, if you enable NACK, mediasoup will generate NACKs and expose them in the Producer stats, and you will see tons of lost packets in the sender side.

Somehow this is becoming a question about mediasoup when there is no real information about how good your ffmpeg or gstreamer sender is sending media. mediasoup cannot do magic if the sender is sending bad.

Of course no, I’m just wondering how it might happen!

I’m confused. It’s obvious, as the stats shows, that the packet lost is happening in the consumer side. What do you mean by “But you are not checking the packet lost in the Producer”? Is there any difference between the producer stats fetched by Consumer.getStats() and the one by Producer.getStats()?

I also repeated the test with announcing support for NACK/PLI and GStreamer, which users have stated it supports that, and the same result.

Okay, If you think this is a ffmpeg/gstreamer issue then I stop investigating.

Regardless there is (or not) packet lost from Producer to mediasoup, the truth is that a WebRTC endpoint (i.e. a browseR) cannot render any received video with such a huge amount of packet lost (as your Consumer stats show). It’s impossible.

So when you say “ISP” you mean the Internet path/connection between mediasoup and the Consumer. Well, it’s clear that such a path is terrible. But then you say:

Well, “not a network issue” why? If such a network is producing lot of packet lost (which is obvious by checking the Consumer stats) and the browser requests a PLI, and such a PLI reaches mediasoup and mediasoup does not send it to ffmpeg/gstreamer (beucase you did not announce support for PLI in the Producer rtpParameters), or if you did but ffmpeg/gstreamer ignores PLI requests, then ffmpeg/gstreamer will NEVER generate a new video keyframe, so the receiver (the browser) will NEVER be able to render anything because it does need a video keyframe first.

So, my guess is that your ffmpeg/gstreamer is not receiving PLIs or it does not support PLI or it’s ignoring PLIs (it may receive them but it does NOT encode a vide keyframe). Now let’s see the Producer stats again:

firCount: 0,
pliCount: 0,

And now the Consumer stats:

firCount: 0,
pliCount: 176,

Wow! 176 PLIs in the Consumer, but 0 in the Producer, how is it possible? mediasoup cannot generate video keyframes so when it recieves a PLI or FIR from a Consumer it just generates a PLI or FIR for the associated Producer… unless the rtpParameteres of the Producer did not announce support for PLI or FIR, which obviously happens.

So not sure what you expect. Basically you have gstreamer/ffmpeg sending video. No idea if you tell them to generate video keyframes every N seconds (ugly workaround for ffmpeg AFAIR since it does not support PLI or FIR). And, assuming gstreamer supports PLI, you don’t announce support for it in mediasoup, so mediasiupo does NEVER send a PLI to gstreamer, so gstreamer may never send a video keyframe.

Basically, in that scenario, any Consumer that looses a single RTP video packet (without being able to get a retransmission for it) will get frozen + undecodeable video forever and ever.

But this is not a bug in mediasoup and, as usual, the only way to help here is by being an expert in ffmpeg or gstreamer. Unfortunately I’m not.

So in your “good ISPs” the consumer can render video because, even if there is packet lost, mediasoup is able to retransmit lost packets to the browser, so there is no need for an eventual keyframe.

In your “bad ISP”, the network is so bad that even packet retransmissions are lost, which means that the browser will eventually need a video keyframe. And your scenario is no ready for that due to lack of proper configuration in the sender side and its announced parameters.

Now, please don’t just announce support for “pli” in the producer rtpParameters. You need to verify whether gstreamer does support it in the way you are launching it.

You were right, the GStreamer command lacks of rtcp-fb-nack-pli=(int)1 in the caps argument. But even by adding that I’m still getting the same result :weary:. The stats also shows that the announcing support for NACK/PLI is in effect:

{ 
  bitrate: 2070298,
  byteCount: 16335139,
  firCount: 0,
  fractionLost: 0,
  jitter: 0,
  kind: 'video',
  mimeType: 'video/H264',
  nackCount: 0,
  nackPacketCount: 0,
  packetCount: 14953,
  packetsDiscarded: 0,
  packetsLost: 0,
  packetsRepaired: 0,
  packetsRetransmitted: 0,
  pliCount: 62,
  score: 10,
  ssrc: 22222222,
  timestamp: 5819796603,
  type: 'inbound-rtp',
},
{ 
  bitrate: 2092838,
  byteCount: 13973735,
  firCount: 0,
  fractionLost: 177,
  kind: 'video',
  mimeType: 'video/H264',
  nackCount: 5056,
  nackPacketCount: 83479,
  packetCount: 12652,
  packetsDiscarded: 0,
  packetsLost: 8381,
  packetsRepaired: 8465,
  packetsRetransmitted: 75820,
  pliCount: 262,
  roundTripTime: 39.00146484375,
  rtxSsrc: 710137139,
  score: 0,
  ssrc: 178838647,
  timestamp: 5819796603,
  type: 'outbound-rtp',
}

BTW we missed this one:

How can proxy resolve the issue if it is about gstreamer or bad network conditions? :thinking:

If GStreamer is ingoring those PLIs and it’s not generating video keyframes, those PLIs are useless. You should check whether GStreamer is sending keyframes or not. You can use this to trace keyframes.

If there is something I’ve learnt in all this years is this: never try to understand why something “fixes” an issue that you don’t yet understand. I won’t waste time with that because is irrelevant.

Okay, I’ll check it. :+1: