Track if Producers or Consumers are alive

vpalmisano · May 27, 2020, 2:29pm

Hi all,
in some cases (e.g. to clean up “stale” transports in the case a client is not sending or receiving rtp packets anymore after a timeout), it will be useful to keep track if a Producer or a Consumer is receiving / sending packets. I tried enableTraceEvent but I think this could degrade the server performance.
How about introducing a lastRtpProcessedTime property (or similar) for each Producer and Consumer? Or, more better, a rtpInactive event?
Thanks

vpalmisano · May 27, 2020, 3:27pm

I’m tying to use transport.getStats() and track bytesSent and bytesReceived values.
I’m seeing that, even if the browser who created the transport instance is closed (without stopping the transport instance on server), the bytesSent value is incremented periodically (RTP probe packets?).

ibc · May 27, 2020, 10:37pm

They may be RTP probation packets or regular RTP packets. If the server side transport and/or its consumers were not closed, it’s expected that mediasoup will keep sending RTP to the remote endpoint.

This would be as inneficient as using the trace event, since we should send a message from C++ to Node for every received/sent RTP packet to tell about such a lastRtpProcessedTime.

Sounds better, but this is not so easy. What about simulcast streams? What about if no RTP is sent but just SCTP messages? Is it a per Producer event or a per Transport event?

Could you please describe a bit more your scenario & use case and why this is needed for you?

vpalmisano · May 28, 2020, 7:14am

I need a mechanism to cleanup the transports that are not sending or receiving data from the corresponding client endpoint. For the moment, I’m using this code that tracks the bytesReceived value:

    transport.appData.stats = {
      bytesSent: 0,
      bytesReceived: 0,
      lastActivity: 0,
    };
    transport.appData._getStats_t = setInterval(async () => {
      const stats = await transport.getStats();
      const now = Date.now();
      transport.appData.stats.bytesSent = stats[0].bytesSent;
      if (stats[0].bytesReceived > transport.appData.stats.bytesReceived) {
        transport.appData.stats.bytesReceived = stats[0].bytesReceived;
        transport.appData.stats.lastActivity = now;
      } else if (now - transport.appData.stats.lastActivity > 10 * 1000) {
        // inactivity detected, close the transport
        await transport.close();
      }
    }, 5 * 1000);

copiltembel · May 28, 2020, 2:21pm

I also believe such a mechanism would be beneficial. I use the signaling mechanism to detect if a peer has disconnected, which works but it’s not ideal.

ibc · May 28, 2020, 2:44pm

Please open a feature request in GitHub and we’ll consider it when possible (right now terribly busy).

copiltembel · May 29, 2020, 6:30am

BTW, isn’t receiving a score of 0 in producer.on(“score”, fn(score)) and consumer.on(“score”, fn(score)) an indication that those peers are not sending/receiving anything and thus can provide a mechanism for cleaning up?

vpalmisano · May 29, 2020, 10:51am

@copiltembel: I think it is not ideal: from my tests when I stop a consumer (closing the browser window without sending any message to the signaling service) I get no score events on server side.

@admins: the problem could be resolved implementing something like the AudioLevelObserver (https://github.com/versatica/mediasoup/blob/v3/worker/src/RTC/AudioLevelObserver.cpp).

TheSalarKhan · May 29, 2020, 5:20pm

AudioLevelObserver might not be a better approach in my opinion since it might require some CPU etc. But you can do something along the lines of what mediasoup-demo does. It uses a websocket for every connected participant and when the websocket disconnects - tab closes - the server lets go of all the resources for that participant - consumers and producers.

vpalmisano · May 29, 2020, 5:23pm

Yes, this is the ideal case. But what if the websocket server fails to handle the client disconnection?

jmillan · May 30, 2020, 11:03am

My opinion is that your effort should go on making that client disconnection handling reliable in that case.

Looking at the RTP reception in order to know if an endpoint is alive is not meaningful unless you control the nature of your endpoints and your mediasoup usage. If you are certain that N (lets say 10) seconds of RTP inactivity can only mean that the endpoint is not alive then there you are. Of course you must consider probation, etc in case it’s being used.

For more generic and reliable situations I would not rely on RTP reception for considering an endpoint not alive, but use another means for checking periodically whether the endpoint is alive.

ibc · May 31, 2020, 2:18pm

Then you have a bigger problem, since without signaling, your app will loose notifications about new consumers, etc.

We cannot do magic with RTP activity. RTP can become 0 if DTX is used (Discontinued Transmission). Nowhere is told that RTP must be continuos. We cannot assume at server side that eventual lack of RTP activity means an error. For instance, if you mute your mic or share a static video content, it’s perfectly fine that your endpoint does not send any new RTP packet until a change happens.

If you want to monitor from server side whether the remote endpoint is alive or not (at WebRTC / ICE level), you can do that with some kind of ping/pong messages over DataChannel.

j1elo · May 24, 2021, 11:26am

Maybe the issue here is assuming a strong coupling between an actual fact (no RTP packets have been seen in N seconds) and its particular semantics (a remote endpoint has abruptly stopped sending data and should be considered dead?.. but that’s just one possible interpretation…).

Just to give another perspective of how this could work; in Kurento we have a flow detection probe that is connected to the audio and video input/output pads of the WebRtcEndpoints. These probes simply trigger an event when data is actually flowing in (or out) to/from these Endpoint pads to/from the rest of the internal pipeline, and the event represents one of two state changes:

From Not Flowing to Flowing: if the last state had been “Not Flowing” but now a data buffer has passed through the pad.
From Flowing to Not Flowing: if the last state had been “Flowing” but now 2 seconds have elapsed without any other buffer passing through the pad.

Now, it is the application the one to decide what to do with these events, but at least having them allows for some fine-grained analysis and allows the app to detect unexpected situations.

Having these events has proven very useful to provide applications with confirmation coming directly from the media plane (in combination with app-specific logic from the signaling plane). Of course, trivial logic such as assuming “Not Flowing == Disconnected” is too simplistic. But more detailed logic can be programmed in order to check the facts and decide what to do.

(e.g. people using this event for the main speaker teacher’s endpoint in a teacher-student app, it typically is adequate to assume that their flow should not suddenly halt in the middle of the session, so if a NotFlowing event occurs, their endpoint gets reconnected. However this doesn’t -necessarily- apply to a screenshare, because there might be NotFlowing events if the desktop is static)

dimoochka · May 25, 2021, 12:39pm

I agree with what ibc wrote. I implement this using a ping/pong over the websocket and time it out after a few seconds of inactivity on both the client and server end. I also hook in a reconnection attempt with the timeout, websocket close/error, and reconnection timeout events and it works surprisingly well. When a client is a mobile device and transitions from cellular to wifi, for example, the reconnection happens within seconds and there’s minimal downtime. Works better than popular video conferencing apps in my own biased opinion.

Unfortunately this method gets pretty janky on unstable connections (lots of unnecessary reconnections/interruptions because I have the timeout set pretty low). However, given the poor signal quality I’m sure the video would be garbage anyway even if I wasn’t forcibly closing it.

jbaudanza · May 25, 2021, 6:30pm

I use a similar method with socketio timeouts (set to 30 seconds). Also, if the socket has ping timeout or a transport error, I allow 10 seconds for the client to reconnect on another socket and reclaim all the transports. If the socket disconnects gracefully, I clean up all the transports immediately.

All of my clients are on mobile devices so changing networks and IP addresses is pretty common. This method helps with the “walk-out-the-door” problem.

dimoochka · May 25, 2021, 6:40pm

That’s a cool way of doing it. When you reclaim existing transports, do you check if the client’s IP is identical to make sure they have the correct transport endpoint? I can see a mobile client dropping off wifi, getting a cellular connection right away and reconnecting, but incoming stream are being sent to the old endpoint.

jbaudanza · May 25, 2021, 7:02pm

I think if your server detects the client IP address has changed, you’re supposed to restart ICE.

Alternatively, If you’re using the native libwebrtc library (not through the browser), you can set an option called “continual gathering”. This isn’t a standardized part of WebRTC and I’m not sure it would even work with mediasoup, but it’s supposed to make switching networks more graceful.
https://webrtc.googlesource.com/src/+/refs/heads/master/sdk/objc/api/peerconnection/RTCConfiguration.h#53

That being said, I don’t do any of these and everything still seems to work. It’s on my TODO list to dig into why. Maybe MediaSoup doesn’t care of the source IP address on the selected tuple changes mid-stream? I dunno. If anyone else on this thread has some insights on this, I’d love to hear.

Topic		Replies	Views
Can one live-inspect and possibly interrupt packets in a simulcast stream? mediasoup libraries	0	180	September 13, 2022
An error in documentation? mediasoup libraries	1	141	July 10, 2023
Mediasoup Consumer Not Receiving Packets – Need Debugging Help mediasoup libraries	2	32	March 20, 2025
Pause Data Coming To A Consumer mediasoup libraries	8	1308	December 21, 2020
Observer events and know if producers or consumers closed abruptly mediasoup libraries	5	836	June 10, 2021

Track if Producers or Consumers are alive

Related topics