When a peer’s internet connection drops (for instance, when the Wi-Fi is turned off and back on), the peers in the room lose their connection entirely, even after the network is restored.
On a desktop device with Wi-Fi, turn off the Wi-Fi connection.
Turn the Wi-Fi back on.
Ideally, both peers should remain in the room, maintaining their connection and room state after the Wi-Fi is restored. Unfortunately, the peers lose their connection entirely and are not able to rejoin automatically after the internet connection is restored.
I wanted to kindly ask if there is a known best practice or recommended approach within Mediasoup to handle such network disruptions, ensuring that peers can stay connected to the room during brief interruptions (such as turning Wi-Fi on and off), without losing their session entirely.
NOTE: This description has intentionally removed security check for simplicity. Do ensure that the necessary security check and procedures are in place for production use.
You could use something like a token/identifier to identify a room session. This identifier is shared with clients when they first join a room and can be presented to the server when the client is rejoining after a network interruption so as to be reconnected back to the room.
The mediasoup-demo relies on the WebSocket connection so if the network changes then it’s disconnected and, by design in the demo application, the WebRtcTransports are closed.
If you switch your signaling path to DataChannels (once the WebRTC layer is connected) then you can survive to network changes or even network temporary failures (assuming that you use the “restart ICE” API of mediasoup-client and mediasoup server properly, for which you may need to carry those messages using HTTP or whatever since the WebRTC path is temporary disconnected on those moments).
I’m not entirely sure, but in the past, I recall that MediaSoup was able to handle these types of disconnections. However, I noticed that if I disconnect the Wi-Fi and reconnect it in the short time, the connection remains active, the transport reconnects and the peers still in the room.
mediasoup-client:Transport connection state changed to disconnected +12s
index-DxpmqRpy.js:3881 mediasoup-client:Transport connection state changed to disconnected +282ms
index-DxpmqRpy.js:3881 mediasoup-client:Transport connection state changed to connected +3s
index-DxpmqRpy.js:3881 mediasoup-client:Transport connection state changed to connected +71ms
Is there an internal timeout in MediaSoup that governs transport reconnection behavior, or is it handled dynamically based on network conditions?
This is the current logic I’m using, but I’d prefer to avoid having the client rejoin the room, as that would minimize the impact on the user experience. Ideally, this behavior should be handled in the background, in silent mode, ensuring that all peers remain in the room without disruption.
On mediasoup-demo:
When a disconnection and reconnection occur in a short time, the transport is disconnected and reconnected, but the peers stay in the room without needing to rejoin.
When the disconnection lasts for a longer period, such as 30+ seconds, , the transport is completely lost (state changes from disconnected to failed) on the Mediasoup side. As a result, upon reconnection, I lose all the peer streams in the room and need to refresh the page to rejoin.
mediasoup-client:Transport connection state changed to disconnected +6s
index-DxpmqRpy.js:3881 mediasoup-client:Transport connection state changed to disconnected +813ms
index-DxpmqRpy.js:3881 mediasoup-client:Transport connection state changed to failed +9s
index-DxpmqRpy.js:3881 mediasoup-client:Transport connection state changed to failed +813ms
I assume we are just talking about WebRtcTransports because WebSocket is out of the scope of mediasoup. After ~30 second without receiving ICE Binding Requests (AKA ICE Pings) mediasoup closes the transport as per ICE specifications.
I’ve set iceConsentTimeout to 60 seconds for testing, but when I turn off Wi-Fi, the transport disconnects and transitions to “failed” before the timeout is reached. Even with iceConsentTimeout set to 0, the behavior remains the same.
The goal is to keep the transport alive during longer disconnections (30+ or 60+ seconds), similar to how it seamlessly reconnects after a short Wi-Fi interruption. How can this be achieved in Mediasoup?
Something like this to illustrate the idea:
mediasoup-client: Transport connection state changed to disconnected +12s
index-DxpmqRpy.js:3881 mediasoup-client: Transport connection state changed to disconnected +282ms
After n seconds (e.g, 30+), I turn the Wi-Fi back on, and the transport reconnects without transitioning to the “failed” state:
index-DxpmqRpy.js:3881 mediasoup-client: Transport connection state changed to connected +40s
index-DxpmqRpy.js:3881 mediasoup-client: Transport connection state changed to connected +71ms
The browser will also close its transport side if it doesn’t receive acknowledgment responses (AKA Pongs) for the ICE Binding Requests it sends to the remote party (mediasoup in this case). That’s how the browser behaves and what ICE specification states.
If the browser doesn’t get a response for a certain period, it automatically closes the transport, even before iceConsentTimeout expires.
This is not controlled by Mediasoup but is a built-in behavior of the browser according to the ICE specification.
Even if Mediasoup keeps the transport open, the browser itself might close it first due to missing Pongs. This is why the transport fails even before reaching the iceConsentTimeout.
Then, there is no way to keep the Mediasoup transport alive. If the browser closes the transport, the only option is to fully rejoin the room by creating a new transport when the connection is restored?
Yes and no. If the browser doesn’t get a response during 30 seconds for any of the pings it sends then ICE states moves to closed state and the transport is closed.
OK, so if the browser closes the transport because it doesn’t receive a response for its ICE pings during the specified period (usually around 30 seconds), then the only option is to fully rejoin the room by creating a new transport when the connection is restored?
I think this is a very interesting topic, and I hope others share alternative strategies that are less invasive than having to rejoin the room in case of unstable connections.
It would be fantastic if a workaround could be developed to handle this directly in Mediasoup. Such a feature would add significant value and improve the user experience during unstable connections!
No, mediasoup cannot do anything here by its own. If the network is down for 30 seconds then both client and server will mark their transports as disconnected. The application can try to reconnect using the restart ICE API. That’s up to the application.
And BTW, if the network is down for 30 seconds that’s not an unstable connection, that’s much worse, specifically if we are talking about an audio&video applicarion. All video apps do the same despite they implement ICE or not and it’s the same in legacy telephony and also in SIP protocol.
The application can try to reconnect using the restart ICE API. That’s up to the application.
You mean when the transport disconnect I can call restartIce on client side, in this way the browser’s ICE timer will be reseted, and will attempt to re-establish the connection with new ICE candidates?
If so, a possible solution to extend the transport’s lifespan before closing it could be:
...
case 'disconnected':
if (this.transport && !this.transport.closed) {
console.warn('Transport disconnected. Attempting to restart ICE...');
await this.restartIce(); // Attempt to restart ICE immediately
}
break;
case 'failed':
console.warn('Transport connection failed. Attempting to restart ICE...');
await this.restartIce();
// Wait a bit and check if the transport is still in the "failed" state
setTimeout(async () => {
if (this.transport && this.transport.connectionState === 'failed') {
console.error('ICE restart failed. Closing transport...');
this.transport.close();
}
}, 5000);
break;
Timeout is part of the spec many protocols require it to determine if a user is gone.
Both client and server-side have to deal with it in their own way. It makes part of handling such a bit confusing.
When a user is partially lagging, their client can detect enough of it they can submit an ice restart request. If a user disconnects entirely though they’d have to respect the timeouts in place.
You can design your application so it doesn’t really look like a full disconnect to the user. Good example, user disconnected with 12 camera’s up, 2 users stop broadcasting. Disconnected user with tab loaded still see’s 12 camera’s all frozen, the chat box and has access to all their features (to an extent). They reconnect and they clear who isn’t on broadcast and reset the videos, little change full integration.