Best practices for port ranges and server configuration for corporate grade networks.

Hi,
we are using mediasoup 3 in our product. Recently we got couple of reports for missing (“black”) cameras from some demo sessions with corporate clients. Mediasoup is working great for most of the cases, we are also using Twilio TURN servers (we have tested them and they work), but for certain group of clients we get the missing cameras. The clients are reporting black cameras, because in such environments they are using phones for audio (so we don’t know if the audio streams will work, but I assume it will be the same as video). The problem is that we do not have any way to test or debug on the problem networks or ask the clients to run some specific tests or third party tools. Our best guess at this point is some kind of aggressive firewall or maybe somekind of proxy server that is blocking the WebRTC communication (the TCP socket communication is working as expected). Currently we are trying to collect some statistics on the server and client side and we are hoping they will shed some more light on these issues. But in the meantime I will appreciate any feedback from mediasoup community for simillar issues and / or suggestions how we can identify the source of the issue. Or any guidlines for mediasoup configurations, TURN servers or port ranges that are expected to work best in corporate grade networks.

Thanks!

Some corporate networks blacklist Twilio servers. This surprised us – not sure of the reason for it – but it’s something we see fairly often.

In general, we’ve found that it’s useful to characterize and log STUN/TURN issues, and to write code that allows you to fallback to a secondary provider. We’ve had outages/issues with all TURN service providers, periodically.

The two major classes of failure are: 1) no data available from STUN servers (they’re blocked, down, or there’s an issue with credentials), and 2) the servers respond but do not return relay candidates.

Corporate networks do not seem to block Xirsys servers in the same way that they block Twilio servers.

Do those “corporate network” also block TURN over HTTPS port 443?

Our data set for these old-school locked down “corporate networks” is not large, as this is not our target market. But in our experience, usually TURN over HTTPS on 443 is not blocked, even if the Twilio servers are blacklisted.

The networks that seem to block TURN on 443 also often block any upgrade to web sockets. So, YMMV, but if you are using web sockets for signaling, and your signaling connection is working, but TURN is not working, my first guess would be that the issue is with the specific TURN servers, rather than TURN in general.

I’m not an expert on this, but TURN over TLS is not the same as HTTPS (even if it uses port 443). This is, traffic middleboxes “may” detect that it’s not HTTPS and decide to block it. However I don’t think that usually happens.

BTW not sure what the current status of this thread is. I mean, if you have verified that your TURN is blocked (even TURNS over port 443) then there is little than can be done, do I miss something?

You’re right, of course, if TURN over TLS is generally blocked, you can’t do anything.

But we have seen cases where specific TURN servers are blocked, we assume because they are part of the published IP ranges used by Twilio. But TURN over TLS, in general, is not blocked.

Clear, thanks. Can we then close this topic?

Thanks for the replays. It is really surprising that some companies are blocking Twilio servers. I will try to update the tread with our findings as well, but as I said we can’t really retest on a problematic network. The thing is, that these issues happens on demo session (with prospect clients) and if the session not goes well, there is rarely another session with the same compnay. Also such demo sessions are not so often, so I do not expect to have any update soon. Basically we will try to collect more stats and identify what is blocked (UDP, TURN servers, ports). And only after we have more clear understanding on the problems we will try to implement some additional TURN servers.

I have couple of follow up questions though:

  1. Do you have any guidlines how to identify what is blocked (UDP, TURN servers, ports)? At this point we will try using webrtc stats, but for now it doesn’t seem so straightforward.
  2. If the UDP communication is blocked on the client network and the mediasoup transport is created with enableUdp, enableTcp, preferUdp all set to true - does this mean the streams will go through the mediasoup server and we won’t use TURN?
  3. Also if in the same room (router) we have part of the peers using UDP and other using TCP (in mediasoup only, no TURN invoved) should we expect some issues (for example peers using TCP doesn’t see peers using UDP or vice versa)? Currently we see some strange warnings and errors in the mediasoup server logs (similar to ERROR:Worker worker process died unexpectedly). So if you can clear it up a bit what exactly enableUdp, enableTcp, preferUdp are meant to do it might help.

Thanks.

Honestly never thought about that.

It’s the client who decides which transport use. TURN candidates (just visible for the client) will be selected if the others fail. No magic in mediasoup here. mediasoup just tells the client its candidate preferences (IPs-ports and protocols) given the order of listenIps entries and the values of preferUdp and preferTcp. Nothing else.

It should work without any issue at al. However it seems there is a bug in TCP we are not (yet) able to determine. Please follow this topic and specially the comment I reference.

From now on, please track this new issue in GitHub and let’s continue there:

Thanks, I will follow the issue on github and will update this tread if / when we find some valuable patterns for blocked webrtc communications.

Please comment the full if (IsRunning()) block here and let us know if it crashes again.

Just in case, just go to YOUR_APP/node_modules/mediasoup/worker/src/RTC/ and edit DtlsTransport.cpp as indicated.

Ok, we will try this, but I can not give any time estimate for the feedback. If we got some meaningful results I will update this tread and https://github.com/versatica/mediasoup/issues/333.

A lack of “ERROR:Worker worker process died unexpectedly” is what I’m looking for :slight_smile:
So it does not happen again during next days, please notify in the GitHub issue.