PlainRtpTransport: Why require same source IP:port than connect() ?

Hi,

I’m finishing an integration demo with Kurento filters, where the architecture is as follows:

Browser >--(WebRTC)--> mediasoup >--(RTP)--> Kurento (applies a filter)
                                             |
Browser <--(WebRTC)--< mediasoup <--(RTP)--<--

Obviously, here “Kurento” could be any external 3rd-party RTP tool such as a custom GStreamer pipeline or a FFmpeg processor. It could even be that the tool receiving RTP is different from the one sending it back to mediasoup.

This setup could be theoretically accomplished with a single PlainRtpTransport that had both a Consumer to send video out, and a Producer to receive video in. However, it won’t work if the external tool doesn’t implement symmetric RTP ports, i.e. the tool needs to send from the very same port that it uses to receive data, which is the port where the PlainRtpTransport has been connect()ed. Or, in the case of two tools, they need to be coordinated to use the exact same port to receive and send data.

If this requirement is not satisfied, then the PlainRtpTransport will just drop the incoming data with the (debug) message ignoring RTP packet from unknown IP:port. For the curious, this check is done in PlainRtpTransport.cpp:605.

My question is: why shouldn’t PlainRtpTransport just listen for incoming data on the IP:port it binds to, regardless of where it comes from? Why such a specific restriction on the source of the incoming packets?

Symmetric RTP is understandably required in 3rd-party endpoints for mechanisms such as COMEDIA to work properly; this is also warned about in Kurento documentation for users that wish to use the COMEDIA mode in there. But for general purpose RTP transmission, it doesn’t look like neccessary. WebRTC doesn’t use it either. Furthermore, this is not a technical limitation: I just tried commenting out that return, and everything works beautifully.

The easy way to avoid this problem was to just use 2 PlainRtpTransports, one to send and another to receive. It’s just that with a receive-only PlainRtpTransport, the semantics of the connect() method are a bit strange (really you are just “connecting” in order to “whitelist” a remote IP and port, to receive data from it).

I’m not saying you should change this, just would like some context to better understand the issue. Although if you see it fitting, then I’d propose removing that restriction :slight_smile:

For security reasons. Otherwise anyone from anywhere could send RTP packets into the mediasoup plain transport binding port. Note that this cannot happen in WebRTC since those RTP packets must be encrypted with SRTP keys previously negotiated over DTLS after ICE procedures authenticated with user/pwd.

I disagree here. connect() is just about connecting the transport, it’s not about sending media to it. This is, you must connect the transport before you can use it for sending or receiving. Yup, maybe we could have chosen other name instead of connect(), but it’s too late to change it :slight_smile:

Said that, I strongly think that, in your scenario, creating two separate plain transport (one for sending and one for receiving) is perfectly elegant and appropriate.

Actually in a receive-only case, when using UDP there is nothing to do after starting to listen on some IP and port… which is what creating the transport already does. Indeed once the transport is created, the C++ code is already listening and receiving data, just ignoring on purpose all incoming packets.

Requiring a further “connect” action (or “start” or whatever, I get the idea) is artificial, just a forced extra step of whitelisting the desired remote port, and getting this concept right, even after reading docs, was confusing at the beginning.

That’s “security through obscurity”, although I guess in practice it kinda works. But ultimately if an RTP stream is to be secured I believe it should rely on proper mechanisms to do so, i.e. using SRTP, instead on arbitrary port restrictions.

In any case the problematic part was sending RTP to mediasoup; it was possible to do thanks to the support for COMEDIA in mediasoup, otherwise it would have been impossible so thanks for adding that.

There is now a complete demo of sending RTP out from mediasoup to Kurento, applying a realtime video filter, and sending the modified video back to mediasoup, in here:

(note that a diagram is supposed to explain the architecture but right now Github seems to have broken image rendering in README pages)

Somehow the plain transport must told about which the remote IP and port(s) are. Since many times this involves SDP O/A and it’s not always clear who the SDP offerer is, we must hace transport.connect() as a separate method to be called once the remote has its port(s) ready.

We do not (yet) support SRTP (other than in WebRTC transport), so we do need a way to protect the transport from receiving malicious packets from anywhere.

Cool, congrats!

BTW I’ve added Kurento/mediasoup-demos project to the “Examples” section in mediasoup website. Well, actually it will be published once we release a new version of mediasoup with transport-cc and TypeScript, which will happen soon.

Nice!

So, I would like to summarize the observations I’ve done, and provide a short conclusion at the end, in case this is useful for some other users that might be searching for info about RTP; you might also find this useful for addition to the docs:

  • For sending RTP out from mediasoup, when the other party has already communicated which port(s) it uses for listening, transport.connect() must be used to start sending RTP to those ports. Note however the SDP Offer/Answer model is a mechanism where each party states their listen (inbound) port(s), and AFAIK there is no standard method at all to express their outbound port(s), so in the standard common case, transport.connect() will only be useful for sending RTP data out, but won’t accept the other party’s RTCP Receiver Reports data in - because the other party’s outbound ports with all probability will be random and won’t coincide with those indicated in transport.connect().

  • For receiving RTP (and/or RTCP) in to mediasoup, we have two scenarios:

    • When it is possible to make the other party explicitly bind to an agreed known outbound port, then either transport.connect() or COMEDIA can be used in mediasoup.

    • When it is not possible to choose outbound ports in the other party, it becomes mandatory to replace transport.connect() with COMEDIA in mediasoup. This applies to pure SDP Offer/Answer parties such as Kurento’s RtpEndpoint, but also to other tools (e.g. I believe you cannot make FFmpeg bind to specific outbound IP/ports, that’s why your docs’ FFmpeg example requires the use of COMEDIA in the mediasoup side).

Short-form conclusions from these observations are:

  • If the remote program allows setting specific outbound bind ports and perform Symmetric RTP/RTCP, then both send and receive can be performed with a single PlainRtpTransport in mediasoup. RTP and RTCP packets will flow correctly.

  • Otherwise, RTP can be made to work by managing two different PlainRtpTransports in the application, at the cost of a partially broken RTCP flow:

    • One transport using transport.connect() to send RTP/RTCP, that won’t accept incoming RTCP Receiver Reports because they will be sent from the wrong remote port.
    • Another transport using COMEDIA to receive RTP/RTCP, that will try to send RTCP Receiver Reports to the wrong remote port.

Cheers!

We assume symmetric RTP and symmetric RTCP (while RTCP-mux is optional, but if unset we do not use contiguous ports for RTP and RTCP). I strongly assume that all RTP engines nowadays do support symmetric RTP and RTCP, don’t they?

Why? I mean, once the other party has generated its SDP offer you can parse its media port and RTCP port and pass them to connect(), am I wrong?

What it comes to my mind is that, in your scenario, you are not using a single “RTP stack instance” in your server but two, one for receiving media and another to send it back to mediasoup, so that’s why you cannot use a single plain transport. But I strongly consider that this is a legit scenario for having two plain transports, because the remote party is running two “RTP endpoints”.

I’m… really not that sure. GStreamer is able to do so for sure, because you can configure the “bind-port” property on the udpsink elements at the end of a pipeline. But I have my doubts about FFmpeg, couldn’t find anything although I saw there is a “localport” property that might do the trick. Also I lack experience with other industrial level software such as Adobe Media Server.

Overall though, I think it might be possible to assume that all major RTP implementations will support Symmetric RTP, although it introduces the additional problem of having to signal the bind ports through means different to SDP, because SDP messages cannot do it per se. Which brings us to the next question:

Once the other party has generated an SDP message (either an Offer or an Answer), this SDP message contains the ports in which the other party is listening for packets. Not the ports that the other party will be using to send packets.

As said earlier, there is no provision (that I know of) in the SDP message format to transmit the ports one party is binding to for its outbound traffic. The port in the m= line is a listening port, telling you that you should send your data to that remote port. It doesn’t tell you anything about what remote port will be used to send data to you. The SDP O/A model assumes that each party will freely bind to a random port in order to send data. That of course goes against a strong assumption of symmetric RTP in mediasoup, but it can be worked around with the setup I described.

The devil is in the details, and now I agree with you that it doesn’t hurt that much once the requirements for RTP are clearer. But the SDP Offer/Answer model had made me biased about what to expect, and in this model, a single RTP endpoint can be used to both send and receive RTP, because the outbound port can be chosen freely by each party, thus each party only needs to know where to send their data to the other one. The SDP messages can natively indicate bidirectional communication between endpoints with the media attribute a=sendrecv.

E.g. when you send this SDP Offer:

v=0
o=- 0 0 IN IP4 192.168.1.11
s=-
c=IN IP4 192.168.1.11
t=0 0
m=video 3310 RTP/AVP 120
a=rtcp:3311
a=sendrecv
a=rtpmap:120 VP8/90000
a=ssrc:11111111 cname:test

You are telling the other party:

Hi, I’m expecting to receive your VP8 RTP video at 192.168.1.11:3310 with Payload Type 120, and your RTCP messages at 192.168.1.11:3311; also I will send video to you with SSRC 11111111.

Then the other party would generate this SDP Answer:

v=0
o=- 0 0 IN IP4 192.168.1.22
s=-
c=IN IP4 192.168.1.22
t=0 0
m=video 3320 RTP/AVP 120
a=rtcp:3322
a=sendrecv
a=rtpmap:120 VP8/90000
a=ssrc:22222222 cname:test

Meaning: OK, I’ll be receiving your VP8 RTP video at 192.168.1.22:3320, and your RTCP messages at 192.168.1.22:3322; also I’ll send my video with SSRC 22222222.

With this SDP O/A exchange, both parties know:

  • What destination IP:Port they must send their packets to.
  • What SSRC they must expect from the other party.

Particularly, one thing neither of the parties know is what is the source remote port that the other party will be using. Also there was no indication in the SDP messages about the other party wanting to use symmetric RTP. So, if symmetric RTP is to be used, it must be signaled out of the SDP band.

So, it can be worked around as I’ve explained, but it’s important that all these concepts are clear in order to prevent lots of confusion and head scratches :slight_smile:

Ok, so the only remaining topic here is symmetric RTP/RTCP. Here a thing:

  • RFC 4961: Symmetric RTP / RTP Control Protocol (RTCP)

Somewhere in section 4:

There are no known cases where symmetric RTP or symmetric RTCP are
harmful.

For these reasons, it is RECOMMENDED that symmetric RTP and symmetric
RTCP always be used for bidirectional RTP media streams.

Yep, indeed it’s not signalled in the SDP so endpoint A (mediasoup PlainRtpTransport) should be, in theory, ready to receive RTP/RTCP from the remote endpoint in the port(s) it has bound no matter where the incoming RTP/RTCP comes from. The problem here is about security. Without SRTP anyone could impersonate such a remote endpoint at any time.

However I do also understand your point. We could relax the port stuff and just require that incoming RTP/RTCP comes to the PlainRtpTransport from the ip given in connect(). Would it make sense?

And, however, we have a technical limitation. The PlainRtpTranport just handles one or two tuples (TransportTuple C++ instances). Such a TransportTuple, when UDP, means a local IP:port and a remote IP:port. So here a question:

Are we definitely talking about bidirectional and asymmetric UDP plain RTP connections? This is, A and B exchanging RTP in both directions from random & unannounced ports? If that is really needed then we should change PlainRtpTransport.cpp internals and, instead of just having:

  • this->tuple
  • this->rtcpTuple

it should have:

  • this->sendTuple
  • this->recvTuple
  • this->sendRtcpTuple
  • this->recvRtcpTuple

That, in addition to just force the remote source IP to match (and not also the remote port) make make it work. However it introduces much more complexity in the code.

I just wonder whether it’s worth all this instead of just creating two separate PlainRtpTransports.

It would make sense for the case of an RTP peer that does not implement Symmetric RTP. However, we are nitpicking here; just have in mind that technically, it could perfectly happen (i.e. symmetric rtp is an optional feature and sooner or later someone will want to connect a RTP VoIP terminal which is not symmetric and it’s not possible to make it so, with mediasoup as a WebRTC gateway, and ports won’t match, etc).

For the more practical side of things, I’ve made some changes to Kurento’s RtpEndpoint to make sure it is both symmetric and it supports the a=rtcp:<PortNumber> attribute, which wasn’t supported before. So right now we can already achieve perfect RTP integration with mediasoup :slight_smile:

Just to make a last clarification in case you decide to tackle this in the future:

Note that what you say doesn’t solve the issue for non-symmetric RTP endpoints; technically all RTP transports are always bidirectional, because RTP goes in one direction and RTCP goes in the opposite one:

  • A consume-only (send-only) PlainRtpTransport

    • Is connected() to the other peer’s RTP and RTCP listen ports, to send both RTP and RTCP SR.
    • Still needs to receive RTCP RR, originated from an unknown port in the other peer (because non-symmetric), so the RTCP RR will be lost when mediasoup discards it.
  • A produce-only (recv-only) PlainRtpTransport

    • Is connected() to the other peer’s RTCP listen port, to send RTCP RR.
    • Still needs to receive RTCP SR, originated from an unknown port in the other peer, so the RTCP SR will be lost when mediasoup discards it.

Mmmm, honestly I’m considering that requiring symmetric RTP is fair enough nowadays. It’s not in my mind the idea of supporting a “RTP VoIP terminal which is not symmetric” :slight_smile:

After the whole discussion, the only important thing IMHO is whether FFmpeg and Gstreamer can consume RTP (without producing) from a PlainRtpTransport and still send RTCP to mediasoup. For that, comedia: true cannot be used (since there is no RTP from FFmpeg or Gstreamer to mediasoup) so we do need to know in advance the RTP and RTCP listening ports of FFmpeg or Gstreamer.

You said that this is possible in Gstreamer and, AFAIR, the same happens in FFmpeg by setting the something like localRtpPort and localRtcpPort somewhere in a URL. If confirmed we are done.

I really do not want to get into the pain of non symmetric RTP endpoints. By design, plain RTP is already insecure enough to even allow it to be more insecure. Thoughts?

Yeah I think it’s fair enough. Also it will keep your implementation simpler. Although don’t underestimate the kind of strange requirements that someone will come upon one day (the typical phrase “it’s a matter of when, not if” applies here).

I’d still suggest that you think of adding SRTP support at some point in the future, because the trick of expecting symmetric rtp while nice it is just a think layer of protection compared to proper security mechanisms.

With GStreamer it is definitely possible if one builds the pipeline by hand. Not so sure if one uses the sdpdemux element, which abstracts the whole pipeline creation process and instead uses a single SDP message as the only method to configure the pipeline.

With FFmpeg, I’d need to study it closer. I found it, it’s a URL parameter named localrtpport=n and localrtcpport=n, documented in the rtp section of the FFmpeg formats page. I think that the URL “mode” of FFmpeg can only be used to consume RTP (i.e. send RTP to FFmpeg), but it cannot be used to produce RTP (i.e. send RTP from FFmpeg) so that might be only a partial solution. In any case, it also prevents from using the more standard method of sending an SDP message fo FFmpeg and let it configure itself as needed.

I’d need to look more into these modes of GStreamer and FFmpeg. Will let you know if there is any obvious issue there.

Also money :slight_smile:

Absolutely. SRTP will be eventually added to plain and pipe transports.

Why not? After you “send” a SDP to FFmpeg, cannot you get a SDP response from FFmpeg and parse the RTP and RTCP ports in there? (of course assuming symmetric RTP).

I think that’s the problem. I believe FFmpeg won’t do symmetric rtp by default (neither will gstreamer), unless you explicitly tell it to do so, which the SDP cannot. But I need to study it further, to be 100% sure about how to do it.