Is it possible to transfer `Transport` objects between `Router` instances?

Probably a bit offtopic, but interesting nevertheless.

In C++, each Worker represents a single operating systems process. Comunication is done over IPC, so their memory address are isolated, so it’s not possible to transfer info that can’t be serialized between them.

But, each Worker can host multiple Router instances, and all of them live in the same memory address, so transfer a Transport or RtpObserver object from one to another would be mostly remove it from a std Map in the source Router and add it to the Map in the destination Router, maybe with some additional (de)initialization calls. Am I right? (I’m lefting out transfer of Producers and Consumers for simplicity, probably transfer of Transports are just enough).

In addition to that, Rust implementation is using threads instead of processes, so all of them live in the same memory space and you could (in theory) also transfer Router instances. Also, with the Rust concept of ownership, it gets almost obvious that the transfer of the objects will invalidate any access to them in the original thread, being safe to do. Is this correct?

Is there any technical limitation (besides time, sponsorship, etc) that prevents do this? What would be the steps to achieve this?

In case you are asking, the purposse for this is to allow better load-balancing of resources, in case some Worker gets too busy, to move some of it’s objects to another one, for example if a Router is getting a lot of connections move out the other ones to make CPU space.

Moving between routers in the same worker is pointless from load-balancing point of view and transferring of objects between threads, including active UDP/TCP sockets will be very non-trivial implementation-wise, so I don’t think it is worth investing resources into.

1 Like

Yes, I know :slight_smile:

Non-trivial in the Rust Mediasoup, or the app using it? Both? Is there any solution to move a Transport (or Consumer, or Producer) to a Router on another Worker besides doing a full unplug-and-plug?

There is no way to move a transport to another router (afaik) and I think this would be very tricky to implement on the server side. I’m not even sure how you would transfer ownership of the bound port without potentially losing some packets (maybe this is a non-issue - i am not that familiar with the actual libmediasoup internals in this regard). I have been thinking about this on and off for a while and it’s a problem I have luckily been able to punt on for a bit. However, I think there is a reasonable way it could be done using piping:

Basically you pipe the existing transport into the new router, then establish a new transport to the new router, once it is established you tear down the pipe and the old transport. The drawback I see here is if there are lots of consumers you will have a lot of signaling and new consumer creation x2. Lots of details to get right (e.g. ensure if you have the new producer and piped producer up for any period of time, only one is sending data), but this should allow a pretty seamless transition. I was actually thinking about this idea as a way to move users between servers. e.g. if I need to scale up, or if I have deployed a new version and want to start forcibly moving users to new server instances it should be possibly to temporarily pipe from the old server to the new one. Right now I just establish new rooms on the new instances and let connections slowly drain off the old ones as rooms become empty. But this solution is starting to become inadequate as we grow.

1 Like

This is an interesting question and though MediaSoup does not allow this behavior and for good reasons there’s many ways to go about this appropriately.

Generally I advise that your signal/chat server is well aware of the usages taking place on the media servers they represent. Knowing how many people can broadcast and be viewed can help you determine an algorithm you can tune to perfection. I have different tiered rooms and including free rooms the chat server is aware of this and how to score/calculate the most efficient produce/consume/pipetransporting so no worker reaches above 85-95%CPU

Ex. I publish a broadcast, it costs 1 point off the producer server. This broadcast will cost 1 point per 12 viewers and I set a maximum point limit per server. If 13 viewers a 2nd point is taken assuming 24 viewers is a potential. Depending on room size chat server can determine best weighting factor for the servers to spread them as far as possible no lag.

To answer your question initially though, it is possible but you’d need to know what you’re doing. This is the type of hack to bring down major networks and so not saying much outside this…

You can in fact through BGP routing “hacks” spoof a network and put yourself between a client and server and with that you can be the respondee to the client and totally fake them out. The media server would just need to mimick the connection and treat it as if it made it and not another server.

We do a lot of securing of our protocols to hopefully prevent this but it’s still possible. :smiley:

Internet doesn’t work without BGP; and with it on the rise of being a seriously vulnerable protocol but it being what makes this all work it’s actually a push by all the latest security experts to design a new way to sign and confirm these relays. (To sum up, this is how many countries are able to spy on an entire company’s server and steal billions in cryptocurrencies but this is getting off topic so enjoy that, just cause you can doesn’t mean you should)

What you requested breaks protocol rules, it can work but is not how it was intended for compliance.

1 Like

Non-trivial on library side, both in Rust and TypeScript (where sockets need to be moved between processes). Technically this is possible, but practically I don’t think mediasoup will ever implement it due to added complexity.

I would expect for multi-threading to be introduced sooner than this (if ever).

Outside hacking it, there’s a proposed standard that the internet engineering task force (IETF) published on duplicating the RTP. If you want more information on it or how to follow up this is the RFC.

Packet loss is undesirable for real-time multimedia sessions but can
occur due to a variety of reasons including unplanned network
outages. In unicast transmissions, recovering from such an outage
can be difficult depending on the outage duration, due to the
potentially large number of missing packets. In multicast
transmissions, recovery is even more challenging as many receivers
could be impacted by the outage. For this challenge, one solution
that does not incur unbounded delay is to duplicate the packets and
send them in separate redundant streams, provided that the underlying
network satisfies certain requirements. This document explains how
Real-time Transport Protocol (RTP) streams can be duplicated without
breaking RTP or RTP Control Protocol (RTCP) rules.

Could be a good read for those who like to follow up on standards/proposals or know if they are breaking protocol rules. :smiley:

1 Like

Thank you for your answers, I’m glad you found it an interesting question and food for thought :slight_smile:

So, are you talking about creating a new Consumer connection on the client pointing to the new server, and close the connection to the old busy one? I was thinking about it too, although it adds complexity in the client and probably missing RTP packets and glitches, and could end up with otherwise useless and redundant PipeTransports if you don’t move the Producers too (and recreates the Consumers in the new server too), but it could work…

In Mafalda SFU the idea is just to don’t need to worry about that things, so it does the auto-scaling itself… That’s why I was interested on doing it load balancing too :slight_smile: Anyway, the BGP hacking sounds like a cool idea :smiley:

Ok, I was thinking that at least on Rust, since they live in the same memory address space, it would just be moving some pointers, but seems it needs some initializations and resources reservations too. Thank you anyway for the clarification :slight_smile:

Nice, thank you! :smiley:

Basically yeah. First we would pipe the existing producer from the “old” server into the new server, and notify the client a new producer is available from the “new” server. Using application logic (identifiers in appData) the client would know to consume this producer, and prefer it over the old producer. Once clients are consuming from the new server, we then move the producer over (this may not even be necessary really, we can maybe just leave the pipe in place). We add the producer to the new server, and again clients are made aware of this and choose to consume it (again app-level logic dictates which producer to consume of a given participant if multiple are available). Once all clients have begun consuming from the new producer, on the new server, we can close down the old producer and the corresponding pipe. We have now transitioned a given producer and all of its consumers to the new server, and if done correctly there should be almost zero disruption. There may be short periods where we are “double consuming” or “double producing” data from the clients perspective, but this will ensure nothing is dropped or missed.

Maybe not all these steps are necessary. Maybe you could double produce directly to the new server and transition without the piping. It depends on the needs and goals of your application.

Again, I have not actually done this yet. It’s just an approach I have been idly thinking about. Our biggest issue comes in trying to scale up server and deploy new versions (and correspondingly terminate old ones). We need a way to forcibly (if slowly) move transports off existing servers and onto different ones whether it be to spread load, or so we can terminate an instance.