Mediasoup horizontal scaling using pipeToRouter

Hello all,
So I was made a decent attempt for horizontal scaling implementation using pipeToRouter. I would like to share some observations and doubts.

Mediasoup server system : Ubuntu 20.04.2 LTS (GNU/Linux 5.4.0-70-generic x86_64). 48 CPU(s)
1 CPU : 1 mediasoup-worker : 1 mediasoup-router
So 48 routers in total.

I have marked first 24 as producing routers(creating only send transport) and next 24 as consuming routers(creating only receive transport) (mediasoup demo pipe branch reference here)

  1. When a peer joins the room, creates send transport on the router id - 0 and receive transport on the router id - 24.
  2. Piping producers from router id - 0 to router id - 24.
  3. The steps repeated until a certain consuming router reaches threshold (now threshold is a value calculated by iterating all transports created on the router and summing up the number of consumers and number of data consumers, *value 500 but is experimental based on CPU capabilities)
  4. When threshold is reached for a consuming router, receive transports are now created onto the next router (switching consuming router from router id - 24 to router id - 25) and producers for that peer are piped into both consuming router id - 24 (increasing value of consumers and data consumers above threshold now for this consuming router) and router id 25.
  5. This also requires to pipe all existing producers from existing peers into the new consuming router router id - 25, increasing value in the number of consumers and data consumers within the current producing router id : 0.

So by adopting this logic, it seems like every producer created on the producing routers have to be eventually piped to all the other consuming routers in use which will then push the limits of that router be it producing or consuming way above threshold as the number of peers increases.

I have tested this for more than 100 users with both audio and video producers. So general observation is that after a certain number of peer count range lets say 40-50, the consuming router id : 24 exceeds not only threshold but with the overhead peaks 100% CPU usage for that specific core causing problems. However, it is also necessary to pipe all producers to all consuming routers as well.

So I was wondering that is this the limit we can achieve with such an approach and then switch to scale between servers or the approach lacks some serious misunderstandings about how mediasoup piping between routers work ?
Please feel free to provide any comments, suggestions or feedback.

(to provide some numbers:
tested : 100 users, all genuine users both audio and video, with 3 simulcast layers
works fine upto 30-35 users with no issues at all)

1 Like

When my producer servers go online my signalling server tells the consumer servers if any are online to createPipeTransports and connect them.

In my case I just want the producers connecting to consumers and not consumers connecting to consumers but to make a single request per server, so consumer won’t connect to a producer more than once.

In this state they’re just chilling,

Now here’s the catch and you need to really watch this haha

The producer server will be hit little, your biggest hit will be the exponential increase on consumers.

12 video broadcasts consumed by 24 viewers is 12*24-24=264 consumers and if there’s audio way more.

If you plan to many to many, and properly I’d maybe devote more consumers to the party and leave some additional space on your cores so the pipetransports can re-consume/re-produce. You can safely use a single re-consume/re-produce on a consumer server multiple times so some gain there just close it when last viewer is done.

If you execute this right you can go insane with that many cores.

2 Likes

Okay. Thanks a lot for the suggestion and feedback.
I will give this approach a shot.
I have all the testing architecture ready so will provide details and numbers for the total number of peers that can be supported and also the overall stats for the mediasoup server. :grinning:

If you really want to aim for top-tier efficiency in my opinion, you could detect when or if a producer cannot re-produce/re-consume again to have the consumer server with space to send it to another producer to expand the amount of consumers that can get the source.

If this makes sense, good luck it drove me mad for a bit. haha

Do post results if you wish but I don’t think you can go wrong this approach and if done remotely you just exponentially increase your power, idea of turning single cores added up to thousands of cores working real hard. :smiley:

If you do this at first I would suggest running all the servers under a private lan to pipe to save on outbound (keep them in same server region).

Thanks for the suggestions again.
But the point is I still don’t understand how the approach is going to solve the overhead on the certain producing and consuming routers as the number of producers increases.
Surely with new consuming routers being assigned producers need to be piped to from all existing producing routers to all in use consuming routers.
So in this process the very first producing or consuming routers will quickly pass the threshold value to hold max consumers and also results to 100% peak in CPU causing problems.

If you plan to many to many, and properly I’d maybe devote more consumers to the party and leave some additional space on your cores so the pipetransports can re-consume/re-produce. You can safely use a single re-consume/re-produce on a consumer server multiple times so some gain there just close it when last viewer is done.

The point here is that even the additional space on my cores start filling up due to the requirement to pipe producers to all consuming routers for all consumers in turn straining the producing routers as well

I am not sure I follow what you mean by producer servers/ consumers servers (are you talking about server to server scaling ?)
Right now I am focused on achieving max capabilities on a single server unit.

But the point is I still don’t understand how the approach is going to solve the overhead on the certain producing and consuming routers as the number of producers increases.
That part you’d build your own little load-balancer, so keep track of who enters/leaves the server and possibly how many pipes you have online.

Surely with new consuming routers being assigned producers need to be piped to from all existing producing routers to all in use consuming routers.
Yes or however you intend to make these servers aware of eachother (but generally consumer connects once to each producer or worker to worker but try drawing it out your plans or both ways)

So in this process the very first producing or consuming routers will quickly pass the threshold value to hold max consumers and also results to 100% peak in CPU causing problems.
Not at all the case, if you keep track of your limits each server/core(or worker) you can tell the signaller at anytime this server isn’t an option to use the next one available.

The trick here is a broadcast(produced item) will be piped and consumed on a number of consumer servers and so if at all the broadcast can’t be piped further cause the producer server is maxed out, just know the consumer server has it and might be able to send it further. This is advanced and would likely require piping consumer to consumer and keeping track of these additional states of sharing.

I am not sure I follow what you mean by producer servers/ consumers servers (are you talking about server to server scaling ?)
Sorry my lingo is still new here, my servers (workers) are programmed to handle strictly producers(broadcasts) on the producer server and the viewers(consumers) on the consumer server.

I am talking server-to-server scaling but this can very-much so be applied to local level. Also this is just information to maybe help you’re welcome to build however you want this is what I’ve gotten to test and it’ll allow me the scalability I need for thousands. And if truly worried about over-load I can run producer servers at 20-30 members and let it go hard!

2 Likes

Thanks @BronzedBroth for the detailed explanation
Will pick up on this and revert back on the topic with all the findings I can get.

1 Like

Awesome, I just found mostly two facts:
Producers don’t kill resources but you will need to send them out to many consumers,
Consumers will kill your resources so you will want to handle this so if you get a wicked idea going that clearly destroys this hell yeah! :smiley:

Hi,
First of all, I am sorry for digging up this relatively old discussion.

I like the concept of having the producer servers and consumer servers split, as it makes it easier to scale from my understanding (because I can then just focus on scaling the consumer servers for a meeting room, as they are the thing that will eat up resources once the meeting room grows).

Now the reason why I bumped this old topic:

A single worker can handle up to around 500 consumers according to the docs (and based on some benchmarking on my cloud instance).
So if I don’t know how many people will produce video & audio, how should I know how many peers I can let connect to a single consumer server?

If I would know that I will have 10 producers, I would know that up to 50 different peers could connect to a single consumer server.
But let’s say another peer would start producing a video track, then there would be 550 consumers (11 producers x 50 peers who consume) for a consumer server with 50 peers who are already connected, so I might run into issues.

I guess in theory I could then start “moving” peers to other consumer servers, but I imagine it would not be optimal as there would be some delays involved and the signaling might get a bit chaotic.
Or am I overthinking that part and it is fine to move people to a different consumer server?

You define the logic that keeps you from over-filling, understand the pipe transport over-head and do make assumptions that will help you never over-fill the CPU’s usage.

Your theory can help you, I wouldn’t worry about moving peers to a new server by a fast disconnect/re-connect elsewhere but if you can improve this process so it doesn’t need to happen you be better off. I utilize many servers and if one gets hit offline, or fails some how another consumer/producer server will be selected.

Overall I can suggest you get comfortable setting producer limits and consumer weight (for viewers) so you can give away space ahead of time to let room grow. Maybe introduce dedicated servers for mass sessions, like if you anticipate many small rooms but some audience rooms you have 10-30 cores on the side ready for rent at any time to give to that service.

If my producer handles 13 broadcasts (audio/video) and consumer handles 3 weight (12 viewers per weight); this means 1 broadcast would use 1 weight till room hits 13 viewers, and no weights till first viewer. If consumer has no slots, signal server should find the next available.

1 Like

Thank you for explaining, I appreciate it.
Also found your thread about horizontal scaling after posting my comment here, that one also helped me to think about how to create a scalable system.

I’ve managed to get myself up running and doing solid for months no crashes with this system and some adjustments you feel but I think I set many on their way to a properly scoped system. My servers all run single core for maximum IP output and DDOS resillience and so far it’s probably the hardest platform to DDOS. I get respects for this setup many love it.

If you follow any of my posts you may find them to be helpful, I am after a goal and with this soup being free I will toss a bone but no source just the ideology for those to think and grasp a true concept that won’t fall on itself.

1 Like