CPU/Bandwidth results for many-to-many rooms in production.

Between 15 rooms with users ranging anywhere from 1 → 11 (currently).
Totaling a maximum of 53 viewers, and 27 broadcasters.

Results may differ with users just sending audio broadcasts, but setup uses pipetransporting and all sorts for maximum scalability.

On 5 servers, CPU usages is at about
2-20%overall CPU usage as displayed in the picture.

SPECS: 4vCore @ 8GB RAM (HDD who cares) and 1Gbp/s networking unmetered.

Due to my setup and rooms expanding past a single core’s ability, I spawn and destroy workers to balance out the system to utilize maximum CPU without it reaching 100% ever.

The design is a bit complexed and hogs some CPU initially but when pipes start getting utilized fully and viewers increase it’s a work of art.

Something to note, CPU usage is never static, so for instance if a user lags out they drop packets and stop responding (no CPU usage there). Server can see increasing range of use as more users pile in, so best basis on CPU usage is not to go by average but it’s peak.

Now for the juicy part. Each server will use around 5-20TB a day in bandwidth serving each user up-to 500KB/s send/recv.

Ram never consumes more than 1-4GB of RAM, so nothing to really talk about here other than handling is good. You can consider for 1vCore 1GB of ram is ample and possibly less!

Any questions feel free to ask. This was a small test, hopefully it shows you the requirements.

1 Like

Same sample of viewers but more broadcasts, 41 broadcasts with potentially (video/audio) both transmitted screen or webcam.

As you can see once we’ve made it over a the heap of many rooms making pipe, broadcasts don’t really increase and the spikes you see now do stabilize but best as always to keep below max CPU at least 80% total with overall CPU. Use as many workers as you need however and have a mechanism for when you close the process.

Memory usage across servers (4vCore@8GB Ram), I use about shy of 2GB. I use RTX and all sorts.

I would suggest that you run the media server on 512-1GB per 1vCore. So in my case 4vCore, I could make way for 2GB or 4GB for what it can calculate.

I only hit about 1- 3GB on the 4vCore machines. Enjoy.

One thing to be careful about is your CPU logical core becoming 100% used, in this case users lag so preventing 100% is ideal on any of your logical cores unless it’s temporary. One thing I do to resolve this having many workers per core is that I close a worker if a server looks over-loaded and send users a disconnect on subscribe message and they find a new server immediately. This action is a bit stressing but worth wild as if I host 20 workers and server manages to use all 4 cores on 4 workers, I boot the 16 workers to find another low-use space.

One thing to keep in mind is CPU usage is always changing, it can be 0.20% -0.75%+ CPU usage per produced/piped broadcast. This doesn’t change so I must focus on keeping a good medium.

Thanks for the detailed insights.

By 53 viewers you mean total of 53 users or 53 consumers?

Why many workers per core?

By viewers, I mean total users their consumed amounts can differ depending on which room they are in and amount of broadcasts.

The many workers per core is because you can’t necessarily max a core out effectively with any algorithm. Say I have a large room taking up this worker but only using 60% of its CPU, that’s 40% loss and for that I’d spawn a worker to handle a small room additionally and destroy this worker when the large room grows to 80% or something.

Just a matter of balancing the loads, users just see their camera’s trickle when this happens and there’s a cut-off for how fast it can occur so it’s not triggered every second or by glitch.

Why we are not using the worker of room 1, which is using 60% of it’s CPU, for room 2?

I think you are dedicating a worker to a room, whenever another room needs to be created it creates it’s own worker, right?

We can be using that single worker for two rooms and more!! A single room however can use more than one worker.

Mostly due to how load-balancing works. For example imagine a single core could handle 30 produced items, but we want to represent this as 15 broadcasts (both audio/video) cause we don’t know if a user will activate video later or turn it off/etc and just run audio or something.

So imagine this, we have now 15 broadcasts audio only, half the CPU is in usage. We know the users may video up but unsure when that may be, so we open another worker that grants another 15 broadcast slots and if for some reason the server goes over 100% we close workers the users get shuffled to a new worker immediately.

No worker is dedicated to a room, this is how I mitigate extremely large attacks.

At this point the videos are not on, so you reserve a worker for these 15 videos that may turn on in future. Is that so?

A single transport for each broadcasting user, it’s assumed they may open either audio, video or both. They may not however, so yes it’s for future purposes. Running multiple workers allows me to be efficient with how I weigh/balance the servers.

What happens if I don’t run more workers, I could have a 15 broadcasts sending audio only and server’s logic say’s this server is full however it may be at 20% CPU usage. :slight_smile:

I think I initially misread this line, I thought you were saying that you were opening multiple worker per core intentionally , I was not able to digest that :D, but I reread it and found that you were saying that you were opening multiple worker , and incase the number of workers get passed the number of cores in a server then of course OS will move more than one workers to a core, in that case you were closing the workers and initiating new worker may be on another machine. All ok now.

I think we are getting off-road of the main topic :smiley:

But as per above one thing is still not digestible to me, which is, why you are closing the worker, incase the core is getting full? I mean no need to close this worker just open a new worker and let them work together.

I mean if you are not dedicating the worker to a room this shouldn’t be necessary. Let me explain lets say there are 2 workers worker_1, worker_2 and this assumes that they are on different cores core_1, core_2. worker_1 is for room_1 and worker_2 is for room_2, worker_1 is 70% and worker_2 is 80% loaded. Let’s say some users turned on the video for room_2 then we know that the worker_2 is about to be full so we will not use worker_2 we will instead create another worker worker_3 and all the piping stuff will be done and users will be entertained via this worker_3 for the new producers. While the worker_1 will entertain the users for the existing producers.

I can’t understand this sorry, If a worker is at 20% of it’s core’s capability then why it would say that I am full?

It’s software based load balancing. When defined worker limits get exceeded by count it’s considered full and this may not be true to CPU usage but it’s true to what the worker was told to handle.

So you can imagine I would had told my worker to handle a total of 15 broadcasts, in the future any of the broadcasts could open video or remain audio only. I can predict this but there’s only assuming. It’s not easy in many cases to use the CPU effectively.

The opening/closing of workers for me is just to re-balance servers essentially, in some cases running more workers than cores has its advantages but I manually bypass for this; I sometimes get a ton of small rooms and their usage is truly idle… I spawn temporary workers just for them and other users to use the CPU up. But once it raises I close the additional workers and open more servers.
(Also I host producer servers and consumer servers, so I may sometimes close the producer server and load a consumer server in its place as viewer count or broadcast counts increase/etc)

Ok thanks for the answer. This is how I do at my end:

I have setted a limit of 1500 consumers per worker, cpu core can handle a little more than that but on safe side i have setted it to 1500.

Whenever I start node server I open n workers where n is the number of cpu cores. This keeps the things simple in a sense that I don’t have to open and close the workers as the maximum workers are opened already on server start.

Each worker have kept track of it’s load majorly based upon the number of consumers.

Each room keeps track of it’s worker as well.

Now whenever someone produce a track then I check for the least loaded worker of the room if any worker have space then the track is produced to that worker otherwise another least loaded worker is picked and track is produced there. For every producer I have kept track of it’s router and eventually of it’s worker.

Now whenever someone requests to consume a producer I get the router, worker from producer and check if this worker can server the user if it can then all is well otherwise I pick another least loaded worker and then pipe the producer to it and user gets the stream from that worker.

It have been working great and smooth.

Imagine a room go from 6 broadcasts x 12 viewers to 12 broadcasts by 24 viewers and 24 broadcasts by 48 viewers. You may think this is a 2x increase in CPU, but you’d be very WRONG. This exponentially increases resource needs and you must in fact compensate for such. You could require many pipes now all sorts.

Idea is you could load balance a server to have 20 max broadcasters, 1 of the broadcasters has 1,000 viewers and lags out the entire broadcasting server cause it needed to pipe to many servers. We could say we let another server pipe for us and limit that load but in the end we’re left to defining how much space a user can use and carefully.

I don’t see a setup like that being good to you in the long run, it can tell many inaccuracies.

In other words I’d have to pass on your design it is unpredictable or assuming of what users can potentially do. that’s like saying 1,500 consumers will hit 98% CPU and not surpass that and that’s a lie. You’ll definitely be spiking cores well before that and need to prevent this, so knowing what a user can potentially do and perhaps re-balancing your system with good routing decide where users go in scenarios they could have thousands of viewers and take up an entire core. just to pipe.

If I’m producing at 0.25%->0.50% how do you fit 1,500 producers. Consumers cost just as much. Processing power per single core hasn’t overly improved to justify the count. That cap should be much lower by far.

It’s confusing stuff, but let’s not go into details. This is the bread/butter for many and it’s really up to app design what’s appropriate these are just stats thank you! :slight_smile:

Thanks for your time brother, it was nice discussion.

I have reread the thread now I got the whole idea why you close the workers and move them to another server, that is because of the software based load balancing algorithm you have on your side.

One thing I want to ask about, if it is appropriate, is that what is the benefit of having separate producer, consumer servers?

Another thing to ask is that, in the whole thread your focus was load balancing the servers based upon producers instead of consumers, while I use load balancing based upon consumers, can you share some thoughts on it why you prefer that?

It allows for me to increase broadcasting and/or viewer limits by adding more producer and/or consumer servers.

The benefits here mostly is that you can design optimal fanning systems this way; I can customize and define how producer servers get used for specific users/etc. So if I have a user that needs more piping I can detect this handle it few ways, remove other users on current producer server and move them to another server or find a new producer server first and switch to it and define its max user limit as 1 so we can pipe around 100+ times to many consumer servers.

This setup funnily enough makes you DDOS proof, if a server stops pinging it recycles itself and new servers become available. So if you got more servers than the attack can destroy things will restart gracefully.

Now to add to the funny, you can actually determine which user is DDOSing by watching which consumer servers you supply out to which rooms. If there’s a pattern the user will be seen on every consumer server before it goes offline. AKA don’t let them know but start toggling their account to low bitrate, make them think they won. :smiley:

I load balance both actually, producer server is load balanced by total users broadcasting with assumed expectations on default balancing, as this process grows for a user it’s slightly modified by total users allowed to handle more consumer server pipes

The consumer server is balanced off a weight system, which is a bit more complex actually cause I cannot keep re-creating pipe transports for each viewer I must re-use existing pipes and so that kind of looks like:

ConsumerUserCount / UsersAllowedPerWeight % 1 == 0

Can’t detail much other than that’s when my weight factor goes up, if I have a total of 12 weight per consumer server and UsersAllowedPerWeight is 6; this means that for every 1st, 7th, 13th request and so on there will be a weight added and when we exceed all 12 weights assigned server is overloaded.

This is where you may want to run further workers as not everyone will utilize the idle space set aside for them. So you could have several rooms open and the weight factor is exceeded if a single viewer is in each room. But this ensures if more users pile in they have a server hot and ready at their service and CPU use is efficient there. More efficient when you can re-use pipes and minimize how many you use to get a job done.