Due to my setup and rooms expanding past a single core’s ability, I spawn and destroy workers to balance out the system to utilize maximum CPU without it reaching 100% ever.
The design is a bit complexed and hogs some CPU initially but when pipes start getting utilized fully and viewers increase it’s a work of art.
Something to note, CPU usage is never static, so for instance if a user lags out they drop packets and stop responding (no CPU usage there). Server can see increasing range of use as more users pile in, so best basis on CPU usage is not to go by average but it’s peak.
Now for the juicy part. Each server will use around 5-20TB a day in bandwidth serving each user up-to 500KB/s send/recv.
Ram never consumes more than 1-4GB of RAM, so nothing to really talk about here other than handling is good. You can consider for 1vCore 1GB of ram is ample and possibly less!
Any questions feel free to ask. This was a small test, hopefully it shows you the requirements.
As you can see once we’ve made it over a the heap of many rooms making pipe, broadcasts don’t really increase and the spikes you see now do stabilize but best as always to keep below max CPU at least 80% total with overall CPU. Use as many workers as you need however and have a mechanism for when you close the process.
One thing to be careful about is your CPU logical core becoming 100% used, in this case users lag so preventing 100% is ideal on any of your logical cores unless it’s temporary. One thing I do to resolve this having many workers per core is that I close a worker if a server looks over-loaded and send users a disconnect on subscribe message and they find a new server immediately. This action is a bit stressing but worth wild as if I host 20 workers and server manages to use all 4 cores on 4 workers, I boot the 16 workers to find another low-use space.
One thing to keep in mind is CPU usage is always changing, it can be 0.20% -0.75%+ CPU usage per produced/piped broadcast. This doesn’t change so I must focus on keeping a good medium.
By viewers, I mean total users their consumed amounts can differ depending on which room they are in and amount of broadcasts.
The many workers per core is because you can’t necessarily max a core out effectively with any algorithm. Say I have a large room taking up this worker but only using 60% of its CPU, that’s 40% loss and for that I’d spawn a worker to handle a small room additionally and destroy this worker when the large room grows to 80% or something.
Just a matter of balancing the loads, users just see their camera’s trickle when this happens and there’s a cut-off for how fast it can occur so it’s not triggered every second or by glitch.
We can be using that single worker for two rooms and more!! A single room however can use more than one worker.
Mostly due to how load-balancing works. For example imagine a single core could handle 30 produced items, but we want to represent this as 15 broadcasts (both audio/video) cause we don’t know if a user will activate video later or turn it off/etc and just run audio or something.
So imagine this, we have now 15 broadcasts audio only, half the CPU is in usage. We know the users may video up but unsure when that may be, so we open another worker that grants another 15 broadcast slots and if for some reason the server goes over 100% we close workers the users get shuffled to a new worker immediately.
No worker is dedicated to a room, this is how I mitigate extremely large attacks.
A single transport for each broadcasting user, it’s assumed they may open either audio, video or both. They may not however, so yes it’s for future purposes. Running multiple workers allows me to be efficient with how I weigh/balance the servers.
What happens if I don’t run more workers, I could have a 15 broadcasts sending audio only and server’s logic say’s this server is full however it may be at 20% CPU usage.
I think I initially misread this line, I thought you were saying that you were opening multiple worker per core intentionally , I was not able to digest that :D, but I reread it and found that you were saying that you were opening multiple worker , and incase the number of workers get passed the number of cores in a server then of course OS will move more than one workers to a core, in that case you were closing the workers and initiating new worker may be on another machine. All ok now.
I think we are getting off-road of the main topic
But as per above one thing is still not digestible to me, which is, why you are closing the worker, incase the core is getting full? I mean no need to close this worker just open a new worker and let them work together.
I mean if you are not dedicating the worker to a room this shouldn’t be necessary. Let me explain lets say there are 2 workers worker_1, worker_2 and this assumes that they are on different cores core_1, core_2. worker_1 is for room_1 and worker_2 is for room_2, worker_1 is 70% and worker_2 is 80% loaded. Let’s say some users turned on the video for room_2 then we know that the worker_2 is about to be full so we will not use worker_2 we will instead create another worker worker_3 and all the piping stuff will be done and users will be entertained via this worker_3 for the new producers. While the worker_1 will entertain the users for the existing producers.
It’s software based load balancing. When defined worker limits get exceeded by count it’s considered full and this may not be true to CPU usage but it’s true to what the worker was told to handle.
So you can imagine I would had told my worker to handle a total of 15 broadcasts, in the future any of the broadcasts could open video or remain audio only. I can predict this but there’s only assuming. It’s not easy in many cases to use the CPU effectively.
The opening/closing of workers for me is just to re-balance servers essentially, in some cases running more workers than cores has its advantages but I manually bypass for this; I sometimes get a ton of small rooms and their usage is truly idle… I spawn temporary workers just for them and other users to use the CPU up. But once it raises I close the additional workers and open more servers.
(Also I host producer servers and consumer servers, so I may sometimes close the producer server and load a consumer server in its place as viewer count or broadcast counts increase/etc)
Ok thanks for the answer. This is how I do at my end:
I have setted a limit of 1500 consumers per worker, cpu core can handle a little more than that but on safe side i have setted it to 1500.
Whenever I start node server I open n workers where n is the number of cpu cores. This keeps the things simple in a sense that I don’t have to open and close the workers as the maximum workers are opened already on server start.
Each worker have kept track of it’s load majorly based upon the number of consumers.
Each room keeps track of it’s worker as well.
Now whenever someone produce a track then I check for the least loaded worker of the room if any worker have space then the track is produced to that worker otherwise another least loaded worker is picked and track is produced there. For every producer I have kept track of it’s router and eventually of it’s worker.
Now whenever someone requests to consume a producer I get the router, worker from producer and check if this worker can server the user if it can then all is well otherwise I pick another least loaded worker and then pipe the producer to it and user gets the stream from that worker.
Imagine a room go from 6 broadcasts x 12 viewers to 12 broadcasts by 24 viewers and 24 broadcasts by 48 viewers. You may think this is a 2x increase in CPU, but you’d be very WRONG. This exponentially increases resource needs and you must in fact compensate for such. You could require many pipes now all sorts.
Idea is you could load balance a server to have 20 max broadcasters, 1 of the broadcasters has 1,000 viewers and lags out the entire broadcasting server cause it needed to pipe to many servers. We could say we let another server pipe for us and limit that load but in the end we’re left to defining how much space a user can use and carefully.
I don’t see a setup like that being good to you in the long run, it can tell many inaccuracies.
In other words I’d have to pass on your design it is unpredictable or assuming of what users can potentially do. that’s like saying 1,500 consumers will hit 98% CPU and not surpass that and that’s a lie. You’ll definitely be spiking cores well before that and need to prevent this, so knowing what a user can potentially do and perhaps re-balancing your system with good routing decide where users go in scenarios they could have thousands of viewers and take up an entire core. just to pipe.
If I’m producing at 0.25%->0.50% how do you fit 1,500 producers. Consumers cost just as much. Processing power per single core hasn’t overly improved to justify the count. That cap should be much lower by far.
It’s confusing stuff, but let’s not go into details. This is the bread/butter for many and it’s really up to app design what’s appropriate these are just stats thank you!
Thanks for your time brother, it was nice discussion.
I have reread the thread now I got the whole idea why you close the workers and move them to another server, that is because of the software based load balancing algorithm you have on your side.
Another thing to ask is that, in the whole thread your focus was load balancing the servers based upon producers instead of consumers, while I use load balancing based upon consumers, can you share some thoughts on it why you prefer that?
It allows for me to increase broadcasting and/or viewer limits by adding more producer and/or consumer servers.
The benefits here mostly is that you can design optimal fanning systems this way; I can customize and define how producer servers get used for specific users/etc. So if I have a user that needs more piping I can detect this handle it few ways, remove other users on current producer server and move them to another server or find a new producer server first and switch to it and define its max user limit as 1 so we can pipe around 100+ times to many consumer servers.
This setup funnily enough makes you DDOS proof, if a server stops pinging it recycles itself and new servers become available. So if you got more servers than the attack can destroy things will restart gracefully.
Now to add to the funny, you can actually determine which user is DDOSing by watching which consumer servers you supply out to which rooms. If there’s a pattern the user will be seen on every consumer server before it goes offline. AKA don’t let them know but start toggling their account to low bitrate, make them think they won.
I load balance both actually, producer server is load balanced by total users broadcasting with assumed expectations on default balancing, as this process grows for a user it’s slightly modified by total users allowed to handle more consumer server pipes
The consumer server is balanced off a weight system, which is a bit more complex actually cause I cannot keep re-creating pipe transports for each viewer I must re-use existing pipes and so that kind of looks like:
Can’t detail much other than that’s when my weight factor goes up, if I have a total of 12 weight per consumer server and UsersAllowedPerWeight is 6; this means that for every 1st, 7th, 13th request and so on there will be a weight added and when we exceed all 12 weights assigned server is overloaded.
This is where you may want to run further workers as not everyone will utilize the idle space set aside for them. So you could have several rooms open and the weight factor is exceeded if a single viewer is in each room. But this ensures if more users pile in they have a server hot and ready at their service and CPU use is efficient there. More efficient when you can re-use pipes and minimize how many you use to get a job done.