I want to implement a broker server and media server architecture and I have two questions regarding that:
In the broker server, how are media servers managed? Do I need to store all the connections in the broker server? Additionally, how does the broker server handle connections when a media server connects or disconnects?
In the media server, should I create all the workers at once, or just create workers on demand? How do I decide when to make a new worker or reuse the old one?
Broker server is synchronous with the mediia servers connected; all actions performed and deduction of weight/etc are handled at the broker so future decisions on where users are moved/etc is sorted.
Yes or any discovery server to manage these connections.
When media server connects it authenticates itself and if valid the broker will keep reference of this media servers usage/etc. Upon a disconnect the users on that server would be told to drop the connection if it hadn’t timed out for them just yet and find a new server.
There’s no easy answer here but on demand creation/destruction would be ideal for workers; however this may not be easily done at first and most commonly you’ll see human intervene in this process often enough and do it manually.
I see a lot of people using WebSockets for media server discovery, and they have media server management on the broker side. Is this an ideal way, or is there anything else that I am missing here?
I am planning to use WebSocket connections for media server discovery and TCP connections to communicate between the broker server and media server. Is this an ideal start?
Yes, WebSockets would be commonly used. It’s as ideal as you make it truthfully, broker servers are complicated to scale. There’s opening multiple zones when this happens though.
I run bash code that checks every so often if overall CPU is way too high (80-90%), I find all the PIDs of mediasoup-worker’s and determine which is least used and stop the process temporarily. If CPU returns to normal we can re-start the workers.
I run a weight system so users are guaranteed a slot to broadcast and be consumed by many users. Not always is this the case however, a user could have a single viewer and this could take a slot without the much needed CPU. In other words we could run out of slots very quickly so I run more workers to ensure this doesn’t happen and automate the process opening/closing workers to move users to best server. The worker is designed with limits already so it cannot exceed 100% CPU core. So it wouldn’t matter if I had 5-15 workers per core if workers get closed as usage picks up. This in turn helps provide slots for many users but at the same time ensure you never over-load as long as you close the lowest used workers till you’re at one worker per core.
If I receive enough alerts of this taking place it’s a sign I need more servers or workers depending on the state of growth.
This sounds like a lot of manual work that needs to be carefully monitored 24/7. Also, the solution you provided seems to contradict my understanding of mediasoup. Firstly, we cannot run more workers than the number of CPUs, as directed in the docs. Additionally, once a worker is created, it shouldn’t be closed because there might be users utilizing it, even if there is only one user. Also, closing a worker means closing all routers, which in turn means closing all transports, producers, and consumers. Also, I am looking for a system to be automatic. Well, anyway, the system seems to be more complicated than I am thinking. Do you provide consulting as a service, or would you work together in building something really cool ?