I am using node clusters to divide loads on all cores of cpus, doing the same for socket io using io redis adapter to distribute sockets between cores.
Socket will be connected to any of the cores. I am having issue here which is user’s socket gets connected to core 1, while room is on core 2 and workers, producers etc all references are on core 2 as well, so socket can’t access all that stuff of the room he want to join in or to perform any other action because they are on separate process.
I can use redis pub sub for inter process communication. Anything else can be done to tackle this?
What can be the best possible solution to tackle this?
Thanks, my rooms, media server were tightly coupled with socket io previously, now what I have done is I have removed this socket io coupling from rooms, workers etc. Previously socket io plus my media server were part of the same node server, and were running in non cluster mode so they were in one core. Which at some point I had to change and the time has come
This is what I have done now:
Socket io, media server both are still part of the same node server but I have enabled node clustering so it is using all server cores and the media server code is running on one of the core (for access related reasons). User connects on socket on one of the node process running on one of the core and then I communicate to my media server part via redis to perform the action. This is all good.
This is what I will do in couple of days:
I will separate signaling code and media server code in separate server and they will communicate via redis all along, so there will be 1-2 or may be more signaling servers but there can be n media servers so this should be the best way, I will be working on it.
The reason for this is having on one core is that I will have access to all the workers, transports, producers, consumers objects as they will be in one core. But if I distribute my rooms in separate cores I will not have direct access to the other core’s rooms as it is separate process I will have to again use redis etc for IPC. But I do believe that they should be on separate process or servers for scalability.
What motivated me to keep them on one core for the time being is that actual things are happening in workers and they are already separate process so my node instance which manages rooms is doing nothing special except calling methods on workers etc so it shouldn’t consume much cpu? Is this assumption right?
So your n media servers are running separately each on it’s own server, and the signaling server is running on separate server and then there is another broker server or you call the signaling server the broker server?
When you say broker server knows the state of mediaservers in it’s holding? What do you mean by hloding? Aren’t the media server running separately on their own machines? And when you say it knows the state, do you mean that the broker server can access the workers, transports etc of other media servers or you mean that this server knows to which media server to communicate and then it communicates to that media server and ask to perform some action, get response and send it back to the user?
There’s a lot to take in here, I’ll try to rip through it effectively.
Making much of your process stateless is the way to go however we want to utilize our servers to the fullest or close enough and with that a factor, it can be tricky. A pooling mechanism is the best we have for cost effectiveness and maintaining a user base of a few thousand users no problem.
How you design is up to you, we have dedicated scenarios where a single room has their own servers or users share what’s available within the pool.
My media-servers connect to a broker server via WebSocket. This mechanism can allow thousands of cores no problem.
You need to hop outside the single worker (core) to be able to provide bigger rooms. Adjust your code for sure
All media servers are independent to their processing abilities, so they are just fed commands and relay. I run many signal servers to perform this task, routing may take time but once routed it’s super-fast.
Within single media server how do you manage the stuff? Lets says there are 4 cores in server so you will start node without clustering meaning it will be running on 1 core while worker will be on their own process?
You sort of answered your own question there. Having a server route/control many servers is key/ideal; just do such optimally.
Okay so you would fork each process and in theory the process could take 25% of overall process (or a single core). If you exceeding CPU capability you need to account for such and write your code differently or dynamically adjust weight/handling.
I would run 4 processes and dynamically adjust weight of producing users. If producers becomes heavy I move them but idea is I pre-calculated core ability and allot a slot count/etc.
So the thing is there will be signaling servers, broker server, media servers. Users will connect to signaling server. And then signaling server will communicate to broker server and broker server will pass it to appropriate media server and then media server will send response to broker server and then it will send back to signaling server and then user will be notified back right?
When you say fork process which process you are talking about? mediasoup workers? Actually this is my main confusion what I have in mind is that I will start node server and that node server will start n mediasoup workers. Node server will keep all the object references of rooms worker, transports.
Thanks one last confusion about broker server is that I am assuming that this is the socket server to which user from lets say web will connect.
Give me a moment, the broker server is not socket server, chat server is socket server and users will connect to it lets say via web and then this chat server will talk to broker server. I think this is right?
The chat servers handle the messaging and user-state, to ensure scalability is a thing users are routed per room within a max of 4 chat-servers, these chat servers get a chat-broker that routes the messages like pub/sub but more controlled.
The broker is split up to handle different things, so ex. Login, Chat, Media, etc etc.I have a world broker to sort my brokers across grid.
Idea is, I have one server keeping eyes on other servers and I scale this up. I let single cores control other units that have more cores and when that single core is met, determine how to stretch the process. With brokers you can literally learn to create or destroy them.
With this discussion we can refer to a broker as a handling server (it signals still but handles the I/O).