In the scalability documentation it states that “depending on the host CPU capabilities, a mediasoup C++ subprocess can typically handle over ~500 consumers in total”. I am curious about how this value was derived and what the options there are for improving single-worker performance.
We recently had upwards of 100 users in a single (audio only) “conference room” and media soup performed admirably well, but we did end up maxing out the CPU of our EC2 instance (c5n.large) as we neared 110 users. If my math is right, 100 users should correspond to 9900 consumers. n*(n-1). Pretty amazing!
I am just wondering what possible options exist to go further here. Sharding the room across routers (and sub-processes) does not appear to give any advantage since as you pipe producers into new routers, you still end up creating the same number of consumers on them.
The obvious solution is to throw bigger CPUs at the problem. We could adjust our instances to focus on fewer but more powerful CPU cores.
I am also curious about how the paused state of producers/consumers impacts the CPU cost of them. The vast majority of the participating users were muted, so this may be why we were able to reach such a high number? One option we considered is to actually destroy (or not initially create) all the consumers of a paused (muted) producer, but we are concerned that the user-experience would suffer as there may be too much overhead in re-creating the necessary consumers when they un-mute themselves and some audio could be missed by the peers.
Going forward we are hoping to be able to support up to 200 users in a single room. Do you think this is at all feasible? Is there any further room for optimization in the C++ library itself that we could look into and contribute back to the project? Is there any way to achieve this within the current implementation/architecture that I am possibly missing?