Very high CPU consumption, Best approach for CPU usage

Hi,
I am developing a video conferencing app. Right now my settings are 1 worker and 1 router.
So in a room, there were 4 producer and 12 consumer. but the CPU consumption goes upto 75%, I am using 8 core cpu. So what are the best ways we can adopt? Is there any kind of settings we can turn off?
Really thanks for your help

To use the full power of your server you need to create worker per CPU core and try to balance routers between workers so each worker has a fair number of participants (producers+consumers).

2 Likes

This is an example of code you would use to solve the problem mkh is describing above.

Basically, you want to create a worker for each virtual CPU core and then attempt to evenly distribute new routers across the workers. (balancing the load)

Here’s a piece of example code I use:

const os = require('os')
const mediasoup = require('mediasoup')

// Get the count of total virtual CPU cores
const numWorkers = Object.keys(os.cpus()).length
let lastUsedWorkerIndex = -1

module.exports = {
  // Create all mediasoup workers (runs once at launch)
  createWorkers: async () => {
    console.log("****CREATING MEDIASOUP WORKERS****")
    for(let i = 0; i < numWorkers; i++) {
      const worker = await mediasoup.createWorker({
        logLevel: "debug"
      })
     // Add worker to workers array
      workers.push(worker)
    }
    console.log(`****CREATED ${numWorkers} MEDIASOUP WORKERS ****`)
  },

  // Get a worker from a CPU core, balances load across CPU
  getWorker: () => {
    lastUsedWorkerIndex++
    if (lastUsedWorkerIndex >= numWorkers) {
      lastUsedWorkerIndex = 0;
    }

    return workers[lastUsedWorkerIndex]
  }
}

The only thing to take into consideration here is that the function for deciding (getWorker()) what worker to create a new router on is very simplistic. It just goes through the workers linearly and doesn’t check the load of each worker. eg: If Worker 1 had 80% CPU core usage, and Worker 2 had 20% CPU core usage, obviously you would assign a new router to Core 2. But the function used above for getting a worker wont come to that conclusion. But this is a start.

3 Likes

Note that now you can even check per Worker resource usage to decide which Worker to assign to a new Router:

2 Likes

Just in case: https://mediasoup.org/documentation/v3/scalability/

1 Like

Thanks you for this code, Really appreciate it.

Actually I have tried that, its not working. Its giving error like getResourceUsage is not a function.

worker = await mediasoup.createWorker(mediasoupOptions.worker);
const usage = await worker.getResourceUsage()

1 Like

If I’m correct, I think that function is new in v3.4.0. https://github.com/versatica/mediasoup/blob/v3/CHANGELOG.md. I’d check to see if you are on at least that version.

I am on 3.2.5. I will update now. Thanks

Interesting!
However it seems not that easy to turn the struct into a comparable value. :face_with_monocle:

man getrusage

https://linux.die.net/man/2/getrusage

I don’t even know how to properly use those values. But they are there.

FWIW, here’s how I calculate load. I don’t know if this is 100% correct, but it does appear to work.

The trick here, is that ru_stime and ru_utime are monotonically increasing, and are represented in milliseconds. Therefore, we just need to take slices of time, and compute the load over that slice of time.

async scheduleResourceUsage (timeout) {
    this.updateResourceUsage()
    this.timeout = setTimeout(() => this.scheduleResourceUsage(timeout), timeout)
  }

  async updateResourceUsage () {
    const { workerLoads, log } = this
    let updatedCPULoad = false
    for (const worker of this.workers) {
      const usage = await worker.getResourceUsage()
      const wall = Date.now()
      const { ru_stime: systemTime, ru_utime: userTime } = usage
      const total = systemTime + userTime
      const last = workerLoads.get(worker)
      let load
      let elapsed
      if (last) {
        const { wall: lastWall, total: lastTotal } = last
        elapsed = wall - lastWall
        const usage = total - lastTotal
        load = Number((usage / (elapsed * 1.0)).toFixed(3))
        updatedCPULoad = true
      }
      const result = { wall, systemTime, userTime, total, load, elapsed }
      log.silly(`Computed worker-${worker.pid} CPU usage`, result)
      workerLoads.set(worker, result)
    }
    if (updatedCPULoad) {
      return this.emit('updated-resource-usage')
    }
  }
3 Likes

I’m just starting to look through all of the documentation. Great stuff!!! Is there any information regarding how many routers one core/worker can handle or is it all just based upon the 200 - 300 viewers spread across the worker regardless of routers? Or is it 200 - 300 viewers per router?

Thank you for this sample code on measuring load %!