DataChannels -- High CPU usage on NodeJS process

I seem to be having some unexpectedly high CPU usage when using datachannels. I believe a lot of this is determined by the router pipe configurations etc.

The basic setup is that there are three routers on an egress server dedicated to incoming traffic from an ingress server. These three transports on egress then distribute the data to the other routers on egress, which have peer webrtctransports attached. The peers in the room then get the datachannel data via those webrtctransports to clientside.

When i get near 100 peers in the room which are all sending a datachannel message at 1 second intervals, the CPU usage of the nodejs process hits 100% and stays close to there. When I back that interval off to 10 seconds, the CPU is very relaxed and low usage. Note, this is not the mediasoup worker process but the nodejs process itself. The data being sent is a string of about 50 characters from each peer.

This does not happen with audio. With audio, the mediasoup workers increase in processing and the nodejs process increases very little.

This leads me to think that sending datachannel messages is cpu intensive. Is this correct more so than for audio/video?

I did a bit of profiling, and it seems that the dataconsumer.js and channel.js are using a lot of ticks to handle events. …?
ticks parent name 44496 52.0% /usr/local/bin/node 24124 54.2% /usr/local/bin/node 17328 71.8% LazyCompile: *<anonymous> /container/node_modules/mediasoup/lib/Channel.js:28:41 >> https://github.com/versatica/mediasoup/blob/d01e6676fa0484140bf68f0bdd5da292814f6de0/lib/Channel.js#L28 17328 100.0% LazyCompile: *emit events.js:349:44 17327 100.0% LazyCompile: *emit domain.js:464:44 9468 54.6% LazyCompile: *readableAddChunk internal/streams/readable.js:214:26 7714 44.5% LazyCompile: *addChunk internal/streams/readable.js:282:18

Entire profile can be found here: Mediasoup Profile - Pastebin.com

I have experimented with the ordering, time-to-live and max-retries settings of the datachannel, but they do not seem to change the reliance on the event.js. My hope there was that the time-to-live would avoid event hooks like number of retries. But this doens’t seem to be the case.

The data being sent over does not need to be reliable. If a packet drops, the next incoming packet should replace it. In this way, it is ‘stream like’ because we can discard data that is outside the time/cpu budget.

Am I able to configure datachannels to not use the nodejs process to wait on events? Or maybe I have interpreted the profile incorrectly… I welcome any insights into what I may be doing wrong or have misconfigured.

Thanks for reading. :smiley:

3 Likes

You should be able to connect hundreds of users to a single core, and compute just fine around few hundred requests a second if your code is tuned right and doesn’t block overly.

100 users sending messages to each other is approximately 10,000 requests a second. If this is a single core, you’ll poop the bed for sure with the queue you just introduced, the event loop will freeze up to complete each task.

So when you suggested every 10 seconds, it makes sense because not all workers will go off exactly in sync so it may be more like bursts and approximately 1,000 requests a second and taking 20ms → 1000ms to complete each task giving CPU time to rest.

Tough to say!

I’m unsure how to distribute the main mediasoup process over multiple cores. I am currently making a worker for every available core (minus two for system freedom). From what I understand, the actual mediasoup workers are handling the load just fine and remaining at a low percent. Its the nodejs process thread that is the bottleneck at 95-100% cpu usage.

I could be missing something very critical here if there is a way to distribute the main mediasoup process over multiple cores, along with the workers. That would likely be out of scope for this forum, but would appreciate any thoughts.

JS is single-threaded. External tasks (network or file system I/O, etc) can run in parallel, but the JS functions they call back in the same process are always executed exclusively, one by one. The only way to distribute JS over multiple cores is to launch several nodejs processes.

1 Like

Yeah, that was my thinking. I don’t even think nodejs clustering will work for this, as all peers would need all other peers data.

I will have a play with some things like nodejs version and within and outside of docker to see how my mileage goes.