I seem to be having some unexpectedly high CPU usage when using datachannels. I believe a lot of this is determined by the router pipe configurations etc.
The basic setup is that there are three routers on an egress server dedicated to incoming traffic from an ingress server. These three transports on egress then distribute the data to the other routers on egress, which have peer webrtctransports attached. The peers in the room then get the datachannel data via those webrtctransports to clientside.
When i get near 100 peers in the room which are all sending a datachannel message at 1 second intervals, the CPU usage of the nodejs process hits 100% and stays close to there. When I back that interval off to 10 seconds, the CPU is very relaxed and low usage. Note, this is not the mediasoup worker process but the nodejs process itself. The data being sent is a string of about 50 characters from each peer.
This does not happen with audio. With audio, the mediasoup workers increase in processing and the nodejs process increases very little.
This leads me to think that sending datachannel messages is cpu intensive. Is this correct more so than for audio/video?
I did a bit of profiling, and it seems that the dataconsumer.js and channel.js are using a lot of ticks to handle events. …?
ticks parent name 44496 52.0% /usr/local/bin/node 24124 54.2% /usr/local/bin/node 17328 71.8% LazyCompile: *<anonymous> /container/node_modules/mediasoup/lib/Channel.js:28:41 >> https://github.com/versatica/mediasoup/blob/d01e6676fa0484140bf68f0bdd5da292814f6de0/lib/Channel.js#L28 17328 100.0% LazyCompile: *emit events.js:349:44 17327 100.0% LazyCompile: *emit domain.js:464:44 9468 54.6% LazyCompile: *readableAddChunk internal/streams/readable.js:214:26 7714 44.5% LazyCompile: *addChunk internal/streams/readable.js:282:18
Entire profile can be found here: Mediasoup Profile - Pastebin.com
I have experimented with the ordering, time-to-live and max-retries settings of the datachannel, but they do not seem to change the reliance on the event.js. My hope there was that the time-to-live would avoid event hooks like number of retries. But this doens’t seem to be the case.
The data being sent over does not need to be reliable. If a packet drops, the next incoming packet should replace it. In this way, it is ‘stream like’ because we can discard data that is outside the time/cpu budget.
Am I able to configure datachannels to not use the nodejs process to wait on events? Or maybe I have interpreted the profile incorrectly… I welcome any insights into what I may be doing wrong or have misconfigured.
Thanks for reading.