Is it possible to make some "backend" jobs on the server side??


I know that mediasoup is based on SFU architecture. Is it possible to make face processing (for example - emotion recogizing) on the server side?

Example scenario:

5 people are connected in one “room” and they send streams to NodeJS server (which uses API from media soup). Then, that server get access to each of the streams, recognises emotions (using some external libraries) and make some filters on that streams. And finally, server sends back that streams with some changes to all of the participants in that room? Is it possible and efficient using mediasoup?

I know that this scenario is more MCU than SFU - so is it possible to make a hybrid solution using mediasoup?

Yes. Consume those streams in mediasoup by using directTransport.consume(), you get the raw RTP packets, then you parse them (you can use GitHub - versatica/rtp.js: RTP stack written in TypeScript), you modify or do whatever, and call directTransport.produce() and producer.send() to send them to the mediasoup Router. Then you consume such a Producer in WebRTC transports so they are sent to receivers.

Everything is documented, please read the docs.


So before starting modifying packets, I would like to make whole traffic go through node.js where I could have access to these RTPPackets. Am I thinking correctly that to achieve something like this I need to have data flow like on this simplified chart?!
Untitled Diagram (1)|629x141

And to achieve sending an unchanged packet from directTransport back to the mediasoup router I have to make something like this, for every stream all clients send?
consumer.on('rtp', (rtpPacket) => { producer.send(rtpPacket) })

If so, how do I inject this rtpPacket to directTransport, by whom and when I have to send data to directTransport so the consumer.on('rtp') can be triggered

I’m also considering this, using nodejs(rtp.js) for server-side video stream transform may be not a good idea, since these CPU-intensive work better be done in C++,

Also if do flexible pipeline deployment, maybe deploy the transform nodes as sidecar in service mesh style? This would not introduce extra port mapping…