Help expanding Recording example to send and receive please

Hi all,

I’m using the recording example as the basis for an in-house app. Totally new to mediasoup and fairly new to webrtc, but not to programming/servers.

I asked for, and received - thank you - help the other day with getting ffmpeg to stream properly from clients to the server.

So my app now does the basics of what I want - n clients (max of 5 for now which is all I need) can connect to the server and each client streams it’s local audio to the server. The server then uses ffmpeg to output each one to a different local audio sound card. Works great. (The logic here is the sound card is a mixing desk, and I am manipulating the audio there, then sending it back to the PC).

But… I need to also be able to send an audio stream back to each client. Each one will be a different stream (taken from the sound cards using ffmpeg).

I’m totally stuck getting a return stream. Sorry. I really have tried.

I get the basics, I think. I’ve read as much of the docs as I can but I’m going around in circles.

I think I’m right in saying I need to start a new transport for sending back from the server to the clients (one per client). And then I need to connect to that from the client and “produce” from the server to be “consumed” on the client.

But I’m stuck, as stuck as a pig in mud.

I had this working ok on peerjs but that was all browser based and I wanted to move it to node.js and an SFU approach but I’ve ended up taking too much of my time (this is not my core job).

Can anyone suggest a way out of this please. Either help me get my thinking straight so I can get it finished myself or if someone is willing to do it as a paid job, please let me know.

I could go back to my peerjs solution but I’d rather not, not having spent so much time on this and getting so close. It’s probably 30 minutes work, if that, for someone who knows what they are doing. The ffmpeg bit isn’t that important, as long as I can get to the point where I can actually send a stream back to the clients browser to be played via an audio html element. I can do the ffmpeg bit myself once I get that work.

I’m equally happy to be guided to learn how to do it myself but I can’t spend another week on this!

Many (many) thanks in advance.

Hi there, hope you’re doing great.

So all you need to do is to create a plainTransport and create a producer on it that will read incoming rtp from ffmpeg. Once that is done, once you have the producer created, all you need to do is to create consumers for that producer. So if you have 5 peers connected to mediasoup then you have to create 5 consumers against the ffmpeg producer, one on each clients transport. After this you’ll have the streams on the client side.

This is implemented as ‘broadcasters’ in the mediasoup-demo project. Where using a broadcaster you can stream a media file on the server to all the participants in a room.

If you need support you can reach me out at (muhammedsalarkhan@gmail.com).

Best,

1 Like

@TheSalarKhan has the correct solution. To add to what he was saying, you need the following additional transports:

A plainTransport on the server side, FROM ffmpeg TO mediasoup, to carry the stream (producer) taken from a sound card. If you have multiple ffmpeg processes, you’ll probably need multiple plainTransports, I think.

A webRtcTransport on the server side with a corresponding recvTransport on the browser side. You will consume() the producer on the server, and the resulting consumer will be transmitted over this transport pair to the browser.

You need one sendTransport/webRtcTransport pair (which you’ve already set up) per client, and one recvTransport/webRtcTransport pair per client as well.

Each sendTransport can send unlimited amounts of streams (producers), and each recvTransport can receive unlimited amounts of streams (consumers).

1 Like

Thanks both, very kind of you. Since I posted though I’ve changed platform (again) to Jitsi Meet. That’s not to say it’s a better product or anything, it just fitted with my needs at the time, especially as I’d gone through several platforms and spent too long on it.

There’s no need to read the rest of this unless you want to as it’s not directly relevant, but just a brief explanation of what I’ve managed to achieve with my system.

In my search for a solution I looked at it again, having dismissed it once, and realised it also gave me a ready built frontend for my users that I can easily adapt if I want (I’ve got a local installation of meet setup - currently running on a small virtualbox VM of ubuntu on my mac).

So Meet works as the frontend for the users without me basically having to do anything but gives me the option to customise it. Other solutions would have involved writing a frontend and it would have been more work for me and would have been a more basic solution.

It also means I’ve gone away from my original idea of creating a mix-minus audio stream for each client. The users will now just use Meet (with audio set to the highest quality, no echo cancellation etc etc - all users are using decent mics and headphones).

It means they won’t get to hear themselves with the processed sound (compression, eq, gating etc) but that’s not the end of the world. The final stream for recording/streaming will have that.

Using the Jitisi javascript API I’ve got each “recording” client running on one Raspberry PI each connecting to the Jitsi room as a hidden user, and each client only listens for the stream of the person they are tasked to monitor. The client app displays them full screen in an html 5 element in chromium and that goes out via hdmi to a capture card and in to my mac to Ecamm Live as a camera source.

I have a 4 channel audio interface on one of the RPis, and the other three send their audio over the local network using zita-njbridge to this RPi, and the four audio outputs go to my digital mixer (via balanced analogue). The digital mixer is an Behringer X32, which is also a 32/32 channel usb card. The processed and combined signal comes back to the mac for recording/broadcasting along with the video. I could take this hardware out of the equation and send the audio streams to the mac directly and then out to the digital mixer and back again but I’m trying to keep load on the mac down and I have the 4 channel AU anyway so why not?

The final piece that I’ve just cracked last night, in demo form and I’ll tidy up today, is saving the local mediastream going to each client to a file by sending it from the javascript client app using websockets to a simple node.js app that saves it to disk. I need to tidy that up so I have some structure to it, record/stop buttons etc, and save them to an SSD connected to the main RPi.

My only word of warning for anyone who may be interested in this type of system is that the RPi 4 has issues displaying fullscreen H264 and it seems like the RPi 3 would actually have been a better choice. I got the RPi 4 because it does 4k and has dual HDMI out but that’s academic because the drivers are just not up to scratch and it’s hard to even get one 1080p html5 element running smoothly in chromium without screen tearing.

Once I have this fully finished - as much as anything ever is - and I’ve run it for a while, I’ll write something up somewhere and post a link to it. Appreciate I’ve used an alternative platform and not mediasoup but from my (deep) searching into this area it seems there are a few people looking for similar setups so it may help someone. And long term I may look at re-doing it if needed and look again at mediasoup.

Thanks again for your help.