Capturing raw audio data on the server side

I am using this github repo as a base: GitHub - Dirvann/mediasoup-sfu-webrtc-video-rooms: A simple video conferencing example using the mediasoup sfu

I would like to extend it, such that I can read the raw audio data on the server side and do something with it (specifically throw it into a whisper model for transcription). I’ve tried using the consumer.on(“rtp”) event but it does not seem to be firing, I should note that I am using createWebRtcTransport and not createDirectTransport, as I was unable to get that to work.

Anyone else that has dealt with similar issues?

If you are not using createDirectTransport() to get a DirectTransport and later create a consumer on it and listen for ‘rtp’ event on it, then of course it won’t work. If you have a problem doing the former, then specify the problem instead of trying things that, as per documentation, won’t work.

That makes sense. Are there any examples available on how to use direct transport? Because I am a bit confused as to what I should do. Replace WebRtcTransport with DirectTransport or keep both?

If you are after capturing the raw audio data from the server then what you need is a PlainTransport to stream the needed audio out of mediasoup to an external endpoint/application (ffmpeg, gstreamer) that can decode it and give you the raw audio data. Consuming Media in an External Endpoint (RTP Out)