Implementation ideas on how to achieve "active speaker" recording

I’ve been playing with mediasoup / ffmpeg for a while and kicking the tires on the recording features.

I’m trying to figure out the best way to create a single recording file which can follow the active speaker. Effectively, when we have a new active speaker detected, I want to somehow “switch” the producer to which the consumer that belongs to the PlainTransport that connects FFmpeg.

This is proving to be a challenge. I went as far as trying to to create a proxy DirectTransport to manually try and do the binding via the consumer’s on(“rtc”) event and push that to the PlainTransports send(packet) method, but this didn’t work either, as the PlainTransport to which FFmpeg connects does not work with the send() API - and I’m not even sure this is a viable approach.

Trying my best to avoid doing something like recording every track separately then post-processing them all and trying to clip the right segments together from the other recordings, as this is very expensive on the CPU and achieving good audio-sync is also difficult.

Looking for any guidance on the best approach…

You can not switch a Consumer to consume from a different Producer dynamically. That’s impossible. You may create a Consumer for all Producers, pause() all but the active one and mangle the obtained RTP packet to create a “continuous” RTP stream (with constant SSRC, incremental seq number and proper timestamp), but that’s not an easy task at all.

If you wait some time, we plan to add a frame-based API for sending and consuming in [], but that’s WIP and we cannot provide a proper ETA yet.

A frame based API would be excellent… tons of use cases that I can think of - other than just grabbing frames from the “active speaker” into a continuous stream - could do stuff like generate and push frames and avatar image dynamically if someone has video turned off but still speaking.

Will definitely be keeping an eye open for when this feature is released…

Until then, I guess it’s back to the drawing board (and probably painful post-processing).

Nice. We’ll announce it here in this forum when done (so subscribe to the “Announcements” topic if you wish).