Question regarding audioLevelObserver

Hi guys, I’m learning mediasoup with the MiroTalk project, recently I introduced the monitoring of the participant’s audio volume to view the active ones. What I have done is add the producer (kind=audio) to audioLevelObserver.

if (kind === 'audio') {
    roomList.get(socket.room_id).addProducerToAudioLevelObserver({ producerId: producer_id });
}

And then every 800ms (interval) if got on the event audioLevelObserver.on(‘volumes’…) I check what peers have audio enabled and then I notify all the peers about his volume.

    async startAudioLevelObservation(router) {
        log.debug('Start audioLevelObserver for signaling active speaker...');

        this.audioLevelObserver = await router.createAudioLevelObserver({
            maxEntries: 1,
            threshold: -80,
            interval: 800,
        });

        this.audioLevelObserver.on('volumes', (volumes) => {
            const volume = volumes[0].volume;
            let audioVolume = Math.round(Math.pow(10, volume / 85) * 10); // 1-10
            if (audioVolume > 2) {
                //console.log('PEERS', this.peers);
                this.peers.forEach((peer) => {
                    peer.producers.forEach((producer) => {
                        if (producer.kind == 'audio' && peer.peer_audio === true) {
                            let data = { peer_id: peer.id, audioVolume: audioVolume };
                            //log.debug('audioLevelObserver', data);
                            this.io.emit('audioVolume', data);
                        }
                    });
                });
            }
        });
        this.audioLevelObserver.on('silence', () => {
            log.debug('audioLevelObserver', { volume: 'silence' });
        });
    }

    addProducerToAudioLevelObserver(producer) {
        this.audioLevelObserver.addProducer(producer);
    }

Is this approach correct? Thanks in advance.

AudioLevelObserver and ActiveSpeakerObserver are usually for the server-side logic. For the clients, it is probably better to use client API like getSynchronizationSourcesaudioLevel.

Could you explain me better with an example? where I can find the documentation about getSynchronizationSources → audioLevel. Thanks so much.

It is in the RTCRtpReceiver. Here is an example of usage. A reference to the Mediasoup consumer’s receiver is in the consumer.rtpReceiver.

Thanks for the tips, but on the mediasoup demo it doesn’t seem to me that it uses that method there. :eyes:

I was implementing AudioObservation using AudioLevelObserver and ActiveSpeakerObserver.
My Goal

  • Sending a map of last n active users to the frontend
  • Calculating the dominant speaker among them

Just curious to know if adding two observers - AudioLevelObserver & ActiveSpeakerObserver would be inefficient in any way.
Shall i just use AudioLevelObserver to calculate both the dominantSpeaker and the lastN peers or is it alright to use both ALO and ASO together for the two different requirements - lastN and dominantSpeaker respectively.

Would appreciate some insights @miroslavpejic85 @ibc

ActiveSpeakerObserver should be enough in your use-case to detect the dominant speaker. You mentioned to send the data to the peers what is the purpose of it? if you just want to show who is speaking or not then it is best to do it on client side.

It is better to create new topic, as this topic is quite old now.

1 Like

Alright got it thanks for the reply!

1 Like

if you just want to show who is speaking or not then it is best to do it on client side.

Can you explain it better, and what is then the purposes of AudioLevelObserver & ActiveSpeakerObserver on Server side? :slight_smile:

In my application I show green border around the box of person who is speaking to highlight it totally on the client side without the involvement of the server using this library: GitHub - latentflip/hark: Converts an audio stream to speech events in the browser. As we have streams on our app side we can do lots to stuff with them without server’s involvement,

You can achieve the same using AudioLevelObserver, ActiveSpeakerObserver on the server side and emit socket events to the participants about who is speaking on not so that they update their UI. But that uses server’s resources and constant pinging to all the participants, why involving server if we can do all that on client side?

AudioLevelObserver, ActiveSpeakerObserver can be used on server side for a lot other purposes like if you want to detect if person is speaking or not and when he is not speaking means he is not active speaker you want to pause the consumers to minimise load on server and the network traffic as well.

These are just 2 use-cases, there can be any use-case.

I appreciate the information you’ve shared! Thank you.

1 Like