Unstable quality on high GPU/CPU usage

Hey,

I am trying to integrate VOIP for fiveM/redM games using mediasoup in chromium process to replace 3rd party software like teamspeak/mumble and I have issues with unstable quality because of high CPU/GPU usage. When I alt tab from the game then everything is smooth but when I return back then it is lagging (CPGPU usage is only difference). Is it possible to run client in worker thread to avoid lagging caused by high CPU/GPU usage or any other way which would provide access to AudioContext?

For example discord uses WebRTC and works smoothly in same environment.

I don’t think you need to make your life more complicated with worker threads. This is likely not related to mediasoup-client at all, since it is essentially a wrapper on top of WebRTC. I’m not sure if the WebRTC API is even supported in web workers anyway.

In general, playing audio is a very CPU light task. Assuming that the game and your chromium processes are separate processes, you can try raising the priority of the chromium process if you think you are CPU bound.

You can play audio via an HTML5 audio element or connect the track to an AudioContext - I haven’t experienced any performance differences between these two.

Can you please describe what exactly you’re experiencing? Is it latency or discontinuous sound/stutter? How high is the CPU usage in-game? If it’s less than 100% then it may not be the source of the problem. It’s very unlikely to be GPU related. Is it possible that you are exceeding your network’s bandwidth limits?

1 Like

Stutter. Crackling. During 100% CPU usage. The thing is that discord in browser is smooth even in such a case. I use AudioContext, analyser node, gain node and AudioWorklet to calculate volume before I send it to mediasoup. On consumer side I use similar nodes + PannerNode for 3D effect. My bandwidth has 300 mbit/s limit. Everyone is experiencing same issue.

Couple of thoughts -
a) Try to investigate why you’re getting 100% usage in the first place. Modern PC’s are often GPU bound for gaming. Do you have substantial CPU usage from chromium? That would be unusual for an audio-only application, which brings me to my next point …
b) Sounds like you’re doing some fancy stuff with AudioContext. If you have a usage issue with Chromium, it’s possible that it’s related to the actual audio processing. You mention that you don’t get this issue with Discord (which AFAIK doesn’t do any post-processing like you do). Some ideas:

My first go-to solution would just be to increase the priority of the chromium process. Very easy and highly likely to get you the results you want. Alternatives would be …

If you’re experiencing 100% usage on the sender (producer) side, then you might want to track.clone() the track you get from getUserMedia, and use track1 for your analyser node and plug track2 directly into mediasoup unmodified. I’m not sure if that would fix your issue but it’s a very trivial hack to do and is worth trying.

If you’re experiencing 100% usage on the receiver (consumer) side, then you could consider doing your post-processing inside of a webworker or audioWorklet (not sure if this is possible for your use case). This would move that stuff to a separate thread, but probably wouldn’t help if that thread is being executed on a busy core (and it sounds like all of your cores are busy).

You can also look at the audio stream on affected PC’s using chrome://webrtc-internals – that might give you some hints as to what’s going on.

Other things to think about:

Make sure that you’ve configured OPUS (assuming that’s the codec you’re using) to use DTX (free bandwidth reduction) and FEC.

Look into increasing the size of the receive audio buffer (jitter buffer??)? I have no idea if this is possible in WebRTC/mediasoup – they’re built to be pretty developer-friendly so you really shouldn’t have to mess with stuff like this.

I tested this without any fancy stuff just in usual chrome browser with 100% CPU usage. WebRTC was using like 2% on both client and server. Just loopback without any post-processing. There was no difference. I use OPUS 48000 hz. No DTX or FEC. You don’t have to increase priority for discord and it works smoothly. That is not solution for production.

You might want to explore some of the possibilities I mentioned in the prior post. Maybe start troubleshooting by turning on FEC or using track.clone on the sending side - both very easy to implement.

Another thought - the machine mediasoup-server is running on should not be running at 100% CPU. Probably not applicable to you.

It doesn’t sound like an issue with mediasoup (which you’ve gotten working). There’s basically no overhead from mediasoup-client once the connection is established. My guess is it’s either something related to WebRTC or your implementation method.

I tried your advice and now sending stream straight to the server and it seems to work. Thank you man. <3

image

Before I was sending outputNode.getAudioTracks()[0] but now I cannot easily add other sources like from AudioNode.

Printing something to console every 10 ms is not a very good idea and doing anything in most cases too. Could easily be throttled by browser if it is out of focus.

That is not the case. I added that log at the end to see if stream is working. Currently it seems okay but there is still problem when I want to play stream from Audio.srcObject via context.createMediaElementSource() and I connect it to same outputNode as microphone so I can play music for all consumers in same track with mic. It is smooth only when alt tabbed from the game. There is stuttering every 2-5 sec when not enough CPU power.

I suppose it is because of the “download → decode → merge with mic → encode → send”

Agree with @nazar-pc. You’re logging a new reference to an entire array 100x/sec which can get pretty intense. You are also doing potentially 100 fast fourier’s per sec (I’m not sure how chrome implements this but it might be inefficient). I think you can easily up your interval to like 200ms or more with zero impact to user experience. If you’re using this function for UI updates then you can use window.requestAnimationFrame instead of setInterval to eliminate excess updates.

Some other options:

track.clone() after all your preprocessing and mixing (but before analysis - probably the most intensive task), which would let you add more sources.

produce() each individual source (would increase bandwidth consumption, so I think better to mix the sources on the sender side)

If the sender is the one with 100% usage, move the gainNode to the receiver end.

I need to track volume every 10 ms because I pause producer on silence to save bandwidth. If I increase it then voice is cut off

I’m not sure if this would help, but maybe remove the HTML5 audio element you’re using for music. Instead of using context.createMediaElementSource(), use Ajax to get the audio buffer and plug that directly into the audiocontext as in this example:

I’ve found WebAudio to be a little janky when it comes to playing audio tags sourced from a file.

Look into enabling OPUS DTX - it will save you bandwidth on silence (maybe enough to remove your setTimeout altogether). I don’t know how much it’ll save, but it’s definitely worth experimenting with!

I looked at DTX but I didn’t understand how to enable it in mediasoup. :frowning:

Here’s my own loader code in case it’s easier for you to implement:

function load_audio(file, callback) {
  var request = new XMLHttpRequest();
  request.open("GET", file, true);
  request.responseType = "arraybuffer";

  request.onload = function () {
    onAudioContext(function () {
      audiocontext.decodeAudioData(request.response, function success(buffer) {
        if (!buffer) return;
        var source = audiocontext.createBufferSource();
        source.buffer = buffer;

        callback(source);
      });
    });
  }

  request.send();
}
...
load_audio('intro.mp3', function (source) { introSound = source; });
...
var introGain = audiocontext.createGain();
introSound.connect(introGain);
introGain.connect(audiocontext.destination);
...
params.encodings[0] = { maxBitrate: 40000, dtx:true }; // not sure if this option actually does anything regarding DTX, but I have it enabled anyway
params.codecOptions.opusDtx = true; // I think you DEFINITELY need this option for DTX
params.codecOptions.opusFec = true;

transport.produce(params);

Also there will be like 100 users connected so I really need to pause streams on silence. Didn’t find efficient way to do it and there is always first chunk cut off before it is enabled again.

Is that server-side?

I say go for all 3. Use DTX, use FEC (I don’t think there’s a downside to enabling both of these), and use Ajax to buffer your music instead of the audio tag.

Sorry I didn’t clarify, this is all done client side:

...
params.encodings[0] = { maxBitrate: 40000, dtx:true }; // not sure if this option actually does anything regarding DTX, but I have it enabled anyway
params.codecOptions.opusDtx = true; // I think you DEFINITELY need this option for DTX
params.codecOptions.opusFec = true;

transport.produce(params);

Thanks a lot for all this info. I wanted to go for XMLHttpRequest before but I need realtime playback without downloading whole track and AudioNode was only solution I found.

I tried it anyways and there was no difference from AudioNode. It was stuttering too every 2 sec