Spatial layers has lower resolution

Maybe a Simulcast generic question, but probably of interest anyway.

I have set the usual three Simulcast scalings: 4, 2, 1. When camera video has a resolution lower than 720p, not all three layers are being encoded in Chrome, as pointed out in simulcast - consumer is selecting invalid temporal layer, so for a 640p video I get only two layers. Problems is, when setting layer two (index: 1) with consumer.setPreferredLayer({spatialLayer: 1}), video has a videoHeight of 320, like scalings are [4, 2, 1] instead of [2, 1].

It somewhat made sense because they are the scaling values I set, but I was expecting that since Mediasoup told me Chrome was generating only two Simulcast spatial layers, setting the number two would be the highest quality one, without doing scaling, but results say other thing. Should I calc in advance the number of spatial layers my camera will generate and take them in account when setting the simulcast config? Or does Mediasoup can tell me the number of layers it receives from the camera?

Just to confirm that you are passing proper encodings to the producer, enable spatial layer 0 in the consumer and check the received resolution.

TIP: The fact that you tell getUserMedia to generate 640p video does NOT mean that the encoder is gonna encode the same resolution. It may generate a minor resolution (even for the highest simulcast stream) if there are CPU issues or uplink limitation/issues.

Maybe related to this: Talking heads application, 352x288, strange scalable video coding - #7 by OllieJones

Just to confirm that you are passing proper encodings to the producer, enable spatial layer 0 in the consumer and check the received resolution.

Do you mean consumer.setPreferredLayer({spatialLayer: 0})? That’s interesting, because doing so, the changedlayers event tells me that spatialLayer is null, not 0, besides that it seems to works correctly. With 1 or 2 works correctly. I was thinking to open an issue about that, but didn’t have time to finish to investigate it in the code to be sure it was a fault on my side and not a bug or intended behaviour on Mediasoup.

It may generate a minor resolution (even for the highest simulcast stream) if there are CPU issues or uplink limitation/issues.

I know that, and in fact I’ve seen myself how it was changuing in realtime :slight_smile:

Check the mediasoup demo, please. I promise there is no issue here at all. This is super tested.

I was not thinking there would be a bug, but instead that being both zero and null “falsy” Javascript values, that there would be some kind of optimization there. We are using a (heavily) modified (old) version of EduMeet, that’s a fork of mediasoup-demo, so yes, it would make sense to take a look to a current version as reference of best practices :slight_smile:

Maybe related to this: Talking heads application, 352x288, strange scalable video coding - #7 by OllieJones

Seems like a similar issue, specially at the end of this comment:

I have just compare current mediasoup-demo code with our own one. Both of them are getting total number of spatial and temporal layers with next code:

		const { spatialLayers, temporalLayers }
			= mediasoupClient.parseScalabilityMode(
				consumer.rtpParameters.encodings[0].scalabilityMode
			);

Thing is, printing them show they are 3 spatial and 3 temporal layers, but the logs about the producer score shows only two of them:

0: {encodingIdx: 0, rid: "r0", score: 10, ssrc: 127565124}
1: {encodingIdx: 1, rid: "r1", score: 0, ssrc: 611333117}

I’ve checked my webcam specs (Slimbook Pro2 integrated one) and according to Review #6169 about “Chicony USB 2.0 Camera” | Webcam Reviews | Webcam Test it can produce 640x480@17fps (yes, an odd framerate…), so according to Chrome values, it’s correctly generating only 2 spatial layers. Maybe related to that fact, consumer scores show that producer scores is a 3-items array, as it dictates the scalability mode, but the third entry (index = 2), that would be the one with highest quality with a [4, 2, 1] scalability array, has a quality of zero. That quality makes sense since producer is generating only 2 layers, so there’s not a third one, but OTOH, setting to use index = 1, that’s the one with the highest quality provided by Chrome and my webcam, is giving me a 320x240 video, that matchs with a 2x scale factor:

1. producerScore: 10
2. producerScores: Array(3)
  1. 0: 10
  2. 1: 10
  3. 2: 0
3. score: 10

That leads me to think: maybe should be check the resolution of the video track generated by getUserMedia() before setting the encodings array in transport.produce(), both by hand or by the produce() method itself, so in case the number of layers that Chome can generate for that resolution are less than the desired ones, then we can remove the encodings with the highest scaling factor?

You didn’t answer to this: Spatial layers has lower resolution - #2 by ibc

I repeat: call consumer.setPreferredLayers({ spatial: 0 }) and check received video resolution. Yes, I say 0, so the lowest one.

Then, rewrite your code with just 2 encodings instead of 3 and ese how it behaves.

PS: There are many factors that make Chrome generate less spatial layers (simulcast streams) than requested: resolution of video input, resolution of captured video, network conditions, system/CPU status, etc.

I know zero is lowest one. I have already check with logs that when spatial or temporal are set to zero, the one set to zero is returned as null in the changedlayers` event, and received video resolution didn’t changed, it’s always 320x240. Anyway, I’ll try to set it explicitly and see what’s the actual result, instead of only with events and logs.

That’s what I was planning to check first.

Then, we would need to know what’s the actual value dynamically, isn’t it?

Yes, and take into account the spatial level configuration shown in the chromium source code (including the minimum and maximum bitrates).
Anyway, after the startup phase, the browser could decide to change the encoding resolution on-fly, and also stopping to send some levels if an high CPU usage is detected.

No, we don’t. Just open the demo and try disabling spatial layers via click in the info overlay over your own video view.

To be clear: yes, you are having an unexpected behavior but that’s not how things work. Let’s focus on your issue (I expect wrong usage of the given encodings, but you didn’t past the exact encodings array you are passing to the producer).

Great, I’ll try it :slight_smile:

So, how can we control this after the startup phase? Does I need to parse the encodings from time to time and adjust to them? Does we have some event lo listen similar to changedlayers but for this info?

Sorry if this it’s already in the docs, I swear I’ve read it, but trying to achieve so much fine grain control, maybe I’ve missing something on it…

I insist: here you are assuming (based on the issue you are having) something that is not true and you are looking for a way to fix something that must not be fixed.

You cannot control it after it has started.

Not control, but get notified when it happens so I can take countermeasures :slight_smile:

What you can see is that at the receiver (client) side the video resolution doesn’t match the nominal simulcast configuration.

I was looking for an explicit event, but yes, I can compare values and emit that event myself, thanks :slight_smile: