Are temporal layers hardcoded in all browsers?

(First some context, to help future readers that might encounter this same problem; actual questions to be found below)

Lately, I’ve been delving into the technical minutiae of Simulcast (I’m just like several years late to the party right? oh, well) and I’m seeing with horror that the WebRTC people left the SDP description totally under-specified. In particular, the fact that SDP doesn’t declare any temporal layers even if the client will be generating them.

I’m working on a very experimental implementation of a mediasoup SDP Bridge. Using a mixture of mediasoup and mediasoup-client utility functions, each media section of an SDP Offer can be parsed into an RtpSendParameters object, which is then used to create a Producer; this has been working extremely well (for my needs, at least).

An SDP like this…

a=rid:r0 send
a=rid:r1 send
a=rid:r2 send
a=simulcast:send r0;r1;r2

…translates into this with the mediasoup utilities…

  encodings: [
    { rid: 'r0', dtx: false },
    { rid: 'r1', dtx: false },
    { rid: 'r2', dtx: false }

…and mediasoup server becomes ready to handle 3 RTP streams.

However, if the sender also includes some temporal layers into each of those encodings, they don’t get signaled in the SDP, and mediasoup doesn’t know about them. This makes the mediasoup Consumer always stuck in the layer 0 (which is always the worst quality one, so in this case, the worst possible framerate).

  • Q1: Is this as of today still the situation? Or are there any recent advancements on this topic that I might have skipped?

  • Q2: Would it be possible at all for mediasoup to “detect” that there are multiple temporal layers on the stream? Or is that exclusively part of the binary stream data, thus mediasoup doesn’t have access to it?

Only solution I’ve found so far is to be explicit when telling mediasoup about the encodings. This means that generating an RtpSendParameters from SDP is no longer enough :frowning: and adding scalabilityMode by hand is required:

  encodings: [
    { rid: 'r0', dtx: false, scalabilityMode: 'L1T3' },
    { rid: 'r1', dtx: false, scalabilityMode: 'L1T3' },
    { rid: 'r2', dtx: false, scalabilityMode: 'L1T3' }

However, the next task becomes knowing what string to put there. Again I’m finding out how web browsers don’t give you a clue about this, and the corresponding APIs are mostly not implemented even today (seems I’m not _that_ late to the party, after all!).

In Chrome, using RTCRtpTransceiver.sender.getParameters() returns an encodings array where scalabilityMode is nowhere to be seen.

Googling a bit, one finds forum posts where it seems common knowledge that Chrome is hardcoded to add 3 temporal layers to its simulcast streams. However there is no official WebRTC stance or documentation about this fact, that I could find. One can only hope that all browsers behave the same!!

Even mediasoup-client itself assumes 3 as the magic number e.g. for Chrome, Firefox, and Safari.

  • Q3: As of today, is that still the real situation? (or again I might have unknowingly skipped some recent developments in this space). Is the WebRTC SFU community as a whole just hardcoding T3 in their scalabilityMode variables, and hoping that’s the correct value, fingers crossed?

Some extra info; this might be useful for some people.

Chrome does indeed implement the scalabilityMode stuff. It’s just that they didn’t ship it yet to the public, and is gated behind a runtime flag.

To enable it, run Chrome with this flag:

chrome --enable-blink-features=RTCSvcScalabilityMode

Then, the scalability modes start appearing in places such as RTCRtpEncodingParameters and RTCRtpCodecCapability. At least in Chrome; I don’t think Firefox has support for any of this, though.

Yes, unless that flag is used Chrome hardcoded number of temporal layers.
And no, temporal layers are not signaled in SDP, so if you know that your sender is producing temporal layers you must tell it to mediasoup when creating the Producer by setting scalabilityMode string. There isn’t much that can be done for now.

1 Like

Nice, thanks for the confirmation.