Talking heads application, 352x288, strange scalable video coding

Please excuse the slow response to your questions. I want to make sure I have a clue what I’m talking about before answering.

My requirement is low-res talking heads, with manageable bandwidth for mobile. I have something working the way I want it to. But I don’t understand why it works. It is a kludge. I appreciate any wisdom you can share about this.

I use these three encodings

  const encodings = [
      { maxBitrate: 128000, scaleResolutionDownBy: 4 },
      { maxBitrate: 384000, scaleResolutionDownBy: 2 },
      { maxBitrate: 512000, scaleResolutionDownBy: 1 } ]

And these constraints

  const userMediaConstraints = {
    video: {  width: { ideal: 704 }, height: { ideal: 576 },
                 frameRate: { min: 10, ideal: 15, max: 15 } },
    audio: true }

When I do this I get two spatial layers, based on the first two of my three constraints. For the higher resolution spatial layer he decoded / displayed videoWidth and videoHeight (on the <video> element) are 352x288, the scaleResolutionDownBy=2 value. This works this way on iOS Safari and on Windows Chrome, Edgium, and Firefox.

Why does this work this way? Why can I not use just two encodings and gUM constraints of 352x288. When I do that I only get one spatial layer. Again, I seek understanding.

(I only use one video codec, h.264 Constrained Baseline Profile, 42e01f. FIrefox only accepts that profile.)