I’ve been browsing the mediasoup source for a reason why temporal layers from chromium wouldn’t be forwarded from mediasoup, and it looks like it required the ‘framemarking’ header extension in the past, which is no longer in libwebrtc.
Is there currently no way to support temporal layers in H264 using chromium/mediasoup?
ibc
(Iñaki Baz Castillo)
June 27, 2024, 9:07am
2
I’ve created an issue for this. Help welcome:
opened 09:06AM - 27 Jun 24 UTC
help wanted
`H264.cpp` payload descriptor parser relies on frame-marking RTP extension to ge… t information about temporal layer of the payload (among other fields). However, libwebrtc (AKA Chrome) no longer enables/implements frame marking RTP extension. Result is that, even if client enables temporal layers in H264, mediasoup will never detect them and will consider that all H264 received packets belong to temporal layer 0.
> It also affects to the `H264_SVC.cpp` codec which also relies on frame marking RTP extension. In this case it means that mediasoup will consider that all H264_SVC received packets belong to spatial layer 0 and temporal layer 0.
RTP extensions offered by Chrome when using H264 codec with temporal layers enabled (by passing proper `scalabilityMode` value to the `encodings`) are the following:
```
a=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=extmap:13 urn:3gpp:video-orientation
a=extmap:3 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=extmap:5 http://www.webrtc.org/experiments/rtp-hdrext/playout-delay
a=extmap:6 http://www.webrtc.org/experiments/rtp-hdrext/video-content-type
a=extmap:7 http://www.webrtc.org/experiments/rtp-hdrext/video-timing
a=extmap:8 http://www.webrtc.org/experiments/rtp-hdrext/color-space
a=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:10 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
a=extmap:11 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
a=extmap:12 https://aomediacodec.github.io/av1-rtp-spec/#dependency-descriptor-rtp-header-extension
a=extmap:9 http://www.webrtc.org/experiments/rtp-hdrext/video-layers-allocation00
```
Relevant RTP extensions are the following:
### https://aomediacodec.github.io/av1-rtp-spec/#dependency-descriptor-rtp-header-extension
**RTP Payload Format For AV1** extension "describes an RTP payload format for the AV1 video codec".
So clearly this is not valid here.
### http://www.webrtc.org/experiments/rtp-hdrext/video-layers-allocation00
**Video Layers Allocation** extension "is for a video sender to provide information about the target bitrate, resolution and frame rate of each scalability layer in order to aid a selective forwarding middlebox to decide which layer to relay."
So this is **NOT** what we need since it doesn't indicate which spatial/temporal layer current packet belongs to. This is a RTP extension to tell the remote side how many spatial/layers we are generating, the target bitrate of each layer and so on.
### Conclusion
AFAIS there is literally no way to detect which spatial/temporal layer a received H264/H264_SVC payload belongs.
Or perhaps we must parse the codec payload to parse its spatial/temporal layers? Note that we already do a very basic parsing of the H264/H264_SVC payload to detect keyframes:
- https://github.com/versatica/mediasoup/blob/3.14.8/worker/src/RTC/Codecs/H264.cpp#L60
- https://github.com/versatica/mediasoup/blob/3.14.8/worker/src/RTC/Codecs/H264_SVC.cpp#L59