Using Chromium in a Docker container to connect to mediasoup in another container fails (with mediasoup >=3.5.8)

I have a complex setup with multiple docker containers, but I’ll try to describe the relevant part. I realize it all may be too vague or unclear, but I’ve given up on this problem, so I am looking for a different point of view.

  • There is docker container running mediasoup. There is a NestJS application that takes care of starting the media server and for the initial communication.
  • There is a Chromium 77 browser in another docker container that opens a given web page then streams the audio and video to mediasoup server in the other container. Of course, the browser uses mediasoup-client
  • Audio and video are then streamed to another container with ffmpeg that records them to a file.

So far, so good. The things is everything was running fine until we upgraded mediasoup to 3.5.11 and mediasoup-client to 3.6.5. After that, upon creating a transport and right after the server attempting to produce, the browser crashes with a similar error:

 [198:198:0529/070835.498900:ERROR:webrtc_sdp.cc(420)] Failed to parse: "candidate:2071985302 1 tcp 1350508287 172.18.0.9 54424 typ prflx raddr 172.18.0.9 rport 9 tcptype  generation 0 ufrag Dnkv network-id 1". Reason: Invalid TCP candidate type.

The error makes sense, because, as you can see, there is a blank space after tcptype, meaning it is blank.

However running Chromium outside of Docker works perfectly fine. I can access the same page and send streams to mediasoup.

All debugging attempts had little success or have been futile. However, here is the information I managed to gather:

  • The error happens in the device handler (in our case it is Chrome74.ts), in send(), when it tries to run await this._pc.setLocalDescription(offer);. This runs one time for audio and it is fine, and when it gets to the video stream, it causes the browser error.
  • After the above runs for the video, the browser throws the error message in the terminal.
  • The IP and the port in the error are not in the candidate list that is initially send by the server (upon transport creation and connecting).
  • The IP in the error message is the IP of the docker container running Chromium
  • The above makes me think there is some sort of communication happening directly between the browser and mediasoup-server. Since it is not going through our code, I was unable to trace it.
  • I could not figure out how is the tcptype affected by the changes in 3.5.8. It may not be, but something, somewhere is generating a broken SDP.

What I tried but did not fix the problem:

  • Trying all server versions from 3.5.8 to 3.5.11
  • Using 3.5.8, but reverting the uv_udp_init_ex() change (while keeping the updated libuv).
  • Using different STUN servers
  • Using or nor using TURN servers
  • Using a different version of mediasoup-client. (the problem seems to originate from the server)
  • When running router.createWebRtcTransport(), passing enableTcp: false makes the error disappear, but the browser still fails.

The only thing that helped was downgrading the server to 3.5.7, probably, because of restoring the older version of libuv.

So my questions would be: Has anyone come across anything similar? Can you suggest where to look for the potential source of data (ICE candidates?), that is converted to an invalid SDP? Something I might be missing or some place where in the code that I need to look into?

Hi @sspanak I have the same issue after updating to 3.5.8 (all the next version fails also).
Same setup, mediasoup runs inside docker, works on 3.5.7 but same setup fails on >=3.5.8.
Did you manage to figure this out?

Please print the options object given to the sendTransport in that Chrome.

I’ve been chasing this for 5 days now, here is what I know:

  1. Upgrading to mediasoup 3.5.8 (the one with libuv 1.37 upgrade) broke our system in a way that video and audio are not getting thru to the consumer. Nothing change on our end other than 3.5.8 (yes, I know that’s what everyone say, read on).
  2. It fails on the ice candidate, again nothing changed but we played with listen IP to be 0.0.0.0, 127.0.0.1 and the local ip address didn’t work.
  3. checking tcpdump prove that our packets get all the way thru, also mediasoup in debug mode show trace of udp packets.
  4. I can reproduce it in docker, put our application in docker on a local machine same behavior
  5. switching to tcp instead of udp and everything works again. inside docker on local machine with the same ip configuration.
  6. working with our application outside docker and everything works again (tcp and udp).

at this point I think this is something in the way mediasoup uses libuv udp and something in the environment of docker.

There shouldn’t be any change related to how UDP works, even that change in libuv. Just in case: how you tried with latest mediasoup? Also, a simple way to reproduce the issue would help a lot.

I managed to reprduce the issue with mediasoup demo and docker.
I created a fork for you to play with at https://github.com/erlichmen/mediasoup-demo with minimum changes.
The repo has two branches 3.5.7 and 3.6.21.

Take each branch inside server run docker/build.sh and then “docker-compose up” to run it.
Modify the MEDIASOUP_ANNOUNCED_IP inside docker-compose.yml to your computer IP. before running it.

After running the server run the app and point to http://localhost:3001, notice that on branch 3.5.7 webrtc internals show a complted ice canindate flow while on branch 3.6.21 it fails.

Thanks, I’ll try when I get some time (kinda busy these days). I assume it stops working since 3.5.8 so I’ll try to bisect the differences.

Hi, @erlichmen. No, I gave up when I posted this question.

We have started full refactoring of our code, to make sure we don’t misuse Mediasoup. Once we are done, we will try upgrading again.

@ibc, if I remember correctly, the changes from 3.5.7 to 3.5.8 are not so many. I think the only significant one was switching to uv_udp_init_ex(). I have tried to revert that, but not the libuv upgrade, however, the error was still appearing. This is all I know so far.

I’m completely unable to run your fork.

  • I assume I must create a config.js into server folder before running docker/build.sh.
  • I have added server/certs/... folder with my local certs and added ADD certs certs into Dockerfile.
  • Once the server is running I’ve no idea about how to launch the web app. http://localhost:3001 is definitely not working at all.

Sorry I forgot to add my config.js to the bracnhes (I didn’t notice it was ignored).|
I added config.js to both branches you don’t need cert and I minimize the demo to work under ws

Thanks, I’ll test it this week, once I get some time.

BTW just in case to not forget it. May you please report this (with same steps to reproduce) in GitHub so we never forget about this issue?

We are seeing the same issue, has anyone been able to make any progress and/or was a github issue ever created?

For us, in both Docker and Amazon ECS TCP works without issue, but UDP fails. Downgrading to 3.5.7 makes UDP work again with the exact same configuration. We have verified machine is reachable with tcpdump etc.

Status of this issue is the following:

  • There is a fork of the mediasoup-demo project that is supposed to show the problem (here). I’m afraid I’ve tried to run it and ICE does not connect neither using the branch 3.5.7 nor 3.6.21 in that fork.
  • I’m Mac, not sure if the issue with Docker just happens in Linux (note that Docker runs completely different in both Linux and Mac).
  • So I don’t know how to verify it and honestly I don’t have time to investigate how to make that fork run as it’s supposed to work.

So if anyone could provide more specific details (other than “it also happens to me”) that may help. Until that happens I don’t know how/what to test. If there was a way to demonstrate the issue without Docker it would be much better to debug.

I am on Mac as well and it happens locally via docker. I will attempt to reproduce the issue in the fork and make it more turn-key for you this morning.

Interestingly, in my quest to reproduce the issue in the above repo, I was only able to get it to reproduce by switching out the base docker image to the node-alpine variants. This matches our production configuration as well. I’ve also added some commits to make it so no local file edits will be necessary on os x.

You can see the working repo here: https://github.com/ghempton/mediasoup-demo/tree/3.5.7

And the failing repo here: https://github.com/ghempton/mediasoup-demo/tree/3.6.25

The only difference between the two is the version of mediasoup. I added some lines to the top of root README to get it to work, but will also include them here:

In the repo directory:


cd server

./docker/build.sh

./bin/start-osx

In a separate terminal


cd app

npm install

npm start

Then visit http://localhost:3000/?info=true. The app will function properly in the 3.5.7 branch, but fail in the 3.6.25 branch.

I can second that we are working on alpine distros and see the same problem.
Interestingly, the target now got smaller to analyze.

[UPDATE] I found this link it looks like alpine support for libuv 1.39(+?) is only a month old

I would appreciate if you could specify what “fail” means here. Does it fail for the producer browser when it sends RTP to mediasoup? or does it fail when the consumer browser tries to consume from mediasoup? Please provide accurate info about this to narrow the issue.

I think the server is failing to send any outbound UDP packets: when I use tcpdump with the latest mediasoup on alpine, I see the incoming UDP packets to the server, but no outbound packets. Switching to 3.5.7 shows UDP packets headed out.

@ghempton, if you were able to reproduce the crash only using node-alpine. Does it mean it work properly with the Node/Debian images? Honestly, it has never come to me to test this, but I may consider it as an option when I get back to it.