Since updating to Mediasoup version 3.14.5, we have been encountering frequent instances where a Mediasoup worker terminates unexpectedly, leading to the message Mediasoup worker died, exiting in 2 seconds.... We have attempted to follow the troubleshooting steps outlined in this guide, but have not been successful in generating any core dumps in the designated folder. Additionally, manually forcing the termination does not produce a dump either.
It’s worth noting that we don't utilize the WebRTCServer option with workers.
Has anyone else experienced a similar issue, and does anyone have any insights into how we might generate the crash dump?
Thank you.
Apologies for the oversight. It appears that I forgot to include the mounting of the /tmp/cores folder in the docker-compose.yml file. Once I receive the updated crash report, I’ll ensure to promptly attach all relevant files. Thank you.
I’ve obtained the core dump of the worker. While I’m not an expert in GDB, I’ve tried to use it within a Docker container to analyze a core dump file located inside that container.
Here’s the output:
/src# gdb /src/node_modules/mediasoup/worker/out/Release/mediasoup-worker /tmp/cores/core.mediasoup-worke.sig11.27
GNU gdb (Debian 13.1-3) 13.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /src/node_modules/mediasoup/worker/out/Release/mediasoup-worker...
(No debugging symbols found in /src/node_modules/mediasoup/worker/out/Release/mediasoup-worker)
warning: Can't open file anon_inode:[io_uring] which was expanded to anon_inode:[io_uring] during file-backed mapping note processing
warning: Can't open file anon_inode:[io_uring] which was expanded to anon_inode:[io_uring] during file-backed mapping note processing
warning: Can't open file anon_inode:[io_uring] which was expanded to anon_inode:[io_uring] during file-backed mapping note processing
warning: Can't open file anon_inode:[io_uring] which was expanded to anon_inode:[io_uring] during file-backed mapping note processing
[New LWP 27]
[New LWP 28]
[New LWP 29]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/src/node_modules/mediasoup/worker/out/Release/mediasoup-worker --logLevel=debu'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00005594a79ec95d in non-virtual thunk to RTC::WebRtcTransport::OnIceServerTupleRemoved(RTC::IceServer const*, RTC::TransportTuple*) ()
[Current thread is 1 (Thread 0x7ff57e279440 (LWP 27))]
(gdb) bt full
#0 0x00005594a79ec95d in non-virtual thunk to RTC::WebRtcTransport::OnIceServerTupleRemoved(RTC::IceServer const*, RTC::TransportTuple*) ()
No symbol table info available.
#1 0x00005594a791a4f2 in RTC::IceServer::OnTimer(TimerHandle*) ()
No symbol table info available.
#2 0x00005594a7cd5fc0 in uv.run_timers ()
No symbol table info available.
#3 0x00005594a7cd9cf8 in uv_run ()
No symbol table info available.
#4 0x00005594a789b959 in DepLibUV::RunLoop() ()
No symbol table info available.
#5 0x00005594a78a659a in Worker::Worker(Channel::ChannelSocket*) ()
No symbol table info available.
#6 0x00005594a789a9f4 in mediasoup_worker_run ()
No symbol table info available.
#7 0x00005594a7899b3c in main ()
No symbol table info available.
Not sure what debugging symbols were available for the worker, but the crash occurred in the RTC::WebRtcTransport::OnIceServerTupleRemoved function. This could be due to a null pointer dereference or another memory-related issue, possibly related to handling ICE server events or timers within the mediasoup worker?
Thank you for the recommendations!
I’ve taken the steps to rebuild the worker and have made sure to enforce its usage by setting MEDIASOUP_WORKER_BIN=‘/root/mediasoup/worker/out/Debug/mediasoup-worker’.
I’ll keep you updated with the results once the issue resurfaces.