ibc
(Iñaki Baz Castillo)
October 21, 2019, 11:37am
82
@mkh I’ve a suspicion. We have fixed an ugly bug in “devel” branch (so you seen so many crashes) and now I’ve created a tcp-crash-wip branch with that fix and another addition that “I think” could fix the issue (not 100% sure).
May you please checkout this new “tcp-crash-wip” in your environment (of course remove the local changes you did in DtlsTransport.cpp
before), then build and run it for some days?
mkh
(Mohsen Khahani)
October 21, 2019, 12:39pm
83
Thanks I’m okay with that local modifications.
Kidding, I’ll do the update and let you know the result.
1 Like
mkh
(Mohsen Khahani)
October 23, 2019, 12:25am
84
Crashed once within a day:
$ gdb /var/conference/server/node_modules/mediasoup/worker/out/Release/mediasoup-worker /tmp/cores/core.mediasoup-worke.sig6.9847
GNU gdb (Ubuntu 8.2-0ubuntu1~16.04.1) 8.2
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /var/conference/server/node_modules/mediasoup/worker/out/Release/mediasoup-worker...done.
warning: core file may not match specified executable file.
[New LWP 9847]
[New LWP 9850]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/var/conference/server/node_modules/mediasoup/worker/out/Release/mediasoup-work'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007fe71b827428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7fe71cd00740 (LWP 9847))]
(gdb) bt full
#0 0x00007fe71b827428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
resultvar = 0
pd = <optimized out>
pid = 9847
selftid = 9847
#1 0x00007fe71b82902a in __GI_abort () at abort.c:89
save_stage = 2
act = {__sigaction_handler = {sa_handler = 0x6c2f756e672d7875, sa_sigaction = 0x6c2f756e672d7875}, sa_mask = {__val = {
3615882721498522217, 7306587963414228531, 3472337320779395383, 7147554992259804464, 8223625903104157025,
3472328295963438455, 4195155967701168179, 3544958757364577072, 2314885530818457655, 2314885530818453536,
7795484802351636512, 3917909816998060649, 3276497845987585332, 7233459864884047463, 8299627453906824556,
7147554992259795567}}, sa_flags = 809067873, sa_restorer = 0x97}
sigs = {__val = {32, 0 <repeats 15 times>}}
#2 0x00007fe71b8697ea in __libc_message (do_abort=2, fmt=fmt@entry=0x7fe71b982ed8 "*** Error in `%s': %s: 0x%s ***\n")
at ../sysdeps/posix/libc_fatal.c:175
ap = <error reading variable ap (Attempt to dereference a generic pointer.)>
fd = 2
on_2 = <optimized out>
list = <optimized out>
nlist = <optimized out>
cp = <optimized out>
written = <optimized out>
#3 0x00007fe71b872d36 in malloc_printerr (ar_ptr=0x7fe71bbb6b20 <main_arena>, ptr=<optimized out>,
str=0x7fe71b97fc75 "corrupted size vs. prev_size", action=3) at malloc.c:5006
buf = "000000000fd17382"
cp = <optimized out>
ar_ptr = 0x7fe71bbb6b20 <main_arena>
ptr = <optimized out>
str = 0x7fe71b97fc75 "corrupted size vs. prev_size"
action = 3
buf = <optimized out>
cp = <optimized out>
#4 _int_free (av=0x7fe71bbb6b20 <main_arena>, p=<optimized out>, have_lock=0) at malloc.c:4005
size = 686
fb = <optimized out>
nextchunk = 0xfd17630
nextsize = 192
nextinuse = <optimized out>
prevsize = <optimized out>
bck = <optimized out>
fwd = <optimized out>
errstr = 0x0
locked = <optimized out>
ignore = <optimized out>
idx = <optimized out>
old = <optimized out>
--Type <RET> for more, q to quit, c to continue without paging--c
old2 = <optimized out>
old_idx = <optimized out>
ignore1 = <optimized out>
ignore2 = <optimized out>
ignore3 = <optimized out>
ignore = <optimized out>
__atg1_result = <optimized out>
ret = <optimized out>
ignore1 = <optimized out>
ignore2 = <optimized out>
ignore3 = <optimized out>
heap = <optimized out>
ignore = <optimized out>
#5 0x00007fe71b87653c in __GI___libc_free (mem=<optimized out>) at malloc.c:2968
ar_ptr = <optimized out>
p = <optimized out>
hook = <optimized out>
#6 0x00000000005d50b8 in onClose (handle=0xfd17540) at ../src/handles/TcpConnection.cpp:48
No locals.
#7 0x000000000074f445 in uv__finish_close (handle=0xfd17540) at ../deps/libuv/src/unix/core.c:293
__PRETTY_FUNCTION__ = "uv__finish_close"
#8 0x000000000074f48c in uv__run_closing_handles (loop=0x2fb2d40) at ../deps/libuv/src/unix/core.c:307
p = 0xfd17540
q = 0x0
#9 0x000000000074f690 in uv_run (loop=0x2fb2d40, mode=UV_RUN_DEFAULT) at ../deps/libuv/src/unix/core.c:377
timeout = 2
r = 1
ran_pending = 1
#10 0x00000000005b2330 in DepLibUV::RunLoop () at ../src/DepLibUV.cpp:52
__FUNCTION__ = "RunLoop"
#11 0x00000000005be549 in Worker::Worker (this=0x7ffd8d2ac890, channel=0x2fb3130) at ../src/Worker.cpp:31
No locals.
#12 0x00000000007472d0 in main (argc=10, argv=0x7ffd8d2aca28) at ../src/main.cpp:123
worker = {<Channel::UnixStreamSocket::Listener> = {_vptr.Listener = 0xca1838 <vtable for Worker+16>}, <SignalsHandler::Listener> = {_vptr.Listener = 0xca1870 <vtable for Worker+72>}, channel = 0x2fb3130, signalsHandler = 0x2fe0510, mapRouters = std::unordered_map with 2 elements = {["0d016050-3648-46ab-adcc-27d085832fd6"] = 0x688e0c0, ["b6bf004c-ef16-421a-8035-a87b2e5cc6b9"] = 0x3203ba0}, closed = false}
__FUNCTION__ = "main"
version = "3.2.5"
channel = 0x2fb3130
(gdb)
ibc
(Iñaki Baz Castillo)
October 23, 2019, 1:24am
85
ok, let’s please continue in the GitHub issue:
opened 09:17AM - 07 Oct 19 UTC
Operating system: Linux (not seen yet in OSX)
mediasoup version: 3.x.x
As described here there is something wrong in the TCP stack of...
bug
ibc
(Iñaki Baz Castillo)
October 23, 2019, 1:35am
86
BTW, can you reply in that issue? I cannot find your GitHub username to ping you from GitHub.
ibc
(Iñaki Baz Castillo)
October 23, 2019, 1:40am
87
Also, please do git pull in your “tcp-crash-wip” branch. I forgot to do the same modification in the UnixSocket handler (which is similar code to TCP).
ibc
(Iñaki Baz Castillo)
October 23, 2019, 1:58am
88
BTW: are you calling getStats()
or dump()
in server side? how often? on which objects?