Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mediasoup worker died, exiting in 2 seconds... #1392

Closed
miroslavpejic85 opened this issue May 8, 2024 · 2 comments · Fixed by #1393
Closed

Mediasoup worker died, exiting in 2 seconds... #1392

miroslavpejic85 opened this issue May 8, 2024 · 2 comments · Fixed by #1393
Assignees
Labels
Milestone

Comments

@miroslavpejic85
Copy link

Bug Report

System Information and Environment:

  • Operating System: Ubuntu 22.04.4 LTS (GNU/Linux 5.15.0-105-generic x86_64)
  • Mediasoup Version: 3.14.5
  • Mediasoup Client Version: 3.7.7
  • Compiler: gcc/g++/c++ 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
  • Node.js Version: v18.17.1
  • Npm Version: 10.7.0
  • Python Version: 3.10.12
  • Docker: 24.0.5, build 24.0.5-0ubuntu1~22.04.1
  • Docker-compose: v2.27.0

For reference: mediasoup.discourse.group

Issue Description:

Since upgrading to Mediasoup version 3.14.5, our system has encountered frequent instances of Mediasoup workers terminating unexpectedly. This behavior is indicated by the message Mediasoup worker died, exiting in 2 seconds....

Troubleshooting Steps:

Following the documentation, we managed obtained a core dump of the worker, which will be provided as an attachment for further analysis.

(gdb) bt
(gdb) bt
#0  0x000055604c64eb58 in RTC::TransportTuple::GetProtocol (this=0x556518627283) at ../../../include/RTC/TransportTuple.hpp:92
#1  0x000055604c766313 in RTC::WebRtcTransport::OnIceServerTupleRemoved (this=0x55604e65a2b0, tuple=0x556518627283)
    at ../../../src/RTC/WebRtcTransport.cpp:1183
#2  0x000055604c64ef97 in RTC::IceServer::OnTimer (this=0x55604e615820, timer=0x55604e67fa90) at ../../../src/RTC/IceServer.cpp:935
#3  0x000055604c5ff52e in TimerHandle::OnUvTimer (this=0x55604e67fa90) at ../../../src/handles/TimerHandle.cpp:162
#4  0x000055604c5fe87b in onTimer (handle=0x55604e692230) at ../../../src/handles/TimerHandle.cpp:13
#5  0x000055604cb49005 in uv__run_timers (loop=0x55604e53bff0) at ../../../subprojects/libuv-v1.48.0/src/timer.c:193
#6  0x000055604cb4ec72 in uv_run (loop=0x55604e53bff0, mode=UV_RUN_DEFAULT) at ../../../subprojects/libuv-v1.48.0/src/unix/core.c:466
#7  0x000055604c5cda83 in DepLibUV::RunLoop () at ../../../src/DepLibUV.cpp:98
#8  0x000055604c5e06ed in Worker::Worker (this=0x7fff8c7dd100, channel=0x55604e53c5d0) at ../../../src/Worker.cpp:56
#9  0x000055604c5c3414 in mediasoup_worker_run (argc=16, argv=0x7fff8c7dd348, version=0x7fff8c7dd200 "3.14.5", consumerChannelFd=3, producerChannelFd=4,
    channelReadFn=0x0, channelReadCtx=0x0, channelWriteFn=0x0, channelWriteCtx=0x0) at ../../../src/lib.cpp:142
#10 0x000055604c80ed5f in main (argc=16, argv=0x7fff8c7dd348) at ../../../src/main.cpp:25
(gdb) bt full
#0  0x000055604c64eb58 in RTC::TransportTuple::GetProtocol (this=0x556518627283) at ../../../include/RTC/TransportTuple.hpp:92
No locals.
#1  0x000055604c766313 in RTC::WebRtcTransport::OnIceServerTupleRemoved (this=0x55604e65a2b0, tuple=0x556518627283)
    at ../../../src/RTC/WebRtcTransport.cpp:1183
No locals.
#2  0x000055604c64ef97 in RTC::IceServer::OnTimer (this=0x55604e615820, timer=0x55604e67fa90) at ../../../src/RTC/IceServer.cpp:935
        storedTuple = 0x556518627283
        it = <error reading variable: Cannot access memory at address 0x556518627283>
        __for_range = std::__cxx11::list = {[0] = {hash = 15945316816845144064, udpSocket = 0x55604e6c2030, udpRemoteAddr = 0x55604e6d24a0,
            tcpConnection = 0x0, localAnnouncedAddress = "", udpRemoteAddrStorage = {ss_family = 2,
              __ss_padding = "\335I%)\273\036", '\000' <repeats 111 times>, __ss_align = 0}, protocol = RTC::TransportTuple::Protocol::UDP}}
        __for_begin = <error reading variable: Cannot access memory at address 0x556518627273>
        __for_end = {hash = 1, udpSocket = 0x55604e6d2460, udpRemoteAddr = 0x55604e67fa90, tcpConnection = 0x0, localAnnouncedAddress = <error: Cannot access memory at address 0xe0>, udpRemoteAddrStorage = {ss_family = 1, __ss_padding = "\000\000\000\000\000\000xt\000\000\000\000\000\000\000\004\000\000\000\000\000\000\340YaN`U\000\000\003\000\000\000\000\000\000\000y\000\000\000\000\000\000\000", '\377' <repeats 16 times>, "\003\000\000\000\002\000\000\000\001", '\000' <repeats 31 times>, "a\000\000\000\000\000\000\000\260\326^N`U\000", __ss_align = 0}, protocol = (unknown: 0x80)}
        __FUNCTION__ = "OnTimer"
#3  0x000055604c5ff52e in TimerHandle::OnUvTimer (this=0x55604e67fa90) at ../../../src/handles/TimerHandle.cpp:162
No locals.
#4  0x000055604c5fe87b in onTimer (handle=0x55604e692230) at ../../../src/handles/TimerHandle.cpp:13
No locals.
#5  0x000055604cb49005 in uv__run_timers (loop=0x55604e53bff0) at ../../../subprojects/libuv-v1.48.0/src/timer.c:193
        heap_node = 0x55604e5fec88
        handle = 0x55604e692230
        queue_node = 0x55604e692298
        ready_queue = {next = 0x55604e629318, prev = 0x55604e721bd8}
#6  0x000055604cb4ec72 in uv_run (loop=0x55604e53bff0, mode=UV_RUN_DEFAULT) at ../../../subprojects/libuv-v1.48.0/src/unix/core.c:466
        timeout = 1
        r = 0
        can_sleep = 1
#7  0x000055604c5cda83 in DepLibUV::RunLoop () at ../../../src/DepLibUV.cpp:98
        __FUNCTION__ = "RunLoop"
        ret = 21856
#8  0x000055604c5e06ed in Worker::Worker (this=0x7fff8c7dd100, channel=0x55604e53c5d0) at ../../../src/Worker.cpp:56
--Type <RET> for more, q to quit, c to continue without paging--
No locals.
#9  0x000055604c5c3414 in mediasoup_worker_run (argc=16, argv=0x7fff8c7dd348, version=0x7fff8c7dd200 "3.14.5", consumerChannelFd=3, producerChannelFd=4, channelReadFn=0x0, channelReadCtx=0x0, channelWriteFn=0x0, channelWriteCtx=0x0) at ../../../src/lib.cpp:142
        worker = {<Channel::ChannelSocket::Listener> = {<Channel::ChannelSocket::RequestHandler> = {_vptr.RequestHandler = 0x55604cf6bb10 <vtable for Worker+16>}, <Channel::ChannelSocket::NotificationHandler> = {
              _vptr.NotificationHandler = 0x55604cf6bb58 <vtable for Worker+88>}, <No data fields>}, <SignalHandle::Listener> = {_vptr.Listener = 0x55604cf6bb80 <vtable for Worker+128>}, <RTC::Router::Listener> = {
            _vptr.Listener = 0x55604cf6bba8 <vtable for Worker+168>}, channel = 0x55604e53c5d0, signalHandle = 0x55604e5b3920, shared = 0x55604e5b2be0,
          mapWebRtcServers = {<absl::lts_20230802::container_internal::raw_hash_map<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, RTC::WebRtcServer*>, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::WebRtcServer*> > >> = {<absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, RTC::WebRtcServer*>, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::WebRtcServer*> > >> = {
                settings_ = {<absl::lts_20230802::container_internal::internal_compressed_tuple::CompressedTupleImpl<absl::lts_20230802::container_internal::CompressedTuple<absl::lts_20230802::container_internal::CommonFields, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::WebRtcServer*> > >, absl::lts_20230802::integer_sequence<unsigned long, 0, 1, 2, 3>, true>> = {<absl::lts_20230802::container_internal::internal_compressed_tuple::uses_inheritance> = {<No data fields>}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<absl::lts_20230802::container_internal::CommonFields, 0, false>> = {
                      value = {<absl::lts_20230802::container_internal::CommonFieldsGenerationInfoDisabled> = {<No data fields>}, control_ = 0x55604cd31d40 <absl::lts_20230802::container_internal::kEmptyGroup+16>, slots_ = 0x0, capacity_ = 0,
                        compressed_tuple_ = {<absl::lts_20230802::container_internal::internal_compressed_tuple::CompressedTupleImpl<absl::lts_20230802::container_internal::CompressedTuple<unsigned long, absl::lts_20230802::container_internal::HashtablezInfoHandle>, absl::lts_20230802::integer_sequence<unsigned long, 0, 1>, true>> = {<absl::lts_20230802::container_internal::internal_compressed_tuple::uses_inheritance> = {<No data fields>}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<unsigned long, 0, false>> = {
                              value = 0}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<absl::lts_20230802::container_internal::HashtablezInfoHandle, 1, true>> = {<absl::lts_20230802::container_internal::HashtablezInfoHandle> = {<No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}}}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<absl::lts_20230802::container_internal::StringHash, 1, true>> = {<absl::lts_20230802::container_internal::StringHash> = {<No data fields>}, <No data fields>}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<absl::lts_20230802::container_internal::StringEq, 2, true>> = {<absl::lts_20230802::container_internal::StringEq> = {<No data fields>}, <No data fields>}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::WebRtcServer*> >, 3, true>> = {<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::WebRtcServer*> >> = {<__gnu_cxx::new_allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::WebRtcServer*> >> = {<No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}}, <No data fields>}, <No data fields>},
          mapRouters = {<absl::lts_20230802::container_internal::raw_hash_map<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, RTC::Router*>, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::Router*> > >> = {<absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, RTC::Router*>, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::Router*> > >> = {
                settings_ = {<absl::lts_20230802::container_internal::internal_compressed_tuple::CompressedTupleImpl<absl::lts_20230802::container_internal::CompressedTuple<absl::lts_20230802::container_internal::CommonFields, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::Router*> > >, absl::lts_20230802::integer_sequence<unsigned long, 0, 1, 2, 3>, true>> = {<absl::lts_20230802::container_internal::internal_compressed_tuple::uses_inheritance> = {<No data fields>}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<absl::lts_20230802::container_internal::CommonFields, 0, false>> = {
                      value = {<absl::lts_20230802::container_internal::CommonFieldsGenerationInfoDisabled> = {<No data fields>}, control_ = 0x55604e6bef58, slots_ = 0x55604e6bef70, capacity_ = 3,
                        compressed_tuple_ = {<absl::lts_20230802::container_internal::internal_compressed_tuple::CompressedTupleImpl<absl::lts_20230802::container_internal::CompressedTuple<unsigned long, absl::lts_20230802::container_internal::HashtablezInfoHandle>, absl::lts_20230802::integer_sequence<unsigned long, 0, 1>, true>> = {<absl::lts_20230802::container_internal::internal_compressed_tuple::uses_inheritance> = {<No data fields>}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<unsigned long, 0, false>> = {
                              value = 2}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<absl::lts_20230802::container_internal::HashtablezInfoHandle, 1, true>> = {<absl::lts_20230802::container_internal::HashtablezInfoHandle> = {<No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}}}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<absl::lts_20230802::container_internal::StringHash, 1, true>> = {<absl::lts_20230802::container_internal::StringHash> = {<No data fields>}, <No data fields>}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<absl::lts_20230802::container_internal::StringEq, 2, true>> = {<absl::lts_20230802::container_internal::StringEq> = {<No data fields>}, <No data fields>}, <absl::lts_20230802::container_internal::internal_compressed_tuple::Storage<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::Router*> >, 3, true>> = {<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::Router*> >> = {<__gnu_cxx::new_allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, RTC::Router*> >> = {<No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}}, <No data fields>}, <No data fields>}, closed = false}
        channel = std::unique_ptr<Channel::ChannelSocket> = {get() = 0x55604e53c5d0}
        __FUNCTION__ = "mediasoup_worker_run"
#10 0x000055604c80ed5f in main (argc=16, argv=0x7fff8c7dd348) at ../../../src/main.cpp:25
        __FUNCTION__ = "main"
        version = "3.14.5"
        statusCode = 0
@miroslavpejic85 miroslavpejic85 changed the title Frequent Mediasoup Worker Termination Mediasoup worker died, exiting in 2 seconds... May 8, 2024
@ibc
Copy link
Member

ibc commented May 8, 2024

As @snnz said in the forum, this looks like the culprit:

https://mediasoup.discourse.group/t/mediasoup-worker-died-exiting-in-2-seconds/6035/7

It looks like after this commit 1 IceServer::OnTimer may end up calling IceServer::RemoveTuple, in the same way IceServer::~IceServer does.

I am on it.

@ibc
Copy link
Member

ibc commented May 8, 2024

@miroslavpejic85, PR here: #1393

However I may need your help if possible. Let's please follow up here in the PR: #1393 (comment)

@ibc ibc closed this as completed in #1393 May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

2 participants