[MDEV-32997] sporadic SIGABRT at my_sleep (timeout) in spider/bugfix.mdev_27575 Created: 2023-12-12  Updated: 2024-02-01

Status: Open
Project: MariaDB Server
Component/s: Storage Engine - Spider
Affects Version/s: 10.5
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Yuchen Pei Assignee: Yuchen Pei
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by MDEV-33343 spider.mdev_28739_simple fails in bui... Closed

 Description   

Examples include 10.5 a356a940d256abd132ea237eb8c84fa78c756276 at
amd64-debian-10
https://buildbot.mariadb.org/#/builders/221/builds/28439/steps/7/logs/stdio

and 10.5 1e80601b826730f11544cc8b58e28c3e73f0cd33 amd64-ubuntu-2204-debug-ps
https://buildbot.mariadb.org/#/builders/534/builds/12422/steps/7/logs/stdio

Not sure why this happens as it never happens locally. The crd threads
should be freed at spider_db_done(), by signaling the thread cond.
spider_db_done is called when mysqld_main() reaches
handle_connections_sockets(), which is way before call to
close_connections(). See also a comment below for details:

https://jira.mariadb.org/browse/MDEV-32997?focusedCommentId=279179&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-279179

spider/bugfix.mdev_27575                 w1 [ fail ]  Found warnings/errors in server log file!
        Test ended at 2023-12-08 09:11:23
line
Attempting backtrace. You can use the following information to find out
^ Found warnings in /home/buildbot/amd64-debian-10/build/mysql-test/var/1/log/mysqld.1.1.err
ok
 
 - found 'core' (0/1)
Core generated by '/home/buildbot/amd64-debian-10/build/sql/mariadbd'
Output from gdb follows. The first stack trace is from the failing thread.
The following stack traces are from all threads (so the failing one is
duplicated).
--------------------------
[New LWP 24496]
[New LWP 24507]
[New LWP 24563]
[New LWP 24612]
[New LWP 24644]
[New LWP 24601]
[New LWP 24602]
[New LWP 24513]
[New LWP 24567]
[New LWP 25337]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/home/buildbot/amd64-debian-10/build/sql/mariadbd --defaults-group-suffix=.1.1'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f8416ca25a1 in pthread_kill () from /lib/x86_64-linux-gnu/libpthread.so.0
[Current thread is 1 (Thread 0x7f841665d040 (LWP 24496))]
#0  0x00007f8416ca25a1 in pthread_kill () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x0000555f9f7a2e47 in my_write_core (sig=<optimized out>) at /home/buildbot/amd64-debian-10/build/mysys/stacktrace.c:424
#2  0x0000555f9f282370 in handle_fatal_signal (sig=6) at /home/buildbot/amd64-debian-10/build/sql/signal_handler.cc:357
#3  <signal handler called>
#4  0x00007f841689bbd7 in select () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x0000555f9f7a0def in my_sleep (m_seconds=m_seconds@entry=1000) at /home/buildbot/amd64-debian-10/build/mysys/my_sleep.c:29
#6  0x0000555f9efcb492 in close_connections () at /home/buildbot/amd64-debian-10/build/sql/mysqld.cc:1813
#7  mysqld_main (argc=<optimized out>, argv=<optimized out>) at /home/buildbot/amd64-debian-10/build/sql/mysqld.cc:5741
#8  0x00007f84167cf09b in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#9  0x0000555f9efbf18a in _start () at /home/buildbot/amd64-debian-10/build/include/mysql/plugin.h:215
 
Thread 10 (Thread 0x7f83d6ffd700 (LWP 25337)):
#0  0x00007f8416ca100c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#1  0x00007f8414106ddf in inline_mysql_cond_wait (src_file=0x7f8414165618 "/home/buildbot/amd64-debian-10/build/storage/spider/spd_table.cc", src_line=10171, mutex=0x7f839409d9f0, that=0x7f839409d9b8) at /home/buildbot/amd64-debian-10/build/include/mysql/psi/mysql_thread.h:1233
        result = <optimized out>
        result = <optimized out>
        locker = <optimized out>
        state = <optimized out>
#2  spider_table_bg_crd_action (arg=0x7f839409d998) at /home/buildbot/amd64-debian-10/build/storage/spider/spd_table.cc:10171
        thread = 0x7f839409d998
        share = <optimized out>
        trx = <optimized out>
        error_num = 0
        spider = <optimized out>
        table = <optimized out>
        conns = <optimized out>
        thd = <optimized out>
#3  0x00007f8416c9afa3 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#4  0x00007f84168a406f in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
 
Thread 9 (Thread 0x7f840f7ff700 (LWP 24567)):
#0  0x00007f8416ca100c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#1  0x0000555f9f69abc1 in buf_flush_page_cleaner () at /home/buildbot/amd64-debian-10/build/storage/innobase/buf/buf0flu.cc:2297
        soft_lsn_limit = <optimized out>
        oldest_lsn = <optimized out>
        dirty_blocks = <optimized out>
        dirty_pct = <optimized out>
        pct_lwm = <optimized out>
        n_flushed = <optimized out>
        n = <optimized out>
        last_pages = 0
        abstime = {tv_sec = 1702026619, tv_nsec = 263975000}
        lsn_limit = 0
        last_activity_count = 0
#2  0x00007f8416c9afa3 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#3  0x00007f84168a406f in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
 
Thread 8 (Thread 0x7f84159e4700 (LWP 24513)):
#0  0x00007f8416ca135b in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#1  0x0000555f9f433799 in inline_mysql_cond_timedwait (src_file=0x555f9f9b6fc0 "/home/buildbot/amd64-debian-10/build/storage/maria/ma_servicethread.c", src_line=116, abstime=0x7f84159e3d20, mutex=0x555fa084c9c0 <LOCK_checkpoint>, that=0x555fa084c980 <COND_checkpoint>) at /home/buildbot/amd64-debian-10/build/include/mysql/psi/mysql_thread.h:1270
        result = <optimized out>
        result = <optimized out>
        locker = <optimized out>
        state = <optimized out>
#2  my_service_thread_sleep (control=control@entry=0x555f9ff87620 <checkpoint_control>, sleep_time=<optimized out>) at /home/buildbot/amd64-debian-10/build/storage/maria/ma_servicethread.c:115
        abstime = {tv_sec = 1702026706, tv_nsec = 155286000}
        res = 0 '\000'
#3  0x0000555f9f42c5b9 in ma_checkpoint_background (arg=0x1e) at /home/buildbot/amd64-debian-10/build/storage/maria/ma_checkpoint.c:725
        interval = 30
        sleeps = 60
        sleep_time = 30
        log_horizon_at_last_checkpoint = 4295014401
        pagecache_flushes_at_last_checkpoint = 0
        pages_bunch_size = 0
        filter_param = {up_to_lsn = 4295014357, max_pages = 0}
        dfile = 0x0
        kfile = 0x0
#4  0x00007f8416c9afa3 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#5  0x00007f84168a406f in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
 
Thread 7 (Thread 0x7f8414385700 (LWP 24602)):
#0  0x00007f8416ca100c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#1  0x0000555f9f36234b in inline_mysql_cond_wait (src_file=0x555f9f9aa078 "/home/buildbot/amd64-debian-10/build/sql/log.cc", src_line=10496, mutex=0x555fa004a810 <mysql_bin_log+3184>, that=0x555fa004a840 <mysql_bin_log+3232>) at /home/buildbot/amd64-debian-10/build/include/mysql/psi/mysql_thread.h:1233
        result = <optimized out>
        result = <optimized out>
        locker = <optimized out>
        state = <optimized out>
#2  binlog_background_thread (arg=arg@entry=0x0) at /home/buildbot/amd64-debian-10/build/sql/log.cc:10495
        stop = false
        queue = 0x0
        next = <optimized out>
        thd = 0x7f83c0000c18
#3  0x0000555f9f492bab in pfs_spawn_thread (arg=0x555fa0bbcb48) at /home/buildbot/amd64-debian-10/build/storage/perfschema/pfs.cc:2201
        typed_arg = 0x555fa0bbcb48
        user_arg = 0x0
        user_start_routine = 0x555f9f361eb0 <binlog_background_thread(void*)>
        pfs = <optimized out>
        klass = <optimized out>
#4  0x00007f8416c9afa3 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#5  0x00007f84168a406f in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
 
Thread 6 (Thread 0x7f84143d0700 (LWP 24601)):
#0  0x00007f8416ca100c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#1  0x0000555f9f07a7bb in inline_mysql_cond_wait (that=0x555fa0019360 <COND_manager>, mutex=0x555fa00193a0 <LOCK_manager>, src_file=0x555f9f85cce8 "/home/buildbot/amd64-debian-10/build/sql/sql_manager.cc", src_line=103) at /home/buildbot/amd64-debian-10/build/include/mysql/psi/mysql_thread.h:1233
        result = <optimized out>
        result = <optimized out>
        locker = <optimized out>
        state = <optimized out>
#2  handle_manager (arg=arg@entry=0x0) at /home/buildbot/amd64-debian-10/build/sql/sql_manager.cc:103
        cb = <optimized out>
        error = 0
        abstime = {tv_sec = 511101108316, tv_nsec = 0}
        reset_flush_time = true
#3  0x0000555f9f492bab in pfs_spawn_thread (arg=0x555fa0e57fa8) at /home/buildbot/amd64-debian-10/build/storage/perfschema/pfs.cc:2201
        typed_arg = 0x555fa0e57fa8
        user_arg = 0x0
        user_start_routine = 0x555f9f07a510 <handle_manager(void*)>
        pfs = <optimized out>
        klass = <optimized out>
#4  0x00007f8416c9afa3 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#5  0x00007f84168a406f in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
 
Thread 5 (Thread 0x7f841433a700 (LWP 24644)):
#0  0x00007f84167e35fc in sigtimedwait () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1  0x0000555f9efc17db in my_sigwait (code=<synthetic pointer>, sig=0x7f8414339ca8, set=0x7f8414339cb0) at /home/buildbot/amd64-debian-10/build/include/my_pthread.h:196
        siginfo = {si_signo = 7, si_errno = 0, si_code = 0, __pad0 = 0, _sifields = {_pad = {37, 21855, 280, 0, 575, 0, 624, 0, -312, -1, 9, 0, 37, 57, 1, 0, 92, 119, 0, 0, 124, 110, 0, 0, 0, 0, -1476394976, 32643}, _kill = {si_pid = 37, si_uid = 21855}, _timer = {si_tid = 37, si_overrun = 21855, si_sigval = {sival_int = 280, sival_ptr = 0x118}}, _rt = {si_pid = 37, si_uid = 21855, si_sigval = {sival_int = 280, sival_ptr = 0x118}}, _sigchld = {si_pid = 37, si_uid = 21855, si_status = 280, si_utime = 575, si_stime = 624}, _sigfault = {si_addr = 0x555f00000025, si_addr_lsb = 280, _bounds = {_addr_bnd = {_lower = 0x23f, _upper = 0x270}, _pkey = 575}}, _sigpoll = {si_band = 93866510254117, si_fd = 280}, _sigsys = {_call_addr = 0x555f00000025, _syscall = 280, _arch = 0}}}
        siginfo = <optimized out>
#2  signal_hand (arg=arg@entry=0x0) at /home/buildbot/amd64-debian-10/build/sql/mysqld.cc:3007
        error = <optimized out>
        origin = <optimized out>
        set = {__val = {548869, 0 <repeats 15 times>}}
        sig = 0
#3  0x0000555f9f492bab in pfs_spawn_thread (arg=0x555fa0ec4ca8) at /home/buildbot/amd64-debian-10/build/storage/perfschema/pfs.cc:2201
        typed_arg = 0x555fa0ec4ca8
        user_arg = 0x0
        user_start_routine = 0x555f9efc1650 <signal_hand(void*)>
        pfs = <optimized out>
        klass = <optimized out>
#4  0x00007f8416c9afa3 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#5  0x00007f84168a406f in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
 
Thread 4 (Thread 0x7f83deffd700 (LWP 24612)):
#0  0x00007f8416ca135b in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#1  0x0000555f9f743c1d in __gthread_cond_timedwait (__abs_timeout=0x7f83deffcd40, __mutex=<optimized out>, __cond=0x555fa0cb8ad0) at /usr/include/x86_64-linux-gnu/c++/8/bits/gthr-default.h:871
No locals.
#2  std::condition_variable::__wait_until_impl<std::chrono::duration<long, std::ratio<1l, 1000000000l> > > (__atime=<optimized out>, __lock=@0x7f83deffcd90: {_M_device = 0x555fa0cb81b8, _M_owns = true}, this=0x555fa0cb8ad0) at /usr/include/c++/8/condition_variable:178
        __s = <optimized out>
        __ns = <optimized out>
        __ts = {tv_sec = 1702026739, tv_nsec = 480040906}
        __s = <optimized out>
        __ns = <optimized out>
        __ts = <optimized out>
#3  std::condition_variable::wait_until<std::chrono::duration<long, std::ratio<1l, 1000000000l> > > (__atime=<optimized out>, __lock=@0x7f83deffcd90: {_M_device = 0x555fa0cb81b8, _M_owns = true}, this=0x555fa0cb8ad0) at /usr/include/c++/8/condition_variable:106
No locals.
#4  std::condition_variable::wait_for<long, std::ratio<1l, 1000l> > (__rtime=@0x555fa0cb81e0: {__r = 60000}, __lock=@0x7f83deffcd90: {_M_device = 0x555fa0cb81b8, _M_owns = true}, this=0x555fa0cb8ad0) at /usr/include/c++/8/condition_variable:143
        __reltime = <optimized out>
        __reltime = <optimized out>
#5  tpool::thread_pool_generic::wait_for_tasks (this=this@entry=0x555fa0cb80a0, lk=@0x7f83deffcd90: {_M_device = 0x555fa0cb81b8, _M_owns = true}, thread_data=thread_data@entry=0x555fa0cb8ad0) at /home/buildbot/amd64-debian-10/build/tpool/tpool_generic.cc:499
No locals.
#6  0x0000555f9f743e5d in tpool::thread_pool_generic::get_task (this=0x555fa0cb80a0, thread_var=0x555fa0cb8ad0, t=0x7f83deffcde8) at /home/buildbot/amd64-debian-10/build/tpool/tpool_generic.cc:552
        lk = {_M_device = 0x555fa0cb81b8, _M_owns = true}
#7  0x0000555f9f74413d in tpool::thread_pool_generic::worker_main (this=0x555fa0cb80a0, thread_var=0x555fa0cb8ad0) at /home/buildbot/amd64-debian-10/build/tpool/tpool_generic.cc:597
        task = 0x555fa0e89c80
#8  0x00007f8416bc5b2f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#9  0x00007f8416c9afa3 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#10 0x00007f84168a406f in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
 
Thread 3 (Thread 0x7f8414dd1700 (LWP 24563)):
#0  0x00007f841689eaf9 in syscall () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1  0x0000555f9f746a65 in my_getevents (min_nr=1, nr=256, ev=0x7f8414dcede0, ctx=<optimized out>) at /home/buildbot/amd64-debian-10/build/tpool/aio_linux.cc:108
        saved_errno = <optimized out>
        ret = <optimized out>
        saved_errno = <optimized out>
        ret = <optimized out>
#2  tpool::aio_linux::getevent_thread_routine (aio=0x555fa0cffe20) at /home/buildbot/amd64-debian-10/build/tpool/aio_linux.cc:108
        ret = <optimized out>
        events = {{data = 0x0, obj = 0x555fa0d208d0, res = 16384, res2 = 0}, {data = 0x0, obj = 0x555fa0d20990, res = 16384, res2 = 0}, {data = 0x0, obj = 0x555fa0d20a50, res = 16384, res2 = 0}, {data = 0x0, obj = 0x555fa0d20b10, res = 16384, res2 = 0}, {data = 0x0, obj = 0x555fa0d20bd0, res = 16384, res2 = 0}, {data = 0x0, obj = 0x555fa0d20c90, res = 16384, res2 = 0}, {data = 0x0, obj = 0x555fa0d20d50, res = 16384, res2 = 0}, {data = 0x0, obj = 0x555fa0d20e10, res = 16384, res2 = 0}, {data = 0x0, obj = 0x555fa0d20ed0, res = 16384, res2 = 0}, {data = 0x0, obj = 0x555fa0d20f90, res = 16384, res2 = 0}, {data = 0x0, obj = 0x555fa0d21050, res = 16384, res2 = 0}, {data = 0x0, obj = 0x555fa0d21110, res = 16384, res2 = 0}, {data = 0x0, obj = 0x0, res = 0, res2 = 0} <repeats 244 times>}
#3  0x00007f8416bc5b2f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#4  0x00007f8416c9afa3 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#5  0x00007f84168a406f in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
 
Thread 2 (Thread 0x7f841655d700 (LWP 24507)):
#0  0x00007f8416ca135b in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#1  0x0000555f9f7a6f01 in inline_mysql_cond_timedwait (that=0x555fa08ec960 <COND_timer>, mutex=0x555fa08ec9a0 <LOCK_timer>, src_file=0x555f9fa24190 "/home/buildbot/amd64-debian-10/build/mysys/thr_timer.c", src_line=321, abstime=0x7f841655cdb0) at /home/buildbot/amd64-debian-10/build/include/mysql/psi/mysql_thread.h:1270
        result = <optimized out>
        result = <optimized out>
        locker = <optimized out>
        state = <optimized out>
#2  timer_handler (arg=<optimized out>) at /home/buildbot/amd64-debian-10/build/mysys/thr_timer.c:321
        error = <optimized out>
        top_time = 0x555fa0cb82b8
        now = <optimized out>
        abstime = {tv_sec = 1702026680, tv_nsec = 328723000}
#3  0x00007f8416c9afa3 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#4  0x00007f84168a406f in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
 
Thread 1 (Thread 0x7f841665d040 (LWP 24496)):
#0  0x00007f8416ca25a1 in pthread_kill () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#1  0x0000555f9f7a2e47 in my_write_core (sig=<optimized out>) at /home/buildbot/amd64-debian-10/build/mysys/stacktrace.c:424
No locals.
#2  0x0000555f9f282370 in handle_fatal_signal (sig=6) at /home/buildbot/amd64-debian-10/build/sql/signal_handler.cc:357
        curr_time = 1702026678
        tm = {tm_sec = 18, tm_min = 11, tm_hour = 9, tm_mday = 8, tm_mon = 11, tm_year = 123, tm_wday = 5, tm_yday = 341, tm_isdst = 0, tm_gmtoff = 0, tm_zone = 0x555fa0bb6280 "UTC"}
        thd = 0x0
        print_invalid_query_pointer = false
#3  <signal handler called>
No symbol table info available.
#4  0x00007f841689bbd7 in select () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#5  0x0000555f9f7a0def in my_sleep (m_seconds=m_seconds@entry=1000) at /home/buildbot/amd64-debian-10/build/mysys/my_sleep.c:29
        t = {tv_sec = 0, tv_usec = 0}
#6  0x0000555f9efcb492 in close_connections () at /home/buildbot/amd64-debian-10/build/sql/mysqld.cc:1813
        n_threads_awaiting_ack = 0
        n_threads_awaiting_ack = <optimized out>
        waiting_threads = <optimized out>
        i = <optimized out>
        i = <optimized out>
#7  mysqld_main (argc=<optimized out>, argv=<optimized out>) at /home/buildbot/amd64-debian-10/build/sql/mysqld.cc:5741
        please_close_stdin = <optimized out>
        ho_error = <optimized out>
        new_thread_stack_size = <optimized out>
        user = <optimized out>
#8  0x00007f84167cf09b in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#9  0x0000555f9efbf18a in _start () at /home/buildbot/amd64-debian-10/build/include/mysql/plugin.h:215
No symbol table info available.
 
Compressed file /home/buildbot/amd64-debian-10/build/mysql-test/var/1/log/spider/bugfix.mdev_27575/mysqld.1.1/data/core
 - saving '/home/buildbot/amd64-debian-10/build/mysql-test/var/1/log/spider/bugfix.mdev_27575/' to '/home/buildbot/amd64-debian-10/build/mysql-test/var/log/bugfix.mdev_27575/'



 Comments   
Comment by Yuchen Pei [ 2024-01-11 ]

This is happening a bit too often these days. Let's disable the test until we find a solution.

example occurrence:
https://buildbot.mariadb.org/#/builders/221/builds/29320

Comment by Yuchen Pei [ 2024-01-24 ]

Happening to spider/bugfix.mdev_28739 too: https://buildbot.mariadb.net/buildbot/builders/kvm-fulltest/builds/48573/steps/mtr_nm/logs/stdio

Comment by Yuchen Pei [ 2024-02-01 ]

Following up on serg's observation of the stack in MDEV-33343

  • the server gets core in select called from sleep called from close_connections. Why? Well, clearly it's shutdown and whatever it was selecting or sleeping on gets destroyed.
  • the server sleeps in close_connections when it's waiting for the thread counter to become 0
  • threads don't end, because thread stack shows spider_table_bg_sts_action and spider_table_bg_crd_action threads waiting on conditions.
  • Why there are sts/crd bg threads if your intention was to remove spider persistent stats?

These bg threads gather stats and store them in-memory rather than
persistently. According to the stack they are waiting on a condition
(3) which should be signaled (1) when spider is shutdown, which when I
run the test locally, happens when mysqld_main() is at the call of
handle_connections_sockets() (4).

The sleep that waits for the threads until timeout is inside
close_connections() (5), which is called after the call to
handle_connections_sockets(). Thus the question is why the threads
are still alive by then? Below the code blocks and stacks are at 10.5
354e97cd722c6b8721f05e75084a6843e32b6b74.

void spider_free_sts_threads(
  SPIDER_THREAD *spider_thread
) {
//  [... 5 lines elided]
  if (!thread_killed)
  {
    if (spider_thread->thd_wait)
    {
        pthread_cond_signal(&spider_thread->cond); // (1)
    }
    pthread_cond_wait(&spider_thread->sync_cond, &spider_thread->mutex);
  }
//  [... 8 lines elided]
}
void *spider_table_bg_sts_action(
  void *arg
) {
//  [... 65 lines elided]
    if (!thread->queue_first)
    {
      DBUG_PRINT("info",("spider bg sts has no job"));
      thread->thd_wait = TRUE;
      pthread_cond_wait(&thread->cond, &thread->mutex); // (3)
      thread->thd_wait = FALSE;
      if (thd->killed)
        thread->killed = TRUE;
      continue;
    }
//  [... 80 lines elided]
}
//  [... 2 lines elided]
int mysqld_main(int argc, char **argv)
#endif
{
//  [... 411 lines elided]
  handle_connections_sockets(); // (4)
 
  mysql_mutex_lock(&LOCK_start_thread);
  select_thread_in_use=0;
  mysql_cond_broadcast(&COND_start_thread);
  mysql_mutex_unlock(&LOCK_start_thread);
#endif /* _WIN32 */
 
  /* Shutdown requested */
  char *user= shutdown_user.load(std::memory_order_relaxed);
  sql_print_information(ER_DEFAULT(ER_NORMAL_SHUTDOWN), my_progname,
                        user ? user : "unknown");
  if (user)
    my_free(user);
//  [... 11 lines elided]
  close_connections();          // (5)
//  [... 29 lines elided]
}

spider_free_sts_threads > spider_db_done > ha_finalize_handlerton > plugin_deinitialize > reap_plugins > plugin_unlock > ha_close_connection > THD::free_connection > unlink_thd > do_handle_one_connection > handle_one_connection

(hmm, I just noticed this is an expanded version of my analysis in the description...)

Generated at Thu Feb 08 10:35:37 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.