Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17294

Server hangs in toku::context::~context upon shutdown



    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 10.1, 10.2, 10.3, 10.2.18, 10.0
    • Fix Version/s: 10.1, 10.2, 10.3, 10.4, 10.0
    • Labels:
    • Environment:
      libc 2.19, libc 2.18 vm-trusty-amd64-install.qcow2


      After querying INFORMATION_SCHEMA.ALL_PLUGINS server hangs on normal shutdown. The error log says Shutdown complete, however the process doesn't exit, the stack trace from the hanging process:

      10.2.18 release bintar

      Thread 1 (Thread 0x7fe8400a0780 (LWP 7198)):
      #0  0x00007fe83f2254c0 in __GI___pthread_mutex_lock (mutex=0x7fe8400ae968 <_rtld_global+2312>) at ../nptl/pthread_mutex_lock.c:114
      #1  0x00007fe83fe8c0dd in tls_get_addr_tail (ti=0x7fe8144b6420, dtv=0x7fe8400a1090, the_map=0x7fe7cc033650) at dl-tls.c:722
      #2  0x00007fe81414eb44 in toku::context::~context (this=<optimized out>) at /home/buildbot/buildbot/build/storage/tokudb/PerconaFT/util/context.cc:57
      #3  0x00007fe83e8841a9 in __run_exit_handlers (status=0, listp=0x7fe83ec0a6c8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82
      #4  0x00007fe83e8841f5 in __GI_exit (status=<optimized out>) at exit.c:104
      #5  0x00007fe84052e3a9 in mysqld_exit (exit_code=exit_code@entry=0) at /home/buildbot/buildbot/build/sql/mysqld.cc:2181
      #6  0x00007fe840537928 in mysqld_main (argc=7, argv=0x7fe842e27398) at /home/buildbot/buildbot/build/sql/mysqld.cc:6116
      #7  0x00007fe83e869f45 in __libc_start_main (main=0x7fe840512e10 <main(int, char**)>, argc=7, argv=0x7fff3e1ca228, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff3e1ca218) at libc-start.c:287
      #8  0x00007fe84052b72d in _start ()

      The failure happens on some Linux versions (with certain glibc versions). It's known to happen on trusty with glibc 2.19 (and likely 2.18), but possibly not limited to it. As of now, it is reproducible on vm-trusty-amd64-install.qcow2 by doing the following:

      Below is the comment by Sergei Golubchik on slack:

      I can see that toku::context::~context modifies a __thread variable
      in this glibc it causes a mutex (internal eglibc) to be unlocked:
      dl-tls.c:730 __rtld_lock_unlock_recursive (GL(dl_load_lock));
      where __rtld_lock_unlock_recursive is just a fancy name for __GI___pthread_mutex_unlock and the argument is a fancy name for some internal eglibc mutex (_rtld_global._dl_load_lock.mutex) (edited)

      and I'd speculate that as it's called from the exit handler, it's just too late to do this kind of stuff, mutexes are already destroyed
      tokudb bug, not one of ours. but similar to those two I've mentioned above

      I've seen it happen on 10.2.18, but filing for 10.0+ as requested.




            serg Sergei Golubchik
            elenst Elena Stepanova
            0 Vote for this issue
            1 Start watching this issue