Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18665

Server hangs on shutdown after setting wsrep_cluster_address at runtime

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 10.1, 10.2, 10.3
    • Fix Version/s: 10.1, 10.2, 10.3, 10.4
    • Component/s: wsrep
    • Labels:
      None

      Description

      • start the server normally, with all defaults, without wsrep
      • run

        set global wsrep_provider='/usr/lib/libgalera_smm.so'; set global wsrep_cluster_address='gcomm://'
        

        (adjust path to the library if necessary).

      • shut down the server

      = The server hangs, seemingly forever. The last lines in the log are

      10.1 431da59f

      2019-02-20 17:19:42 139659940031232 [Note] /data/bld/10.1/bin/mysqld: Normal shutdown
      2019-02-20 17:19:42 139659940031232 [Note] WSREP: Stop replication
      2019-02-20 17:19:42 139659940031232 [Note] WSREP: Closing send monitor...
      2019-02-20 17:19:42 139659940031232 [Note] WSREP: Closed send monitor.
      2019-02-20 17:19:42 139659940031232 [Note] WSREP: gcomm: terminating thread
      2019-02-20 17:19:42 139659940031232 [Note] WSREP: gcomm: joining thread
      2019-02-20 17:19:42 139659940031232 [Note] WSREP: gcomm: closing backend
      2019-02-20 17:19:42 139659940031232 [Note] WSREP: view((empty))
      2019-02-20 17:19:42 139659940031232 [Note] WSREP: gcomm: closed
      2019-02-20 17:19:42 139658977601280 [Note] WSREP: Received self-leave message.
      2019-02-20 17:19:42 139658977601280 [Note] WSREP: Flow-control interval: [0, 0]
      2019-02-20 17:19:42 139658977601280 [Note] WSREP: Received SELF-LEAVE. Closing connection.
      2019-02-20 17:19:42 139658977601280 [Note] WSREP: Shifting SYNCED -> CLOSED (TO: 0)
      2019-02-20 17:19:42 139658977601280 [Note] WSREP: RECV thread exiting 0: Success
      2019-02-20 17:19:42 139659940031232 [Note] WSREP: recv_thread() joined.
      2019-02-20 17:19:42 139659940031232 [Note] WSREP: Closing replication queue.
      2019-02-20 17:19:42 139659940031232 [Note] WSREP: Closing slave action queue.
      

      Stack trace from threads of the hanging process which seem to be up to something:

      Thread 6 (Thread 0x7f051f2adb00 (LWP 26855)):
      #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
      #1  0x000055d937962ec7 in safe_cond_wait (cond=0x55d93839c9a0 <COND_slave_background>, mp=0x55d93839fb00 <LOCK_slave_background>, file=0x55d9379f4500 "/data/src/10.1/include/mysql/psi/mysql_thread.h", line=1165) at /data/src/10.1/mysys/thr_mutex.c:493
      #2  0x000055d93700da02 in inline_mysql_cond_wait (that=0x55d93839c9a0 <COND_slave_background>, mutex=0x55d93839fb00 <LOCK_slave_background>, src_file=0x55d9379f4c5d "/data/src/10.1/sql/slave.cc", src_line=336) at /data/src/10.1/include/mysql/psi/mysql_thread.h:1165
      #3  0x000055d93700e6ea in handle_slave_background (arg=0x0) at /data/src/10.1/sql/slave.cc:336
      #4  0x00007f051ef07494 in start_thread (arg=0x7f051f2adb00) at pthread_create.c:333
      #5  0x00007f051d2c093f in clone () from /lib/x86_64-linux-gnu/libc.so.6
      

      Thread 5 (Thread 0x7f04e7357700 (LWP 26861)):
      #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
      #1  0x00007f04ef5485ee in wait (this=0x7f04e7356d10, cond=...) at galerautils/src/gu_lock.hpp:56
      #2  galera::ServiceThd::thd_func (arg=0x7f04f051a350) at galera/src/galera_service_thd.cpp:30
      #3  0x00007f051ef07494 in start_thread (arg=0x7f04e7357700) at pthread_create.c:333
      #4  0x00007f051d2c093f in clone () from /lib/x86_64-linux-gnu/libc.so.6
      

      Thread 4 (Thread 0x7f051f219b00 (LWP 26864)):
      #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
      #1  0x000055d937962ec7 in safe_cond_wait (cond=0x55d9383c7580 <COND_wsrep_rollback>, mp=0x55d9383c74c0 <LOCK_wsrep_rollback>, file=0x55d937a61800 "/data/src/10.1/include/mysql/psi/mysql_thread.h", line=1165) at /data/src/10.1/mysys/thr_mutex.c:493
      #2  0x000055d93727d6ac in inline_mysql_cond_wait (that=0x55d9383c7580 <COND_wsrep_rollback>, mutex=0x55d9383c74c0 <LOCK_wsrep_rollback>, src_file=0x55d937a61900 "/data/src/10.1/sql/wsrep_thd.cc", src_line=471) at /data/src/10.1/include/mysql/psi/mysql_thread.h:1165
      #3  0x000055d93727f2f2 in wsrep_rollback_process (thd=0x7f04e2816070) at /data/src/10.1/sql/wsrep_thd.cc:471
      #4  0x000055d93726cfa2 in start_wsrep_THD (arg=0x55d93727f1fa <wsrep_rollback_process(THD*)>) at /data/src/10.1/sql/wsrep_mysqld.cc:2064
      #5  0x00007f051ef07494 in start_thread (arg=0x7f051f219b00) at pthread_create.c:333
      #6  0x00007f051d2c093f in clone () from /lib/x86_64-linux-gnu/libc.so.6
      

      Thread 3 (Thread 0x7f051f1cfb00 (LWP 26865)):
      #0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
      #1  0x000055d9379631d1 in safe_cond_timedwait (cond=0x55d9383c7480 <COND_wsrep_sst_init>, mp=0x55d9383c73c0 <LOCK_wsrep_sst_init>, abstime=0x7f051f1cd850, file=0x55d937a5ea80 "/data/src/10.1/include/mysql/psi/mysql_thread.h", line=1202) at /data/src/10.1/mysys/thr_mutex.c:547
      #2  0x000055d937270a1b in inline_mysql_cond_timedwait (that=0x55d9383c7480 <COND_wsrep_sst_init>, mutex=0x55d9383c73c0 <LOCK_wsrep_sst_init>, abstime=0x7f051f1cd850, src_file=0x55d937a5ebd0 "/data/src/10.1/sql/wsrep_sst.cc", src_line=1411) at /data/src/10.1/include/mysql/psi/mysql_thread.h:1202
      #3  0x000055d937275912 in wsrep_SE_init_wait () at /data/src/10.1/sql/wsrep_sst.cc:1411
      #4  0x000055d9372668b2 in wsrep_view_handler_cb (app_ctx=0x6e6f00, recv_ctx=0x7f04e2416070, view=0x7f04e249f300, state=0x0, state_len=0, sst_req=0x7f051f1ce170, sst_req_len=0x7f051f1ce178) at /data/src/10.1/sql/wsrep_mysqld.cc:404
      #5  0x00007f04ef57d580 in galera::ReplicatorSMM::process_conf_change (this=0x7f04f0519e00, recv_ctx=0x7f04e2416070, view_info=..., repl_proto=7, next_state=galera::Replicator::S_JOINED, seqno_l=<optimized out>) at galera/src/replicator_smm.cpp:1414
      #6  0x00007f04ef554aab in galera::GcsActionSource::dispatch (this=this@entry=0x7f04f051a488, recv_ctx=recv_ctx@entry=0x7f04e2416070, act=..., exit_loop=@0x7f051f1ce65c: false) at galera/src/gcs_action_source.cpp:139
      #7  0x00007f04ef556586 in galera::GcsActionSource::process (this=0x7f04f051a488, recv_ctx=0x7f04e2416070, exit_loop=@0x7f051f1ce65c: false) at galera/src/gcs_action_source.cpp:181
      #8  0x00007f04ef57cbc3 in galera::ReplicatorSMM::async_recv (this=0x7f04f0519e00, recv_ctx=0x7f04e2416070) at galera/src/replicator_smm.cpp:371
      #9  0x00007f04ef59121b in galera_recv (gh=<optimized out>, recv_ctx=<optimized out>) at galera/src/wsrep_provider.cpp:244
      #10 0x000055d93727ec56 in wsrep_replication_process (thd=0x7f04e2416070) at /data/src/10.1/sql/wsrep_thd.cc:360
      #11 0x000055d93726cfa2 in start_wsrep_THD (arg=0x55d93727eb8f <wsrep_replication_process(THD*)>) at /data/src/10.1/sql/wsrep_mysqld.cc:2064
      #12 0x00007f051ef07494 in start_thread (arg=0x7f051f1cfb00) at pthread_create.c:333
      #13 0x00007f051d2c093f in clone () from /lib/x86_64-linux-gnu/libc.so.6
      

      Thread 2 (Thread 0x7f051d1d6b00 (LWP 26871)):
      #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
      #1  0x000055d937962ec7 in safe_cond_wait (cond=0x55d9383a0200 <COND_thread_count>, mp=0x55d93839efc0 <LOCK_thread_count>, file=0x55d937a5ce00 "/data/src/10.1/include/mysql/psi/mysql_thread.h", line=1165) at /data/src/10.1/mysys/thr_mutex.c:493
      #2  0x000055d937266024 in inline_mysql_cond_wait (that=0x55d9383a0200 <COND_thread_count>, mutex=0x55d93839efc0 <LOCK_thread_count>, src_file=0x55d937a5d210 "/data/src/10.1/sql/wsrep_mysqld.cc", src_line=2369) at /data/src/10.1/include/mysql/psi/mysql_thread.h:1165
      #3  0x000055d93726dfc6 in wsrep_wait_appliers_close (thd=0x0) at /data/src/10.1/sql/wsrep_mysqld.cc:2369
      #4  0x000055d9372686dd in wsrep_stop_replication (thd=0x0) at /data/src/10.1/sql/wsrep_mysqld.cc:901
      #5  0x000055d936fe809d in kill_server (sig_ptr=0x0) at /data/src/10.1/sql/mysqld.cc:1924
      #6  0x000055d936fe8108 in kill_server_thread (arg=0x7f051f2f7220) at /data/src/10.1/sql/mysqld.cc:1961
      #7  0x00007f051ef07494 in start_thread (arg=0x7f051d1d6b00) at pthread_create.c:333
      #8  0x00007f051d2c093f in clone () from /lib/x86_64-linux-gnu/libc.so.6
      

      10.1-10.3 are affected. I didn't check 10.4.

        Attachments

          Activity

            People

            Assignee:
            jplindst Jan Lindström
            Reporter:
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated: