Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-15863

Galera cluster node refuses to shut down

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Cannot Reproduce
    • 10.2.14
    • N/A
    • Galera
    • None
    • Centos6

    Description

      In my lab while running sysbench with 8 threads I try and shut down node receiving
      traffic: service mysqld stop refuses to complete, mysqld server is stuck in shutdown mode.

      Details below. Worth nothing this node was also async slave, but receiving very little traffic from it (just pt-heartbeat updates).

       
      2018-04-13 14:48:41 140573072803584 [Note] /usr/sbin/mysqld (initiated by: unknown): Normal shutdown
      2018-04-13 14:48:41 140573072803584 [Note] WSREP: Stop replication
      2018-04-13 14:48:41 140573072803584 [Note] WSREP: Closing send monitor...
      2018-04-13 14:48:41 140573072803584 [Note] WSREP: Closed send monitor.
      2018-04-13 14:48:41 140573072803584 [Note] WSREP: gcomm: terminating thread
      2018-04-13 14:48:41 140573072803584 [Note] WSREP: gcomm: joining thread
      2018-04-13 14:48:41 140573072803584 [Note] WSREP: gcomm: closing backend
      2018-04-13 14:48:41 140573072803584 [Note] WSREP: view(view_id(NON_PRIM,95d49a94,41) memb {
      	cd67d220,0
      } joined {
      } left {
      } partitioned {
      	95d49a94,0
      	da537e8a,0
      })
      2018-04-13 14:48:41 140573072803584 [Note] WSREP: view((empty))
      2018-04-13 14:48:41 140573072803584 [Note] WSREP: gcomm: closed
      2018-04-13 14:48:41 140573092583168 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
      2018-04-13 14:48:41 140573092583168 [Note] WSREP: Flow-control interval: [16, 16]
      2018-04-13 14:48:41 140573092583168 [Note] WSREP: Trying to continue unpaused monitor
      2018-04-13 14:48:41 140573092583168 [Note] WSREP: Received NON-PRIMARY.
      2018-04-13 14:48:41 140573092583168 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 270133)
      2018-04-13 14:48:41 140573092583168 [Note] WSREP: Received self-leave message.
      2018-04-13 14:48:41 140573092583168 [Note] WSREP: Flow-control interval: [0, 0]
      2018-04-13 14:48:41 140573092583168 [Note] WSREP: Trying to continue unpaused monitor
      2018-04-13 14:48:41 140573092583168 [Note] WSREP: Received SELF-LEAVE. Closing connection.
      2018-04-13 14:48:41 140573092583168 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 270133)
      2018-04-13 14:48:41 140573092583168 [Note] WSREP: RECV thread exiting 0: Success
      2018-04-13 14:48:41 140573072803584 [Note] WSREP: recv_thread() joined.
      2018-04-13 14:48:41 140573072803584 [Note] WSREP: Closing replication queue.
      2018-04-13 14:48:41 140573072803584 [Note] WSREP: Closing slave action queue.
      2018-04-13 14:48:41 140573081790208 [Note] WSREP: New cluster view: global state: 75749ce0-3c03-11e8-ba88-67dc780e95e7:270133, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3
      2018-04-13 14:48:41 140573081790208 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      2018-04-13 14:48:41 140573081790208 [Note] WSREP: New cluster view: global state: 75749ce0-3c03-11e8-ba88-67dc780e95e7:270133, view# -1: non-Primary, number of nodes: 0, my index: -1, protocol version 3
      2018-04-13 14:48:41 140573081790208 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      2018-04-13 14:48:41 140573081790208 [Note] WSREP: applier thread exiting (code:0)
      2018-04-13 14:48:41 140573074417408 [Warning] Aborted connection 18 to db: 'sbtest' user: 'root' host: '192.168.2.90' (Got an error reading communication packets)
      2018-04-13 14:48:42 140573412579072 [Note] Slave I/O thread exiting, read up to log 'rbac.000003', position 20269957
      2018-04-13 14:48:43 140573072803584 [Note] WSREP: killing local connection: 15
      2018-04-13 14:48:43 140573072803584 [Note] WSREP: killing local connection: 16
      2018-04-13 14:48:43 140573072803584 [Note] WSREP: killing local connection: 17
      2018-04-13 14:48:43 140573072803584 [Note] WSREP: killing local connection: 22
      2018-04-13 14:48:43 140573072803584 [Note] WSREP: killing local connection: 21
      2018-04-13 14:48:43 140573072803584 [Note] WSREP: killing local connection: 20
      2018-04-13 14:48:43 140573072803584 [Note] WSREP: killing local connection: 19
      

      [mysqld]
       
      binlog_format=row
      default_storage_engine=InnoDB
      innodb_autoinc_lock_mode=2
      log_slave_updates=1
      server_id=1001
       
      innodb_log_file_size=128M
      innodb_log_files_in_group=2
       
      [galera]
      # Mandatory settings
      wsrep_on=ON
      wsrep_provider=/usr/lib64/galera/libgalera_smm.so
      wsrep_cluster_address=gcomm://192.168.2.91,192.168.2.92,192.168.2.93
      #
      
      

          10 libaio::??(libaio.so.1),LinuxAIOHandler::collect(os0file.cc:1915),LinuxAIOHandler::poll(os0file.cc:2060),os_aio_linux_handler(os0file.cc:2114),os_aio_handler(os0file.cc:2114),fil_aio_wait(fil0fil.cc:5348),io_handler_thread(srv0start.cc:342),start_thread(libpthread.so.0),clone(libc.so.6)
            3 pthread_cond_wait,wait(os0event.cc:166),wait_low(os0event.cc:166),os_event_wait_low(os0event.cc:166),srv_resume_thread(srv0srv.cc:924),srv_worker_thread(srv0srv.cc:924),start_thread(libpthread.so.0),clone(libc.so.6)
            1 sigwait(libpthread.so.0),signal_hand(mysqld.cc:3509),start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_wait,wait(os0event.cc:166),wait_low(os0event.cc:166),os_event_wait_low(os0event.cc:166),srv_resume_thread(srv0srv.cc:924),srv_purge_coordinator_suspend(srv0srv.cc:924),srv_purge_coordinator_thread(srv0srv.cc:924),start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_wait,wait(os0event.cc:166),wait_low(os0event.cc:166),os_event_wait_low(os0event.cc:166),buf_resize_thread(buf0buf.cc:3083),start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_wait,wait(os0event.cc:166),wait_low(os0event.cc:166),os_event_wait_low(os0event.cc:166),buf_dump_thread(buf0dump.cc:790),start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_wait,wait,galera::ServiceThd::thd_func,start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_wait,inline_mysql_cond_wait(mysql_thread.h:1152),MYSQL_BIN_LOG::wait_for_update_relay_log(log.cc:8275),next_event(slave.cc:7172),exec_relay_log_event(slave.cc:7172),handle_slave_sql(slave.cc:7172),start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_wait,inline_mysql_cond_wait(mysql_thread.h:1149),inline_mysql_cond_wait(mysql_thread.h:1149),wsrep_rollback_process(mysql_thread.h:1149),start_wsrep_THD(wsrep_mysqld.cc:1981),start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_wait,inline_mysql_cond_wait(mysql_thread.h:1149),inline_mysql_cond_wait(mysql_thread.h:1149),wsrep_close_client_connections(mysql_thread.h:1149),wsrep_stop_replication(wsrep_mysqld.cc:909),kill_server(mysqld.cc:2008),kill_server_thread(mysqld.cc:2043),start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_wait,inline_mysql_cond_wait(mysql_thread.h:1149),inline_mysql_cond_wait(mysql_thread.h:1149),thd_destructor_proxy(mysql_thread.h:1149),start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_wait,inline_mysql_cond_wait(mysql_thread.h:1149),inline_mysql_cond_wait(mysql_thread.h:1149),handle_slave_background(mysql_thread.h:1149),start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_timedwait,os_event::timed_wait(os0event.cc:286),os_event::wait_time_low(os0event.cc:413),srv_monitor_thread(srv0srv.cc:1752),start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_timedwait,os_event::timed_wait(os0event.cc:286),os_event::wait_time_low(os0event.cc:413),srv_error_monitor_thread(srv0srv.cc:1916),start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_timedwait,os_event::timed_wait(os0event.cc:286),os_event::wait_time_low(os0event.cc:413),pc_sleep_if_needed(buf0flu.cc:2693),buf_flush_page_cleaner_coordinator(buf0flu.cc:2693),start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_timedwait,os_event::timed_wait(os0event.cc:286),os_event::wait_time_low(os0event.cc:413),lock_wait_timeout_thread(lock0wait.cc:547),start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_timedwait,os_event::timed_wait(os0event.cc:286),os_event::wait_time_low(os0event.cc:413),ib_wqueue_timedwait(ut0wqueue.cc:163),fts_optimize_thread(fts0opt.cc:3035),start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_timedwait,os_event::timed_wait(os0event.cc:286),os_event::wait_time_low(os0event.cc:413),dict_stats_thread(dict0stats_bg.cc:451),start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_timedwait,inline_mysql_cond_timedwait(mysql_thread.h:1186),timer_handler(mysql_thread.h:1186),start_thread(libpthread.so.0),clone(libc.so.6)
            1 pthread_cond_timedwait,inline_mysql_cond_timedwait(mysql_thread.h:1186),my_service_thread_sleep(mysql_thread.h:1186),ma_checkpoint_background(ma_checkpoint.c:709),start_thread(libpthread.so.0),clone(libc.so.6)
            1 poll(libc.so.6),handle_connections_sockets(mysqld.cc:6589),mysqld_main(mysqld.cc:6055),__libc_start_main(libc.so.6),_start
            1 nanosleep(libpthread.so.0),os_thread_sleep(os0thread.cc:230),srv_master_sleep(srv0srv.cc:2405),srv_master_thread(srv0srv.cc:2405),start_thread(libpthread.so.0),clone(libc.so.6)
            1 nanosleep(libpthread.so.0),os_thread_sleep(os0thread.cc:230),btr_defragment_thread(btr0defragment.cc:772),start_thread(libpthread.so.0),clone(libc.so.6)
      

      Attachments

        Activity

          People

            jplindst Jan Lindström (Inactive)
            rpizzi Rick Pizzi
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.