Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14282

SIGHUP after rsync SST causes crash in my_hash_element

Details

    Description

      A joiner node performed an rsync SST from another node in the cluster, and when the SST finished, the node saw the following crash:

      2017-11-03 16:44:46 140515631684352 [Note] WSREP: State transfer required: 
      	Group state: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977
      	Local state: 00000000-0000-0000-0000-000000000000:-1
      2017-11-03 16:44:46 140515631684352 [Note] WSREP: New cluster view: global state: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977, view# 102: Primary, number of nodes: 4, my index: 3, protocol version 3
      2017-11-03 16:44:46 140515631684352 [Warning] WSREP: Gap in state sequence. Need state transfer.
      2017-11-03 16:44:46 140504717063936 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '10.119.17.237' --datadir '/app/mysql/galera/'  --defaults-file '/app/mysql/config/galera.cnf'  --parent '8214' --binlog '/app/mysql/galera/db-node1-binlog' '
      2017-11-03 16:44:46 140515631684352 [Note] WSREP: Prepared SST request: rsync|10.119.17.237:4444/rsync_sst
      2017-11-03 16:44:46 140515631684352 [Note] WSREP: REPL Protocols: 7 (3, 2)
      2017-11-03 16:44:46 140515631684352 [Note] WSREP: Assign initial position for certification: 230930977, protocol version: 3
      2017-11-03 16:44:46 140504792139520 [Note] WSREP: Service thread queue flushed.
      2017-11-03 16:44:46 140515631684352 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (b1c4fe17-b433-11e7-bd44-52c2748b6170): 1 (Operation not permitted)
      	 at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
      2017-11-03 16:44:46 140504733837056 [Note] WSREP: Member 3.0 (10-119-17-237.domain.com) requested state transfer from '*any*'. Selected 0.0 (10-119-17-69.domain.com)(SYNCED) as donor.
      2017-11-03 16:44:46 140504733837056 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 230930977)
      2017-11-03 16:44:46 140515631684352 [Note] WSREP: Requesting state transfer: success, donor: 0
      2017-11-03 16:44:46 140515631684352 [Note] WSREP: GCache history reset: old(00000000-0000-0000-0000-000000000000:0) -> new(b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977)
      2017-11-03 16:44:49 140504742229760 [Note] WSREP: (d3790118, 'tcp://0.0.0.0:4567') connection to peer d3790118 with addr tcp://10.119.17.237:4567 timed out, no messages seen in PT3S
      2017-11-03 16:44:49 140504742229760 [Note] WSREP: (d3790118, 'tcp://0.0.0.0:4567') turning message relay requesting off
      2017-11-03 17:06:34 140504733837056 [Note] WSREP: 0.0 (10-119-17-69.domain.com): State transfer to 3.0 (10-119-17-237.domain.com) complete.
      WSREP_SST: [INFO] Extracting binlog files: (20171103 17:06:34.145)
      db-g3-bin.000042
      ls: cannot access db-node1-binlog.*: No such file or directory
      WSREP_SST: [INFO] Joiner cleanup. rsync PID: 8293 (20171103 17:06:34.156)
      WSREP_SST: [INFO] Joiner cleanup done. (20171103 17:06:34.662)
      2017-11-03 17:06:34 140515632003328 [Note] WSREP: SST complete, seqno: 230930977
      2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using mutexes to ref count buffer pool pages
      2017-11-03 17:06:34 140515632003328 [Note] InnoDB: The InnoDB memory heap is disabled
      2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      2017-11-03 17:06:34 140515632003328 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
      2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Compressed tables use zlib 1.2.7
      2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using Linux native AIO
      2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using SSE crc32 instructions
      2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Initializing buffer pool, size = 512.0M
      2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Completed initialization of buffer pool
      2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Highest supported file format is Barracuda.
      2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1244892544453
      2017-11-03 17:06:36 140504733837056 [Note] WSREP: Member 0.0 (10-119-17-69.domain.com) synced with group.
      2017-11-03 17:06:42 140515632003328 [Note] InnoDB: Restoring possible half-written data pages from the doublewrite buffer...
      2017-11-03 17:06:49 140515632003328 [Note] InnoDB: To recover: 2702 pages from log
      2017-11-03 17:06:53 140515632003328 [Note] InnoDB: Starting final batch to recover 2585 pages from redo log
      InnoDB: Last MySQL binlog file position 0 786976686, file name /app/mysql/galera/db-g3-bin.000041
      2017-11-03 17:07:15 140515632003328 [Note] InnoDB: 128 rollback segment(s) are active.
      2017-11-03 17:07:15 140515632003328 [Note] InnoDB: Waiting for purge to start
      2017-11-03 17:07:15 140515632003328 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.36-82.2 started; log sequence number 1244894457235
      2017-11-03 17:07:23 140515632003328 [Note] Plugin 'FEEDBACK' is disabled.
      2017-11-03 17:07:23 140503555213056 [Note] InnoDB: Dumping buffer pool(s) not yet started
      2017-11-03 17:07:23 140515632003328 [Note] Server socket created on IP: '10.119.17.237'.
       
      Status information:
       
      Current dir: /app/mysql/galera/
      Running threads: 2  Stack size: 295936
      Current locks:
      lock: 0x7fc98da48788:
       
      lock: 0x7fc9cdc6a508:
       
      lock: 0x7fcc598b5520:
       
      lock: 0x7fcc598b5440:
       
      lock: 0x7fcc598b4480:
       
      lock: 0x7fcc598b43a0:
       
      lock: 0x7fcc598b42c0:
       
      lock: 0x7fcc598b4640:
       
      lock: 0x7fcc598b2880:
       
      lock: 0x7fcc598b5360:
       
      lock: 0x7fcc598b2f80:
       
      lock: 0x7fcc598b3220:
       
      lock: 0x7fcc598b2ea0:
       
      lock: 0x7fcc598b3140:
       
      lock: 0x7fcc598b2dc0:
       
      lock: 0x7fcc598b3060:
       
      lock: 0x7fcc598b35a0:
       
      lock: 0x7fcc598b3680:
       
      lock: 0x7fcc598b3760:
       
      lock: 0x7fcc598b2ce0:
       
      lock: 0x7fcc598b2a40:
       
      lock: 0x7fcc598b2c00:
       
      lock: 0x7fcc598b2960:
       
      lock: 0x7fcc598b2b20:
       
      lock: 0x7fcc598b3300:
       
      lock: 0x7fcc598b33e0:
       
      lock: 0x7fcc598b34c0:
       
      lock: 0x7fcc598b4fe0:
       
      lock: 0x7fcc598b5280:
       
      lock: 0x7fcc598b51a0:
       
      lock: 0x7fcc598b50c0:
       
      lock: 0x7fcc598b4c60:
       
      lock: 0x7fcc598b4b80:
       
      lock: 0x7fcc598b4aa0:
       
      lock: 0x7fcc598b49c0:
       
      lock: 0x7fcc598b48e0:
       
      lock: 0x7fcc598b4e20:
       
      lock: 0x7fcc598b4800:
       
      lock: 0x7fcc598b4720:
       
      lock: 0x7fcc598b4f00:
       
      lock: 0x7fcc598b4560:
       
      lock: 0x7fcc598b4100:
       
      lock: 0x7fcc598b41e0:
       
      lock: 0x7fcc598b4020:
       
      lock: 0x7fcc598b3f40:
       
      lock: 0x7fcc598b3bc0:
       
      lock: 0x7fcc598b3e60:
       
      lock: 0x7fcc598b3d80:
       
      lock: 0x7fcc598b3ae0:
       
      lock: 0x7fcc598b3ca0:
       
      lock: 0x7fcc598b3840:
       
      lock: 0x7fcc598b3920:
       
      lock: 0x7fcc598b3a00:
       
      lock: 0x7fcc598b4d40:
       
       
      Key caches:
      default
      Buffer_size:     134217728
      Block_size:           1024
      Division_limit:        100
      Age_threshold:         300
      Partitions:              0
      blocks used:             0
      not flushed:             0
      w_requests:              0
      writes:                  0
      r_requests:              0
      reads:                   0
       
       
      handler status:
      read_key:            0
      read_next:           0
      read_rnd             0
      read_first:          0
      write:               0
      delete               0
      update:              0
       
      Table status:
      Opened tables:          2
      Open tables:            2
      Open files:             8
      Open streams:           0
       
      Alarm status:
      Active alarms:   0
      Max used alarms: 0
      Next alarm time: 0
       
      Memory status:
      Non-mmapped space allocated from system: 0
      Number of free chunks:			 1
      Number of fastbin blocks:		 0
      Number of mmapped regions:		 0
      Space in mmapped regions:		 0
      Maximum total allocated space:		 0
      Space available in freed fastbin blocks: 0
      Total allocated space:			 0
      Total free space:			 0
      Top-most, releasable space:		 0
      Estimated memory (with thread stack):    591872
       
       
       
      Events status:
      LLA = Last Locked At  LUA = Last Unlocked At
      WOC = Waiting On Condition  DL = Data Locked
      The Event Scheduler is disabled
       
      2017-11-03 17:07:23 140515632003328 [Note] WSREP: Signalling provider to continue.
      2017-11-03 17:07:23 140515632003328 [Note] WSREP: SST received: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977
      2017-11-03 17:07:23 140515630676736 [Note] Event Scheduler: scheduler thread started with id 4
      171103 17:07:23 [ERROR] mysqld got signal 11 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed, 
      something is definitely wrong and this may fail.
       
      Server version: 10.1.28-MariaDB
      key_buffer_size=134217728
      read_buffer_size=131072
      max_used_connections=0
      max_threads=1002
      thread_count=10
      It is possible that mysqld could use up to 
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2332069 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x0 thread_stack 0x48400
      mysqld(my_print_stacktrace+0x2e)[0x7fcc5902d92e]
      2017-11-03 17:07:23 140515632003328 [Note] Reading of all Master_info entries succeded
      2017-11-03 17:07:23 140515632003328 [Note] Added new Master_info '' to hash table
      2017-11-03 17:07:23 140515632003328 [Note] mysqld: ready for connections.
      Version: '10.1.28-MariaDB'  socket: '/app/mysql/temporaries/galera/mysqld.sock'  port: 3306  MariaDB Server
      2017-11-03 17:07:23 140504733837056 [Note] WSREP: 3.0 (10-119-17-237.domain.com): State transfer from 0.0 (10-119-17-69.domain.com) complete.
      2017-11-03 17:07:23 140504733837056 [Note] WSREP: Shifting JOINER -> JOINED (TO: 230971731)
      mysqld(handle_fatal_signal+0x305)[0x7fcc58b50ce5]
      /lib64/libpthread.so.0(+0xf370)[0x7fcc58161370]
      addr2line: 'mysqld': No such file
      mysqld(my_hash_element+0x13)[0x7fcc5900c023]
      mysqld(_ZN17Master_info_index20flush_all_relay_logsEv+0x93)[0x7fcc58a98c83]
      mysqld(_Z20reload_acl_and_cacheP3THDyP10TABLE_LISTPi+0x5d2)[0x7fcc58ab8672]
      mysqld(signal_hand+0x382)[0x7fcc5892c812]
      /lib64/libpthread.so.0(+0x7dc5)[0x7fcc58159dc5]
      /lib64/libc.so.6(clone+0x6d)[0x7fcc5657976d]
      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.
      

      I'm not sure if it could provide any hint as to the cause of the crash, but the donor's error log contained a lot of messages about long semaphore waits involving log_sys->checkpoint_lock during the time period in which it acted as donor:

      InnoDB: ###### Diagnostic info printed to the standard error stream
      WSREP: avoiding InnoDB self crash due to long semaphore wait of  > 600 seconds
      Server is processing SST donor operation, fatal_cnt now: 0InnoDB: Warning: a long semaphore wait:
      --Thread 140439973771008 has waited at log0log.cc line 2352 for 1265.0 seconds the semaphore:
      S-lock on RW-latch at 0x7fbd82c5d3b8 '&log_sys->checkpoint_lock'
      a writer (thread id 140439973771008) has reserved it in mode  exclusive
      number of readers 0, waiters flag 1, lock_word: 0
      Last time read locked in file log0log.cc line 2352
      Last time write locked in file log0log.cc line 2109
      Holder thread 0 file not yet reserved line 0
      InnoDB: Warning: semaphore wait:
      --Thread 140439973771008 has waited at log0log.cc line 2352 for 1265.0 seconds the semaphore:
      S-lock on RW-latch at 0x7fbd82c5d3b8 '&log_sys->checkpoint_lock'
      a writer (thread id 140439973771008) has reserved it in mode  exclusive
      number of readers 0, waiters flag 1, lock_word: 0
      Last time read locked in file log0log.cc line 2352
      Last time write locked in file log0log.cc line 2109
      Holder thread 0 file not yet reserved line 0
      InnoDB: Warning: Writer thread is waiting this semaphore:
      --Thread 140439973771008 has waited at log0log.cc line 2352 for 1265.0 seconds the semaphore:
      S-lock on RW-latch at 0x7fbd82c5d3b8 '&log_sys->checkpoint_lock'
      a writer (thread id 140439973771008) has reserved it in mode  exclusive
      number of readers 0, waiters flag 1, lock_word: 0
      Last time read locked in file log0log.cc line 2352
      Last time write locked in file log0log.cc line 2109
      Holder thread 0 file not yet reserved line 0
      

      In case it is relevant, the donor node was a replication master, but not a slave.

      After the first SST led to a crash, another SST was attempted, and that one was successful without a crash.

      Attachments

        Issue Links

          Activity

            GeoffMontee Geoff Montee (Inactive) created issue -
            elenst Elena Stepanova made changes -
            Field Original Value New Value
            Assignee Andrii Nikitin [ anikitin ]
            GeoffMontee Geoff Montee (Inactive) made changes -
            Description A joiner node performed an rsync SST from another node in the cluster, and when the SST finished, the node saw the following crash:

            {noformat}
            2017-11-03 17:06:34 140504733837056 [Note] WSREP: 0.0 (10-119-17-69.domain.com): State transfer to 3.0 (10-119-17-237.domain.com) complete.
            WSREP_SST: [INFO] Extracting binlog files: (20171103 17:06:34.145)$
            db-g3-bin.000042$
            ls: cannot access db-devg1-binlog.*: No such file or directory$
            WSREP_SST: [INFO] Joiner cleanup. rsync PID: 8293 (20171103 17:06:34.156)$
            WSREP_SST: [INFO] Joiner cleanup done. (20171103 17:06:34.662)$
            2017-11-03 17:06:34 140515632003328 [Note] WSREP: SST complete, seqno: 230930977$
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using mutexes to ref count buffer pool pages
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: The InnoDB memory heap is disabled$
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used or memory barrier$
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Compressed tables use zlib 1.2.7$
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using Linux native AIO$
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using SSE crc32 instructions$
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Initializing buffer pool, size = 512.0M$
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Completed initialization of buffer pool$
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Highest supported file format is Barracuda.$
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Starting crash recovery from checkpoint LSN= 244892544453$
            2017-11-03 17:06:36 140504733837056 [Note] WSREP: Member 0.0 (10-119-17-69.domain.com) sy ced with group.$
            2017-11-03 17:06:42 140515632003328 [Note] InnoDB: Restoring possible half-written data pages f om the doublewrite buffer...$
            2017-11-03 17:06:49 140515632003328 [Note] InnoDB: To recover: 2702 pages from log$
            2017-11-03 17:06:53 140515632003328 [Note] InnoDB: Starting final batch to recover 2585 pages f om redo log$
            InnoDB: Last MySQL binlog file position 0 786976686, file name /app/mysql/galera/db-g3-bin.0000 1$
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: 128 rollback segment(s) are active.$
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: Waiting for purge to start$
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6 36-82.2 started; log sequence number 1244894457235$
            2017-11-03 17:07:23 140515632003328 [Note] Plugin 'FEEDBACK' is disabled.$
            2017-11-03 17:07:23 140503555213056 [Note] InnoDB: Dumping buffer pool(s) not yet started$
            2017-11-03 17:07:23 140515632003328 [Note] Server socket created on IP: '10.119.17.237'.$
            $
            Status information:$
            $
            Current dir: /app/mysql/galera/$
            Running threads: 2 Stack size: 295936$
            Current locks:$
            lock: 0x7fc98da48788:$
            $
            lock: 0x7fc9cdc6a508:$
            $
            lock: 0x7fcc598b5520:$
            $
            lock: 0x7fcc598b5440:$
            $
            lock: 0x7fcc598b4480:$
            $
            lock: 0x7fcc598b43a0:$
            $
            lock: 0x7fcc598b42c0:$
            $
            lock: 0x7fcc598b4640:$
            $
            lock: 0x7fcc598b2880:$
            $
            lock: 0x7fcc598b5360:$
            $
            lock: 0x7fcc598b2f80:$
            $
            lock: 0x7fcc598b3220:$
            $
            lock: 0x7fcc598b2ea0:$
            $
            lock: 0x7fcc598b3140:$
            $
            lock: 0x7fcc598b2dc0:$
            $
            lock: 0x7fcc598b3060:$
            $
            lock: 0x7fcc598b35a0:$
            $
            lock: 0x7fcc598b3680:$
            $
            lock: 0x7fcc598b3760:$
            $
            lock: 0x7fcc598b2ce0:$
            $
            lock: 0x7fcc598b2a40:$
            $
            lock: 0x7fcc598b2c00:$
            $
            lock: 0x7fcc598b2960:$
            $
            lock: 0x7fcc598b2b20:$
            $
            lock: 0x7fcc598b3300:$
            $
            lock: 0x7fcc598b33e0:$
            $
            lock: 0x7fcc598b34c0:$
            $
            lock: 0x7fcc598b4fe0:$
            $
            lock: 0x7fcc598b5280:$
            $
            lock: 0x7fcc598b51a0:$
            $
            lock: 0x7fcc598b50c0:$
            $
            lock: 0x7fcc598b4c60:$
            $
            lock: 0x7fcc598b4b80:$
            $
            lock: 0x7fcc598b4aa0:$
            $
            lock: 0x7fcc598b49c0:$
            $
            lock: 0x7fcc598b48e0:$
            $
            lock: 0x7fcc598b4e20:$
            $
            lock: 0x7fcc598b4800:$
            $
            lock: 0x7fcc598b4720:$
            $
            lock: 0x7fcc598b4f00:$
            $
            lock: 0x7fcc598b4560:$
            $
            lock: 0x7fcc598b4100:$
            $
            lock: 0x7fcc598b41e0:$
            $
            lock: 0x7fcc598b4020:$
            $
            lock: 0x7fcc598b3f40:$
            $
            lock: 0x7fcc598b3bc0:$
            $
            lock: 0x7fcc598b3e60:$
            $
            lock: 0x7fcc598b3d80:$
            $
            lock: 0x7fcc598b3ae0:$
            $
            lock: 0x7fcc598b3ca0:$
            $
            lock: 0x7fcc598b3840:$
            $
            lock: 0x7fcc598b3920:$
            $
            lock: 0x7fcc598b3a00:$
            $
            lock: 0x7fcc598b4d40:$
            $
            $
            Key caches:$
            default$
            Buffer_size: 134217728$
            Block_size: 1024$
            Division_limit: 100$
            Age_threshold: 300$
            Partitions: 0$
            blocks used: 0$
            not flushed: 0$
            w_requests: 0$
            writes: 0$
            r_requests: 0$
            reads: 0$
            $
            $
            handler status:$
            read_key: 0$
            read_next: 0$
            read_rnd 0$
            read_first: 0$
            write: 0$
            delete 0$
            update: 0$
            $
            Table status:$
            Opened tables: 2$
            Open tables: 2$
            Open files: 8$
            Open streams: 0$
            $
            Alarm status:$
            Active alarms: 0$
            Max used alarms: 0$
            Next alarm time: 0$
            $
            Memory status:$
            Non-mmapped space allocated from system: 0$
            Number of free chunks: 1$
            Number of fastbin blocks: 0$
            Number of mmapped regions: 0$
            Space in mmapped regions: 0$
            Maximum total allocated space: 0$
            Space available in freed fastbin blocks: 0$
            Total allocated space: 0$
            Total free space: 0$
            Top-most, releasable space: 0$
            Estimated memory (with thread stack): 591872$
            $
            $
            $
            Events status:$
            LLA = Last Locked At LUA = Last Unlocked At$
            WOC = Waiting On Condition DL = Data Locked$
            The Event Scheduler is disabled$
            $
            2017-11-03 17:07:23 140515632003328 [Note] WSREP: Signalling provider to continue.$
            2017-11-03 17:07:23 140515632003328 [Note] WSREP: SST received: b1c4fe17-b433-11e7-bd44-52c2748 6170:230930977$
            2017-11-03 17:07:23 140515630676736 [Note] Event Scheduler: scheduler thread started with id 4$
            171103 17:07:23 [ERROR] mysqld got signal 11 ;$
            This could be because you hit a bug. It is also possible that this binary$
            or one of the libraries it was linked against is corrupt, improperly built,$
            or misconfigured. This error can also be caused by malfunctioning hardware.$
            $
            To report this bug, see https://mariadb.com/kb/en/reporting-bugs$
            $
            We will try our best to scrape up some info that will hopefully help$
            diagnose the problem, but since we have already crashed, $
            something is definitely wrong and this may fail.$
            $
            Server version: 10.1.28-MariaDB$
            key_buffer_size=134217728$
            read_buffer_size=131072$
            max_used_connections=0$
            max_threads=1002$
            thread_count=10$
            It is possible that mysqld could use up to $
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2332069 K bytes of memor $
            Hope that's ok; if not, decrease some variables in the equation.$
            $
            Thread pointer: 0x0$
            Attempting backtrace. You can use the following information to find out$
            where mysqld died. If you see no messages after this, something went$
            terribly wrong...$
            stack_bottom = 0x0 thread_stack 0x48400$
            mysqld(my_print_stacktrace+0x2e)[0x7fcc5902d92e]$
            2017-11-03 17:07:23 140515632003328 [Note] Reading of all Master_info entries succeded$
            2017-11-03 17:07:23 140515632003328 [Note] Added new Master_info '' to hash table$
            2017-11-03 17:07:23 140515632003328 [Note] mysqld: ready for connections.$
            Version: '10.1.28-MariaDB' socket: '/app/mysql/temporaries/galera/mysqld.sock' port: 3306 MariaDB Server$
            2017-11-03 17:07:23 140504733837056 [Note] WSREP: 3.0 (10-119-17-237.domain.com): State transfer from 0.0 (10-119-17-69.domain.com) complete.$
            2017-11-03 17:07:23 140504733837056 [Note] WSREP: Shifting JOINER -> JOINED (TO: 230971731)$
            mysqld(handle_fatal_signal+0x305)[0x7fcc58b50ce5]$
            /lib64/libpthread.so.0(+0xf370)[0x7fcc58161370]$
            addr2line: 'mysqld': No such file$
            mysqld(my_hash_element+0x13)[0x7fcc5900c023]$
            mysqld(_ZN17Master_info_index20flush_all_relay_logsEv+0x93)[0x7fcc58a98c83]$
            mysqld(_Z20reload_acl_and_cacheP3THDyP10TABLE_LISTPi+0x5d2)[0x7fcc58ab8672]$
            mysqld(signal_hand+0x382)[0x7fcc5892c812]$
            /lib64/libpthread.so.0(+0x7dc5)[0x7fcc58159dc5]$
            /lib64/libc.so.6(clone+0x6d)[0x7fcc5657976d]$
            The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains$
            information that should help you find out what is causing the crash.$
            {noformat}

            In case it is relevant, the donor node was a replication master, but not a slave.

            After the first SST led to a crash, another SST was attempted, and that one was successful without a crash.
            A joiner node performed an rsync SST from another node in the cluster, and when the SST finished, the node saw the following crash:

            {noformat}
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: State transfer required:
            Group state: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977
            Local state: 00000000-0000-0000-0000-000000000000:-1
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: New cluster view: global state: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977, view# 102: Primary, number of nodes: 4, my index: 3, protocol version 3
            2017-11-03 16:44:46 140515631684352 [Warning] WSREP: Gap in state sequence. Need state transfer.
            2017-11-03 16:44:46 140504717063936 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '10.119.17.237' --datadir '/app/mysql/galera/' --defaults-file '/app/mysql/config/galera.cnf' --parent '8214' --binlog '/app/mysql/galera/db-node1-binlog' '
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: Prepared SST request: rsync|10.119.17.237:4444/rsync_sst
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: REPL Protocols: 7 (3, 2)
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: Assign initial position for certification: 230930977, protocol version: 3
            2017-11-03 16:44:46 140504792139520 [Note] WSREP: Service thread queue flushed.
            2017-11-03 16:44:46 140515631684352 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (b1c4fe17-b433-11e7-bd44-52c2748b6170): 1 (Operation not permitted)
            at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
            2017-11-03 16:44:46 140504733837056 [Note] WSREP: Member 3.0 (10-119-17-237.domain.com) requested state transfer from '*any*'. Selected 0.0 (10-119-17-69.domain.com)(SYNCED) as donor.
            2017-11-03 16:44:46 140504733837056 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 230930977)
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: Requesting state transfer: success, donor: 0
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: GCache history reset: old(00000000-0000-0000-0000-000000000000:0) -> new(b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977)
            2017-11-03 16:44:49 140504742229760 [Note] WSREP: (d3790118, 'tcp://0.0.0.0:4567') connection to peer d3790118 with addr tcp://10.119.17.237:4567 timed out, no messages seen in PT3S
            2017-11-03 16:44:49 140504742229760 [Note] WSREP: (d3790118, 'tcp://0.0.0.0:4567') turning message relay requesting off
            2017-11-03 17:06:34 140504733837056 [Note] WSREP: 0.0 (10-119-17-69.domain.com): State transfer to 3.0 (10-119-17-237.domain.com) complete.
            WSREP_SST: [INFO] Extracting binlog files: (20171103 17:06:34.145)
            db-g3-bin.000042
            ls: cannot access db-node1-binlog.*: No such file or directory
            WSREP_SST: [INFO] Joiner cleanup. rsync PID: 8293 (20171103 17:06:34.156)
            WSREP_SST: [INFO] Joiner cleanup done. (20171103 17:06:34.662)
            2017-11-03 17:06:34 140515632003328 [Note] WSREP: SST complete, seqno: 230930977
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using mutexes to ref count buffer pool pages
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: The InnoDB memory heap is disabled
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Compressed tables use zlib 1.2.7
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using Linux native AIO
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using SSE crc32 instructions
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Initializing buffer pool, size = 512.0M
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Completed initialization of buffer pool
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Highest supported file format is Barracuda.
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1244892544453
            2017-11-03 17:06:36 140504733837056 [Note] WSREP: Member 0.0 (10-119-17-69.domain.com) synced with group.
            2017-11-03 17:06:42 140515632003328 [Note] InnoDB: Restoring possible half-written data pages from the doublewrite buffer...
            2017-11-03 17:06:49 140515632003328 [Note] InnoDB: To recover: 2702 pages from log
            2017-11-03 17:06:53 140515632003328 [Note] InnoDB: Starting final batch to recover 2585 pages from redo log
            InnoDB: Last MySQL binlog file position 0 786976686, file name /app/mysql/galera/db-g3-bin.000041
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: 128 rollback segment(s) are active.
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: Waiting for purge to start
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.36-82.2 started; log sequence number 1244894457235
            2017-11-03 17:07:23 140515632003328 [Note] Plugin 'FEEDBACK' is disabled.
            2017-11-03 17:07:23 140503555213056 [Note] InnoDB: Dumping buffer pool(s) not yet started
            2017-11-03 17:07:23 140515632003328 [Note] Server socket created on IP: '10.119.17.237'.

            Status information:

            Current dir: /app/mysql/galera/
            Running threads: 2 Stack size: 295936
            Current locks:
            lock: 0x7fc98da48788:

            lock: 0x7fc9cdc6a508:

            lock: 0x7fcc598b5520:

            lock: 0x7fcc598b5440:

            lock: 0x7fcc598b4480:

            lock: 0x7fcc598b43a0:

            lock: 0x7fcc598b42c0:

            lock: 0x7fcc598b4640:

            lock: 0x7fcc598b2880:

            lock: 0x7fcc598b5360:

            lock: 0x7fcc598b2f80:

            lock: 0x7fcc598b3220:

            lock: 0x7fcc598b2ea0:

            lock: 0x7fcc598b3140:

            lock: 0x7fcc598b2dc0:

            lock: 0x7fcc598b3060:

            lock: 0x7fcc598b35a0:

            lock: 0x7fcc598b3680:

            lock: 0x7fcc598b3760:

            lock: 0x7fcc598b2ce0:

            lock: 0x7fcc598b2a40:

            lock: 0x7fcc598b2c00:

            lock: 0x7fcc598b2960:

            lock: 0x7fcc598b2b20:

            lock: 0x7fcc598b3300:

            lock: 0x7fcc598b33e0:

            lock: 0x7fcc598b34c0:

            lock: 0x7fcc598b4fe0:

            lock: 0x7fcc598b5280:

            lock: 0x7fcc598b51a0:

            lock: 0x7fcc598b50c0:

            lock: 0x7fcc598b4c60:

            lock: 0x7fcc598b4b80:

            lock: 0x7fcc598b4aa0:

            lock: 0x7fcc598b49c0:

            lock: 0x7fcc598b48e0:

            lock: 0x7fcc598b4e20:

            lock: 0x7fcc598b4800:

            lock: 0x7fcc598b4720:

            lock: 0x7fcc598b4f00:

            lock: 0x7fcc598b4560:

            lock: 0x7fcc598b4100:

            lock: 0x7fcc598b41e0:

            lock: 0x7fcc598b4020:

            lock: 0x7fcc598b3f40:

            lock: 0x7fcc598b3bc0:

            lock: 0x7fcc598b3e60:

            lock: 0x7fcc598b3d80:

            lock: 0x7fcc598b3ae0:

            lock: 0x7fcc598b3ca0:

            lock: 0x7fcc598b3840:

            lock: 0x7fcc598b3920:

            lock: 0x7fcc598b3a00:

            lock: 0x7fcc598b4d40:


            Key caches:
            default
            Buffer_size: 134217728
            Block_size: 1024
            Division_limit: 100
            Age_threshold: 300
            Partitions: 0
            blocks used: 0
            not flushed: 0
            w_requests: 0
            writes: 0
            r_requests: 0
            reads: 0


            handler status:
            read_key: 0
            read_next: 0
            read_rnd 0
            read_first: 0
            write: 0
            delete 0
            update: 0

            Table status:
            Opened tables: 2
            Open tables: 2
            Open files: 8
            Open streams: 0

            Alarm status:
            Active alarms: 0
            Max used alarms: 0
            Next alarm time: 0

            Memory status:
            Non-mmapped space allocated from system: 0
            Number of free chunks: 1
            Number of fastbin blocks: 0
            Number of mmapped regions: 0
            Space in mmapped regions: 0
            Maximum total allocated space: 0
            Space available in freed fastbin blocks: 0
            Total allocated space: 0
            Total free space: 0
            Top-most, releasable space: 0
            Estimated memory (with thread stack): 591872



            Events status:
            LLA = Last Locked At LUA = Last Unlocked At
            WOC = Waiting On Condition DL = Data Locked
            The Event Scheduler is disabled

            2017-11-03 17:07:23 140515632003328 [Note] WSREP: Signalling provider to continue.
            2017-11-03 17:07:23 140515632003328 [Note] WSREP: SST received: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977
            2017-11-03 17:07:23 140515630676736 [Note] Event Scheduler: scheduler thread started with id 4
            171103 17:07:23 [ERROR] mysqld got signal 11 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.

            To report this bug, see https://mariadb.com/kb/en/reporting-bugs

            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed,
            something is definitely wrong and this may fail.

            Server version: 10.1.28-MariaDB
            key_buffer_size=134217728
            read_buffer_size=131072
            max_used_connections=0
            max_threads=1002
            thread_count=10
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2332069 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x0
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x0 thread_stack 0x48400
            mysqld(my_print_stacktrace+0x2e)[0x7fcc5902d92e]
            2017-11-03 17:07:23 140515632003328 [Note] Reading of all Master_info entries succeded
            2017-11-03 17:07:23 140515632003328 [Note] Added new Master_info '' to hash table
            2017-11-03 17:07:23 140515632003328 [Note] mysqld: ready for connections.
            Version: '10.1.28-MariaDB' socket: '/app/mysql/temporaries/galera/mysqld.sock' port: 3306 MariaDB Server
            2017-11-03 17:07:23 140504733837056 [Note] WSREP: 3.0 (10-119-17-237.domain.com): State transfer from 0.0 (10-119-17-69.domain.com) complete.
            2017-11-03 17:07:23 140504733837056 [Note] WSREP: Shifting JOINER -> JOINED (TO: 230971731)
            mysqld(handle_fatal_signal+0x305)[0x7fcc58b50ce5]
            /lib64/libpthread.so.0(+0xf370)[0x7fcc58161370]
            addr2line: 'mysqld': No such file
            mysqld(my_hash_element+0x13)[0x7fcc5900c023]
            mysqld(_ZN17Master_info_index20flush_all_relay_logsEv+0x93)[0x7fcc58a98c83]
            mysqld(_Z20reload_acl_and_cacheP3THDyP10TABLE_LISTPi+0x5d2)[0x7fcc58ab8672]
            mysqld(signal_hand+0x382)[0x7fcc5892c812]
            /lib64/libpthread.so.0(+0x7dc5)[0x7fcc58159dc5]
            /lib64/libc.so.6(clone+0x6d)[0x7fcc5657976d]
            The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            information that should help you find out what is causing the crash.
            {noformat}

            In case it is relevant, the donor node was a replication master, but not a slave.

            After the first SST led to a crash, another SST was attempted, and that one was successful without a crash.
            GeoffMontee Geoff Montee (Inactive) made changes -
            Description A joiner node performed an rsync SST from another node in the cluster, and when the SST finished, the node saw the following crash:

            {noformat}
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: State transfer required:
            Group state: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977
            Local state: 00000000-0000-0000-0000-000000000000:-1
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: New cluster view: global state: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977, view# 102: Primary, number of nodes: 4, my index: 3, protocol version 3
            2017-11-03 16:44:46 140515631684352 [Warning] WSREP: Gap in state sequence. Need state transfer.
            2017-11-03 16:44:46 140504717063936 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '10.119.17.237' --datadir '/app/mysql/galera/' --defaults-file '/app/mysql/config/galera.cnf' --parent '8214' --binlog '/app/mysql/galera/db-node1-binlog' '
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: Prepared SST request: rsync|10.119.17.237:4444/rsync_sst
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: REPL Protocols: 7 (3, 2)
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: Assign initial position for certification: 230930977, protocol version: 3
            2017-11-03 16:44:46 140504792139520 [Note] WSREP: Service thread queue flushed.
            2017-11-03 16:44:46 140515631684352 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (b1c4fe17-b433-11e7-bd44-52c2748b6170): 1 (Operation not permitted)
            at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
            2017-11-03 16:44:46 140504733837056 [Note] WSREP: Member 3.0 (10-119-17-237.domain.com) requested state transfer from '*any*'. Selected 0.0 (10-119-17-69.domain.com)(SYNCED) as donor.
            2017-11-03 16:44:46 140504733837056 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 230930977)
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: Requesting state transfer: success, donor: 0
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: GCache history reset: old(00000000-0000-0000-0000-000000000000:0) -> new(b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977)
            2017-11-03 16:44:49 140504742229760 [Note] WSREP: (d3790118, 'tcp://0.0.0.0:4567') connection to peer d3790118 with addr tcp://10.119.17.237:4567 timed out, no messages seen in PT3S
            2017-11-03 16:44:49 140504742229760 [Note] WSREP: (d3790118, 'tcp://0.0.0.0:4567') turning message relay requesting off
            2017-11-03 17:06:34 140504733837056 [Note] WSREP: 0.0 (10-119-17-69.domain.com): State transfer to 3.0 (10-119-17-237.domain.com) complete.
            WSREP_SST: [INFO] Extracting binlog files: (20171103 17:06:34.145)
            db-g3-bin.000042
            ls: cannot access db-node1-binlog.*: No such file or directory
            WSREP_SST: [INFO] Joiner cleanup. rsync PID: 8293 (20171103 17:06:34.156)
            WSREP_SST: [INFO] Joiner cleanup done. (20171103 17:06:34.662)
            2017-11-03 17:06:34 140515632003328 [Note] WSREP: SST complete, seqno: 230930977
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using mutexes to ref count buffer pool pages
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: The InnoDB memory heap is disabled
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Compressed tables use zlib 1.2.7
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using Linux native AIO
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using SSE crc32 instructions
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Initializing buffer pool, size = 512.0M
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Completed initialization of buffer pool
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Highest supported file format is Barracuda.
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1244892544453
            2017-11-03 17:06:36 140504733837056 [Note] WSREP: Member 0.0 (10-119-17-69.domain.com) synced with group.
            2017-11-03 17:06:42 140515632003328 [Note] InnoDB: Restoring possible half-written data pages from the doublewrite buffer...
            2017-11-03 17:06:49 140515632003328 [Note] InnoDB: To recover: 2702 pages from log
            2017-11-03 17:06:53 140515632003328 [Note] InnoDB: Starting final batch to recover 2585 pages from redo log
            InnoDB: Last MySQL binlog file position 0 786976686, file name /app/mysql/galera/db-g3-bin.000041
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: 128 rollback segment(s) are active.
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: Waiting for purge to start
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.36-82.2 started; log sequence number 1244894457235
            2017-11-03 17:07:23 140515632003328 [Note] Plugin 'FEEDBACK' is disabled.
            2017-11-03 17:07:23 140503555213056 [Note] InnoDB: Dumping buffer pool(s) not yet started
            2017-11-03 17:07:23 140515632003328 [Note] Server socket created on IP: '10.119.17.237'.

            Status information:

            Current dir: /app/mysql/galera/
            Running threads: 2 Stack size: 295936
            Current locks:
            lock: 0x7fc98da48788:

            lock: 0x7fc9cdc6a508:

            lock: 0x7fcc598b5520:

            lock: 0x7fcc598b5440:

            lock: 0x7fcc598b4480:

            lock: 0x7fcc598b43a0:

            lock: 0x7fcc598b42c0:

            lock: 0x7fcc598b4640:

            lock: 0x7fcc598b2880:

            lock: 0x7fcc598b5360:

            lock: 0x7fcc598b2f80:

            lock: 0x7fcc598b3220:

            lock: 0x7fcc598b2ea0:

            lock: 0x7fcc598b3140:

            lock: 0x7fcc598b2dc0:

            lock: 0x7fcc598b3060:

            lock: 0x7fcc598b35a0:

            lock: 0x7fcc598b3680:

            lock: 0x7fcc598b3760:

            lock: 0x7fcc598b2ce0:

            lock: 0x7fcc598b2a40:

            lock: 0x7fcc598b2c00:

            lock: 0x7fcc598b2960:

            lock: 0x7fcc598b2b20:

            lock: 0x7fcc598b3300:

            lock: 0x7fcc598b33e0:

            lock: 0x7fcc598b34c0:

            lock: 0x7fcc598b4fe0:

            lock: 0x7fcc598b5280:

            lock: 0x7fcc598b51a0:

            lock: 0x7fcc598b50c0:

            lock: 0x7fcc598b4c60:

            lock: 0x7fcc598b4b80:

            lock: 0x7fcc598b4aa0:

            lock: 0x7fcc598b49c0:

            lock: 0x7fcc598b48e0:

            lock: 0x7fcc598b4e20:

            lock: 0x7fcc598b4800:

            lock: 0x7fcc598b4720:

            lock: 0x7fcc598b4f00:

            lock: 0x7fcc598b4560:

            lock: 0x7fcc598b4100:

            lock: 0x7fcc598b41e0:

            lock: 0x7fcc598b4020:

            lock: 0x7fcc598b3f40:

            lock: 0x7fcc598b3bc0:

            lock: 0x7fcc598b3e60:

            lock: 0x7fcc598b3d80:

            lock: 0x7fcc598b3ae0:

            lock: 0x7fcc598b3ca0:

            lock: 0x7fcc598b3840:

            lock: 0x7fcc598b3920:

            lock: 0x7fcc598b3a00:

            lock: 0x7fcc598b4d40:


            Key caches:
            default
            Buffer_size: 134217728
            Block_size: 1024
            Division_limit: 100
            Age_threshold: 300
            Partitions: 0
            blocks used: 0
            not flushed: 0
            w_requests: 0
            writes: 0
            r_requests: 0
            reads: 0


            handler status:
            read_key: 0
            read_next: 0
            read_rnd 0
            read_first: 0
            write: 0
            delete 0
            update: 0

            Table status:
            Opened tables: 2
            Open tables: 2
            Open files: 8
            Open streams: 0

            Alarm status:
            Active alarms: 0
            Max used alarms: 0
            Next alarm time: 0

            Memory status:
            Non-mmapped space allocated from system: 0
            Number of free chunks: 1
            Number of fastbin blocks: 0
            Number of mmapped regions: 0
            Space in mmapped regions: 0
            Maximum total allocated space: 0
            Space available in freed fastbin blocks: 0
            Total allocated space: 0
            Total free space: 0
            Top-most, releasable space: 0
            Estimated memory (with thread stack): 591872



            Events status:
            LLA = Last Locked At LUA = Last Unlocked At
            WOC = Waiting On Condition DL = Data Locked
            The Event Scheduler is disabled

            2017-11-03 17:07:23 140515632003328 [Note] WSREP: Signalling provider to continue.
            2017-11-03 17:07:23 140515632003328 [Note] WSREP: SST received: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977
            2017-11-03 17:07:23 140515630676736 [Note] Event Scheduler: scheduler thread started with id 4
            171103 17:07:23 [ERROR] mysqld got signal 11 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.

            To report this bug, see https://mariadb.com/kb/en/reporting-bugs

            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed,
            something is definitely wrong and this may fail.

            Server version: 10.1.28-MariaDB
            key_buffer_size=134217728
            read_buffer_size=131072
            max_used_connections=0
            max_threads=1002
            thread_count=10
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2332069 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x0
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x0 thread_stack 0x48400
            mysqld(my_print_stacktrace+0x2e)[0x7fcc5902d92e]
            2017-11-03 17:07:23 140515632003328 [Note] Reading of all Master_info entries succeded
            2017-11-03 17:07:23 140515632003328 [Note] Added new Master_info '' to hash table
            2017-11-03 17:07:23 140515632003328 [Note] mysqld: ready for connections.
            Version: '10.1.28-MariaDB' socket: '/app/mysql/temporaries/galera/mysqld.sock' port: 3306 MariaDB Server
            2017-11-03 17:07:23 140504733837056 [Note] WSREP: 3.0 (10-119-17-237.domain.com): State transfer from 0.0 (10-119-17-69.domain.com) complete.
            2017-11-03 17:07:23 140504733837056 [Note] WSREP: Shifting JOINER -> JOINED (TO: 230971731)
            mysqld(handle_fatal_signal+0x305)[0x7fcc58b50ce5]
            /lib64/libpthread.so.0(+0xf370)[0x7fcc58161370]
            addr2line: 'mysqld': No such file
            mysqld(my_hash_element+0x13)[0x7fcc5900c023]
            mysqld(_ZN17Master_info_index20flush_all_relay_logsEv+0x93)[0x7fcc58a98c83]
            mysqld(_Z20reload_acl_and_cacheP3THDyP10TABLE_LISTPi+0x5d2)[0x7fcc58ab8672]
            mysqld(signal_hand+0x382)[0x7fcc5892c812]
            /lib64/libpthread.so.0(+0x7dc5)[0x7fcc58159dc5]
            /lib64/libc.so.6(clone+0x6d)[0x7fcc5657976d]
            The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            information that should help you find out what is causing the crash.
            {noformat}

            In case it is relevant, the donor node was a replication master, but not a slave.

            After the first SST led to a crash, another SST was attempted, and that one was successful without a crash.
            A joiner node performed an rsync SST from another node in the cluster, and when the SST finished, the node saw the following crash:

            {noformat}
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: State transfer required:
            Group state: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977
            Local state: 00000000-0000-0000-0000-000000000000:-1
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: New cluster view: global state: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977, view# 102: Primary, number of nodes: 4, my index: 3, protocol version 3
            2017-11-03 16:44:46 140515631684352 [Warning] WSREP: Gap in state sequence. Need state transfer.
            2017-11-03 16:44:46 140504717063936 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '10.119.17.237' --datadir '/app/mysql/galera/' --defaults-file '/app/mysql/config/galera.cnf' --parent '8214' --binlog '/app/mysql/galera/db-node1-binlog' '
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: Prepared SST request: rsync|10.119.17.237:4444/rsync_sst
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: REPL Protocols: 7 (3, 2)
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: Assign initial position for certification: 230930977, protocol version: 3
            2017-11-03 16:44:46 140504792139520 [Note] WSREP: Service thread queue flushed.
            2017-11-03 16:44:46 140515631684352 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (b1c4fe17-b433-11e7-bd44-52c2748b6170): 1 (Operation not permitted)
            at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
            2017-11-03 16:44:46 140504733837056 [Note] WSREP: Member 3.0 (10-119-17-237.domain.com) requested state transfer from '*any*'. Selected 0.0 (10-119-17-69.domain.com)(SYNCED) as donor.
            2017-11-03 16:44:46 140504733837056 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 230930977)
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: Requesting state transfer: success, donor: 0
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: GCache history reset: old(00000000-0000-0000-0000-000000000000:0) -> new(b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977)
            2017-11-03 16:44:49 140504742229760 [Note] WSREP: (d3790118, 'tcp://0.0.0.0:4567') connection to peer d3790118 with addr tcp://10.119.17.237:4567 timed out, no messages seen in PT3S
            2017-11-03 16:44:49 140504742229760 [Note] WSREP: (d3790118, 'tcp://0.0.0.0:4567') turning message relay requesting off
            2017-11-03 17:06:34 140504733837056 [Note] WSREP: 0.0 (10-119-17-69.domain.com): State transfer to 3.0 (10-119-17-237.domain.com) complete.
            WSREP_SST: [INFO] Extracting binlog files: (20171103 17:06:34.145)
            db-g3-bin.000042
            ls: cannot access db-node1-binlog.*: No such file or directory
            WSREP_SST: [INFO] Joiner cleanup. rsync PID: 8293 (20171103 17:06:34.156)
            WSREP_SST: [INFO] Joiner cleanup done. (20171103 17:06:34.662)
            2017-11-03 17:06:34 140515632003328 [Note] WSREP: SST complete, seqno: 230930977
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using mutexes to ref count buffer pool pages
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: The InnoDB memory heap is disabled
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Compressed tables use zlib 1.2.7
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using Linux native AIO
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using SSE crc32 instructions
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Initializing buffer pool, size = 512.0M
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Completed initialization of buffer pool
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Highest supported file format is Barracuda.
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1244892544453
            2017-11-03 17:06:36 140504733837056 [Note] WSREP: Member 0.0 (10-119-17-69.domain.com) synced with group.
            2017-11-03 17:06:42 140515632003328 [Note] InnoDB: Restoring possible half-written data pages from the doublewrite buffer...
            2017-11-03 17:06:49 140515632003328 [Note] InnoDB: To recover: 2702 pages from log
            2017-11-03 17:06:53 140515632003328 [Note] InnoDB: Starting final batch to recover 2585 pages from redo log
            InnoDB: Last MySQL binlog file position 0 786976686, file name /app/mysql/galera/db-g3-bin.000041
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: 128 rollback segment(s) are active.
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: Waiting for purge to start
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.36-82.2 started; log sequence number 1244894457235
            2017-11-03 17:07:23 140515632003328 [Note] Plugin 'FEEDBACK' is disabled.
            2017-11-03 17:07:23 140503555213056 [Note] InnoDB: Dumping buffer pool(s) not yet started
            2017-11-03 17:07:23 140515632003328 [Note] Server socket created on IP: '10.119.17.237'.

            Status information:

            Current dir: /app/mysql/galera/
            Running threads: 2 Stack size: 295936
            Current locks:
            lock: 0x7fc98da48788:

            lock: 0x7fc9cdc6a508:

            lock: 0x7fcc598b5520:

            lock: 0x7fcc598b5440:

            lock: 0x7fcc598b4480:

            lock: 0x7fcc598b43a0:

            lock: 0x7fcc598b42c0:

            lock: 0x7fcc598b4640:

            lock: 0x7fcc598b2880:

            lock: 0x7fcc598b5360:

            lock: 0x7fcc598b2f80:

            lock: 0x7fcc598b3220:

            lock: 0x7fcc598b2ea0:

            lock: 0x7fcc598b3140:

            lock: 0x7fcc598b2dc0:

            lock: 0x7fcc598b3060:

            lock: 0x7fcc598b35a0:

            lock: 0x7fcc598b3680:

            lock: 0x7fcc598b3760:

            lock: 0x7fcc598b2ce0:

            lock: 0x7fcc598b2a40:

            lock: 0x7fcc598b2c00:

            lock: 0x7fcc598b2960:

            lock: 0x7fcc598b2b20:

            lock: 0x7fcc598b3300:

            lock: 0x7fcc598b33e0:

            lock: 0x7fcc598b34c0:

            lock: 0x7fcc598b4fe0:

            lock: 0x7fcc598b5280:

            lock: 0x7fcc598b51a0:

            lock: 0x7fcc598b50c0:

            lock: 0x7fcc598b4c60:

            lock: 0x7fcc598b4b80:

            lock: 0x7fcc598b4aa0:

            lock: 0x7fcc598b49c0:

            lock: 0x7fcc598b48e0:

            lock: 0x7fcc598b4e20:

            lock: 0x7fcc598b4800:

            lock: 0x7fcc598b4720:

            lock: 0x7fcc598b4f00:

            lock: 0x7fcc598b4560:

            lock: 0x7fcc598b4100:

            lock: 0x7fcc598b41e0:

            lock: 0x7fcc598b4020:

            lock: 0x7fcc598b3f40:

            lock: 0x7fcc598b3bc0:

            lock: 0x7fcc598b3e60:

            lock: 0x7fcc598b3d80:

            lock: 0x7fcc598b3ae0:

            lock: 0x7fcc598b3ca0:

            lock: 0x7fcc598b3840:

            lock: 0x7fcc598b3920:

            lock: 0x7fcc598b3a00:

            lock: 0x7fcc598b4d40:


            Key caches:
            default
            Buffer_size: 134217728
            Block_size: 1024
            Division_limit: 100
            Age_threshold: 300
            Partitions: 0
            blocks used: 0
            not flushed: 0
            w_requests: 0
            writes: 0
            r_requests: 0
            reads: 0


            handler status:
            read_key: 0
            read_next: 0
            read_rnd 0
            read_first: 0
            write: 0
            delete 0
            update: 0

            Table status:
            Opened tables: 2
            Open tables: 2
            Open files: 8
            Open streams: 0

            Alarm status:
            Active alarms: 0
            Max used alarms: 0
            Next alarm time: 0

            Memory status:
            Non-mmapped space allocated from system: 0
            Number of free chunks: 1
            Number of fastbin blocks: 0
            Number of mmapped regions: 0
            Space in mmapped regions: 0
            Maximum total allocated space: 0
            Space available in freed fastbin blocks: 0
            Total allocated space: 0
            Total free space: 0
            Top-most, releasable space: 0
            Estimated memory (with thread stack): 591872



            Events status:
            LLA = Last Locked At LUA = Last Unlocked At
            WOC = Waiting On Condition DL = Data Locked
            The Event Scheduler is disabled

            2017-11-03 17:07:23 140515632003328 [Note] WSREP: Signalling provider to continue.
            2017-11-03 17:07:23 140515632003328 [Note] WSREP: SST received: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977
            2017-11-03 17:07:23 140515630676736 [Note] Event Scheduler: scheduler thread started with id 4
            171103 17:07:23 [ERROR] mysqld got signal 11 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.

            To report this bug, see https://mariadb.com/kb/en/reporting-bugs

            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed,
            something is definitely wrong and this may fail.

            Server version: 10.1.28-MariaDB
            key_buffer_size=134217728
            read_buffer_size=131072
            max_used_connections=0
            max_threads=1002
            thread_count=10
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2332069 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x0
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x0 thread_stack 0x48400
            mysqld(my_print_stacktrace+0x2e)[0x7fcc5902d92e]
            2017-11-03 17:07:23 140515632003328 [Note] Reading of all Master_info entries succeded
            2017-11-03 17:07:23 140515632003328 [Note] Added new Master_info '' to hash table
            2017-11-03 17:07:23 140515632003328 [Note] mysqld: ready for connections.
            Version: '10.1.28-MariaDB' socket: '/app/mysql/temporaries/galera/mysqld.sock' port: 3306 MariaDB Server
            2017-11-03 17:07:23 140504733837056 [Note] WSREP: 3.0 (10-119-17-237.domain.com): State transfer from 0.0 (10-119-17-69.domain.com) complete.
            2017-11-03 17:07:23 140504733837056 [Note] WSREP: Shifting JOINER -> JOINED (TO: 230971731)
            mysqld(handle_fatal_signal+0x305)[0x7fcc58b50ce5]
            /lib64/libpthread.so.0(+0xf370)[0x7fcc58161370]
            addr2line: 'mysqld': No such file
            mysqld(my_hash_element+0x13)[0x7fcc5900c023]
            mysqld(_ZN17Master_info_index20flush_all_relay_logsEv+0x93)[0x7fcc58a98c83]
            mysqld(_Z20reload_acl_and_cacheP3THDyP10TABLE_LISTPi+0x5d2)[0x7fcc58ab8672]
            mysqld(signal_hand+0x382)[0x7fcc5892c812]
            /lib64/libpthread.so.0(+0x7dc5)[0x7fcc58159dc5]
            /lib64/libc.so.6(clone+0x6d)[0x7fcc5657976d]
            The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            information that should help you find out what is causing the crash.
            {noformat}

            I'm not sure if it could provide any hint as to the cause of the crash, but the donor's error log contained a lot of messages about long semaphore waits during the time period in which it acted as donor:

            {noformat}
            InnoDB: ###### Diagnostic info printed to the standard error stream
            WSREP: avoiding InnoDB self crash due to long semaphore wait of > 600 seconds
            Server is processing SST donor operation, fatal_cnt now: 0InnoDB: Warning: a long semaphore wait:
            --Thread 140439973771008 has waited at log0log.cc line 2352 for 1265.0 seconds the semaphore:
            S-lock on RW-latch at 0x7fbd82c5d3b8 '&log_sys->checkpoint_lock'
            a writer (thread id 140439973771008) has reserved it in mode exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time read locked in file log0log.cc line 2352
            Last time write locked in file log0log.cc line 2109
            Holder thread 0 file not yet reserved line 0
            InnoDB: Warning: semaphore wait:
            --Thread 140439973771008 has waited at log0log.cc line 2352 for 1265.0 seconds the semaphore:
            S-lock on RW-latch at 0x7fbd82c5d3b8 '&log_sys->checkpoint_lock'
            a writer (thread id 140439973771008) has reserved it in mode exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time read locked in file log0log.cc line 2352
            Last time write locked in file log0log.cc line 2109
            Holder thread 0 file not yet reserved line 0
            InnoDB: Warning: Writer thread is waiting this semaphore:
            --Thread 140439973771008 has waited at log0log.cc line 2352 for 1265.0 seconds the semaphore:
            S-lock on RW-latch at 0x7fbd82c5d3b8 '&log_sys->checkpoint_lock'
            a writer (thread id 140439973771008) has reserved it in mode exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time read locked in file log0log.cc line 2352
            Last time write locked in file log0log.cc line 2109
            Holder thread 0 file not yet reserved line 0
            {noformat}

            In case it is relevant, the donor node was a replication master, but not a slave.

            After the first SST led to a crash, another SST was attempted, and that one was successful without a crash.
            GeoffMontee Geoff Montee (Inactive) made changes -
            Description A joiner node performed an rsync SST from another node in the cluster, and when the SST finished, the node saw the following crash:

            {noformat}
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: State transfer required:
            Group state: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977
            Local state: 00000000-0000-0000-0000-000000000000:-1
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: New cluster view: global state: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977, view# 102: Primary, number of nodes: 4, my index: 3, protocol version 3
            2017-11-03 16:44:46 140515631684352 [Warning] WSREP: Gap in state sequence. Need state transfer.
            2017-11-03 16:44:46 140504717063936 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '10.119.17.237' --datadir '/app/mysql/galera/' --defaults-file '/app/mysql/config/galera.cnf' --parent '8214' --binlog '/app/mysql/galera/db-node1-binlog' '
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: Prepared SST request: rsync|10.119.17.237:4444/rsync_sst
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: REPL Protocols: 7 (3, 2)
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: Assign initial position for certification: 230930977, protocol version: 3
            2017-11-03 16:44:46 140504792139520 [Note] WSREP: Service thread queue flushed.
            2017-11-03 16:44:46 140515631684352 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (b1c4fe17-b433-11e7-bd44-52c2748b6170): 1 (Operation not permitted)
            at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
            2017-11-03 16:44:46 140504733837056 [Note] WSREP: Member 3.0 (10-119-17-237.domain.com) requested state transfer from '*any*'. Selected 0.0 (10-119-17-69.domain.com)(SYNCED) as donor.
            2017-11-03 16:44:46 140504733837056 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 230930977)
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: Requesting state transfer: success, donor: 0
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: GCache history reset: old(00000000-0000-0000-0000-000000000000:0) -> new(b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977)
            2017-11-03 16:44:49 140504742229760 [Note] WSREP: (d3790118, 'tcp://0.0.0.0:4567') connection to peer d3790118 with addr tcp://10.119.17.237:4567 timed out, no messages seen in PT3S
            2017-11-03 16:44:49 140504742229760 [Note] WSREP: (d3790118, 'tcp://0.0.0.0:4567') turning message relay requesting off
            2017-11-03 17:06:34 140504733837056 [Note] WSREP: 0.0 (10-119-17-69.domain.com): State transfer to 3.0 (10-119-17-237.domain.com) complete.
            WSREP_SST: [INFO] Extracting binlog files: (20171103 17:06:34.145)
            db-g3-bin.000042
            ls: cannot access db-node1-binlog.*: No such file or directory
            WSREP_SST: [INFO] Joiner cleanup. rsync PID: 8293 (20171103 17:06:34.156)
            WSREP_SST: [INFO] Joiner cleanup done. (20171103 17:06:34.662)
            2017-11-03 17:06:34 140515632003328 [Note] WSREP: SST complete, seqno: 230930977
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using mutexes to ref count buffer pool pages
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: The InnoDB memory heap is disabled
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Compressed tables use zlib 1.2.7
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using Linux native AIO
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using SSE crc32 instructions
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Initializing buffer pool, size = 512.0M
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Completed initialization of buffer pool
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Highest supported file format is Barracuda.
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1244892544453
            2017-11-03 17:06:36 140504733837056 [Note] WSREP: Member 0.0 (10-119-17-69.domain.com) synced with group.
            2017-11-03 17:06:42 140515632003328 [Note] InnoDB: Restoring possible half-written data pages from the doublewrite buffer...
            2017-11-03 17:06:49 140515632003328 [Note] InnoDB: To recover: 2702 pages from log
            2017-11-03 17:06:53 140515632003328 [Note] InnoDB: Starting final batch to recover 2585 pages from redo log
            InnoDB: Last MySQL binlog file position 0 786976686, file name /app/mysql/galera/db-g3-bin.000041
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: 128 rollback segment(s) are active.
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: Waiting for purge to start
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.36-82.2 started; log sequence number 1244894457235
            2017-11-03 17:07:23 140515632003328 [Note] Plugin 'FEEDBACK' is disabled.
            2017-11-03 17:07:23 140503555213056 [Note] InnoDB: Dumping buffer pool(s) not yet started
            2017-11-03 17:07:23 140515632003328 [Note] Server socket created on IP: '10.119.17.237'.

            Status information:

            Current dir: /app/mysql/galera/
            Running threads: 2 Stack size: 295936
            Current locks:
            lock: 0x7fc98da48788:

            lock: 0x7fc9cdc6a508:

            lock: 0x7fcc598b5520:

            lock: 0x7fcc598b5440:

            lock: 0x7fcc598b4480:

            lock: 0x7fcc598b43a0:

            lock: 0x7fcc598b42c0:

            lock: 0x7fcc598b4640:

            lock: 0x7fcc598b2880:

            lock: 0x7fcc598b5360:

            lock: 0x7fcc598b2f80:

            lock: 0x7fcc598b3220:

            lock: 0x7fcc598b2ea0:

            lock: 0x7fcc598b3140:

            lock: 0x7fcc598b2dc0:

            lock: 0x7fcc598b3060:

            lock: 0x7fcc598b35a0:

            lock: 0x7fcc598b3680:

            lock: 0x7fcc598b3760:

            lock: 0x7fcc598b2ce0:

            lock: 0x7fcc598b2a40:

            lock: 0x7fcc598b2c00:

            lock: 0x7fcc598b2960:

            lock: 0x7fcc598b2b20:

            lock: 0x7fcc598b3300:

            lock: 0x7fcc598b33e0:

            lock: 0x7fcc598b34c0:

            lock: 0x7fcc598b4fe0:

            lock: 0x7fcc598b5280:

            lock: 0x7fcc598b51a0:

            lock: 0x7fcc598b50c0:

            lock: 0x7fcc598b4c60:

            lock: 0x7fcc598b4b80:

            lock: 0x7fcc598b4aa0:

            lock: 0x7fcc598b49c0:

            lock: 0x7fcc598b48e0:

            lock: 0x7fcc598b4e20:

            lock: 0x7fcc598b4800:

            lock: 0x7fcc598b4720:

            lock: 0x7fcc598b4f00:

            lock: 0x7fcc598b4560:

            lock: 0x7fcc598b4100:

            lock: 0x7fcc598b41e0:

            lock: 0x7fcc598b4020:

            lock: 0x7fcc598b3f40:

            lock: 0x7fcc598b3bc0:

            lock: 0x7fcc598b3e60:

            lock: 0x7fcc598b3d80:

            lock: 0x7fcc598b3ae0:

            lock: 0x7fcc598b3ca0:

            lock: 0x7fcc598b3840:

            lock: 0x7fcc598b3920:

            lock: 0x7fcc598b3a00:

            lock: 0x7fcc598b4d40:


            Key caches:
            default
            Buffer_size: 134217728
            Block_size: 1024
            Division_limit: 100
            Age_threshold: 300
            Partitions: 0
            blocks used: 0
            not flushed: 0
            w_requests: 0
            writes: 0
            r_requests: 0
            reads: 0


            handler status:
            read_key: 0
            read_next: 0
            read_rnd 0
            read_first: 0
            write: 0
            delete 0
            update: 0

            Table status:
            Opened tables: 2
            Open tables: 2
            Open files: 8
            Open streams: 0

            Alarm status:
            Active alarms: 0
            Max used alarms: 0
            Next alarm time: 0

            Memory status:
            Non-mmapped space allocated from system: 0
            Number of free chunks: 1
            Number of fastbin blocks: 0
            Number of mmapped regions: 0
            Space in mmapped regions: 0
            Maximum total allocated space: 0
            Space available in freed fastbin blocks: 0
            Total allocated space: 0
            Total free space: 0
            Top-most, releasable space: 0
            Estimated memory (with thread stack): 591872



            Events status:
            LLA = Last Locked At LUA = Last Unlocked At
            WOC = Waiting On Condition DL = Data Locked
            The Event Scheduler is disabled

            2017-11-03 17:07:23 140515632003328 [Note] WSREP: Signalling provider to continue.
            2017-11-03 17:07:23 140515632003328 [Note] WSREP: SST received: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977
            2017-11-03 17:07:23 140515630676736 [Note] Event Scheduler: scheduler thread started with id 4
            171103 17:07:23 [ERROR] mysqld got signal 11 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.

            To report this bug, see https://mariadb.com/kb/en/reporting-bugs

            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed,
            something is definitely wrong and this may fail.

            Server version: 10.1.28-MariaDB
            key_buffer_size=134217728
            read_buffer_size=131072
            max_used_connections=0
            max_threads=1002
            thread_count=10
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2332069 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x0
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x0 thread_stack 0x48400
            mysqld(my_print_stacktrace+0x2e)[0x7fcc5902d92e]
            2017-11-03 17:07:23 140515632003328 [Note] Reading of all Master_info entries succeded
            2017-11-03 17:07:23 140515632003328 [Note] Added new Master_info '' to hash table
            2017-11-03 17:07:23 140515632003328 [Note] mysqld: ready for connections.
            Version: '10.1.28-MariaDB' socket: '/app/mysql/temporaries/galera/mysqld.sock' port: 3306 MariaDB Server
            2017-11-03 17:07:23 140504733837056 [Note] WSREP: 3.0 (10-119-17-237.domain.com): State transfer from 0.0 (10-119-17-69.domain.com) complete.
            2017-11-03 17:07:23 140504733837056 [Note] WSREP: Shifting JOINER -> JOINED (TO: 230971731)
            mysqld(handle_fatal_signal+0x305)[0x7fcc58b50ce5]
            /lib64/libpthread.so.0(+0xf370)[0x7fcc58161370]
            addr2line: 'mysqld': No such file
            mysqld(my_hash_element+0x13)[0x7fcc5900c023]
            mysqld(_ZN17Master_info_index20flush_all_relay_logsEv+0x93)[0x7fcc58a98c83]
            mysqld(_Z20reload_acl_and_cacheP3THDyP10TABLE_LISTPi+0x5d2)[0x7fcc58ab8672]
            mysqld(signal_hand+0x382)[0x7fcc5892c812]
            /lib64/libpthread.so.0(+0x7dc5)[0x7fcc58159dc5]
            /lib64/libc.so.6(clone+0x6d)[0x7fcc5657976d]
            The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            information that should help you find out what is causing the crash.
            {noformat}

            I'm not sure if it could provide any hint as to the cause of the crash, but the donor's error log contained a lot of messages about long semaphore waits during the time period in which it acted as donor:

            {noformat}
            InnoDB: ###### Diagnostic info printed to the standard error stream
            WSREP: avoiding InnoDB self crash due to long semaphore wait of > 600 seconds
            Server is processing SST donor operation, fatal_cnt now: 0InnoDB: Warning: a long semaphore wait:
            --Thread 140439973771008 has waited at log0log.cc line 2352 for 1265.0 seconds the semaphore:
            S-lock on RW-latch at 0x7fbd82c5d3b8 '&log_sys->checkpoint_lock'
            a writer (thread id 140439973771008) has reserved it in mode exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time read locked in file log0log.cc line 2352
            Last time write locked in file log0log.cc line 2109
            Holder thread 0 file not yet reserved line 0
            InnoDB: Warning: semaphore wait:
            --Thread 140439973771008 has waited at log0log.cc line 2352 for 1265.0 seconds the semaphore:
            S-lock on RW-latch at 0x7fbd82c5d3b8 '&log_sys->checkpoint_lock'
            a writer (thread id 140439973771008) has reserved it in mode exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time read locked in file log0log.cc line 2352
            Last time write locked in file log0log.cc line 2109
            Holder thread 0 file not yet reserved line 0
            InnoDB: Warning: Writer thread is waiting this semaphore:
            --Thread 140439973771008 has waited at log0log.cc line 2352 for 1265.0 seconds the semaphore:
            S-lock on RW-latch at 0x7fbd82c5d3b8 '&log_sys->checkpoint_lock'
            a writer (thread id 140439973771008) has reserved it in mode exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time read locked in file log0log.cc line 2352
            Last time write locked in file log0log.cc line 2109
            Holder thread 0 file not yet reserved line 0
            {noformat}

            In case it is relevant, the donor node was a replication master, but not a slave.

            After the first SST led to a crash, another SST was attempted, and that one was successful without a crash.
            A joiner node performed an rsync SST from another node in the cluster, and when the SST finished, the node saw the following crash:

            {noformat}
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: State transfer required:
            Group state: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977
            Local state: 00000000-0000-0000-0000-000000000000:-1
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: New cluster view: global state: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977, view# 102: Primary, number of nodes: 4, my index: 3, protocol version 3
            2017-11-03 16:44:46 140515631684352 [Warning] WSREP: Gap in state sequence. Need state transfer.
            2017-11-03 16:44:46 140504717063936 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '10.119.17.237' --datadir '/app/mysql/galera/' --defaults-file '/app/mysql/config/galera.cnf' --parent '8214' --binlog '/app/mysql/galera/db-node1-binlog' '
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: Prepared SST request: rsync|10.119.17.237:4444/rsync_sst
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: REPL Protocols: 7 (3, 2)
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: Assign initial position for certification: 230930977, protocol version: 3
            2017-11-03 16:44:46 140504792139520 [Note] WSREP: Service thread queue flushed.
            2017-11-03 16:44:46 140515631684352 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (b1c4fe17-b433-11e7-bd44-52c2748b6170): 1 (Operation not permitted)
            at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
            2017-11-03 16:44:46 140504733837056 [Note] WSREP: Member 3.0 (10-119-17-237.domain.com) requested state transfer from '*any*'. Selected 0.0 (10-119-17-69.domain.com)(SYNCED) as donor.
            2017-11-03 16:44:46 140504733837056 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 230930977)
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: Requesting state transfer: success, donor: 0
            2017-11-03 16:44:46 140515631684352 [Note] WSREP: GCache history reset: old(00000000-0000-0000-0000-000000000000:0) -> new(b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977)
            2017-11-03 16:44:49 140504742229760 [Note] WSREP: (d3790118, 'tcp://0.0.0.0:4567') connection to peer d3790118 with addr tcp://10.119.17.237:4567 timed out, no messages seen in PT3S
            2017-11-03 16:44:49 140504742229760 [Note] WSREP: (d3790118, 'tcp://0.0.0.0:4567') turning message relay requesting off
            2017-11-03 17:06:34 140504733837056 [Note] WSREP: 0.0 (10-119-17-69.domain.com): State transfer to 3.0 (10-119-17-237.domain.com) complete.
            WSREP_SST: [INFO] Extracting binlog files: (20171103 17:06:34.145)
            db-g3-bin.000042
            ls: cannot access db-node1-binlog.*: No such file or directory
            WSREP_SST: [INFO] Joiner cleanup. rsync PID: 8293 (20171103 17:06:34.156)
            WSREP_SST: [INFO] Joiner cleanup done. (20171103 17:06:34.662)
            2017-11-03 17:06:34 140515632003328 [Note] WSREP: SST complete, seqno: 230930977
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using mutexes to ref count buffer pool pages
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: The InnoDB memory heap is disabled
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Compressed tables use zlib 1.2.7
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using Linux native AIO
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Using SSE crc32 instructions
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Initializing buffer pool, size = 512.0M
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Completed initialization of buffer pool
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Highest supported file format is Barracuda.
            2017-11-03 17:06:34 140515632003328 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1244892544453
            2017-11-03 17:06:36 140504733837056 [Note] WSREP: Member 0.0 (10-119-17-69.domain.com) synced with group.
            2017-11-03 17:06:42 140515632003328 [Note] InnoDB: Restoring possible half-written data pages from the doublewrite buffer...
            2017-11-03 17:06:49 140515632003328 [Note] InnoDB: To recover: 2702 pages from log
            2017-11-03 17:06:53 140515632003328 [Note] InnoDB: Starting final batch to recover 2585 pages from redo log
            InnoDB: Last MySQL binlog file position 0 786976686, file name /app/mysql/galera/db-g3-bin.000041
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: 128 rollback segment(s) are active.
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: Waiting for purge to start
            2017-11-03 17:07:15 140515632003328 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.36-82.2 started; log sequence number 1244894457235
            2017-11-03 17:07:23 140515632003328 [Note] Plugin 'FEEDBACK' is disabled.
            2017-11-03 17:07:23 140503555213056 [Note] InnoDB: Dumping buffer pool(s) not yet started
            2017-11-03 17:07:23 140515632003328 [Note] Server socket created on IP: '10.119.17.237'.

            Status information:

            Current dir: /app/mysql/galera/
            Running threads: 2 Stack size: 295936
            Current locks:
            lock: 0x7fc98da48788:

            lock: 0x7fc9cdc6a508:

            lock: 0x7fcc598b5520:

            lock: 0x7fcc598b5440:

            lock: 0x7fcc598b4480:

            lock: 0x7fcc598b43a0:

            lock: 0x7fcc598b42c0:

            lock: 0x7fcc598b4640:

            lock: 0x7fcc598b2880:

            lock: 0x7fcc598b5360:

            lock: 0x7fcc598b2f80:

            lock: 0x7fcc598b3220:

            lock: 0x7fcc598b2ea0:

            lock: 0x7fcc598b3140:

            lock: 0x7fcc598b2dc0:

            lock: 0x7fcc598b3060:

            lock: 0x7fcc598b35a0:

            lock: 0x7fcc598b3680:

            lock: 0x7fcc598b3760:

            lock: 0x7fcc598b2ce0:

            lock: 0x7fcc598b2a40:

            lock: 0x7fcc598b2c00:

            lock: 0x7fcc598b2960:

            lock: 0x7fcc598b2b20:

            lock: 0x7fcc598b3300:

            lock: 0x7fcc598b33e0:

            lock: 0x7fcc598b34c0:

            lock: 0x7fcc598b4fe0:

            lock: 0x7fcc598b5280:

            lock: 0x7fcc598b51a0:

            lock: 0x7fcc598b50c0:

            lock: 0x7fcc598b4c60:

            lock: 0x7fcc598b4b80:

            lock: 0x7fcc598b4aa0:

            lock: 0x7fcc598b49c0:

            lock: 0x7fcc598b48e0:

            lock: 0x7fcc598b4e20:

            lock: 0x7fcc598b4800:

            lock: 0x7fcc598b4720:

            lock: 0x7fcc598b4f00:

            lock: 0x7fcc598b4560:

            lock: 0x7fcc598b4100:

            lock: 0x7fcc598b41e0:

            lock: 0x7fcc598b4020:

            lock: 0x7fcc598b3f40:

            lock: 0x7fcc598b3bc0:

            lock: 0x7fcc598b3e60:

            lock: 0x7fcc598b3d80:

            lock: 0x7fcc598b3ae0:

            lock: 0x7fcc598b3ca0:

            lock: 0x7fcc598b3840:

            lock: 0x7fcc598b3920:

            lock: 0x7fcc598b3a00:

            lock: 0x7fcc598b4d40:


            Key caches:
            default
            Buffer_size: 134217728
            Block_size: 1024
            Division_limit: 100
            Age_threshold: 300
            Partitions: 0
            blocks used: 0
            not flushed: 0
            w_requests: 0
            writes: 0
            r_requests: 0
            reads: 0


            handler status:
            read_key: 0
            read_next: 0
            read_rnd 0
            read_first: 0
            write: 0
            delete 0
            update: 0

            Table status:
            Opened tables: 2
            Open tables: 2
            Open files: 8
            Open streams: 0

            Alarm status:
            Active alarms: 0
            Max used alarms: 0
            Next alarm time: 0

            Memory status:
            Non-mmapped space allocated from system: 0
            Number of free chunks: 1
            Number of fastbin blocks: 0
            Number of mmapped regions: 0
            Space in mmapped regions: 0
            Maximum total allocated space: 0
            Space available in freed fastbin blocks: 0
            Total allocated space: 0
            Total free space: 0
            Top-most, releasable space: 0
            Estimated memory (with thread stack): 591872



            Events status:
            LLA = Last Locked At LUA = Last Unlocked At
            WOC = Waiting On Condition DL = Data Locked
            The Event Scheduler is disabled

            2017-11-03 17:07:23 140515632003328 [Note] WSREP: Signalling provider to continue.
            2017-11-03 17:07:23 140515632003328 [Note] WSREP: SST received: b1c4fe17-b433-11e7-bd44-52c2748b6170:230930977
            2017-11-03 17:07:23 140515630676736 [Note] Event Scheduler: scheduler thread started with id 4
            171103 17:07:23 [ERROR] mysqld got signal 11 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.

            To report this bug, see https://mariadb.com/kb/en/reporting-bugs

            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed,
            something is definitely wrong and this may fail.

            Server version: 10.1.28-MariaDB
            key_buffer_size=134217728
            read_buffer_size=131072
            max_used_connections=0
            max_threads=1002
            thread_count=10
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2332069 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x0
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x0 thread_stack 0x48400
            mysqld(my_print_stacktrace+0x2e)[0x7fcc5902d92e]
            2017-11-03 17:07:23 140515632003328 [Note] Reading of all Master_info entries succeded
            2017-11-03 17:07:23 140515632003328 [Note] Added new Master_info '' to hash table
            2017-11-03 17:07:23 140515632003328 [Note] mysqld: ready for connections.
            Version: '10.1.28-MariaDB' socket: '/app/mysql/temporaries/galera/mysqld.sock' port: 3306 MariaDB Server
            2017-11-03 17:07:23 140504733837056 [Note] WSREP: 3.0 (10-119-17-237.domain.com): State transfer from 0.0 (10-119-17-69.domain.com) complete.
            2017-11-03 17:07:23 140504733837056 [Note] WSREP: Shifting JOINER -> JOINED (TO: 230971731)
            mysqld(handle_fatal_signal+0x305)[0x7fcc58b50ce5]
            /lib64/libpthread.so.0(+0xf370)[0x7fcc58161370]
            addr2line: 'mysqld': No such file
            mysqld(my_hash_element+0x13)[0x7fcc5900c023]
            mysqld(_ZN17Master_info_index20flush_all_relay_logsEv+0x93)[0x7fcc58a98c83]
            mysqld(_Z20reload_acl_and_cacheP3THDyP10TABLE_LISTPi+0x5d2)[0x7fcc58ab8672]
            mysqld(signal_hand+0x382)[0x7fcc5892c812]
            /lib64/libpthread.so.0(+0x7dc5)[0x7fcc58159dc5]
            /lib64/libc.so.6(clone+0x6d)[0x7fcc5657976d]
            The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            information that should help you find out what is causing the crash.
            {noformat}

            I'm not sure if it could provide any hint as to the cause of the crash, but the donor's error log contained a lot of messages about long semaphore waits involving log_sys->checkpoint_lock during the time period in which it acted as donor:

            {noformat}
            InnoDB: ###### Diagnostic info printed to the standard error stream
            WSREP: avoiding InnoDB self crash due to long semaphore wait of > 600 seconds
            Server is processing SST donor operation, fatal_cnt now: 0InnoDB: Warning: a long semaphore wait:
            --Thread 140439973771008 has waited at log0log.cc line 2352 for 1265.0 seconds the semaphore:
            S-lock on RW-latch at 0x7fbd82c5d3b8 '&log_sys->checkpoint_lock'
            a writer (thread id 140439973771008) has reserved it in mode exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time read locked in file log0log.cc line 2352
            Last time write locked in file log0log.cc line 2109
            Holder thread 0 file not yet reserved line 0
            InnoDB: Warning: semaphore wait:
            --Thread 140439973771008 has waited at log0log.cc line 2352 for 1265.0 seconds the semaphore:
            S-lock on RW-latch at 0x7fbd82c5d3b8 '&log_sys->checkpoint_lock'
            a writer (thread id 140439973771008) has reserved it in mode exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time read locked in file log0log.cc line 2352
            Last time write locked in file log0log.cc line 2109
            Holder thread 0 file not yet reserved line 0
            InnoDB: Warning: Writer thread is waiting this semaphore:
            --Thread 140439973771008 has waited at log0log.cc line 2352 for 1265.0 seconds the semaphore:
            S-lock on RW-latch at 0x7fbd82c5d3b8 '&log_sys->checkpoint_lock'
            a writer (thread id 140439973771008) has reserved it in mode exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time read locked in file log0log.cc line 2352
            Last time write locked in file log0log.cc line 2109
            Holder thread 0 file not yet reserved line 0
            {noformat}

            In case it is relevant, the donor node was a replication master, but not a slave.

            After the first SST led to a crash, another SST was attempted, and that one was successful without a crash.
            anikitin Andrii Nikitin (Inactive) added a comment - - edited

            Geoff, I am not sure if checkoint on master is related, but the crash happened during something like kill -HUP $(pidof mysqld) (which follows something like mysqladmin debug) on problem node.
            So, the most we can do is to try find out if user was using some custom commands/scripts during/after sst and try to write test case which replicates that behavior.

            anikitin Andrii Nikitin (Inactive) added a comment - - edited Geoff, I am not sure if checkoint on master is related, but the crash happened during something like kill -HUP $(pidof mysqld) (which follows something like mysqladmin debug ) on problem node. So, the most we can do is to try find out if user was using some custom commands/scripts during/after sst and try to write test case which replicates that behavior.
            anikitin Andrii Nikitin (Inactive) made changes -
            Fix Version/s 10.1 [ 16100 ]

            Thanks, Andrii. It sounds like you don't think that this was a crash due to a bug, and instead, some user or other process actually explicitly killed mysqld on the joiner. Do I understand you correctly? If so, do you think the kill could have occurred due to a systemd timeout (i.e. the problem described by MDEV-9202)?

            GeoffMontee Geoff Montee (Inactive) added a comment - Thanks, Andrii. It sounds like you don't think that this was a crash due to a bug, and instead, some user or other process actually explicitly killed mysqld on the joiner. Do I understand you correctly? If so, do you think the kill could have occurred due to a systemd timeout (i.e. the problem described by MDEV-9202 )?

            Any crash is a bug, just it looks that conditions which caused this particular crash are linked to user actions. (Small chance exists that it was sst script which generated that `kill` command , which is reflected in `signal_hand` call in the problem stack trace, but I doubt it). kill -HUP is not dangerous for uptime, or at least shouldn't be.

            anikitin Andrii Nikitin (Inactive) added a comment - Any crash is a bug, just it looks that conditions which caused this particular crash are linked to user actions. (Small chance exists that it was sst script which generated that `kill` command , which is reflected in `signal_hand` call in the problem stack trace, but I doubt it). kill -HUP is not dangerous for uptime, or at least shouldn't be.
            GeoffMontee Geoff Montee (Inactive) added a comment - - edited

            Ah, I see what you mean. You are saying that it looks like the server received a SIGHUP right before it crashed. SIGHUP causes the server to reload privilege tables and flush its logs, which shouldn't cause a problem for a server under normal scenarios, but maybe it crashed the server in this case because the server was not completely initialized yet, as it had just finished receiving its SST.

            It doesn't look like systemd would be responsible for the SIGHUP, because systemd only sends SIGHUP if SendSIGHUP is enabled. That is not enabled in MariaDB's unit file, and it defaults to "no", so it should be disabled for MariaDB.

            I'll ask the user who saw this crash if they know why the server received a SIGHUP in this case.

            GeoffMontee Geoff Montee (Inactive) added a comment - - edited Ah, I see what you mean. You are saying that it looks like the server received a SIGHUP right before it crashed. SIGHUP causes the server to reload privilege tables and flush its logs, which shouldn't cause a problem for a server under normal scenarios, but maybe it crashed the server in this case because the server was not completely initialized yet, as it had just finished receiving its SST. It doesn't look like systemd would be responsible for the SIGHUP, because systemd only sends SIGHUP if SendSIGHUP is enabled. That is not enabled in MariaDB's unit file, and it defaults to "no", so it should be disabled for MariaDB. I'll ask the user who saw this crash if they know why the server received a SIGHUP in this case.
            GeoffMontee Geoff Montee (Inactive) made changes -
            Summary Galera joiner node crashes in my_hash_element after rsync SST completes SIGHUP after rsync SST causes crashe in my_hash_element
            GeoffMontee Geoff Montee (Inactive) added a comment - - edited

            The SIGHUP seems to be caused by Jenkins and/or Ansible somehow.

            GeoffMontee Geoff Montee (Inactive) added a comment - - edited The SIGHUP seems to be caused by Jenkins and/or Ansible somehow.
            GeoffMontee Geoff Montee (Inactive) made changes -
            Summary SIGHUP after rsync SST causes crashe in my_hash_element SIGHUP after rsync SST causes crash in my_hash_element
            serg Sergei Golubchik made changes -
            Assignee Andrii Nikitin [ anikitin ]

            So, while it is a bug, it is caused by a user sending SIGHUP. If someone has enough privileges to send a SIGHUP, one can send a SIGKILL just as well. So the bug is that a user, who can crash mysqld, crashes mysqld. Thus, I believe, it can be downgraded to Minor.

            serg Sergei Golubchik added a comment - So, while it is a bug, it is caused by a user sending SIGHUP. If someone has enough privileges to send a SIGHUP, one can send a SIGKILL just as well. So the bug is that a user, who can crash mysqld, crashes mysqld. Thus, I believe, it can be downgraded to Minor.
            serg Sergei Golubchik made changes -
            Priority Major [ 3 ] Minor [ 4 ]

            Setting to Confirmed based on comments above

            elenst Elena Stepanova added a comment - Setting to Confirmed based on comments above
            elenst Elena Stepanova made changes -
            Status Open [ 1 ] Confirmed [ 10101 ]
            GeoffMontee Geoff Montee (Inactive) made changes -
            jplindst Jan Lindström (Inactive) made changes -
            Assignee Andrei Elkin [ elkin ]

            Elkin Can you have a look as stack seems to contain replication related functions.

            jplindst Jan Lindström (Inactive) added a comment - Elkin Can you have a look as stack seems to contain replication related functions.
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 83622 ] MariaDB v4 [ 143985 ]

            10.1 is EOL.

            janlindstrom Jan Lindström added a comment - 10.1 is EOL.
            janlindstrom Jan Lindström made changes -
            Fix Version/s N/A [ 14700 ]
            Fix Version/s 10.1 [ 16100 ]
            Resolution Won't Fix [ 2 ]
            Status Confirmed [ 10101 ] Closed [ 6 ]
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 171572

            People

              Elkin Andrei Elkin
              GeoffMontee Geoff Montee (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.