Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8311

Replication slave is stuck without any error

    XMLWordPrintable

    Details

      Description

      When execute show slave status, It seems every thing is OK, but actually slave's sql thread was stuck! The Exec_Master_Log_Pos(mysql-bin.000160 2140880879) didn't change for several days.

      MariaDB [(none)]> show slave status\G;
      *************************** 1. row ***************************
                     Slave_IO_State: Waiting for master to send event
                        Master_Host: 192.168.1.113
                        Master_User: repl
                        Master_Port: 5518
                      Connect_Retry: 60
                    Master_Log_File: mysql-bin.000282
                Read_Master_Log_Pos: 531076432
                     Relay_Log_File: relay-bin.000002
                      Relay_Log_Pos: 691
              Relay_Master_Log_File: mysql-bin.000160
                   Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                    Replicate_Do_DB: 
                Replicate_Ignore_DB: 
                 Replicate_Do_Table: 
             Replicate_Ignore_Table: 
            Replicate_Wild_Do_Table: 
        Replicate_Wild_Ignore_Table: 
                         Last_Errno: 0
                         Last_Error: 
                       Skip_Counter: 0
                Exec_Master_Log_Pos: 2140880879
                    Relay_Log_Space: 124435203348
                    Until_Condition: None
                     Until_Log_File: 
                      Until_Log_Pos: 0
                 Master_SSL_Allowed: No
                 Master_SSL_CA_File: 
                 Master_SSL_CA_Path: 
                    Master_SSL_Cert: 
                  Master_SSL_Cipher: 
                     Master_SSL_Key: 
              Seconds_Behind_Master: 283562
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error: 
                     Last_SQL_Errno: 0
                     Last_SQL_Error: 
        Replicate_Ignore_Server_Ids: 
                   Master_Server_Id: 1895939662
                     Master_SSL_Crl: 
                 Master_SSL_Crlpath: 
                         Using_Gtid: Current_Pos
                        Gtid_IO_Pos: 0-1895939662-13106380
      1 row in set (0.00 sec)

      I use parallel replication, and set SLAVE_PARALLEL_THREADS=16, I found one work thread is always 'Waiting for prior transaction to start commit before starting next transaction', other 15 work threads is always 'Waiting for work from SQL thread',and show engine innodb status, there's no locks.
      Then check error log, find a restart in log 'mysql-bin.000160' at position 2140880879 :

      150606 10:32:34 [Note] /home/mariadb/bin/mysqld: Normal shutdown
      recording                                                                                                                                  60385,1       86%
      150605 18:42:13 [Warning] IP address '192.168.1.117' could not be resolved: Temporary failure in name resolution
      150606 10:32:34 [Note] /home/mariadb/bin/mysqld: Normal shutdown
       
      150606 10:32:34 [Note] Event Scheduler: Purging the queue. 0 events
      150606 10:36:51 [Note] Slave: received end packet from server, apparent master shutdown:
      150606 10:36:51 [Note] Slave I/O thread killed while reading event
      150606 10:36:51 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000260', position 353423150; GTID position 0-1895939662-1585424
      150606 10:44:24 [Note] InnoDB: Using mutexes to ref count buffer pool pages
      150606 10:44:24 [Note] InnoDB: The InnoDB memory heap is disabled
      150606 10:44:24 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      150606 10:44:24 [Note] InnoDB: Compressed tables use zlib 1.2.3
      150606 10:44:24 [Note] InnoDB: Using Linux native AIO
      150606 10:44:24 [Note] InnoDB: Using CPU crc32 instructions
      150606 10:44:24 [Note] InnoDB: Initializing buffer pool, size = 78.1G
      150606 10:44:27 [Note] InnoDB: Completed initialization of buffer pool
      150606 10:44:28 [Note] InnoDB: Highest supported file format is Barracuda.
      150606 10:44:28 [Note] InnoDB: The log sequence numbers 70561972035 and 70561972035 in ibdata files do not match the log sequence number 121831407098 in the ib_logfiles!
      150606 10:44:28 [Note] InnoDB: Database was not shutdown normally!
      150606 10:44:28 [Note] InnoDB: Starting crash recovery.
      150606 10:44:28 [Note] InnoDB: Reading tablespace information from the .ibd files...
      150606 10:44:28 [Note] InnoDB: Restoring possible half-written data pages
      150606 10:44:28 [Note] InnoDB: from the doublewrite buffer...
      InnoDB: 1 transaction(s) which must be rolled back or cleaned up
      InnoDB: in total 4554 row operations to undo
      InnoDB: Trx id counter is 28084736
      InnoDB: Last MySQL binlog file position 0 708607516, file name ../binlog/mysql-bin.000179
      150606 10:44:29 [Note] InnoDB: 128 rollback segment(s) are active.
      InnoDB: Starting in background the rollback of uncommitted transactions
      2015-06-06 10:44:29 7f1c2d7ff700  InnoDB: Rolling back trx with id 28084351, 4554 rows to undo
       
      InnoDB: Progress in percents: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100150606 10:44:30 [Note] InnoDB: Rollback of trx with id 28084351 completed
      2015-06-06 10:44:30 7f1c2d7ff700  InnoDB: Rollback of non-prepared transactions completed
      150606 10:44:34 [Note] InnoDB: Waiting for purge to start
      150606 10:44:34 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.17-65.0 started; log sequence number 121831407098
      150606 10:44:34 [Note] Semi-sync replication initialized for transactions.
      150606 10:44:34 [Note] Semi-sync replication enabled on the master.
      150606 10:44:34 server_audit: MariaDB Audit Plugin version 1.1.7 STARTED.
      150606 10:44:34 [Note] Recovering after a crash using ../binlog/mysql-bin
      150606 10:44:34 [Note] Starting crash recovery...
      150606 10:44:34 [Note] Crash recovery finished.
      150606 10:44:34 [Note] Server socket created on IP: '192.168.1.114'.
      150606 10:44:34 [Note] Event Scheduler: Loaded 0 events
      150606 10:44:46 [Note] /home/mariadb/bin/mysqld: Normal shutdown
       
      150606 10:44:50 [Note] /home/mariadb/bin/mysqld: ready for connections.
      Version: '10.0.12-MariaDB-log'  socket: '/home/mariadb/bin/mysql1.sock'  port: 5518  Source distribution
      150606 10:44:50 [Note] Event Scheduler: Purging the queue. 0 events
      150606 10:44:50 [Note] Slave I/O thread: Start semi-sync replication to master 'repl@192.168.1.113:5518' in log 'mysql-bin.000160' at position 2140880879
      150606 10:44:50 [Note] Slave I/O thread killed while connecting to master
      150606 10:44:50 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000160', position 2140880879; GTID position 0-1895939662-1128069
      150606 10:44:50 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000160' at position 2140880879, relay log '../relaylog/relay-bin.000001' position: 4; GTID position '0-1895939662-1128069'
      150606 10:44:50 [Note] Slave SQL thread exiting, replication stopped in log 'mysql-bin.000160' at position 2140880879; GTID position '0-1895939662-1128069'
      150606 10:44:50 server_audit: STOPPED
      150606 10:44:50 [Note] unregister_replicator OK
      150606 10:44:50 [Note] InnoDB: FTS optimize thread exiting.
      150606 10:44:50 [Note] InnoDB: Starting shutdown...
      150606 10:44:55 [Note] InnoDB: Shutdown completed; log sequence number 121833466167
      150606 10:44:55 [Note] /home/mariadb/bin/mysqld: Shutdown complete
       
      150606 10:45:12 [Note] InnoDB: Using mutexes to ref count buffer pool pages
      150606 10:45:12 [Note] InnoDB: The InnoDB memory heap is disabled
      150606 10:45:12 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      150606 10:45:12 [Note] InnoDB: Compressed tables use zlib 1.2.3
      150606 10:45:12 [Note] InnoDB: Using Linux native AIO
      150606 10:45:12 [Note] InnoDB: Using CPU crc32 instructions
      150606 10:45:12 [Note] InnoDB: Initializing buffer pool, size = 78.1G
      150606 10:45:15 [Note] InnoDB: Completed initialization of buffer pool
      150606 10:45:15 [Note] InnoDB: Highest supported file format is Barracuda.
      150606 10:45:16 [Note] InnoDB: 128 rollback segment(s) are active.
      150606 10:45:16 [Note] InnoDB: Waiting for purge to start
      150606 10:45:16 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.17-65.0 started; log sequence number 121833466167
      150606 10:45:16 [Note] Semi-sync replication initialized for transactions.
      150606 10:45:16 [Note] Semi-sync replication enabled on the master.
      150606 10:45:16 server_audit: MariaDB Audit Plugin version 1.1.7 STARTED.
      150606 10:45:16 [Note] Server socket created on IP: '192.168.1.114'.
      150606 10:45:16 [Note] Event Scheduler: Loaded 0 events
      150606 10:45:16 [Note] /home/mariadb/bin/mysqld: ready for connections.
      Version: '10.0.12-MariaDB-log'  socket: '/home/mariadb/bin/mysql1.sock'  port: 5518  Source distribution
      150606 10:45:16 [Note] Slave I/O thread: Start semi-sync replication to master 'repl@192.168.1.113:5518' in log 'mysql-bin.000160' at position 2140880879
      150606 10:45:16 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000160' at position 2140880879, relay log '../relaylog/relay-bin.000001' position: 4; GTID position '0-1895939662-1128069'
      150606 10:45:16 [Note] Slave I/O thread: connected to master 'repl@192.168.1.113:5518',replication starts at GTID position '0-1895939662-1128069'

        Attachments

          Activity

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            linzhoukai Zhoukai Lin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Git Integration