[MDEV-8353] STOP SLAVE may hang when replication is in inconsistent state Created: 2015-06-22  Updated: 2019-04-06

Status: Open
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 5.5.42
Fix Version/s: 5.5

Type: Bug Priority: Major
Reporter: Guillaume Lefranc Assignee: Andrei Elkin
Resolution: Unresolved Votes: 0
Labels: None


 Description   

There is this slave where replication failed a long time ago, but for some reason is completely impossible to stop. Here is the output of SHOW SLAVE STATUS:

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: (redacted)
                  Master_User: replicate
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.031549
          Read_Master_Log_Pos: 350895547
               Relay_Log_File: mysql-relay-bin.036016
                Relay_Log_Pos: 350557624
        Relay_Master_Log_File: mysql-bin.031549
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 350557340
              Relay_Log_Space: 350896811
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 42343306

STOP SLAVE hangs with the following error in the log:

150622 14:28:23 [Warning] Slave SQL: Request to stop slave SQL Thread received while applying a group that has non-transactional changes; waiting for completion of the group ... , Error_code : 0

When this is done, this causes threads in KILL state to appear, then as slave will never stop it is necessary to SIGKILL mysqld. unfortunately skip-slave-start seems to be ignored, as the IO thread is not running, but mysqld considers that the slave is still running and I can't reset it (even though I could probably just remove the master.info file and be done with it)



 Comments   
Comment by Elena Stepanova [ 2019-04-06 ]

Elkin, this is a very old entry from the depth of our backlog. Unless you see there anything worth digging into, please feel free to close with any status of your choosing.

Generated at Thu Feb 08 07:26:31 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.