Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31572

STOP SLAVE hangs on 10.3.39

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Incomplete
    • 10.3.39
    • N/A
    • Replication
    • None

    Description

      Hi,

      I am runnign the newest MariaDB 10.3. STOP SLAVE hangs for 5 mintues now. The server is not very busy.

      I am running Multi-source replication, from two primaries. Both primaries have been shut down.
      The primary of the default master connection (without a name) was shut down several months ago, the primary of the other connection was shut down more recently.

      Both replication streams have a REPLICATE_IGNORE_DB set.

      This was the state of the 2 connections before I issued the command:

       
      [8:20 PM] Michaël de Groot
       
      MariaDB [(none)]> show all slaves status\G
       
      *************************** 1. row ***************************
       
                     Connection_name: 
       
                     Slave_SQL_State: Waiting for the next event in relay log
       
                      Slave_IO_State: Connecting to master
       
                         Master_Host: 10.10.A.B
       
                         Master_User: ansi_repl
       
                         Master_Port: 3306
       
                       Connect_Retry: 60
       
                     Master_Log_File: mysql-bin.015523
       
                 Read_Master_Log_Pos: 882782146
       
                      Relay_Log_File: prod-d1-mariadb-01-relay-bin.004518
       
                       Relay_Log_Pos: 938754441
       
               Relay_Master_Log_File: mysql-bin.015457
       
                    Slave_IO_Running: Connecting
       
                   Slave_SQL_Running: Yes
       
                     Replicate_Do_DB: 
       
                 Replicate_Ignore_DB: mysql,some_schema
       
                  Replicate_Do_Table: 
       
              Replicate_Ignore_Table: 
       
             Replicate_Wild_Do_Table: 
       
         Replicate_Wild_Ignore_Table: 
       
                          Last_Errno: 0
       
                          Last_Error: 
       
                        Skip_Counter: 0
       
                 Exec_Master_Log_Pos: 938754146
       
                     Relay_Log_Space: 71381577192
       
                     Until_Condition: None
       
                      Until_Log_File: 
       
                       Until_Log_Pos: 0
       
                  Master_SSL_Allowed: No
       
                  Master_SSL_CA_File: 
       
                  Master_SSL_CA_Path: 
       
                     Master_SSL_Cert: 
       
                   Master_SSL_Cipher: 
       
                      Master_SSL_Key: 
       
               Seconds_Behind_Master: NULL
       
      Master_SSL_Verify_Server_Cert: No
       
                       Last_IO_Errno: 2003
       
                       Last_IO_Error: error connecting to master 'ansi_repl@10.10.A.B.:3306' - retry-time: 60  maximum-retries: 86400  message: Can't connect to MySQL server on '10.10.A.B.' (113 "No route to host")
       
                      Last_SQL_Errno: 0
       
                      Last_SQL_Error: 
       
         Replicate_Ignore_Server_Ids: 
       
                    Master_Server_Id: 0
       
                      Master_SSL_Crl: 
       
                  Master_SSL_Crlpath: 
       
                          Using_Gtid: No
       
                         Gtid_IO_Pos: 
       
             Replicate_Do_Domain_Ids: 
       
         Replicate_Ignore_Domain_Ids: 
       
                       Parallel_Mode: conservative
       
                           SQL_Delay: 0
       
                 SQL_Remaining_Delay: NULL
       
             Slave_SQL_Running_State: Waiting for the next event in relay log
       
                    Slave_DDL_Groups: 0
       
      Slave_Non_Transactional_Groups: 0
       
          Slave_Transactional_Groups: 0
       
                Retried_transactions: 0
       
                  Max_relay_log_size: 1073741824
       
                Executed_log_entries: 0
       
           Slave_received_heartbeats: 0
       
              Slave_heartbeat_period: 30.000
       
                      Gtid_Slave_Pos: 0-1245586661-56708316
       
      *************************** 2. row ***************************
       
                     Connection_name: cluster_migration
       
                     Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
       
                      Slave_IO_State: Connecting to master
       
                         Master_Host: 10.10.C.D
       
                         Master_User: cluster_migr_repl
       
                         Master_Port: 3306
       
                       Connect_Retry: 60
       
                     Master_Log_File: mysql-cluster-01-bin.000044
       
                 Read_Master_Log_Pos: 723615742
       
                      Relay_Log_File: prod-d1-mariadb-01-relay-bin-cluster_migration.000050
       
                       Relay_Log_Pos: 4
       
               Relay_Master_Log_File: mysql-cluster-01-bin.000044
       
                    Slave_IO_Running: Connecting
       
                   Slave_SQL_Running: Yes
       
                     Replicate_Do_DB: 
       
                 Replicate_Ignore_DB: mysql
       
                  Replicate_Do_Table: 
       
              Replicate_Ignore_Table: 
       
             Replicate_Wild_Do_Table: 
       
         Replicate_Wild_Ignore_Table: 
       
                          Last_Errno: 0
       
                          Last_Error: 
       
                        Skip_Counter: 0
       
                 Exec_Master_Log_Pos: 723615742
       
                     Relay_Log_Space: 256
       
                     Until_Condition: None
       
                      Until_Log_File: 
       
                       Until_Log_Pos: 0
       
                  Master_SSL_Allowed: No
       
                  Master_SSL_CA_File: 
       
                  Master_SSL_CA_Path: 
       
                     Master_SSL_Cert: 
       
                   Master_SSL_Cipher: 
       
                      Master_SSL_Key: 
       
               Seconds_Behind_Master: NULL
       
      Master_SSL_Verify_Server_Cert: No
       
                       Last_IO_Errno: 2003
       
                       Last_IO_Error: error connecting to master 'cluster_migr_repl@10.10.C.D:3306' - retry-time: 60  maximum-retries: 86400  message: Can't connect to MySQL server on '10.10.C.D' (113 "No route to host")
       
                      Last_SQL_Errno: 0
       
                      Last_SQL_Error: 
       
         Replicate_Ignore_Server_Ids: 
       
                    Master_Server_Id: 0
       
                      Master_SSL_Crl: 
       
                  Master_SSL_Crlpath: 
       
                          Using_Gtid: No
       
                         Gtid_IO_Pos: 
       
             Replicate_Do_Domain_Ids: 
       
         Replicate_Ignore_Domain_Ids: 
       
                       Parallel_Mode: conservative
       
                           SQL_Delay: 0
       
                 SQL_Remaining_Delay: NULL
       
             Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
       
                    Slave_DDL_Groups: 0
       
      Slave_Non_Transactional_Groups: 0
       
          Slave_Transactional_Groups: 0
       
                Retried_transactions: 0
       
                  Max_relay_log_size: 1073741824
       
                Executed_log_entries: 2
       
           Slave_received_heartbeats: 0
       
              Slave_heartbeat_period: 30.000
       
                      Gtid_Slave_Pos: 0-1245586661-56708316
      
      

      From the processlist I learned there was 1 SQL thread still running. The thread ID was 363357. I do not know from which one of the two master connections this was. I killed that thread and it dissapeared. This did not release the STOP SLAVE command.

      Perhaps the issue is that the SQL thread had stopped already (before the STOP SLAVE command), and that STOP SLAVE runs into this issue because of this.

      I tried to stop the other master connection, this worked without issues but did not release or return the first 'STOP SLAVE' command.

      I believe the server must be restarted to release this 'stop slave' issue, but I leave it runnign in case you want to gather some more information from this system.

      Earlier, this system refused to stop. Perhaps this has the same root cause (as stopping mariadb stops the replica connections in one of the steps).

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              michaeldg Michaël de groot
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.