Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17515

GTID Replication in optimistic mode deadlock

Details

    Description

      Hi,

      Setup is like this
      Relay(MySQL5.5-MIXED-log_slave_updates)
      ->Relay(MDB10.1.36 MIXED-log_slave_updates )
      -> Slave(MDB10.1.36 MIXED-log_slave_updates/optimistic)
      -> Slave(MDB10.1.36 MIXED-log_slave_updates/conservative)

      • We get deadlock on a slave in optimistic.
      • Other slaves in conservative do not deadlock.
      • Impossible to stop slave when deadlock

      The scenario that trigger such deadlock was some optimize tables on the first relay and mysql restart after optimize

      • the restart let relay master without events from master

       
       Slave_IO_Running: Connecting
                  Slave_SQL_Running: Yes
                      Last_IO_Errno: 2003
                      Last_IO_Error: error reconnecting to master 'replication@x.x.x.x:3306' - retry-time: 60  maximum-retries: 86400  message: Can't connect to MySQL server on 'x.x.x.x' (111 "Connection refused")
      

      The deadlock:

      +--------+--------------+----------------------+-------------+---------+--------+-----------------------------------------------+---------------------------+----------+
      | Id     | User         | Host                 | db          | Command | Time   | State                                         | Info                      | Progress |
      +--------+--------------+----------------------+-------------+---------+--------+-----------------------------------------------+---------------------------+----------+
      | 190042 | system user  |                      | NULL        | Connect | 196362 | Waiting for master to send event              | NULL                      |    0.000 |
      | 190043 | system user  |                      | NULL        | Connect | 196362 | Waiting for work from SQL thread              | NULL                      |    0.000 |
      | 190044 | system user  |                      | NULL        | Connect | 196362 | Waiting for work from SQL thread              | NULL                      |    0.000 |
      | 190045 | system user  |                      | NULL        | Connect | 196362 | Waiting for work from SQL thread              | NULL                      |    0.000 |
      | 190046 | system user  |                      | NULL        | Connect | 196362 | Waiting for work from SQL thread              | NULL                      |    0.000 |
      | 190047 | system user  |                      | NULL        | Connect | 196362 | Waiting for work from SQL thread              | NULL                      |    0.000 |
      | 190048 | system user  |                      | NULL        | Connect | 196362 | Waiting for work from SQL thread              | NULL                      |    0.000 |
      | 190049 | system user  |                      | tsce_unedic | Connect | 171856 | Waiting for table metadata lock               | OPTIMIZE TABLE `requetes` |    0.000 |
      | 190050 | system user  |                      | NULL        | Connect | 171856 | Waiting for prior transaction to commit       | NULL                      |    0.000 |
      | 190051 | system user  |                      | NULL        | Connect | 177872 | Waiting for room in worker thread event queue | NULL                      |    0.000 |
      | 307368 | root         | localhost            | NULL        | Killed  | 116880 | Killing slave                                 | stop slave                |    0.000 |
      | 434731 | root         | localhost            | NULL        | Killed  |  30480 | Killing slave                                 | stop slave                |    0.000 |
      | 474959 | root         | localhost            | NULL        | Killed  |   3167 | Killing slave                                 | stop slave                |    0.000 |
      | 479253 | root         | localhost            | NULL        | Query   |      0 | init                                          | show processlist          |    0.000 |
      | 479422 | proxysql     | ?:38626 | NULL        | Sleep   |      0 |                                               | NULL                      |    0.000 |
      | 479451 | proxysql     | ?:42168 | NULL        | Sleep   |      0 |                                               | NULL                      |    0.000 |
      | 479490 | netdata      | localhost            | NULL        | Sleep   |      0 |                                               | NULL                      |    0.000 |
      | 479518 | repl_manager | 192.168.185.44:33222 | NULL        | Sleep   |      1 |                                               | NULL                      |    0.000 |
      +--------+--------------+----------------------+-------------+---------+--------+-----------------------------------------------+---------------------------+----------+
      18 rows in set (0.00 sec)
      

      *************************** 1. row ***************************
                     Slave_IO_State: Waiting for master to send event
                        Master_Host: x.x.x.x
                        Master_User: repl
                        Master_Port: 3306
                      Connect_Retry: 60
                    Master_Log_File: mysql-bin.000035
                Read_Master_Log_Pos: 800365460
                     Relay_Log_File: relay-bin.000004
                      Relay_Log_Pos: 100745890
              Relay_Master_Log_File: mysql-bin.000032
                   Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                    Replicate_Do_DB: 
                Replicate_Ignore_DB: 
                 Replicate_Do_Table: 
             Replicate_Ignore_Table: 
            Replicate_Wild_Do_Table: 
        Replicate_Wild_Ignore_Table: 
                         Last_Errno: 0
                         Last_Error: 
                       Skip_Counter: 0
                Exec_Master_Log_Pos: 100745602
                    Relay_Log_Space: 4809515065
                    Until_Condition: None
                     Until_Log_File: 
                      Until_Log_Pos: 0
                 Master_SSL_Allowed: No
                 Master_SSL_CA_File: 
                 Master_SSL_CA_Path: 
                    Master_SSL_Cert: 
                  Master_SSL_Cipher: 
                     Master_SSL_Key: 
              Seconds_Behind_Master: 0
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error: 
                     Last_SQL_Errno: 0
                     Last_SQL_Error: 
        Replicate_Ignore_Server_Ids: 
                   Master_Server_Id: 21
                     Master_SSL_Crl: 
                 Master_SSL_Crlpath: 
                         Using_Gtid: Slave_Pos
                        Gtid_IO_Pos: 0-21-28343538
            Replicate_Do_Domain_Ids: 
        Replicate_Ignore_Domain_Ids: 
                      Parallel_Mode: optimistic
      
      

      Attachments

        Issue Links

          Activity

            stephane@skysql.com VAROQUI Stephane created issue -
            stephane@skysql.com VAROQUI Stephane made changes -
            Field Original Value New Value
            Description Hi,

            Setup is like this
            Relay(MySQL5.5-MIXED-log_slave_updates)
                ->Relay(MDB10.1.36 MIXED-log_slave_updates )
                   -> Slave(MDB10.1.36 MIXED-log_slave_updates/optimistic)
                   -> Slave(MDB10.1.36 MIXED-log_slave_updates/conservative)

            - We get deadlock a slave in optimistic
            - Other slaves in conservative do not deadlock.

            The scenario that trigger such deadlock was some optimize tables on the first relay and mysql restart after optimize

            - the restart let no master for a long time so that
            {noformat}
             Slave_IO_Running: Connecting
                        Slave_SQL_Running: Yes
                            Last_IO_Errno: 2003
                            Last_IO_Error: error reconnecting to master 'replication@95.128.41.30:3306' - retry-time: 60 maximum-retries: 86400 message: Can't connect to MySQL server on '95.128.41.30' (111 "Connection refused")
             {format}

            The deadlock:
            {noformat}
            +--------+--------------+----------------------+-------------+---------+--------+-----------------------------------------------+---------------------------+----------+
            | Id | User | Host | db | Command | Time | State | Info | Progress |
            +--------+--------------+----------------------+-------------+---------+--------+-----------------------------------------------+---------------------------+----------+
            | 190042 | system user | | NULL | Connect | 196362 | Waiting for master to send event | NULL | 0.000 |
            | 190043 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190044 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190045 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190046 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190047 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190048 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190049 | system user | | tsce_unedic | Connect | 171856 | Waiting for table metadata lock | OPTIMIZE TABLE `requetes` | 0.000 |
            | 190050 | system user | | NULL | Connect | 171856 | Waiting for prior transaction to commit | NULL | 0.000 |
            | 190051 | system user | | NULL | Connect | 177872 | Waiting for room in worker thread event queue | NULL | 0.000 |
            | 307368 | root | localhost | NULL | Killed | 116880 | Killing slave | stop slave | 0.000 |
            | 434731 | root | localhost | NULL | Killed | 30480 | Killing slave | stop slave | 0.000 |
            | 474959 | root | localhost | NULL | Killed | 3167 | Killing slave | stop slave | 0.000 |
            | 479253 | root | localhost | NULL | Query | 0 | init | show processlist | 0.000 |
            | 479422 | proxysql | 192.168.185.47:38626 | NULL | Sleep | 0 | | NULL | 0.000 |
            | 479451 | proxysql | 192.168.185.48:42168 | NULL | Sleep | 0 | | NULL | 0.000 |
            | 479490 | netdata | localhost | NULL | Sleep | 0 | | NULL | 0.000 |
            | 479518 | repl_manager | 192.168.185.44:33222 | NULL | Sleep | 1 | | NULL | 0.000 |
            +--------+--------------+----------------------+-------------+---------+--------+-----------------------------------------------+---------------------------+----------+
            18 rows in set (0.00 sec)
            {noformat}

            {noformat}
            *************************** 1. row ***************************
                           Slave_IO_State: Waiting for master to send event
                              Master_Host: 192.168.185.40
                              Master_User: repl
                              Master_Port: 3306
                            Connect_Retry: 60
                          Master_Log_File: mysql-bin.000035
                      Read_Master_Log_Pos: 800365460
                           Relay_Log_File: relay-bin.000004
                            Relay_Log_Pos: 100745890
                    Relay_Master_Log_File: mysql-bin.000032
                         Slave_IO_Running: Yes
                        Slave_SQL_Running: Yes
                          Replicate_Do_DB:
                      Replicate_Ignore_DB:
                       Replicate_Do_Table:
                   Replicate_Ignore_Table:
                  Replicate_Wild_Do_Table:
              Replicate_Wild_Ignore_Table:
                               Last_Errno: 0
                               Last_Error:
                             Skip_Counter: 0
                      Exec_Master_Log_Pos: 100745602
                          Relay_Log_Space: 4809515065
                          Until_Condition: None
                           Until_Log_File:
                            Until_Log_Pos: 0
                       Master_SSL_Allowed: No
                       Master_SSL_CA_File:
                       Master_SSL_CA_Path:
                          Master_SSL_Cert:
                        Master_SSL_Cipher:
                           Master_SSL_Key:
                    Seconds_Behind_Master: 0
            Master_SSL_Verify_Server_Cert: No
                            Last_IO_Errno: 0
                            Last_IO_Error:
                           Last_SQL_Errno: 0
                           Last_SQL_Error:
              Replicate_Ignore_Server_Ids:
                         Master_Server_Id: 21
                           Master_SSL_Crl:
                       Master_SSL_Crlpath:
                               Using_Gtid: Slave_Pos
                              Gtid_IO_Pos: 0-21-28343538
                  Replicate_Do_Domain_Ids:
              Replicate_Ignore_Domain_Ids:
                            Parallel_Mode: optimistic

            {noformat}
            Hi,

            Setup is like this
            Relay(MySQL5.5-MIXED-log_slave_updates)
                ->Relay(MDB10.1.36 MIXED-log_slave_updates )
                   -> Slave(MDB10.1.36 MIXED-log_slave_updates/optimistic)
                   -> Slave(MDB10.1.36 MIXED-log_slave_updates/conservative)

            - We get deadlock a slave in optimistic.
            - Other slaves in conservative do not deadlock.
             - Impossible to stop slave when deadlock

            The scenario that trigger such deadlock was some optimize tables on the first relay and mysql restart after optimize

            - the restart let no master for a long time so that
            {noformat}
             Slave_IO_Running: Connecting
                        Slave_SQL_Running: Yes
                            Last_IO_Errno: 2003
                            Last_IO_Error: error reconnecting to master 'replication@95.128.41.30:3306' - retry-time: 60 maximum-retries: 86400 message: Can't connect to MySQL server on '95.128.41.30' (111 "Connection refused")
            {noformat}

            The deadlock:
            {noformat}
            +--------+--------------+----------------------+-------------+---------+--------+-----------------------------------------------+---------------------------+----------+
            | Id | User | Host | db | Command | Time | State | Info | Progress |
            +--------+--------------+----------------------+-------------+---------+--------+-----------------------------------------------+---------------------------+----------+
            | 190042 | system user | | NULL | Connect | 196362 | Waiting for master to send event | NULL | 0.000 |
            | 190043 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190044 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190045 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190046 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190047 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190048 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190049 | system user | | tsce_unedic | Connect | 171856 | Waiting for table metadata lock | OPTIMIZE TABLE `requetes` | 0.000 |
            | 190050 | system user | | NULL | Connect | 171856 | Waiting for prior transaction to commit | NULL | 0.000 |
            | 190051 | system user | | NULL | Connect | 177872 | Waiting for room in worker thread event queue | NULL | 0.000 |
            | 307368 | root | localhost | NULL | Killed | 116880 | Killing slave | stop slave | 0.000 |
            | 434731 | root | localhost | NULL | Killed | 30480 | Killing slave | stop slave | 0.000 |
            | 474959 | root | localhost | NULL | Killed | 3167 | Killing slave | stop slave | 0.000 |
            | 479253 | root | localhost | NULL | Query | 0 | init | show processlist | 0.000 |
            | 479422 | proxysql | 192.168.185.47:38626 | NULL | Sleep | 0 | | NULL | 0.000 |
            | 479451 | proxysql | 192.168.185.48:42168 | NULL | Sleep | 0 | | NULL | 0.000 |
            | 479490 | netdata | localhost | NULL | Sleep | 0 | | NULL | 0.000 |
            | 479518 | repl_manager | 192.168.185.44:33222 | NULL | Sleep | 1 | | NULL | 0.000 |
            +--------+--------------+----------------------+-------------+---------+--------+-----------------------------------------------+---------------------------+----------+
            18 rows in set (0.00 sec)
            {noformat}

            {noformat}
            *************************** 1. row ***************************
                           Slave_IO_State: Waiting for master to send event
                              Master_Host: 192.168.185.40
                              Master_User: repl
                              Master_Port: 3306
                            Connect_Retry: 60
                          Master_Log_File: mysql-bin.000035
                      Read_Master_Log_Pos: 800365460
                           Relay_Log_File: relay-bin.000004
                            Relay_Log_Pos: 100745890
                    Relay_Master_Log_File: mysql-bin.000032
                         Slave_IO_Running: Yes
                        Slave_SQL_Running: Yes
                          Replicate_Do_DB:
                      Replicate_Ignore_DB:
                       Replicate_Do_Table:
                   Replicate_Ignore_Table:
                  Replicate_Wild_Do_Table:
              Replicate_Wild_Ignore_Table:
                               Last_Errno: 0
                               Last_Error:
                             Skip_Counter: 0
                      Exec_Master_Log_Pos: 100745602
                          Relay_Log_Space: 4809515065
                          Until_Condition: None
                           Until_Log_File:
                            Until_Log_Pos: 0
                       Master_SSL_Allowed: No
                       Master_SSL_CA_File:
                       Master_SSL_CA_Path:
                          Master_SSL_Cert:
                        Master_SSL_Cipher:
                           Master_SSL_Key:
                    Seconds_Behind_Master: 0
            Master_SSL_Verify_Server_Cert: No
                            Last_IO_Errno: 0
                            Last_IO_Error:
                           Last_SQL_Errno: 0
                           Last_SQL_Error:
              Replicate_Ignore_Server_Ids:
                         Master_Server_Id: 21
                           Master_SSL_Crl:
                       Master_SSL_Crlpath:
                               Using_Gtid: Slave_Pos
                              Gtid_IO_Pos: 0-21-28343538
                  Replicate_Do_Domain_Ids:
              Replicate_Ignore_Domain_Ids:
                            Parallel_Mode: optimistic

            {noformat}
            stephane@skysql.com VAROQUI Stephane made changes -
            Description Hi,

            Setup is like this
            Relay(MySQL5.5-MIXED-log_slave_updates)
                ->Relay(MDB10.1.36 MIXED-log_slave_updates )
                   -> Slave(MDB10.1.36 MIXED-log_slave_updates/optimistic)
                   -> Slave(MDB10.1.36 MIXED-log_slave_updates/conservative)

            - We get deadlock a slave in optimistic.
            - Other slaves in conservative do not deadlock.
             - Impossible to stop slave when deadlock

            The scenario that trigger such deadlock was some optimize tables on the first relay and mysql restart after optimize

            - the restart let no master for a long time so that
            {noformat}
             Slave_IO_Running: Connecting
                        Slave_SQL_Running: Yes
                            Last_IO_Errno: 2003
                            Last_IO_Error: error reconnecting to master 'replication@95.128.41.30:3306' - retry-time: 60 maximum-retries: 86400 message: Can't connect to MySQL server on '95.128.41.30' (111 "Connection refused")
            {noformat}

            The deadlock:
            {noformat}
            +--------+--------------+----------------------+-------------+---------+--------+-----------------------------------------------+---------------------------+----------+
            | Id | User | Host | db | Command | Time | State | Info | Progress |
            +--------+--------------+----------------------+-------------+---------+--------+-----------------------------------------------+---------------------------+----------+
            | 190042 | system user | | NULL | Connect | 196362 | Waiting for master to send event | NULL | 0.000 |
            | 190043 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190044 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190045 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190046 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190047 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190048 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190049 | system user | | tsce_unedic | Connect | 171856 | Waiting for table metadata lock | OPTIMIZE TABLE `requetes` | 0.000 |
            | 190050 | system user | | NULL | Connect | 171856 | Waiting for prior transaction to commit | NULL | 0.000 |
            | 190051 | system user | | NULL | Connect | 177872 | Waiting for room in worker thread event queue | NULL | 0.000 |
            | 307368 | root | localhost | NULL | Killed | 116880 | Killing slave | stop slave | 0.000 |
            | 434731 | root | localhost | NULL | Killed | 30480 | Killing slave | stop slave | 0.000 |
            | 474959 | root | localhost | NULL | Killed | 3167 | Killing slave | stop slave | 0.000 |
            | 479253 | root | localhost | NULL | Query | 0 | init | show processlist | 0.000 |
            | 479422 | proxysql | 192.168.185.47:38626 | NULL | Sleep | 0 | | NULL | 0.000 |
            | 479451 | proxysql | 192.168.185.48:42168 | NULL | Sleep | 0 | | NULL | 0.000 |
            | 479490 | netdata | localhost | NULL | Sleep | 0 | | NULL | 0.000 |
            | 479518 | repl_manager | 192.168.185.44:33222 | NULL | Sleep | 1 | | NULL | 0.000 |
            +--------+--------------+----------------------+-------------+---------+--------+-----------------------------------------------+---------------------------+----------+
            18 rows in set (0.00 sec)
            {noformat}

            {noformat}
            *************************** 1. row ***************************
                           Slave_IO_State: Waiting for master to send event
                              Master_Host: 192.168.185.40
                              Master_User: repl
                              Master_Port: 3306
                            Connect_Retry: 60
                          Master_Log_File: mysql-bin.000035
                      Read_Master_Log_Pos: 800365460
                           Relay_Log_File: relay-bin.000004
                            Relay_Log_Pos: 100745890
                    Relay_Master_Log_File: mysql-bin.000032
                         Slave_IO_Running: Yes
                        Slave_SQL_Running: Yes
                          Replicate_Do_DB:
                      Replicate_Ignore_DB:
                       Replicate_Do_Table:
                   Replicate_Ignore_Table:
                  Replicate_Wild_Do_Table:
              Replicate_Wild_Ignore_Table:
                               Last_Errno: 0
                               Last_Error:
                             Skip_Counter: 0
                      Exec_Master_Log_Pos: 100745602
                          Relay_Log_Space: 4809515065
                          Until_Condition: None
                           Until_Log_File:
                            Until_Log_Pos: 0
                       Master_SSL_Allowed: No
                       Master_SSL_CA_File:
                       Master_SSL_CA_Path:
                          Master_SSL_Cert:
                        Master_SSL_Cipher:
                           Master_SSL_Key:
                    Seconds_Behind_Master: 0
            Master_SSL_Verify_Server_Cert: No
                            Last_IO_Errno: 0
                            Last_IO_Error:
                           Last_SQL_Errno: 0
                           Last_SQL_Error:
              Replicate_Ignore_Server_Ids:
                         Master_Server_Id: 21
                           Master_SSL_Crl:
                       Master_SSL_Crlpath:
                               Using_Gtid: Slave_Pos
                              Gtid_IO_Pos: 0-21-28343538
                  Replicate_Do_Domain_Ids:
              Replicate_Ignore_Domain_Ids:
                            Parallel_Mode: optimistic

            {noformat}
            Hi,

            Setup is like this
            Relay(MySQL5.5-MIXED-log_slave_updates)
                ->Relay(MDB10.1.36 MIXED-log_slave_updates )
                   -> Slave(MDB10.1.36 MIXED-log_slave_updates/optimistic)
                   -> Slave(MDB10.1.36 MIXED-log_slave_updates/conservative)

            - We get deadlock on a slave in optimistic.
            - Other slaves in conservative do not deadlock.
            - Impossible to stop slave when deadlock

            The scenario that trigger such deadlock was some optimize tables on the first relay and mysql restart after optimize

            - the restart let relay master without events from master

            {noformat}
             Slave_IO_Running: Connecting
                        Slave_SQL_Running: Yes
                            Last_IO_Errno: 2003
                            Last_IO_Error: error reconnecting to master 'replication@x.x.x.x:3306' - retry-time: 60 maximum-retries: 86400 message: Can't connect to MySQL server on 'x.x.x.x' (111 "Connection refused")
            {noformat}

            The deadlock:
            {noformat}
            +--------+--------------+----------------------+-------------+---------+--------+-----------------------------------------------+---------------------------+----------+
            | Id | User | Host | db | Command | Time | State | Info | Progress |
            +--------+--------------+----------------------+-------------+---------+--------+-----------------------------------------------+---------------------------+----------+
            | 190042 | system user | | NULL | Connect | 196362 | Waiting for master to send event | NULL | 0.000 |
            | 190043 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190044 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190045 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190046 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190047 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190048 | system user | | NULL | Connect | 196362 | Waiting for work from SQL thread | NULL | 0.000 |
            | 190049 | system user | | tsce_unedic | Connect | 171856 | Waiting for table metadata lock | OPTIMIZE TABLE `requetes` | 0.000 |
            | 190050 | system user | | NULL | Connect | 171856 | Waiting for prior transaction to commit | NULL | 0.000 |
            | 190051 | system user | | NULL | Connect | 177872 | Waiting for room in worker thread event queue | NULL | 0.000 |
            | 307368 | root | localhost | NULL | Killed | 116880 | Killing slave | stop slave | 0.000 |
            | 434731 | root | localhost | NULL | Killed | 30480 | Killing slave | stop slave | 0.000 |
            | 474959 | root | localhost | NULL | Killed | 3167 | Killing slave | stop slave | 0.000 |
            | 479253 | root | localhost | NULL | Query | 0 | init | show processlist | 0.000 |
            | 479422 | proxysql | ?:38626 | NULL | Sleep | 0 | | NULL | 0.000 |
            | 479451 | proxysql | ?:42168 | NULL | Sleep | 0 | | NULL | 0.000 |
            | 479490 | netdata | localhost | NULL | Sleep | 0 | | NULL | 0.000 |
            | 479518 | repl_manager | 192.168.185.44:33222 | NULL | Sleep | 1 | | NULL | 0.000 |
            +--------+--------------+----------------------+-------------+---------+--------+-----------------------------------------------+---------------------------+----------+
            18 rows in set (0.00 sec)
            {noformat}

            {noformat}
            *************************** 1. row ***************************
                           Slave_IO_State: Waiting for master to send event
                              Master_Host: x.x.x.x
                              Master_User: repl
                              Master_Port: 3306
                            Connect_Retry: 60
                          Master_Log_File: mysql-bin.000035
                      Read_Master_Log_Pos: 800365460
                           Relay_Log_File: relay-bin.000004
                            Relay_Log_Pos: 100745890
                    Relay_Master_Log_File: mysql-bin.000032
                         Slave_IO_Running: Yes
                        Slave_SQL_Running: Yes
                          Replicate_Do_DB:
                      Replicate_Ignore_DB:
                       Replicate_Do_Table:
                   Replicate_Ignore_Table:
                  Replicate_Wild_Do_Table:
              Replicate_Wild_Ignore_Table:
                               Last_Errno: 0
                               Last_Error:
                             Skip_Counter: 0
                      Exec_Master_Log_Pos: 100745602
                          Relay_Log_Space: 4809515065
                          Until_Condition: None
                           Until_Log_File:
                            Until_Log_Pos: 0
                       Master_SSL_Allowed: No
                       Master_SSL_CA_File:
                       Master_SSL_CA_Path:
                          Master_SSL_Cert:
                        Master_SSL_Cipher:
                           Master_SSL_Key:
                    Seconds_Behind_Master: 0
            Master_SSL_Verify_Server_Cert: No
                            Last_IO_Errno: 0
                            Last_IO_Error:
                           Last_SQL_Errno: 0
                           Last_SQL_Error:
              Replicate_Ignore_Server_Ids:
                         Master_Server_Id: 21
                           Master_SSL_Crl:
                       Master_SSL_Crlpath:
                               Using_Gtid: Slave_Pos
                              Gtid_IO_Pos: 0-21-28343538
                  Replicate_Do_Domain_Ids:
              Replicate_Ignore_Domain_Ids:
                            Parallel_Mode: optimistic

            {noformat}
            elenst Elena Stepanova made changes -
            Assignee Andrei Elkin [ elkin ]
            elenst Elena Stepanova made changes -
            Fix Version/s 10.1 [ 16100 ]
            Elkin Andrei Elkin made changes -
            Assignee Andrei Elkin [ elkin ] Sachin Setiya [ sachin.setiya.007 ]

            From the process list, it looks like we have two transactions T1 (OPTIMIZE TABLE requetes) and T2 (DML on table requetes). And T2 was somehow allowed to start before T1 was complete; this should not have been allowed. OPTIMIZE TABLE should have been marked as DDL, which would prevent T2 from starting until T1 completed.

            Check the binlog on the master of the deadlocked slave (or alternatively the relaylog on the deadlocked slave) that the OPTIMIZE TABLE GTID event is marked as DDL.

            I can see this is an old bug, so probably the original logs are not available, but it might be enough to check the binlogging in a similar setup. The deadlock will be very dependent on timing between different transactions, but the marking of OPTIMIZE TABLE as DDL is hoefully not.

            Hope this helps,

            • Kristian.
            knielsen Kristian Nielsen added a comment - From the process list, it looks like we have two transactions T1 (OPTIMIZE TABLE requetes) and T2 (DML on table requetes). And T2 was somehow allowed to start before T1 was complete; this should not have been allowed. OPTIMIZE TABLE should have been marked as DDL, which would prevent T2 from starting until T1 completed. Check the binlog on the master of the deadlocked slave (or alternatively the relaylog on the deadlocked slave) that the OPTIMIZE TABLE GTID event is marked as DDL. I can see this is an old bug, so probably the original logs are not available, but it might be enough to check the binlogging in a similar setup. The deadlock will be very dependent on timing between different transactions, but the marking of OPTIMIZE TABLE as DDL is hoefully not. Hope this helps, Kristian.
            Elkin Andrei Elkin added a comment -

            knielsen, thanks for the guidence! That must be the case. The width of the pasted show processlist window did not help me in particular to identify OPTIMIZE as an actor .

            Elkin Andrei Elkin added a comment - knielsen , thanks for the guidence! That must be the case. The width of the pasted show processlist window did not help me in particular to identify OPTIMIZE as an actor .
            Elkin Andrei Elkin made changes -
            Labels parallelreplication

            Hi knielsen
            You are right , Optimize table is not treated as DDL

            #191205 14:59:37 server id 1  end_log_pos 352   GTID 0-1-1 ddl
            /*!100101 SET @@session.skip_parallel_replication=0*//*!*/;
            /*!100001 SET @@session.gtid_domain_id=0*//*!*/;
            /*!100001 SET @@session.server_id=1*//*!*/;
            /*!100001 SET @@session.gtid_seq_no=1*//*!*/;
            # at 352
            #191205 14:59:37 server id 1  end_log_pos 437   Query   thread_id=4     exec_time=0     error_code=0
            use `test`/*!*/;
            SET TIMESTAMP=1575538177/*!*/;
            SET @@session.pseudo_thread_id=4/*!*/;
            SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
            SET @@session.sql_mode=1342177280/*!*/;
            SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
            /*!\C latin1 *//*!*/;
            SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/;
            SET @@session.lc_time_names=0/*!*/;
            SET @@session.collation_database=DEFAULT/*!*/;
            create table t1(a int)
            /*!*/;
            # at 437
            #191205 14:59:37 server id 1  end_log_pos 475   GTID 0-1-2
            /*!100001 SET @@session.gtid_seq_no=2*//*!*/;
            # at 475
            #191205 14:59:37 server id 1  end_log_pos 555   Query   thread_id=4     exec_time=0     error_code=0
            SET TIMESTAMP=1575538177/*!*/;
            optimize table t1
            /*!*/;
            # at 555
            #191205 14:59:37 server id 1  end_log_pos 574   Stop
            DELIMITER ;
            # End of log file
            ROLLBACK /* added by mysqlbinlog */;
            /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
            /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
            

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Hi knielsen You are right , Optimize table is not treated as DDL #191205 14:59:37 server id 1 end_log_pos 352 GTID 0-1-1 ddl /*!100101 SET @@session.skip_parallel_replication=0*//*!*/; /*!100001 SET @@session.gtid_domain_id=0*//*!*/; /*!100001 SET @@session.server_id=1*//*!*/; /*!100001 SET @@session.gtid_seq_no=1*//*!*/; # at 352 #191205 14:59:37 server id 1 end_log_pos 437 Query thread_id=4 exec_time=0 error_code=0 use `test`/*!*/; SET TIMESTAMP=1575538177/*!*/; SET @@session.pseudo_thread_id=4/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=1342177280/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C latin1 *//*!*/; SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; create table t1(a int) /*!*/; # at 437 #191205 14:59:37 server id 1 end_log_pos 475 GTID 0-1-2 /*!100001 SET @@session.gtid_seq_no=2*//*!*/; # at 475 #191205 14:59:37 server id 1 end_log_pos 555 Query thread_id=4 exec_time=0 error_code=0 SET TIMESTAMP=1575538177/*!*/; optimize table t1 /*!*/; # at 555 #191205 14:59:37 server id 1 end_log_pos 574 Stop DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */; /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

            Test Case

            --source include/have_binlog_format_row.inc
            --source include/have_innodb.inc
            --source include/master-slave.inc
             
            --connection slave
            stop slave;
            SET GLOBAL slave_parallel_threads=2;
            set global slave_parallel_mode=optimistic;
            start slave;
             
            --connection master
            create table t1(a int) engine=innodb;
            optimize table t1;
            insert into t1 values(1);
            insert into t1  select 1+a from t1;
            insert into t1  select 2+a from t1;
             
            select * from t1;
             
            --sync_slave_with_master
             
             
            --connection master
            drop table t1;
             
            --sync_slave_with_master
            stop slave;
            SET GLOBAL slave_parallel_threads=0;
            set global slave_parallel_mode=conservative;
            start slave;
             
            --source include/rpl_end.inc
            
            

            Debug build it fails with assert

            57      ../sysdeps/unix/sysv/linux/pthread_kill.c: No such file or directory.                                                                                                                 
            [Current thread is 1 (Thread 0x7f2cd46b4700 (LWP 30726))]                                                                                                                                     
            #0  __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:57                                                                                        
            #1  0x00005623573250fa in my_write_core (sig=6) at /home/sachin/10.1/server/mysys/stacktrace.c:477                                                                                            
            #2  0x0000562356cc4575 in handle_fatal_signal (sig=6) at /home/sachin/10.1/server/sql/signal_handler.cc:296                                                                                   
            #3  <signal handler called>                                                                                                                                                                   
            #4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51                                                                                                                     
            #5  0x00007f2ce2834801 in __GI_abort () at abort.c:79                                                                                                                                         
            #6  0x00007f2ce282439a in __assert_fail_base (fmt=0x7f2ce29ab7d8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x562357455408 "(mdl_request->type != MDL_INTENTION_EX
            CLUSIVE && mdl_request->type != MDL_EXCLUSIVE) || !(get_thd()->rgi_slave && get_thd()->rgi_slave->is_parallel_exec && lock->check_if_conflicting_replication_locks"..., file=file@entry=0x5623
            57454e90 "/home/sachin/10.1/server/sql/mdl.cc", line=line@entry=2104, function=function@entry=0x562357455c80 <MDL_context::acquire_lock(MDL_request*, double)::__PRETTY_FUNCTION__> "bool MDL_
            context::acquire_lock(MDL_request*, double)") at assert.c:92
            #7  0x00007f2ce2824412 in __GI___assert_fail (assertion=0x562357455408 "(mdl_request->type != MDL_INTENTION_EXCLUSIVE && mdl_request->type != MDL_EXCLUSIVE) || !(get_thd()->rgi_slave && get_
            thd()->rgi_slave->is_parallel_exec && lock->check_if_conflicting_replication_locks"..., file=0x562357454e90 "/home/sachin/10.1/server/sql/mdl.cc", line=2104, function=0x562357455c80 <MDL_con
            text::acquire_lock(MDL_request*, double)::__PRETTY_FUNCTION__> "bool MDL_context::acquire_lock(MDL_request*, double)") at assert.c:101
            #8  0x0000562356bc4499 in MDL_context::acquire_lock (this=0x7f2c80000e48, mdl_request=0x7f2cd46b04e0, lock_wait_timeout=31536000) at /home/sachin/10.1/server/sql/mdl.cc:2100
            #9  0x0000562356bc4d8a in MDL_context::upgrade_shared_lock (this=0x7f2c80000e48, mdl_ticket=0x7f2c800066f0, new_type=MDL_EXCLUSIVE, lock_wait_timeout=31536000) at /home/sachin/10.1/server/sq
            l/mdl.cc:2318
            #10 0x0000562356a15b3b in wait_while_table_is_used (thd=0x7f2c80000d50, table=0x7f2c8001caf0, function=HA_EXTRA_PREPARE_FOR_RENAME) at /home/sachin/10.1/server/sql/sql_base.cc:1885
            #11 0x0000562356b450d8 in mysql_inplace_alter_table (thd=0x7f2c80000d50, table_list=0x7f2c80008720, table=0x7f2c8001caf0, altered_table=0x7f2c80025b50, ha_alter_info=0x7f2cd46b0910, inplace_
            supported=HA_ALTER_INPLACE_NO_LOCK_AFTER_PREPARE, target_mdl_request=0x7f2cd46b0980, alter_ctx=0x7f2cd46b1530) at /home/sachin/10.1/server/sql/sql_table.cc:7196
            #12 0x0000562356b49d54 in mysql_alter_table (thd=0x7f2c80000d50, new_db=0x0, new_name=0x0, create_info=0x7f2cd46b2110, table_list=0x7f2c80008720, alter_info=0x7f2cd46b2080, order_num=0, orde
            r=0x0, ignore=false) at /home/sachin/10.1/server/sql/sql_table.cc:9084
            #13 0x0000562356b4c491 in mysql_recreate_table (thd=0x7f2c80000d50, table_list=0x7f2c80008720, table_copy=false) at /home/sachin/10.1/server/sql/sql_table.cc:9859
            #14 0x0000562356bc7d9e in admin_recreate_table (thd=0x7f2c80000d50, table_list=0x7f2c80008720) at /home/sachin/10.1/server/sql/sql_admin.cc:58
            #15 0x0000562356bcad08 in mysql_admin_table (thd=0x7f2c80000d50, tables=0x7f2c80008720, check_opt=0x7f2c800056f8, operator_name=0x562357456a00 "optimize", lock_type=TL_WRITE, org_open_for_mo
            dify=true, repair_table_use_frm=false, extra_open_options=0, prepare_func=0x0, operator_func=(int (handler::*)(handler * const, THD *, HA_CHECK_OPT *)) 0x562356cce814 <handler::ha_optimize(T
            HD*, st_ha_check_opt*)>, view_operator_func=0x0) at /home/sachin/10.1/server/sql/sql_admin.cc:966
            #16 0x0000562356bcbf21 in Sql_cmd_optimize_table::execute (this=0x7f2c80008d20, thd=0x7f2c80000d50) at /home/sachin/10.1/server/sql/sql_admin.cc:1303
            #17 0x0000562356a88171 in mysql_execute_command (thd=0x7f2c80000d50) at /home/sachin/10.1/server/sql/sql_parse.cc:5442
            #18 0x0000562356a8cd51 in mysql_parse (thd=0x7f2c80000d50, rawbuf=0x7f2c8401ed0b "optimize table t1", length=17, parser_state=0x7f2cd46b3660) at /home/sachin/10.1/server/sql/sql_parse.cc:720
            9
            #19 0x0000562356dbd7a3 in Query_log_event::do_apply_event (this=0x7f2c8401eb60, rgi=0x7f2c8401bc70, query_arg=0x7f2c8401ed0b "optimize table t1", q_len_arg=17) at /home/sachin/10.1/server/sq
            l/log_event.cc:4533
            #20 0x0000562356dbc7b3 in Query_log_event::do_apply_event (this=0x7f2c8401eb60, rgi=0x7f2c8401bc70) at /home/sachin/10.1/server/sql/log_event.cc:4213
            #21 0x00005623569e8083 in Log_event::apply_event (this=0x7f2c8401eb60, rgi=0x7f2c8401bc70) at /home/sachin/10.1/server/sql/log_event.h:1343
            #22 0x00005623569dd6b0 in apply_event_and_update_pos_apply (ev=0x7f2c8401eb60, thd=0x7f2c80000d50, rgi=0x7f2c8401bc70, reason=0) at /home/sachin/10.1/server/sql/slave.cc:3482
            #23 0x00005623569ddc03 in apply_event_and_update_pos_for_parallel (ev=0x7f2c8401eb60, thd=0x7f2c80000d50, rgi=0x7f2c8401bc70) at /home/sachin/10.1/server/sql/slave.cc:3626
            #24 0x0000562356c23b11 in rpt_handle_event (qev=0x7f2c8401e380, rpt=0x7f2c840013d8) at /home/sachin/10.1/server/sql/rpl_parallel.cc:50
            #25 0x0000562356c26977 in handle_rpl_parallel_thread (arg=0x7f2c840013d8) at /home/sachin/10.1/server/sql/rpl_parallel.cc:1274
            #26 0x0000562356fedf59 in pfs_spawn_thread (arg=0x7f2c84001660) at /home/sachin/10.1/server/storage/perfschema/pfs.cc:1861
            #27 0x00007f2ce33136db in start_thread (arg=0x7f2cd46b4700) at pthread_create.c:463
            #28 0x00007f2ce291588f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
            
            

            In RelWithDebInfo build

             
            MariaDB [(none)]> show processlist;
            +----+-------------+-----------------+------+---------+------+-----------------------------------------------------------------------------+-----------------------------------------------------------+----------+
            | Id | User        | Host            | db   | Command | Time | State                                                                       | Info
                     | Progress |
            +----+-------------+-----------------+------+---------+------+-----------------------------------------------------------------------------+-----------------------------------------------------------+----------+
            |  4 | root        | localhost:36092 | test | Sleep   |    8 |                                                                             | NULL
                     |    0.000 |
            |  5 | root        | localhost:36094 | test | Sleep   |    8 |                                                                             | NULL
                     |    0.000 |
            |  8 | root        | localhost:36106 | test | Query   |    8 | Waiting for the slave SQL thread to advance position                        | select master_pos_wait('master-bin.000001', 994,
            300, '') |    0.000 |
            |  9 | root        | localhost:36108 | test | Sleep   |    8 |                                                                             | NULL
                     |    0.000 |
            | 10 | system user |                 | NULL | Connect |    8 | Waiting for master to send event                                            | NULL
                     |    0.000 |
            | 11 | system user |                 | NULL | Connect |    8 | Waiting for prior transaction to commit                                     | NULL
                     |    0.000 |
            | 12 | system user |                 | test | Connect |    8 | Waiting for table metadata lock                                             | optimize table t1
                     |    0.000 |
            | 13 | system user |                 | NULL | Connect |    8 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL
                     |    0.000 |
            | 14 | root        | localhost       | NULL | Query   |    0 | init                                                                        | show processlist
                     |    0.000 |
            +----+-------------+-----------------+------+---------+------+-----------------------------------------------------------------------------+-----------------------------------------------------------+----------+
            
            

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Test Case --source include/have_binlog_format_row.inc --source include/have_innodb.inc --source include/master-slave.inc   --connection slave stop slave; SET GLOBAL slave_parallel_threads=2; set global slave_parallel_mode=optimistic; start slave;   --connection master create table t1(a int) engine=innodb; optimize table t1; insert into t1 values(1); insert into t1 select 1+a from t1; insert into t1 select 2+a from t1;   select * from t1;   --sync_slave_with_master     --connection master drop table t1;   --sync_slave_with_master stop slave; SET GLOBAL slave_parallel_threads=0; set global slave_parallel_mode=conservative; start slave;   --source include/rpl_end.inc Debug build it fails with assert 57 ../sysdeps/unix/sysv/linux/pthread_kill.c: No such file or directory. [Current thread is 1 (Thread 0x7f2cd46b4700 (LWP 30726))] #0 __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:57 #1 0x00005623573250fa in my_write_core (sig=6) at /home/sachin/10.1/server/mysys/stacktrace.c:477 #2 0x0000562356cc4575 in handle_fatal_signal (sig=6) at /home/sachin/10.1/server/sql/signal_handler.cc:296 #3 <signal handler called> #4 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #5 0x00007f2ce2834801 in __GI_abort () at abort.c:79 #6 0x00007f2ce282439a in __assert_fail_base (fmt=0x7f2ce29ab7d8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x562357455408 "(mdl_request->type != MDL_INTENTION_EX CLUSIVE && mdl_request->type != MDL_EXCLUSIVE) || !(get_thd()->rgi_slave && get_thd()->rgi_slave->is_parallel_exec && lock->check_if_conflicting_replication_locks"..., file=file@entry=0x5623 57454e90 "/home/sachin/10.1/server/sql/mdl.cc", line=line@entry=2104, function=function@entry=0x562357455c80 <MDL_context::acquire_lock(MDL_request*, double)::__PRETTY_FUNCTION__> "bool MDL_ context::acquire_lock(MDL_request*, double)") at assert.c:92 #7 0x00007f2ce2824412 in __GI___assert_fail (assertion=0x562357455408 "(mdl_request->type != MDL_INTENTION_EXCLUSIVE && mdl_request->type != MDL_EXCLUSIVE) || !(get_thd()->rgi_slave && get_ thd()->rgi_slave->is_parallel_exec && lock->check_if_conflicting_replication_locks"..., file=0x562357454e90 "/home/sachin/10.1/server/sql/mdl.cc", line=2104, function=0x562357455c80 <MDL_con text::acquire_lock(MDL_request*, double)::__PRETTY_FUNCTION__> "bool MDL_context::acquire_lock(MDL_request*, double)") at assert.c:101 #8 0x0000562356bc4499 in MDL_context::acquire_lock (this=0x7f2c80000e48, mdl_request=0x7f2cd46b04e0, lock_wait_timeout=31536000) at /home/sachin/10.1/server/sql/mdl.cc:2100 #9 0x0000562356bc4d8a in MDL_context::upgrade_shared_lock (this=0x7f2c80000e48, mdl_ticket=0x7f2c800066f0, new_type=MDL_EXCLUSIVE, lock_wait_timeout=31536000) at /home/sachin/10.1/server/sq l/mdl.cc:2318 #10 0x0000562356a15b3b in wait_while_table_is_used (thd=0x7f2c80000d50, table=0x7f2c8001caf0, function=HA_EXTRA_PREPARE_FOR_RENAME) at /home/sachin/10.1/server/sql/sql_base.cc:1885 #11 0x0000562356b450d8 in mysql_inplace_alter_table (thd=0x7f2c80000d50, table_list=0x7f2c80008720, table=0x7f2c8001caf0, altered_table=0x7f2c80025b50, ha_alter_info=0x7f2cd46b0910, inplace_ supported=HA_ALTER_INPLACE_NO_LOCK_AFTER_PREPARE, target_mdl_request=0x7f2cd46b0980, alter_ctx=0x7f2cd46b1530) at /home/sachin/10.1/server/sql/sql_table.cc:7196 #12 0x0000562356b49d54 in mysql_alter_table (thd=0x7f2c80000d50, new_db=0x0, new_name=0x0, create_info=0x7f2cd46b2110, table_list=0x7f2c80008720, alter_info=0x7f2cd46b2080, order_num=0, orde r=0x0, ignore=false) at /home/sachin/10.1/server/sql/sql_table.cc:9084 #13 0x0000562356b4c491 in mysql_recreate_table (thd=0x7f2c80000d50, table_list=0x7f2c80008720, table_copy=false) at /home/sachin/10.1/server/sql/sql_table.cc:9859 #14 0x0000562356bc7d9e in admin_recreate_table (thd=0x7f2c80000d50, table_list=0x7f2c80008720) at /home/sachin/10.1/server/sql/sql_admin.cc:58 #15 0x0000562356bcad08 in mysql_admin_table (thd=0x7f2c80000d50, tables=0x7f2c80008720, check_opt=0x7f2c800056f8, operator_name=0x562357456a00 "optimize", lock_type=TL_WRITE, org_open_for_mo dify=true, repair_table_use_frm=false, extra_open_options=0, prepare_func=0x0, operator_func=(int (handler::*)(handler * const, THD *, HA_CHECK_OPT *)) 0x562356cce814 <handler::ha_optimize(T HD*, st_ha_check_opt*)>, view_operator_func=0x0) at /home/sachin/10.1/server/sql/sql_admin.cc:966 #16 0x0000562356bcbf21 in Sql_cmd_optimize_table::execute (this=0x7f2c80008d20, thd=0x7f2c80000d50) at /home/sachin/10.1/server/sql/sql_admin.cc:1303 #17 0x0000562356a88171 in mysql_execute_command (thd=0x7f2c80000d50) at /home/sachin/10.1/server/sql/sql_parse.cc:5442 #18 0x0000562356a8cd51 in mysql_parse (thd=0x7f2c80000d50, rawbuf=0x7f2c8401ed0b "optimize table t1", length=17, parser_state=0x7f2cd46b3660) at /home/sachin/10.1/server/sql/sql_parse.cc:720 9 #19 0x0000562356dbd7a3 in Query_log_event::do_apply_event (this=0x7f2c8401eb60, rgi=0x7f2c8401bc70, query_arg=0x7f2c8401ed0b "optimize table t1", q_len_arg=17) at /home/sachin/10.1/server/sq l/log_event.cc:4533 #20 0x0000562356dbc7b3 in Query_log_event::do_apply_event (this=0x7f2c8401eb60, rgi=0x7f2c8401bc70) at /home/sachin/10.1/server/sql/log_event.cc:4213 #21 0x00005623569e8083 in Log_event::apply_event (this=0x7f2c8401eb60, rgi=0x7f2c8401bc70) at /home/sachin/10.1/server/sql/log_event.h:1343 #22 0x00005623569dd6b0 in apply_event_and_update_pos_apply (ev=0x7f2c8401eb60, thd=0x7f2c80000d50, rgi=0x7f2c8401bc70, reason=0) at /home/sachin/10.1/server/sql/slave.cc:3482 #23 0x00005623569ddc03 in apply_event_and_update_pos_for_parallel (ev=0x7f2c8401eb60, thd=0x7f2c80000d50, rgi=0x7f2c8401bc70) at /home/sachin/10.1/server/sql/slave.cc:3626 #24 0x0000562356c23b11 in rpt_handle_event (qev=0x7f2c8401e380, rpt=0x7f2c840013d8) at /home/sachin/10.1/server/sql/rpl_parallel.cc:50 #25 0x0000562356c26977 in handle_rpl_parallel_thread (arg=0x7f2c840013d8) at /home/sachin/10.1/server/sql/rpl_parallel.cc:1274 #26 0x0000562356fedf59 in pfs_spawn_thread (arg=0x7f2c84001660) at /home/sachin/10.1/server/storage/perfschema/pfs.cc:1861 #27 0x00007f2ce33136db in start_thread (arg=0x7f2cd46b4700) at pthread_create.c:463 #28 0x00007f2ce291588f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 In RelWithDebInfo build   MariaDB [(none)]> show processlist; +----+-------------+-----------------+------+---------+------+-----------------------------------------------------------------------------+-----------------------------------------------------------+----------+ | Id | User | Host | db | Command | Time | State | Info | Progress | +----+-------------+-----------------+------+---------+------+-----------------------------------------------------------------------------+-----------------------------------------------------------+----------+ | 4 | root | localhost:36092 | test | Sleep | 8 | | NULL | 0.000 | | 5 | root | localhost:36094 | test | Sleep | 8 | | NULL | 0.000 | | 8 | root | localhost:36106 | test | Query | 8 | Waiting for the slave SQL thread to advance position | select master_pos_wait('master-bin.000001', 994, 300, '') | 0.000 | | 9 | root | localhost:36108 | test | Sleep | 8 | | NULL | 0.000 | | 10 | system user | | NULL | Connect | 8 | Waiting for master to send event | NULL | 0.000 | | 11 | system user | | NULL | Connect | 8 | Waiting for prior transaction to commit | NULL | 0.000 | | 12 | system user | | test | Connect | 8 | Waiting for table metadata lock | optimize table t1 | 0.000 | | 13 | system user | | NULL | Connect | 8 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL | 0.000 | | 14 | root | localhost | NULL | Query | 0 | init | show processlist | 0.000 | +----+-------------+-----------------+------+---------+------+-----------------------------------------------------------------------------+-----------------------------------------------------------+----------+
            alice Alice Sherepa made changes -
            Affects Version/s 10.1 [ 16100 ]
            Affects Version/s 10.2 [ 14601 ]
            Affects Version/s 10.3 [ 22126 ]
            Affects Version/s 10.4 [ 22408 ]
            alice Alice Sherepa made changes -
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            alice Alice Sherepa made changes -
            Status Open [ 1 ] Confirmed [ 10101 ]
            alice Alice Sherepa added a comment - - edited

            Current 10.4,10.5-debug hangs with the test case from the above.

            but assertion is still appearing:

            10.5 054f10365c49b70a386

            Server version: 10.5.5-MariaDB-debug-log
             
            linux/raise.c:51(__GI_raise)[0x7fa4ff6cef47]
            stdlib/abort.c:81(__GI_abort)[0x7fa4ff6d08b1]
            assert/assert.c:89(__assert_fail_base)[0x7fa4ff6c042a]
            ??:0(__assert_fail)[0x7fa4ff6c04a2]
            sql/mdl.cc:2308(MDL_context::acquire_lock(MDL_request*, double))[0x557c45c6ff9f]
            sql/mdl.cc:2535(MDL_context::upgrade_shared_lock(MDL_ticket*, enum_mdl_type, double))[0x557c4568af58]
            sql/sql_base.cc:1311(wait_while_table_is_used(THD*, TABLE*, ha_extra_function))[0x557c45ab65e8]
            sql/sql_table.cc:7994(mysql_inplace_alter_table(THD*, TABLE_LIST*, TABLE*, TABLE*, Alter_inplace_info*, enum_alter_inplace_result, MDL_request*, Alter_table_ctx*))[0x557c45ac9265]
            sql/sql_table.cc:10564(mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool, bool))[0x557c45ad0c4a]
            sql/sql_table.cc:11497(mysql_recreate_table(THD*, TABLE_LIST*, bool))[0x557c45c78fc4]
            sql/sql_admin.cc:57(admin_recreate_table(THD*, TABLE_LIST*))[0x557c45c7fb13]
            sql/sql_admin.cc:1026(mysql_admin_table(THD*, TABLE_LIST*, st_ha_check_opt*, char const*, thr_lock_type, bool, bool, unsigned int, int (*)(THD*, TABLE_LIST*, st_ha_check_opt*), int (handler::*)(THD*, st_ha_check_opt*), int (*)(THD*, TABLE_LIST*, st_ha_check_opt*)))[0x557c45c82a35]
            sql/sql_admin.cc:1375(Sql_cmd_optimize_table::execute(THD*))[0x557c45837b78]
            sql/sql_parse.cc:5951(mysql_execute_command(THD*))[0x557c458453f5]
            sql/sql_parse.cc:7993(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x557c4635c7fd]
            sql/log_event_server.cc:1906(Query_log_event::do_apply_event(rpl_group_info*, char const*, unsigned int))[0x557c46359a17]
            sql/log_event_server.cc:1579(Query_log_event::do_apply_event(rpl_group_info*))[0x557c455c4df9]
            sql/log_event.h:1496(Log_event::apply_event(rpl_group_info*))[0x557c455a6d0e]
            sql/slave.cc:4003(apply_event_and_update_pos_apply(Log_event*, THD*, rpl_group_info*, int))[0x557c455a7a8e]
            sql/slave.cc:4190(apply_event_and_update_pos_for_parallel(Log_event*, THD*, rpl_group_info*))[0x557c45d7ece9]
            sql/rpl_parallel.cc:61(rpt_handle_event(rpl_parallel_thread::queued_event*, rpl_parallel_thread*))[0x557c45d86437]
            sql/rpl_parallel.cc:1343(handle_rpl_parallel_thread)[0x557c4690b19f]
            nptl/pthread_create.c:463(start_thread)[0x7fa5005cb6db]
            x86_64/clone.S:97(clone)[0x7fa4ff7b1a3f]
             
            Query (0x6120009a0453): OPTIMIZE TABLE t1
            

            alice Alice Sherepa added a comment - - edited Current 10.4,10.5-debug hangs with the test case from the above. but assertion is still appearing: 10.5 054f10365c49b70a386 Server version: 10.5.5-MariaDB-debug-log   linux/raise.c:51(__GI_raise)[0x7fa4ff6cef47] stdlib/abort.c:81(__GI_abort)[0x7fa4ff6d08b1] assert/assert.c:89(__assert_fail_base)[0x7fa4ff6c042a] ??:0(__assert_fail)[0x7fa4ff6c04a2] sql/mdl.cc:2308(MDL_context::acquire_lock(MDL_request*, double))[0x557c45c6ff9f] sql/mdl.cc:2535(MDL_context::upgrade_shared_lock(MDL_ticket*, enum_mdl_type, double))[0x557c4568af58] sql/sql_base.cc:1311(wait_while_table_is_used(THD*, TABLE*, ha_extra_function))[0x557c45ab65e8] sql/sql_table.cc:7994(mysql_inplace_alter_table(THD*, TABLE_LIST*, TABLE*, TABLE*, Alter_inplace_info*, enum_alter_inplace_result, MDL_request*, Alter_table_ctx*))[0x557c45ac9265] sql/sql_table.cc:10564(mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool, bool))[0x557c45ad0c4a] sql/sql_table.cc:11497(mysql_recreate_table(THD*, TABLE_LIST*, bool))[0x557c45c78fc4] sql/sql_admin.cc:57(admin_recreate_table(THD*, TABLE_LIST*))[0x557c45c7fb13] sql/sql_admin.cc:1026(mysql_admin_table(THD*, TABLE_LIST*, st_ha_check_opt*, char const*, thr_lock_type, bool, bool, unsigned int, int (*)(THD*, TABLE_LIST*, st_ha_check_opt*), int (handler::*)(THD*, st_ha_check_opt*), int (*)(THD*, TABLE_LIST*, st_ha_check_opt*)))[0x557c45c82a35] sql/sql_admin.cc:1375(Sql_cmd_optimize_table::execute(THD*))[0x557c45837b78] sql/sql_parse.cc:5951(mysql_execute_command(THD*))[0x557c458453f5] sql/sql_parse.cc:7993(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x557c4635c7fd] sql/log_event_server.cc:1906(Query_log_event::do_apply_event(rpl_group_info*, char const*, unsigned int))[0x557c46359a17] sql/log_event_server.cc:1579(Query_log_event::do_apply_event(rpl_group_info*))[0x557c455c4df9] sql/log_event.h:1496(Log_event::apply_event(rpl_group_info*))[0x557c455a6d0e] sql/slave.cc:4003(apply_event_and_update_pos_apply(Log_event*, THD*, rpl_group_info*, int))[0x557c455a7a8e] sql/slave.cc:4190(apply_event_and_update_pos_for_parallel(Log_event*, THD*, rpl_group_info*))[0x557c45d7ece9] sql/rpl_parallel.cc:61(rpt_handle_event(rpl_parallel_thread::queued_event*, rpl_parallel_thread*))[0x557c45d86437] sql/rpl_parallel.cc:1343(handle_rpl_parallel_thread)[0x557c4690b19f] nptl/pthread_create.c:463(start_thread)[0x7fa5005cb6db] x86_64/clone.S:97(clone)[0x7fa4ff7b1a3f]   Query (0x6120009a0453): OPTIMIZE TABLE t1
            alice Alice Sherepa made changes -
            Affects Version/s 10.5 [ 23123 ]
            alice Alice Sherepa made changes -
            Fix Version/s 10.5 [ 23123 ]
            Elkin Andrei Elkin made changes -
            Labels parallelreplication mdl parallelreplication

            A better testcase would be to grep the binlog from OPTIMIZE TABLE to check that it is marked as DDL.
            Even if the included testcase would somehow not hang, it's still a bug that OPTIMIZE TABLE is not marked as DDL.
            Anything that takes table locks or other locks without implementing the parallel replication deadlock detection-and-handling mechanism must be marked as DDL (or non-transactional) as appropriate.

            • Kristian.
            knielsen Kristian Nielsen added a comment - A better testcase would be to grep the binlog from OPTIMIZE TABLE to check that it is marked as DDL. Even if the included testcase would somehow not hang, it's still a bug that OPTIMIZE TABLE is not marked as DDL. Anything that takes table locks or other locks without implementing the parallel replication deadlock detection-and-handling mechanism must be marked as DDL (or non-transactional) as appropriate. Kristian.
            alice Alice Sherepa made changes -
            Labels mdl parallelreplication affects-tests mdl parallelreplication
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 10.1 [ 16100 ]
            sujatha.sivakumar Sujatha Sivakumar (Inactive) made changes -
            Assignee Sachin Setiya [ sachin.setiya.007 ] Sujatha Sivakumar [ sujatha.sivakumar ]
            sujatha.sivakumar Sujatha Sivakumar (Inactive) made changes -
            Status Confirmed [ 10101 ] In Progress [ 3 ]
            sujatha.sivakumar Sujatha Sivakumar (Inactive) added a comment - Hello Elkin Can you please review the changes for MDEV-17515 Patch: https://github.com/MariaDB/server/commit/70377dc9be5205286806e01365f33cbfd32e1d27 BuildBot: http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.2-sujatha Thank you.
            sujatha.sivakumar Sujatha Sivakumar (Inactive) made changes -
            Assignee Sujatha Sivakumar [ sujatha.sivakumar ] Andrei Elkin [ elkin ]
            Status In Progress [ 3 ] In Review [ 10002 ]

            Hello Elkin

            I made some minor changes on top of the above mentioned commit. Please find the following commit
            which includes patches for MDEV-17515 and MDEV-22530. MDEV-22530 changes are implemented on top of MDEV-17515.
            https://github.com/MariaDB/server/commit/44eb746c226d3eabe93d476790bb0c6bb30106b2

            Please review top two commits:

            1. commit 44eb746c226d3eabe93d476790bb0c6bb30106b2
            MDEV-22530: Aborting OPTIMIZE TABLE still logs in binary log and replicates to the Slave server.

            2. commit 502a6b9b3ce119f507cb8ccc9ec034a9953ad3cc
            MDEV-17515: GTID Replication in optimistic mode deadlock

            sujatha.sivakumar Sujatha Sivakumar (Inactive) added a comment - Hello Elkin I made some minor changes on top of the above mentioned commit. Please find the following commit which includes patches for MDEV-17515 and MDEV-22530 . MDEV-22530 changes are implemented on top of MDEV-17515 . https://github.com/MariaDB/server/commit/44eb746c226d3eabe93d476790bb0c6bb30106b2 Please review top two commits: 1. commit 44eb746c226d3eabe93d476790bb0c6bb30106b2 MDEV-22530 : Aborting OPTIMIZE TABLE still logs in binary log and replicates to the Slave server. 2. commit 502a6b9b3ce119f507cb8ccc9ec034a9953ad3cc MDEV-17515 : GTID Replication in optimistic mode deadlock
            Elkin Andrei Elkin made changes -
            Assignee Andrei Elkin [ elkin ] Sachin Setiya [ sachin.setiya.007 ]

            Okay to push , one comment needs to addressed

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Okay to push , one comment needs to addressed
            sachin.setiya.007 Sachin Setiya (Inactive) made changes -
            Assignee Sachin Setiya [ sachin.setiya.007 ] Sujatha Sivakumar [ sujatha.sivakumar ]
            Status In Review [ 10002 ] Stalled [ 10000 ]

            MDEV-17515 and MDEV-22530 are related. MDEV-22530's fix is implemented on MDEV-17515's fix.

            Hence both are tested and pushed together.

            10.3: http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.3-sujatha
            ====
            commit 07da206ccd21080351363a2c1f080320367c52bb (HEAD -> 10.3, origin/bb-10.3-sujatha)
            MDEV-22530: Aborting OPTIMIZE TABLE still logs in binary log and replicates to the Slave server.

            10.3 cherry-pick testing.

            commit e683e8c010bd2ffa9415857828e89f94b9ac0550
            MDEV-17515: GTID Replication in optimistic mode deadlock

            10.3 cherry-pick testing.

            10.4: http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.4-sujatha
            ====
            commit 0f34165ec1427eccbf18db9f71744537fe6a1354 (HEAD -> 10.4, origin/bb-10.4-sujatha)
            MDEV-22530: Aborting OPTIMIZE TABLE still logs in binary log and replicates to the Slave server.

            10.4 cherry-pick testing.

            commit 4169e6571c342b57a3ce89e7ae45328c06beb788
            MDEV-17515: GTID Replication in optimistic mode deadlock

            10.4 cherry-pick testing.

            Merge conflicts addressed.
            Result file re-recorded.

            10.5: http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.5-sujatha
            ====
            commit 0f9598100aa37f48e8eb93669396e9fac37c493a (HEAD -> 10.5, origin/bb-10.5-sujatha)
            MDEV-22530: Aborting OPTIMIZE TABLE still logs in binary log and replicates to the Slave server.

            10.5 cherry-pick testing.

            No Merge conflicts

            commit f0c220898fbe8f58393d386aa0cb6da6b7cedf9f
            MDEV-17515: GTID Replication in optimistic mode deadlock

            10.5 cherry-pick testing.

            Moved changes to log_event_server.cc

            sujatha.sivakumar Sujatha Sivakumar (Inactive) added a comment - MDEV-17515 and MDEV-22530 are related. MDEV-22530 's fix is implemented on MDEV-17515 's fix. Hence both are tested and pushed together. 10.3: http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.3-sujatha ==== commit 07da206ccd21080351363a2c1f080320367c52bb (HEAD -> 10.3, origin/bb-10.3-sujatha) MDEV-22530 : Aborting OPTIMIZE TABLE still logs in binary log and replicates to the Slave server. 10.3 cherry-pick testing. commit e683e8c010bd2ffa9415857828e89f94b9ac0550 MDEV-17515 : GTID Replication in optimistic mode deadlock 10.3 cherry-pick testing. 10.4: http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.4-sujatha ==== commit 0f34165ec1427eccbf18db9f71744537fe6a1354 (HEAD -> 10.4, origin/bb-10.4-sujatha) MDEV-22530 : Aborting OPTIMIZE TABLE still logs in binary log and replicates to the Slave server. 10.4 cherry-pick testing. commit 4169e6571c342b57a3ce89e7ae45328c06beb788 MDEV-17515 : GTID Replication in optimistic mode deadlock 10.4 cherry-pick testing. Merge conflicts addressed. Result file re-recorded. 10.5: http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.5-sujatha ==== commit 0f9598100aa37f48e8eb93669396e9fac37c493a (HEAD -> 10.5, origin/bb-10.5-sujatha) MDEV-22530 : Aborting OPTIMIZE TABLE still logs in binary log and replicates to the Slave server. 10.5 cherry-pick testing. No Merge conflicts commit f0c220898fbe8f58393d386aa0cb6da6b7cedf9f MDEV-17515 : GTID Replication in optimistic mode deadlock 10.5 cherry-pick testing. Moved changes to log_event_server.cc
            sujatha.sivakumar Sujatha Sivakumar (Inactive) made changes -
            Fix Version/s 10.2.39 [ 25731 ]
            Fix Version/s 10.3.30 [ 25732 ]
            Fix Version/s 10.4.20 [ 25733 ]
            Fix Version/s 10.5.11 [ 25734 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            Fix Version/s 10.5 [ 23123 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 90205 ] MariaDB v4 [ 155088 ]
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -

            People

              sujatha.sivakumar Sujatha Sivakumar (Inactive)
              stephane@skysql.com VAROQUI Stephane
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.