[MDEV-30790] Parallel replicas hang forever when used with conservative&optimistic mode Created: 2023-03-06  Updated: 2023-05-02  Resolved: 2023-05-02

Status: Closed
Project: MariaDB Server
Component/s: Replication, Server, Storage Engine - InnoDB
Affects Version/s: 10.6.11, 10.11.2, 10.6.12
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Lei Zhang Assignee: Angelique Sklavounos (Inactive)
Resolution: Incomplete Votes: 0
Labels: None
Environment:

OS: CentOS Linux release 7.4.1708 (Core) 3.10.0-693.el7.x86_64



 Description   

Currently,we already test 10.3.38、10.6.11、10.6.12 and 10.11.2 ,all these version met the same question except 10.3.38. At this case, we cannot stop replicas, and cannot stop mysqld service,the only method is kill -9 `pidof mariadbd` from os level

MariaDB [(none)]> select trx_id,trx_state,trx_started,trx_requested_lock_id,trx_wait_started,trx_weight,trx_query,trx_operation_state,trx_tables_in_use,trx_tables_locked,trx_isolation_level,trx_autocommit_non_locking from information_schema.innodb_trx;
+------------+-----------+---------------------+-----------------------+------------------+------------+-----------+---------------------+-------------------+-------------------+---------------------+----------------------------+
| trx_id     | trx_state | trx_started         | trx_requested_lock_id | trx_wait_started | trx_weight | trx_query | trx_operation_state | trx_tables_in_use | trx_tables_locked | trx_isolation_level | trx_autocommit_non_locking |
+------------+-----------+---------------------+-----------------------+------------------+------------+-----------+---------------------+-------------------+-------------------+---------------------+----------------------------+
| 2106906698 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          4 | NULL      | starting index read |                 1 |                 1 | READ COMMITTED      |                          0 |
| 2106906705 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          4 | NULL      | inserting           |                 1 |                 2 | REPEATABLE READ     |                          0 |
| 2106906707 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          4 | NULL      | inserting           |                 1 |                 2 | REPEATABLE READ     |                          0 |
| 2106906706 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          4 | NULL      | inserting           |                 1 |                 2 | REPEATABLE READ     |                          0 |
| 2106906696 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          5 | NULL      |                     |                 0 |                 2 | REPEATABLE READ     |                          0 |
| 2106906700 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          4 | NULL      | inserting           |                 1 |                 2 | REPEATABLE READ     |                          0 |
| 2106906694 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          5 | NULL      |                     |                 0 |                 2 | REPEATABLE READ     |                          0 |
| 2106906695 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          5 | NULL      |                     |                 0 |                 2 | REPEATABLE READ     |                          0 |
| 2106906692 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          5 | NULL      |                     |                 0 |                 2 | REPEATABLE READ     |                          0 |
| 2106906691 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          5 | NULL      |                     |                 0 |                 2 | REPEATABLE READ     |                          0 |
| 2106906693 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          5 | NULL      |                     |                 0 |                 2 | REPEATABLE READ     |                          0 |
| 2106906689 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          5 | NULL      |                     |                 0 |                 2 | REPEATABLE READ     |                          0 |
| 2106906685 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          5 | NULL      |                     |                 0 |                 2 | REPEATABLE READ     |                          0 |
| 2106906682 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          5 | NULL      |                     |                 0 |                 2 | REPEATABLE READ     |                          0 |
| 2106906686 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          5 | NULL      |                     |                 0 |                 2 | REPEATABLE READ     |                          0 |
| 2106906697 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          5 | NULL      | inserting           |                 1 |                 2 | REPEATABLE READ     |                          0 |
| 2106906708 | RUNNING   | 2023-03-06 10:00:34 | NULL                  | NULL             |          4 | NULL      | inserting           |                 1 |                 2 | REPEATABLE READ     |                          0 |
+------------+-----------+---------------------+-----------------------+------------------+------------+-----------+---------------------+-------------------+-------------------+---------------------+----------------------------+
17 rows in set (0.000 sec)
 
MariaDB [(none)]> show processlist;
+--------+-------------+-----------+------+--------------+------+-----------------------------------------------+------------------+----------+
| Id     | User        | Host      | db   | Command      | Time | State                                         | Info             | Progress |
+--------+-------------+-----------+------+--------------+------+-----------------------------------------------+------------------+----------+
|      5 | system user |           | NULL | Slave_IO     | 7815 | Waiting for master to send event              | NULL             |    0.000 |
|      7 | system user |           | NULL | Slave_worker | 6187 | NULL                                          | NULL             |    0.000 |
|     11 | system user |           | NULL | Slave_worker | 6187 | NULL                                          | NULL             |    0.000 |
|      9 | system user |           | NULL | Slave_worker | 6187 | Waiting for prior transaction to commit       | NULL             |    0.000 |
|      8 | system user |           | NULL | Slave_worker | 6187 | NULL                                          | NULL             |    0.000 |
|     10 | system user |           | NULL | Slave_worker | 6187 | NULL                                          | NULL             |    0.000 |
|     12 | system user |           | NULL | Slave_worker | 6187 | Waiting for prior transaction to commit       | NULL             |    0.000 |
|     14 | system user |           | NULL | Slave_worker | 6187 | Waiting for prior transaction to commit       | NULL             |    0.000 |
|     13 | system user |           | NULL | Slave_worker | 6187 | Waiting for prior transaction to commit       | NULL             |    0.000 |
|     15 | system user |           | NULL | Slave_worker | 6187 | Waiting for prior transaction to commit       | NULL             |    0.000 |
|     16 | system user |           | NULL | Slave_worker | 6187 | Waiting for prior transaction to commit       | NULL             |    0.000 |
|     17 | system user |           | NULL | Slave_worker | 6187 | Waiting for prior transaction to commit       | NULL             |    0.000 |
|     18 | system user |           | NULL | Slave_worker | 6187 | Waiting for prior transaction to commit       | NULL             |    0.000 |
|     19 | system user |           | NULL | Slave_worker | 6187 | Waiting for prior transaction to commit       | NULL             |    0.000 |
|     20 | system user |           | NULL | Slave_worker | 6187 | Waiting for prior transaction to commit       | NULL             |    0.000 |
|     21 | system user |           | NULL | Slave_worker | 6187 | NULL                                          | NULL             |    0.000 |
|     22 | system user |           | NULL | Slave_worker | 6187 | NULL                                          | NULL             |    0.000 |
|      6 | system user |           | NULL | Slave_SQL    | 6194 | Waiting for room in worker thread event queue | NULL             |    0.000 |
| 521724 | root        | localhost | NULL | Query        |    0 | starting                                      | show processlist |    0.000 |
+--------+-------------+-----------+------+--------------+------+-----------------------------------------------+------------------+----------+
19 rows in set (0.000 sec)

# Primary status
MariaDB [(none)]> show master status\G 
            File: mysql-bin.004372
        Position: 65101223
    Binlog_Do_DB: 
Binlog_Ignore_DB: 
 
# Repicas status
MariaDB [(none)]> show slave status\G
               Master_Log_File: mysql-bin.004372
           Read_Master_Log_Pos: 65101223
  {color:#DE350B}      * Relay_Master_Log_File: mysql-bin.003997 
           Exec_Master_Log_Pos: 108892490*{color}
1 row in set (0.000 sec)
 
MariaDB [(none)]> select sleep(10);show slave status\G
1 row in set (10.000 sec)
 
               Master_Log_File: mysql-bin.004372
           Read_Master_Log_Pos: 65101223
       *{color:#DE350B}  Relay_Master_Log_File: mysql-bin.003997
           Exec_Master_Log_Pos: 108892490{color}*
1 row in set (0.000 sec)
 
MariaDB [(none)]> select sleep(10);show slave status\G
1 row in set (10.000 sec)
 
               Master_Log_File: mysql-bin.004372
           Read_Master_Log_Pos: 65101223
     *{color:#DE350B}    Relay_Master_Log_File: mysql-bin.003997
           Exec_Master_Log_Pos: 108892490{color}*
1 row in set (0.000 sec)



 Comments   
Comment by Angelique Sklavounos (Inactive) [ 2023-03-30 ]

Hi blylei

When this happens, would it be possible to provide:

  1. full stack trace with: gdb --batch --eval-command="thread apply all bt" <mysqld pid> (https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/#getting-backtraces-from-a-running-mysqld-process-with-gdb-on-linux)
  2. full SHOW SLAVE STATUS\G
  3. SHOW ENGINE INNODB STATUS\G
  4. SHOW PROCESSLIST on Primary server
  5. binlog file of when slave hangs (can upload to private FTP server: https://mariadb.com/kb/en/meta/mariadb-ftp-server/)
  6. error logs (if they show anything)

Thank you

Generated at Thu Feb 08 10:18:54 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.