Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18673

max_statement_time does not work for COMMIT in Galera cluster

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Not a Bug
    • 10.1.37, 10.2.23
    • N/A
    • Galera
    • None

    Description

      Normally max_statement_time setting work for statements in Galera cluster (statement is killed when it runs too long).

      Consider 2 nodes cluster and big enough table (without primary key) created and filled with data like this:

      MariaDB [test]> show create table t1\G
      *************************** 1. row ***************************
      Table: t1
      Create Table: CREATE TABLE `t1` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `val` int(11) DEFAULT NULL,
      PRIMARY KEY (`id`)
      ) ENGINE=InnoDB AUTO_INCREMENT=3054 DEFAULT CHARSET=latin1
      1 row in set (0.00 sec)
       
      MariaDB [test]> create table tbig(c1 int, c2 char(200)) engine=InnoDB;
      Query OK, 0 rows affected (0.30 sec)
       
      MariaDB [test]> insert into tbig values (1, repeat('a', 200)), (2, repeat('b',200));
      Query OK, 2 rows affected (0.13 sec)
      Records: 2 Duplicates: 0 Warnings: 0
       
      MariaDB [test]> insert into tbig select * from tbig;
      Query OK, 2 rows affected (0.09 sec)
      Records: 2 Duplicates: 0 Warnings: 0
       
      ...
       
      MariaDB [test]> insert into tbig select * from tbig;
      Query OK, 524288 rows affected (21.87 sec)
      Records: 524288 Duplicates: 0 Warnings: 0
      

      When we run INSERT ... SELECT we can make sure it does not run too long:

      MariaDB [test]> show variables like 'gtid%';
      +------------------------+---------+
      | Variable_name          | Value   |
      +------------------------+---------+
      | gtid_binlog_pos        | 0-10-25 |
      | gtid_binlog_state      | 0-10-25 |
      | gtid_current_pos       | 0-10-25 |
      | gtid_domain_id         | 0       |
      | gtid_ignore_duplicates | OFF     |
      | gtid_seq_no            | 0       |
      | gtid_slave_pos         |         |
      | gtid_strict_mode       | OFF     |
      +------------------------+---------+
      8 rows in set (0.00 sec)
       
      MariaDB [test]> set statement max_statement_time=5 for insert into tbig select * from tbig limit 1000;
      Query OK, 1000 rows affected (0.14 sec)
      Records: 1000  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> set statement max_statement_time=5 for insert into tbig select * from tbig limit 100000;
      Query OK, 100000 rows affected (3.58 sec)
      Records: 100000  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> set statement max_statement_time=2 for insert into tbig select * from tbig limit 1000000;
      ERROR 1969 (70100): Query execution was interrupted (max_statement_time exceeded)
      MariaDB [test]> show variables like 'gtid%';                                    +------------------------+---------+
      | Variable_name          | Value   |
      +------------------------+---------+
      | gtid_binlog_pos        | 0-10-27 |
      | gtid_binlog_state      | 0-10-27 |
      | gtid_current_pos       | 0-10-27 |
      | gtid_domain_id         | 0       |
      | gtid_ignore_duplicates | OFF     |
      | gtid_seq_no            | 0       |
      | gtid_slave_pos         |         |
      | gtid_strict_mode       | OFF     |
      +------------------------+---------+
      8 rows in set (0.01 sec)
       
      MariaDB [test]> select count(*) from tbig;
      +----------+
      | count(*) |
      +----------+
      |   625288 |
      +----------+
      1 row in set (0.54 sec)
      

      Probably because we interrup it at selcting data stage. When we use explicit transaction and want to make sure COMMIT does NOT take too much time (it may do so as other node is still processing large DELETE from this same table, delete from tbig where c1 = 2

      MariaDB [test]> show processlist;
      +----+-------------+-----------+------+---------+------+----------------------------------------+-------------------------------+----------+
      | Id | User        | Host      | db   | Command | Time | State                                  | Info                          | Progress |
      +----+-------------+-----------+------+---------+------+----------------------------------------+-------------------------------+----------+
      |  1 | system user |           | NULL | Sleep   | 1567 | wsrep aborter idle                     | NULL                          |    0.000 |
      |  2 | system user |           | NULL | Sleep   |    0 | Delete_rows_log_event::find_row(52566) | delete from tbig where c1 = 2 |    0.000 |
      |  3 | system user |           | NULL | Daemon  | NULL | InnoDB purge coordinator               | NULL                          |    0.000 |
      |  4 | system user |           | NULL | Daemon  | NULL | InnoDB purge worker                    | NULL                          |    0.000 |
      |  5 | system user |           | NULL | Daemon  | NULL | InnoDB purge worker                    | NULL                          |    0.000 |
      |  6 | system user |           | NULL | Daemon  | NULL | InnoDB purge worker                    | NULL                          |    0.000 |
      |  7 | system user |           | NULL | Daemon  | NULL | InnoDB shutdown handler                | NULL                          |    0.000 |
      | 11 | root        | localhost | test | Query   |    0 | init                                   | show processlist              |    0.000 |
      +----+-------------+-----------+------+---------+------+----------------------------------------+-------------------------------+----------+
      8 rows in set (0.00 sec)
       
      MariaDB [test]> show variables like 'gtid%';
      +------------------------+---------+
      | Variable_name          | Value   |
      +------------------------+---------+
      | gtid_binlog_pos        | 0-10-21 |
      | gtid_binlog_state      | 0-10-21 |
      | gtid_current_pos       | 0-10-21 |
      | gtid_domain_id         | 0       |
      | gtid_ignore_duplicates | OFF     |
      | gtid_seq_no            | 0       |
      | gtid_slave_pos         |         |
      | gtid_strict_mode       | OFF     |
      +------------------------+---------+
      8 rows in set (0.00 sec)
      

      we can not make max_statement_time work:

      MariaDB [test]> select count(*) from tbig;                                      
      +----------+
      | count(*) |
      +----------+
      |   625288 |
      +----------+
      1 row in set (0.53 sec)
       
      MariaDB [test]> start transaction;
      Query OK, 0 rows affected (0.00 sec)
       
      MariaDB [test]> insert into tbig select * from tbig;                            
      Query OK, 625288 rows affected (24.04 sec)
      Records: 625288  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> set statement max_statement_time=2 for commit;                  
      Query OK, 0 rows affected (3.43 sec)
       
      MariaDB [test]> select count(*) from tbig;                                      
      +----------+
      | count(*) |
      +----------+
      |  1250576 |
      +----------+
      1 row in set (1.23 sec)
      

      As you can see above, COMMIT took 3.43 seconds to execute while we wanted it to run for not more than 2 seconds.

      Probably that's because COMMIT in Galera (or any DML statement in certifcation/committing stage) is not interruptable, neither by explicit KILL nor by max_statement_time setting). If this is the case, it should be documented as a known limitation, or fixed, so that setting max_statement_time works as expected for all statements.

      Attachments

        Activity

          Does that same max_statement_time on commit work if you do not have Galera ?

          jplindst Jan Lindström (Inactive) added a comment - Does that same max_statement_time on commit work if you do not have Galera ?

          As far as I understand, in case of InnoDB or RocksDB outside Galera COMMIT can not wait anyway, if it fails it should fail immediately.

          I do not have clear idea on how to make COMMIT waiting for some time so that max_statement_time applies, for pure InnoDB.

          valerii Valerii Kravchuk added a comment - As far as I understand, in case of InnoDB or RocksDB outside Galera COMMIT can not wait anyway, if it fails it should fail immediately. I do not have clear idea on how to make COMMIT waiting for some time so that max_statement_time applies, for pure InnoDB.

          Agreed. On standalone InnoDB commit on this case is almost instantaneous. However, in Galera case we need to certify insert on master and at commit send it to slave. I do not think it is even intended that max_statement_time includes the time required to replicate the transaction from master to slave. In normal MySQL replication (row-based) not sure if you could get binlog write to take so much time that max_statement_time would expire and even then not sure if commit can really fail.

          jplindst Jan Lindström (Inactive) added a comment - Agreed. On standalone InnoDB commit on this case is almost instantaneous. However, in Galera case we need to certify insert on master and at commit send it to slave. I do not think it is even intended that max_statement_time includes the time required to replicate the transaction from master to slave. In normal MySQL replication (row-based) not sure if you could get binlog write to take so much time that max_statement_time would expire and even then not sure if commit can really fail.

          greenman Can you please update this limitation to documentation.

          jplindst Jan Lindström (Inactive) added a comment - greenman Can you please update this limitation to documentation.
          greenman Ian Gilfillan added a comment -

          Updated

          greenman Ian Gilfillan added a comment - Updated

          People

            greenman Ian Gilfillan
            valerii Valerii Kravchuk
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.