Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26712

row events never reset thd->mem_root

Details

    Description

      row event execution starts from Rows_log_event::do_apply_event which calls open_and_lock_tables() which calls lock_tables(), which does

      if (!(ptr=start=(TABLE**) thd->alloc(sizeof(TABLE*)*count)))
      

      thus allocating an array of points of the thd's memroot. This is just few bytes, 8 in the test below. But this memory is never freed, not by row events at least. So, by creating a transaction with a huge number of row events we can consume any amount of memory.

      source include/have_innodb.inc;
      source include/have_binlog_format_row.inc;
      source include/master-slave.inc;
      create table t1 (a int) engine=innodb;
      insert t1 values (1);
       
      sync_slave_with_master;
      stop slave;
      set global max_session_mem_used=10*1024*1024;
      start slave;
       
      connection master;
      let $a=1000000;
      start transaction;
      disable_query_log;
      while ($a) {
        update t1 set a=a+1;
        dec $a;
      }
      enable_query_log;
      commit;
      drop table t1;
      sync_slave_with_master;
      set global max_session_mem_used=default;
      source include/rpl_end.inc;
      

      To fix that, rbr events must do

      free_root(thd->mem_root, MYF(MY_KEEP_PREALLOC));
      

      somewhere. Where it's safe, because it'll destroy anything allocated on the thd memroot.
      This patch appears to be working fine, it frees thd memroot after closing all opened tables:

      --- a/sql/rpl_rli.cc
      +++ b/sql/rpl_rli.cc
      @@ -2326,6 +2326,7 @@ void rpl_group_info::cleanup_context(THD *thd, bool error)
         */
         reset_row_stmt_start_timestamp();
         unset_long_find_row_note_printed();
      +  free_root(thd->mem_root, MYF(MY_KEEP_PREALLOC));
       
         DBUG_EXECUTE_IF("inject_sleep_gtid_100_x_x", {
             if (current_gtid.domain_id == 100)
      

      Attachments

        Issue Links

          Activity

            • Note, when running the test you might see it apparently stuck after rpl_end line. This is mysqltest noticing the replication failure is trying to consume the output of SHOW BINLOG EVENTS. I used to press ^C at that point.
            • This is obviously not a test case to be pushed into mysql-test/. Just sayin'
            • If you're curious, you can run from another terminal mysql --socket=mysql-test/var/tmp/mysqld.2.sock -uroot and then select id,command,state,memory_used from information_schema.processlist; to watch it grow (or not, if you've applied the fix}}.
            serg Sergei Golubchik added a comment - Note, when running the test you might see it apparently stuck after rpl_end line. This is mysqltest noticing the replication failure is trying to consume the output of SHOW BINLOG EVENTS . I used to press ^C at that point. This is obviously not a test case to be pushed into mysql-test/ . Just sayin' If you're curious, you can run from another terminal mysql --socket=mysql-test/var/tmp/mysqld.2.sock -uroot and then select id,command,state,memory_used from information_schema.processlist; to watch it grow (or not, if you've applied the fix}}.
            Elkin Andrei Elkin added a comment -

            To complete the description, the THD::mem_root of the slave applier gets finally released after commit. More precisely at the end of the following Gtid_log_event.

            Elkin Andrei Elkin added a comment - To complete the description, the THD::mem_root of the slave applier gets finally released after commit. More precisely at the end of the following Gtid_log_event.

            Confirmed. Memory starts increasing on the slave after the commit on the master:

            10.5.13 c430aa72abbdccb1ece7f0d65b49a6b48e7c5ba7 (Debug)

            MariaDB [(none)]> select id,command,state,memory_used from information_schema.processlist;
            +----+-----------+------------------------------------------------------+-------------+
            | id | command   | state                                                | memory_used |
            +----+-----------+------------------------------------------------------+-------------+
            | 13 | Query     | Filling schema table                                 |      132368 |
            | 12 | Slave_SQL | Reading event from the relay log                     |     4390792 |
            | 11 | Slave_IO  | Queueing master event to the relay log               |       46392 |
            | 10 | Sleep     |                                                      |       74920 |
            |  9 | Query     | Waiting for the slave SQL thread to advance position |       74920 |
            |  6 | Sleep     |                                                      |       74920 |
            |  5 | Sleep     |                                                      |       73848 |
            +----+-----------+------------------------------------------------------+-------------+
            7 rows in set (0.002 sec)
             
            MariaDB [(none)]> select id,command,state,memory_used from information_schema.processlist;
            +----+-----------+------------------------------------------------------+-------------+
            | id | command   | state                                                | memory_used |
            +----+-----------+------------------------------------------------------+-------------+
            | 13 | Query     | Filling schema table                                 |      132368 |
            | 12 | Slave_SQL | Reading event from the relay log                     |     4783096 |
            | 11 | Slave_IO  | Queueing master event to the relay log               |       46392 |
            | 10 | Sleep     |                                                      |       74920 |
            |  9 | Query     | Waiting for the slave SQL thread to advance position |       74920 |
            |  6 | Sleep     |                                                      |       74920 |
            |  5 | Sleep     |                                                      |       73848 |
            +----+-----------+------------------------------------------------------+-------------+
            7 rows in set (0.002 sec)
            

            And slave error log mysql-test/var/log/mysqld.2.err shows:

            10.5.13 c430aa72abbdccb1ece7f0d65b49a6b48e7c5ba7 (Debug)

            2021-09-29 20:23:24 12 [ERROR] Slave SQL: Error executing row event: 'The MariaDB server is running with the --max-thread-mem-used=10485760 option so it cannot execute this statement', Gtid 0-1-3, Internal MariaDB error code: 1290
            

            Roel Roel Van de Paar added a comment - Confirmed. Memory starts increasing on the slave after the commit on the master: 10.5.13 c430aa72abbdccb1ece7f0d65b49a6b48e7c5ba7 (Debug) MariaDB [(none)]> select id,command,state,memory_used from information_schema.processlist; +----+-----------+------------------------------------------------------+-------------+ | id | command | state | memory_used | +----+-----------+------------------------------------------------------+-------------+ | 13 | Query | Filling schema table | 132368 | | 12 | Slave_SQL | Reading event from the relay log | 4390792 | | 11 | Slave_IO | Queueing master event to the relay log | 46392 | | 10 | Sleep | | 74920 | | 9 | Query | Waiting for the slave SQL thread to advance position | 74920 | | 6 | Sleep | | 74920 | | 5 | Sleep | | 73848 | +----+-----------+------------------------------------------------------+-------------+ 7 rows in set (0.002 sec)   MariaDB [(none)]> select id,command,state,memory_used from information_schema.processlist; +----+-----------+------------------------------------------------------+-------------+ | id | command | state | memory_used | +----+-----------+------------------------------------------------------+-------------+ | 13 | Query | Filling schema table | 132368 | | 12 | Slave_SQL | Reading event from the relay log | 4783096 | | 11 | Slave_IO | Queueing master event to the relay log | 46392 | | 10 | Sleep | | 74920 | | 9 | Query | Waiting for the slave SQL thread to advance position | 74920 | | 6 | Sleep | | 74920 | | 5 | Sleep | | 73848 | +----+-----------+------------------------------------------------------+-------------+ 7 rows in set (0.002 sec) And slave error log mysql-test/var/log/mysqld.2.err shows: 10.5.13 c430aa72abbdccb1ece7f0d65b49a6b48e7c5ba7 (Debug) 2021-09-29 20:23:24 12 [ERROR] Slave SQL: Error executing row event: 'The MariaDB server is running with the --max-thread-mem-used=10485760 option so it cannot execute this statement', Gtid 0-1-3, Internal MariaDB error code: 1290
            Roel Roel Van de Paar added a comment - - edited

            Bug also present in 10.4

            10.4.22 03a10706ecfa5cf0a6252d436e341483f5c0827c (Debug)

            | id | command   | state                                                | memory_used |
            ...
            | 17 | Slave_SQL | Closing tables                                       |     7183648 |
            ...
            | 17 | Slave_SQL | Reading event from the relay log                     |     8180632 |
            

            Roel Roel Van de Paar added a comment - - edited Bug also present in 10.4 10.4.22 03a10706ecfa5cf0a6252d436e341483f5c0827c (Debug) | id | command | state | memory_used | ... | 17 | Slave_SQL | Closing tables | 7183648 | ... | 17 | Slave_SQL | Reading event from the relay log | 8180632 |
            Roel Roel Van de Paar added a comment - - edited

            Bug now confirmed in 10.2 to 10.6. (To execute on 10.2 use ./mtr --suite=main test after placing test into mysql-test/t/test.test).

            10.2.41 (Debug)

            | 17 | Slave_SQL | after apply log event                                |     4205160 |
            ...
            | 17 | Slave_SQL | Reading event from the relay log                     |     9239112 |
            

            10.3.32 b112c9dfaacbcb7c3548414c6f402114663223dc (Debug)

            | 17 | Slave_SQL | Reading event from the relay log                     |     4386400 |
            ...
            | 17 | Slave_SQL | Reading event from the relay log                     |     6249616 |
            

            10.4-10.5: ref comments above.

            10.6.5 1e9c922fa726b22f4522f2a4de0fcb6595404086 (Debug)

            | 12 | Slave_SQL | Reading event from the relay log                     |     2724176 |
            ...
            | 12 | Slave_SQL | Executing                                            |     7186640 |
            

            10.7 However is not affected (tested twice):

            10.7.0 d552e092c9f3e20da078d1b62b976f629f73d3a4 (Debug)

            MariaDB [(none)]> select id,command,state,memory_used from information_schema.processlist;
            +----+-----------+------------------------------------------------------+-------------+
            | id | command   | state                                                | memory_used |
            +----+-----------+------------------------------------------------------+-------------+
            | 13 | Query     | Filling schema table                                 |       88304 |
            | 12 | Slave_SQL | Executing                                            |       45768 |
            | 11 | Slave_IO  | Queueing master event to the relay log               |       45768 |
            | 10 | Sleep     |                                                      |       45768 |
            |  9 | Query     | Waiting for the slave SQL thread to advance position |       45768 |
            |  6 | Sleep     |                                                      |       45768 |
            |  5 | Sleep     |                                                      |       45768 |
            +----+-----------+------------------------------------------------------+-------------+
            7 rows in set (0.002 sec)
             
            MariaDB [(none)]> select id,command,state,memory_used from information_schema.processlist;
            +----+-----------+------------------------------------------------------+-------------+
            | id | command   | state                                                | memory_used |
            +----+-----------+------------------------------------------------------+-------------+
            | 13 | Query     | Filling schema table                                 |       88304 |
            | 12 | Slave_SQL | Reading event from the relay log                     |       45768 |
            | 11 | Slave_IO  | Queueing master event to the relay log               |       45768 |
            | 10 | Sleep     |                                                      |       45768 |
            |  9 | Query     | Waiting for the slave SQL thread to advance position |       45768 |
            |  6 | Sleep     |                                                      |       45768 |
            |  5 | Sleep     |                                                      |       45768 |
            +----+-----------+------------------------------------------------------+-------------+
            7 rows in set (0.003 sec)
            

            Additionally, the 10.7 test succeeds

            10.7.0 d552e092c9f3e20da078d1b62b976f629f73d3a4 (Debug)

            ...
            commit;
            drop table t1;
            connection slave;
            set global max_session_mem_used=default;
            include/rpl_end.inc
            main.test 'innodb,row'                   [ pass ]  396922
            --------------------------------------------------------------------------
            The servers were restarted 0 times
            Spent 396.922 of 401 seconds executing testcases
             
            Completed: All 1 tests were successful.
            

            Roel Roel Van de Paar added a comment - - edited Bug now confirmed in 10.2 to 10.6. (To execute on 10.2 use ./mtr --suite=main test after placing test into mysql-test/t/test.test ). 10.2.41 (Debug) | 17 | Slave_SQL | after apply log event | 4205160 | ... | 17 | Slave_SQL | Reading event from the relay log | 9239112 | 10.3.32 b112c9dfaacbcb7c3548414c6f402114663223dc (Debug) | 17 | Slave_SQL | Reading event from the relay log | 4386400 | ... | 17 | Slave_SQL | Reading event from the relay log | 6249616 | 10.4-10.5: ref comments above. 10.6.5 1e9c922fa726b22f4522f2a4de0fcb6595404086 (Debug) | 12 | Slave_SQL | Reading event from the relay log | 2724176 | ... | 12 | Slave_SQL | Executing | 7186640 | 10.7 However is not affected (tested twice): 10.7.0 d552e092c9f3e20da078d1b62b976f629f73d3a4 (Debug) MariaDB [(none)]> select id,command,state,memory_used from information_schema.processlist; +----+-----------+------------------------------------------------------+-------------+ | id | command | state | memory_used | +----+-----------+------------------------------------------------------+-------------+ | 13 | Query | Filling schema table | 88304 | | 12 | Slave_SQL | Executing | 45768 | | 11 | Slave_IO | Queueing master event to the relay log | 45768 | | 10 | Sleep | | 45768 | | 9 | Query | Waiting for the slave SQL thread to advance position | 45768 | | 6 | Sleep | | 45768 | | 5 | Sleep | | 45768 | +----+-----------+------------------------------------------------------+-------------+ 7 rows in set (0.002 sec)   MariaDB [(none)]> select id,command,state,memory_used from information_schema.processlist; +----+-----------+------------------------------------------------------+-------------+ | id | command | state | memory_used | +----+-----------+------------------------------------------------------+-------------+ | 13 | Query | Filling schema table | 88304 | | 12 | Slave_SQL | Reading event from the relay log | 45768 | | 11 | Slave_IO | Queueing master event to the relay log | 45768 | | 10 | Sleep | | 45768 | | 9 | Query | Waiting for the slave SQL thread to advance position | 45768 | | 6 | Sleep | | 45768 | | 5 | Sleep | | 45768 | +----+-----------+------------------------------------------------------+-------------+ 7 rows in set (0.003 sec) Additionally, the 10.7 test succeeds 10.7.0 d552e092c9f3e20da078d1b62b976f629f73d3a4 (Debug) ... commit; drop table t1; connection slave; set global max_session_mem_used=default; include/rpl_end.inc main.test 'innodb,row' [ pass ] 396922 -------------------------------------------------------------------------- The servers were restarted 0 times Spent 396.922 of 401 seconds executing testcases   Completed: All 1 tests were successful.
            Elkin Andrei Elkin added a comment -

            Sergei, as you started on it anyway, could you please review as well?
            The patch ddf10294831 is in bb-10.2-andrei (will have merge conflicts with 10.4 and 10.5,
            resolved yet locally). Thanks.

            Elkin Andrei Elkin added a comment - Sergei, as you started on it anyway, could you please review as well? The patch ddf10294831 is in bb-10.2-andrei (will have merge conflicts with 10.4 and 10.5, resolved yet locally). Thanks.

            ddf10294831 looks ok to me

            serg Sergei Golubchik added a comment - ddf10294831 looks ok to me
            Elkin Andrei Elkin added a comment -

            For merging 10.2->10.3 and 10.4->10.5 conflicts are resolved in

            bb-10.3-MDEV-26712-andrei,
            bb-10.5-MDEV-26712-andrei.

            Elkin Andrei Elkin added a comment - For merging 10.2->10.3 and 10.4->10.5 conflicts are resolved in bb-10.3- MDEV-26712 -andrei, bb-10.5- MDEV-26712 -andrei.

            People

              Elkin Andrei Elkin
              serg Sergei Golubchik
              Votes:
              2 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.