Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-4240

mariadb 5.3.12 using more memory than MySQL 5.1 for an inefficient query

Details

    • Bug
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Fixed
    • 5.3.12
    • 5.3.13
    • None
    • RedHat EL6 using mariadb-5.3.12-Linux-x86_64.tar.gz

    Description

      Hello and thank you for mariadb,

      I noticed mariadb 5.3.12 using a lot more memory than MySQL 5.1.48 for a very inefficient query (that should never get near production):

      CREATE TABLE `table_a` (
        `field_c` varchar(8) DEFAULT NULL,
        `field_d` varchar(11) DEFAULT NULL,
        UNIQUE KEY `field_c` (`field_c`),
        UNIQUE KEY `field_d` (`field_d`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1

      CREATE TABLE `table_b` (
        `field_c` char(9) NOT NULL,
        `field_d` char(12) DEFAULT NULL,
        PRIMARY KEY (`field_c`),
        KEY `field_d` (`field_d`)
      ) ENGINE=MyISAM DEFAULT CHARSET=latin1

      table_a has 4,834,708 rows,
      table_b has 6,522,728 rows, I will upload a tar.gz with mysqldumps of the tables.

      The query:

      SELECT
        table_a.field_d
      FROM
        table_a
        INNER JOIN table_b
          ON table_a.field_c = SUBSTR(table_b.field_c, 1, 8) OR table_a.field_d = SUBSTR(table_b.field_d, 1, 11);

      MySQL 5.1.48 takes a while but does return 4,099,991 rows with:
      VmPeak: 9891240 kB
      VmHWM: 7855352 kB
      Staying under a 10 Gigabyte virtual memory ulimit.

      +----+-------------+---------+------+-----------------+------+---------+------+---------+----------+------------------------------------------------+
      | id | select_type | table   | type | possible_keys   | key  | key_len | ref  | rows    | filtered | Extra                                          |
      +----+-------------+---------+------+-----------------+------+---------+------+---------+----------+------------------------------------------------+
      |  1 | SIMPLE      | table_b | ALL  | NULL            | NULL | NULL    | NULL | 6522728 |   100.00 |                                                |
      |  1 | SIMPLE      | table_a | ALL  | field_c,field_d | NULL | NULL    | NULL | 4817421 |   100.00 | Range checked for each record (index map: 0x3) |
      +----+-------------+---------+------+-----------------+------+---------+------+---------+----------+------------------------------------------------+
      select `test`.`table_a`.`field_d` AS `field_d` from `test`.`table_a` join `test`.`table_b` where ((`test`.`table_a`.`field_c` = substr(`test`.`table_b`.`field_c`,1,8)) or (`test`.`table_a`.`field_d` = substr(`test`.`table_b`.`field_d`,1,11)))

      mariadb 5.3.12 hits a 20 Gigabyte virtual memory ulimit, RSS is around 16G.

      +----+-------------+---------+------+-----------------+------+---------+------+---------+----------+------------------------------------------------+
      | id | select_type | table   | type | possible_keys   | key  | key_len | ref  | rows    | filtered | Extra                                          |
      +----+-------------+---------+------+-----------------+------+---------+------+---------+----------+------------------------------------------------+
      |  1 | SIMPLE      | table_b | ALL  | NULL            | NULL | NULL    | NULL | 6522728 |   100.00 |                                                |
      |  1 | SIMPLE      | table_a | ALL  | field_c,field_d | NULL | NULL    | NULL | 4798671 |   100.00 | Range checked for each record (index map: 0x3) |
      +----+-------------+---------+------+-----------------+------+---------+------+---------+----------+------------------------------------------------+
      select `test`.`table_a`.`field_d` AS `field_d` from `test`.`table_a` join `test`.`table_b` where ((`test`.`table_a`.`field_c` = substr(`test`.`table_b`.`field_c`,1,8)) or (`test`.`table_a`.`field_d` = substr(`test`.`table_b`.`field_d`,1,11)))

      I was using the following when reproducing:

      [mysqld]
      key_buffer = 500M
      innodb_buffer_pool_size = 2000M

      Thank you.

      Attachments

        Activity

          The patch:

          === modified file 'sql/opt_range.cc'
          — sql/opt_range.cc 2013-02-28 20:48:47 +0000
          +++ sql/opt_range.cc 2013-03-29 11:05:57 +0000
          @@ -2266,11 +2266,21 @@ int QUICK_ROR_UNION_SELECT::reset()
          if (!scans_inited)
          {
          List_iterator_fast<QUICK_SELECT_I> it(quick_selects);
          +
          + /* Provide a MEM-root to children. */
          + MEM_ROOT *save_mem_root= thd->mem_root;
          + thd->mem_root= &alloc;
          +
          while ((quick= it++))
          {
          if (quick->init_ror_merged_scan(FALSE))
          +

          { + thd->mem_root= save_mem_root; DBUG_RETURN(1); + }

          }
          + thd->mem_root= save_mem_root;
          +
          scans_inited= TRUE;
          }
          queue_remove_all(&queue);

          This patch needs to be improved - there are other kinds of quick selects that may call handler::clone(). They all should make sure that thd->mem_root is pointing to quick select's mem_root. If thd->mem_root is a statement-execution-mem_root, then that mem_root will accumulate a lot of data.

          psergei Sergei Petrunia added a comment - The patch: === modified file 'sql/opt_range.cc' — sql/opt_range.cc 2013-02-28 20:48:47 +0000 +++ sql/opt_range.cc 2013-03-29 11:05:57 +0000 @@ -2266,11 +2266,21 @@ int QUICK_ROR_UNION_SELECT::reset() if (!scans_inited) { List_iterator_fast<QUICK_SELECT_I> it(quick_selects); + + /* Provide a MEM-root to children. */ + MEM_ROOT *save_mem_root= thd->mem_root; + thd->mem_root= &alloc; + while ((quick= it++)) { if (quick->init_ror_merged_scan(FALSE)) + { + thd->mem_root= save_mem_root; DBUG_RETURN(1); + } } + thd->mem_root= save_mem_root; + scans_inited= TRUE; } queue_remove_all(&queue); This patch needs to be improved - there are other kinds of quick selects that may call handler::clone(). They all should make sure that thd->mem_root is pointing to quick select's mem_root. If thd->mem_root is a statement-execution-mem_root, then that mem_root will accumulate a lot of data.

          Committed another variant of the patch.

          psergei Sergei Petrunia added a comment - Committed another variant of the patch.

          Sanja, please reassign back when review is done.

          psergei Sergei Petrunia added a comment - Sanja, please reassign back when review is done.

          OK to push

          sanja Oleksandr Byelkin added a comment - OK to push

          Fix pushed into 5.3

          psergei Sergei Petrunia added a comment - Fix pushed into 5.3

          People

            psergei Sergei Petrunia
            thatsafunnyname Peter (Stig) Edwards
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.