Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9764

MariaDB does not limit memory used for range optimization

Details

    • 10.2.1-1, 10.2.1-2, 5.5.50, 10.1.15, 10.2.4-1, 10.1.21

    Description

      `Unlike MySQL 5.6 (that provides eq_range_index_dive_limit server variable) and MySQL 5.7 (that additionally provides range_optimizer_max_mem_size server variable) MariaDB 10.x does NOT limit memory used for range optimization in any way (neither directly nor indirectly).

      As a result, for some queries with huge list of value in the IN () list in the WHERE clause (having number of items comparable to the number of rows in the table) MariaDB 10.x does not only execute the query longer than MySQL 5.6 and 5.7 with default settings (as it spends more time on query optimization), but also uses a lot of memory in the process (see https://bugs.mysql.com/bug.php?id=78973 for some estimations and links). With many concurrent connections running this kind of queries we may easily end up with OOM condition for the mysqld process.

      Consider a table like this:

      create table t (id bigint not null auto_increment, ts datetime, primary key(id,ts)) engine=InnoDB;
      insert into t(ts) values(now());
      insert into t(ts) values(now());
      replace into t(ts) select now() from t t1, t t2, t t3, t t4, t t5, t t6, t t7, t t8, t t9, t t10, t t11, t t12, t t13, t t14, t t15,t t16, t t17, t t18;
      analyze table t;
      

      and a query like this:

      SELECT * FROM t WHERE id IN (10000,10001,10002, ... 54106,54107);
      

      Run the query on different versions of MySQL and MariaDB, compare execution times and memory used in the process. It may be easier to see the memory impact on versions that do not allow to monitor memory used per session or "state" by running many copies of the same query concurrently using mysqlslap, for example.

      Attachments

        Issue Links

          Activity

            valerii Valerii Kravchuk added a comment - - edited

            I've checked today with recent MariaDB 10.1.13 built from current sources on my QuadCore box. I've got the following results:

            Execution of the query:

            ...
            | 54107 | 2016-03-21 11:43:03 |
            +-------+---------------------+
            43238 rows in set (0.45 sec)
            
            

            Then I've executed mysqlslap like this:

            mysqlslap -uroot --concurrency=100 --iterations=100 --create-schema=test --query=/tmp/query.sql --delimiter=";" --socket=/tmp/mysql.sock
            
            

            and in another session got the following on memory usage:

            MariaDB [test]> select sum(memory_used)/1024/1024, state, count(*), max(Time), avg(Time) FROM information_schema.processlist group by state with rollup;
            +----------------------------+----------------------+----------+-----------+-----------+
            | sum(memory_used)/1024/1024 | state                | count(*) | max(Time) | avg(Time) |
            +----------------------------+----------------------+----------+-----------+-----------+
            |                 0.06581116 |                      |        1 |         5 |    5.0000 |
            |                 0.08060455 | Filling schema table |        1 |         0 |    0.0000 |
            |              2235.60366058 | statistics           |      100 |         5 |    5.0000 |
            |              2235.75007629 | NULL                 |      102 |         5 |    4.9510 |
            +----------------------------+----------------------+----------+-----------+-----------+
            4 rows in set (2.09 sec)
             
            MariaDB [test]> select sum(memory_used)/1024/1024, state, count(*), max(Time), avg(Time) FROM information_schema.processlist group by state with rollup;
            +----------------------------+----------------------+----------+-----------+-----------+
            | sum(memory_used)/1024/1024 | state                | count(*) | max(Time) | avg(Time) |
            +----------------------------+----------------------+----------+-----------+-----------+
            |                 0.06581116 |                      |        1 |        25 |   25.0000 |
            |                 0.08060455 | Filling schema table |        1 |         0 |    0.0000 |
            |                24.48434448 | preparing            |        2 |        25 |   24.5000 |
            |               673.31947327 | Sending data         |       55 |        25 |   24.6909 |
            |              1968.40044403 | statistics           |       31 |        25 |   24.2258 |
            |              2666.35067749 | NULL                 |       90 |        25 |   24.2556 |
            +----------------------------+----------------------+----------+-----------+-----------+
            6 rows in set (0.08 sec)
            

            Then mysqld process was killed by OOM killer:

            2016-03-21 11:42:12 140198044989504 [Note] /home/openxs/dbs/maria10.1/bin/mysqld: ready for connections.
            Version: '10.1.13-MariaDB'  socket: '/tmp/mysql.sock'  port: 3306  Source distribution
            160321 11:48:51 mysqld_safe Number of processes running now: 0
            160321 11:48:51 mysqld_safe mysqld restarted
            2016-03-21 11:48:51 140211571005504 [Note] /home/openxs/dbs/maria10.1/bin/mysqld (mysqld 10.1.13-MariaDB) starting as process 11422 ...
             
            [openxs@fc23 maria10.1]$ journalctl -b | grep mysqld
            Mar 21 11:48:46 fc23 kernel: mysqld invoked oom-killer: gfp_mask=0x24280ca, order=0, oom_score_adj=0
            Mar 21 11:48:47 fc23 kernel: mysqld cpuset=/ mems_allowed=0
            Mar 21 11:48:47 fc23 kernel: CPU: 3 PID: 11330 Comm: mysqld Not tainted 4.4.4-301.fc23.x86_64 #1
            Mar 21 11:48:49 fc23 kernel: [10505]  1000 10505    29781      111      11       3        0             0 mysqld_safe
            Mar 21 11:48:49 fc23 kernel: [10567]  1000 10567  2122745  1704741    3440      12        0             0 mysqld
            Mar 21 11:48:49 fc23 kernel: Out of memory: Kill process 10567 (mysqld) score 839 or sacrifice child
            Mar 21 11:48:49 fc23 kernel: Killed process 10567 (mysqld) total-vm:8490980kB, anon-rss:6818988kB, file-rss:0kB
            

            With Percona Server 5.6.28-76.1 built from source on the same hardware I've got the following execution time:

            ...
            | 54107 | 2016-03-21 11:55:08 |
            +-------+---------------------+
            43238 rows in set (0.24 sec)
            

            and memory usage while the same mysqlsap was running the query in 100 connections based on top was like this:

            11793 openxs    20   0 5409252 4.062g  15772 S 378.4 52.3   0:40.17 mysqld
            ...
            11793 openxs    20   0 5634456 4.509g  15772 S 384.1 58.1   2:12.86 mysqld
            ...
            11793 openxs    20   0 5739204 4.518g  15772 S 379.4 58.2  22:50.10 mysqld
            ...
            11793 openxs    20   0 5853164 4.671g  15772 S 380.7 60.2  53:07.90 mysqld
            

            The command completed successfully:

            [openxs@fc23 p5.6]$ bin/mysqlslap -uroot --concurrency=100 --iterations=100 --create-schema=test --query=/tmp/query.sql --delimiter=";" --socket=/tmp/mysql.sock
            Benchmark
                    Average number of seconds to run all queries: 8.560 seconds
                    Minimum number of seconds to run all queries: 8.344 seconds
                    Maximum number of seconds to run all queries: 16.744 seconds
                    Number of clients running queries: 100
                    Average number of queries per client: 1
            

            The system used for testing was like this:

            [openxs@fc23 percona-toolkit]$ bin/pt-summary
            # Percona Toolkit System Summary Report ######################
                    Date | 2016-03-21 10:08:35 UTC (local TZ: EET +0200)
                Hostname | fc23
                  Uptime |  1:38,  3 users,  load average: 61.94, 58.26, 36.55
                Platform | Linux
                 Release | Fedora release 23 (Twenty Three)
                  Kernel | 4.4.4-301.fc23.x86_64
            Architecture | CPU = 64-bit, OS = 64-bit
               Threading | NPTL 2.22
                 SELinux | Enforcing
             Virtualized | No virtualization detected
            # Processor ##################################################
              Processors | physical = 1, cores = 4, virtual = 4, hyperthreading = no
                  Speeds | 4x2499.000
                  Models | 4xIntel(R) Core(TM)2 Quad CPU Q8300 @ 2.50GHz
                  Caches | 4x2048 KB
            # Memory #####################################################
                   Total | 7.8G
                    Free | 1.7G
                    Used | physical = 5.3G, swap allocated = 0.0, swap used = 0.0, virtual = 5.3G
                 Buffers | 795.5M
                  Caches | 2.2G
                   Dirty | 49416 kB
                 UsedRSS | 5.3G
              Swappiness | 60
             DirtyPolicy | 20, 10
             DirtyStatus | 0, 0
            ...
            

            valerii Valerii Kravchuk added a comment - - edited I've checked today with recent MariaDB 10.1.13 built from current sources on my QuadCore box. I've got the following results: Execution of the query: ... | 54107 | 2016-03-21 11:43:03 | +-------+---------------------+ 43238 rows in set (0.45 sec) Then I've executed mysqlslap like this: mysqlslap -uroot --concurrency=100 --iterations=100 --create-schema=test --query=/tmp/query.sql --delimiter=";" --socket=/tmp/mysql.sock and in another session got the following on memory usage: MariaDB [test]> select sum(memory_used)/1024/1024, state, count(*), max(Time), avg(Time) FROM information_schema.processlist group by state with rollup; +----------------------------+----------------------+----------+-----------+-----------+ | sum(memory_used)/1024/1024 | state | count(*) | max(Time) | avg(Time) | +----------------------------+----------------------+----------+-----------+-----------+ | 0.06581116 | | 1 | 5 | 5.0000 | | 0.08060455 | Filling schema table | 1 | 0 | 0.0000 | | 2235.60366058 | statistics | 100 | 5 | 5.0000 | | 2235.75007629 | NULL | 102 | 5 | 4.9510 | +----------------------------+----------------------+----------+-----------+-----------+ 4 rows in set (2.09 sec)   MariaDB [test]> select sum(memory_used)/1024/1024, state, count(*), max(Time), avg(Time) FROM information_schema.processlist group by state with rollup; +----------------------------+----------------------+----------+-----------+-----------+ | sum(memory_used)/1024/1024 | state | count(*) | max(Time) | avg(Time) | +----------------------------+----------------------+----------+-----------+-----------+ | 0.06581116 | | 1 | 25 | 25.0000 | | 0.08060455 | Filling schema table | 1 | 0 | 0.0000 | | 24.48434448 | preparing | 2 | 25 | 24.5000 | | 673.31947327 | Sending data | 55 | 25 | 24.6909 | | 1968.40044403 | statistics | 31 | 25 | 24.2258 | | 2666.35067749 | NULL | 90 | 25 | 24.2556 | +----------------------------+----------------------+----------+-----------+-----------+ 6 rows in set (0.08 sec) Then mysqld process was killed by OOM killer: 2016-03-21 11:42:12 140198044989504 [Note] /home/openxs/dbs/maria10.1/bin/mysqld: ready for connections. Version: '10.1.13-MariaDB' socket: '/tmp/mysql.sock' port: 3306 Source distribution 160321 11:48:51 mysqld_safe Number of processes running now: 0 160321 11:48:51 mysqld_safe mysqld restarted 2016-03-21 11:48:51 140211571005504 [Note] /home/openxs/dbs/maria10.1/bin/mysqld (mysqld 10.1.13-MariaDB) starting as process 11422 ...   [openxs@fc23 maria10.1]$ journalctl -b | grep mysqld Mar 21 11:48:46 fc23 kernel: mysqld invoked oom-killer: gfp_mask=0x24280ca, order=0, oom_score_adj=0 Mar 21 11:48:47 fc23 kernel: mysqld cpuset=/ mems_allowed=0 Mar 21 11:48:47 fc23 kernel: CPU: 3 PID: 11330 Comm: mysqld Not tainted 4.4.4-301.fc23.x86_64 #1 Mar 21 11:48:49 fc23 kernel: [10505] 1000 10505 29781 111 11 3 0 0 mysqld_safe Mar 21 11:48:49 fc23 kernel: [10567] 1000 10567 2122745 1704741 3440 12 0 0 mysqld Mar 21 11:48:49 fc23 kernel: Out of memory: Kill process 10567 (mysqld) score 839 or sacrifice child Mar 21 11:48:49 fc23 kernel: Killed process 10567 (mysqld) total-vm:8490980kB, anon-rss:6818988kB, file-rss:0kB With Percona Server 5.6.28-76.1 built from source on the same hardware I've got the following execution time: ... | 54107 | 2016-03-21 11:55:08 | +-------+---------------------+ 43238 rows in set (0.24 sec) and memory usage while the same mysqlsap was running the query in 100 connections based on top was like this: 11793 openxs 20 0 5409252 4.062g 15772 S 378.4 52.3 0:40.17 mysqld ... 11793 openxs 20 0 5634456 4.509g 15772 S 384.1 58.1 2:12.86 mysqld ... 11793 openxs 20 0 5739204 4.518g 15772 S 379.4 58.2 22:50.10 mysqld ... 11793 openxs 20 0 5853164 4.671g 15772 S 380.7 60.2 53:07.90 mysqld The command completed successfully: [openxs@fc23 p5.6]$ bin/mysqlslap -uroot --concurrency=100 --iterations=100 --create-schema=test --query=/tmp/query.sql --delimiter=";" --socket=/tmp/mysql.sock Benchmark Average number of seconds to run all queries: 8.560 seconds Minimum number of seconds to run all queries: 8.344 seconds Maximum number of seconds to run all queries: 16.744 seconds Number of clients running queries: 100 Average number of queries per client: 1 The system used for testing was like this: [openxs@fc23 percona-toolkit]$ bin/pt-summary # Percona Toolkit System Summary Report ###################### Date | 2016-03-21 10:08:35 UTC (local TZ: EET +0200) Hostname | fc23 Uptime | 1:38, 3 users, load average: 61.94, 58.26, 36.55 Platform | Linux Release | Fedora release 23 (Twenty Three) Kernel | 4.4.4-301.fc23.x86_64 Architecture | CPU = 64-bit, OS = 64-bit Threading | NPTL 2.22 SELinux | Enforcing Virtualized | No virtualization detected # Processor ################################################## Processors | physical = 1, cores = 4, virtual = 4, hyperthreading = no Speeds | 4x2499.000 Models | 4xIntel(R) Core(TM)2 Quad CPU Q8300 @ 2.50GHz Caches | 4x2048 KB # Memory ##################################################### Total | 7.8G Free | 1.7G Used | physical = 5.3G, swap allocated = 0.0, swap used = 0.0, virtual = 5.3G Buffers | 795.5M Caches | 2.2G Dirty | 49416 kB UsedRSS | 5.3G Swappiness | 60 DirtyPolicy | 20, 10 DirtyStatus | 0, 0 ...

            Some observations (these are not the direct cause of the problem but related).

            One reason for high memory consuption is:

            (gdb) p sizeof(SEL_TREE)
              $32 = 600
            (gdb) p sizeof(SEL_TREE::keys)
              $36 = 512
            

            MySQL-5.6 also has this big array. I didn't check other versions.

            psergei Sergei Petrunia added a comment - Some observations (these are not the direct cause of the problem but related). One reason for high memory consuption is: (gdb) p sizeof(SEL_TREE) $32 = 600 (gdb) p sizeof(SEL_TREE::keys) $36 = 512 MySQL-5.6 also has this big array. I didn't check other versions.

            opt_range.cc also has some other ways to prevent high memory usage:

            • PARAM::alloced_sel_args (introduced by me in 2007, present in both MySQL and MariaDB)
            • NOT_IN_IGNORE_THRESHOLD(introduced in 2006, present in both MySQL and MariaDB)

            these are not directly related to this bug

            psergei Sergei Petrunia added a comment - opt_range.cc also has some other ways to prevent high memory usage: PARAM::alloced_sel_args (introduced by me in 2007, present in both MySQL and MariaDB) NOT_IN_IGNORE_THRESHOLD(introduced in 2006, present in both MySQL and MariaDB) these are not directly related to this bug

            The patch that introduces @@eq_range_index_dive_limit is here:
            https://github.com/mysql/mysql-server/commit/7a9d83117f039337a948f918c2c4c5d7eb3f5ccc

            it replaces records_in_range() calls with use of index statistics (or some adjusted value?). This will remove [one cause of] the slowdown, but won't help with high memory consumption.

            psergei Sergei Petrunia added a comment - The patch that introduces @@eq_range_index_dive_limit is here: https://github.com/mysql/mysql-server/commit/7a9d83117f039337a948f918c2c4c5d7eb3f5ccc it replaces records_in_range() calls with use of index statistics (or some adjusted value?). This will remove [one cause of] the slowdown, but won't help with high memory consumption.
            valerii Valerii Kravchuk added a comment - See http://bugs.mysql.com/bug.php?id=79450 also.

            The patch that limits memory consumption is here:
            https://github.com/mysql/mysql-server/commit/a7571080513ed2107919ebb1f5633566d3605189

            It is done on a fairly low level, by limiting memory consumption of a MEM_ROOT pool

            (the capability to limit memory use of a MEM_ROOT is also used to limit the amount of memory used by the parser:
            https://github.com/mysql/mysql-server/commit/d18c716d3e6e25c0444b332b83bed358c3e4b137

            psergei Sergei Petrunia added a comment - The patch that limits memory consumption is here: https://github.com/mysql/mysql-server/commit/a7571080513ed2107919ebb1f5633566d3605189 It is done on a fairly low level, by limiting memory consumption of a MEM_ROOT pool (the capability to limit memory use of a MEM_ROOT is also used to limit the amount of memory used by the parser: https://github.com/mysql/mysql-server/commit/d18c716d3e6e25c0444b332b83bed358c3e4b137

            A somewhat similar case: MDEV-10046

            psergei Sergei Petrunia added a comment - A somewhat similar case: MDEV-10046

            Looking at mysql-5.7, found this: if one has a table with extended key

            CREATE TABLE t1 (
              pk int(11) NOT NULL,
              a int(11) DEFAULT NULL,
              ...
              PRIMARY KEY (pk),
              KEY a (a)
            );
            

            and a query that has equalities on a secondary index as well as PK:

            explain select * from t1 
            where a in  (1,2,3,4,5,6,7,8,9,10) and 
                  pk in (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15);
            

            Then it enumerates a combinatorial explosion of ranges:

                "1 <= a <= 1 AND 1 <= pk <= 1",
                "1 <= a <= 1 AND 2 <= pk <= 2",
                "1 <= a <= 1 AND 3 <= pk <= 3",
                "1 <= a <= 1 AND 4 <= pk <= 4",
                ...
                "10 <= a <= 10 AND 13 <= pk <= 13",
                "10 <= a <= 10 AND 14 <= pk <= 14",
                "10 <= a <= 10 AND 15 <= pk <= 15"
            

            but also it calls ha_innobase::records_in_range() for each of them.

            This goes against the idea of not calling records_in_range() for "equality ranges" (full key ranges on unique indexes).

            psergei Sergei Petrunia added a comment - Looking at mysql-5.7, found this: if one has a table with extended key CREATE TABLE t1 ( pk int(11) NOT NULL, a int(11) DEFAULT NULL, ... PRIMARY KEY (pk), KEY a (a) ); and a query that has equalities on a secondary index as well as PK: explain select * from t1 where a in (1,2,3,4,5,6,7,8,9,10) and pk in (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15); Then it enumerates a combinatorial explosion of ranges: "1 <= a <= 1 AND 1 <= pk <= 1", "1 <= a <= 1 AND 2 <= pk <= 2", "1 <= a <= 1 AND 3 <= pk <= 3", "1 <= a <= 1 AND 4 <= pk <= 4", ... "10 <= a <= 10 AND 13 <= pk <= 13", "10 <= a <= 10 AND 14 <= pk <= 14", "10 <= a <= 10 AND 15 <= pk <= 15" but also it calls ha_innobase::records_in_range() for each of them. This goes against the idea of not calling records_in_range() for "equality ranges" (full key ranges on unique indexes).

            Looking at the testcase in the bug report again:

            == 10.1 ==
            explain: 2.63 sec
            query: 3.58 sec
            memory usage:

            • MariaDB's memory tracking
            • information_schema.processlist.memory_used: 67,746,520
            • checking increments in debugger one can see that memory usage is about 9.5M when we enter get_mm_tree.
            • That is, inside get_mm_tree, 58,260,384 bytes of mem are consumed.

            == mysql-5.7 ==
            explain: 1.05 sec
            query: 2.12 sec
            range optimizer is terminated early because it runs out of memory.

            == mysql-5.7, no memory limit for range optimizer ==
            set range_optimizer_max_mem_size=1024*1024*1024;
            explain: 1.21 sec
            query: 2.43 sec (still faster than 10.1)
            memory usage: MySQL 5.7's way:
            update performance_schema.setup_instruments set enabled='YES'
            where name='memory/sql/test_quick_select';
            select * from performance_schema.memory_summary_global_by_event_name
            where event_name like '%test_quick_select%';
            HIGH_NUMBER_OF_BYTES_USED: 10,241,280

            Debugging, one can see that

            • 44,107 SEL_ARG objects are produced
            • eq_range_index_dive_limit limit is hit, and records_in_range() calls are not made.

            == mysql-5.7, no eq_range_index_dive_limit ==
            set eq_range_index_dive_limit=1000*1000;
            explain: 4.0 sec
            query: 5.16 sec
            HIGH_NUMBER_OF_BYTES_USED: 10,241,280

            psergei Sergei Petrunia added a comment - Looking at the testcase in the bug report again: == 10.1 == explain: 2.63 sec query: 3.58 sec memory usage: MariaDB's memory tracking information_schema.processlist.memory_used: 67,746,520 checking increments in debugger one can see that memory usage is about 9.5M when we enter get_mm_tree. That is, inside get_mm_tree, 58,260,384 bytes of mem are consumed. == mysql-5.7 == explain: 1.05 sec query: 2.12 sec range optimizer is terminated early because it runs out of memory. == mysql-5.7, no memory limit for range optimizer == set range_optimizer_max_mem_size=1024*1024*1024; explain: 1.21 sec query: 2.43 sec (still faster than 10.1) memory usage: MySQL 5.7's way: update performance_schema.setup_instruments set enabled='YES' where name='memory/sql/test_quick_select'; select * from performance_schema.memory_summary_global_by_event_name where event_name like '%test_quick_select%'; HIGH_NUMBER_OF_BYTES_USED: 10,241,280 Debugging, one can see that 44,107 SEL_ARG objects are produced eq_range_index_dive_limit limit is hit, and records_in_range() calls are not made. == mysql-5.7, no eq_range_index_dive_limit == set eq_range_index_dive_limit=1000*1000; explain: 4.0 sec query: 5.16 sec HIGH_NUMBER_OF_BYTES_USED: 10,241,280

            Pushed
            https://github.com/MariaDB/server/commit/016790403a4bb6182094870870ce1a1c3e2756dc
            into 10.1.

            The patch doesn't put any limits on memory usage, but it makes memory usage much lower.
            Re-running the test from the previous comments:

            == 10.1-patch1 ==
            explain: 2.54 sec (was 2.63 sec)
            query: 3.48 sec (was: 3.58 sec)

            information_schema.processlist.memory_used: 26,091,264 (was: 67,746,520)
            That is, inside get_mm_tree, 16,605,264 bytes of memory are consumed (was: 58,260,384)

            psergei Sergei Petrunia added a comment - Pushed https://github.com/MariaDB/server/commit/016790403a4bb6182094870870ce1a1c3e2756dc into 10.1. The patch doesn't put any limits on memory usage, but it makes memory usage much lower. Re-running the test from the previous comments: == 10.1-patch1 == explain: 2.54 sec (was 2.63 sec) query: 3.48 sec (was: 3.58 sec) information_schema.processlist.memory_used: 26,091,264 (was: 67,746,520) That is, inside get_mm_tree, 16,605,264 bytes of memory are consumed (was: 58,260,384)

            Note that the above is still higher than the HIGH_NUMBER_OF_BYTES_USED: 10,241,280 number we've got for mysql-5.7, no eq_range_index_dive_limit above.

            psergei Sergei Petrunia added a comment - Note that the above is still higher than the HIGH_NUMBER_OF_BYTES_USED: 10,241,280 number we've got for mysql-5.7, no eq_range_index_dive_limit above.

            MariaDB uses more memory because it allocates more SEL_TREE objects.
            It processes t.key IN (c1,c2,c3 ...) in the same way as t.key=c1 OR t.key=c2 OR t.key=c3 OR ... would be processed:

            • constructs SEL_TREE for each t.key=c_i
            • calls tree_or(sel_tree1, sel_tree2)
              tree_or() always allocates a separate SEL_TREE for its result.

            This causes MariaDB to allocate 2x more SEL_TREE objects than the number of elements in (c1,c2, ... cN) list. For the example, I see 88213 SEL_TREEs allocated.

            MySQL 5.7 with range_optimizer_max_mem_size=1024*1024*1024 setting allocates only 44108 SEL_TREE objects.
            It allocates fewer objects, because tree_or(sel_tree1, sel_tree2) call reuses one of SEL_TREE parameters for the result
            ...

            Looking why MariaDB doesn't reuse... it comes from this patch:

            commit 3019d3972932931d9c4a14bec4183b232355cdcf
            Author:	Igor Babaev <igor@askmonty.org>  Mon Oct 12 08:59:34 2009
            The main patch for WL#24: 
            "index_merge: fair choice between index_merge union and range access"
            

            psergei Sergei Petrunia added a comment - MariaDB uses more memory because it allocates more SEL_TREE objects. It processes t.key IN (c1,c2,c3 ...) in the same way as t.key=c1 OR t.key=c2 OR t.key=c3 OR ... would be processed: constructs SEL_TREE for each t.key=c_i calls tree_or(sel_tree1, sel_tree2) tree_or() always allocates a separate SEL_TREE for its result. This causes MariaDB to allocate 2x more SEL_TREE objects than the number of elements in (c1,c2, ... cN) list. For the example, I see 88213 SEL_TREEs allocated. MySQL 5.7 with range_optimizer_max_mem_size=1024*1024*1024 setting allocates only 44108 SEL_TREE objects. It allocates fewer objects, because tree_or(sel_tree1, sel_tree2) call reuses one of SEL_TREE parameters for the result ... Looking why MariaDB doesn't reuse... it comes from this patch: commit 3019d3972932931d9c4a14bec4183b232355cdcf Author: Igor Babaev <igor@askmonty.org> Mon Oct 12 08:59:34 2009 The main patch for WL#24: "index_merge: fair choice between index_merge union and range access"
            psergei Sergei Petrunia added a comment - - edited

            If we made tree_or not to allocate a SEL_TREE in this case, we would gain (sizeof(SEL_TREE)=120) * 44108=5,292,960 bytes.

            get_mm_tree currently uses 16,605,264 bytes so it would one third.

            psergei Sergei Petrunia added a comment - - edited If we made tree_or not to allocate a SEL_TREE in this case, we would gain (sizeof(SEL_TREE)=120) * 44108=5,292,960 bytes. get_mm_tree currently uses 16,605,264 bytes so it would one third.

            Conclusions from yesterday scrum discussion:
            it is ok to backport @@eq_range_index_dive_limit.
            ( https://github.com/mysql/mysql-server/commit/7a9d83117f039337a948f918c2c4c5d7eb3f5ccc )

            psergei Sergei Petrunia added a comment - Conclusions from yesterday scrum discussion: it is ok to backport @@eq_range_index_dive_limit. ( https://github.com/mysql/mysql-server/commit/7a9d83117f039337a948f918c2c4c5d7eb3f5ccc )
            psergei Sergei Petrunia added a comment - - edited

            Pushed patch2, 7d3d75895d9d29d52c34dd3559cec59731d8d267.
            It reduces memory usage further:

            == 10.1-patch2 ==
            explain: 2.56 sec (no improvement)
            query: 3.50 sec (no improvement)
            information_schema.processlist.memory_used: 20,437,336 (was: 26,091,264)
            Memory used inside get_mm_tree: 10,951,200 (was: 16,605,264)

            The amount of memory used by the range optimizer is now roughly the same what 5.7 would use (see HIGH_NUMBER_OF_BYTES_USED: 10,241,280 above) when one sets the settings to make it apples-to-apples comparison).

            psergei Sergei Petrunia added a comment - - edited Pushed patch2, 7d3d75895d9d29d52c34dd3559cec59731d8d267. It reduces memory usage further: == 10.1-patch2 == explain: 2.56 sec (no improvement) query: 3.50 sec (no improvement) information_schema.processlist.memory_used: 20,437,336 (was: 26,091,264) Memory used inside get_mm_tree: 10,951,200 (was: 16,605,264) The amount of memory used by the range optimizer is now roughly the same what 5.7 would use (see HIGH_NUMBER_OF_BYTES_USED: 10,241,280 above) when one sets the settings to make it apples-to-apples comparison).

            Pushed a fix for MDEV-10175. It doesn't have a lot of effect for in-memory workloads, but may make a difference when records_in_range() calls have to hit the disk.

            psergei Sergei Petrunia added a comment - Pushed a fix for MDEV-10175 . It doesn't have a lot of effect for in-memory workloads, but may make a difference when records_in_range() calls have to hit the disk.

            Still no hard limit on memory used, so still possible to trigger OOM.

            serg Sergei Golubchik added a comment - Still no hard limit on memory used, so still possible to trigger OOM.
            psergei Sergei Petrunia added a comment - - edited

            confirms the earlier findings - recent MariaDB versions take about as much RAM as MySQL does.
            MySQL has a size cap in form of the @@range_optimizer_max_mem_size limit.

            Without the cap.. we have test_quick_select code use about 10M memory for a 300K query. That is, about 30x the size of the query. You can get the multiplier to be bigger if there are multiple potential indexes.

            psergei Sergei Petrunia added a comment - - edited confirms the earlier findings - recent MariaDB versions take about as much RAM as MySQL does. MySQL has a size cap in form of the @@range_optimizer_max_mem_size limit. Without the cap.. we have test_quick_select code use about 10M memory for a 300K query. That is, about 30x the size of the query. You can get the multiplier to be bigger if there are multiple potential indexes.

            Debugging the query explain-mdev9764-12.sql (query text takes 39K)

            explain
            SELECT * FROM t WHERE id IN (10000, 10001, 10002, ....15479, 15480)
            

            The IN list has 5480 elements.

            I can see 5481 SEL_TREE objects alloced. sizeof(SEL_TREE)=120. This is responsible for 120*5480 = 657,600 = 657K of mem usage

            Each SEL_TREE object has

            • one SEL_ARG object taking 104 bytes
            • one key value, which takes 9 bytes
              This adds up to 5481 * (9 + 104)= 619,353 bytes.

            619+658=~1.2M, which agrees with the data on the chart.

            Conclusions:
            In this bug's example

            • memory usage is O(query_size), albeit with a big constant (30x)
            • all consumed memory is used by the final SEL_ARG/SEL_TREE data structure (this means, there is currently no memory waste. There used to be but I've fixed that)
            psergei Sergei Petrunia added a comment - Debugging the query explain-mdev9764-12.sql (query text takes 39K) explain SELECT * FROM t WHERE id IN (10000, 10001, 10002, ....15479, 15480) The IN list has 5480 elements. I can see 5481 SEL_TREE objects alloced. sizeof(SEL_TREE)=120. This is responsible for 120*5480 = 657,600 = 657K of mem usage Each SEL_TREE object has one SEL_ARG object taking 104 bytes one key value, which takes 9 bytes This adds up to 5481 * (9 + 104)= 619,353 bytes. 619+658=~1.2M, which agrees with the data on the chart. Conclusions : In this bug's example memory usage is O(query_size), albeit with a big constant (30x) all consumed memory is used by the final SEL_ARG/SEL_TREE data structure (this means, there is currently no memory waste. There used to be but I've fixed that)

            Queries of this kind, with long IN () lists, are usually dynamically generated by software that may be out of DBA's control. Here hard limits (using any of the way implemented in upstream MySQL (eq_range_index_dive_limit and/or range_optimizer_max_mem_size) help DBA to keep server more safe.

            Also, what about that part of the original report related to time spent on optimization (for no real gain, when number of items in the IN list is approaching number of rows in the table)? Hard limits solve this problem as well.

            valerii Valerii Kravchuk added a comment - Queries of this kind, with long IN () lists, are usually dynamically generated by software that may be out of DBA's control. Here hard limits (using any of the way implemented in upstream MySQL (eq_range_index_dive_limit and/or range_optimizer_max_mem_size) help DBA to keep server more safe. Also, what about that part of the original report related to time spent on optimization (for no real gain, when number of items in the IN list is approaching number of rows in the table)? Hard limits solve this problem as well.

            Ok, finally it became clear that A) other linked MDEV issues are different from this one, and B) although some solutions are possible for this particular issue (will file an MDEV), all of them will be too intrusive to be put into a 10.1 release.

            So, thinking of backporting the @@range_optimizer_max_mem_size fix from MySQL.

            The patch in MySQL is:
            https://github.com/mysql/mysql-server/commit/a7571080513ed2107919ebb1f5633566d3605189

            Possible concerns when applying this on MariaDB 10.1:

            • Adding members into MEM_ROOT structure causes the ABI check to fail (I am not sure how much of an issue that actually is)

            Warning	3170	Memory capacity of 1536000 bytes for
            'range_optimizer_max_mem_size' exceeded. Range optimization was not done for this query.
            

            The wording is not entirely correct, because range optimization may be still done for other tables. Do we want the warning to be printed at all?

            • I suppose we want to preserve the current behavior, so the default should be "no limits" (unlike with MySQL 5.7 that has 8M as the default).
            psergei Sergei Petrunia added a comment - Ok, finally it became clear that A) other linked MDEV issues are different from this one, and B) although some solutions are possible for this particular issue (will file an MDEV), all of them will be too intrusive to be put into a 10.1 release. So, thinking of backporting the @@range_optimizer_max_mem_size fix from MySQL. The patch in MySQL is: https://github.com/mysql/mysql-server/commit/a7571080513ed2107919ebb1f5633566d3605189 Possible concerns when applying this on MariaDB 10.1: Adding members into MEM_ROOT structure causes the ABI check to fail (I am not sure how much of an issue that actually is) The patch adds new errors into sql/share/errmsg-utf8.txt. Can we do that in the middle of the stable 10.1 series (probably Yes, as here's an example of a patch that did that: https://github.com/MariaDB/server/commit/825f51d1aab51d363dc07ec9fe0829af33063883 ? ) With the patch, the queries will produce warnings like this: Warning 3170 Memory capacity of 1536000 bytes for 'range_optimizer_max_mem_size' exceeded. Range optimization was not done for this query. The wording is not entirely correct, because range optimization may be still done for other tables. Do we want the warning to be printed at all? I suppose we want to preserve the current behavior, so the default should be "no limits" (unlike with MySQL 5.7 that has 8M as the default).

            Added a new server option/variable --max-session-mem-used that implements a
            soft limit for the server session status variable mem_used.

            serg Sergei Golubchik added a comment - Added a new server option/variable --max-session-mem-used that implements a soft limit for the server session status variable mem_used .
            Lazzaris Simone added a comment -

            Is it possible for this bug to affect version 10.0.30?

            I'm having some random OOM issues, and I've seend queries with megabytes (literaly) of "IN"/"NOT IN" clause.

            Lazzaris Simone added a comment - Is it possible for this bug to affect version 10.0.30? I'm having some random OOM issues, and I've seend queries with megabytes (literaly) of "IN"/"NOT IN" clause.

            Yes, it affects all versions below what is listed in FixVersion/s field. So, all 5.5 and 10.0 versions.

            serg Sergei Golubchik added a comment - Yes, it affects all versions below what is listed in FixVersion/s field. So, all 5.5 and 10.0 versions.

            People

              serg Sergei Golubchik
              valerii Valerii Kravchuk
              Votes:
              2 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.