Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13783

Index optimiser regression in 10.1.26 and possibly earlier

    XMLWordPrintable

Details

    Description

      Since upgrading some of our databases we have started to notice a significant performance drop due to simple indexes that were previously used, now being ignored by the optimiser. As an example we have nearly identical systems running 10.1.10 and 10.1.26. 10.1.10 is performant, and uses the index correctly, while 10.1.26 is not, and requires a FORCE index to get it to run efficiently.

      Below are details of the two systems with some (superfluous) information redacted/changed. This has been most noticeable on tables we have that have many 100's of millions of rows (up to 700 million), but may have affected smaller ones as well, just that it hasn't been as obvious an issue. Note that this issue may have affected releases between 10.1.10 and 10.1.26, we can only confirm it as being at least in 10.1.26.

      The only 'smoking gun' I can point to possibly at this stage is that the TABLE_ROWS and AVG_ROW_LENGTH data aren't updated on the system with the newer server version, which could point to a data issue after upgrading, however I understand that these stats aren't always correct. Prior to upgrade, this database was performant, but I don't have information as to whether these values were reporting correctly or not. FWIW, we have seen this data not up to dat

      Please let me know if you need any more information. Note that this was one example of this, there was another database with completely different table and data, but with 100s of millions of rows that became similarly inefficient after upgrade, but we don't have the relative detail available for that one like for this example.

      Latest 10.1.26 release with regression : (db_a)

      # mysqld -V
      mysqld  Ver 10.1.26-MariaDB-1~xenial for debian-linux-gnu on x86_64 (mariadb.org binary distribution)
      

      System that is OK with older version of MariaDB : (db_b)

      # mysqld -V
      mysqld  Ver 10.1.10-MariaDB-1~trusty for debian-linux-gnu on x86_64 (mariadb.org binary distribution)
      

      Below command is running a DESCRIBE on simple query of form "describe select * from big_table where char_id in ('abc','def','gef'...... approx 20,000 entries.....)".

      # mysql -uroot db_a < /tmp/x.sql
      id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
      1       SIMPLE  big_table   ALL     char_id     NULL    NULL    NULL    82623   Using where; Using filesort
       
      # mysql -uroot db_b < /tmp/x.sql
      id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
      1       SIMPLE  big_table   range   char_id     char_id     62      NULL    38663   Using where; Using filesort
      

      Note that for only a handful of items in the IN (eg. 5) it uses the index correctly in both cases. It is however orders of magnitude more efficient to force the index with the larger IN clause. We didn't let the one not using the index run to completion as it was taking far too long, and consuming significant db resources, but it was verified multiple times before forcing the index usage which worked around the problem. The one that uses the index properly returns in seconds.

      Table schema, identical on both except for AUTO_INCREMENT, some field names changed :

      CREATE TABLE `big_table` (
        `id` bigint(20) NOT NULL,
        `id1` tinyint(4) NOT NULL,
        `char1` varchar(20) DEFAULT NULL,
        `char_id` varchar(15) NOT NULL,
        `char2` varchar(30) NOT NULL DEFAULT '',
        `char3` varchar(160) NOT NULL DEFAULT '',
        `id2` tinyint(4) NOT NULL DEFAULT '0',
        `id3` tinyint(4) NOT NULL DEFAULT '0',
        `id4` int(11) NOT NULL DEFAULT '-1',
        `id5` int(11) NOT NULL DEFAULT '-1',
        `created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
        `updated_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
        `id6` mediumint(9) DEFAULT NULL,
        `sequence_id` int(11) NOT NULL AUTO_INCREMENT,
        `flag` tinyint(3) unsigned NOT NULL DEFAULT '0',
        `id6` mediumint(9) NOT NULL DEFAULT '-1',
        `id7` mediumint(9) NOT NULL DEFAULT '-1',
        `id8` mediumint(9) NOT NULL DEFAULT '-1',
        `id9` smallint(5) unsigned NOT NULL DEFAULT '0',
        PRIMARY KEY (`id`),
        UNIQUE KEY `sequence_id` (`sequence_id`),
        KEY `char_id` (`char_id`)
      ) ENGINE=TokuDB AUTO_INCREMENT=966463846 DEFAULT CHARSET=utf8mb4 `COMPRESSION`=TOKUDB_FAST;
      

      Schema information from both databases for this table. Note that the stats on the disfunctional db here aren't updated :

      MariaDB [db_a]> SELECT * FROM information_schema.tables WHERE table_schema = DATABASE() and table_name='big_table'\G
      *************************** 1. row ***************************
        TABLE_CATALOG: def
         TABLE_SCHEMA: db_a
           TABLE_NAME: big_table
           TABLE_TYPE: BASE TABLE
               ENGINE: TokuDB
              VERSION: 10
           ROW_FORMAT: Dynamic
           TABLE_ROWS: 0
       AVG_ROW_LENGTH: 0
          DATA_LENGTH: 78503998066
      MAX_DATA_LENGTH: 9223372036854775807
         INDEX_LENGTH: 22612881055
            DATA_FREE: 18446744054885965551
       AUTO_INCREMENT: 938484216
          CREATE_TIME: 2017-06-18 00:53:40
          UPDATE_TIME: 2017-09-11 21:51:12
           CHECK_TIME: NULL
      TABLE_COLLATION: utf8mb4_general_ci
             CHECKSUM: NULL
       CREATE_OPTIONS: `COMPRESSION`=TOKUDB_FAST
        TABLE_COMMENT:
      1 row in set (0.00 sec)
      

      MariaDB [db_b]> SELECT * FROM information_schema.tables WHERE table_schema = DATABASE() and table_name='big_table'\G
      *************************** 1. row ***************************
        TABLE_CATALOG: def
         TABLE_SCHEMA: db_b
           TABLE_NAME: big_table
           TABLE_TYPE: BASE TABLE
               ENGINE: TokuDB
              VERSION: 10
           ROW_FORMAT: Dynamic
           TABLE_ROWS: 713853307
       AVG_ROW_LENGTH: 113
          DATA_LENGTH: 80947464082
      MAX_DATA_LENGTH: 9223372036854775807
         INDEX_LENGTH: 24225657099
            DATA_FREE: 12250841088
       AUTO_INCREMENT: 966464388
          CREATE_TIME: 2017-05-27 05:51:23
          UPDATE_TIME: 2017-09-11 21:53:46
           CHECK_TIME: NULL
      TABLE_COLLATION: utf8mb4_general_ci
             CHECKSUM: NULL
       CREATE_OPTIONS: `COMPRESSION`=TOKUDB_FAST
        TABLE_COMMENT:
      1 row in set (0.01 sec)
      

      Explicit table row information for both :

      MariaDB [db_a]> select count(*) from big_table;
      +-----------+
      | count(*)  |
      +-----------+
      | 672204287 |
      +-----------+
       
      MariaDB [db_b]> select count(*) from big_table;
      +-----------+
      | count(*)  |
      +-----------+
      | 700081095 |
      +-----------+
      1 row in set (3 min 42.56 sec)
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              johnbarratt John Barratt
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.