Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8880

Very suboptimal join order is generated for a simple query even if MariaDB query planner knows the other is better in every sense

    XMLWordPrintable

    Details

    • Sprint:
      10.1.11

      Description

      This is the summary of the bug reported here: https://phabricator.wikimedia.org/T113901

      This is the exact CREATE syntax of revision and page:

      CREATE TABLE `revision` (
          `rev_id` int(8) unsigned NOT NULL AUTO_INCREMENT,
          `rev_page` int(8) unsigned NOT NULL DEFAULT '0',
          `rev_text_id` int(8) unsigned NOT NULL DEFAULT '0',
          `rev_comment` varbinary(255) DEFAULT NULL,
          `rev_user` int(5) unsigned NOT NULL DEFAULT '0',
          `rev_user_text` varbinary(255) NOT NULL DEFAULT '',
          `rev_timestamp` varbinary(14) NOT NULL DEFAULT '',
          `rev_minor_edit` tinyint(1) unsigned NOT NULL DEFAULT '0',
          `rev_deleted` tinyint(1) unsigned NOT NULL DEFAULT '0',
          `rev_len` int(8) unsigned DEFAULT NULL,
          `rev_parent_id` int(8) unsigned DEFAULT NULL,
          `rev_sha1` varbinary(32) NOT NULL DEFAULT '',
          `rev_content_model` varbinary(32) DEFAULT NULL,
          `rev_content_format` varbinary(64) DEFAULT NULL,
          PRIMARY KEY (`rev_id`),
          KEY `rev_page_id` (`rev_page`,`rev_id`),
          KEY `rev_timestamp` (`rev_timestamp`,`rev_id`),
          KEY `page_timestamp` (`rev_page`,`rev_timestamp`,`rev_id`),
          KEY `user_timestamp` (`rev_user`,`rev_timestamp`,`rev_id`),
          KEY `usertext_timestamp` (`rev_user_text`,`rev_timestamp`,`rev_id`)
        ) ENGINE=InnoDB AUTO_INCREMENT=683122218 DEFAULT CHARSET=binary

      CREATE TABLE `page` (
        `page_id` int(8) unsigned NOT NULL AUTO_INCREMENT,
        `page_namespace` int(11) NOT NULL DEFAULT '0',
        `page_title` varbinary(255) NOT NULL DEFAULT '',
        `page_restrictions` tinyblob NOT NULL,
        `page_counter` bigint(20) unsigned NOT NULL DEFAULT '0',
        `page_is_redirect` tinyint(1) unsigned NOT NULL DEFAULT '0',
        `page_is_new` tinyint(1) unsigned NOT NULL DEFAULT '0',
        `page_random` double unsigned NOT NULL DEFAULT '0',
        `page_touched` varbinary(14) NOT NULL DEFAULT '',
        `page_links_updated` varbinary(14) DEFAULT NULL,
        `page_latest` int(8) unsigned NOT NULL DEFAULT '0',
        `page_len` int(8) unsigned NOT NULL DEFAULT '0',
        `page_content_model` varbinary(32) DEFAULT NULL,
        PRIMARY KEY (`page_id`),
        UNIQUE KEY `name_title` (`page_namespace`,`page_title`),
        KEY `page_random` (`page_random`),
        KEY `page_len` (`page_len`),
        KEY `page_redirect_namespace_len` (`page_is_redirect`,`page_namespace`,`page_len`)
      ) ENGINE=InnoDB AUTO_INCREMENT=47992700 DEFAULT CHARSET=binary

      Regular explain:

      Tables have approximately the numbers or rows you see on the autoinc.

      mysql> EXPLAIN SELECT rev_id, rev_timestamp, page_id, page_title, page_namespace FROM `revision` INNER JOIN `page` ON ((rev_page = page_id))  ORDER BY rev_timestamp  DESC,rev_id  DESC LIMIT 11\G
        *************************** 1. row ***************************
                   id: 1
          select_type: SIMPLE
                table: page
                 type: index
        possible_keys: PRIMARY
                  key: name_title
              key_len: 261
                  ref: NULL
                 rows: 33836157
                Extra: Using index; Using temporary; Using filesort
        *************************** 2. row ***************************
                   id: 1
          select_type: SIMPLE
                table: revision
                 type: ref
        possible_keys: rev_page_id,page_timestamp
                  key: page_timestamp
              key_len: 4
                  ref: enwiki.page.page_id
                 rows: 8
                Extra: Using index

      Forcing the other plan:

      mysql> EXPLAIN SELECT STRAIGHT_JOIN rev_id, rev_timestamp, page_id, page_title, page_namespace FROM `revision` INNER JOIN `page` ON ((rev_page = page_id))  ORDER BY rev_timestamp  DESC,rev_id  DESC LIMIT 11\G
        *************************** 1. row ***************************
                   id: 1
          select_type: SIMPLE
                table: revision
                 type: index
        possible_keys: rev_page_id,page_timestamp
                  key: rev_timestamp
              key_len: 20
                  ref: NULL
                 rows: 11
                Extra:
        *************************** 2. row ***************************
                   id: 1
          select_type: SIMPLE
                table: page
                 type: eq_ref
        possible_keys: PRIMARY
                  key: PRIMARY
              key_len: 4
                  ref: enwiki.revision.rev_page
                 rows: 1
                Extra:

      The actual query parsed has no issue:

      select `enwiki`.`revision`.`rev_id` AS `rev_id`,`enwiki`.`revision`.`rev_timestamp` AS `rev_timestamp`,`enwiki`.`page`.`page_id`
        AS `page_id`,`enwiki`.`page`.`page_title` AS `page_title`,`enwiki`.`page`.`page_namespace` AS `page_namespace`
        from `enwiki`.`revision` join `enwiki`.`page` where (`enwiki`.`revision`.`rev_page` = `enwiki`.`page`.`page_id`)
        order by `enwiki`.`revision`.`rev_timestamp` desc,`enwiki`.`revision`.`rev_id` desc limit 11

      Even ignoring the indexes does not work (it is not related to covering index):

      mysql> EXPLAIN EXTENDED SELECT rev_id, rev_timestamp, page_id, page_title, page_namespace FROM `revision` INNER JOIN `page` IGNORE INDEX(name_title, PRIMARY) ON ((rev_page = page_id))  ORDER BY rev_timestamp  DESC,rev_id  DESC LIMIT 11\G
        *************************** 1. row ***************************
                   id: 1
          select_type: SIMPLE
                table: page
                 type: ALL
        possible_keys: NULL
                  key: NULL
              key_len: NULL
                  ref: NULL
                 rows: 33836157
             filtered: 100.00
                Extra: Using temporary; Using filesort
        *************************** 2. row ***************************
                   id: 1
          select_type: SIMPLE
                table: revision
                 type: ref
        possible_keys: rev_page_id,page_timestamp
                  key: page_timestamp
              key_len: 4
                  ref: enwiki.page.page_id
                 rows: 8
             filtered: 100.00
                Extra: Using index
        2 rows in set, 1 warning (0.00 sec)

      I think this is a bug on the optimizer.

      Running ANALYZE on both tables, even with use-stat-tables=PREFERABLY does not help (and it shouldn't, row counts are ok, it is that for some reason maria refuses to use the clearly better plan -that itself understands).

      This could be a specific configuration we have, you can see most of it at:
      https://git.wikimedia.org/blob/operations%2Fpuppet.git/1423255184712f7fde17dee338c7ac1519ee44ee/templates%2Fmariadb%2Fproduction.my.cnf.erb

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              psergey Sergei Petrunia
              Reporter:
              jcrespo Jaime Crespo
              Votes:
              4 Vote for this issue
              Watchers:
              7 Start watching this issue

                Dates

                Created:
                Updated: