Details

    • Bug
    • Status: Stalled (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.2.5
    • 10.6
    • Optimizer
    • None
    • Debian 8, x86_64

    Description

      Using the following table:

      CREATE TABLE `obj` (
        `partition` smallint(5) unsigned NOT NULL,
        `oid` bigint(20) unsigned NOT NULL,
        `tid` bigint(20) unsigned NOT NULL,
        `data_id` bigint(20) unsigned DEFAULT NULL,
        `value_tid` bigint(20) unsigned DEFAULT NULL,
        PRIMARY KEY (`partition`,`tid`,`oid`),
        KEY `partition` (`partition`,`oid`,`tid`),
        KEY `data_id` (`data_id`)
      ) ENGINE=ROCKSDB
      

      The following request is far too slow:

      MariaDB [neo0]> analyze SELECT SQL_NO_CACHE `partition`, MAX(tid) FROM obj WHERE `partition`=0;
      +------+-------------+-------+------+-------------------+-----------+---------+-------+--------+-----------+----------+------------+-------------+
      | id   | select_type | table | type | possible_keys     | key       | key_len | ref   | rows   | r_rows    | filtered | r_filtered | Extra       |
      +------+-------------+-------+------+-------------------+-----------+---------+-------+--------+-----------+----------+------------+-------------+
      |    1 | SIMPLE      | obj   | ref  | PRIMARY,partition | partition | 2       | const | 650282 | 647396.00 |   100.00 |     100.00 | Using index |
      +------+-------------+-------+------+-------------------+-----------+---------+-------+--------+-----------+----------+------------+-------------+
      1 row in set (0.40 sec)
      

      It used the wrong index (maybe another issue). With correct index, it's not faster:

      MariaDB [neo0]> analyze SELECT SQL_NO_CACHE `partition`, MAX(tid) FROM obj FORCE INDEX (PRIMARY) WHERE `partition`=0;
      +------+-------------+-------+------+---------------+---------+---------+-------+--------+-----------+----------+------------+-------------+
      | id   | select_type | table | type | possible_keys | key     | key_len | ref   | rows   | r_rows    | filtered | r_filtered | Extra       |
      +------+-------------+-------+------+---------------+---------+---------+-------+--------+-----------+----------+------------+-------------+
      |    1 | SIMPLE      | obj   | ref  | PRIMARY       | PRIMARY | 2       | const | 664465 | 647396.00 |   100.00 |     100.00 | Using index |
      +------+-------------+-------+------+---------------+---------+---------+-------+--------+-----------+----------+------------+-------------+
      1 row in set (0.38 sec)
      

      I want the `partition` column in the result because I'd like to query MAX(tid) for several values of `partition` at the same time (using GROUP BY). Without this column, it works as expected:

      MariaDB [neo0]> SELECT SQL_NO_CACHE MAX(tid) FROM obj WHERE `partition`=0;
      +--------------------+
      | MAX(tid)           |
      +--------------------+
      | 270055958963668676 |
      +--------------------+
      1 row in set (0.00 sec)
      

      Attachments

        Issue Links

          Activity

            varun, ok I tried debugging your example.

            • Confirm, const_result in opt_sum_query is FALSE, because of a in the select list.
            • Other that that, optimization works: find_key_for_maxmin() succeeds in figuring out that MIN(b) can be resolved by using an index lookup.

            If I remove "a" from the select list, the optimization works:

            MariaDB [j50]> explain SELECT SQL_NO_CACHE MIN(b) FROM t3 WHERE a = 0 ;
            +------+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
            | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
            +------+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
            |    1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
            +------+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
            

            compare with

            MariaDB [j50]> explain SELECT SQL_NO_CACHE a, MIN(b) FROM t3 WHERE a = 0 ;
            +------+-------------+-------+------+---------------+---------+---------+-------+------+-------------+
            | id   | select_type | table | type | possible_keys | key     | key_len | ref   | rows | Extra       |
            +------+-------------+-------+------+---------------+---------+---------+-------+------+-------------+
            |    1 | SIMPLE      | t3    | ref  | PRIMARY       | PRIMARY | 4       | const |    4 | Using index |
            +------+-------------+-------+------+---------------+---------+---------+-------+------+-------------+
            

            psergei Sergei Petrunia added a comment - varun , ok I tried debugging your example. Confirm, const_result in opt_sum_query is FALSE, because of a in the select list. Other that that, optimization works: find_key_for_maxmin() succeeds in figuring out that MIN(b) can be resolved by using an index lookup. If I remove "a" from the select list, the optimization works: MariaDB [j50]> explain SELECT SQL_NO_CACHE MIN(b) FROM t3 WHERE a = 0 ; +------+-------------+-------+------+---------------+------+---------+------+------+------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+------+------------------------------+ | 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away | +------+-------------+-------+------+---------------+------+---------+------+------+------------------------------+ compare with MariaDB [j50]> explain SELECT SQL_NO_CACHE a, MIN(b) FROM t3 WHERE a = 0 ; +------+-------------+-------+------+---------------+---------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+---------+---------+-------+------+-------------+ | 1 | SIMPLE | t3 | ref | PRIMARY | PRIMARY | 4 | const | 4 | Using index | +------+-------------+-------+------+---------------+---------+---------+-------+------+-------------+

            So we have in the select list

            (gdb) p dbug_print_item(item)
              $46 = 0x555556d3fd80 <dbug_item_print_buf> "t3.a"
            (gdb) p item->const_item()
              $47 = false
            

            The WHERE clause has "a=0" but the Item_field object in the select list is not
            marked as a participant in multiple equality (let's denote this as
            SELECT-LIST-NOT-IN-ITEM-EQUAL):

            (gdb) p item
              $49 = (Item_field *) 0x7fffd1b68178
            (gdb) p item->item_equal
              $50 = (Item_equal *) 0x0
            

            A proof that multiple equality is already built:

            (gdb) up
              #1  0x0000555555a620c3 in JOIN::optimize_inner (this=0x7fffd1b694f8) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1291
            (gdb) p dbug_print_item(this->cond_equal.current_level.head())
              $65 = 0x555556d3fd80 <dbug_item_print_buf> "multiple equal(0, t3.a)"
            

            I think it's feasible to develop a patch that will replace the

                  if (!item->const_item())
                    const_result= 0;
            

            with

                  if (!item->const_item() && 
                      cannot_infer_that_item_is_constant(item))
                    const_result= 0;
            

            Here cannot_infer_that_item_is_constant() should use JOIN's multiple equalities.

            • It would be nice if it handled expressions, not just column references
            • There is probably code somewhere that finds which Item_equal a given Item_field belongs to
            • Should we fix SELECT-LIST-NOT-IN-ITEM-EQUAL while making the check? (I am not certain).

            Let's check the above questions with Igor as he's the most knowledgeable about
            multiple equalities.

            psergei Sergei Petrunia added a comment - So we have in the select list (gdb) p dbug_print_item(item) $46 = 0x555556d3fd80 <dbug_item_print_buf> "t3.a" (gdb) p item->const_item() $47 = false The WHERE clause has "a=0" but the Item_field object in the select list is not marked as a participant in multiple equality (let's denote this as SELECT-LIST-NOT-IN-ITEM-EQUAL): (gdb) p item $49 = (Item_field *) 0x7fffd1b68178 (gdb) p item->item_equal $50 = (Item_equal *) 0x0 A proof that multiple equality is already built: (gdb) up #1 0x0000555555a620c3 in JOIN::optimize_inner (this=0x7fffd1b694f8) at /home/psergey/dev-git/10.1/sql/sql_select.cc:1291 (gdb) p dbug_print_item(this->cond_equal.current_level.head()) $65 = 0x555556d3fd80 <dbug_item_print_buf> "multiple equal(0, t3.a)" I think it's feasible to develop a patch that will replace the if (!item->const_item()) const_result= 0; with if (!item->const_item() && cannot_infer_that_item_is_constant(item)) const_result= 0; Here cannot_infer_that_item_is_constant() should use JOIN's multiple equalities. It would be nice if it handled expressions, not just column references There is probably code somewhere that finds which Item_equal a given Item_field belongs to Should we fix SELECT-LIST-NOT-IN-ITEM-EQUAL while making the check? (I am not certain). Let's check the above questions with Igor as he's the most knowledgeable about multiple equalities.

            Notes from the yesterday optimizer call:

            Mixing of aggregates and non-aggregates is not allowed by the SQL standard. MySQL/MariaDB has the infamous extension that allows it. The following is legal:

            SELECT a, column2, MIN(b) FROM t3 WHERE a = 0;
            

            here, the values of a and column2 can be from any row within "the group". And "the group" here consists of all table rows that match the WHERE clause.

            An alternative solution to the one mentioned in the previous comment:

            • Allow non-constant items in the select list.
            • Compute them using column values of the row that we got the MIN(b) from.

            At the moment the index lookup to find MIN(b) is done using HA_EXTRA_KEYREAD. If we need a column that's not a part of the index, we can

            • disable HA_EXTRA_KEYREAD for one of the index lookups
            • save the rowid and make ha_rnd_pos() call to get the full row.

            CC: igor.

            psergei Sergei Petrunia added a comment - Notes from the yesterday optimizer call: Mixing of aggregates and non-aggregates is not allowed by the SQL standard. MySQL/MariaDB has the infamous extension that allows it. The following is legal: SELECT a, column2, MIN(b) FROM t3 WHERE a = 0; here, the values of a and column2 can be from any row within "the group". And "the group" here consists of all table rows that match the WHERE clause. An alternative solution to the one mentioned in the previous comment: Allow non-constant items in the select list. Compute them using column values of the row that we got the MIN(b) from. At the moment the index lookup to find MIN(b) is done using HA_EXTRA_KEYREAD. If we need a column that's not a part of the index, we can disable HA_EXTRA_KEYREAD for one of the index lookups save the rowid and make ha_rnd_pos() call to get the full row. CC: igor .

            As discussed with Igor, functional dependency would take care of this case and it would be easier to check if the item in the select list is a constant or not. So this would be fixed in 10.5.

            varun Varun Gupta (Inactive) added a comment - As discussed with Igor, functional dependency would take care of this case and it would be easier to check if the item in the select list is a constant or not. So this would be fixed in 10.5.

            is there any plans to fix this and https://jira.mariadb.org/browse/MDEV-11588 in 10.6.

            It would be critical and very important to close this after nearly 4 years

            blackout Roman Stingler (Inactive) added a comment - is there any plans to fix this and https://jira.mariadb.org/browse/MDEV-11588 in 10.6. It would be critical and very important to close this after nearly 4 years

            People

              psergei Sergei Petrunia
              jmuchemb Julien Muchembled
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.