XML

Word

Printable

Details

Type: Task
Status: Closed (View Workflow)
Priority: Major
Resolution: Won't Fix
Fix Version/s: N/A
Component/s: Optimizer
Labels:
None

Description

Original setting

Investigate resolving hints later to solve chicken/egg problem where hints are needed for derived tables before all VIEWs are opened. The basic premise is that we open VIEWs (which are treated as derived tables) after lexing the parent select. This select and its query blocks are numbered. It may specify hints that have an effect within VIEWs (like [NO_]MERGE) but we won't know until after we open VIEWs. So we need to reconsider derived merge hints again after opening VIEWs. (In effect, we make a decision regarding the hint treatment during parent query lexing, and then have to revisit it upon opening a VIEW, or at the very least defer such decisions until after we open VIEWs).

From Slack:

Parsing

  open tables

    LOC1: (somewhere inside we check for OPTIMIZER_SWITCH_DERIVED_MERGE  and hint with the same meainig) (edited)

      > here we learn that some tables are views and that they have child select #N , #N+1, #N+2 ...

Name resolution

  somewhere here we check for DERIVED_MERGE again (?)

  Maybe here we could "reconsider" the decision made at LOC1

Defer resolution until

  JOIN object construction ?

  fix_fields for items ?

Second attempt

Starting points:

At the moment, hints cannot control any part of the query plan that comes from constructs inside VIEWs.
The desired workflow is:
- One gets a query using VIEWs.
- One looks at EXPLAIN output and wants to change something.
- One uses select numbers from EXPLAIN in top-level query to tweak the query optimization. Changing VIEW definitions is not required. (And in the main query, we just add a hint at the start. No need to place any QB_NAME somewhere inside the query)

A long-ish text with examples:

create table t1 (

  a int, b int,

  index i1(a), index i2(b)

);

insert into t1 select seq, seq from seq_1_to_1000;

First, make sure the hint works:

explain select * from t1 where a<3 or b<3;

+------+-------------+-------+-------------+---------------+-------+---------+------+------+--------------------------------------+

| id   | select_type | table | type        | possible_keys | key   | key_len | ref  | rows | Extra                                |

+------+-------------+-------+-------------+---------------+-------+---------+------+------+--------------------------------------+

|    1 | SIMPLE      | t1    | index_merge | i1,i2         | i1,i2 | 5,5     | NULL | 4    | Using sort_union(i1,i2); Using where |

+------+-------------+-------+-------------+---------------+-------+---------+------+------+--------------------------------------+

explain select /*+ NO_INDEX_MERGE(t1 i1,i2) */ * from t1 where a<3 or b<3;

+------+-------------+-------+------+---------------+------+---------+------+------+-------------+

| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |

+------+-------------+-------+------+---------------+------+---------+------+------+-------------+

|    1 | SIMPLE      | t1    | ALL  | i1,i2         | NULL | NULL    | NULL | 1000 | Using where |

+------+-------------+-------+------+---------------+------+---------+------+------+-------------+

Ok, it does.

Make sure it works for derived table:

explain select *

from (select * from t1 where a<3 or b<3 limit 100) T;

+------+-------------+------------+-------------+---------------+-------+---------+------+------+--------------------------------------+

| id   | select_type | table      | type        | possible_keys | key   | key_len | ref  | rows | Extra                                |

+------+-------------+------------+-------------+---------------+-------+---------+------+------+--------------------------------------+

|    1 | PRIMARY     | <derived2> | ALL         | NULL          | NULL  | NULL    | NULL | 4    |                                      |

|    2 | DERIVED     | t1         | index_merge | i1,i2         | i1,i2 | 5,5     | NULL | 4    | Using sort_union(i1,i2); Using where |

+------+-------------+------------+-------------+---------------+-------+---------+------+------+--------------------------------------+

explain select /*+ NO_INDEX_MERGE(t1@`select#2` i1,i2) */ *

from

(select * from t1 where a<3 or b<3 limit 100) T;

+------+-------------+------------+------+---------------+------+---------+------+------+-------------+

| id   | select_type | table      | type | possible_keys | key  | key_len | ref  | rows | Extra       |

+------+-------------+------------+------+---------------+------+---------+------+------+-------------+

|    1 | PRIMARY     | <derived2> | ALL  | NULL          | NULL | NULL    | NULL | 100  |             |

|    2 | DERIVED     | t1         | ALL  | i1,i2         | NULL | NULL    | NULL | 1000 | Using where |

+------+-------------+------------+------+---------------+------+---------+------+------+-------------+

Now, let's create a VIEW like a derived table:

create view v1 as

select * from t1 where a <3 or b <3 limit 100;

Hint no longer works:

explain select /*+ NO_INDEX_MERGE(t1@`select#2` i1,i2) */ *

from v1;

...

Warning (Code 4220): Query block name `select#2` is not found for NO_INDEX_MERGE hint

So, should we just move the hint resolution to happen after the VIEWs are opened and connected to the select_lex chain?

Hints use pre-merge numbering

One effect of hint resolution being done so early is that the hints use pre-merge numbering.
Here we use t1@`select#2` despite that EXPLAIN shows the table in select#1 :

explain select /*+ NO_INDEX_MERGE(t1@`select#2` i1,i2) */ *

from

(select * from t1 where a<3 or b<3 ) T;

+------+-------------+-------+------+---------------+------+---------+------+------+-------------+

| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |

+------+-------------+-------+------+---------------+------+---------+------+------+-------------+

|    1 | SIMPLE      | t1    | ALL  | i1,i2         | NULL | NULL    | NULL | 1000 | Using where |

+------+-------------+-------+------+---------------+------+---------+------+------+-------------+

If we use post-merge numbering, hint will fail to resolve:

explain select /*+ NO_INDEX_MERGE(t1@`select#1` i1,i2) */ *

from

(select * from t1 where a<3 or b<3 ) T;

...

Warning (Code 4221): Unresolved table name `t1`@`select#1` for NO_INDEX_MERGE hint

This means currently the number is not "like in EXPLAINs".

MySQL also uses pre-merge numbering: https://dbfiddle.uk/0-gUGRPg

Merging can create table naming ambiguities

In SQL, each table in a SELECT must have a unique alias.
However, derived and semi-join merge can put tables with the same alias into the same select.
Example with semi-join:

explain

select * from t1

where b in ( select b from t1 where a<3 or b < 3);

+------+-------------+-------+-------------+---------------+-------+---------+-----------+------+-------------------------------------------------------+

| id   | select_type | table | type        | possible_keys | key   | key_len | ref       | rows | Extra                                                 |

+------+-------------+-------+-------------+---------------+-------+---------+-----------+------+-------------------------------------------------------+

|    1 | PRIMARY     | t1    | index_merge | i1,i2         | i1,i2 | 5,5     | NULL      | 4    | Using sort_union(i1,i2); Using where; Start temporary |

|    1 | PRIMARY     | t1    | ref         | i2            | i2    | 5       | test.t1.b | 1    | End temporary                                         |

+------+-------------+-------+-------------+---------------+-------+---------+-----------+------+-------------------------------------------------------+

Example with derived table:

explain

select count(*) from t1, (select b from t1 where a<3 or b < 3) T ;

+------+-------------+-------+-------------+---------------+-------+---------+------+------+-------------------------------------------------+

| id   | select_type | table | type        | possible_keys | key   | key_len | ref  | rows | Extra                                           |

+------+-------------+-------+-------------+---------------+-------+---------+------+------+-------------------------------------------------+

|    1 | SIMPLE      | t1    | index_merge | i1,i2         | i1,i2 | 5,5     | NULL | 4    | Using sort_union(i1,i2); Using where            |

|    1 | SIMPLE      | t1    | index       | NULL          | i1    | 5       | NULL | 1000 | Using index; Using join buffer (flat, BNL join) |

+------+-------------+-------+-------------+---------------+-------+---------+------+------+-------------------------------------------------+

What to do

So should hints use pre-merge or post-merge select numbers?

Use pre-merge numbers

+ This is compatible with MySQL
- For a user, it might be hard to find the pre-merge number in large queries/views.
The rule is something like: For a VIEW, one must find the select_number of the select $SL where the VIEW reference was in the upper query. Find if $SL had any UNION peers. Then, add $SL->select_number to the "local" number of table in the VIEW.

Solution: Print the pre-merge number in query_block in EXPLAIN FORMAT=JSON.

Use post-merge numbers

+ Usability: take EXPLAIN, look at select number, use that in hint.
- This would be incompatible with MySQL
- Cannot use post-merge numbers for NO_MERGE hint? ( NO_MERGE hint is checked really early now)
- Merging may create table naming conflicts.

Solutions:

use a different notation for post-merge select numbers (like "sel#N" instead of "select#n").
(Note that we shouldn't use SEL_N that TiDB uses as their numbers are pre-merge (right? TODO check))
use some notation for conflicting table names "tableX#n" ?

CTEs work ok

(for historical purposes: initial discussion was here: https://jira.mariadb.org/browse/MDEV-37260?focusedCommentId=311448&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-311448 , the complaint raised there was partially fixed).

CTEs can be referred by their derived-like select number. (Technically, CTE occurs in query text before the top-level query, but then renumbering is done which puts the top-level query first).

create table ten(a int);

insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);

create table one_k(a int);

insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;

create table t1 ( a int, b int);

insert into t1 select a, a from ten;

create table t2 as select * from t1;

alter table t2 add index(a);

explain

with

  TBL1 as (select * from t2 where a < 2 limit 10)

select

  /*+ NO_INDEX(t2@`select#2` a) NO_INDEX(t2@`select#3` a) */  *

from t1, TBL1 as T_1, TBL1 as T_2;

+------+-------------+------------+------+---------------+------+---------+------+------+-------------------------------------------+

| id   | select_type | table      | type | possible_keys | key  | key_len | ref  | rows | Extra                                     |

+------+-------------+------------+------+---------------+------+---------+------+------+-------------------------------------------+

|    1 | PRIMARY     | t1         | ALL  | NULL          | NULL | NULL    | NULL | 10   |                                           |

|    1 | PRIMARY     | <derived2> | ALL  | NULL          | NULL | NULL    | NULL | 10   | Using join buffer (flat, BNL join)        |

|    1 | PRIMARY     | <derived3> | ALL  | NULL          | NULL | NULL    | NULL | 10   | Using join buffer (incremental, BNL join) |

|    3 | DERIVED     | t2         | ALL  | NULL          | NULL | NULL    | NULL | 10   | Using where                               |

|    2 | DERIVED     | t2         | ALL  | NULL          | NULL | NULL    | NULL | 10   | Using where                               |

+------+-------------+------------+------+---------------+------+---------+------+------+-------------------------------------------+

Attachments

Issue Links

is caused by

MDEV-37260 New-style Optimizer Hints: milestone 3

Stalled

relates to

MDEV-38045 New-style optimizer hints: milestone 3.5

Closed

Activity

People

Assignee:: Oleg Smirnov

Reporter:: Dave Gosselin

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 2025-09-05 16:06

Updated:: 2026-05-18 11:36

Resolved:: 2026-05-18 11:36

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.