[MDEV-33281] Implement hints like in MySQL 8 ? Created: 2024-01-19  Updated: 2024-02-07

Status: Open
Project: MariaDB Server
Component/s: Optimizer
Fix Version/s: None

Type: New Feature Priority: Major
Reporter: Sergei Petrunia Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: compat80

Issue Links:
Duplicate
duplicates MDEV-9078 optimizer hints MySQL 8.0 style Closed
PartOf
is part of MDEV-28906 MySQL 8.0 desired compatibility Open

 Description   

This is to study if/how we should implement query hints compatible with MySQL 8.
(there's a request for specifically that - support compatible syntax for hints)

1. h2. Hints in MySQL
https://dev.mysql.com/doc/refman/8.0/en/optimizer-hints.html

1.1 Syntax

The syntax is

  /*+ hint(s) */

The comment must be at the right place in the query:

SELECT /*+ hint(s) */ ... ;
UPDATE /*+ hint(s) */ ... ; 
... (SELECT /*+ hint(s) */ ) ...;

hints in other parts of query are silently ignored.

Hints have a separate parser in sql_hints.yy, invoked by the main lexer through consume_optimizer_hints().

1.2 Hint effect

There are hints attached to

  • The whole statement
  • [previously named] query_block
  • table
  • table and its index, or set of its indexes.

Errors in hint parsing/processing are emitted as warnings.

Hint processing is "best effort", that is, if the optimizer is not able to follow the directions of some hint it will silently ignore it.

1.3 Supported hints

Hints controlling table access methods

Hint Scope
MRR, NO_MRR Table, index
NO_ICP Table, index
NO_RANGE_OPTIMIZATION Table, index
INDEX_MERGE, NO_INDEX_MERGE Table, index

Hints similar to existing hints:

Hint Scope  
ORDER_INDEX, NO_ORDER_INDEX Index  
GROUP_INDEX, NO_GROUP_INDEX Index  
JOIN_INDEX, NO_JOIN_INDEX Index  
INDEX, NO_INDEX Index Acts as the combination of JOIN_INDEX, GROUP_INDEX, and ORDER_INDEX

Hints controlling access method in the context of join optimization

BKA, NO_BKA Query block, table
BNL, NO_BNL Query block, table

Hints controlling join optimization

Hint Scope
JOIN_FIXED_ORDER Query block
JOIN_ORDER Query block
JOIN_PREFIX Query block
JOIN_SUFFIX Query block

Hints controlling other optimizations:

Hint Scope
DERIVED_CONDITION_PUSHDOWN, NO_DERIVED_CONDITION_PUSHDOWN Query block, table
MERGE, NO_MERGE Table
SEMIJOIN, NO_SEMIJOIN Query block
SUBQUERY Query block

Hints serving the system itself

Hint Scope
QB_NAME Query block

Not really optimizer hints

Hint Scope
RESOURCE_GROUP Global
MAX_EXECUTION_TIME Global
SET_VAR Global

Not applicable

Hint Scope
Only 8.0.20 HASH_JOIN, NO_HASH_JOIN Query block, table
Not supported in MariaDB SKIP_SCAN, NO_SKIP_SCAN Table, index

1.3.1 Extra hints needed in MariaDB

Off the top of the head, we would need:

Hint Scope
ROWID_FILTER, NO_ROWID_FILTER Table, index
SPLIT_MATERIALIZED, NO_SPLIT_MATERIALIZED Table, index

1.4 Data structures

Hints have two kinds of data structures:

  • PT_hint_XXX
  • Opt_hint_XXX

PT_hint_XXX are created by the parser. After the parsing, contextualize() operation is performed (this is fix_fields() right?). During that, some (not all) PT_hint* structures create Opt_hint_XXX data structures and attach them to the appropriate Table_ref (TABLE_LIST) or Query_block (SELECT_LEX).

1.5 Other details

1.5.1 Printout in EXPLAIN

Hints are printed back in EXPLAIN + SHOW WARNINGS output. Hints that weren't processed do not seem to be printed.
Example:

explain select /*+ JOIN_ORDER(t20, t21) JOIN_ORDER(t21,t20) */  * from t20, t21 where t20.a=t21.a;

gives

/* select#1 */ select /*+ JOIN_ORDER(@`select#1` `t20`,`t21`) */ `test`.`t20`.`a` AS `a`...

1.5.2 Implicit Query Block names.

Printout a few lines above shows @`select#1` for an unnamed query block.

One can use this syntax in a query, see (USE-IMPLICIT-QB-NAME-EXAMPLE) below.
However, I was only able to get it to work by:

  • starting the client with --comments
  • using sql_mode='ansi_quotes' and so using double quotes.
    (TODO: verify this)

1.5.3 VIEWs

One can create a VIEW with hints but they seem to be ignored when using the VIEW:
https://gist.github.com/spetrunia/5973166d0686e3787d0f2d37f7faba1d

Also one cannot control query blocks inside a VIEW (mysql client started with --comments so it doesn't truncate at '#' ): (note: use of ansi_quotes is actually not necessary here )

mysql>  set sql_mode='ansi_quotes';
mysql>  explain  select /*+ JOIN_ORDER(@"select#2" twenty,ten) */ * from v1;
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                      |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
|  1 | PRIMARY     | <derived2> | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   20 |   100.00 | NULL                                       |
|  2 | DERIVED     | ten        | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   10 |   100.00 | NULL                                       |
|  2 | DERIVED     | twenty     | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   20 |    10.00 | Using where; Using join buffer (hash join) |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
3 rows in set, 2 warnings (0.01 sec)

mysql> show warnings;
+---------+------+-----------------------------------------------------------------+
| Level   | Code | Message                                                         |
+---------+------+-----------------------------------------------------------------+
| Warning | 3127 | Query block name "select#2" is not found for JOIN_ORDER hint    |
| Note    | 1003 | /* select#1 */ select "test2"."v1"."a" AS "a" from "test2"."v1" |
+---------+------+-----------------------------------------------------------------+
2 rows in set (0.00 sec)

For comparison, one can control a derived table this way (USE-IMPLICIT-QB-NAME-EXAMPLE):

mysql>  explain  select /*+ JOIN_ORDER(@"select#2" twenty,ten) */ * from (select "ten"."a" AS "a" from ("ten" join "twenty") where ("ten"."a" = "twenty"."a") limit 1000 ) T;
    -> ;
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                      |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
|  1 | PRIMARY     | <derived2> | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   20 |   100.00 | NULL                                       |
|  2 | DERIVED     | twenty     | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   20 |   100.00 | NULL                                       |
|  2 | DERIVED     | ten        | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   10 |    10.00 | Using where; Using join buffer (hash join) |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
3 rows in set, 1 warning (0.00 sec)

1.6 Hint processing

Join order hints are "compiled" into updates to tables' dependency maps.
The dependency map is updated right before the optimizer starts considering the join orders:

  #0  set_join_hint_deps (join=0x7f6b04155590, hint_table_list=0x7f6b041196a0, type=JOIN_ORDER_HINT_ENUM) at /home/psergey/dev-git2/mysql-8.0-dbg2/sql/opt_hints.cc:492
  #1  0x00005624ab0bc7bd in Opt_hints_qb::apply_join_order_hints (this=0x7f6b041546d0, join=0x7f6b04155590) at /home/psergey/dev-git2/mysql-8.0-dbg2/sql/opt_hints.cc:522
  #2  0x00005624aa9ef752 in JOIN::make_join_plan (this=0x7f6b04155590) at /home/psergey/dev-git2/mysql-8.0-dbg2/sql/sql_optimizer.cc:5349
  #3  0x00005624aa9e1f73 in JOIN::optimize (this=0x7f6b04155590, finalize_access_paths=true) at /home/psergey/dev-git2/mysql-8.0-dbg2/sql/sql_optimizer.cc:694
  #4  0x00005624aaaad439 in Query_block::optimize (this=0x7f6b04119048, thd=0x7f6b04001000, finalize_access_paths=true) at /home/psergey/dev-git2/mysql-8.0-dbg2/sql/sql_select.cc:2013
  #5  0x00005624aab61a0e in Query_expression::optimize (this=0x7f6b04118f60, thd=0x7f6b04001000, materialize_destination=0x0, create_iterators=true, finalize_access_paths=true) at /home/psergey/dev-

1.7 Source code

Hints code is here:

   964 sql/opt_hints.cc
   797 sql/opt_hints.h
   585 sql/parse_tree_hints.cc
   351 sql/parse_tree_hints.h
   718 sql/sql_hints.yy
   236 sql/sql_lex_hints.cc
   480 sql/sql_lex_hints.h
  4131 total

This doesn't include the checks that the optimizer makes to follow the hints.

1.8 Hooking hints parser into the main SQL parser

sql/sql_yacc.yy has this:

%token  INSERT_SYM 496                    /* SQL-2003-R */
%token  SELECT_SYM 748                    /* SQL-2003-R */
 
%type <lexer.optimizer_hints> SELECT_SYM INSERT_SYM REPLACE_SYM UPDATE_SYM DELETE_SYM

That is, the parsed hints are the "payload" of SELECT/INSERT/etc tokens.

2 MariaDB's implementation notes

2.1 Error handling

Monty: we need to introduce @@hint_mode with these options:

  • Default: errors for hint conflicts, warnings for missing tables, keys, etc.
  • "warnings strict mode": Everything is detected and produces warnings.
  • (errors for any kind of error conditions)

If the hint is impossible to follow, produce warnings or errors according to above.
(Us giving a warnings where MySQL doesn't isn't considered breach of compatibility)

Hint processing

A question: MySQL's processing of QB_NAME requires hint parsing to use two passes?
Monty: Can "apply" QB_NAME at parsing stage and then we need one pass.



 Comments   
Comment by Sergei Petrunia [ 2024-01-22 ]

Note that named query blocks allow hints to have "non-local" effects: This works:

select /*+ NO_INDEX(t10@sel1 idx2) */ * from (select /*+ QB_NAME(sel1) */ * from t10 where a< 10 ) T ;

explain 
select * from (select /*+ QB_NAME(sel1) */ * from t10 where a< 10 ) T  
union 
select /*+ NO_INDEX(t10@sel1 idx1,idx2) */ * from t10 ;

Generated at Thu Feb 08 10:37:45 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.