[MDEV-537] Make multi-column non-top level subqueries to be executed via index (index/unique subquery) instead of single_select_engine - Jira

Timour Katchaounov (Inactive) created issue - 2012-09-17 16:49

Timour Katchaounov (Inactive) made changes - 2012-10-01 11:30

Field	Original Value	New Value
Description	While working on a bug fix in the subquery code, I noticed that some legacy has remained from the past, that can be removed to improve/simplify the code: 1. Multi-column non-top level subqueries that could be executed via the unique_subquery/index_subquery methods are executed via the general single_select engine. If the same queries are transformed into single-column INs, then unique_subquery/index_subquery is chosen. The problem is that the method Item_in_subselect::create_row_in_to_exists_cond() adds Item_is_not_null_test and Item_func_trig_cond without looking at the left IN operand. At the same time, the analogous method for single columns does that, and doesn't add the above conditions if the left argument cannot be NULL. The proposed patch is: @@ -2290,7 +2303,7 @@ Item_in_subselect::create_row_in_to_exis ref_pointer_array+i, (char )"<no matter>", (char )"<list ref>")); - if (!abort_on_null) + if (!abort_on_null && select_lex->ref_pointer_array[i]->maybe_null) { Item having_col_item= new Item_is_not_null_test(this, @@ -2309,10 +2322,6 @@ Item_in_subselect::create_row_in_to_exis (char )"<no matter>", (char )"<list ref>")); item= new Item_cond_or(item, item_isnull); - / - TODO: why we create the above for cases where the right part - cant be NULL? - / if (left_expr->element_index(i)->maybe_null) { if (!(item= new Item_func_trig_cond(item, get_cond_guard(i)))) @@ -2323,6 +2332,11 @@ Item_in_subselect::create_row_in_to_exis } having_item= and_items(having_item, having_col_item); } + if (!abort_on_null && left_expr->element_index(i)->maybe_null) + { + if (!(item= new Item_func_trig_cond(item, get_cond_guard(i)))) + DBUG_RETURN(true); + } where_item= and_items(*where_item, item); } } 2. enum store_key_result can be transformed into boolean When analyzing subselect_uniquesubquery_engine::copy_ref_key I encountered enum store_key_result { STORE_KEY_OK, STORE_KEY_FATAL, STORE_KEY_CONV } It turns out that the last value STORE_KEY_CONV is not used anywhere, and the above enum is not needed, since two enums can be represented by bool. 3. Unneeded extra call to engine->exec() in Item_subselect::exec In MariaDB 5.3 I introduced early subquery optimization. As a result the logic in Item_subselect::exec that calls engine->exec second time if the engine was changed is not needed. The reason is that in the past optimization and engine change was done lazily during the first call to engine->exec(). From 5.3 this is not true, the engine is changed before execution, so even the first execution is done with the right engine. We cannot simply remove this logic completely, because there are still few border cases when optimization is done lazily. However Item_subselect::exec should check if the engine was chosen within itself (and then re-execute), or if the engine was chosen before execution, and then do not re-execute.	TODO: the task should be split into two - one for 5.5, one for 10.0 While working on a bug fix in the subquery code, I noticed that some legacy has remained from the past, that can be removed to improve/simplify the code: 1. Multi-column non-top level subqueries that could be executed via the unique_subquery/index_subquery methods are executed via the general single_select engine. If the same queries are transformed into single-column INs, then unique_subquery/index_subquery is chosen. The problem is that the method Item_in_subselect::create_row_in_to_exists_cond() adds Item_is_not_null_test and Item_func_trig_cond without looking at the left IN operand. At the same time, the analogous method for single columns does that, and doesn't add the above conditions if the left argument cannot be NULL. The proposed patch is: @@ -2290,7 +2303,7 @@ Item_in_subselect::create_row_in_to_exis ref_pointer_array+i, (char )"<no matter>", (char )"<list ref>")); - if (!abort_on_null) + if (!abort_on_null && select_lex->ref_pointer_array[i]->maybe_null) { Item having_col_item= new Item_is_not_null_test(this, @@ -2309,10 +2322,6 @@ Item_in_subselect::create_row_in_to_exis (char )"<no matter>", (char )"<list ref>")); item= new Item_cond_or(item, item_isnull); - / - TODO: why we create the above for cases where the right part - cant be NULL? - / if (left_expr->element_index(i)->maybe_null) { if (!(item= new Item_func_trig_cond(item, get_cond_guard(i)))) @@ -2323,6 +2332,11 @@ Item_in_subselect::create_row_in_to_exis } having_item= and_items(having_item, having_col_item); } + if (!abort_on_null && left_expr->element_index(i)->maybe_null) + { + if (!(item= new Item_func_trig_cond(item, get_cond_guard(i)))) + DBUG_RETURN(true); + } where_item= and_items(*where_item, item); } } 2. enum store_key_result can be transformed into boolean When analyzing subselect_uniquesubquery_engine::copy_ref_key I encountered enum store_key_result { STORE_KEY_OK, STORE_KEY_FATAL, STORE_KEY_CONV } It turns out that the last value STORE_KEY_CONV is not used anywhere, and the above enum is not needed, since two enums can be represented by bool. 3. Unneeded extra call to engine->exec() in Item_subselect::exec In MariaDB 5.3 I introduced early subquery optimization. As a result the logic in Item_subselect::exec that calls engine->exec second time if the engine was changed is not needed. The reason is that in the past optimization and engine change was done lazily during the first call to engine->exec(). From 5.3 this is not true, the engine is changed before execution, so even the first execution is done with the right engine. We cannot simply remove this logic completely, because there are still few border cases when optimization is done lazily. However Item_subselect::exec should check if the engine was chosen within itself (and then re-execute), or if the engine was chosen before execution, and then do not re-execute.
Due Date	2012-09-28	2012-10-12

Sergei Golubchik made changes - 2012-10-12 15:15

Description

TODO: the task should be split into two - one for 5.5, one for 10.0

While working on a bug fix in the subquery code, I noticed that some legacy has remained from the past, that can be removed to improve/simplify the code:

1. Multi-column non-top level subqueries that could be executed via the unique_subquery/index_subquery methods are executed via the general single_select engine.

If the same queries are transformed into single-column INs, then unique_subquery/index_subquery is chosen.

The problem is that the method Item_in_subselect::create_row_in_to_exists_cond()
adds Item_is_not_null_test and Item_func_trig_cond without looking at the left IN operand. At the same time, the analogous method for single columns does that, and
doesn't add the above conditions if the left argument cannot be NULL.

The proposed patch is:

@@ -2290,7 +2303,7 @@ Item_in_subselect::create_row_in_to_exis
                                          ref_pointer_array+i,
                                          (char *)"<no matter>",
                                          (char *)"<list ref>"));
- if (!abort_on_null)
+ if (!abort_on_null && select_lex->ref_pointer_array[i]->maybe_null)
       {
         Item *having_col_item=
           new Item_is_not_null_test(this,
@@ -2309,10 +2322,6 @@ Item_in_subselect::create_row_in_to_exis
                                            (char *)"<no matter>",
                                            (char *)"<list ref>"));
         item= new Item_cond_or(item, item_isnull);
- /*
- TODO: why we create the above for cases where the right part
- cant be NULL?
- */
         if (left_expr->element_index(i)->maybe_null)
         {
           if (!(item= new Item_func_trig_cond(item, get_cond_guard(i))))
@@ -2323,6 +2332,11 @@ Item_in_subselect::create_row_in_to_exis
         }
         *having_item= and_items(*having_item, having_col_item);
       }
+ if (!abort_on_null && left_expr->element_index(i)->maybe_null)
+ {
+ if (!(item= new Item_func_trig_cond(item, get_cond_guard(i))))
+ DBUG_RETURN(true);
+ }
       *where_item= and_items(*where_item, item);
     }
   }

2. enum store_key_result can be transformed into boolean

When analyzing subselect_uniquesubquery_engine::copy_ref_key I encountered
enum store_key_result { STORE_KEY_OK, STORE_KEY_FATAL, STORE_KEY_CONV }
It turns out that the last value STORE_KEY_CONV is not used anywhere, and
the above enum is not needed, since two enums can be represented by bool.

3. Unneeded extra call to engine->exec() in Item_subselect::exec

In MariaDB 5.3 I introduced early subquery optimization. As a result the logic
in Item_subselect::exec that calls engine->exec second time if the engine was
changed is not needed. The reason is that in the past optimization and engine
change was done lazily during the first call to engine->exec(). From 5.3 this is
not true, the engine is changed before execution, so even the first execution is
done with the right engine.

We cannot simply remove this logic completely, because there are still few
border cases when optimization is done lazily. However Item_subselect::exec
should check if the engine was chosen within itself (and then re-execute),
or if the engine was chosen before execution, and then do not re-execute.

*TODO:* the task should be split into two - one for 5.5, one for 10.0

While working on a bug fix in the subquery code, I noticed that some legacy has remained from the past, that can be removed to improve/simplify the code:

1. Multi-column non-top level subqueries that could be executed via the unique_subquery/index_subquery methods are executed via the general single_select engine.

If the same queries are transformed into single-column INs, then unique_subquery/index_subquery is chosen.

The problem is that the method {{Item_in_subselect::create_row_in_to_exists_cond()}}
adds {{Item_is_not_null_test}} and {{Item_func_trig_cond}} without looking at the left IN operand. At the same time, the analogous method for single columns does that, and
doesn't add the above conditions if the left argument cannot be NULL.

The proposed patch is:

{noformat}
@@ -2290,7 +2303,7 @@ Item_in_subselect::create_row_in_to_exis
                                          ref_pointer_array+i,
                                          (char *)"<no matter>",
                                          (char *)"<list ref>"));
- if (!abort_on_null)
+ if (!abort_on_null && select_lex->ref_pointer_array[i]->maybe_null)
       {
         Item *having_col_item=
           new Item_is_not_null_test(this,
@@ -2309,10 +2322,6 @@ Item_in_subselect::create_row_in_to_exis
                                            (char *)"<no matter>",
                                            (char *)"<list ref>"));
         item= new Item_cond_or(item, item_isnull);
- /*
- TODO: why we create the above for cases where the right part
- cant be NULL?
- */
         if (left_expr->element_index(i)->maybe_null)
         {
           if (!(item= new Item_func_trig_cond(item, get_cond_guard(i))))
@@ -2323,6 +2332,11 @@ Item_in_subselect::create_row_in_to_exis
         }
         *having_item= and_items(*having_item, having_col_item);
       }
+ if (!abort_on_null && left_expr->element_index(i)->maybe_null)
+ {
+ if (!(item= new Item_func_trig_cond(item, get_cond_guard(i))))
+ DBUG_RETURN(true);
+ }
       *where_item= and_items(*where_item, item);
     }
   }
{noformat}

2. {{enum store_key_result}} can be transformed into boolean

When analyzing {{subselect_uniquesubquery_engine::copy_ref_key}} I encountered
{noformat}
enum store_key_result { STORE_KEY_OK, STORE_KEY_FATAL, STORE_KEY_CONV }
{noformat}
It turns out that the last value {{STORE_KEY_CONV}} is not used anywhere, and
the above {{enum}} is not needed, since two {{enum}}'s can be represented by {{bool}}.

3. Unneeded extra call to {{engine->exec()}} in {{Item_subselect::exec}}

In MariaDB 5.3 I introduced early subquery optimization. As a result the logic
in {{Item_subselect::exec}} that calls {{engine->exec}} second time if the engine was
changed is not needed. The reason is that in the past optimization and engine
change was done lazily during the first call to {{engine->exec()}}. From 5.3 this is
not true, the engine is changed before execution, so even the first execution is
done with the right engine.

We cannot simply remove this logic completely, because there are still few
border cases when optimization is done lazily. However {{Item_subselect::exec}}
should check if the engine was chosen within itself (and then re-execute),
or if the engine was chosen before execution, and then do not re-execute.

Sergei Golubchik made changes - 2012-10-12 17:56

Fix Version/s		5.5.29 [ 11701 ]
Fix Version/s	5.5.28 [ 11200 ]

Timour Katchaounov (Inactive) made changes - 2012-10-18 16:49

Summary

Minor non-semijoin subquery improvements

Make multi-column non-top level subqueries to be executed via index (index/unique subquery) instead of single_select_engine

Timour Katchaounov (Inactive) made changes - 2012-10-18 16:53

Description

*TODO:* the task should be split into two - one for 5.5, one for 10.0

While working on a bug fix in the subquery code, I noticed that some legacy has remained from the past, that can be removed to improve/simplify the code:

1. Multi-column non-top level subqueries that could be executed via the unique_subquery/index_subquery methods are executed via the general single_select engine.

If the same queries are transformed into single-column INs, then unique_subquery/index_subquery is chosen.

The problem is that the method {{Item_in_subselect::create_row_in_to_exists_cond()}}
adds {{Item_is_not_null_test}} and {{Item_func_trig_cond}} without looking at the left IN operand. At the same time, the analogous method for single columns does that, and
doesn't add the above conditions if the left argument cannot be NULL.

The proposed patch is:

{noformat}
@@ -2290,7 +2303,7 @@ Item_in_subselect::create_row_in_to_exis
                                          ref_pointer_array+i,
                                          (char *)"<no matter>",
                                          (char *)"<list ref>"));
- if (!abort_on_null)
+ if (!abort_on_null && select_lex->ref_pointer_array[i]->maybe_null)
       {
         Item *having_col_item=
           new Item_is_not_null_test(this,
@@ -2309,10 +2322,6 @@ Item_in_subselect::create_row_in_to_exis
                                            (char *)"<no matter>",
                                            (char *)"<list ref>"));
         item= new Item_cond_or(item, item_isnull);
- /*
- TODO: why we create the above for cases where the right part
- cant be NULL?
- */
         if (left_expr->element_index(i)->maybe_null)
         {
           if (!(item= new Item_func_trig_cond(item, get_cond_guard(i))))
@@ -2323,6 +2332,11 @@ Item_in_subselect::create_row_in_to_exis
         }
         *having_item= and_items(*having_item, having_col_item);
       }
+ if (!abort_on_null && left_expr->element_index(i)->maybe_null)
+ {
+ if (!(item= new Item_func_trig_cond(item, get_cond_guard(i))))
+ DBUG_RETURN(true);
+ }
       *where_item= and_items(*where_item, item);
     }
   }
{noformat}

2. {{enum store_key_result}} can be transformed into boolean

When analyzing {{subselect_uniquesubquery_engine::copy_ref_key}} I encountered
{noformat}
enum store_key_result { STORE_KEY_OK, STORE_KEY_FATAL, STORE_KEY_CONV }
{noformat}
It turns out that the last value {{STORE_KEY_CONV}} is not used anywhere, and
the above {{enum}} is not needed, since two {{enum}}'s can be represented by {{bool}}.

3. Unneeded extra call to {{engine->exec()}} in {{Item_subselect::exec}}

In MariaDB 5.3 I introduced early subquery optimization. As a result the logic
in {{Item_subselect::exec}} that calls {{engine->exec}} second time if the engine was
changed is not needed. The reason is that in the past optimization and engine
change was done lazily during the first call to {{engine->exec()}}. From 5.3 this is
not true, the engine is changed before execution, so even the first execution is
done with the right engine.

We cannot simply remove this logic completely, because there are still few
border cases when optimization is done lazily. However {{Item_subselect::exec}}
should check if the engine was chosen within itself (and then re-execute),
or if the engine was chosen before execution, and then do not re-execute.

Multi-column non-top level subqueries can be executed via the unique_subquery/index_subquery methods instead of the general single_select engine.

If the same queries are transformed into single-column INs, then unique_subquery/index_subquery is chosen. However in some cases the IN-EXISTS transformation for multi-column subqueries adds unnecessary null-rejecting conditions that prevent the use of the index-based subquery access methods. The problem is that the method {Item_in_subselect::create_row_in_to_exists_cond()}} adds {{Item_is_not_null_test}} and {{Item_func_trig_cond}} without looking at the left IN operand. At the same time, the analogous method for single columns does that, and doesn't add the above conditions if the left argument cannot be NULL.

The proposed patch is:

{noformat}
@@ -2290,7 +2303,7 @@ Item_in_subselect::create_row_in_to_exis
                                          ref_pointer_array+i,
                                          (char *)"<no matter>",
                                          (char *)"<list ref>"));
- if (!abort_on_null)
+ if (!abort_on_null && select_lex->ref_pointer_array[i]->maybe_null)
       {
         Item *having_col_item=
           new Item_is_not_null_test(this,
@@ -2309,10 +2322,6 @@ Item_in_subselect::create_row_in_to_exis
                                            (char *)"<no matter>",
                                            (char *)"<list ref>"));
         item= new Item_cond_or(item, item_isnull);
- /*
- TODO: why we create the above for cases where the right part
- cant be NULL?
- */
         if (left_expr->element_index(i)->maybe_null)
         {
           if (!(item= new Item_func_trig_cond(item, get_cond_guard(i))))
@@ -2323,6 +2332,11 @@ Item_in_subselect::create_row_in_to_exis
         }
         *having_item= and_items(*having_item, having_col_item);
       }
+ if (!abort_on_null && left_expr->element_index(i)->maybe_null)
+ {
+ if (!(item= new Item_func_trig_cond(item, get_cond_guard(i))))
+ DBUG_RETURN(true);
+ }
       *where_item= and_items(*where_item, item);
     }
   }
{noformat}

The change is proposed for 10.0 because it will change all affected query plans to use new access methods.

Timour Katchaounov (Inactive) made changes - 2012-10-18 16:55

Fix Version/s		10.0.1 [ 11400 ]
Fix Version/s	5.5.29 [ 11701 ]
Due Date	2012-10-12	2012-10-26

Timour Katchaounov (Inactive) made changes - 2012-10-26 16:02

Due Date

2012-10-26

2012-11-02

Timour Katchaounov (Inactive) made changes - 2012-11-13 14:35

Due Date

2012-11-02

2012-11-16

Sergei Golubchik made changes - 2012-12-05 13:46

Fix Version/s		10.0.2 [ 11900 ]
Fix Version/s	10.0.1 [ 11400 ]

Timour Katchaounov (Inactive) made changes - 2012-12-18 16:22

Due Date

2012-11-16

2012-12-21

Timour Katchaounov (Inactive) made changes - 2012-12-18 16:22

Status

Open [ 1 ]

In Progress [ 3 ]

Timour Katchaounov (Inactive) made changes - 2012-12-21 11:38

Status

In Progress [ 3 ]

Open [ 1 ]

Sergei Golubchik made changes - 2013-01-21 15:23

Due Date

2012-12-21

Timour Katchaounov (Inactive) added a comment - 2013-02-06 11:08

The implementation has been tested by Elena, and is waiting for 10.0.1 to be released in order to be pushed to 10.0.2.

Timour Katchaounov (Inactive) added a comment - 2013-02-06 11:08 The implementation has been tested by Elena, and is waiting for 10.0.1 to be released in order to be pushed to 10.0.2.

Timour Katchaounov (Inactive) added a comment - 2013-02-07 15:35

merged & tested with latest 10.0, pushed to 10.0.02

Timour Katchaounov (Inactive) added a comment - 2013-02-07 15:35 merged & tested with latest 10.0, pushed to 10.0.02

Timour Katchaounov (Inactive) made changes - 2013-02-07 15:35

Resolution		Fixed [ 1 ]
Status	Open [ 1 ]	Closed [ 6 ]

Sergei Golubchik made changes - 2014-06-13 15:07

Workflow

defaullt [ 14301 ]

MariaDB v2 [ 46381 ]

Rasmus Johansson (Inactive) made changes - 2015-05-18 17:51

Workflow

MariaDB v2 [ 46381 ]

MariaDB v3 [ 64382 ]

Sergei Golubchik made changes - 2021-12-06 21:22

Workflow

MariaDB v3 [ 64382 ]

MariaDB v4 [ 131980 ]

MariaDB Server

Make multi-column non-top level subqueries to be executed via index (index/unique subquery) instead of single_select_engine

Details

Description

Attachments

Activity

People

Dates

Git Integration