|
This looks similar to MDEV-21383.
But the tree I am trying this on is MariaDB 10.4.13 which includes the fix for MDEV-21383
|
|
Examining the optimizer trace, I can see that range optimizer was invoked for table tms and it has picked a range access over primary key:
"rows_for_plan": 10,
|
"cost_for_plan": 12.019,
|
"chosen": true
|
(The same for table tn)
best_access_path also was able to use it:
"plan_prefix": [],
|
"table": "tms",
|
"best_access_path": {
|
"considered_access_paths": [
|
{
|
"access_type": "range",
|
"resulting_rows": 10,
|
"cost": 12.018,
|
"chosen": true
|
}
|
but then:
"attaching_conditions_to_tables": {
|
"original_condition": "tn... ",
|
"attached_conditions_computation": [
|
{
|
"table": "tms",
|
"range_analysis": {
|
"table_scan": {
|
"rows": 92070,
|
"cost": 110486
|
},
|
|
|
EXPLAIN FORMAT=JSON for the SELECT:
EXPLAIN: {
|
"query_block": {
|
"select_id": 1,
|
"table": {
|
"table_name": "hist",
|
"access_type": "ALL",
|
"rows": 99205,
|
"filtered": 100,
|
"attached_condition": "!<in_optimizer>(hist.col1,hist.col1 in (subquery#2))"
|
},
|
"subqueries": [
|
{
|
"query_block": {
|
"select_id": 2,
|
"table": {
|
"table_name": "tms",
|
"access_type": "range",
|
"possible_keys": ["PRIMARY"],
|
"key": "PRIMARY",
|
"key_length": "92",
|
"used_key_parts": ["key1"],
|
"rows": 10,
|
"filtered": 100,
|
"attached_condition": "tms.key1 in ('1','2','3','4','5','6','7','8','9','10')",
|
"using_index": true
|
},
|
"table": {
|
"table_name": "tn",
|
"access_type": "eq_ref",
|
"possible_keys": ["PRIMARY"],
|
"key": "PRIMARY",
|
"key_length": "92",
|
"used_key_parts": ["key1"],
|
"ref": ["test.tms.key1"],
|
"rows": 1,
|
"filtered": 100
|
}
|
}
|
}
|
]
|
}
|
}
|
Restrictions in the subquery:
ON tms.key1 = tn.key1 ...
|
WHERE tn.key1 IN ('1','2','3','4','5','6','7','8','9','10')
|
This was used to infer the condition : {{tms.key1 IN (...) }}.
EXPLAIN FORMAT=JSON for the UPDATE:
| {
|
"query_block": {
|
"select_id": 1,
|
"table": {
|
"update": 1,
|
"table_name": "hist",
|
"access_type": "ALL",
|
"rows": 99205,
|
"attached_condition": "hist.key1 in ('1','2','3','4','5','6','7','8','9','10') and !<in_optimizer>(hist.col1,<exists>(subquery#2))"
|
},
|
"subqueries": [
|
{
|
"query_block": {
|
"select_id": 2,
|
"having_condition": "tn.col1 is null",
|
"table": {
|
"table_name": "tms",
|
"access_type": "index",
|
"possible_keys": ["PRIMARY"],
|
"key": "PRIMARY",
|
"key_length": "92",
|
"used_key_parts": ["key1"],
|
"rows": 116726,
|
"filtered": 0.0086,
|
"attached_condition": "1",
|
"using_index": true
|
},
|
"table": {
|
"table_name": "tn",
|
"access_type": "eq_ref",
|
"possible_keys": ["PRIMARY"],
|
"key": "PRIMARY",
|
"key_length": "92",
|
"used_key_parts": ["key1"],
|
"ref": ["test.tms.key1"],
|
"rows": 1,
|
"filtered": 100,
|
"attached_condition": "tn.key1 in ('1','2','3','4','5','6','7','8','9','10') and (<cache>(hist.col1) = tn.col1 or tn.col1 is null)"
|
}
|
}
|
}
|
]
|
}
|
Note that table tms has
"attached_condition": "1",
|
|
|
The problem is this call in JOIN::optimize_stage2:
conds= substitute_for_best_equal_field(thd, NO_PARTICULAR_TAB, conds,
|
cond_equal, map2table, true);
|
Both SELECT and UPDATE start with a valid conds.
SELECT has:
(gdb) p dbug_print_item(conds)
|
$7 = 0x5555576c4620 <dbug_item_print_buf> "tn.key1 in ('1','2','3','4','5','6','7','8','9','10') and multiple equal(tms.key1, tn.key1)"
|
|
UPDATE has:
(gdb) p dbug_print_item(conds)
|
$30 = 0x5555577037c0 <dbug_item_print_buf> "tn.key1 in ('1','2','3','4','5','6','7','8','9','10') and (<cache>(hist.col1) = tn.col1 or tn.col1 is null) and multiple equal(tms.key1, tn.key1)"
|
For the SELECT, the optimizer performs equality substitution and generates the tms.key1 in (...).
For the UPDATE, this doesn't happen.
|
|
The issue is in Item_in_subselect::inject_in_to_exists_cond. It replaces JOIN::conds with another Item_cond_and. But it doesn't copy the m_cond_equal.current_level.
Then, substitute_for_best_equal_field does not "see" the multiple equality and doesn't perform the equality substitution ...
|
|
Draft: https://gist.github.com/spetrunia/056e9f50c2b965624c4a57446673239a
|
|
Why UPDATE is affected while SELECT is not?
See MDEV-22415. The SELECT uses Materialization and avoids this issue. Single-table UPDATE doesn't use Materialization, and so is affected.
MariaDB [j5]> set optimizer_switch='materialization=off';
|
Query OK, 0 rows affected (0.000 sec)
|
|
MariaDB [j5]> explain select * from t1 hist where hist.col1 NOT IN (SELECT tn.col1 FROM t10 tn JOIN t11 tms ON tms.key1 = tn.key1 WHERE tn.key1 IN ('1','2','3','4','5','6','7','8','9','10') )\G
|
*************************** 1. row ***************************
|
id: 1
|
select_type: PRIMARY
|
table: hist
|
type: ALL
|
possible_keys: NULL
|
key: NULL
|
key_len: NULL
|
ref: NULL
|
rows: 90612
|
Extra: Using where
|
*************************** 2. row ***************************
|
id: 2
|
select_type: DEPENDENT SUBQUERY
|
table: tms
|
type: index
|
possible_keys: PRIMARY
|
key: PRIMARY
|
key_len: 92
|
ref: NULL
|
rows: 80453
|
Extra: Using where; Using index
|
*************************** 3. row ***************************
|
|
|
Review request (sent on Apr, 30th) : https://lists.launchpad.net/maria-developers/msg12215.html
|
|
Take-away from the discussion at the optimizer call:
- Indeed, Item_cond_and with collected multiple equalities has certain non-trivial invariants (MULTI-EQUALITY-INVARIANTS) that must not be violated.
- The code implements "add a conjunct into Item_cond_and while maintaining MULTI-EQUALITY-INVARIANTS" should be moved into a separate function
- It should not be just put into and_items(), because that function is widely used during phases where there is no need to maintain MULTI-EQUALITY-INVARIANTS.
- So we need a new function, tentative name a[n|d]d_items_with_multi_equal
|
|
http://lists.askmonty.org/pipermail/commits/2022-January/014869.html - next variant of the patch. igor, please review.
|
|
My analysis of the bug:
Starting with the commit
commit c8141f53140054282306d17459310fbca94cbf0e
|
Author: Alexander Barkov <bar@mariadb.org> Tue Apr 28 03:06:07 2015
|
Committer: Alexander Barkov <bar@mariadb.org> Tue Apr 28 03:06:07 2015
|
|
MDEV-7950 Item_func::type() takes 0.26% in OLTP RO
|
the function substitute_for_best_equal_field() uses the member Item_cond_and::m_cond_equal to access multiple equality items included into a conjunction. It means that this value should be maintained whenever we change the WHERE condition.
One of such changes is done in the function Item_in_subselect::inject_in_to_exists_cond. Here new conjuncts are injected into the WHERE condition of a subquery. These conjuncts have to be included just before the multiple equalities. This is done in such a way. First multiple equalities are disjoint from the AND list (they are always at the very end of the list with Item_cond_and::cond_equal::current_level pointing to the first multiple equality). Then the remaining conjunct are ANDed with the injected condition producing a new Item_cond_and item. After this the disjoined multiple equalities have to be appended the AND list of this item.
As we create a new Item_cond_and item to where multiple equalities are included we have to set Item_cond_and::m_cond_equal to point to these multiple equalities. This was not done and m_cond_equal remained 0 for the new WHERE condition. The code of the function Item_in_subselect::inject_in_to_exists_cond() had been written long before m_cond_equal was introduced. So there is no wonder that m_cond_equal was not set here.
|
|
So we have to set m_cond_equal for the new WHERE condition. It can be done with the following change:
#if 0
|
List_iterator<Item_equal> li(join_arg->cond_equal->current_level);
|
Item_equal *elem;
|
while ((elem= li++))
|
{
|
and_args->push_back(elem, thd->mem_root);
|
}
|
#else
|
((Item_cond_and *) (join_arg->conds))->m_cond_equal=
|
*join_arg->cond_equal;
|
and_args->append((List<Item> *)&join_arg->cond_equal->current_level);
|
Note that we prefer to append the whole sublist of multiple equalities to the AND list rather than to add multiple equalities one by one. It allows us not to reset the value of join_arg->cond_equal->current_level.
|
|
This bug might affect any query with a subquery that is subject to in-to-exists transformation either because this is an only option to evaluate the predicand containing this subquery (as in the case of UPDATE) or because the setting of the optimizer switch prohibits to use any other ways to do it.
This bug must be fixed in 10.2.
|
|
igor, made it into a commit: bb-10.2-mdev22377, http://lists.askmonty.org/pipermail/commits/2022-February/014877.html
|
|
igor, here's a patch that uses SELECT in the testcase: http://lists.askmonty.org/pipermail/commits/2022-February/014878.html . Also I've pushed it to bb-10.2-mdev22377
|
|
ok to push into 10.2
|