commit 985e788ed645e8282f586150d45d0cdd1172849d
|
Author: Knut Anders Hatlen <knut.hatlen@oracle.com>
|
Date: Wed Dec 2 17:34:58 2015 +0100
|
|
Bug#22089623: ASSERTION IN ITEM_FUNC_TRIG_COND::VAL_INT() WITH SUBQUERY
|
ON LEFT SIDE OF IN
|
|
Problem:
|
|
If the left expression of an IN expression is a row subquery that does
|
not access any tables, an assertion could be raised in debug builds,
|
and wrong results could be returned in release builds.
|
|
The assert failure was introduced by the fix for bug#20729351, but the
|
wrong results were seen even before that fix.
|
|
Analysis:
|
|
The root cause for these problems is an inconsistency in how
|
nullability is represented. For Item_row, the maybe_null flag tells if
|
any of the columns in the row could be NULL. For scalar subqueries and
|
row subqueries represented by Item_singlerow_subselect, maybe_null has
|
this meaning:
|
|
- If the subquery is scalar, maybe_null == true means that the
|
returned column could be NULL, or that the subquery could return an
|
empty result (which would make the scalar subquery evaluate to
|
NULL). This is consistent with how maybe_null is used in Item_row.
|
|
- If the subquery returns multiple columns (a row subquery) and it is
|
not a UNION, maybe_null == true means that the subquery could return
|
empty results. And, conversely, maybe_null == false means that it
|
could not return empty results. It does not say anything about the
|
nullability of the columns. This is not consistent with how
|
maybe_null is used in Item_row and in scalar subqueries.
|
|
- If the subquery is a row query that is a UNION, maybe_null == true
|
means that it accesses tables, whereas maybe_null == false means
|
that it does not access any tables.
|
|
Take these two equivalent statements:
|
|
SELECT (NULL, NULL) IN ...
|
|
SELECT (SELECT NULL, NULL) IN ...
|
|
In the former, the Item_row that represents the left operand of the IN
|
expression, has maybe_null == true. In the latter, the
|
Item_singlerow_subselect that represents the left operand, has
|
maybe_null == false.
|
|
Or take these two equivalent statements:
|
|
SELECT (SELECT 1, 2 WHERE FALSE) IN ...
|
|
SELECT (SELECT 1, 2 WHERE FALSE UNION SELECT 1, 2 WHERE FALSE) IN ...
|
|
In the former of these, the Item_singlerow_subselect has maybe_null ==
|
true. In the latter, maybe_null is false.
|
|
These inconsistencies confuse the code that transforms IN expressions
|
to EXISTS expressions in the following way:
|
|
Item_in_subselect::row_value_transformer() sets up a data structure
|
(pushed_cond_guards) that might be needed in the synthetic predicates
|
that are created later. It only does this if left_expr->maybe_null is
|
true. However, the data structure is needed if any of the values in
|
the left expression could be NULL, and this is not what
|
left_expr->maybe_null tells us if left_expr is an
|
Item_singlerow_subselect, only if it is an Item_row.
|
|
Because of this, it might skip setting up pushed_cond_guards even
|
though it will be needed later. When it is attempted used later, and
|
it is found that it is NULL, it results in an assertion failure or a
|
wrong result.
|
|
Before bug#20729351, all the columns returned by the row subquery
|
(represented by Item_cache objects) had maybe_null set to false, even
|
if the value could be NULL. This was clearly wrong, and it caused
|
wrong results in some cases. The assert wasn't seen, though.
|
|
After bug#20729351, all the columns returned by the row subquery have
|
maybe_null pessimistically set to true. This made the queries take a
|
slightly different code path, which still produced wrong results, but
|
now also raise an assert failure.
|
|
Solution:
|
|
This patch fixes these problems by making the maybe_null flag in
|
Item_singlerow_subselect have a value that is more consistent. Now it
|
is false if and only if we know for sure that it is never going to
|
return empty results, and none of the returned columns are nullable.
|
|
It also makes the maybe_null flag of the individual columns in the row
|
subquery more accurate. Instead of being always false/true as they
|
were before/after 20729351, the maybe_null flag now tells the actual
|
nullability of the column (if the column could have the value NULL, or
|
if the subquery could return an empty result, maybe_null is true).
|
|
Additionally, testing of this fix exposed another, earlier wrong
|
result regression, caused by "WL#6369: EXPLAIN for other thread." A
|
query such as
|
|
SELECT (SELECT 1, 2 UNION SELECT 1, 2) IN (SELECT 1, 2)
|
|
started returning 0 instead of 1 after WL#6369, and it started raising
|
an assert failure after bug#20729351.
|
|
The problem is very similar to bug#22090717, in that the optimization
|
of the IN expression happens earlier now than before. It tries to read
|
the values returned from the UNION subquery before the subquery has
|
been evaluated. The patch enhances the fix for bug#22090717 with an
|
extra check in Item_in_subselect::mark_as_outer() to prevent reading
|
of data from a subquery before the subquery has been evaluated, except
|
in the case where the subquery returns a basic constant (a literal).
|
Could this fix from MySQL be applicable?
commit 985e788ed645e8282f586150d45d0cdd1172849d
Author: Knut Anders Hatlen <knut.hatlen@oracle.com>
Date: Wed Dec 2 17:34:58 2015 +0100
Bug#22089623: ASSERTION IN ITEM_FUNC_TRIG_COND::VAL_INT() WITH SUBQUERY
ON LEFT SIDE OF IN
Problem:
If the left expression of an IN expression is a row subquery that does
not access any tables, an assertion could be raised in debug builds,
and wrong results could be returned in release builds.
The assert failure was introduced by the fix for bug#20729351, but the
wrong results were seen even before that fix.
Analysis:
The root cause for these problems is an inconsistency in how
nullability is represented. For Item_row, the maybe_null flag tells if
any of the columns in the row could be NULL. For scalar subqueries and
row subqueries represented by Item_singlerow_subselect, maybe_null has
this meaning:
- If the subquery is scalar, maybe_null == true means that the
returned column could be NULL, or that the subquery could return an
empty result (which would make the scalar subquery evaluate to
NULL). This is consistent with how maybe_null is used in Item_row.
- If the subquery returns multiple columns (a row subquery) and it is
not a UNION, maybe_null == true means that the subquery could return
empty results. And, conversely, maybe_null == false means that it
could not return empty results. It does not say anything about the
nullability of the columns. This is not consistent with how
maybe_null is used in Item_row and in scalar subqueries.
- If the subquery is a row query that is a UNION, maybe_null == true
means that it accesses tables, whereas maybe_null == false means
that it does not access any tables.
Take these two equivalent statements:
SELECT (NULL, NULL) IN ...
SELECT (SELECT NULL, NULL) IN ...
In the former, the Item_row that represents the left operand of the IN
expression, has maybe_null == true. In the latter, the
Item_singlerow_subselect that represents the left operand, has
maybe_null == false.
Or take these two equivalent statements:
SELECT (SELECT 1, 2 WHERE FALSE) IN ...
SELECT (SELECT 1, 2 WHERE FALSE UNION SELECT 1, 2 WHERE FALSE) IN ...
In the former of these, the Item_singlerow_subselect has maybe_null ==
true. In the latter, maybe_null is false.
These inconsistencies confuse the code that transforms IN expressions
to EXISTS expressions in the following way:
Item_in_subselect::row_value_transformer() sets up a data structure
(pushed_cond_guards) that might be needed in the synthetic predicates
that are created later. It only does this if left_expr->maybe_null is
true. However, the data structure is needed if any of the values in
the left expression could be NULL, and this is not what
left_expr->maybe_null tells us if left_expr is an
Item_singlerow_subselect, only if it is an Item_row.
Because of this, it might skip setting up pushed_cond_guards even
though it will be needed later. When it is attempted used later, and
it is found that it is NULL, it results in an assertion failure or a
wrong result.
Before bug#20729351, all the columns returned by the row subquery
(represented by Item_cache objects) had maybe_null set to false, even
if the value could be NULL. This was clearly wrong, and it caused
wrong results in some cases. The assert wasn't seen, though.
After bug#20729351, all the columns returned by the row subquery have
maybe_null pessimistically set to true. This made the queries take a
slightly different code path, which still produced wrong results, but
now also raise an assert failure.
Solution:
This patch fixes these problems by making the maybe_null flag in
Item_singlerow_subselect have a value that is more consistent. Now it
is false if and only if we know for sure that it is never going to
return empty results, and none of the returned columns are nullable.
It also makes the maybe_null flag of the individual columns in the row
subquery more accurate. Instead of being always false/true as they
were before/after 20729351, the maybe_null flag now tells the actual
nullability of the column (if the column could have the value NULL, or
if the subquery could return an empty result, maybe_null is true).
Additionally, testing of this fix exposed another, earlier wrong
result regression, caused by "WL#6369: EXPLAIN for other thread." A
query such as
SELECT (SELECT 1, 2 UNION SELECT 1, 2) IN (SELECT 1, 2)
started returning 0 instead of 1 after WL#6369, and it started raising
an assert failure after bug#20729351.
The problem is very similar to bug#22090717, in that the optimization
of the IN expression happens earlier now than before. It tries to read
the values returned from the UNION subquery before the subquery has
been evaluated. The patch enhances the fix for bug#22090717 with an
extra check in Item_in_subselect::mark_as_outer() to prevent reading
of data from a subquery before the subquery has been evaluated, except
in the case where the subquery returns a basic constant (a literal).