The attached commit addresses column naming issues with CTEs in the use of prepared statements and procedures.
Usage of either prepared statements or procedures with Common Table Expressions and column renaming may be affected.
There are three related but different issues addressed here.
1) First execution issue. Consider the following
with cte (col1, col2) as (select a as c1, b as c2 from t order by c1) select col1, col2 from cte;
|
During parsing, the Item representing the first column in the derived table (cte) created, then has it's name set to c1.
When elements in the order by column are resolved during context analysis, this Item is found from it's name. Later in context analysis st_select_lex_unit::prepare(), this Item is renamed col1, for use in a more outer statement (in the above example select col1, col2 from cte).
When run as a prepared statement, context analysis is run again before statement execution, so name resolution on elements in order by will no longer find an Item with the name of c1 and will produce an error
ERROR 1054 (42S22): Unknown column 'c1' in 'order clause'
Columns of the first select in the unit representing any potential union determine the names of the resultant set, so only Items in this select are renamed to names specified in the with clause.
It would be reasonable to only restore these names to those after parsing, but doing so will lead to another problem.
2) Second execution issue. Consider this example
prepare st from "with cte (c1,c2) as
|
(select a as col1, sum(b) as col2 from t1 where a > 0 group by col1
|
union select a as col3, sum(b) as col4 from t2 where b > 2 group by col3)
|
select * from cte where c1=1";
|
and what happens to Items in the second select in the union. Parsing sets the name of the first Item to col3, which is then used in resolving the group by clause.
During first execution, the name is left as col3, which is again used to resolve the group by clause.
The optimizer will then potentially push the clause "c1=1" into BOTH selects in the union and in doing so will rename all Items in all selects in the union in that position to c1 to make the clause valid.
During the next execution, the group by clause will no longer be resolvable.
ERROR 1054 (42S22): Unknown column 'col3' in 'group statement'
Examining the JSON output from an explain command on the above query produces
"query_specifications": [
|
{
|
"query_block": {
|
"select_id": 2,
|
"table": {
|
"table_name": "t1",
|
"access_type": "ALL",
|
"rows": 5,
|
"filtered": 100,
|
"attached_condition": "t1.a = 1"
|
}
|
}
|
},
|
{
|
"query_block": {
|
"select_id": 3,
|
"operation": "UNION",
|
"table": {
|
"table_name": "t2",
|
"access_type": "ALL",
|
"rows": 5,
|
"filtered": 100,
|
"attached_condition": "t2.a = 1 and t2.b > 2"
|
but with set @@optimizer_switch="condition_pushdown_for_derived=off";
"query_specifications": [
|
{
|
"query_block": {
|
"select_id": 2,
|
"filesort": {
|
"sort_key": "t1.a",
|
"temporary_table": {
|
"table": {
|
"table_name": "t1",
|
"access_type": "ALL",
|
"rows": 5,
|
"filtered": 100,
|
"attached_condition": "t1.a > 0"
|
}
|
}
|
}
|
}
|
},
|
{
|
"query_block": {
|
"select_id": 3,
|
"operation": "UNION",
|
"filesort": {
|
"sort_key": "t2.a",
|
"temporary_table": {
|
"table": {
|
"table_name": "t2",
|
"access_type": "ALL",
|
"rows": 5,
|
"filtered": 100,
|
"attached_condition": "t2.b > 2"
|
we can see this condition (t1.a=1) pushdown in action.
3) Memory Leak. During parsing, Item names are allocated in statement memory. During context analysis before each execution, Item renaming allocated more memory on statement memory. For statements that have a high number of executions, this would be a noticable memory leak.
A recently introduced compile time flag, activated with the cmake option -DWITH_PROTECT_STATEMENT_MEMROOT:BOOL=ON triggers a failing assert on second execution
prepare stmt from "
|
with cte(c) as (select ? ) select r.c, s.c+10 from cte as r, cte as s;
|
";
|
set @a=2;
|
execute stmt using @a;
|
c s.c+10
|
2 12
|
set @a=5;
|
execute stmt using @a;
|
#6 0x0000557e017dcb5e in alloc_root (mem_root=0x7fceec20a538, length=4) at /home/rex/src/mariadb/server.10.4-MDEV-24961/mysys/my_alloc.c:225
|
#7 0x0000557e00a22163 in Query_arena::alloc (this=0x7fceec000db8, size=4) at /home/rex/src/mariadb/server.10.4-MDEV-24961/sql/sql_class.h:1121
|
#8 0x0000557e00ee4652 in make_name (thd=0x7fceec000da0, str=0x7fceec1f80c8 "c", length=1, cs=0x557e024f5300 <my_charset_utf8_general_ci>, max_octet_length=255)
|
at /home/rex/src/mariadb/server.10.4-MDEV-24961/sql/item.cc:1153
|
#9 0x0000557e00ee4989 in Item::set_name (this=0x7fceec1f85d0, thd=0x7fceec000da0, str=0x7fceec1f80c8 "c", length=1, cs=0x557e024f5300 <my_charset_utf8_general_ci>)
|
at /home/rex/src/mariadb/server.10.4-MDEV-24961/sql/item.cc:1217
|
#10 0x0000557e00dd132f in With_element::rename_columns_of_derived_unit (this=0x7fceec1f8fe0, thd=0x7fceec000da0, unit=0x7fceec1f87b0)
|
The attached commit addresses column naming issues with CTEs in the use of prepared statements and procedures.
Usage of either prepared statements or procedures with Common Table Expressions and column renaming may be affected.
There are three related but different issues addressed here.
1) First execution issue. Consider the following
During parsing, the Item representing the first column in the derived table (cte) created, then has it's name set to c1.
When elements in the order by column are resolved during context analysis, this Item is found from it's name. Later in context analysis st_select_lex_unit::prepare(), this Item is renamed col1, for use in a more outer statement (in the above example select col1, col2 from cte).
When run as a prepared statement, context analysis is run again before statement execution, so name resolution on elements in order by will no longer find an Item with the name of c1 and will produce an error
ERROR 1054 (42S22): Unknown column 'c1' in 'order clause'
Columns of the first select in the unit representing any potential union determine the names of the resultant set, so only Items in this select are renamed to names specified in the with clause.
It would be reasonable to only restore these names to those after parsing, but doing so will lead to another problem.
2) Second execution issue. Consider this example
(select a as col1, sum(b) as col2 from t1 where a > 0 group by col1
union select a as col3, sum(b) as col4 from t2 where b > 2 group by col3)
and what happens to Items in the second select in the union. Parsing sets the name of the first Item to col3, which is then used in resolving the group by clause.
During first execution, the name is left as col3, which is again used to resolve the group by clause.
The optimizer will then potentially push the clause "c1=1" into BOTH selects in the union and in doing so will rename all Items in all selects in the union in that position to c1 to make the clause valid.
During the next execution, the group by clause will no longer be resolvable.
ERROR 1054 (42S22): Unknown column 'col3' in 'group statement'
Examining the JSON output from an explain command on the above query produces
"query_specifications": [
{
"query_block": {
"select_id": 2,
"table": {
"table_name": "t1",
"access_type": "ALL",
"rows": 5,
"filtered": 100,
"attached_condition": "t1.a = 1"
}
}
},
{
"query_block": {
"select_id": 3,
"operation": "UNION",
"table": {
"table_name": "t2",
"access_type": "ALL",
"rows": 5,
"filtered": 100,
"attached_condition": "t2.a = 1 and t2.b > 2"
but with set @@optimizer_switch="condition_pushdown_for_derived=off";
"query_specifications": [
{
"query_block": {
"select_id": 2,
"filesort": {
"sort_key": "t1.a",
"temporary_table": {
"table": {
"table_name": "t1",
"access_type": "ALL",
"rows": 5,
"filtered": 100,
"attached_condition": "t1.a > 0"
}
}
}
}
},
{
"query_block": {
"select_id": 3,
"operation": "UNION",
"filesort": {
"sort_key": "t2.a",
"temporary_table": {
"table": {
"table_name": "t2",
"access_type": "ALL",
"rows": 5,
"filtered": 100,
"attached_condition": "t2.b > 2"
we can see this condition (t1.a=1) pushdown in action.
3) Memory Leak. During parsing, Item names are allocated in statement memory. During context analysis before each execution, Item renaming allocated more memory on statement memory. For statements that have a high number of executions, this would be a noticable memory leak.
A recently introduced compile time flag, activated with the cmake option -DWITH_PROTECT_STATEMENT_MEMROOT:BOOL=ON triggers a failing assert on second execution
with cte(c) as (select ? ) select r.c, s.c+10 from cte as r, cte as s;
c s.c+10
2 12
#6 0x0000557e017dcb5e in alloc_root (mem_root=0x7fceec20a538, length=4) at /home/rex/src/mariadb/server.10.4-MDEV-24961/mysys/my_alloc.c:225
#7 0x0000557e00a22163 in Query_arena::alloc (this=0x7fceec000db8, size=4) at /home/rex/src/mariadb/server.10.4-MDEV-24961/sql/sql_class.h:1121
#8 0x0000557e00ee4652 in make_name (thd=0x7fceec000da0, str=0x7fceec1f80c8 "c", length=1, cs=0x557e024f5300 <my_charset_utf8_general_ci>, max_octet_length=255)
at /home/rex/src/mariadb/server.10.4-MDEV-24961/sql/item.cc:1153
#9 0x0000557e00ee4989 in Item::set_name (this=0x7fceec1f85d0, thd=0x7fceec000da0, str=0x7fceec1f80c8 "c", length=1, cs=0x557e024f5300 <my_charset_utf8_general_ci>)
at /home/rex/src/mariadb/server.10.4-MDEV-24961/sql/item.cc:1217
#10 0x0000557e00dd132f in With_element::rename_columns_of_derived_unit (this=0x7fceec1f8fe0, thd=0x7fceec000da0, unit=0x7fceec1f87b0)