[MDEV-31305] Crash caused by query with aggregation over materialized derived Created: 2023-05-19  Updated: 2024-02-05  Resolved: 2024-02-03

Status: Closed
Project: MariaDB Server
Component/s: Optimizer - CTE
Affects Version/s: 10.5.15, 10.4.28, 10.6.12
Fix Version/s: 10.11.3, 10.4.29, 10.5.20, 10.6.13

Type: Bug Priority: Critical
Reporter: Valerii Kravchuk Assignee: Igor Babaev
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-29210 Assertion `param->field_count > (uint... Confirmed
relates to MDEV-30706 Different results of selects from vie... Closed

 Description   

Consider the following EXPLAIN for the query referencing to CTE d columns as d.*:

WITH demo AS (
  SELECT seq1.seq AS dim1, seq2.seq AS dim2, seq3.seq AS dim3,
         FLOOR(RAND() * 5) AS p
    FROM seq_100_to_105 seq1
         JOIN seq_10_to_15 seq2
         JOIN seq_1_to_5 seq3
)
SELECT d.*, SUM(p)
  FROM demo d

This query works as expected (depending on sql_mode) on 10.6.13 or current code, for example:

Yuliyas-Air:maria10.6 Valerii$ bin/mysql test
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 32
Server version: 10.6.14-MariaDB MariaDB Server
 
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [test]> explain extended
    -> WITH demo AS (
    ->   SELECT seq1.seq AS dim1, seq2.seq AS dim2, seq3.seq AS dim3,
    ->          FLOOR(RAND() * 5) AS p
    ->     FROM seq_100_to_105 seq1
    ->          JOIN seq_10_to_15 seq2
    ->          JOIN seq_1_to_5 seq3
    -> )
    -> SELECT d.*, SUM(p)
    ->   FROM demo d;
+------+-------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------------------------------------+
| id   | select_type | table      | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                                                  |
+------+-------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------------------------------------+
|    1 | PRIMARY     |  | ALL   | NULL          | NULL    | NULL    | NULL | 180  |   100.00 |                                                        |
|    2 | DERIVED     | seq3       | index | NULL          | PRIMARY | 8       | NULL | 5    |   100.00 | Using index                                            |
|    2 | DERIVED     | seq1       | index | NULL          | PRIMARY | 8       | NULL | 6    |   100.00 | Using index; Using join buffer (flat, BNL join)        |
|    2 | DERIVED     | seq2       | index | NULL          | PRIMARY | 8       | NULL | 6    |   100.00 | Using index; Using join buffer (incremental, BNL join) |
+------+-------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------------------------------------+
4 rows in set, 1 warning (0.061 sec)
 
MariaDB [test]> set session sql_mode = "ONLY_FULL_GROUP_BY";
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [test]> explain extended WITH demo AS (   SELECT seq1.seq AS dim1, seq2.seq AS dim2, seq3.seq AS dim3,          FLOOR(RAND() * 5) AS p     FROM seq_100_to_105 seq1          JOIN seq_10_to_15 seq2          JOIN seq_1_to_5 seq3 ) SELECT d.*, SUM(p)   FROM demo d;
ERROR 1140 (42000): Mixing of GROUP columns (MIN(),MAX(),COUNT(),...) with no GROUP columns is illegal if there is no GROUP BY clause

but on 10.6.12 and other affected versions we get a crash:

Yuliyas-Air:maria10.6.12 Valerii$ bin/mysql test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 10.6.12-MariaDB MariaDB Server
 
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [test]> set sql_mode = 'ONLY_FULL_GROUP_BY';
Query OK, 0 rows affected (0.005 sec)
 
MariaDB [test]> WITH demo AS (
    ->   SELECT seq1.seq AS dim1, seq2.seq AS dim2, seq3.seq AS dim3,
    ->          FLOOR(RAND() * 5) AS p
    ->     FROM seq_100_to_105 seq1
    ->          JOIN seq_10_to_15 seq2
    ->          JOIN seq_1_to_5 seq3
    -> )
    -> SELECT d.*, SUM(p)
    ->   FROM demo d;
ERROR 2013 (HY000): Lost connection to server during query

with the following stack trace (macOS here):

...
2023-05-18 21:21:55 0 [Note] /Users/Valerii/dbs/maria10.6.12/bin/mariadbd: ready for connections.
Version: '10.6.12-MariaDB'  socket: '/tmp/mysql.sock'  port: 3306  MariaDB Server
230519  7:10:13 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.6.12-MariaDB source revision: 4c79e15cc3716f69c044d4287ad2160da8101cdc
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=1
max_threads=153
thread_count=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467971 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7fc0f58dd418
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x700004f28e70 thread_stack 0x49000
Printing to addr2line failed
0   mariadbd                            0x000000011045093c my_print_stacktrace + 60
0   mariadbd                            0x000000010fbf5844 handle_fatal_signal + 740
0   libsystem_platform.dylib            0x00007fff64d90f5a _sigtramp + 26
0   ???                                 0x0000000000000000 0x0 + 0
0   mariadbd                            0x000000010feb021b _ZN4JOIN21make_aggr_tables_infoEv + 3947
0   mariadbd                            0x000000010fea47db _ZN4JOIN15optimize_stage2Ev + 17995
0   mariadbd                            0x000000010fea7de5 _ZN4JOIN14optimize_innerEv + 7525
0   mariadbd                            0x000000010fe99b58 _Z12mysql_selectP3THDP10TABLE_LISTR4ListI4ItemEPS4_jP8st_orderS9_S7_S9_yP13select_resultP18st_select_lex_unitP13st_select_lex + 1144
0   mariadbd                            0x000000010fe995be _Z13handle_selectP3THDP3LEXP13select_resultm + 238
0   mariadbd                            0x000000010fe64a2f _ZL21execute_sqlcom_selectP3THDP10TABLE_LIST + 1231
0   mariadbd                            0x000000010fe5e239 _Z21mysql_execute_commandP3THDb + 2617
0   mariadbd                            0x000000010fe59a6b _Z11mysql_parseP3THDPcjP12Parser_state + 731
0   mariadbd                            0x000000010fe5794b _Z16dispatch_command19enum_server_commandP3THDPcjb + 3899
0   mariadbd                            0x000000010fe5a027 _Z10do_commandP3THDb + 1031
0   mariadbd                            0x000000010ffa4a8e _Z24do_handle_one_connectionP7CONNECTb + 430
0   mariadbd                            0x000000010ffa47ab handle_one_connection + 91
0   mariadbd                            0x0000000110145c99 pfs_spawn_thread + 217
0   libsystem_pthread.dylib             0x00007fff64d9a661 _pthread_body + 340
0   libsystem_pthread.dylib             0x00007fff64d9a50d _pthread_body + 0
0   libsystem_pthread.dylib             0x00007fff64d99bf9 thread_start + 13
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7fc0f48be830): WITH demo AS (
  SELECT seq1.seq AS dim1, seq2.seq AS dim2, seq3.seq AS dim3,
         FLOOR(RAND() * 5) AS p
    FROM seq_100_to_105 seq1
         JOIN seq_10_to_15 seq2
         JOIN seq_1_to_5 seq3
)
SELECT d.*, SUM(p)
  FROM demo d
 
Connection ID (thread ID): 4
Status: NOT_KILLED
...

If we reference proper columns of CTE explicitly, everything works, though:

MariaDB [test]> explain extended
    -> WITH demo AS (
    ->   SELECT seq1.seq AS dim1, seq2.seq AS dim2, seq3.seq AS dim3,
    ->          FLOOR(RAND() * 5) AS p
    ->     FROM seq_100_to_105 seq1
    ->          JOIN seq_10_to_15 seq2
    ->          JOIN seq_1_to_5 seq3
    -> )
    -> SELECT d.dim1, d.dim2, d.dim3, SUM(p)
    ->   FROM demo d;
+------+-------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------------------------------------+
| id   | select_type | table      | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                                                  |
+------+-------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------------------------------------+
|    1 | PRIMARY     | <derived2> | ALL   | NULL          | NULL    | NULL    | NULL | 180  |   100.00 |                                                        |
|    2 | DERIVED     | seq3       | index | NULL          | PRIMARY | 8       | NULL | 5    |   100.00 | Using index                                            |
|    2 | DERIVED     | seq1       | index | NULL          | PRIMARY | 8       | NULL | 6    |   100.00 | Using index; Using join buffer (flat, BNL join)        |
|    2 | DERIVED     | seq2       | index | NULL          | PRIMARY | 8       | NULL | 6    |   100.00 | Using index; Using join buffer (incremental, BNL join) |
+------+-------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------------------------------------+
4 rows in set, 1 warning (0.005 sec)

but if we reference d.p, we crash again the same way:

MariaDB [test]> explain extended WITH demo AS (   SELECT seq1.seq AS dim1, seq2.seq AS dim2, seq3.seq AS dim3,          FLOOR(RAND() * 5) AS p     FROM seq_100_to_105 seq1          JOIN seq_10_to_15 seq2          JOIN seq_1_to_5 seq3 ) SELECT d.dim1, d.dim2, d.dim3, d.p, SUM(p)   FROM demo d;
ERROR 2013 (HY000): Lost connection to server during query

As I can not reproduce the crash on 10.6.13 the bug is fixed somehow, but I can not find any existing MDEV based on the stack trace, so it may be just by chance and the problem may be re-introduced later. Hence this bug report.



 Comments   
Comment by Alice Sherepa [ 2023-05-19 ]

No crash after

commit ccec9b1de95a66b7597bc30e0a60bd61866f225d (HEAD)
Author: Igor Babaev <igor@askmonty.org>
Date:   Wed Mar 1 22:49:27 2023 -0800
 
    MDEV-30706 Different results of selects from view and CTE with same definition
    MDEV-30668 Set function aggregated in outer select used in view definition
    
    This patch fixes two bugs concerning views whose specifications contain
    subqueries with set functions aggregated in outer selects.
    Due to the first bug those such views that have implicit grouping were
    considered as mergeable. This led to wrong result sets for selects from
    these views.
    Due to the second bug the aggregation select was determined incorrectly and
    this led to bogus error messages.
    The patch added several test cases for these two bugs and for four other
    duplicate bugs.
    The patch also enables view-protocol for many other test cases.
    
    Approved by Oleksandr Byelkin <sanja@mariadb.com>

before that commit:

mysqld:  /sql/sql_select.cc:25585: bool setup_copy_fields(THD*, TMP_TABLE_PARAM*, Ref_ptr_array, List<Item>&, List<Item>&, uint, List<Item>&): Assertion `param->field_count > (uint) (copy - copy_start)' failed.
230519 12:57:44 [ERROR] mysqld got signal 6 ;
 
Server version: 10.4.29-MariaDB-debug-log source revision: a6a906d76697d5487418d51b251a3070ac6e8c80
 
/lib/x86_64-linux-gnu/libc.so.6(+0x33fd6)[0x7f744b975fd6]
sql/sql_select.cc:25586(setup_copy_fields(THD*, TMP_TABLE_PARAM*, Bounds_checked_array<Item*>, List<Item>&, List<Item>&, unsigned int, List<Item>&))[0x555b788c93aa]
sql/sql_select.cc:3708(JOIN::make_aggr_tables_info())[0x555b78827e51]
sql/sql_select.cc:3084(JOIN::optimize_stage2())[0x555b78820b16]
sql/sql_select.cc:2370(JOIN::optimize_inner())[0x555b78819363]
sql/sql_select.cc:1687(JOIN::optimize())[0x555b7881207b]
sql/sql_select.cc:4794(mysql_select(THD*, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*))[0x555b78833205]
sql/sql_select.cc:439(handle_select(THD*, LEX*, select_result*, unsigned long))[0x555b78803f3c]
sql/sql_parse.cc:6454(execute_sqlcom_select(THD*, TABLE_LIST*))[0x555b78771dbd]
sql/sql_parse.cc:3966(mysql_execute_command(THD*))[0x555b7875f6a2]
sql/sql_parse.cc:7986(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x555b7877b19d]
sql/sql_parse.cc:1860(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x555b787519f3]
sql/sql_parse.cc:1378(do_command(THD*))[0x555b7874e534]
sql/sql_connect.cc:1420(do_handle_one_connection(CONNECT*))[0x555b78b51d65]
sql/sql_connect.cc:1325(handle_one_connection)[0x555b78b51609]
perfschema/pfs.cc:1871(pfs_spawn_thread)[0x555b797daca1]
nptl/pthread_create.c:478(start_thread)[0x7f744be90609]
addr2line: DWARF error: section .debug_info is larger than its filesize! (0x93ef57 vs 0x530ea0)
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7f744ba61133]
 
Query (0x62b0000b6290): WITH demo AS (
SELECT seq1.seq AS dim1, seq2.seq AS dim2, seq3.seq AS dim3,
FLOOR(RAND() * 5) AS p
FROM seq_100_to_105 seq1
JOIN seq_10_to_15 seq2
JOIN seq_1_to_5 seq3
)
SELECT d.*, SUM(p)
FROM demo d

230519 13:14:19 [ERROR] mysqld got signal 11 ;
 
Server version: 10.6.12-MariaDB source revision: 4c79e15cc3716f69c044d4287ad2160da8101cdc
 
sigaction.c:0(__restore_rt)[0x7fb1c8f3f420]
sql/sql_select.cc:26344(JOIN::make_sum_func_list(List<Item>&, List<Item>&, bool))[0x5631eac1e6a2]
sql/sql_select.cc:3912(JOIN::make_aggr_tables_info())[0x5631eac2b41e]
sql/sql_select.cc:3273(JOIN::optimize_stage2())[0x5631eac387f9]
sql/sql_select.cc:2530(JOIN::optimize_inner())[0x5631eac3af7d]
sql/sql_select.cc:1845(JOIN::optimize())[0x5631eac3cc12]
sql/sql_select.cc:5048(mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*))[0x5631eac3cce8]
sql/sql_select.cc:566(handle_select(THD*, LEX*, select_result*, unsigned long))[0x5631eac3d5f4]
sql/sql_parse.cc:6262(execute_sqlcom_select(THD*, TABLE_LIST*))[0x5631eaa9fbe6]
sql/sql_parse.cc:3949(mysql_execute_command(THD*, bool))[0x5631eabdf032]
sql/sql_parse.cc:8022(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x5631eabe13fb]
sql/sql_parse.cc:1955(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool))[0x5631eabe35ba]
sql/sql_parse.cc:1411(do_command(THD*, bool))[0x5631eabe4ca3]
sql/sql_connect.cc:1416(do_handle_one_connection(CONNECT*, bool))[0x5631eacdb1a7]
sql/sql_connect.cc:1324(handle_one_connection)[0x5631eacdb444]
perfschema/pfs.cc:2204(pfs_spawn_thread)[0x5631eb06bc2c]
nptl/pthread_create.c:478(start_thread)[0x7fb1c8f33609]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7fb17c010b70): WITH demo AS (
  SELECT seq1.seq AS dim1, seq2.seq AS dim2, seq3.seq AS dim3,
         FLOOR(RAND() * 5) AS p
    FROM seq_100_to_105 seq1
         JOIN seq_10_to_15 seq2
         JOIN seq_1_to_5 seq3
)
SELECT d.*, SUM(p)
  FROM demo d

Comment by Alice Sherepa [ 2023-05-22 ]

julien.fritsch the bug is probably fixed by that commit, but igor should take a look into it and decide 1) is it really fixed 2) should we add this test and then close it.

Comment by Igor Babaev [ 2024-02-01 ]

If we consider the tree before the patch for MDEV-30706 we see that the query using the equivalent derived table d instead of CTE demo

SELECT d.*, SUM(p)
  FROM (
  SELECT seq1.seq AS dim1, seq2.seq AS dim2, seq3.seq AS dim3,
         FLOOR(RAND(13) * 5) AS p
    FROM seq_100_to_105 seq1
         JOIN seq_10_to_15 seq2
         JOIN seq_1_to_5 seq3
) d;

fails at the assertion

          DBUG_ASSERT (param->field_count > (uint) (copy - copy_start));

from the function setup_copy_fields().
In debugger we see that the value of param->field_count is 3 while the value of (uint) (copy - copy_start)) is also 3.
At the same time if we run the query

select v.*, sum(v.p) from v;

that uses the view v having the same specification as the derived table d

CREATE VIEW v AS
  SELECT seq1.seq AS dim1, seq2.seq AS dim2, seq3.seq AS dim3,
         FLOOR(RAND(13) * 5) AS p
    FROM seq_100_to_105 seq1
         JOIN seq_10_to_15 seq2
         JOIN seq_1_to_5 seq3;

everything works fine and we get:

MariaDB [test]> select v1.*, sum(v1.p) from v1;
+------+------+------+------+-----------+
| dim1 | dim2 | dim3 | p    | sum(v1.p) |
+------+------+------+------+-----------+
|  100 |   10 |    1 |    2 |       371 |
+------+------+------+------+-----------+

In debugger we see that when we come to the above assertion we have the value of param->field_count equal to 5.
The value of param->field is calculated in the function count_field_types() using the parameter 'fields'. In both cases we have 5 elements in the list fields. Yet in the case with view the elements are:
v.dim1, v.dim2, v.dim3, v.p, sum(v.p)
while in the case of derived table they are
seq1.seq, seq2.seq, seq3.seq, floor(rand(13) * 5), sum(floor(rand(13) * 5)).
Both the view v and the derived table d are supposed to be materialized because their specification contains the function RAND. However the 'field' list in the case with the derived table d looks like as if the derived table were merged.
After the patch for MDEV-30706 we have the following 'fields' list for the derived table
d.dim1, d.dim2, d.dim3, d.p, sum(d.p),
param->field _count is evaluated to 5 and the query returns the right result set.

MariaDB [test]> SELECT d.*, SUM(p)
    ->   FROM (
    ->   SELECT seq1.seq AS dim1, seq2.seq AS dim2, seq3.seq AS dim3,
    ->          FLOOR(RAND(13) * 5) AS p
    ->     FROM seq_100_to_105 seq1
    ->          JOIN seq_10_to_15 seq2
    ->          JOIN seq_1_to_5 seq3
    -> ) d;
+------+------+------+------+--------+
| dim1 | dim2 | dim3 | p    | SUM(p) |
+------+------+------+------+--------+
|  100 |   10 |    1 |    2 |    371 |
+------+------+------+------+--------+

Comment by Igor Babaev [ 2024-02-03 ]

This bug is closed as fixed by the patch for MDEV-30706. See also my comment containing an analysis of this bug. A test case for this bug was pushed into 10.4.

Generated at Thu Feb 08 10:22:50 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.