[MDEV-31113] Server crashes in store_length / Type_handler_string_result::make_sort_key with DISTINCT and group function Created: 2023-04-23  Updated: 2023-07-26  Resolved: 2023-04-28

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0
Fix Version/s: 11.1.1, 10.11.3, 11.0.2, 10.4.29, 10.5.20, 10.6.13, 10.8.8, 10.9.6, 10.10.4

Type: Bug Priority: Blocker
Reporter: Elena Stepanova Assignee: Oleg Smirnov
Resolution: Fixed Votes: 0
Labels: regression


 Description   

Note: The query in the test case doesn't look meaningful, but I'm not sure the problem is limited to such nonsensical queries. If the analysis shows that it indeed does, I don't object demoting it from a blocker.

CREATE TABLE t (f INT);
INSERT INTO t VALUES (1),(2);
SELECT DISTINCT CONVERT(STDDEV(f), CHAR(16)) AS f1, UUID() AS f2 FROM t GROUP BY f2 WITH ROLLUP;
 
# Cleanup
DROP TABLE t;

10.4 2e1c532b

#3  <signal handler called>
#4  0x0000559fad00b719 in store_length (to=0x7f74ae61d3f9 <error: Cannot access memory at address 0x7f74ae61d3f9>, length=6, pack_length=2779096485) at /data/src/10.4/sql/filesort.cc:970
#5  0x0000559fad00ba35 in Type_handler_string_result::make_sort_key (this=0x559fae437f10 <type_handler_varchar>, to=0x7f745407798e '\245' <repeats 200 times>..., item=0x7f7454015018, sort_field=0x7f7454155fa0, tmp_buffer=0x7f7464b72e60) at /data/src/10.4/sql/filesort.cc:1037
#6  0x0000559facd47ea9 in make_sort_key (sortorder=0x7f7454155fa0, key_buffer=0x7f745407798d "\001", '\245' <repeats 199 times>..., tmp_value=0x7f7464b72e60) at /data/src/10.4/sql/sql_select.cc:24322
#7  0x0000559facd488bc in remove_dup_with_hash_index (thd=0x7f7454000da0, table=0x7f74540537d8, field_count=1, first_field=0x7f7454054b58, sortorder=0x7f7454155fa0, key_length=126, having=0x0) at /data/src/10.4/sql/sql_select.cc:24534
#8  0x0000559facd47d30 in st_join_table::remove_duplicates (this=0x7f74540180f8) at /data/src/10.4/sql/sql_select.cc:24296
#9  0x0000559facd40fb2 in join_init_read_record (tab=0x7f74540180f8) at /data/src/10.4/sql/sql_select.cc:21753
#10 0x0000559facd578ed in AGGR_OP::end_send (this=0x7f7454019210) at /data/src/10.4/sql/sql_select.cc:29577
#11 0x0000559facd3e731 in sub_select_postjoin_aggr (join=0x7f7454016420, join_tab=0x7f74540180f8, end_of_records=true) at /data/src/10.4/sql/sql_select.cc:20544
#12 0x0000559facd3eaaf in sub_select (join=0x7f7454016420, join_tab=0x7f7454017d50, end_of_records=true) at /data/src/10.4/sql/sql_select.cc:20779
#13 0x0000559facd3e751 in sub_select_postjoin_aggr (join=0x7f7454016420, join_tab=0x7f7454017d50, end_of_records=true) at /data/src/10.4/sql/sql_select.cc:20546
#14 0x0000559facd3eaaf in sub_select (join=0x7f7454016420, join_tab=0x7f74540179a8, end_of_records=true) at /data/src/10.4/sql/sql_select.cc:20779
#15 0x0000559facd3e16a in do_select (join=0x7f7454016420, procedure=0x0) at /data/src/10.4/sql/sql_select.cc:20368
#16 0x0000559facd1173c in JOIN::exec_inner (this=0x7f7454016420) at /data/src/10.4/sql/sql_select.cc:4590
#17 0x0000559facd10810 in JOIN::exec (this=0x7f7454016420) at /data/src/10.4/sql/sql_select.cc:4372
#18 0x0000559facd1200f in mysql_select (thd=0x7f7454000da0, tables=0x7f7454015260, wild_num=0, fields=..., conds=0x0, og_num=1, order=0x0, group=0x7f7454015a80, having=0x0, proc_param=0x0, select_options=2147748609, result=0x7f74540163f8, unit=0x7f7454004cd8, select_lex=0x7f7454014900) at /data/src/10.4/sql/sql_select.cc:4811
#19 0x0000559facd00ddf in handle_select (thd=0x7f7454000da0, lex=0x7f7454004c18, result=0x7f74540163f8, setup_tables_done_option=0) at /data/src/10.4/sql/sql_select.cc:442
#20 0x0000559faccc3c0e in execute_sqlcom_select (thd=0x7f7454000da0, all_tables=0x7f7454015260) at /data/src/10.4/sql/sql_parse.cc:6463
#21 0x0000559faccba110 in mysql_execute_command (thd=0x7f7454000da0) at /data/src/10.4/sql/sql_parse.cc:3966
#22 0x0000559faccc7d83 in mysql_parse (thd=0x7f7454000da0, rawbuf=0x7f74540147e8 "SELECT DISTINCT CONVERT(STDDEV(f), CHAR(16)) AS f1, UUID() AS f2 FROM t GROUP BY f2 WITH ROLLUP", length=95, parser_state=0x7f7464b743c0, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:7998
#23 0x0000559faccb37b5 in dispatch_command (command=COM_QUERY, thd=0x7f7454000da0, packet=0x7f745400ac41 "", packet_length=95, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:1857
#24 0x0000559faccb201c in do_command (thd=0x7f7454000da0) at /data/src/10.4/sql/sql_parse.cc:1378
#25 0x0000559face529ca in do_handle_one_connection (connect=0x559fb0eefa00) at /data/src/10.4/sql/sql_connect.cc:1420
#26 0x0000559face52732 in handle_one_connection (arg=0x559fb0eefa00) at /data/src/10.4/sql/sql_connect.cc:1324
#27 0x0000559fad3a2a48 in pfs_spawn_thread (arg=0x559fb0f35930) at /data/src/10.4/storage/perfschema/pfs.cc:1869
#28 0x00007f746abc8fd4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#29 0x00007f746ac495bc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

The failure started happening on 10.4+ after this commit:

commit 476b24d084e7e717310155bb986eb086d3c1e1a6
Author: Monty
Date:   Thu Feb 16 14:19:33 2023 +0200
 
    MDEV-20057 Distinct SUM on CROSS JOIN and grouped returns wrong result



 Comments   
Comment by Elena Stepanova [ 2023-04-24 ]

I suppose this failure is related to the initially reported:

mysqld: /home/vsts/src/sql/filesort.cc:2144: virtual void Type_handler_string_result::sort_length(THD*, const Type_std_attributes*, SORT_FIELD_ATTR*) const: Assertion `sortorder->length <= 0xFFFFFFFFL - sor
torder->suffix_length' failed.
230424  6:58:39 [ERROR] mysqld got signal 6 ;

Comment by Sergei Petrunia [ 2023-04-24 ]

The problem starts here:

  #0  Type_handler_string_result::make_sort_key (this=0x5555575aeef0 <type_handler_varchar>, to=0x7fffd40ba68e '\245' <repeats 200 times>..., item=0x7fffd4017168, sort_field=0x7fffd409cca0, tmp_buffer=0x7fffeb548eb0) at /home/psergey/dev-git2/10.4-look/sql/filesort.cc:1026
  #1  0x0000555555e1f4b8 in make_sort_key (sortorder=0x7fffd409cca0, key_buffer=0x7fffd40ba68d "\001", '\245' <repeats 199 times>..., tmp_value=0x7fffeb548eb0) at /home/psergey/dev-git2/10.4-look/sql/sql_select.cc:24322
  #2  0x0000555555e1fe9e in remove_dup_with_hash_index (thd=0x7fffd4000d50, table=0x7fffd40787a8, field_count=1, first_field=0x7fffd4079b28, sortorder=0x7fffd409cca0, key_length=126, having=0x0) at /home/psergey/dev-git2/10.4-look/sql/sql_select.cc:24534
  #3  0x0000555555e1f335 in st_join_table::remove_duplicates (this=0x7fffd401a248) at /home/psergey/dev-git2/10.4-look/sql/sql_select.cc:24296

We crash because of non-sensical value:

(gdb) print sort_field_length
  $26 = 1515870827

Which comes from

     uint sort_field_length= sort_field->length - sort_field->suffix_length;

where

(gdb) p sort_field->length
  $23 = 16
(gdb) p sort_field->suffix_length
  $24 = 2779096485
(gdb) p/x sort_field->suffix_length
  $25 = 0xa5a5a5a5

Comment by Sergei Petrunia [ 2023-04-24 ]

Doesn't seem to be directly related to MDEV-20057 .

I see a mismatch between
Type_handler_string_result::make_sort_key() and Type_handler_string_result::sortlength().

We use collation latin1_swedish_ci. latin1_swedish_ci!= &my_charset_bin but also for ituse_strnxfrm(...)= false, so Type_handler_string_result::sortlength() function doesn't set suffix_length.

But the code in Type_handler_string_result::make_sort_key() will try to read suffix_length if use_strnxfrm(cs)==false.

oleg.smirnov, can you check the above reasoning. Any clue? Please ask bar if necessary.

Comment by Oleg Smirnov [ 2023-04-25 ]

psergei, please review the patch bb-10.4-mdev-31113.

Comment by Sergei Petrunia [ 2023-04-25 ]

oleg.smirnov, please make Type_handler_string_result::sortlength() to assign suffix_length in all cases when Type_handler_string_result::make_sort_key() needs it.

That way, the code will be structured better.

Comment by Michael Widenius [ 2023-04-25 ]

An alternative, more future proof patch, is the following (in the same function):

+ if (!(sortorder= (SORT_FIELD*) my_malloc((fields->elements+1) *
+ sizeof(SORT_FIELD),
+ MYF(MY_WME))))

Add MY_ZEROFILL to MY_WME above.

Comment by Oleg Smirnov [ 2023-04-25 ]

After discussion during an optimizer call we decided to proceed with the bzero() approach.

Comment by Sergei Petrunia [ 2023-04-26 ]

Ok to push

Comment by Oleg Smirnov [ 2023-04-28 ]

Pushed to 10.4

Generated at Thu Feb 08 10:21:23 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.