Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31743

Server crash in store_length, assertion failure in Type_handler_string_result::sort_length

Details

    Description

      I'll set the fix version tentatively to 10.4+, even though it's not failing there (see bisect results below); please remove 10.4 if you find out it's 10.5-specific.

      --source include/have_innodb.inc
       
      SELECT DISTINCT IF(BENCHMARK(1,VAR_POP(CHECK_TIME)),(ENCODE(PARTITION_EXPRESSION, 'x')),0) AS f FROM information_schema.PARTITIONS GROUP BY VALUE(PARTITION_METHOD) WITH ROLLUP;
      

      10.5 1a5c4c2d non-debug

      #2  <signal handler called>
      #3  0x0000560ef74f0a3a in store_length (pack_length=<optimized out>, length=1, to=0x7fd6b03575a8 <error: Cannot access memory at address 0x7fd6b03575a8>) at /data/src/10.5/sql/filesort.cc:1100
      #4  store_length (to=0x7fd6b03575a8 <error: Cannot access memory at address 0x7fd6b03575a8>, length=1, pack_length=<optimized out>) at /data/src/10.5/sql/filesort.cc:1087
      #5  0x0000560ef74f0bec in Type_handler_string_result::make_sort_key_part (this=<optimized out>, to=0x7fd5b03575a9 "", item=<optimized out>, sort_field=0x7fd5b02e78b8, tmp_buffer=<optimized out>) at /data/src/10.5/sql/filesort.cc:1166
      #6  0x0000560ef7320756 in make_sort_key (sortorder=sortorder@entry=0x7fd5b02e78b8, key_buffer=0x7fd5b03575a8 "\001", tmp_value=tmp_value@entry=0x7fd620386780) at /data/src/10.5/sql/sql_select.cc:24703
      #7  0x0000560ef734a573 in remove_dup_with_compare (having=0x0, keylength=4, sortorder=0x7fd5b02e78b8, first_field=0x7fd5b0141a48, table=0x7fd5b01411b0, thd=0x7fd5b0000c68) at /data/src/10.5/sql/sql_select.cc:24776
      #8  st_join_table::remove_duplicates (this=this@entry=0x7fd5b0100d30) at /data/src/10.5/sql/sql_select.cc:24681
      #9  0x0000560ef734bf38 in join_init_read_record (tab=0x7fd5b0100d30) at /data/src/10.5/sql/sql_select.cc:22113
      #10 0x0000560ef73594b5 in AGGR_OP::end_send (this=0x7fd5b0101ff0) at /data/src/10.5/sql/sql_select.cc:29933
      #11 0x0000560ef7359868 in sub_select_postjoin_aggr (join=0x7fd5b0013758, join_tab=0x7fd5b0100d30, end_of_records=<optimized out>) at /data/src/10.5/sql/sql_select.cc:20873
      #12 0x0000560ef736326a in do_select (procedure=<optimized out>, join=0x7fd5b0013758) at /data/src/10.5/sql/sql_select.cc:20697
      #13 JOIN::exec_inner (this=this@entry=0x7fd5b0013758) at /data/src/10.5/sql/sql_select.cc:4602
      #14 0x0000560ef7363670 in JOIN::exec (this=this@entry=0x7fd5b0013758) at /data/src/10.5/sql/sql_select.cc:4382
      #15 0x0000560ef7361666 in mysql_select (thd=thd@entry=0x7fd5b0000c68, tables=0x7fd5b00117b8, fields=..., conds=0x0, og_num=1, order=0x0, group=<optimized out>, having=<optimized out>, proc_param=<optimized out>, select_options=<optimized out>, result=<optimized out>, unit=<optimized out>, select_lex=<optimized out>) at /data/src/10.5/sql/sql_select.cc:4859
      #16 0x0000560ef736200f in handle_select (thd=thd@entry=0x7fd5b0000c68, lex=lex@entry=0x7fd5b0004b90, result=result@entry=0x7fd5b0013730, setup_tables_done_option=setup_tables_done_option@entry=0) at /data/src/10.5/sql/sql_select.cc:450
      #17 0x0000560ef72f2300 in execute_sqlcom_select (thd=thd@entry=0x7fd5b0000c68, all_tables=0x7fd5b00117b8) at /data/src/10.5/sql/sql_parse.cc:6331
      #18 0x0000560ef72feb0a in mysql_execute_command (thd=thd@entry=0x7fd5b0000c68) at /data/src/10.5/sql/sql_parse.cc:4008
      #19 0x0000560ef73000a6 in mysql_parse (thd=0x7fd5b0000c68, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>, is_com_multi=<optimized out>, is_next_command=<optimized out>) at /data/src/10.5/sql/sql_parse.cc:8106
      #20 0x0000560ef7301dc5 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7fd5b0000c68, packet=packet@entry=0x7fd5b00080c9 "", packet_length=packet_length@entry=175, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /data/src/10.5/sql/sql_parse.cc:1990
      #21 0x0000560ef7303a60 in do_command (thd=0x7fd5b0000c68) at /data/src/10.5/sql/sql_parse.cc:1375
      #22 0x0000560ef73f6082 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x560ef9ac81b8, put_in_cache=put_in_cache@entry=true) at /data/src/10.5/sql/sql_connect.cc:1416
      #23 0x0000560ef73f62ed in handle_one_connection (arg=arg@entry=0x560ef9ac81b8) at /data/src/10.5/sql/sql_connect.cc:1318
      #24 0x0000560ef77297cb in pfs_spawn_thread (arg=0x560ef9586df8) at /data/src/10.5/storage/perfschema/pfs.cc:2201
      #25 0x00007fd6272a7fd4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
      #26 0x00007fd6273285bc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
      {noformat]
       
      {noformat:title=10.5 1a5c4c2d debug}
      mariadbd: /data/src/10.5/sql/filesort.cc:2130: virtual void Type_handler_string_result::sort_length(THD*, const Type_std_attributes*, SORT_FIELD_ATTR*) const: Assertion `sortorder->length <= 0xFFFFFFFFL - sortorder->suffix_length' failed.
      230719 21:34:47 [ERROR] mysqld got signal 6 ;
       
      #9  0x00007fe50c453df2 in __GI___assert_fail (assertion=0x556f24916200 "sortorder->length <= 0xFFFFFFFFL - sortorder->suffix_length", file=0x556f24913e20 "/data/src/10.5/sql/filesort.cc", line=2130, function=0x556f24916160 "virtual void Type_handler_string_result::sort_length(THD*, const Type_std_attributes*, SORT_FIELD_ATTR*) const") at ./assert/assert.c:101
      #10 0x0000556f22c17bba in Type_handler_string_result::sort_length (this=0x556f2676bc80 <type_handler_long_blob>, thd=0x62b00007e218, item=0x62b000086428, sortorder=0x60d000097db8) at /data/src/10.5/sql/filesort.cc:2130
      #11 0x0000556f22586b90 in st_join_table::remove_duplicates (this=0x6290002a4488) at /data/src/10.5/sql/sql_select.cc:24632
      #12 0x0000556f22573ae5 in join_init_read_record (tab=0x6290002a4488) at /data/src/10.5/sql/sql_select.cc:22113
      #13 0x0000556f225ae2fc in AGGR_OP::end_send (this=0x6290002a5148) at /data/src/10.5/sql/sql_select.cc:29933
      #14 0x0000556f2256c07f in sub_select_postjoin_aggr (join=0x62b0000885f0, join_tab=0x6290002a4488, end_of_records=true) at /data/src/10.5/sql/sql_select.cc:20873
      #15 0x0000556f2256cb35 in sub_select (join=0x62b0000885f0, join_tab=0x6290002a40d8, end_of_records=true) at /data/src/10.5/sql/sql_select.cc:21119
      #16 0x0000556f2256c0ae in sub_select_postjoin_aggr (join=0x62b0000885f0, join_tab=0x6290002a40d8, end_of_records=true) at /data/src/10.5/sql/sql_select.cc:20875
      #17 0x0000556f2256cb35 in sub_select (join=0x62b0000885f0, join_tab=0x6290002a3d28, end_of_records=true) at /data/src/10.5/sql/sql_select.cc:21119
      #18 0x0000556f2256b296 in do_select (join=0x62b0000885f0, procedure=0x0) at /data/src/10.5/sql/sql_select.cc:20697
      #19 0x0000556f224f6ceb in JOIN::exec_inner (this=0x62b0000885f0) at /data/src/10.5/sql/sql_select.cc:4602
      #20 0x0000556f224f42d0 in JOIN::exec (this=0x62b0000885f0) at /data/src/10.5/sql/sql_select.cc:4382
      #21 0x0000556f224f85eb in mysql_select (thd=0x62b00007e218, tables=0x62b0000865d0, fields=..., conds=0x0, og_num=1, order=0x0, group=0x62b000086fa8, having=0x0, proc_param=0x0, select_options=2684619521, result=0x62b0000885c0, unit=0x62b0000823c8, select_lex=0x62b000085400) at /data/src/10.5/sql/sql_select.cc:4859
      #22 0x0000556f224c927c in handle_select (thd=0x62b00007e218, lex=0x62b000082300, result=0x62b0000885c0, setup_tables_done_option=0) at /data/src/10.5/sql/sql_select.cc:450
      #23 0x0000556f22432b5d in execute_sqlcom_select (thd=0x62b00007e218, all_tables=0x62b0000865d0) at /data/src/10.5/sql/sql_parse.cc:6331
      #24 0x0000556f2242165f in mysql_execute_command (thd=0x62b00007e218) at /data/src/10.5/sql/sql_parse.cc:4008
      #25 0x0000556f2243daab in mysql_parse (thd=0x62b00007e218, rawbuf=0x62b000085238 "SELECT DISTINCT IF(BENCHMARK(1,VAR_POP(CHECK_TIME)),(ENCODE(PARTITION_EXPRESSION, 'x')),0) AS f FROM information_schema.PARTITIONS GROUP BY VALUE(PARTITION_METHOD) WITH ROLLUP", length=175, parser_state=0x7fe4fe2efc10, is_com_multi=false, is_next_command=false) at /data/src/10.5/sql/sql_parse.cc:8106
      #26 0x0000556f2241363c in dispatch_command (command=COM_QUERY, thd=0x62b00007e218, packet=0x629000280219 "", packet_length=175, is_com_multi=false, is_next_command=false) at /data/src/10.5/sql/sql_parse.cc:1891
      #27 0x0000556f2240ffcf in do_command (thd=0x62b00007e218) at /data/src/10.5/sql/sql_parse.cc:1375
      #28 0x0000556f2285d243 in do_handle_one_connection (connect=0x6080000039b8, put_in_cache=true) at /data/src/10.5/sql/sql_connect.cc:1416
      #29 0x0000556f2285cc0b in handle_one_connection (arg=0x608000003938) at /data/src/10.5/sql/sql_connect.cc:1318
      #30 0x0000556f234a60e2 in pfs_spawn_thread (arg=0x61500000c898) at /data/src/10.5/storage/perfschema/pfs.cc:2201
      #31 0x00007fe50c4a7fd4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
      #32 0x00007fe50c5285bc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
      

      The failure started happening on 10.5+ after this merge in 10.5.20:

      commit ac5a534a4caa6c86762e721dfe7183be2fee29ca
      Merge: e093e5abbed eaebe8b5600
      Author: Oleksandr Byelkin
      Date:   Fri Mar 31 21:32:41 2023 +0200
       
          Merge remote-tracking branch '10.4' into 10.5
      

      It is not reproducible with the provided test case on 10.4, so I cannot bisect it further.
      However, given similarities with MDEV-31113 and the fact that the commit 476b24d08 which caused it was in the merge above, that would be my suspect.

      Attachments

        Issue Links

          Activity

            oleg.smirnov Oleg Smirnov added a comment - - edited

            Simplified test case:

            create table t1 (a int, b longtext, c varchar(18));
             
            insert into t1 values (1, 'Aa123456', 'abc'), (2, 'Bb7897777', 'bcd');
             
            select distinct if(sum(a), b, 0) from t1 group by value(c) with rollup;
             
            drop table t1;
            

            The problem is related to long strings being incorrectly processed in JOIN_TAB::remove_duplicates(). The cause of this is a conflict of two commits:

            10.4

            commit 476b24d084e7e717310155bb986eb086d3c1e1a6
            Author: Monty <monty@mariadb.org>
            Date:   Thu Feb 16 14:19:33 2023 +0200
             
                MDEV-20057 Distinct SUM on CROSS JOIN and grouped returns wrong result
            

            10.5

            commit b753ac066bc26acda9deb707a31c112f1bbf9ec2
            Author: Varun Gupta <varun.gupta@mariadb.com>
            Date:   Tue Mar 10 04:56:38 2020 +0530
             
                MDEV-21580: Allow packed sort keys in sort buffer
            

            The second commit in v10.5 changed the logic of sort buffers operations, but the first commit was made into v10.4 and did not know about that change. That is why the bug started appearing after merge of 10.4 to 10.5.

            The exact problem is that before Varun's commit the limitation of comparable string length was done unconditionally:

              set_if_smaller(sortorder->length, thd->variables.max_sort_length);
            

            while after the commit it is done under the condition:

            if (is_variable_sized())
                set_if_smaller(length, thd->variables.max_sort_length);
            

            To provide correct functioning of is_variable_sized() SORT_FIELD::type must be set properly. This patch seems to fix the bug:

            diff --git a/sql/sql_select.cc b/sql/sql_select.cc
            index afa5593bd7e..4411e430706 100644
            --- a/sql/sql_select.cc
            +++ b/sql/sql_select.cc
            @@ -24628,6 +24628,9 @@ JOIN_TAB::remove_duplicates()
                   {
                     /* Item is not stored in temporary table, remember it */
                     sorder->item= item;
            +        sorder->type= sorder->item->type_handler()->is_packable() ?
            +                      SORT_FIELD_ATTR::VARIABLE_SIZE :
            +                      SORT_FIELD_ATTR::FIXED_SIZE;
                     /* Calculate sorder->length */
                     item->type_handler()->sort_length(thd, item, sorder);
                     sorder++;
            

            oleg.smirnov Oleg Smirnov added a comment - - edited Simplified test case: create table t1 (a int, b longtext, c varchar(18));   insert into t1 values (1, 'Aa123456', 'abc'), (2, 'Bb7897777', 'bcd');   select distinct if(sum(a), b, 0) from t1 group by value(c) with rollup;   drop table t1; The problem is related to long strings being incorrectly processed in JOIN_TAB::remove_duplicates(). The cause of this is a conflict of two commits: 10.4 commit 476b24d084e7e717310155bb986eb086d3c1e1a6 Author: Monty <monty@mariadb.org> Date: Thu Feb 16 14:19:33 2023 +0200   MDEV-20057 Distinct SUM on CROSS JOIN and grouped returns wrong result 10.5 commit b753ac066bc26acda9deb707a31c112f1bbf9ec2 Author: Varun Gupta <varun.gupta@mariadb.com> Date: Tue Mar 10 04:56:38 2020 +0530   MDEV-21580: Allow packed sort keys in sort buffer The second commit in v10.5 changed the logic of sort buffers operations, but the first commit was made into v10.4 and did not know about that change. That is why the bug started appearing after merge of 10.4 to 10.5. The exact problem is that before Varun's commit the limitation of comparable string length was done unconditionally: set_if_smaller(sortorder->length, thd->variables.max_sort_length); while after the commit it is done under the condition: if (is_variable_sized()) set_if_smaller(length, thd->variables.max_sort_length); To provide correct functioning of is_variable_sized() SORT_FIELD::type must be set properly. This patch seems to fix the bug: diff --git a/sql/sql_select.cc b/sql/sql_select.cc index afa5593bd7e..4411e430706 100644 --- a/sql/sql_select.cc +++ b/sql/sql_select.cc @@ -24628,6 +24628,9 @@ JOIN_TAB::remove_duplicates() { /* Item is not stored in temporary table, remember it */ sorder->item= item; + sorder->type= sorder->item->type_handler()->is_packable() ? + SORT_FIELD_ATTR::VARIABLE_SIZE : + SORT_FIELD_ATTR::FIXED_SIZE; /* Calculate sorder->length */ item->type_handler()->sort_length(thd, item, sorder); sorder++;
            oleg.smirnov Oleg Smirnov added a comment -

            @Monty, can you please review https://github.com/MariaDB/server/pull/2715 or branch bb-10.5-mdev-31743?

            By the way, doesn't it make sense to employ packed sort keys for JOIN_TAB::remove_duplicates()

            oleg.smirnov Oleg Smirnov added a comment - @Monty, can you please review https://github.com/MariaDB/server/pull/2715 or branch bb-10.5-mdev-31743? By the way, doesn't it make sense to employ packed sort keys for JOIN_TAB::remove_duplicates()

            Patch looks ok.

            Regarding pack keys in JOIN_TAB::remove_duplicates()
            This is not needed when it comes to space as we only store two rows in memory at any time.
            (The row we are checking and the row we are comparing against).

            However we could store pointers to blob keys separately in a list and modify the compare of rows to
            also check the list. This would allow us to do distinct for blobs of any size.

            In other words, modify the code here:
            if (compare_record(table, first_field) == 0 &&
            (!keylength ||
            memcmp(key_buffer, key_buffer2, keylength) == 0))
            ->
            if (compare_record(table, first_field) == 0 &&
            (!keylength ||
            memcmp(key_buffer, key_buffer2, keylength) == 0 &&
            compare_all_blobs() == 0)

            monty Michael Widenius added a comment - Patch looks ok. Regarding pack keys in JOIN_TAB::remove_duplicates() This is not needed when it comes to space as we only store two rows in memory at any time. (The row we are checking and the row we are comparing against). However we could store pointers to blob keys separately in a list and modify the compare of rows to also check the list. This would allow us to do distinct for blobs of any size. In other words, modify the code here: if (compare_record(table, first_field) == 0 && (!keylength || memcmp(key_buffer, key_buffer2, keylength) == 0)) -> if (compare_record(table, first_field) == 0 && (!keylength || memcmp(key_buffer, key_buffer2, keylength) == 0 && compare_all_blobs() == 0)

            Ok to push

            monty Michael Widenius added a comment - Ok to push
            oleg.smirnov Oleg Smirnov added a comment -

            Pushed to 10.5 and created MDEV-31834 for the suggested optimization.

            oleg.smirnov Oleg Smirnov added a comment - Pushed to 10.5 and created MDEV-31834 for the suggested optimization.

            Note for the documentation: queries in form

            SELECT DISTINCT function_returning_string(aggregate_func(...)) ... GROUP BY ...
            

            Could produce an assertion failure or a wrong query result.

            psergei Sergei Petrunia added a comment - Note for the documentation: queries in form SELECT DISTINCT function_returning_string(aggregate_func(...)) ... GROUP BY ... Could produce an assertion failure or a wrong query result.

            People

              oleg.smirnov Oleg Smirnov
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.