Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6255

DUPLICATE KEY Errors on SELECT .. GROUP BY that uses temporary and filesort

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 5.5.37, 10.0.10
    • 10.0.14
    • Optimizer
    • None
    • Windows, Linux

    Description

      Import the table provided into a MariaDB 5.5 and/or 10.0 instance, ensure your max_heap_table_size and tmp_table_size are 32M or less, and issue these queries:

      SELECT * FROM t_album2 GROUP BY Composer;
      SELECT Composer FROM t_album2 GROUP BY Composer;

      It returns DUPLICATE KEY errors:
      ERROR 1034 (HY000): Duplicate key 1 for record at 583508 against record at 11380

      Here is the table structure:

      CREATE TABLE `t_album` (
      `id_album` int(11) NOT NULL AUTO_INCREMENT,
      `Composer` varchar(255) COLLATE utf8_swedish_ci NOT NULL DEFAULT '',
      PRIMARY KEY (`id_album`)
      ) ENGINE=InnoDB AUTO_INCREMENT=1050654 DEFAULT CHARSET=utf8 COLLATE=utf8_swedish_ci;

      The GROUP BY must be in the query to trigger the error.

      Notice that if I increase max_heap_table_size and tmp_table_size (512M seems to work here), then the query runs as expected.

      SET SESSION max_heap_table_size=536870912; 
      SET SESSION tmp_table_size=536870912;

      After-thought: With the same tmp/heap table sizes, the query runs fine in MySQL 5.5.37, though the bug looks somewhat similar to this open MySQL bug:

      http://bugs.mysql.com/bug.php?id=58081

      Attachments

        Activity

          On a debug version, the same scenario causes an assertion failure:

          mysqld: 5.5/storage/maria/ha_maria.cc:1982: virtual int ha_maria::enable_indexes(uint): Assertion `thd->killed != 0' failed.
          140521 17:42:15 [ERROR] mysqld got signal 6 ;

          #6  0x00007f235ecda621 in *__GI___assert_fail (assertion=0xeeae27 "thd->killed != 0", file=<optimized out>, line=1982, function=0xeec240 "virtual int ha_maria::enable_indexes(uint)") at assert.c:81
          #7  0x0000000000b811f2 in ha_maria::enable_indexes (this=0x7f233597cc78, mode=2) at 5.5/storage/maria/ha_maria.cc:1982
          #8  0x0000000000b81829 in ha_maria::end_bulk_insert (this=0x7f233597cc78) at 5.5/storage/maria/ha_maria.cc:2167
          #9  0x000000000061e12e in handler::ha_end_bulk_insert (this=0x7f233597cc78) at 5.5/sql/handler.h:1937
          #10 0x0000000000688008 in create_internal_tmp_table_from_heap2 (thd=0x7f23457e1060, table=0x7f233587a078, start_recinfo=0x7f233587aea8, recinfo=0x7f2335895cd8, error=135, ignore_last_dupp_key_error=true, hton=0x7f235e52d560, proc_info=0xd971d3 "converting HEAP to Aria", is_duplicate=0x7f2360605453) at 5.5/sql/sql_select.cc:16212
          #11 0x0000000000687b86 in create_internal_tmp_table_from_heap (thd=0x7f23457e1060, table=0x7f233587a078, start_recinfo=0x7f233587aea8, recinfo=0x7f2335895cd8, error=135, ignore_last_dupp_key_error=true, is_duplicate=0x7f2360605453) at 5.5/sql/sql_select.cc:15954
          #12 0x000000000068c97e in end_write (join=0x7f2335895a80, join_tab=0x7f2335896fc0, end_of_records=false) at 5.5/sql/sql_select.cc:18129
          #13 0x0000000000689c1d in evaluate_join_record (join=0x7f2335895a80, join_tab=0x7f2335896ca0, error=0) at 5.5/sql/sql_select.cc:17033
          #14 0x00000000006896e6 in sub_select (join=0x7f2335895a80, join_tab=0x7f2335896ca0, end_of_records=false) at 5.5/sql/sql_select.cc:16853
          #15 0x0000000000688db4 in do_select (join=0x7f2335895a80, fields=0x0, table=0x7f233587a078, procedure=0x0) at 5.5/sql/sql_select.cc:16476
          #16 0x0000000000666289 in JOIN::exec (this=0x7f2335895a80) at 5.5/sql/sql_select.cc:2445
          #17 0x0000000000668579 in mysql_select (thd=0x7f23457e1060, rref_pointer_array=0x7f23457e4cc8, tables=0x7f2335895298, wild_num=1, fields=..., conds=0x0, og_num=1, order=0x0, group=0x7f2335895988, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7f2335895a60, unit=0x7f23457e4380, select_lex=0x7f23457e4a58) at 5.5/sql/sql_select.cc:3089
          #18 0x000000000065f079 in handle_select (thd=0x7f23457e1060, lex=0x7f23457e42d0, result=0x7f2335895a60, setup_tables_done_option=0) at 5.5/sql/sql_select.cc:319
          #19 0x0000000000638693 in execute_sqlcom_select (thd=0x7f23457e1060, all_tables=0x7f2335895298) at 5.5/sql/sql_parse.cc:4688
          #20 0x0000000000631875 in mysql_execute_command (thd=0x7f23457e1060) at 5.5/sql/sql_parse.cc:2233
          #21 0x000000000063ada2 in mysql_parse (thd=0x7f23457e1060, rawbuf=0x7f2335895078 "SELECT * FROM t_album2 GROUP BY Composer", length=40, parser_state=0x7f2360606680) at 5.5/sql/sql_parse.cc:5799
          #22 0x000000000062eded in dispatch_command (command=COM_QUERY, thd=0x7f23457e1060, packet=0x7f234a7cb061 "SELECT * FROM t_album2 GROUP BY Composer", packet_length=40) at 5.5/sql/sql_parse.cc:1079
          #23 0x000000000062df79 in do_command (thd=0x7f23457e1060) at 5.5/sql/sql_parse.cc:793
          #24 0x000000000072f872 in do_handle_one_connection (thd_arg=0x7f23457e1060) at 5.5/sql/sql_connect.cc:1266
          #25 0x000000000072f331 in handle_one_connection (arg=0x7f23457e1060) at 5.5/sql/sql_connect.cc:1181
          #26 0x00007f236023ab50 in start_thread (arg=<optimized out>) at pthread_create.c:304
          #27 0x00007f235ed89a7d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112

          Stack trace from:

          revision-id: monty@askmonty.org-20140517074259-8etfmks2mr7n2gbi
          revno: 4190
          branch-nick: 5.5

          elenst Elena Stepanova added a comment - On a debug version, the same scenario causes an assertion failure: mysqld: 5.5/storage/maria/ha_maria.cc:1982: virtual int ha_maria::enable_indexes(uint): Assertion `thd->killed != 0' failed. 140521 17:42:15 [ERROR] mysqld got signal 6 ; #6 0x00007f235ecda621 in *__GI___assert_fail (assertion=0xeeae27 "thd->killed != 0", file=<optimized out>, line=1982, function=0xeec240 "virtual int ha_maria::enable_indexes(uint)") at assert.c:81 #7 0x0000000000b811f2 in ha_maria::enable_indexes (this=0x7f233597cc78, mode=2) at 5.5/storage/maria/ha_maria.cc:1982 #8 0x0000000000b81829 in ha_maria::end_bulk_insert (this=0x7f233597cc78) at 5.5/storage/maria/ha_maria.cc:2167 #9 0x000000000061e12e in handler::ha_end_bulk_insert (this=0x7f233597cc78) at 5.5/sql/handler.h:1937 #10 0x0000000000688008 in create_internal_tmp_table_from_heap2 (thd=0x7f23457e1060, table=0x7f233587a078, start_recinfo=0x7f233587aea8, recinfo=0x7f2335895cd8, error=135, ignore_last_dupp_key_error=true, hton=0x7f235e52d560, proc_info=0xd971d3 "converting HEAP to Aria", is_duplicate=0x7f2360605453) at 5.5/sql/sql_select.cc:16212 #11 0x0000000000687b86 in create_internal_tmp_table_from_heap (thd=0x7f23457e1060, table=0x7f233587a078, start_recinfo=0x7f233587aea8, recinfo=0x7f2335895cd8, error=135, ignore_last_dupp_key_error=true, is_duplicate=0x7f2360605453) at 5.5/sql/sql_select.cc:15954 #12 0x000000000068c97e in end_write (join=0x7f2335895a80, join_tab=0x7f2335896fc0, end_of_records=false) at 5.5/sql/sql_select.cc:18129 #13 0x0000000000689c1d in evaluate_join_record (join=0x7f2335895a80, join_tab=0x7f2335896ca0, error=0) at 5.5/sql/sql_select.cc:17033 #14 0x00000000006896e6 in sub_select (join=0x7f2335895a80, join_tab=0x7f2335896ca0, end_of_records=false) at 5.5/sql/sql_select.cc:16853 #15 0x0000000000688db4 in do_select (join=0x7f2335895a80, fields=0x0, table=0x7f233587a078, procedure=0x0) at 5.5/sql/sql_select.cc:16476 #16 0x0000000000666289 in JOIN::exec (this=0x7f2335895a80) at 5.5/sql/sql_select.cc:2445 #17 0x0000000000668579 in mysql_select (thd=0x7f23457e1060, rref_pointer_array=0x7f23457e4cc8, tables=0x7f2335895298, wild_num=1, fields=..., conds=0x0, og_num=1, order=0x0, group=0x7f2335895988, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7f2335895a60, unit=0x7f23457e4380, select_lex=0x7f23457e4a58) at 5.5/sql/sql_select.cc:3089 #18 0x000000000065f079 in handle_select (thd=0x7f23457e1060, lex=0x7f23457e42d0, result=0x7f2335895a60, setup_tables_done_option=0) at 5.5/sql/sql_select.cc:319 #19 0x0000000000638693 in execute_sqlcom_select (thd=0x7f23457e1060, all_tables=0x7f2335895298) at 5.5/sql/sql_parse.cc:4688 #20 0x0000000000631875 in mysql_execute_command (thd=0x7f23457e1060) at 5.5/sql/sql_parse.cc:2233 #21 0x000000000063ada2 in mysql_parse (thd=0x7f23457e1060, rawbuf=0x7f2335895078 "SELECT * FROM t_album2 GROUP BY Composer", length=40, parser_state=0x7f2360606680) at 5.5/sql/sql_parse.cc:5799 #22 0x000000000062eded in dispatch_command (command=COM_QUERY, thd=0x7f23457e1060, packet=0x7f234a7cb061 "SELECT * FROM t_album2 GROUP BY Composer", packet_length=40) at 5.5/sql/sql_parse.cc:1079 #23 0x000000000062df79 in do_command (thd=0x7f23457e1060) at 5.5/sql/sql_parse.cc:793 #24 0x000000000072f872 in do_handle_one_connection (thd_arg=0x7f23457e1060) at 5.5/sql/sql_connect.cc:1266 #25 0x000000000072f331 in handle_one_connection (arg=0x7f23457e1060) at 5.5/sql/sql_connect.cc:1181 #26 0x00007f236023ab50 in start_thread (arg=<optimized out>) at pthread_create.c:304 #27 0x00007f235ed89a7d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 Stack trace from: revision-id: monty@askmonty.org-20140517074259-8etfmks2mr7n2gbi revno: 4190 branch-nick: 5.5

          The reason it fails in the debug version is that we have an assert to catch any bugs when re-creating keys (under normal circumstances this should never fail).

          The problem here is that one of the composers contains 0x20C2A0 (space, no-break space) at the end (id_album= 116996). This end-string is sorted differently in my_hash_sort() and my_strnncollsp() which causes a duplicated key when we convert the internal hash table to an aria table.

          The fix is to ensure that both sort methods will work identically.

          monty Michael Widenius added a comment - The reason it fails in the debug version is that we have an assert to catch any bugs when re-creating keys (under normal circumstances this should never fail). The problem here is that one of the composers contains 0x20C2A0 (space, no-break space) at the end (id_album= 116996). This end-string is sorted differently in my_hash_sort() and my_strnncollsp() which causes a duplicated key when we convert the internal hash table to an aria table. The fix is to ensure that both sort methods will work identically.
          bar Alexander Barkov added a comment - - edited

          This SQL script demonstrates the same problem using partitions:

          CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET cp1251 COLLATE cp1251_ukrainian_ci);
          INSERT INTO t1 VALUES (0x20),(0x60),(0x6060),(0x606060);
          SELECT * FROM t1 WHERE a=0x60;
          ALTER TABLE t1  PARTITION BY KEY(a) PARTITIONS 3;
          SELECT * FROM t1 WHERE a=0x60;
          DROP TABLE t1;

          The first query correctly returns all values:

          MariaDB [test]> SELECT * FROM t1 WHERE a=0x60;
          +------+
          | a    |
          +------+
          |      |
          | `    |
          | ``   |
          | ```  |
          +------+
          4 rows in set (0.00 sec)

          The second query returns only one value:

          MariaDB [test]> SELECT * FROM t1 WHERE a=0x60;
          +------+
          | a    |
          +------+
          | `    |
          +------+
          1 row in set (0.00 sec)

          This happens because different hash values are erroneously calculated for the four equal values,
          so the values do not go into the same partition.

          Note, 0x20 and 0x60 are equal in cp1251_ukrainian_ci.
          http://collation-charts.org/mysql60/mysql604.cp1251_ukrainian_ci.html

          bar Alexander Barkov added a comment - - edited This SQL script demonstrates the same problem using partitions: CREATE TABLE t1 (a VARCHAR (10) CHARACTER SET cp1251 COLLATE cp1251_ukrainian_ci); INSERT INTO t1 VALUES (0x20),(0x60),(0x6060),(0x606060); SELECT * FROM t1 WHERE a=0x60; ALTER TABLE t1 PARTITION BY KEY (a) PARTITIONS 3; SELECT * FROM t1 WHERE a=0x60; DROP TABLE t1; The first query correctly returns all values: MariaDB [test]> SELECT * FROM t1 WHERE a=0x60; +------+ | a | +------+ | | | ` | | `` | | ``` | +------+ 4 rows in set (0.00 sec) The second query returns only one value: MariaDB [test]> SELECT * FROM t1 WHERE a=0x60; +------+ | a | +------+ | ` | +------+ 1 row in set (0.00 sec) This happens because different hash values are erroneously calculated for the four equal values, so the values do not go into the same partition. Note, 0x20 and 0x60 are equal in cp1251_ukrainian_ci. http://collation-charts.org/mysql60/mysql604.cp1251_ukrainian_ci.html

          Fix pushed

          monty Michael Widenius added a comment - Fix pushed

          People

            monty Michael Widenius
            ccalender Chris Calender (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.