[MDEV-7122] Assertion `0' failed in subselect_hash_sj_engine::init Created: 2014-11-17  Updated: 2016-02-10  Resolved: 2016-02-10

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 5.5, 10.0
Fix Version/s: 5.5.48

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Vicențiu Ciorbaru
Resolution: Fixed Votes: 0
Labels: upstream-fixed

Sprint: 5.5.47-1, 5.5.48-0

 Description   

The bug report was initially reported upstream as http://bugs.mysql.com/bug.php?id=74889. On MariaDB 5.5/10.0, the same test case causes a different assertion failure.

Even if it's related, the upstream issue is said to be fixed in 5.6.22 (it's not public yet, so couldn't check), and since it's not InnoDB, the bugfix won't be automatically merged to 10.0.

5.5/sql/item_subselect.cc:4334: bool subselect_hash_sj_engine::init(List<Item>*, uint): Assertion `0' failed.
141117 15:23:25 [ERROR] mysqld got signal 6 ;

#6  0x00007f68cdf706f1 in *__GI___assert_fail (assertion=0xe1b1d8 "0", file=<optimized out>, line=4334, function=0xe1d620 "bool subselect_hash_sj_engine::init(List<Item>*, uint)") at assert.c:81
#7  0x0000000000885b5f in subselect_hash_sj_engine::init (this=0x7f68c898b5f0, tmp_columns=0x7f68c8a87a98, subquery_id=2) at /home/elenst/bzr/5.5/sql/item_subselect.cc:4334
#8  0x0000000000882633 in Item_in_subselect::setup_mat_engine (this=0x7f68c8a88c58) at /home/elenst/bzr/5.5/sql/item_subselect.cc:2766
#9  0x000000000077ba23 in JOIN::choose_subquery_plan (this=0x7f68c8968698, join_tables=1) at /home/elenst/bzr/5.5/sql/opt_subselect.cc:5539
#10 0x000000000066e626 in make_join_statistics (join=0x7f68c8968698, tables_list=..., conds=0x0, keyuse_array=0x7f68c89689a0) at /home/elenst/bzr/5.5/sql/sql_select.cc:3826
#11 0x00000000006654ce in JOIN::optimize (this=0x7f68c8968698) at /home/elenst/bzr/5.5/sql/sql_select.cc:1229
#12 0x000000000087c95c in Item_in_subselect::optimize (this=0x7f68c8a88c58, out_rows=0x7f68c93b4518, cost=0x7f68c93b4510) at /home/elenst/bzr/5.5/sql/item_subselect.cc:689
#13 0x000000000077ad5b in setup_jtbm_semi_joins (join=0x7f68c8968078, join_list=0x7f68c9953bd0, join_where=0x7f68c8968490) at /home/elenst/bzr/5.5/sql/opt_subselect.cc:5181
#14 0x0000000000664c56 in JOIN::optimize (this=0x7f68c8968078) at /home/elenst/bzr/5.5/sql/sql_select.cc:1084
#15 0x000000000066bd37 in mysql_select (thd=0x7f68c9950060, rref_pointer_array=0x7f68c9953cc8, tables=0x7f68c8a87298, wild_num=1, fields=..., conds=0x7f68c8a88c58, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7f68c8a88ed8, unit=0x7f68c9953380, select_lex=0x7f68c9953a58) at /home/elenst/bzr/5.5/sql/sql_select.cc:3080
#16 0x0000000000662899 in handle_select (thd=0x7f68c9950060, lex=0x7f68c99532d0, result=0x7f68c8a88ed8, setup_tables_done_option=0) at /home/elenst/bzr/5.5/sql/sql_select.cc:319
#17 0x000000000063bee0 in execute_sqlcom_select (thd=0x7f68c9950060, all_tables=0x7f68c8a87298) at /home/elenst/bzr/5.5/sql/sql_parse.cc:4689
#18 0x00000000006350c2 in mysql_execute_command (thd=0x7f68c9950060) at /home/elenst/bzr/5.5/sql/sql_parse.cc:2234
#19 0x000000000063e5ee in mysql_parse (thd=0x7f68c9950060, rawbuf=0x7f68c8a87078 "SELECT * FROM t1 WHERE a IN(SELECT MIN(a)FROM t1)", length=49, parser_state=0x7f68c93b5630) at /home/elenst/bzr/5.5/sql/sql_parse.cc:5800
#20 0x0000000000632609 in dispatch_command (command=COM_QUERY, thd=0x7f68c9950060, packet=0x7f68c9a07061 "SELECT * FROM t1 WHERE a IN(SELECT MIN(a)FROM t1)", packet_length=49) at /home/elenst/bzr/5.5/sql/sql_parse.cc:1079
#21 0x0000000000631795 in do_command (thd=0x7f68c9950060) at /home/elenst/bzr/5.5/sql/sql_parse.cc:793
#22 0x00000000007338f2 in do_handle_one_connection (thd_arg=0x7f68c9950060) at /home/elenst/bzr/5.5/sql/sql_connect.cc:1266
#23 0x00000000007333b1 in handle_one_connection (arg=0x7f68c9950060) at /home/elenst/bzr/5.5/sql/sql_connect.cc:1181
#24 0x0000000000b6b28d in pfs_spawn_thread (arg=0x7f68c9971ca0) at /home/elenst/bzr/5.5/storage/perfschema/pfs.cc:1015
#25 0x00007f68cfd6ab50 in start_thread (arg=<optimized out>) at pthread_create.c:304
#26 0x00007f68ce02120d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112

Test case

SET SESSION big_tables=1;
CREATE TABLE t1(a char(255)DEFAULT'',KEY(a(10))) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
INSERT INTO t1 VALUES(0),(0),(0);
SELECT * FROM t1 WHERE a IN(SELECT MIN(a)FROM t1);

Stack trace from 5.5

revision-id: sanja@askmonty.org-20141115211833-waisauprym8dbgzi
date: 2014-11-15 22:18:33 +0100
build-date: 2014-11-17 16:09:29 +0400
revno: 4350
branch-nick: 5.5



 Comments   
Comment by Daniel Black [ 2015-10-23 ]

I had a look and couldn't see anything buts its probably here somewhere https://github.com/mysql/mysql-server/compare/mysql-5.5.21...mysql-5.5.22

Comment by Sergei Petrunia [ 2015-12-09 ]

So, it is trying to initialize the query plan for non-merged semi-join materialization strategy (aka JTBM=Join TaB Materialization).

And the assert is here:

  if (tmp_table->s->keys == 0)
  {

That is, we assert when we have created a temporary table, but it turned out that the table has no keys (=> and hence it is not suitable for doing Materialization, without keys one can't remove duplicates or make efficient lookups).

Comment by Sergei Petrunia [ 2015-12-09 ]

Note that without SET SESSION big_tables=1 the query works, and its EXPLAIN is:

MariaDB [test]> explain SELECT * FROM t1 WHERE a IN(SELECT MIN(a)FROM t1);
+------+--------------+-------------+------+---------------+------+---------+------+------+-------------------------------------------------+
| id   | select_type  | table       | type | possible_keys | key  | key_len | ref  | rows | Extra                                           |
+------+--------------+-------------+------+---------------+------+---------+------+------+-------------------------------------------------+
|    1 | PRIMARY      | <subquery2> | ALL  | distinct_key  | NULL | NULL    | NULL |    1 |                                                 |
|    1 | PRIMARY      | t1          | ALL  | a             | NULL | NULL    | NULL |    3 | Using where; Using join buffer (flat, BNL join) |
|    2 | MATERIALIZED | t1          | ALL  | NULL          | NULL | NULL    | NULL |    3 |                                                 |
+------+--------------+-------------+------+---------------+------+---------+------+------+-------------------------------------------------+

Comment by Sergei Petrunia [ 2015-12-09 ]

Debugging create_tmp_table().

without big_tables=1

  • it creates a table with table->file= (ha_heap)
  • table->s->keys==1

with big_tables=1:
The execution starts to differ here:

  /* If result table is small; use a heap */
  /* future: storage engine selection can be made dynamic? */
  if (blob_count || using_unique_constraint
      || (thd->variables.big_tables && !(select_options & SELECT_SMALL_RESULT))
      || (select_options & TMP_TABLE_FORCE_MYISAM)
      || thd->variables.tmp_table_size == 0)
  {
    share->db_plugin= ha_lock_engine(0, TMP_ENGINE_HTON);
    table->file= get_new_handler(share, &table->mem_root,
                                 share->db_type());

table->file is a ha_maria object.

The execution proceeds in the same way as in the working case, until we reach
this place:

  if (!do_not_open)
  {
    if (share->db_type() == TMP_ENGINE_HTON)
    {
      if (create_internal_tmp_table(table, param->keyinfo, param->start_recinfo,
                                    &param->recinfo, select_options))
        goto err;
    }
    if (open_tmp_table(table))
      goto err;
  }

create_internal_tmp_table() is only called when table->file is ha_maria object.
Inside, we get here:

    if (keyinfo->key_length > table->file->max_key_length() ||
	keyinfo->user_defined_key_parts > table->file->max_key_parts() ||
	share->uniques)
    {
      if (!share->uniques && !(keyinfo->flags & HA_NOSAME))
      {
        my_error(ER_INTERNAL_ERROR, MYF(0),
                 "Using too big key for internal temp tables");
        DBUG_RETURN(1);
      }
 
      /* Can't create a key; Make a unique constraint instead of a key */
      share->keys=    0;

The outermost if is taken, the innermost is not. We can see: it is unable to create a key and falls back to using a "constraint" instead.

Comment by Sergei Petrunia [ 2015-12-09 ]

The reason it is unable to create a key:

(gdb) p keyinfo->key_length
  $42 = 1020
(gdb) p keyinfo->user_defined_key_parts
  $43 = 1
(gdb) p table->file->max_key_length()
  $44 = 1000
(gdb) p table->file->max_key_parts()
  $45 = 32

Comment by Sergei Petrunia [ 2015-12-09 ]

This is a long-known problem:

  • we can use Materialization only when we're able to create a temp. table with an appropriate key.
  • the code that determines the condition for this is buried in multiple inside create_tmp_table().
  • Hence, when deciding whether to use Materialization, we need to copy the check. Apparently it was done incorrectly.
Comment by Sergei Petrunia [ 2015-12-09 ]

Searching through OPTIMIZER_SWITCH_MATERIALIZATION finds the function that
makes the check: subquery_types_allow_materialization(). That function should also make an equivalent of the check that is made in create_internal_tmp_table:

    if (keyinfo->key_length > table->file->max_key_length() ||
	keyinfo->user_defined_key_parts > table->file->max_key_parts() ||

Comment by Vicențiu Ciorbaru [ 2016-02-08 ]

Hi Sergei,

Can you please review the patch for this bug? The patch is on the commits mailing list:

http://lists.askmonty.org/pipermail/commits/2016-February/008975.html

Regards,
Vicentiu

Comment by Sergei Petrunia [ 2016-02-09 ]

Checking limits in storage engines:

#define MAX_REF_PARTS 32			/* Max parts used as ref */
#define MAX_KEY_LENGTH 3072			/* max possible key */

(gdb) p join_tab[0].table->file
  $8 = (ha_maria *) 0x7fff7001c0b8
(gdb) p join_tab[0].table->file->max_key_length()
  $9 = 1000
(gdb) p join_tab[0].table->file->max_key_parts()
  $10 = 32

(gdb) p join_tab[0].table->file
  $11 = (ha_myisam *) 0x7fff700351f8
(gdb) p join_tab[0].table->file->max_key_length()
  $12 = 1000
(gdb) p join_tab[0].table->file->max_key_parts()
  $13 = 32

(gdb) p join_tab[0].table->file
  $16 = (ha_heap *) 0x7fff7003ea18
(gdb) p join_tab[0].table->file->max_key_length()
  $17 = 3072
(gdb) p join_tab[0].table->file->max_key_parts()
  $18 = 32

Comment by Sergei Petrunia [ 2016-02-09 ]

Looking at the above:

  • table->file->max_key_parts() == MAX_REF_PARTS for all engines
  • table->file->max_key_length() < MAX_KEY_LENGTH for some engines.

Discussed the issue with serg. Conclusions:

<serg> and when a temp table is created, add an assert that h->max_key_parts() >= MAX_REF_PARTS

for MAX_KEY_LENGTH, we need to #define MAX_TMP_TABLE_KEY_LENGTH, and then

  • add assert(s) that created temp. tables have table->file->max_key_length() >= MAX_TMP_TABLE_KEY_LENGTH.
  • subquery_types_allow_materialization() should then not chose materialziation if the expected key length exceeds MAX_TMP_TABLE_KEY_LENGTH.

cvicentiu any objections? If not pelase implement this.

Comment by Vicențiu Ciorbaru [ 2016-02-10 ]

Due to differences in storage engines, we ended up implementing an #ifdef'ed solution. Based on the temporary table storage engine, a new function called tmp_table_max_key_

{length|parts}

() returns the required values.

Comment by Vicențiu Ciorbaru [ 2016-02-10 ]

Fixed with:
https://github.com/MariaDB/server/commit/775cccca9f1502ae2f4cf1417d0f94d4872d4630
https://github.com/MariaDB/server/commit/3c5c04bd2bccbfeb62a86bdc5610b1dcea378dd3

Generated at Thu Feb 08 07:17:10 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.