[MDEV-21809] Server crashes in Item_cond::get_mm_tree/prune_partitions Created: 2020-02-24  Updated: 2023-12-05

Status: Stalled
Project: MariaDB Server
Component/s: Partitioning
Affects Version/s: 10.3, 10.4, 10.5
Fix Version/s: 10.4, 10.5

Type: Bug Priority: Major
Reporter: Alice Sherepa Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 0
Labels: None


 Description   

Server crashes in Item_cond::get_mm_tree/prune_partitions

--source include/have_partition.inc
 
CREATE TABLE t1 (i1 int) ;
INSERT INTO t1 VALUES (1),(2),(3);
 
CREATE TABLE t2 (pk int NOT NULL, i int, PRIMARY KEY (pk)) PARTITION BY KEY (pk) PARTITIONS 2;
INSERT INTO t2 VALUES (1,1),(2,1),(3,1);
 
UPDATE (t1 JOIN t2 ON (t2.pk = t1.i1 OR 0))
SET t2.i = 5;

Reproducible with Myisam/Aria/Innodb, debug/non-debug, reproducible on 10.3-10.5:

10.3 affe7fabc7baa36083e7632eb

Thread 1 (Thread 0x7f5407583700 (LWP 17954)):
#0  __pthread_kill (threadid=<optimized out>, signo=11) at ../sysdeps/unix/sysv/linux/pthread_kill.c:62
#1  0x000055898abc33e2 in my_write_core (sig=11) at /10.3/mysys/stacktrace.c:481
#2  0x000055898a34480f in handle_fatal_signal (sig=11) at /10.3/sql/signal_handler.cc:343
#3  <signal handler called>
#4  0x000055898a4c7903 in Item_cond::get_mm_tree (this=0x7f53f00141c8, param=0x7f5407581460, cond_ptr=0x7f54075813a8) at /10.3/sql/opt_range.cc:7660
#5  0x000055898a4bd945 in prune_partitions (thd=0x7f53f0000af0, table=0x7f53f0076340, pprune_cond=0x7f53f00141c8) at /10.3/sql/opt_range.cc:3528
#6  0x000055898a06e4e4 in JOIN::optimize_inner (this=0x7f53f0014d18) at /10.3/sql/sql_select.cc:1830
#7  0x000055898a06cfcb in JOIN::optimize (this=0x7f53f0014d18) at /10.3/sql/sql_select.cc:1488
#8  0x000055898a0770c8 in mysql_select (thd=0x7f53f0000af0, tables=0x7f53f0013010, wild_num=0, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=1342177408, result=0x7f53f0014c40, unit=0x7f53f00049b8, select_lex=0x7f53f0005140) at /10.3/sql/sql_select.cc:4283
#9  0x000055898a12f002 in mysql_multi_update (thd=0x7f53f0000af0, table_list=0x7f53f0013010, fields=0x7f53f0005268, values=0x7f53f0005770, conds=0x0, options=0, handle_duplicates=DUP_ERROR, ignore=false, unit=0x7f53f00049b8, select_lex=0x7f53f0005140, result=0x7f5407581e60) at /10.3/sql/sql_update.cc:1805
#10 0x000055898a02857e in mysql_execute_command (thd=0x7f53f0000af0) at /10.3/sql/sql_parse.cc:4369
#11 0x000055898a034246 in mysql_parse (thd=0x7f53f0000af0, rawbuf=0x7f53f0012818 "UPDATE (t1 JOIN t2 ON (t2.pk = t1.i1 OR 0))\nSET t2.i = 5", length=56, parser_state=0x7f5407582460, is_com_multi=false, is_next_command=false) at /10.3/sql/sql_parse.cc:7817
#12 0x000055898a020d6d in dispatch_command (command=COM_QUERY, thd=0x7f53f0000af0, packet=0x7f53f0123221 "UPDATE (t1 JOIN t2 ON (t2.pk = t1.i1 OR 0))\nSET t2.i = 5", packet_length=56, is_com_multi=false, is_next_command=false) at /10.3/sql/sql_parse.cc:1856
#13 0x000055898a01f67b in do_command (thd=0x7f53f0000af0) at /10.3/sql/sql_parse.cc:1402
#14 0x000055898a19838c in do_handle_one_connection (connect=0x55898c98c610) at /10.3/sql/sql_connect.cc:1403
#15 0x000055898a1980c8 in handle_one_connection (arg=0x55898c98c610) at /10.3/sql/sql_connect.cc:1308
#16 0x000055898ab49af0 in pfs_spawn_thread (arg=0x55898c8d86f0) at /10.3/storage/perfschema/pfs.cc:1869
#17 0x00007f5412e176ba in start_thread (arg=0x7f5407583700) at pthread_create.c:333
#18 0x00007f54122ac41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109



 Comments   
Comment by Igor Babaev [ 2020-06-09 ]

Assigned it Sergei Petrunia because the bug is definitely in his code.

Comment by Sergei Petrunia [ 2020-07-04 ]

Analysis:

  • Item_cond::get_mm_tree crashes because its list is empty, that is,
    it is an "OR of zero items". This should not be happening.

The partition pruning module got this Item_cond_or from get_sargable_cond()
which returned it here:

  if (table->pos_in_table_list->on_expr)
  {
    /*
      This is an inner table from a single-table LEFT JOIN, "t1 LEFT JOIN
      t2 ON cond". Use the condition cond.
    */
    retval= &table->pos_in_table_list->on_expr;

which also seems wrong, as the query is an inner join so on_expr should have been moved into WHERE.

Comment by Sergei Petrunia [ 2020-07-05 ]

A bit more detail on how the ON expression is processed:

Initially it is

(t2.pk = t1.i1 OR 0)

simplify_joins moves it to the WHERE (to join->conds)

then, the "0" is removed here:

  #3  0x0000555555cb1e84 in Item_cond::remove_eq_conds (this=0x7fff6c015378, thd=0x7fff6c000d50, cond_value=0x7fff6c0161a8, top_level_arg=true) at /home/psergey/dev-git/10.3-cl/sql/sql_select.cc:16507
  #4  0x0000555555cb1629 in optimize_cond (join=0x7fff6c015e98, conds=0x7fff6c015378, join_list=0x7fff6c005540, ignore_on_conds=false, cond_value=0x7fff6c0161a8, cond_equal=0x7fff6c0162d0, flags=1) at /home/psergey/dev-git/10.3-cl/sql/sql_select.cc:16236
  #5  0x0000555555c88aeb in JOIN::optimize_inner (this=0x7fff6c015e98) at /home/psergey/dev-git/10.3-cl/sql/sql_select.cc:1738
  #6  0x0000555555c87c6f in JOIN::optimize (this=0x7fff6c015e98) at /home/psergey/dev-git/10.3-cl/sql/sql_select.cc:1497
  #7  0x0000555555c91d48 in mysql_select (thd=0x7fff6c000d50, tables=0x7fff6c0141c0, wild_num=0, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=1342177408, result=0x7fff6c015dc0, unit=0x7fff6c004c18, select_lex=0x7fff6c0053a0) at /home/psergey/dev-git/10.3-cl/sql/sql_select.cc:4301

this means the Item_cond_or becomes an "unary OR" and so execution reaches this point in Item_cond::remove_eq_conds :

  if (((Item_cond*) cond)->argument_list()->elements == 1)
  {                                                // Remove list
    item= ((Item_cond*) cond)->argument_list()->head();
    ((Item_cond*) cond)->argument_list()->empty();
    return item;
  }

(Note the empty() call. this is how the "0-way OR" is produced)
The item returned is:

(gdb) print item
  $158 = (Item_equal *) 0x7fff6c017030

This all is ok

Comment by Sergei Petrunia [ 2020-07-05 ]

Then, we reach here:

#ifdef WITH_PARTITION_STORAGE_ENGINE
  {
    TABLE_LIST *tbl;
    List_iterator_fast<TABLE_LIST> li(select_lex->leaf_tables);
    while ((tbl= li++))
    {
      Item **prune_cond= get_sargable_cond(this, tbl->table);
      tbl->table->all_partitions_pruned_away=
        prune_partitions(thd, tbl->table, *prune_cond);
    }
  }
#endif

select_lex->leaf_tables has two elements.
The first one is t1.
The second one is t2. It has an interesting property:

(gdb) p tbl
  $182 = (TABLE_LIST *) 0x7fff6c014868
(gdb) p tbl->alias.str
  $183 = 0x7fff6c014860 "t2"
 
(gdb) p tbl->table->pos_in_table_list
  $184 = (TABLE_LIST *) 0x7fff6c0164a0
(gdb) p tbl->table->pos_in_table_list->alias.str
  $185 = 0x7fff6c014860 "t2"

One expects that tbl->table->pos_in_table_list==tbl but here it is a different TABLE_LIST object. Which also represents table t2! How is that possible?

This second object was created here in multi_update::prepare here:

/*
    Save tables beeing updated in update_tables
    update_table->shared is position for table
    Don't use key read on tables that are updated
  */
 
  update.empty();
  ti.rewind();
  while ((table_ref= ti++))
  {
    /* TODO: add support of view of join support */
    if (table_ref->is_jtbm())
      continue;
    TABLE *table=table_ref->table;
    leaf_table_count++;
    if (tables_to_update & table->map)
    {
      TABLE_LIST *tl= (TABLE_LIST*) thd->memdup(table_ref,
						sizeof(*tl));
      if (!tl)
	DBUG_RETURN(1);
      update.link_in_list(tl, &tl->next_local);
      tl->shared= table_count++;
      table->no_keyread=1;
      table->covering_keys.clear_all();
      table->pos_in_table_list= tl;
      table->prepare_triggers_for_update_stmt_or_event();
      table->reset_default_fields();
    }

I am not sure what is its purpose.

Comment by Julien Fritsch [ 2023-12-05 ]

Automated message:
----------------------------
Since this issue has not been updated since 6 weeks, it's time to move it back to Stalled.

Comment by JiraAutomate [ 2023-12-05 ]

Automated message:
----------------------------
Since this issue has not been updated since 6 weeks, it's time to move it back to Stalled.

Generated at Thu Feb 08 09:09:58 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.