Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23248

Server crashes in mi_extra / ha_partition::loop_extra_alter upon REORGANIZE

Details

    Description

      This is a seemingly deterministic test case which uses max_session_mem_used trick to imitate KILL_QUERY. If you choose to debug with this test case, don't use a valgrind build, the trick doesn't work there.

      The failure is not specific to max_session_mem_used though, the second test case is a "normal" two-thread test with explicit KILL. Due to the race between threads it's non-deterministic, run with --repeat if it doesn't fail right away. You may also need to adjust the amount of data and/or the number of partitions to tune the race condition.

      Test case 1

      --source include/have_partition.inc
       
      CREATE TABLE t1 (a INT, b INT) ENGINE=MyISAM PARTITION BY RANGE (a) SUBPARTITION BY HASH (a) SUBPARTITIONS 2 (PARTITION p1 VALUES LESS THAN (100), PARTITION p2 VALUES LESS THAN MAXVALUE);
      INSERT INTO t1 VALUES (4,6),(0,9);
      UPDATE t1 SET a = 7;
       
      SET max_session_mem_used= @@max_session_mem_used + 1024;
      ALTER TABLE t1 REORGANIZE PARTITION p1,p2 INTO (PARTITION p1 VALUES LESS THAN (5), PARTITION p2 VALUES LESS THAN MAXVALUE);
       
      # Cleanup
      DROP TABLE t1;
      

      Test case 2

      --source include/have_sequence.inc
      --source include/have_partition.inc
       
      CREATE TABLE t1 (a INT, b INT) ENGINE=MyISAM PARTITION BY RANGE (a) SUBPARTITION BY HASH (a) SUBPARTITIONS 70 (PARTITION p1 VALUES LESS THAN (100), PARTITION p2 VALUES LESS THAN MAXVALUE);
      INSERT INTO t1 SELECT 4, 6 FROM seq_1_to_131072;
      UPDATE t1 SET a = 7;
       
      --connect (con1,localhost,root,,)
      --let conid= `select connection_id()`
      --send
        ALTER TABLE t1 REORGANIZE PARTITION p1,p2 INTO (PARTITION p1 VALUES LESS THAN (5), PARTITION p2 VALUES LESS THAN MAXVALUE);
       
      --connection default
      --sleep 0.1
      --eval KILL QUERY $conid
       
      # Cleanup
      DROP TABLE t1;
      

      10.3 af83ed9f

      #3  <signal handler called>
      #4  0x000055997af07a87 in mi_extra (info=0x0, function=HA_EXTRA_FORCE_REOPEN, extra_arg=0x0) at /data/src/10.3/storage/myisam/mi_extra.c:42
      #5  0x000055997aee5a95 in ha_myisam::extra (this=0x7f083c014a88, operation=HA_EXTRA_FORCE_REOPEN) at /data/src/10.3/storage/myisam/ha_myisam.cc:2058
      #6  0x000055997afb85ec in ha_partition::loop_extra_alter (this=0x7f083c0975b8, operation=HA_EXTRA_FORCE_REOPEN) at /data/src/10.3/sql/ha_partition.cc:9154
      #7  0x000055997afb7c4b in ha_partition::extra (this=0x7f083c0975b8, operation=HA_EXTRA_FORCE_REOPEN) at /data/src/10.3/sql/ha_partition.cc:8882
      #8  0x000055997a3c43fe in wait_while_table_is_used (thd=0x7f083c000af0, table=0x7f083c096510, function=HA_EXTRA_FORCE_REOPEN) at /data/src/10.3/sql/sql_base.cc:1264
      #9  0x000055997a947483 in handle_alter_part_error (lpt=0x7f084d184fc0, action_completed=false, drop_partition=false, frm_install=false, close_table=true) at /data/src/10.3/sql/sql_partition.cc:6979
      #10 0x000055997a948c1d in fast_alter_partition_table (thd=0x7f083c000af0, table=0x7f083c096510, alter_info=0x7f084d186420, create_info=0x7f084d1864e0, table_list=0x7f083c0129c0, db=0x7f084d185900, table_name=0x7f084d185910) at /data/src/10.3/sql/sql_partition.cc:7530
      #11 0x000055997a546194 in mysql_alter_table (thd=0x7f083c000af0, new_db=0x7f083c0051d8, new_name=0x7f083c0055a0, create_info=0x7f084d1864e0, table_list=0x7f083c0129c0, alter_info=0x7f084d186420, order_num=0, order=0x0, ignore=false) at /data/src/10.3/sql/sql_table.cc:9685
      #12 0x000055997a5d5672 in Sql_cmd_alter_table::execute (this=0x7f083c013928, thd=0x7f083c000af0) at /data/src/10.3/sql/sql_alter.cc:512
      #13 0x000055997a46578a in mysql_execute_command (thd=0x7f083c000af0) at /data/src/10.3/sql/sql_parse.cc:6022
      #14 0x000055997a46af3f in mysql_parse (thd=0x7f083c000af0, rawbuf=0x7f083c012818 "ALTER TABLE t1 REORGANIZE PARTITION p1,p2 INTO (PARTITION p1 VALUES LESS THAN (5), PARTITION p2 VALUES LESS THAN MAXVALUE)", length=122, parser_state=0x7f084d1875e0, is_com_multi=false, is_next_command=false) at /data/src/10.3/sql/sql_parse.cc:7810
      #15 0x000055997a457786 in dispatch_command (command=COM_QUERY, thd=0x7f083c000af0, packet=0x7f083c1234d1 "ALTER TABLE t1 REORGANIZE PARTITION p1,p2 INTO (PARTITION p1 VALUES LESS THAN (5), PARTITION p2 VALUES LESS THAN MAXVALUE)", packet_length=122, is_com_multi=false, is_next_command=false) at /data/src/10.3/sql/sql_parse.cc:1848
      #16 0x000055997a45609e in do_command (thd=0x7f083c000af0) at /data/src/10.3/sql/sql_parse.cc:1393
      #17 0x000055997a5cf671 in do_handle_one_connection (connect=0x55997ddcd540) at /data/src/10.3/sql/sql_connect.cc:1403
      #18 0x000055997a5cf3d3 in handle_one_connection (arg=0x55997ddcd540) at /data/src/10.3/sql/sql_connect.cc:1308
      #19 0x000055997af870b8 in pfs_spawn_thread (arg=0x55997dde85f0) at /data/src/10.3/storage/perfschema/pfs.cc:1869
      #20 0x00007f0854f114a4 in start_thread (arg=0x7f084d188700) at pthread_create.c:456
      #21 0x00007f0853045d0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
      

      Reproducible on 10.3-10.5, debug, non-debug and ASAN builds alike (but see the earlier note about Valgrind and 1st test case).

      Couldn't reproduce on 10.2.

      Attachments

        Issue Links

          Activity

            monty Michael Widenius added a comment - - edited

            The problem was that mysql_change_partitions() closes all handler files in case of error, which was not properly reflected in
            fast_alter_partition_table(). This caused handle_alter_part_error() to try to close already closed tables, which caused the crash.

            This bug is a duplicate of MDEV-23248

            monty Michael Widenius added a comment - - edited The problem was that mysql_change_partitions() closes all handler files in case of error, which was not properly reflected in fast_alter_partition_table(). This caused handle_alter_part_error() to try to close already closed tables, which caused the crash. This bug is a duplicate of MDEV-23248

            Fix pushed into 10.3

            There will be a separate fix for 10.5 that will be pushed shortly.
            This is because the code in 10.5 is a bit different and the 10.5 patch
            also fixes issues in S3 related to the same bug.

            monty Michael Widenius added a comment - Fix pushed into 10.3 There will be a separate fix for 10.5 that will be pushed shortly. This is because the code in 10.5 is a bit different and the 10.5 patch also fixes issues in S3 related to the same bug.

            People

              monty Michael Widenius
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.