Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-11415

Remove excessive undo logging during ALTER TABLE…ALGORITHM=COPY

Details

    Description

      MySQL 5.6 introduced ALTER TABLE…ALGORITHM=INPLACE, which avoids writing undo log for individual records that are copied. No change was made for ALGORITHM=COPY.
      There was a previous work-around that would ensure that if the server was killed during ALTER TABLE, rollback would not take too much time:

      commit fd069e2bb36a3c1c1f26d65dd298b07e6d83ac8b
      Author: marko@hundin.mysql.fi <>
      Date:   Wed Nov 3 21:32:48 2004 +0200
       
          InnoDB: commit after every 10000 rows in ALTER TABLE
      

      The proper fix would be to disable the undo logging for the individual rows altogether. This would still leave an orphan #sql table (until crash-safe atomic DDL is implemented), which can be dropped by the user. That fix was implemented in MySQL 8.0, and MariaDB should backport it:

      commit b5f211744b49d620d00fdecd13d9af09ef26c15b
      Author: Thirunarayanan Balathandayuthapani <thirunarayanan.balathandayuth@oracle.com>
      Date:   Wed Dec 2 16:09:15 2015 +0530
       
          Bug #17479594   AVOID INTERMEDIATE COMMIT WHILE DOING
                          ALTER TABLE ALGORITHM=COPY
          
          Problem:
          =======
            During ALTER TABLE, we commit and restart the transaction
          for every 10,000 rows, so that the rollback after recovery would not take
          so long.
          
          Fix:
          ====
          Supress the undo logging during copy alter operation. If fts_index is
          present then insert directly into fts auxiliary table rather
          than doing at commit time.
          
          ha_innobase::num_write_row: Remove the variable.
          ha_innobase::write_row(): Remove the hack for committing every 10000 rows.
          row_lock_table_for_mysql(): Remove the extra 2 parameters.
          lock_get_src_table(), lock_is_table_exclusive(): Remove the function.
          
          Reviewed-by: Marko Mäkelä <marko.makela@oracle.com>
          Reviewed-by: Shaohua Wang <shaohua.wang@oracle.com>
          Reviewed-by: Jon Olav Hauglid <jon.hauglid@oracle.com>
          RB: 10419
      

      Attachments

        Issue Links

          Activity

            Actually, if MDEV-515 implements bulk insert into empty tables or partitions, it would automatically take care of this ALGORITHM=COPY. Quite simply, ha_innobase::write_row() would detect that the table is empty, and then enable bulk insert mode for the table until the end of the transaction. The ha_innobase::num_write_row and related code would of course be removed.

            marko Marko Mäkelä added a comment - Actually, if MDEV-515 implements bulk insert into empty tables or partitions, it would automatically take care of this ALGORITHM=COPY. Quite simply, ha_innobase::write_row() would detect that the table is empty, and then enable bulk insert mode for the table until the end of the transaction. The ha_innobase::num_write_row and related code would of course be removed.
            marko Marko Mäkelä added a comment - - edited

            The initial version of the contribution would reduce Galera functionality with the following:

            diff --git a/sql/sys_vars.cc b/sql/sys_vars.cc
            index 0e5192d72ad..5341286375f 100644
            --- a/sql/sys_vars.cc
            +++ b/sql/sys_vars.cc
            @@ -5408,12 +5408,6 @@ static Sys_var_ulong Sys_wsrep_mysql_replication_bundle(
                    GLOBAL_VAR(wsrep_mysql_replication_bundle), CMD_LINE(REQUIRED_ARG),
                    VALID_RANGE(0, 1000), DEFAULT(0), BLOCK_SIZE(1));
             
            -static Sys_var_mybool Sys_wsrep_load_data_splitting(
            -       "wsrep_load_data_splitting", "To commit LOAD DATA "
            -       "transaction after every 10K rows inserted",
            -       GLOBAL_VAR(wsrep_load_data_splitting), 
            -       CMD_LINE(OPT_ARG), DEFAULT(TRUE));
            -
             static Sys_var_mybool Sys_wsrep_slave_FK_checks(
                    "wsrep_slave_FK_checks", "Should slave thread do "
                    "foreign key constraint checks",
            

            Because of this, LOAD DATA…INFILE would always be atomic in Galera, and the transaction could become too large for replication. This would be possibly fixed by MDEV-515 or Galera 4.
            As pointed out by several colleagues, this is unacceptable, and therefore we must keep the wsrep_load_data_splitting parameter functional.

            marko Marko Mäkelä added a comment - - edited The initial version of the contribution would reduce Galera functionality with the following: diff --git a/sql/sys_vars.cc b/sql/sys_vars.cc index 0e5192d72ad..5341286375f 100644 --- a/sql/sys_vars.cc +++ b/sql/sys_vars.cc @@ -5408,12 +5408,6 @@ static Sys_var_ulong Sys_wsrep_mysql_replication_bundle( GLOBAL_VAR(wsrep_mysql_replication_bundle), CMD_LINE(REQUIRED_ARG), VALID_RANGE(0, 1000), DEFAULT(0), BLOCK_SIZE(1)); -static Sys_var_mybool Sys_wsrep_load_data_splitting( - "wsrep_load_data_splitting", "To commit LOAD DATA " - "transaction after every 10K rows inserted", - GLOBAL_VAR(wsrep_load_data_splitting), - CMD_LINE(OPT_ARG), DEFAULT(TRUE)); - static Sys_var_mybool Sys_wsrep_slave_FK_checks( "wsrep_slave_FK_checks", "Should slave thread do " "foreign key constraint checks", Because of this, LOAD DATA…INFILE would always be atomic in Galera, and the transaction could become too large for replication. This would be possibly fixed by MDEV-515 or Galera 4. As pointed out by several colleagues, this is unacceptable, and therefore we must keep the wsrep_load_data_splitting parameter functional.

            I revised the wsrep_load_data_splitting hack and compared the contributed code to the original MySQL 8.0 changes. The revised contribution is waiting for results from bb-10.3-marko.

            marko Marko Mäkelä added a comment - I revised the wsrep_load_data_splitting hack and compared the contributed code to the original MySQL 8.0 changes. The revised contribution is waiting for results from bb-10.3-marko .
            marko Marko Mäkelä added a comment - I ported this to 10.2 as well.

            This one seems to be happening only on bb-10.2-marko, but not on 10.2 95f39339444:

            bb-10.2-marko 1670cfe7d7734b

            2018-01-29 16:29:27 139781054519040 [ERROR] InnoDB: Record in index `a` of table `test`.`t1` was not found on update: TUPLE (info_bits=0, 2 fields): {[3]foo(0x060F0F),[6]      (0x000000000203)} at: COMPACT RECORD(info_bits=32, 2 fields): {[3]foo(0x060F0F),[6]      (0x000000000202)}
            2018-01-29 16:29:27 139781054519040 [Note] InnoDB: GIS MBR INFO: 2.67715e+59 and 2.75699e+40, 1.97554e+122, 9.92615e+44
             
            mysqld: /data/src/bb-10.2-marko/storage/innobase/row/row0upd.cc:2440: dberr_t row_upd_sec_index_entry(upd_node_t*, que_thr_t*): Assertion `0' failed.
            180129 16:29:27 [ERROR] mysqld got signal 6 ;
             
            #7  0x00007f215cd6bee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
            #8  0x000055fe187401f7 in row_upd_sec_index_entry (node=0x7f210c037270, thr=0x7f210c037570) at /data/src/bb-10.2-marko/storage/innobase/row/row0upd.cc:2440
            #9  0x000055fe1874086d in row_upd_sec_step (node=0x7f210c037270, thr=0x7f210c037570) at /data/src/bb-10.2-marko/storage/innobase/row/row0upd.cc:2553
            #10 0x000055fe18742b7d in row_upd (node=0x7f210c037270, thr=0x7f210c037570) at /data/src/bb-10.2-marko/storage/innobase/row/row0upd.cc:3306
            #11 0x000055fe18742ed4 in row_upd_step (thr=0x7f210c037570) at /data/src/bb-10.2-marko/storage/innobase/row/row0upd.cc:3423
            #12 0x000055fe186e6953 in row_update_for_mysql (prebuilt=0x7f210c036ad8) at /data/src/bb-10.2-marko/storage/innobase/row/row0mysql.cc:1923
            #13 0x000055fe185ab074 in ha_innobase::delete_row (this=0x7f210c131308, record=0x7f210c148198 "\376\003foo") at /data/src/bb-10.2-marko/storage/innobase/handler/ha_innodb.cc:9344
            #14 0x000055fe182a1166 in handler::ha_delete_row (this=0x7f210c131308, buf=0x7f210c148198 "\376\003foo") at /data/src/bb-10.2-marko/sql/handler.cc:6061
            #15 0x000055fe1842f908 in mysql_delete (thd=0x7f210c000b00, table_list=0x7f210c0125b0, conds=0x0, order_list=0x7f210c005038, limit=18446744073709551614, options=0, result=0x0) at /data/src/bb-10.2-marko/sql/sql_delete.cc:583
            #16 0x000055fe1802497a in mysql_execute_command (thd=0x7f210c000b00) at /data/src/bb-10.2-marko/sql/sql_parse.cc:4616
            #17 0x000055fe1802ecc3 in mysql_parse (thd=0x7f210c000b00, rawbuf=0x7f210c0124e8 "DELETE FROM t1", length=14, parser_state=0x7f215019c200, is_com_multi=false, is_next_command=false) at /data/src/bb-10.2-marko/sql/sql_parse.cc:7898
            #18 0x000055fe1801cc5d in dispatch_command (command=COM_QUERY, thd=0x7f210c000b00, packet=0x7f210c008941 "DELETE FROM t1", packet_length=14, is_com_multi=false, is_next_command=false) at /data/src/bb-10.2-marko/sql/sql_parse.cc:1806
            #19 0x000055fe1801b5c0 in do_command (thd=0x7f210c000b00) at /data/src/bb-10.2-marko/sql/sql_parse.cc:1360
            #20 0x000055fe181697c0 in do_handle_one_connection (connect=0x55fe1b32edb0) at /data/src/bb-10.2-marko/sql/sql_connect.cc:1335
            #21 0x000055fe1816954d in handle_one_connection (arg=0x55fe1b32edb0) at /data/src/bb-10.2-marko/sql/sql_connect.cc:1241
            #22 0x000055fe18588c24 in pfs_spawn_thread (arg=0x55fe1b3117d0) at /data/src/bb-10.2-marko/storage/perfschema/pfs.cc:1862
            #23 0x00007f215ea42494 in start_thread (arg=0x7f215019d700) at pthread_create.c:333
            #24 0x00007f215ce2893f in clone () from /lib/x86_64-linux-gnu/libc.so.6
            

            --source include/have_innodb.inc
             
            CREATE TABLE t1 (a VARCHAR(128)) ENGINE=InnoDB;
            INSERT INTO t1 VALUES ('foo'),('foo');
            ALTER IGNORE TABLE t1 ADD UNIQUE(a), ALGORITHM=COPY;
            DELETE FROM t1;
             
            # Cleanup
            DROP TABLE t1;
            

            elenst Elena Stepanova added a comment - This one seems to be happening only on bb-10.2-marko, but not on 10.2 95f39339444: bb-10.2-marko 1670cfe7d7734b 2018-01-29 16:29:27 139781054519040 [ERROR] InnoDB: Record in index `a` of table `test`.`t1` was not found on update: TUPLE (info_bits=0, 2 fields): {[3]foo(0x060F0F),[6] (0x000000000203)} at: COMPACT RECORD(info_bits=32, 2 fields): {[3]foo(0x060F0F),[6] (0x000000000202)} 2018-01-29 16:29:27 139781054519040 [Note] InnoDB: GIS MBR INFO: 2.67715e+59 and 2.75699e+40, 1.97554e+122, 9.92615e+44   mysqld: /data/src/bb-10.2-marko/storage/innobase/row/row0upd.cc:2440: dberr_t row_upd_sec_index_entry(upd_node_t*, que_thr_t*): Assertion `0' failed. 180129 16:29:27 [ERROR] mysqld got signal 6 ;   #7 0x00007f215cd6bee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6 #8 0x000055fe187401f7 in row_upd_sec_index_entry (node=0x7f210c037270, thr=0x7f210c037570) at /data/src/bb-10.2-marko/storage/innobase/row/row0upd.cc:2440 #9 0x000055fe1874086d in row_upd_sec_step (node=0x7f210c037270, thr=0x7f210c037570) at /data/src/bb-10.2-marko/storage/innobase/row/row0upd.cc:2553 #10 0x000055fe18742b7d in row_upd (node=0x7f210c037270, thr=0x7f210c037570) at /data/src/bb-10.2-marko/storage/innobase/row/row0upd.cc:3306 #11 0x000055fe18742ed4 in row_upd_step (thr=0x7f210c037570) at /data/src/bb-10.2-marko/storage/innobase/row/row0upd.cc:3423 #12 0x000055fe186e6953 in row_update_for_mysql (prebuilt=0x7f210c036ad8) at /data/src/bb-10.2-marko/storage/innobase/row/row0mysql.cc:1923 #13 0x000055fe185ab074 in ha_innobase::delete_row (this=0x7f210c131308, record=0x7f210c148198 "\376\003foo") at /data/src/bb-10.2-marko/storage/innobase/handler/ha_innodb.cc:9344 #14 0x000055fe182a1166 in handler::ha_delete_row (this=0x7f210c131308, buf=0x7f210c148198 "\376\003foo") at /data/src/bb-10.2-marko/sql/handler.cc:6061 #15 0x000055fe1842f908 in mysql_delete (thd=0x7f210c000b00, table_list=0x7f210c0125b0, conds=0x0, order_list=0x7f210c005038, limit=18446744073709551614, options=0, result=0x0) at /data/src/bb-10.2-marko/sql/sql_delete.cc:583 #16 0x000055fe1802497a in mysql_execute_command (thd=0x7f210c000b00) at /data/src/bb-10.2-marko/sql/sql_parse.cc:4616 #17 0x000055fe1802ecc3 in mysql_parse (thd=0x7f210c000b00, rawbuf=0x7f210c0124e8 "DELETE FROM t1", length=14, parser_state=0x7f215019c200, is_com_multi=false, is_next_command=false) at /data/src/bb-10.2-marko/sql/sql_parse.cc:7898 #18 0x000055fe1801cc5d in dispatch_command (command=COM_QUERY, thd=0x7f210c000b00, packet=0x7f210c008941 "DELETE FROM t1", packet_length=14, is_com_multi=false, is_next_command=false) at /data/src/bb-10.2-marko/sql/sql_parse.cc:1806 #19 0x000055fe1801b5c0 in do_command (thd=0x7f210c000b00) at /data/src/bb-10.2-marko/sql/sql_parse.cc:1360 #20 0x000055fe181697c0 in do_handle_one_connection (connect=0x55fe1b32edb0) at /data/src/bb-10.2-marko/sql/sql_connect.cc:1335 #21 0x000055fe1816954d in handle_one_connection (arg=0x55fe1b32edb0) at /data/src/bb-10.2-marko/sql/sql_connect.cc:1241 #22 0x000055fe18588c24 in pfs_spawn_thread (arg=0x55fe1b3117d0) at /data/src/bb-10.2-marko/storage/perfschema/pfs.cc:1862 #23 0x00007f215ea42494 in start_thread (arg=0x7f215019d700) at pthread_create.c:333 #24 0x00007f215ce2893f in clone () from /lib/x86_64-linux-gnu/libc.so.6 --source include/have_innodb.inc   CREATE TABLE t1 (a VARCHAR (128)) ENGINE=InnoDB; INSERT INTO t1 VALUES ( 'foo' ),( 'foo' ); ALTER IGNORE TABLE t1 ADD UNIQUE (a), ALGORITHM=COPY; DELETE FROM t1;   # Cleanup DROP TABLE t1;

            For ALTER IGNORE we must for now keep the undo logging enabled, so that the latest row can be rolled back in case of an error.
            A better fix would be to empty the undo log after each successfully inserted row.

            marko Marko Mäkelä added a comment - For ALTER IGNORE we must for now keep the undo logging enabled, so that the latest row can be rolled back in case of an error. A better fix would be to empty the undo log after each successfully inserted row.

            Other failures encountered on bb-10.2-marko 1670cfe7d7734b and bb-10.3-marko f2acc0efdc8591f appear to be unrelated. I've started another test run on bb-10.2-marko 72689ee5c with the fix for ALTER IGNORE problem above.

            The following bugs interfered with the previous test run and made many tests end prematurely (most likely will cause problems in the new run too):

            MDEV-13828 - crash upon RENAME TABLE
            MDEV-14697 - crash upon ALTER TABLE
            MDEV-14750 - heap-use-after-free upon SHOW CREATE TABLE
            MDEV-14862 - crash upon DELETE from table
            MDEV-14864 - crash upon ALTER TABLE
            MDEV-15114 - heap-use-after-free upon REPLACE into table
            MDEV-15115 - crash upon CREATE or ALTER SEQUENCE
            MDEV-15117 - crash upon CREATE + FLUSH SEQUENCE

            elenst Elena Stepanova added a comment - Other failures encountered on bb-10.2-marko 1670cfe7d7734b and bb-10.3-marko f2acc0efdc8591f appear to be unrelated. I've started another test run on bb-10.2-marko 72689ee5c with the fix for ALTER IGNORE problem above. The following bugs interfered with the previous test run and made many tests end prematurely (most likely will cause problems in the new run too): MDEV-13828 - crash upon RENAME TABLE MDEV-14697 - crash upon ALTER TABLE MDEV-14750 - heap-use-after-free upon SHOW CREATE TABLE MDEV-14862 - crash upon DELETE from table MDEV-14864 - crash upon ALTER TABLE MDEV-15114 - heap-use-after-free upon REPLACE into table MDEV-15115 - crash upon CREATE or ALTER SEQUENCE MDEV-15117 - crash upon CREATE + FLUSH SEQUENCE

            Note: After this fix, ALTER IGNORE TABLE will allocate and write undo log for all written rows. This could be fixed as a follow-up change. We only need to remember the last written row so that it can be rolled back in case of an error, such as a duplicate key in

            ALTER IGNORE TABLE t ADD UNIQUE INDEX(non_unique_column);
            

            marko Marko Mäkelä added a comment - Note: After this fix, ALTER IGNORE TABLE will allocate and write undo log for all written rows. This could be fixed as a follow-up change. We only need to remember the last written row so that it can be rolled back in case of an error, such as a duplicate key in ALTER IGNORE TABLE t ADD UNIQUE INDEX (non_unique_column);

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.