Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13724

ALTER TABLE…ALGORITHM=INPLACE is modifying the source table

Details

    Description

      http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest/builds/10072/steps/test_2/logs/stdio

      sys_vars.log_slow_admin_statements_func 'innodb' w3 [ fail ]
              Test ended at 2017-09-02 15:21:43
       
      CURRENT_TEST: sys_vars.log_slow_admin_statements_func
      --- /mnt/buildbot/build/mariadb-10.3.2/mysql-test/suite/sys_vars/r/log_slow_admin_statements_func.result	2017-09-02 12:56:48.000000000 -0400
      +++ /mnt/buildbot/build/mariadb-10.3.2/mysql-test/suite/sys_vars/r/log_slow_admin_statements_func.reject	2017-09-02 15:21:43.124060071 -0400
      @@ -27,10 +27,14 @@
       OPTIMIZE TABLE log_slow_admin_statements;
       Table	Op	Msg_type	Msg_text
       test.log_slow_admin_statements	optimize	note	Table does not support optimize, doing recreate + analyze instead
      -test.log_slow_admin_statements	optimize	status	OK
      +test.log_slow_admin_statements	optimize	error	Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
      +test.log_slow_admin_statements	optimize	status	Operation failed
      +Warnings:
      +Error	1296	Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
       CHECK TABLE log_slow_admin_statements EXTENDED;
       Table	Op	Msg_type	Msg_text
      -test.log_slow_admin_statements	check	status	OK
      +test.log_slow_admin_statements	check	Warning	InnoDB: Index PRIMARY is marked as corrupted
      +test.log_slow_admin_statements	check	error	Corrupt
       DROP TABLE log_slow_admin_statements;
       SELECT sql_text FROM mysql.slow_log WHERE sql_text LIKE '%TABLE log_slow_admin_statements%';
       sql_text
       
      mysqltest: Result length mismatch
      

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova added a comment - - edited

            Various problems with this test are reproducible on 10.3 non-debug build running with --repeat=N --mem. For me usually N=10 is enough, but it might vary on different machines.

            With this test (an excerpt from the original test case) i'm also quite often getting the assertion failure:

            -- source include/have_innodb.inc
             
            CREATE TABLE log_slow_admin_statements (
                    i INT PRIMARY KEY AUTO_INCREMENT,
                    j VARCHAR(255)
            ) ENGINE=InnoDB;
             
            INSERT INTO log_slow_admin_statements(j) values (REPEAT('A', 255));
            INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements;
            INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements;
            INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements;
            INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements;
            INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements;
            INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements;
            INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements;
            INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements;
            INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements;
            INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements;
            INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements;
            INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements;
            INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements;
            INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements;
             
            OPTIMIZE TABLE log_slow_admin_statements;
             
            # Cleanup
            DROP TABLE log_slow_admin_statements;
            

            10.3 non-debug 83f9422f0555b6

            2017-09-06 16:19:12 0 [ERROR] InnoDB: Database page corruption on disk or a failed file read of tablespace test/log_slow_admin_statements page [page id: space=8, page number=240]. You may have to recover from a backup.
            2017-09-06 16:19:12 0 [Note] InnoDB: Page dump in ascii and hex (16384 bytes):
            ...
            InnoDB: End of page dump
            2017-09-06 16:19:12 0 [Note] InnoDB: Uncompressed page, stored checksum in field1 2074857521, calculated checksums for field1: crc32 3550863345/3589883547, innodb 
            2491050503,  page type 17855 == INDEX.none 3735928559, stored checksum in field2 2074857521, calculated checksums for field2: crc32 3550863345/3589883547, innodb 8
            5042567, none 3735928559,  page LSN 0 16959928, low 4 bytes of LSN at page end 16959928, page number (if stored to page already) 244, space id (if created with >= 
            MySQL-4.1.1 and stored already) 8
            2017-09-06 16:19:12 0 [Note] InnoDB: Page may be an index page where index id is 25
            2017-09-06 16:19:12 0 [Note] InnoDB: Index 25 is `PRIMARY` in table `test`.`log_slow_admin_statements`
            2017-09-06 16:19:12 0 [Note] InnoDB: It is also possible that your operating system has corrupted its own file cache and rebooting your computer removes the error.
             If the corrupt page is an index page. You can also try to fix the corruption by dumping, dropping, and reimporting the corrupt table. You can use CHECK TABLE to s
            can your table for corruption. Please refer to http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html for information about forcing recovery.
            2017-09-06 16:19:12 0x7f4334cbe700  InnoDB: Assertion failure in file /data/src/10.3/storage/innobase/buf/buf0lru.cc line 2148
            InnoDB: Failing assertion: bpage->buf_fix_count == 0
             
            #4  0x00007f43417013fa in abort () from /lib/x86_64-linux-gnu/libc.so.6
            #5  0x0000564cc8e55287 in ut_dbg_assertion_failed (expr=expr@entry=0x564cc9721a98 "bpage->buf_fix_count == 0", file=file@entry=0x564cc9721548 "/data/src/10.3/storage/innobase/buf/buf0lru.cc", line=line@entry=2148) at /data/src/10.3/storage/innobase/ut/ut0dbg.cc:61
            #6  0x0000564cc9479710 in buf_LRU_block_remove_hashed (bpage=bpage@entry=0x7f4336bb0358, zip=zip@entry=true) at /data/src/10.3/storage/innobase/buf/buf0lru.cc:2148
            #7  0x0000564cc947e650 in buf_LRU_free_one_page (bpage=bpage@entry=0x7f4336bb0358) at /data/src/10.3/storage/innobase/buf/buf0lru.cc:2412
            #8  0x0000564cc9458e66 in buf_mark_space_corrupt (bpage=0x7f4336bb0358) at /data/src/10.3/storage/innobase/buf/buf0buf.cc:5761
            #9  buf_page_io_complete (bpage=0x7f4336bb0358, evict=evict@entry=false) at /data/src/10.3/storage/innobase/buf/buf0buf.cc:6006
            #10 0x0000564cc94bcc5c in fil_aio_wait (segment=segment@entry=3) at /data/src/10.3/storage/innobase/fil/fil0fil.cc:5500
            #11 0x0000564cc93d60c8 in io_handler_thread (arg=<optimized out>) at /data/src/10.3/storage/innobase/srv/srv0start.cc:340
            #12 0x00007f434363d494 in start_thread (arg=0x7f4334cbe700) at pthread_create.c:333
            #13 0x00007f43417b593f in clone () from /lib/x86_64-linux-gnu/libc.so.6
            

            The problem was apparently introduced (or revealed) by this commit:

            commit ff81faf670e083e1da4f3e7bce33fe9104410b2c
            Author: Marko Mäkelä <marko.makela@mariadb.com>
            Date:   Thu Aug 31 18:03:50 2017 +0300
             
                MDEV-13654 Various crashes due to DB_TRX_ID mismatch in table-rebuilding ALTER TABLE…LOCK=NONE
            

            elenst Elena Stepanova added a comment - - edited Various problems with this test are reproducible on 10.3 non-debug build running with --repeat=N --mem . For me usually N=10 is enough, but it might vary on different machines. With this test (an excerpt from the original test case) i'm also quite often getting the assertion failure: -- source include/have_innodb.inc   CREATE TABLE log_slow_admin_statements ( i INT PRIMARY KEY AUTO_INCREMENT, j VARCHAR (255) ) ENGINE=InnoDB;   INSERT INTO log_slow_admin_statements(j) values (REPEAT( 'A' , 255)); INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements; INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements; INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements; INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements; INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements; INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements; INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements; INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements; INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements; INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements; INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements; INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements; INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements; INSERT INTO log_slow_admin_statements(j) SELECT j FROM log_slow_admin_statements;   OPTIMIZE TABLE log_slow_admin_statements;   # Cleanup DROP TABLE log_slow_admin_statements; 10.3 non-debug 83f9422f0555b6 2017-09-06 16:19:12 0 [ERROR] InnoDB: Database page corruption on disk or a failed file read of tablespace test/log_slow_admin_statements page [page id: space=8, page number=240]. You may have to recover from a backup. 2017-09-06 16:19:12 0 [Note] InnoDB: Page dump in ascii and hex (16384 bytes): ... InnoDB: End of page dump 2017-09-06 16:19:12 0 [Note] InnoDB: Uncompressed page, stored checksum in field1 2074857521, calculated checksums for field1: crc32 3550863345/3589883547, innodb 2491050503, page type 17855 == INDEX.none 3735928559, stored checksum in field2 2074857521, calculated checksums for field2: crc32 3550863345/3589883547, innodb 8 5042567, none 3735928559, page LSN 0 16959928, low 4 bytes of LSN at page end 16959928, page number (if stored to page already) 244, space id (if created with >= MySQL-4.1.1 and stored already) 8 2017-09-06 16:19:12 0 [Note] InnoDB: Page may be an index page where index id is 25 2017-09-06 16:19:12 0 [Note] InnoDB: Index 25 is `PRIMARY` in table `test`.`log_slow_admin_statements` 2017-09-06 16:19:12 0 [Note] InnoDB: It is also possible that your operating system has corrupted its own file cache and rebooting your computer removes the error. If the corrupt page is an index page. You can also try to fix the corruption by dumping, dropping, and reimporting the corrupt table. You can use CHECK TABLE to s can your table for corruption. Please refer to http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html for information about forcing recovery. 2017-09-06 16:19:12 0x7f4334cbe700 InnoDB: Assertion failure in file /data/src/10.3/storage/innobase/buf/buf0lru.cc line 2148 InnoDB: Failing assertion: bpage->buf_fix_count == 0   #4 0x00007f43417013fa in abort () from /lib/x86_64-linux-gnu/libc.so.6 #5 0x0000564cc8e55287 in ut_dbg_assertion_failed (expr=expr@entry=0x564cc9721a98 "bpage->buf_fix_count == 0", file=file@entry=0x564cc9721548 "/data/src/10.3/storage/innobase/buf/buf0lru.cc", line=line@entry=2148) at /data/src/10.3/storage/innobase/ut/ut0dbg.cc:61 #6 0x0000564cc9479710 in buf_LRU_block_remove_hashed (bpage=bpage@entry=0x7f4336bb0358, zip=zip@entry=true) at /data/src/10.3/storage/innobase/buf/buf0lru.cc:2148 #7 0x0000564cc947e650 in buf_LRU_free_one_page (bpage=bpage@entry=0x7f4336bb0358) at /data/src/10.3/storage/innobase/buf/buf0lru.cc:2412 #8 0x0000564cc9458e66 in buf_mark_space_corrupt (bpage=0x7f4336bb0358) at /data/src/10.3/storage/innobase/buf/buf0buf.cc:5761 #9 buf_page_io_complete (bpage=0x7f4336bb0358, evict=evict@entry=false) at /data/src/10.3/storage/innobase/buf/buf0buf.cc:6006 #10 0x0000564cc94bcc5c in fil_aio_wait (segment=segment@entry=3) at /data/src/10.3/storage/innobase/fil/fil0fil.cc:5500 #11 0x0000564cc93d60c8 in io_handler_thread (arg=<optimized out>) at /data/src/10.3/storage/innobase/srv/srv0start.cc:340 #12 0x00007f434363d494 in start_thread (arg=0x7f4334cbe700) at pthread_create.c:333 #13 0x00007f43417b593f in clone () from /lib/x86_64-linux-gnu/libc.so.6 The problem was apparently introduced (or revealed) by this commit: commit ff81faf670e083e1da4f3e7bce33fe9104410b2c Author: Marko Mäkelä <marko.makela@mariadb.com> Date: Thu Aug 31 18:03:50 2017 +0300   MDEV-13654 Various crashes due to DB_TRX_ID mismatch in table-rebuilding ALTER TABLE…LOCK=NONE

            The OPTIMIZE TABLE can be replaced with

            ALTER TABLE log_sow_admin_statements FORCE, ALGORITHM=INPLACE;
            

            and the failure can still be repeated. With ALGORITHM=COPY, there is no corruption.

            marko Marko Mäkelä added a comment - The OPTIMIZE TABLE can be replaced with ALTER TABLE log_sow_admin_statements FORCE , ALGORITHM=INPLACE; and the failure can still be repeated. With ALGORITHM=COPY, there is no corruption.

            MDEV-13654 is actually modifying the contents of a buffer pool page, instead of modifying a copy of the record. Removing the offending writes makes the crash go away:

            diff --git a/storage/innobase/row/row0merge.cc b/storage/innobase/row/row0merge.cc
            index 63b1e2a4105..27752cf4c0f 100644
            --- a/storage/innobase/row/row0merge.cc
            +++ b/storage/innobase/row/row0merge.cc
            @@ -2302,13 +2302,6 @@ row_merge_read_clustered_index(
             			for which history is not going to be
             			available after the rebuild operation.
             			This essentially mimics row_purge_reset_trx_id(). */
            -			byte* ptr = static_cast<byte*>(
            -				row->fields[new_trx_id_col].data);
            -
            -			memset(ptr, 0, DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
            -			ptr[DATA_TRX_ID_LEN] = 1U
            -				<< (ROLL_PTR_INSERT_FLAG_POS - CHAR_BIT
            -				    * (DATA_ROLL_PTR_LEN - 1));
             		}
             
             		if (add_autoinc != ULINT_UNDEFINED) {
            

            This change would make the original table appear corrupted while the table is being copied. I guess that the read request that is noticing the corruption would be from a purge thread.

            marko Marko Mäkelä added a comment - MDEV-13654 is actually modifying the contents of a buffer pool page, instead of modifying a copy of the record. Removing the offending writes makes the crash go away: diff --git a/storage/innobase/row/row0merge.cc b/storage/innobase/row/row0merge.cc index 63b1e2a4105..27752cf4c0f 100644 --- a/storage/innobase/row/row0merge.cc +++ b/storage/innobase/row/row0merge.cc @@ -2302,13 +2302,6 @@ row_merge_read_clustered_index( for which history is not going to be available after the rebuild operation. This essentially mimics row_purge_reset_trx_id(). */ - byte* ptr = static_cast<byte*>( - row->fields[new_trx_id_col].data); - - memset(ptr, 0, DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN); - ptr[DATA_TRX_ID_LEN] = 1U - << (ROLL_PTR_INSERT_FLAG_POS - CHAR_BIT - * (DATA_ROLL_PTR_LEN - 1)); } if (add_autoinc != ULINT_UNDEFINED) { This change would make the original table appear corrupted while the table is being copied. I guess that the read request that is noticing the corruption would be from a purge thread.

            People

              marko Marko Mäkelä
              alice Alice Sherepa
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.