Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13534

InnoDB STATS_PERSISTENT fails to ignore garbage delete-mark flag on node pointer pages

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.2.7
    • 10.2.9
    • CentOS 7, Linux 3.10.0-514.26.2.el7.x86_64 #1 SMP Tue Jul 4 15:04:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

    Description

      The MariaDB crashed with an assertion failure. At that time there was a large transaction running with (according to the rollback log) approx. 4191650 row operations.

      2017-08-15 15:33:20 0x7fc018037700  InnoDB: Assertion failure in file /home/buildbot/buildbot/build/mariadb-10.2.7/storage/innobase/dict/dict0stats.cc line 1572
      InnoDB: Failing assertion: offsets_rec != NULL
      InnoDB: We intentionally generate a memory trap.
      InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
      InnoDB: If you get repeated assertion failures or crashes, even
      InnoDB: immediately after the mysqld startup, there may be
      InnoDB: corruption in the InnoDB tablespace. Please refer to
      InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
      InnoDB: about forcing recovery.
      170815 15:33:20 [ERROR] mysqld got signal 6 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed,
      something is definitely wrong and this may fail.
       
      Server version: 10.2.7-MariaDB
      key_buffer_size=16777216
      read_buffer_size=2097152
      max_used_connections=101
      max_threads=102
      thread_count=26
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 436307 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x0 thread_stack 0x49000
      /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7fc13e53d37e]
      /usr/sbin/mysqld(handle_fatal_signal+0x30d)[0x7fc13df82fad]
      /lib64/libpthread.so.0(+0xf370)[0x7fc13d4f0370]
      /lib64/libc.so.6(gsignal+0x37)[0x7fc13ba791d7]
      /lib64/libc.so.6(abort+0x148)[0x7fc13ba7a8c8]
      /usr/sbin/mysqld(+0x422d82)[0x7fc13dd42d82]
      /usr/sbin/mysqld(+0xa727c0)[0x7fc13e3927c0]
      /usr/sbin/mysqld(+0xa749c7)[0x7fc13e3949c7]
      /usr/sbin/mysqld(+0xa781ed)[0x7fc13e3981ed]
      /usr/sbin/mysqld(+0xa7a283)[0x7fc13e39a283]
      /lib64/libpthread.so.0(+0x7dc5)[0x7fc13d4e8dc5]
      /lib64/libc.so.6(clone+0x6d)[0x7fc13bb3b76d]
      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.
      

      Attachments

        Issue Links

          Activity

            This seems valid for my case. A table with ~450k rows is completely rebuild along with a 2nd table with 1.7m rows which references this table. So there are a lot of insert / delete statements in a short period of time, within a transaction.

            If there is anything I can debug for you let me know, I can trigger this to happen by running the same update script over and over again

            Kind regards,
            Tim

            dicode Tim Westervoorde added a comment - This seems valid for my case. A table with ~450k rows is completely rebuild along with a 2nd table with 1.7m rows which references this table. So there are a lot of insert / delete statements in a short period of time, within a transaction. If there is anything I can debug for you let me know, I can trigger this to happen by running the same update script over and over again Kind regards, Tim

            A prerequisite of this bug is that a non-root node pointer page is filled with records that carry the (garbage) delete-mark flag. There is no problem if all root page records are ‘delete-marked’ or if all leaf page records are delete-marked. So, the minimum required index tree height is 3, that is, 2 levels of node pointers above the leaf level.

            Because the maximum PRIMARY KEY length is so small (768 bytes on innodb_page_size=4k; 3072 bytes on innodb_page_size=16k), I thought that it is infeasible to create an ’organic’ test for this. Yes, I could pad the leaf page record sizes with non-key columns, but if each node pointer page will have at least 4 records, I’d still need quite a few row before the node pointer page is split.

            I was able to reproduce this by using debug instrumentation, adapting the great innodb.innodb_bug14676111 test painstakingly created by Yasufumi Kinoshita:

            --source include/have_innodb.inc
            --source include/have_debug.inc
             
            CREATE TABLE t(a INT UNSIGNED PRIMARY KEY)
            ENGINE=InnoDB STATS_PERSISTENT=1 STATS_SAMPLE_PAGES=1;
             
            BEGIN;
            # Create an index tree of height 3.
            # This is adapted from innodb.innodb_bug14676111.
             
            SET @save_debug = @@GLOBAL.innodb_limit_optimistic_insert_debug;
            SET GLOBAL innodb_limit_optimistic_insert_debug=2;
             
            INSERT t VALUES(1),(5);
            DELETE FROM t;
            INSERT t VALUES(4);
            DELETE FROM t;
            INSERT t VALUES(3);
            DELETE FROM t;
            SET GLOBAL innodb_limit_optimistic_insert_debug = @save_debug;
             
            connect(con1, localhost, root,,);
            ANALYZE TABLE t;
            disconnect con1;
             
            connection default;
            DROP TABLE t;
            

            marko Marko Mäkelä added a comment - A prerequisite of this bug is that a non-root node pointer page is filled with records that carry the (garbage) delete-mark flag. There is no problem if all root page records are ‘delete-marked’ or if all leaf page records are delete-marked. So, the minimum required index tree height is 3, that is, 2 levels of node pointers above the leaf level. Because the maximum PRIMARY KEY length is so small (768 bytes on innodb_page_size=4k; 3072 bytes on innodb_page_size=16k), I thought that it is infeasible to create an ’organic’ test for this. Yes, I could pad the leaf page record sizes with non-key columns, but if each node pointer page will have at least 4 records, I’d still need quite a few row before the node pointer page is split. I was able to reproduce this by using debug instrumentation, adapting the great innodb.innodb_bug14676111 test painstakingly created by Yasufumi Kinoshita: --source include/have_innodb.inc --source include/have_debug.inc   CREATE TABLE t(a INT UNSIGNED PRIMARY KEY) ENGINE=InnoDB STATS_PERSISTENT=1 STATS_SAMPLE_PAGES=1;   BEGIN; # Create an index tree of height 3. # This is adapted from innodb.innodb_bug14676111.   SET @save_debug = @@GLOBAL.innodb_limit_optimistic_insert_debug; SET GLOBAL innodb_limit_optimistic_insert_debug=2;   INSERT t VALUES(1),(5); DELETE FROM t; INSERT t VALUES(4); DELETE FROM t; INSERT t VALUES(3); DELETE FROM t; SET GLOBAL innodb_limit_optimistic_insert_debug = @save_debug;   connect(con1, localhost, root,,); ANALYZE TABLE t; disconnect con1;   connection default; DROP TABLE t;

            I introduced this bug in MDEV-12698 in MySQL 10.2.7 with the following change:

            @@ -1392,13 +1388,10 @@ dict_stats_scan_page(
             	Because offsets1,offsets2 should be big enough,
             	this memory heap should never be used. */
             	mem_heap_t*	heap			= NULL;
            -	const rec_t*	(*get_next)(const rec_t*);
            -
            -	if (scan_method == COUNT_ALL_NON_BORING_AND_SKIP_DEL_MARKED) {
            -		get_next = page_rec_get_next_non_del_marked;
            -	} else {
            -		get_next = page_rec_get_next_const;
            -	}
            +	const rec_t*	(*get_next)(const rec_t*)
            +		= srv_stats_include_delete_marked
            +		? page_rec_get_next_const
            +		: page_rec_get_next_non_del_marked;
             
             	const bool	should_count_external_pages = n_external_pages != NULL;
             
            

            This change wrongly causes the bogus (garbage) delete-mark flags to be considered on node pointer pages.

            The only MariaDB Server releases that are affected by this bug are 10.2.7 and 10.2.8.

            marko Marko Mäkelä added a comment - I introduced this bug in MDEV-12698 in MySQL 10.2.7 with the following change: @@ -1392,13 +1388,10 @@ dict_stats_scan_page( Because offsets1,offsets2 should be big enough, this memory heap should never be used. */ mem_heap_t* heap = NULL; - const rec_t* (*get_next)(const rec_t*); - - if (scan_method == COUNT_ALL_NON_BORING_AND_SKIP_DEL_MARKED) { - get_next = page_rec_get_next_non_del_marked; - } else { - get_next = page_rec_get_next_const; - } + const rec_t* (*get_next)(const rec_t*) + = srv_stats_include_delete_marked + ? page_rec_get_next_const + : page_rec_get_next_non_del_marked; const bool should_count_external_pages = n_external_pages != NULL; This change wrongly causes the bogus (garbage) delete-mark flags to be considered on node pointer pages. The only MariaDB Server releases that are affected by this bug are 10.2.7 and 10.2.8.

            To workaround the problem (besides upgrade when 10.2.9 is released) one may try to disable persistent statistics :

            put following line into [mysqld] section of .cnf file:
            innodb_stats_persistent=0

            To disable persistent stats for current server without restart - execute SQL command:

            set global innodb_stats_persistent=0;
            

            anikitin Andrii Nikitin (Inactive) added a comment - To workaround the problem (besides upgrade when 10.2.9 is released) one may try to disable persistent statistics : put following line into [mysqld] section of .cnf file: innodb_stats_persistent=0 To disable persistent stats for current server without restart - execute SQL command: set global innodb_stats_persistent=0;
            gpfeng.cs Guangpu Feng added a comment -

            set global innodb_stats_persistent=0;
            

            will not workaround this bug, verified in 10.2.8.

            gpfeng.cs Guangpu Feng added a comment - set global innodb_stats_persistent= 0 ; will not workaround this bug, verified in 10.2.8.

            People

              marko Marko Mäkelä
              dicode Tim Westervoorde
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.