Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-15916

Change buffer crash during TRUNCATE or DROP TABLE

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.2.2, 10.3.0
    • 10.2.15, 10.3.7
    • None
    • CentOS Linux release 7.4.1708 (Core)
      3.10.0-693.21.1.el7.x86_64 #1 SMP
      VMWare client

    Description

      Whilst truncating a bunch of tables from a Python script, the server fell over on a table that was already empty.

      Attachments

        Issue Links

          Activity

            Copied from the error log to make it searchable:

            2018-04-18  9:30:10 140615806613248 [Note] InnoDB: Buffer pool(s) load completed at 180418  9:30:10
            2018-04-18  9:36:22 140616634459904 [ERROR] InnoDB: ibuf cursor restoration fails!. ibuf record inserted to page 609:23
            2018-04-18  9:36:22 140616634459904 [ERROR] InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
            PHYSICAL RECORD: n_fields 1; 1-byte offsets; info bits 0
             0: len 8; hex 696e66696d756d00; asc infimum ;;
            PHYSICAL RECORD: n_fields 6; 1-byte offsets; info bits 32
             0: len 4; hex 00000261; asc    a;;
             1: len 1; hex 00; asc  ;;
             2: len 4; hex 00000017; asc     ;;
             3: len 16; hex 023d01010c0f003c00c0860300048000; asc  =     <        ;;
             4: SQL NULL, size 0 ;
             5: len 4; hex 8000028f; asc     ;;
            DATA TUPLE: 3 fields;
             0: len 4; hex 00000261; asc    a;;
             1: len 1; hex 00; asc  ;;
             2: len 4; hex 00000000; asc     ;;
            PHYSICAL RECORD: n_fields 6; 1-byte offsets; info bits 32
             0: len 4; hex 00000261; asc    a;;
             1: len 1; hex 00; asc  ;;
             2: len 4; hex 00000017; asc     ;;
             3: len 16; hex 023d01010c0f003c00c0860300048000; asc  =     <        ;;
             4: SQL NULL, size 0 ;
             5: len 4; hex 8000028f; asc     ;;
            2018-04-18  9:36:22 140616634459904 [ERROR] [FATAL] InnoDB: Failed to restore ibuf position.
            180418  9:36:22 [ERROR] mysqld got signal 6 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.
             
            To report this bug, see https://mariadb.com/kb/en/reporting-bugs
             
            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed, 
            something is definitely wrong and this may fail.
             
            Server version: 10.2.14-MariaDB
            key_buffer_size=16777216
            read_buffer_size=131072
            max_used_connections=7
            max_threads=153
            thread_count=13
            It is possible that mysqld could use up to 
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 352557 K  bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.
             
            Thread pointer: 0x7fe378000a88
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x7fe3dc8bdd70 thread_stack 0x30000
            /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x5641a76fdcce]
            /usr/sbin/mysqld(handle_fatal_signal+0x355)[0x5641a7189605]
            sigaction.c:0(__restore_rt)[0x7fe3f816a5e0]
            :0(__GI_raise)[0x7fe3f66771f7]
            :0(__GI_abort)[0x7fe3f66788e8]
            /usr/sbin/mysqld(+0x9ae963)[0x5641a74c4963]
            ut/ut0ut.cc:792(ib::fatal::~fatal())[0x5641a739850a]
            ibuf/ibuf0ibuf.cc:4296(ibuf_restore_pos(unsigned long, unsigned long, dtuple_t const*, unsigned long, btr_pcur_t*, mtr_t*))[0x5641a7399f18]
            ibuf/ibuf0ibuf.cc:4396(ibuf_delete_rec(unsigned long, unsigned long, btr_pcur_t*, dtuple_t const*, mtr_t*))[0x5641a739b5d9]
            fil/fil0fil.cc:3234(fil_reinit_space_header_for_table(dict_table_t*, unsigned long, trx_t*))[0x5641a757356f]
            row/row0trunc.cc:2038(row_truncate_table_for_mysql(dict_table_t*, trx_t*))[0x5641a7457399]
            handler/ha_innodb.cc:13192(ha_innobase::truncate())[0x5641a7371f08]
            sql/sql_truncate.cc:261(Sql_cmd_truncate_table::handler_truncate(THD*, TABLE_LIST*, bool))[0x5641a72f6e60]
            sql/sql_truncate.cc:449(Sql_cmd_truncate_table::truncate_table(THD*, TABLE_LIST*))[0x5641a72f72f0]
            sql/sql_truncate.cc:498(Sql_cmd_truncate_table::execute(THD*))[0x5641a72f7396]
            sql/sql_parse.cc:6220(mysql_execute_command(THD*))[0x5641a6ff3d0b]
            sql/sql_parse.cc:7915(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x5641a6ffbb2e]
            sql/sql_parse.cc:1815(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x5641a6ffeb33]
            sql/sql_parse.cc:1371(do_command(THD*))[0x5641a6fff76b]
            sql/sql_connect.cc:1335(do_handle_one_connection(CONNECT*))[0x5641a70c6d3a]
            sql/sql_connect.cc:1243(handle_one_connection)[0x5641a70c6e5d]
            pthread_create.c:0(start_thread)[0x7fe3f8162e25]
            /lib64/libc.so.6(clone+0x6d)[0x7fe3f673a34d]
             
            Trying to get some variables.
            Some pointers may be invalid and cause the dump to abort.
            Query (0x7fe37800f170): truncate aggregation_cause_kpi_fleet_day_at
            Connection ID (thread ID): 304
            Status: NOT_KILLED
             
            Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on
            

            elenst Elena Stepanova added a comment - Copied from the error log to make it searchable: 2018-04-18 9:30:10 140615806613248 [Note] InnoDB: Buffer pool(s) load completed at 180418 9:30:10 2018-04-18 9:36:22 140616634459904 [ERROR] InnoDB: ibuf cursor restoration fails!. ibuf record inserted to page 609:23 2018-04-18 9:36:22 140616634459904 [ERROR] InnoDB: Submit a detailed bug report to https://jira.mariadb.org/ PHYSICAL RECORD: n_fields 1; 1-byte offsets; info bits 0 0: len 8; hex 696e66696d756d00; asc infimum ;; PHYSICAL RECORD: n_fields 6; 1-byte offsets; info bits 32 0: len 4; hex 00000261; asc a;; 1: len 1; hex 00; asc ;; 2: len 4; hex 00000017; asc ;; 3: len 16; hex 023d01010c0f003c00c0860300048000; asc = < ;; 4: SQL NULL, size 0 ; 5: len 4; hex 8000028f; asc ;; DATA TUPLE: 3 fields; 0: len 4; hex 00000261; asc a;; 1: len 1; hex 00; asc ;; 2: len 4; hex 00000000; asc ;; PHYSICAL RECORD: n_fields 6; 1-byte offsets; info bits 32 0: len 4; hex 00000261; asc a;; 1: len 1; hex 00; asc ;; 2: len 4; hex 00000017; asc ;; 3: len 16; hex 023d01010c0f003c00c0860300048000; asc = < ;; 4: SQL NULL, size 0 ; 5: len 4; hex 8000028f; asc ;; 2018-04-18 9:36:22 140616634459904 [ERROR] [FATAL] InnoDB: Failed to restore ibuf position. 180418 9:36:22 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware.   To report this bug, see https://mariadb.com/kb/en/reporting-bugs   We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail.   Server version: 10.2.14-MariaDB key_buffer_size=16777216 read_buffer_size=131072 max_used_connections=7 max_threads=153 thread_count=13 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 352557 K bytes of memory Hope that's ok; if not, decrease some variables in the equation.   Thread pointer: 0x7fe378000a88 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7fe3dc8bdd70 thread_stack 0x30000 /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x5641a76fdcce] /usr/sbin/mysqld(handle_fatal_signal+0x355)[0x5641a7189605] sigaction.c:0(__restore_rt)[0x7fe3f816a5e0] :0(__GI_raise)[0x7fe3f66771f7] :0(__GI_abort)[0x7fe3f66788e8] /usr/sbin/mysqld(+0x9ae963)[0x5641a74c4963] ut/ut0ut.cc:792(ib::fatal::~fatal())[0x5641a739850a] ibuf/ibuf0ibuf.cc:4296(ibuf_restore_pos(unsigned long, unsigned long, dtuple_t const*, unsigned long, btr_pcur_t*, mtr_t*))[0x5641a7399f18] ibuf/ibuf0ibuf.cc:4396(ibuf_delete_rec(unsigned long, unsigned long, btr_pcur_t*, dtuple_t const*, mtr_t*))[0x5641a739b5d9] fil/fil0fil.cc:3234(fil_reinit_space_header_for_table(dict_table_t*, unsigned long, trx_t*))[0x5641a757356f] row/row0trunc.cc:2038(row_truncate_table_for_mysql(dict_table_t*, trx_t*))[0x5641a7457399] handler/ha_innodb.cc:13192(ha_innobase::truncate())[0x5641a7371f08] sql/sql_truncate.cc:261(Sql_cmd_truncate_table::handler_truncate(THD*, TABLE_LIST*, bool))[0x5641a72f6e60] sql/sql_truncate.cc:449(Sql_cmd_truncate_table::truncate_table(THD*, TABLE_LIST*))[0x5641a72f72f0] sql/sql_truncate.cc:498(Sql_cmd_truncate_table::execute(THD*))[0x5641a72f7396] sql/sql_parse.cc:6220(mysql_execute_command(THD*))[0x5641a6ff3d0b] sql/sql_parse.cc:7915(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x5641a6ffbb2e] sql/sql_parse.cc:1815(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x5641a6ffeb33] sql/sql_parse.cc:1371(do_command(THD*))[0x5641a6fff76b] sql/sql_connect.cc:1335(do_handle_one_connection(CONNECT*))[0x5641a70c6d3a] sql/sql_connect.cc:1243(handle_one_connection)[0x5641a70c6e5d] pthread_create.c:0(start_thread)[0x7fe3f8162e25] /lib64/libc.so.6(clone+0x6d)[0x7fe3f673a34d]   Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x7fe37800f170): truncate aggregation_cause_kpi_fleet_day_at Connection ID (thread ID): 304 Status: NOT_KILLED   Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on

            hayden, can you please look up the table name:

            SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES WHERE SPACE=609;
            

            The SHOW CREATE TABLE would help. It could be completely unrelated to the current operation.

            TRUNCATE TABLE should not use the InnoDB change buffer at all. Due to the nature of the change buffer, the cause could have been some earlier corruption, possibly caused when using an older version of the server. For example, MDEV-15199 was fixed in 10.2.13; it affected all versions since 10.2.2.

            marko Marko Mäkelä added a comment - hayden , can you please look up the table name: SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES WHERE SPACE =609; The SHOW CREATE TABLE would help. It could be completely unrelated to the current operation. TRUNCATE TABLE should not use the InnoDB change buffer at all. Due to the nature of the change buffer, the cause could have been some earlier corruption, possibly caused when using an older version of the server. For example, MDEV-15199 was fixed in 10.2.13; it affected all versions since 10.2.2.

            I think that there is a simple race condition in the condition fil_space_get_flags(space) == ULINT_UNDEFINED in ibuf_restore_pos(). The condition would not hold while a tablespace is being dropped or truncated; it only holds after the tablespace no longer exists.

            I encountered this same error on buildbot during the DROP TABLE of the test innodb.mdev-15707.

            Also, it seems unreasonable to let InnoDB abort the server in this case. It is not really a fatal error. Yes, it could lead to secondary index corruption (MDEV-9663), but such corruption could always be cured by dropping and re-creating the secondary indexes.

            marko Marko Mäkelä added a comment - I think that there is a simple race condition in the condition fil_space_get_flags(space) == ULINT_UNDEFINED in ibuf_restore_pos() . The condition would not hold while a tablespace is being dropped or truncated; it only holds after the tablespace no longer exists. I encountered this same error on buildbot during the DROP TABLE of the test innodb.mdev-15707 . Also, it seems unreasonable to let InnoDB abort the server in this case. It is not really a fatal error. Yes, it could lead to secondary index corruption ( MDEV-9663 ), but such corruption could always be cured by dropping and re-creating the secondary indexes.

            I tried repeating the failure with innodb.mdev-15707 manually, even modifying the test and adding some sleeps to the code, but had no success. I have a simple fix that involves invoking fil_space_acquire(), which will properly return NULL if truncate or drop is in progress. It would also display a file name, and remove the intentional crash in case of failure:

            diff --git a/storage/innobase/ibuf/ibuf0ibuf.cc b/storage/innobase/ibuf/ibuf0ibuf.cc
            index 7041ad53f0d..f20354e7588 100644
            --- a/storage/innobase/ibuf/ibuf0ibuf.cc
            +++ b/storage/innobase/ibuf/ibuf0ibuf.cc
            @@ -4274,15 +4274,12 @@ ibuf_restore_pos(
             		return(TRUE);
             	}
             
            -	if (fil_space_get_flags(space) == ULINT_UNDEFINED) {
            -		/* The tablespace has been dropped.  It is possible
            -		that another thread has deleted the insert buffer
            -		entry.  Do not complain. */
            -		ibuf_btr_pcur_commit_specify_mtr(pcur, mtr);
            -	} else {
            -		ib::error() << "ibuf cursor restoration fails!."
            +	if (fil_space_t* s = fil_space_acquire_silent(space)) {
            +		ib::error() << "ibuf cursor restoration fails!"
             			" ibuf record inserted to page "
            -			<< space << ":" << page_no;
            +			<< space << ":" << page_no
            +			<< " in file " << s->chain.start->name;
            +		fil_space_release(s);
             
             		ib::error() << BUG_REPORT_MSG;
             
            @@ -4292,10 +4289,9 @@ ibuf_restore_pos(
             
             		rec_print_old(stderr,
             			      page_rec_get_next(btr_pcur_get_rec(pcur)));
            -
            -		ib::fatal() << "Failed to restore ibuf position.";
             	}
             
            +	ibuf_btr_pcur_commit_specify_mtr(pcur, mtr);
             	return(FALSE);
             }
             
            

            marko Marko Mäkelä added a comment - I tried repeating the failure with innodb.mdev-15707 manually, even modifying the test and adding some sleeps to the code, but had no success. I have a simple fix that involves invoking fil_space_acquire() , which will properly return NULL if truncate or drop is in progress. It would also display a file name, and remove the intentional crash in case of failure: diff --git a/storage/innobase/ibuf/ibuf0ibuf.cc b/storage/innobase/ibuf/ibuf0ibuf.cc index 7041ad53f0d..f20354e7588 100644 --- a/storage/innobase/ibuf/ibuf0ibuf.cc +++ b/storage/innobase/ibuf/ibuf0ibuf.cc @@ -4274,15 +4274,12 @@ ibuf_restore_pos( return(TRUE); } - if (fil_space_get_flags(space) == ULINT_UNDEFINED) { - /* The tablespace has been dropped. It is possible - that another thread has deleted the insert buffer - entry. Do not complain. */ - ibuf_btr_pcur_commit_specify_mtr(pcur, mtr); - } else { - ib::error() << "ibuf cursor restoration fails!." + if (fil_space_t* s = fil_space_acquire_silent(space)) { + ib::error() << "ibuf cursor restoration fails!" " ibuf record inserted to page " - << space << ":" << page_no; + << space << ":" << page_no + << " in file " << s->chain.start->name; + fil_space_release(s); ib::error() << BUG_REPORT_MSG; @@ -4292,10 +4289,9 @@ ibuf_restore_pos( rec_print_old(stderr, page_rec_get_next(btr_pcur_get_rec(pcur))); - - ib::fatal() << "Failed to restore ibuf position."; } + ibuf_btr_pcur_commit_specify_mtr(pcur, mtr); return(FALSE); }

            People

              marko Marko Mäkelä
              hayden Hayden Clark
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.