[MDEV-15916] Change buffer crash during TRUNCATE or DROP TABLE - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.2.2, 10.3.0
Fix Version/s: 10.2.15, 10.3.7
Component/s: Storage Engine - InnoDB
Labels:
None
Environment:
CentOS Linux release 7.4.1708 (Core)
3.10.0-693.21.1.el7.x86_64 #1 SMP
VMWare client

Description

Whilst truncating a bunch of tables from a Python script, the server fell over on a table that was already empty.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

error.log
2018-04-18 09:43
12 kB
Hayden Clark

Issue Links

relates to

MDEV-9663 InnoDB assertion failure: *cursor->index->name == TEMP_INDEX_PREFIX, or !cursor->index->is_committed()

Closed

MDEV-13256 innodb.truncate_debug fails in buildbot

Closed

Activity

Ascending order - Click to sort in descending order

Elena Stepanova added a comment - 2018-04-18 10:58

Copied from the error log to make it searchable:

2018-04-18  9:30:10 140615806613248 [Note] InnoDB: Buffer pool(s) load completed at 180418  9:30:10

2018-04-18  9:36:22 140616634459904 [ERROR] InnoDB: ibuf cursor restoration fails!. ibuf record inserted to page 609:23

2018-04-18  9:36:22 140616634459904 [ERROR] InnoDB: Submit a detailed bug report to https://jira.mariadb.org/

PHYSICAL RECORD: n_fields 1; 1-byte offsets; info bits 0

 0: len 8; hex 696e66696d756d00; asc infimum ;;

PHYSICAL RECORD: n_fields 6; 1-byte offsets; info bits 32

 0: len 4; hex 00000261; asc    a;;

 1: len 1; hex 00; asc  ;;

 2: len 4; hex 00000017; asc     ;;

 3: len 16; hex 023d01010c0f003c00c0860300048000; asc  =     <        ;;

 4: SQL NULL, size 0 ;

 5: len 4; hex 8000028f; asc     ;;

DATA TUPLE: 3 fields;

 0: len 4; hex 00000261; asc    a;;

 1: len 1; hex 00; asc  ;;

 2: len 4; hex 00000000; asc     ;;

PHYSICAL RECORD: n_fields 6; 1-byte offsets; info bits 32

 0: len 4; hex 00000261; asc    a;;

 1: len 1; hex 00; asc  ;;

 2: len 4; hex 00000017; asc     ;;

 3: len 16; hex 023d01010c0f003c00c0860300048000; asc  =     <        ;;

 4: SQL NULL, size 0 ;

 5: len 4; hex 8000028f; asc     ;;

2018-04-18  9:36:22 140616634459904 [ERROR] [FATAL] InnoDB: Failed to restore ibuf position.

180418  9:36:22 [ERROR] mysqld got signal 6 ;

This could be because you hit a bug. It is also possible that this binary

or one of the libraries it was linked against is corrupt, improperly built,

or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help

diagnose the problem, but since we have already crashed,

something is definitely wrong and this may fail.

Server version: 10.2.14-MariaDB

key_buffer_size=16777216

read_buffer_size=131072

max_used_connections=7

max_threads=153

thread_count=13

It is possible that mysqld could use up to

key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 352557 K  bytes of memory

Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7fe378000a88

Attempting backtrace. You can use the following information to find out

where mysqld died. If you see no messages after this, something went

terribly wrong...

stack_bottom = 0x7fe3dc8bdd70 thread_stack 0x30000

/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x5641a76fdcce]

/usr/sbin/mysqld(handle_fatal_signal+0x355)[0x5641a7189605]

sigaction.c:0(__restore_rt)[0x7fe3f816a5e0]

:0(__GI_raise)[0x7fe3f66771f7]

:0(__GI_abort)[0x7fe3f66788e8]

/usr/sbin/mysqld(+0x9ae963)[0x5641a74c4963]

ut/ut0ut.cc:792(ib::fatal::~fatal())[0x5641a739850a]

ibuf/ibuf0ibuf.cc:4296(ibuf_restore_pos(unsigned long, unsigned long, dtuple_t const*, unsigned long, btr_pcur_t*, mtr_t*))[0x5641a7399f18]

ibuf/ibuf0ibuf.cc:4396(ibuf_delete_rec(unsigned long, unsigned long, btr_pcur_t*, dtuple_t const*, mtr_t*))[0x5641a739b5d9]

fil/fil0fil.cc:3234(fil_reinit_space_header_for_table(dict_table_t*, unsigned long, trx_t*))[0x5641a757356f]

row/row0trunc.cc:2038(row_truncate_table_for_mysql(dict_table_t*, trx_t*))[0x5641a7457399]

handler/ha_innodb.cc:13192(ha_innobase::truncate())[0x5641a7371f08]

sql/sql_truncate.cc:261(Sql_cmd_truncate_table::handler_truncate(THD*, TABLE_LIST*, bool))[0x5641a72f6e60]

sql/sql_truncate.cc:449(Sql_cmd_truncate_table::truncate_table(THD*, TABLE_LIST*))[0x5641a72f72f0]

sql/sql_truncate.cc:498(Sql_cmd_truncate_table::execute(THD*))[0x5641a72f7396]

sql/sql_parse.cc:6220(mysql_execute_command(THD*))[0x5641a6ff3d0b]

sql/sql_parse.cc:7915(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x5641a6ffbb2e]

sql/sql_parse.cc:1815(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x5641a6ffeb33]

sql/sql_parse.cc:1371(do_command(THD*))[0x5641a6fff76b]

sql/sql_connect.cc:1335(do_handle_one_connection(CONNECT*))[0x5641a70c6d3a]

sql/sql_connect.cc:1243(handle_one_connection)[0x5641a70c6e5d]

pthread_create.c:0(start_thread)[0x7fe3f8162e25]

/lib64/libc.so.6(clone+0x6d)[0x7fe3f673a34d]

Trying to get some variables.

Some pointers may be invalid and cause the dump to abort.

Query (0x7fe37800f170): truncate aggregation_cause_kpi_fleet_day_at

Connection ID (thread ID): 304

Status: NOT_KILLED

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on

Elena Stepanova added a comment - 2018-04-18 10:58 Copied from the error log to make it searchable: 2018-04-18 9:30:10 140615806613248 [Note] InnoDB: Buffer pool(s) load completed at 180418 9:30:10 2018-04-18 9:36:22 140616634459904 [ERROR] InnoDB: ibuf cursor restoration fails!. ibuf record inserted to page 609:23 2018-04-18 9:36:22 140616634459904 [ERROR] InnoDB: Submit a detailed bug report to https://jira.mariadb.org/ PHYSICAL RECORD: n_fields 1; 1-byte offsets; info bits 0 0: len 8; hex 696e66696d756d00; asc infimum ;; PHYSICAL RECORD: n_fields 6; 1-byte offsets; info bits 32 0: len 4; hex 00000261; asc a;; 1: len 1; hex 00; asc ;; 2: len 4; hex 00000017; asc ;; 3: len 16; hex 023d01010c0f003c00c0860300048000; asc = < ;; 4: SQL NULL, size 0 ; 5: len 4; hex 8000028f; asc ;; DATA TUPLE: 3 fields; 0: len 4; hex 00000261; asc a;; 1: len 1; hex 00; asc ;; 2: len 4; hex 00000000; asc ;; PHYSICAL RECORD: n_fields 6; 1-byte offsets; info bits 32 0: len 4; hex 00000261; asc a;; 1: len 1; hex 00; asc ;; 2: len 4; hex 00000017; asc ;; 3: len 16; hex 023d01010c0f003c00c0860300048000; asc = < ;; 4: SQL NULL, size 0 ; 5: len 4; hex 8000028f; asc ;; 2018-04-18 9:36:22 140616634459904 [ERROR] [FATAL] InnoDB: Failed to restore ibuf position. 180418 9:36:22 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see https://mariadb.com/kb/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 10.2.14-MariaDB key_buffer_size=16777216 read_buffer_size=131072 max_used_connections=7 max_threads=153 thread_count=13 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 352557 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x7fe378000a88 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7fe3dc8bdd70 thread_stack 0x30000 /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x5641a76fdcce] /usr/sbin/mysqld(handle_fatal_signal+0x355)[0x5641a7189605] sigaction.c:0(__restore_rt)[0x7fe3f816a5e0] :0(__GI_raise)[0x7fe3f66771f7] :0(__GI_abort)[0x7fe3f66788e8] /usr/sbin/mysqld(+0x9ae963)[0x5641a74c4963] ut/ut0ut.cc:792(ib::fatal::~fatal())[0x5641a739850a] ibuf/ibuf0ibuf.cc:4296(ibuf_restore_pos(unsigned long, unsigned long, dtuple_t const*, unsigned long, btr_pcur_t*, mtr_t*))[0x5641a7399f18] ibuf/ibuf0ibuf.cc:4396(ibuf_delete_rec(unsigned long, unsigned long, btr_pcur_t*, dtuple_t const*, mtr_t*))[0x5641a739b5d9] fil/fil0fil.cc:3234(fil_reinit_space_header_for_table(dict_table_t*, unsigned long, trx_t*))[0x5641a757356f] row/row0trunc.cc:2038(row_truncate_table_for_mysql(dict_table_t*, trx_t*))[0x5641a7457399] handler/ha_innodb.cc:13192(ha_innobase::truncate())[0x5641a7371f08] sql/sql_truncate.cc:261(Sql_cmd_truncate_table::handler_truncate(THD*, TABLE_LIST*, bool))[0x5641a72f6e60] sql/sql_truncate.cc:449(Sql_cmd_truncate_table::truncate_table(THD*, TABLE_LIST*))[0x5641a72f72f0] sql/sql_truncate.cc:498(Sql_cmd_truncate_table::execute(THD*))[0x5641a72f7396] sql/sql_parse.cc:6220(mysql_execute_command(THD*))[0x5641a6ff3d0b] sql/sql_parse.cc:7915(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x5641a6ffbb2e] sql/sql_parse.cc:1815(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x5641a6ffeb33] sql/sql_parse.cc:1371(do_command(THD*))[0x5641a6fff76b] sql/sql_connect.cc:1335(do_handle_one_connection(CONNECT*))[0x5641a70c6d3a] sql/sql_connect.cc:1243(handle_one_connection)[0x5641a70c6e5d] pthread_create.c:0(start_thread)[0x7fe3f8162e25] /lib64/libc.so.6(clone+0x6d)[0x7fe3f673a34d] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x7fe37800f170): truncate aggregation_cause_kpi_fleet_day_at Connection ID (thread ID): 304 Status: NOT_KILLED Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on

Marko Mäkelä added a comment - 2018-04-18 14:16

hayden, can you please look up the table name:

SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES WHERE SPACE=609;

The SHOW CREATE TABLE would help. It could be completely unrelated to the current operation.

TRUNCATE TABLE should not use the InnoDB change buffer at all. Due to the nature of the change buffer, the cause could have been some earlier corruption, possibly caused when using an older version of the server. For example, ~~MDEV-15199~~ was fixed in 10.2.13; it affected all versions since 10.2.2.

Marko Mäkelä added a comment - 2018-04-18 14:16 hayden , can you please look up the table name: SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES WHERE SPACE =609; The SHOW CREATE TABLE would help. It could be completely unrelated to the current operation. TRUNCATE TABLE should not use the InnoDB change buffer at all. Due to the nature of the change buffer, the cause could have been some earlier corruption, possibly caused when using an older version of the server. For example, MDEV-15199 was fixed in 10.2.13; it affected all versions since 10.2.2.

Marko Mäkelä added a comment - 2018-05-09 06:35

I think that there is a simple race condition in the condition fil_space_get_flags(space) == ULINT_UNDEFINED in ibuf_restore_pos(). The condition would not hold while a tablespace is being dropped or truncated; it only holds after the tablespace no longer exists.

I encountered this same error on buildbot during the DROP TABLE of the test innodb.mdev-15707.

Also, it seems unreasonable to let InnoDB abort the server in this case. It is not really a fatal error. Yes, it could lead to secondary index corruption (~~MDEV-9663~~), but such corruption could always be cured by dropping and re-creating the secondary indexes.

Marko Mäkelä added a comment - 2018-05-09 06:35 I think that there is a simple race condition in the condition fil_space_get_flags(space) == ULINT_UNDEFINED in ibuf_restore_pos() . The condition would not hold while a tablespace is being dropped or truncated; it only holds after the tablespace no longer exists. I encountered this same error on buildbot during the DROP TABLE of the test innodb.mdev-15707 . Also, it seems unreasonable to let InnoDB abort the server in this case. It is not really a fatal error. Yes, it could lead to secondary index corruption ( MDEV-9663 ), but such corruption could always be cured by dropping and re-creating the secondary indexes.

Marko Mäkelä added a comment - 2018-05-09 09:06

I tried repeating the failure with innodb.mdev-15707 manually, even modifying the test and adding some sleeps to the code, but had no success. I have a simple fix that involves invoking fil_space_acquire(), which will properly return NULL if truncate or drop is in progress. It would also display a file name, and remove the intentional crash in case of failure:

diff --git a/storage/innobase/ibuf/ibuf0ibuf.cc b/storage/innobase/ibuf/ibuf0ibuf.cc

index 7041ad53f0d..f20354e7588 100644

--- a/storage/innobase/ibuf/ibuf0ibuf.cc

+++ b/storage/innobase/ibuf/ibuf0ibuf.cc

@@ -4274,15 +4274,12 @@ ibuf_restore_pos(

 		return(TRUE);

-	if (fil_space_get_flags(space) == ULINT_UNDEFINED) {

-		/* The tablespace has been dropped.  It is possible

-		that another thread has deleted the insert buffer

-		entry.  Do not complain. */

-		ibuf_btr_pcur_commit_specify_mtr(pcur, mtr);

-	} else {

-		ib::error() << "ibuf cursor restoration fails!."

+	if (fil_space_t* s = fil_space_acquire_silent(space)) {

+		ib::error() << "ibuf cursor restoration fails!"

 			" ibuf record inserted to page "

-			<< space << ":" << page_no;

+			<< space << ":" << page_no

+			<< " in file " << s->chain.start->name;

+		fil_space_release(s);

 		ib::error() << BUG_REPORT_MSG;

@@ -4292,10 +4289,9 @@ ibuf_restore_pos(

 		rec_print_old(stderr,

 			      page_rec_get_next(btr_pcur_get_rec(pcur)));

-		ib::fatal() << "Failed to restore ibuf position.";

+	ibuf_btr_pcur_commit_specify_mtr(pcur, mtr);

 	return(FALSE);

Marko Mäkelä added a comment - 2018-05-09 09:06 I tried repeating the failure with innodb.mdev-15707 manually, even modifying the test and adding some sleeps to the code, but had no success. I have a simple fix that involves invoking fil_space_acquire() , which will properly return NULL if truncate or drop is in progress. It would also display a file name, and remove the intentional crash in case of failure: diff --git a/storage/innobase/ibuf/ibuf0ibuf.cc b/storage/innobase/ibuf/ibuf0ibuf.cc index 7041ad53f0d..f20354e7588 100644 --- a/storage/innobase/ibuf/ibuf0ibuf.cc +++ b/storage/innobase/ibuf/ibuf0ibuf.cc @@ -4274,15 +4274,12 @@ ibuf_restore_pos( return(TRUE); } - if (fil_space_get_flags(space) == ULINT_UNDEFINED) { - /* The tablespace has been dropped. It is possible - that another thread has deleted the insert buffer - entry. Do not complain. */ - ibuf_btr_pcur_commit_specify_mtr(pcur, mtr); - } else { - ib::error() << "ibuf cursor restoration fails!." + if (fil_space_t* s = fil_space_acquire_silent(space)) { + ib::error() << "ibuf cursor restoration fails!" " ibuf record inserted to page " - << space << ":" << page_no; + << space << ":" << page_no + << " in file " << s->chain.start->name; + fil_space_release(s); ib::error() << BUG_REPORT_MSG; @@ -4292,10 +4289,9 @@ ibuf_restore_pos( rec_print_old(stderr, page_rec_get_next(btr_pcur_get_rec(pcur))); - - ib::fatal() << "Failed to restore ibuf position."; } + ibuf_btr_pcur_commit_specify_mtr(pcur, mtr); return(FALSE); }

MariaDB Server

Change buffer crash during TRUNCATE or DROP TABLE

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration