[MDEV-13542] Crashing on a corrupted page is unhelpful - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 5.5(EOL), 10.0(EOL), 10.1(EOL), 10.2(EOL), 10.3(EOL)
Fix Version/s: 10.6.9, 10.7.5, 10.8.4, 10.9.2
Component/s: Storage Engine - InnoDB
Labels:

Description

Since the very first version, InnoDB was aborting the whole MySQL or MariaDB server when trying to access a corrupted page.

Earlier, small steps have been taken to improve MariaDB stability, such as ~~MDEV-12253~~.

We must enable error handling for corrupted pages instead of allowing the server to crash. This is not only about detecting a checksum failure when reading a corrupted page (like suggested in MySQL Bug#10132 Crashing the server on corrupt InnoDB page is unhelpful, filed on April 25, 2005), but also about performing some operations on corrupted pages whose checksum appears valid.

In some background operations, such as the purge of committed transaction history, errors may be reported to the server error log only, or silently ignored. While executing SQL, errors should be reported back to the client, and the transaction could be rolled back.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

cursor_restore_supremum.patch
2 kB
2022-05-19 14:55

Issue Links

blocks

MDEV-18519 0x7f0118195700 InnoDB: Assertion failure in file /home/buildbot/buildbot/build/mariadb-10.3.7/storage/innobase/btr/btr0cur.cc line 4308

Closed

MDEV-18976 Implement a CHECKSUM redo log record for improved validation

Closed

MDEV-21098 Crash in rec_get_offsets_func() due to invalid rec_get_status()

Closed

MDEV-22388 Corrupted undo log record leads to server crash

Closed

MDEV-27593 Crashing on I/O error is unhelpful

Closed

MDEV-28457 Crash in page_dir_find_owner_slot()

Closed

MDEV-29603 btr_cur_open_at_index_side() is missing some consistency checks

Closed

MDEV-34830 LSN in the future is not being treated as serious corruption

Closed

causes

MDEV-28845 InnoDB: Failing assertion: bpage->can_relocate() in buf0lru.cc

Closed

MDEV-28950 Assertion `*err == DB_SUCCESS' failed in btr_page_split_and_insert

Closed

MDEV-29321 Percona XtraDB 5.7 can't be upgrade to MariaDB 10.6 or above

Closed

MDEV-29374 Frequent "Data structure corruption" in InnoDB after OOM

Closed

MDEV-29383 Assertion mysql_mutex_assert_owner(&log_sys.flush_order_mutex) failed in mtr_t::commit()

Closed

MDEV-29435 CHECK TABLE forgets to release latches after reporting failure

Closed

MDEV-31333 fsp_free_page() fails to move the extent from FSP_FREE_FRAG to FSP_FREE list

Closed

MDEV-35430 my_snprintf fixes for 10.6+

Closed

is duplicated by

MDEV-14170 InnoDB: Failing assertion: sibling_mode == RW_NO_LATCH || btr_page_get_next(get_block->frame, mtr) == page_get_page_no(page)

Closed

MDEV-16497 Server crashes on corrupted pages during DROP TABLE

Closed

MDEV-26550 [FATAL] [ERROR] mysqld got signal 6 ;InnoDB: Aborting because of a corrupt database page.

Closed

MDEV-27865 buf_page_read_complete assertions during during innodb_init

Closed

MDEV-28148 InnoDB slave crash / signal 6

Closed

MDEV-28574 Assertion failure in fil_invalid_page_access_msg() while running CHECK TABLE to find the details on corruption suspected

Closed

MDEV-28786 InnoDB crash leads to pagesize comparison failure

Closed

MDEV-29126 DB is crashing repeatedly with error message "InnoDB: Rec offset 99, cur1 offset 1723, cur2 offset 16095"

Closed

MDEV-29130 InnoDB: Assertion failure in file storage/innobase/fsp/fsp0fsp.cc line 1562

Closed

MDEV-29150 DB continuously rebooting

Closed

MDEV-33398 Server crashes when executing DML with innodb_checksum_algorithm set to strict_innodb

Closed

relates to

MDEV-12112 corruption in encrypted table may be overlooked

Closed

MDEV-13680 InnoDB may crash when btr_page_alloc() fails

Closed

MDEV-17810 Improve error printout when decryption fails or we identify page as both encrypted and unencrypted

Closed

MDEV-17957 Make Innodb_checksum_algorithm stricter for strict_* values

Closed

MDEV-17958 Make bug-endian innodb_checksum_algorithm=crc32 optional

Closed

MDEV-18025 Mariabackup fails to detect corrupted page_compressed=1 tables

Closed

MDEV-18026 Make mariabackup option to double read the pages from data file.

Closed

MDEV-18455 Crash On Table structure Change on very large table - With filesystem not having enough space, rendering the table (and server) unusable

Closed

MDEV-18691 Failing assertion: page_is_comp(next_page) == page_is_comp(page) in btr0pcur.cc

Closed

MDEV-18932 MariaDB 10.3.10-10.3.13 corrupts table and refuses to start with assertion in row0sel.cc 2986

Closed

MDEV-19081 DB crashes periodicly with - InnoDB: Assertion failure

Closed

MDEV-19541 InnoDB crashes when trying to recover a corrupted page

Closed

MDEV-21030 MariaDB keepts crashing on start

Closed

MDEV-21513 Fix some crashes in ALTER TABLE…IMPORT TABLESPACE

Closed

MDEV-21863 InnoDB: Assertion failure rem0rec.cc 1878

Closed

MDEV-22373 Unable to find a record to delete-mark ends up crashing mysqld process after upgrading from 10.1.43 to 10.4

Closed

MDEV-22481 Server crash after access to table with imported tablespace (after restart)

Closed

MDEV-22624 mariadb used for wordpress restarts and crashes continuously

Closed

MDEV-23344 InnoDB: Fatal: Trying to access page number 57 in space 182 space name zabbix/items.

Closed

MDEV-23734 InnoDB: File (unknown): 'read' returned OS error 205. Cannot continue operation

Closed

MDEV-26780 Tablespaces corrupt, MariaDB does not start

Closed

MDEV-26893 innodb assertion on startup - rem0rec:877

Closed

MDEV-28143 Data table corruption/crashing on btrfs

Open

MDEV-28975 AWS RDS mariadb 10.5.12 crashes upon OS update

Closed

MDEV-29082 InnoDB: Failing assertion: !cursor->index->is_committed()

Closed

MDEV-29938 InnoDB: Assertion failure in btr0pcur.cc line 532

Open

MDEV-30132 Crash after recovery, with InnoDB: Tried to read ... bytes at offset

Closed

MDEV-31353 InnoDB recovery hangs after reporting corruption

Closed

MDEV-32269 InnoDB after ALTER TABLE…IMPORT TABLESPACE may not be crash safe

Closed

MDEV-33325 Crash in flst_read_addr on corrupted data

Closed

MDEV-35413 InnoDB: Cannot load compressed BLOB (ROW_FORMAT=COMPRESSED table)

Closed

MDEV-35729 InnoDB: Trying to read page … which is outside the tablespace bounds

Closed

MDEV-9663 InnoDB assertion failure: *cursor->index->name == TEMP_INDEX_PREFIX, or !cursor->index->is_committed()

Closed

MDEV-11125 Introduce a reduced doublewrite mode, handling error detection only

Closed

MDEV-12253 Buffer pool blocks are accessed after they have been freed

Closed

MDEV-12288 Reset DB_TRX_ID when the history is removed, to speed up MVCC

Closed

MDEV-13536 DB_TRX_ID is not actually being reset when the history is removed

Closed

MDEV-13559 encryption.innodb-redo-badkey failed in buildbot

Closed

MDEV-14132 innodb.innodb-64k failed in buildbot, lost connection to server

Closed

MDEV-14481 Execute InnoDB crash recovery in the background

Closed

MDEV-16763 SIGNAL 6 ERROR

Closed

MDEV-17482 InnoDB fails to say which fatal error fsync() returned

Closed

MDEV-17638 Improve error message about corruption of encrypted page

Closed

MDEV-17641 start slave and mariadb server crash

Closed

MDEV-17947 mariabackup SST encounters InnoDB tablespace errors during prepare, ignores them, server crashes

Closed

MDEV-18115 Remove dummy tablespace for the redo log

Closed

MDEV-18529 InnoDB wrongly skips decryption of encrypted page if unencrypted and post-encrypted checksums match

Closed

MDEV-18606 innodb crashes on large update and it gets corrupted

Closed

MDEV-19711 Assertion failure during IMPORT TABLESPACE

Closed

MDEV-20931 ALTER...IMPORT can crash the server

Closed

MDEV-24287 mysqlcheck unexpectedly causes table to be marked as corrupt

Closed

MDEV-24412 Mariadb 10.5: InnoDB: Upgrade after a crash is not supported

Closed

MDEV-24578 MariaDB 10.5 fails to join Galera cluster of MariaDB 10.1, root cause SST failure, symptom is InnoDB: Assertion failure

Closed

MDEV-26022 InnoDB page data size mismatch crash

Closed

MDEV-26537 InnoDB corrupts files due to incorrect st_blksize calculation

Closed

MDEV-27053 Crash on assertion failure in btr0cur.cc - apparent index corruption

Closed

MDEV-27538 MariaDB server startup crash

Closed

MDEV-28002 Failing assertion: btr_page_get_prev(next_page, mtr) == btr_pcur_get_block(cursor)->page.id.page_no()

Closed

MDEV-28174 InnoDB: Assertion failure

Closed

MDEV-28349 Provide "crash safe" options for CHECK TABLE and ALTER TABLE ... CHECK PARTITION ...

Open

MDEV-29276 [FATAL] InnoDB: Page old data size 7898 new data size 7922, page old max ins size 8288 new max ins size 8264

Closed

MDEV-30397 InnoDB crash due to DB_FAIL reported for a corrupted page

Closed

MDEV-31245 Assertion Failure results in crash

Closed

MDEV-31767 InnoDB tables are being flagged as corrupted on an I/O bound server

Closed

MDEV-32044 Mariadb crash after upgrading to 11.0.3: Failing assertion: local_len >= BTR_EXTERN_FIELD_REF_SIZE

Closed

MDEV-34233 InnoDB crashes due to corrupted ibdata1 (Assertion failure in innodb.undo_page)

Closed

MDEV-34773 Troubles with DB

Closed

SAMU-289 Loading...

links to

Intentionally cause an I/O error in Linux?

(3 blocks, 8 causes, 11 is duplicated by, 68 relates to, 1 links to)

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä created issue - 2017-08-16 06:02

Marko Mäkelä made changes - 2017-08-16 06:02

Field	Original Value	New Value
Link		This issue relates to ~~MDEV-12253~~ [ ~~MDEV-12253~~ ]

Marko Mäkelä made changes - 2017-08-16 06:02

Link

This issue relates to ~~MDEV-12288~~ [ ~~MDEV-12288~~ ]

Marko Mäkelä made changes - 2017-08-16 06:02

Link

This issue relates to ~~MDEV-13536~~ [ ~~MDEV-13536~~ ]

Marko Mäkelä made changes - 2017-08-17 10:28

Link

This issue relates to ~~MDEV-13559~~ [ ~~MDEV-13559~~ ]

Marko Mäkelä made changes - 2017-08-21 08:37

Link

This issue relates to ~~MDEV-11125~~ [ ~~MDEV-11125~~ ]

Marko Mäkelä made changes - 2017-08-30 12:29

Link

This issue relates to ~~MDEV-13680~~ [ ~~MDEV-13680~~ ]

Marko Mäkelä made changes - 2017-10-26 12:08

Link

This issue relates to ~~MDEV-14132~~ [ ~~MDEV-14132~~ ]

Marko Mäkelä made changes - 2018-06-15 12:01

Status

Open [ 1 ]

Confirmed [ 10101 ]

Marko Mäkelä made changes - 2018-06-15 12:01

Fix Version/s		10.3 [ 22126 ]
Fix Version/s		10.4 [ 22408 ]

Marko Mäkelä made changes - 2018-06-15 12:03

Link

This issue is duplicated by ~~MDEV-16497~~ [ ~~MDEV-16497~~ ]

Marko Mäkelä added a comment - 2018-09-17 06:38

As part of fixing this, we should also ensure that when fil_space_verify_crypt_checksum() has reported corruption, access to the corrupted page will not be allowed, because that could cause various assertion failures inside InnoDB, like this one somewhere inside a B-tree:

10.2.12

2018-09-15 22:01:17 139673797834496 [ERROR] InnoDB:  Page 39432:264336 may be corrupted. Post encryption checksum 2055092086 stored [1704382167:1704382167] key_version 1

2018-09-15 22:01:17 0x7f08571ab700  InnoDB: Assertion failure in file /home/buildbot/buildbot/build/storage/innobase/rem/rem0rec.cc line 580

Marko Mäkelä added a comment - 2018-09-17 06:38 As part of fixing this, we should also ensure that when fil_space_verify_crypt_checksum() has reported corruption, access to the corrupted page will not be allowed, because that could cause various assertion failures inside InnoDB, like this one somewhere inside a B-tree: 10.2.12 2018-09-15 22:01:17 139673797834496 [ERROR] InnoDB: Page 39432:264336 may be corrupted. Post encryption checksum 2055092086 stored [1704382167:1704382167] key_version 1 2018-09-15 22:01:17 0x7f08571ab700 InnoDB: Assertion failure in file /home/buildbot/buildbot/build/storage/innobase/rem/rem0rec.cc line 580

Marko Mäkelä made changes - 2018-10-12 09:37

Link

This issue relates to ~~MDEV-16763~~ [ ~~MDEV-16763~~ ]

Marko Mäkelä made changes - 2018-11-08 06:27

Link

This issue relates to ~~MDEV-17638~~ [ ~~MDEV-17638~~ ]

Marko Mäkelä added a comment - 2018-11-08 08:09 - edited

Also for ROW_FORMAT=COMPRESSED tables, it looks like corrupted pages will be wrongly added to the buffer pool. I was under the impression that MySQL would remove the corrupted page from the buffer pool (or not add it), but that definitely is not the case in MariaDB.

Adding a corrupted page to the buffer pool would cause a crash anywhere when the page contents is accessed. Here is an example from CHECK TABLE:

10.3.10
Version: '10.3.10-MariaDB-log' socket: '' port: 3306 mariadb.org binary distribution
2018-11-03 16:40:21 2467 [ERROR] InnoDB: Unable to decompress XXX.ibd[page id: space=1922, page number=865]
2018-11-03 16:40:21 2470 [ERROR] Got error 126 when reading table 'XXX'
2018-11-03 16:40:21 2473 [ERROR] Got error 126 when reading table 'XXX'
2018-11-03 16:40:21 2475 [ERROR] Got error 126 when reading table 'XXX'
2018-11-03 16:40:21 2467 [ERROR] Got error 192 when reading table 'XXX'
...
2018-11-03 16:40:23 2489 [ERROR] Got error 1877 when reading table 'XXX'
2018-11-03 16:46:47 29 [ERROR] InnoDB: Wrong owned count 8, 1, rec 129
2018-11-03 16:46:47 29 [ERROR] InnoDB: Apparent corruption in space 1922 page 865 index `PRIMARY`
2018-11-03 16:46:47 29 [ERROR] InnoDB: In page 865 of index `PRIMARY` of table `XXX`.`XXX`
…
Thread pointer: 0x1649b2d4d98
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
mysqld.exe!page_dir_find_owner_slot()[page0page.cc:99]
mysqld.exe!page_rec_get_prev_const()[page0page.ic:762]
mysqld.exe!btr_validate_level()[btr0btr.cc:5119]
mysqld.exe!btr_validate_index()[btr0btr.cc:5474]
mysqld.exe!ha_innobase::check()[ha_innodb.cc:14404]
mysqld.exe!handler::ha_check()[handler.cc:4199]
mysqld.exe!mysql_admin_table()[sql_admin.cc:803]
mysqld.exe!Sql_cmd_check_table::execute()[sql_admin.cc:1348]
mysqld.exe!mysql_execute_command()[sql_parse.cc:6295]
mysqld.exe!mysql_parse()[sql_parse.cc:8096]

Marko Mäkelä added a comment - 2018-11-08 08:09 - edited Also for ROW_FORMAT=COMPRESSED tables, it looks like corrupted pages will be wrongly added to the buffer pool. I was under the impression that MySQL would remove the corrupted page from the buffer pool (or not add it), but that definitely is not the case in MariaDB. Adding a corrupted page to the buffer pool would cause a crash anywhere when the page contents is accessed. Here is an example from CHECK TABLE : 10.3.10 Version: '10.3.10-MariaDB-log' socket: '' port: 3306 mariadb.org binary distribution 2018-11-03 16:40:21 2467 [ERROR] InnoDB: Unable to decompress XXX.ibd[page id: space=1922, page number=865] 2018-11-03 16:40:21 2470 [ERROR] Got error 126 when reading table 'XXX' 2018-11-03 16:40:21 2473 [ERROR] Got error 126 when reading table 'XXX' 2018-11-03 16:40:21 2475 [ERROR] Got error 126 when reading table 'XXX' 2018-11-03 16:40:21 2467 [ERROR] Got error 192 when reading table 'XXX' ... 2018-11-03 16:40:23 2489 [ERROR] Got error 1877 when reading table 'XXX' 2018-11-03 16:46:47 29 [ERROR] InnoDB: Wrong owned count 8, 1, rec 129 2018-11-03 16:46:47 29 [ERROR] InnoDB: Apparent corruption in space 1922 page 865 index `PRIMARY` 2018-11-03 16:46:47 29 [ERROR] InnoDB: In page 865 of index `PRIMARY` of table `XXX`.`XXX` … Thread pointer: 0x1649b2d4d98 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... mysqld.exe!page_dir_find_owner_slot()[page0page.cc:99] mysqld.exe!page_rec_get_prev_const()[page0page.ic:762] mysqld.exe!btr_validate_level()[btr0btr.cc:5119] mysqld.exe!btr_validate_index()[btr0btr.cc:5474] mysqld.exe!ha_innobase::check()[ha_innodb.cc:14404] mysqld.exe!handler::ha_check()[handler.cc:4199] mysqld.exe!mysql_admin_table()[sql_admin.cc:803] mysqld.exe!Sql_cmd_check_table::execute()[sql_admin.cc:1348] mysqld.exe!mysql_execute_command()[sql_parse.cc:6295] mysqld.exe!mysql_parse()[sql_parse.cc:8096]

Marko Mäkelä made changes - 2018-11-08 08:09

Labels

corruption recovery

corruption encryption page_compression recovery

Marko Mäkelä made changes - 2018-11-08 11:08

Link

This issue relates to ~~MDEV-17641~~ [ ~~MDEV-17641~~ ]

Geoff Montee (Inactive) added a comment - 2018-12-10 07:56

InnoDB also currently crashes when a tablespace is missing or inconsistent with the data dictionary. For example, see the crash from ~~MDEV-17947~~:

2018-12-08  2:19:46 139957174704896 [ERROR] InnoDB: Trying to do i/o to a tablespace which exists without .ibd data file. i/o type 10, space id 1355633, page no 0, i/o length 16384 bytes

2018-12-08  2:19:46 139957174704896 [ERROR] InnoDB: Trying to access tablespace [space=1355633: page=0] but the tablespace does not exist or is just being dropped.

2018-12-08  2:19:46 139957174704896 [ERROR] InnoDB: tablespace id is 1355633 in the data dictionary but in file ./DB1/TAB_TEMP.ibd it is 1354713!

2018-12-08  2:19:46 139957174704896 [ERROR] InnoDB: Trying to do i/o to a tablespace which exists without .ibd data file. i/o type 10, space id 1355633, page no 0, i/o length 16384 bytes

2018-12-08  2:19:46 139957174704896 [ERROR] InnoDB: Trying to access tablespace [space=1355633: page=0] but the tablespace does not exist or is just being dropped.

2018-12-08  2:19:46 139957174704896 [ERROR] InnoDB: Unable to read tablespace 1355633 page no 0 into the buffer pool after 100 attempts. The most probable cause of this error may be that the table has been corrupted. You can try to fix this problem by using innodb_force_recovery. Please see http://dev.mysql.com/doc/refman/5.6/en/ for more details. Aborting...

2018-12-08 02:19:46 7f4a51ae9b00  InnoDB: Assertion failure in thread 139957174704896 in file ha_innodb.cc line 22061

InnoDB: We intentionally generate a memory trap.

InnoDB: Submit a detailed bug report to https://jira.mariadb.org/

InnoDB: If you get repeated assertion failures or crashes, even

InnoDB: immediately after the mysqld startup, there may be

InnoDB: corruption in the InnoDB tablespace. Please refer to

InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html

InnoDB: about forcing recovery.

181208  2:19:46 [ERROR] mysqld got signal 6 ;

This could be because you hit a bug. It is also possible that this binary

or one of the libraries it was linked against is corrupt, improperly built,

or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help

diagnose the problem, but since we have already crashed,

something is definitely wrong and this may fail.

Server version: 10.1.35-MariaDB

key_buffer_size=16777216

read_buffer_size=262144

max_used_connections=5

max_threads=2002

thread_count=4

It is possible that mysqld could use up to

key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1595581 K  bytes of memory

Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f41af9ed008

Attempting backtrace. You can use the following information to find out

where mysqld died. If you see no messages after this, something went

terribly wrong...

stack_bottom = 0x7f4a51ae9140 thread_stack 0x48400

mysys/stacktrace.c:268(my_print_stacktrace)[0x55a87a7089bb]

sql/signal_handler.cc:168(handle_fatal_signal)[0x55a87a264d85]

/lib64/libpthread.so.0(+0x36aca0f7e0)[0x7f4a518167e0]

/lib64/libc.so.6(gsignal+0x35)[0x7f4a50086495]

/lib64/libc.so.6(abort+0x175)[0x7f4a50087c75]

handler/ha_innodb.cc:22034(ib_logf(ib_log_level_t, char const*, ...))[0x55a87a48fe53]

buf/buf0buf.cc:3268(buf_page_get_gen(unsigned long, unsigned long, unsigned long, unsigned long, buf_block_t*, unsigned long, char const*, unsigned long, mtr_t*, dberr_t*))[0x55a87a5d90eb]

fil/fil0fil.cc:3921(fsp_flags_try_adjust(unsigned long, unsigned long))[0x55a87a6352b3]

fil/fil0fil.cc:5492(fil_space_for_table_exists_in_mem(unsigned long, char const*, bool, bool, bool, mem_block_info_t*, unsigned long, unsigned long))[0x55a87a635bc5]

dict/dict0load.cc:2409(dict_load_table(char const*, unsigned long, dict_err_ignore_t))[0x55a87a618fb4]

dict/dict0dict.cc:1155(dict_table_open_on_name(char const*, unsigned long, unsigned long, dict_err_ignore_t))[0x55a87a606e30]

handler/ha_innodb.cc:6364(ha_innobase::open(char const*, int, unsigned int))[0x55a87a494753]

sql/handler.cc:2533(handler::ha_open(TABLE*, char const*, int, unsigned int))[0x55a87a26928e]

sql/table.cc:2975(open_table_from_share(THD*, TABLE_SHARE*, char const*, unsigned int, unsigned int, unsigned int, TABLE*, bool))[0x55a87a190023]

sql/sql_base.cc:2565(open_table(THD*, TABLE_LIST*, Open_table_context*))[0x55a87a0a61dd]

sql/sql_base.cc:4091(open_and_process_table)[0x55a87a0aa732]

sql/sql_show.cc:1166(mysqld_show_create_get_fields(THD*, TABLE_LIST*, List<Item>*, String*))[0x55a87a14b762]

sql/sql_show.cc:1260(mysqld_show_create(THD*, TABLE_LIST*))[0x55a87a14c0a7]

sql/sql_parse.cc:3740(mysql_execute_command(THD*))[0x55a87a0ed485]

sql/sql_parse.cc:7463(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x55a87a0f22e5]

sql/sql_parse.cc:7268(wsrep_mysql_parse)[0x55a87a0f23f4]

sql/sql_parse.cc:1493(dispatch_command(enum_server_command, THD*, char*, unsigned int))[0x55a87a0f4807]

sql/sql_parse.cc:1126(do_command(THD*))[0x55a87a0f52b7]

sql/sql_connect.cc:1330(do_handle_one_connection(THD*))[0x55a87a1b699f]

sql/sql_connect.cc:1244(handle_one_connection)[0x55a87a1b6ad7]

/lib64/libpthread.so.0(+0x36aca07aa1)[0x7f4a5180eaa1]

/lib64/libc.so.6(clone+0x6d)[0x7f4a5013cbdd]

Trying to get some variables.

Some pointers may be invalid and cause the dump to abort.

Query (0x7f41b68d0020): show create table `TAB_TEMP`

Connection ID (thread ID): 108197

Status: NOT_KILLED

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off

Would you also want to prevent crashes in situations like that?

Geoff Montee (Inactive) added a comment - 2018-12-10 07:56 InnoDB also currently crashes when a tablespace is missing or inconsistent with the data dictionary. For example, see the crash from MDEV-17947 : 2018-12-08 2:19:46 139957174704896 [ERROR] InnoDB: Trying to do i/o to a tablespace which exists without .ibd data file. i/o type 10, space id 1355633, page no 0, i/o length 16384 bytes 2018-12-08 2:19:46 139957174704896 [ERROR] InnoDB: Trying to access tablespace [space=1355633: page=0] but the tablespace does not exist or is just being dropped. 2018-12-08 2:19:46 139957174704896 [ERROR] InnoDB: tablespace id is 1355633 in the data dictionary but in file ./DB1/TAB_TEMP.ibd it is 1354713! 2018-12-08 2:19:46 139957174704896 [ERROR] InnoDB: Trying to do i/o to a tablespace which exists without .ibd data file. i/o type 10, space id 1355633, page no 0, i/o length 16384 bytes 2018-12-08 2:19:46 139957174704896 [ERROR] InnoDB: Trying to access tablespace [space=1355633: page=0] but the tablespace does not exist or is just being dropped. 2018-12-08 2:19:46 139957174704896 [ERROR] InnoDB: Unable to read tablespace 1355633 page no 0 into the buffer pool after 100 attempts. The most probable cause of this error may be that the table has been corrupted. You can try to fix this problem by using innodb_force_recovery. Please see http://dev.mysql.com/doc/refman/5.6/en/ for more details. Aborting... 2018-12-08 02:19:46 7f4a51ae9b00 InnoDB: Assertion failure in thread 139957174704896 in file ha_innodb.cc line 22061 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to https://jira.mariadb.org/ InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. 181208 2:19:46 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see https://mariadb.com/kb/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 10.1.35-MariaDB key_buffer_size=16777216 read_buffer_size=262144 max_used_connections=5 max_threads=2002 thread_count=4 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1595581 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x7f41af9ed008 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7f4a51ae9140 thread_stack 0x48400 mysys/stacktrace.c:268(my_print_stacktrace)[0x55a87a7089bb] sql/signal_handler.cc:168(handle_fatal_signal)[0x55a87a264d85] /lib64/libpthread.so.0(+0x36aca0f7e0)[0x7f4a518167e0] /lib64/libc.so.6(gsignal+0x35)[0x7f4a50086495] /lib64/libc.so.6(abort+0x175)[0x7f4a50087c75] handler/ha_innodb.cc:22034(ib_logf(ib_log_level_t, char const*, ...))[0x55a87a48fe53] buf/buf0buf.cc:3268(buf_page_get_gen(unsigned long, unsigned long, unsigned long, unsigned long, buf_block_t*, unsigned long, char const*, unsigned long, mtr_t*, dberr_t*))[0x55a87a5d90eb] fil/fil0fil.cc:3921(fsp_flags_try_adjust(unsigned long, unsigned long))[0x55a87a6352b3] fil/fil0fil.cc:5492(fil_space_for_table_exists_in_mem(unsigned long, char const*, bool, bool, bool, mem_block_info_t*, unsigned long, unsigned long))[0x55a87a635bc5] dict/dict0load.cc:2409(dict_load_table(char const*, unsigned long, dict_err_ignore_t))[0x55a87a618fb4] dict/dict0dict.cc:1155(dict_table_open_on_name(char const*, unsigned long, unsigned long, dict_err_ignore_t))[0x55a87a606e30] handler/ha_innodb.cc:6364(ha_innobase::open(char const*, int, unsigned int))[0x55a87a494753] sql/handler.cc:2533(handler::ha_open(TABLE*, char const*, int, unsigned int))[0x55a87a26928e] sql/table.cc:2975(open_table_from_share(THD*, TABLE_SHARE*, char const*, unsigned int, unsigned int, unsigned int, TABLE*, bool))[0x55a87a190023] sql/sql_base.cc:2565(open_table(THD*, TABLE_LIST*, Open_table_context*))[0x55a87a0a61dd] sql/sql_base.cc:4091(open_and_process_table)[0x55a87a0aa732] sql/sql_show.cc:1166(mysqld_show_create_get_fields(THD*, TABLE_LIST*, List<Item>*, String*))[0x55a87a14b762] sql/sql_show.cc:1260(mysqld_show_create(THD*, TABLE_LIST*))[0x55a87a14c0a7] sql/sql_parse.cc:3740(mysql_execute_command(THD*))[0x55a87a0ed485] sql/sql_parse.cc:7463(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x55a87a0f22e5] sql/sql_parse.cc:7268(wsrep_mysql_parse)[0x55a87a0f23f4] sql/sql_parse.cc:1493(dispatch_command(enum_server_command, THD*, char*, unsigned int))[0x55a87a0f4807] sql/sql_parse.cc:1126(do_command(THD*))[0x55a87a0f52b7] sql/sql_connect.cc:1330(do_handle_one_connection(THD*))[0x55a87a1b699f] sql/sql_connect.cc:1244(handle_one_connection)[0x55a87a1b6ad7] /lib64/libpthread.so.0(+0x36aca07aa1)[0x7f4a5180eaa1] /lib64/libc.so.6(clone+0x6d)[0x7f4a5013cbdd] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x7f41b68d0020): show create table `TAB_TEMP` Connection ID (thread ID): 108197 Status: NOT_KILLED Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off Would you also want to prevent crashes in situations like that?

Julien Fritsch made changes - 2018-12-10 10:16

Priority

Major [ 3 ]

Critical [ 2 ]

Geoff Montee (Inactive) made changes - 2018-12-10 17:48

Link

This issue relates to ~~MDEV-17947~~ [ ~~MDEV-17947~~ ]

Geoff Montee (Inactive) made changes - 2018-12-11 17:22

Link

This issue relates to ~~MDEV-17958~~ [ ~~MDEV-17958~~ ]

Geoff Montee (Inactive) made changes - 2018-12-11 17:22

Link

This issue relates to ~~MDEV-17957~~ [ ~~MDEV-17957~~ ]

Geoff Montee (Inactive) made changes - 2018-12-11 17:23

Link

This issue relates to ~~MDEV-17810~~ [ ~~MDEV-17810~~ ]

Geoff Montee (Inactive) made changes - 2018-12-13 18:01

Link

This issue relates to ~~MDEV-12112~~ [ ~~MDEV-12112~~ ]

Geoff Montee (Inactive) made changes - 2018-12-17 18:57

Link

This issue relates to ~~MDEV-18025~~ [ ~~MDEV-18025~~ ]

Geoff Montee (Inactive) made changes - 2018-12-17 18:57

Link

This issue relates to ~~MDEV-18026~~ [ ~~MDEV-18026~~ ]

Julien Fritsch made changes - 2019-01-28 17:18

Support case ID

23523 24286 not-21898 not-22741 not-23404 not-23578

24286 not-21898 not-22741 not-23404 not-23578 not-23523

Julien Fritsch made changes - 2019-01-28 17:20

Support case ID

24286 not-21898 not-22741 not-23404 not-23523 not-23578

24286 not-21898 not-22741 not-23404 not-23523 not-23578 not-24286

Julien Fritsch made changes - 2019-01-28 17:20

Support case ID

24286 not-21898 not-22741 not-23404 not-23523 not-23578 not-24286

not-21898 not-22741 not-23404 not-23523 not-23578 not-24286

Julien Fritsch made changes - 2019-01-28 17:29

Support case ID

not-21898 not-22741 not-23404 not-23523 not-23578 not-24286

not-21898 not-22741 not-23404 not-23523 not-23578 not-24286 test

Julien Fritsch made changes - 2019-01-28 17:30

Support case ID

not-21898 not-22741 not-23404 not-23523 not-23578 not-24286 test

not-21898 not-22741 not-23404 not-23523 not-23578 not-24286

Marko Mäkelä made changes - 2019-01-31 09:35

Assignee

Marko Mäkelä [ marko ]

Vladislav Lesin [ vlad.lesin ]

Geoff Montee (Inactive) made changes - 2019-02-09 23:28

Link

This issue relates to ~~MDEV-18529~~ [ ~~MDEV-18529~~ ]

Marko Mäkelä made changes - 2019-02-19 07:56

Link

This issue relates to ~~MDEV-18606~~ [ ~~MDEV-18606~~ ]

Richard Stracke made changes - 2019-03-18 08:04

Support case ID

not-21898 not-22741 not-23404 not-23523 not-23578 not-24286

not-21898 not-22741 not-23404 not-23523 not-23578 not-24286 27019

Marko Mäkelä made changes - 2019-03-18 09:53

Link

This issue relates to ~~MDEV-17482~~ [ ~~MDEV-17482~~ ]

Jira Update Service made changes - 2019-03-19 08:49

Support case ID

27019 not-21898 not-22741 not-23404 not-23523 not-23578 not-24286

not-21898 not-22741 not-23404 not-23523 not-23578 not-24286 not-27019

Marko Mäkelä made changes - 2019-03-21 15:52

Link

This issue relates to ~~MDEV-18932~~ [ ~~MDEV-18932~~ ]

Marko Mäkelä made changes - 2019-03-26 16:36

Link

This issue relates to ~~MDEV-18691~~ [ ~~MDEV-18691~~ ]

Marko Mäkelä made changes - 2019-03-29 14:07

Link

This issue relates to ~~MDEV-19081~~ [ ~~MDEV-19081~~ ]

Thirunarayanan Balathandayuthapani added a comment - 2019-05-06 08:57

There is another crash because of error case mishandling:

#6  0x00007f7afa8dabd7 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x564c4d8a7512 "bpage->buf_fix_count == 0", file=file@entry=0x564c4d8a5f08 "/home/thiru/mariarepo/10.4/storage/innobase/buf/buf0buf.cc", line=line@entry=5811, function=function@entry=0x564c4d8abfa0 <buf_mark_space_corrupt(buf_page_t*, fil_space_t const*)::__PRETTY_FUNCTION__> "void buf_mark_space_corrupt(buf_page_t*, const fil_space_t*)") at assert.c:92

#7  0x00007f7afa8dac82 in __GI___assert_fail (assertion=0x564c4d8a7512 "bpage->buf_fix_count == 0", file=0x564c4d8a5f08 "/home/thiru/mariarepo/10.4/storage/innobase/buf/buf0buf.cc", line=5811, function=0x564c4d8abfa0 <buf_mark_space_corrupt(buf_page_t*, fil_space_t const*)::__PRETTY_FUNCTION__> "void buf_mark_space_corrupt(buf_page_t*, const fil_space_t*)") at assert.c:101

#8  0x0000564c4d2cddf3 in buf_mark_space_corrupt (bpage=0x7f7aef82ec50, space=0x564c504c3bd0) at /home/thiru/mariarepo/10.4/storage/innobase/buf/buf0buf.cc:5811

#9  0x0000564c4d2ced38 in buf_page_io_complete (bpage=0x7f7aef82ec50, dblwr=false, evict=false) at /home/thiru/mariarepo/10.4/storage/innobase/buf/buf0buf.cc:6107

#10 0x0000564c4d2f62ea in buf_read_page_low (err=0x7f7ad6ffaef4, sync=true, type=0, mode=132, page_id=..., zip_size=0, unzip=false, ignore_missing_space=false) at /home/thiru/mariarepo/10.4/storage/innobase/buf/buf0rea.cc:202

#11 0x0000564c4d2f6bb7 in buf_read_page (page_id=..., zip_size=0) at /home/thiru/mariarepo/10.4/storage/innobase/buf/buf0rea.cc:410

#12 0x0000564c4d2c95bc in buf_page_get_gen (page_id=..., zip_size=0, rw_latch=2, guess=0x0, mode=10, file=0x564c4d845878 "/home/thiru/mariarepo/10.4/storage/innobase/row/row0row.cc", line=1219, mtr=0x7f7ad6ffc890, err=0x7f7ad6ffb388) at /home/thiru/mariarepo/10.4/storage/innobase/buf/buf0buf.cc:4384

#13 0x0000564c4d28adb5 in btr_cur_search_to_nth_level_func (index=0x7f7ac8022318, level=0, tuple=0x7f7ac000c688, mode=PAGE_CUR_LE, latch_mode=2, cursor=0x564c5054d3a0, ahi_latch=0x0, file=0x564c4d845878 "/home/thiru/mariarepo/10.4/storage/innobase/row/row0row.cc", line=1219, mtr=0x7f7ad6ffc890, autoinc=0) at /home/thiru/mariarepo/10.4/storage/innobase/btr/btr0cur.cc:1544

#14 0x0000564c4d1c8d4a in btr_pcur_open_low (index=0x7f7ac8022318, level=0, tuple=0x7f7ac000c688, mode=PAGE_CUR_LE, latch_mode=2, cursor=0x564c5054d3a0, file=0x564c4d845878 "/home/thiru/mariarepo/10.4/storage/innobase/row/row0row.cc", line=1219, autoinc=0, mtr=0x7f7ad6ffc890) at /home/thiru/mariarepo/10.4/storage/innobase/include/btr0pcur.ic:459

#15 0x0000564c4d1cbf3b in row_search_on_row_ref (pcur=0x564c5054d3a0, mode=2, table=0x7f7ac8020cd8, ref=0x7f7ac000c688, mtr=0x7f7ad6ffc890) at /home/thiru/mariarepo/10.4/storage/innobase/row/row0row.cc:1219

#16 0x0000564c4d1c079c in row_purge_reposition_pcur (mode=2, node=0x564c5054d2f8, mtr=0x7f7ad6ffc890) at /home/thiru/mariarepo/10.4/storage/innobase/row/row0purge.cc:78

#17 0x0000564c4d1c26f2 in row_purge_reset_trx_id (node=0x564c5054d2f8, mtr=0x7f7ad6ffc890) at /home/thiru/mariarepo/10.4/storage/innobase/row/row0purge.cc:798

#18 0x0000564c4d1c3db0 in row_purge_record_func (node=0x564c5054d2f8, undo_rec=0x7f7ac80110c8 "", thr=0x564c5054d128, updated_extern=false) at /home/thiru/mariarepo/10.4/storage/innobase/row/row0purge.cc:1205

#19 0x0000564c4d1c3f80 in row_purge (node=0x564c5054d2f8, undo_rec=0x7f7ac80110c8 "", thr=0x564c5054d128) at /home/thiru/mariarepo/10.4/storage/innobase/row/row0purge.cc:1254

#20 0x0000564c4d1c41a9 in row_purge_step (thr=0x564c5054d128) at /home/thiru/mariarepo/10.4/storage/innobase/row/row0purge.cc:1315

#21 0x0000564c4d146e89 in que_thr_step (thr=0x564c5054d128) at /home/thiru/mariarepo/10.4/storage/innobase/que/que0que.cc:1042

#22 0x0000564c4d1470bf in que_run_threads_low (thr=0x564c5054d128) at /home/thiru/mariarepo/10.4/storage/innobase/que/que0que.cc:1104

#23 0x0000564c4d1472b1 in que_run_threads (thr=0x564c5054d128) at /home/thiru/mariarepo/10.4/storage/innobase/que/que0que.cc:1144

#24 0x0000564c4d201ee4 in srv_task_execute () at /home/thiru/mariarepo/10.4/storage/innobase/srv/srv0srv.cc:2437

#25 0x0000564c4d20209a in srv_worker_thread (arg=0x0) at /home/thiru/mariarepo/10.4/storage/innobase/srv/srv0srv.cc:2485

Thirunarayanan Balathandayuthapani added a comment - 2019-05-06 08:57 There is another crash because of error case mishandling: #6 0x00007f7afa8dabd7 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x564c4d8a7512 "bpage->buf_fix_count == 0", file=file@entry=0x564c4d8a5f08 "/home/thiru/mariarepo/10.4/storage/innobase/buf/buf0buf.cc", line=line@entry=5811, function=function@entry=0x564c4d8abfa0 <buf_mark_space_corrupt(buf_page_t*, fil_space_t const*)::__PRETTY_FUNCTION__> "void buf_mark_space_corrupt(buf_page_t*, const fil_space_t*)") at assert.c:92 #7 0x00007f7afa8dac82 in __GI___assert_fail (assertion=0x564c4d8a7512 "bpage->buf_fix_count == 0", file=0x564c4d8a5f08 "/home/thiru/mariarepo/10.4/storage/innobase/buf/buf0buf.cc", line=5811, function=0x564c4d8abfa0 <buf_mark_space_corrupt(buf_page_t*, fil_space_t const*)::__PRETTY_FUNCTION__> "void buf_mark_space_corrupt(buf_page_t*, const fil_space_t*)") at assert.c:101 #8 0x0000564c4d2cddf3 in buf_mark_space_corrupt (bpage=0x7f7aef82ec50, space=0x564c504c3bd0) at /home/thiru/mariarepo/10.4/storage/innobase/buf/buf0buf.cc:5811 #9 0x0000564c4d2ced38 in buf_page_io_complete (bpage=0x7f7aef82ec50, dblwr=false, evict=false) at /home/thiru/mariarepo/10.4/storage/innobase/buf/buf0buf.cc:6107 #10 0x0000564c4d2f62ea in buf_read_page_low (err=0x7f7ad6ffaef4, sync=true, type=0, mode=132, page_id=..., zip_size=0, unzip=false, ignore_missing_space=false) at /home/thiru/mariarepo/10.4/storage/innobase/buf/buf0rea.cc:202 #11 0x0000564c4d2f6bb7 in buf_read_page (page_id=..., zip_size=0) at /home/thiru/mariarepo/10.4/storage/innobase/buf/buf0rea.cc:410 #12 0x0000564c4d2c95bc in buf_page_get_gen (page_id=..., zip_size=0, rw_latch=2, guess=0x0, mode=10, file=0x564c4d845878 "/home/thiru/mariarepo/10.4/storage/innobase/row/row0row.cc", line=1219, mtr=0x7f7ad6ffc890, err=0x7f7ad6ffb388) at /home/thiru/mariarepo/10.4/storage/innobase/buf/buf0buf.cc:4384 #13 0x0000564c4d28adb5 in btr_cur_search_to_nth_level_func (index=0x7f7ac8022318, level=0, tuple=0x7f7ac000c688, mode=PAGE_CUR_LE, latch_mode=2, cursor=0x564c5054d3a0, ahi_latch=0x0, file=0x564c4d845878 "/home/thiru/mariarepo/10.4/storage/innobase/row/row0row.cc", line=1219, mtr=0x7f7ad6ffc890, autoinc=0) at /home/thiru/mariarepo/10.4/storage/innobase/btr/btr0cur.cc:1544 #14 0x0000564c4d1c8d4a in btr_pcur_open_low (index=0x7f7ac8022318, level=0, tuple=0x7f7ac000c688, mode=PAGE_CUR_LE, latch_mode=2, cursor=0x564c5054d3a0, file=0x564c4d845878 "/home/thiru/mariarepo/10.4/storage/innobase/row/row0row.cc", line=1219, autoinc=0, mtr=0x7f7ad6ffc890) at /home/thiru/mariarepo/10.4/storage/innobase/include/btr0pcur.ic:459 #15 0x0000564c4d1cbf3b in row_search_on_row_ref (pcur=0x564c5054d3a0, mode=2, table=0x7f7ac8020cd8, ref=0x7f7ac000c688, mtr=0x7f7ad6ffc890) at /home/thiru/mariarepo/10.4/storage/innobase/row/row0row.cc:1219 #16 0x0000564c4d1c079c in row_purge_reposition_pcur (mode=2, node=0x564c5054d2f8, mtr=0x7f7ad6ffc890) at /home/thiru/mariarepo/10.4/storage/innobase/row/row0purge.cc:78 #17 0x0000564c4d1c26f2 in row_purge_reset_trx_id (node=0x564c5054d2f8, mtr=0x7f7ad6ffc890) at /home/thiru/mariarepo/10.4/storage/innobase/row/row0purge.cc:798 #18 0x0000564c4d1c3db0 in row_purge_record_func (node=0x564c5054d2f8, undo_rec=0x7f7ac80110c8 "", thr=0x564c5054d128, updated_extern=false) at /home/thiru/mariarepo/10.4/storage/innobase/row/row0purge.cc:1205 #19 0x0000564c4d1c3f80 in row_purge (node=0x564c5054d2f8, undo_rec=0x7f7ac80110c8 "", thr=0x564c5054d128) at /home/thiru/mariarepo/10.4/storage/innobase/row/row0purge.cc:1254 #20 0x0000564c4d1c41a9 in row_purge_step (thr=0x564c5054d128) at /home/thiru/mariarepo/10.4/storage/innobase/row/row0purge.cc:1315 #21 0x0000564c4d146e89 in que_thr_step (thr=0x564c5054d128) at /home/thiru/mariarepo/10.4/storage/innobase/que/que0que.cc:1042 #22 0x0000564c4d1470bf in que_run_threads_low (thr=0x564c5054d128) at /home/thiru/mariarepo/10.4/storage/innobase/que/que0que.cc:1104 #23 0x0000564c4d1472b1 in que_run_threads (thr=0x564c5054d128) at /home/thiru/mariarepo/10.4/storage/innobase/que/que0que.cc:1144 #24 0x0000564c4d201ee4 in srv_task_execute () at /home/thiru/mariarepo/10.4/storage/innobase/srv/srv0srv.cc:2437 #25 0x0000564c4d20209a in srv_worker_thread (arg=0x0) at /home/thiru/mariarepo/10.4/storage/innobase/srv/srv0srv.cc:2485

Marko Mäkelä added a comment - 2019-05-21 15:42 - edited

~~MDEV-19541~~ reports a crash during crash recovery, related to a slightly modified test of ~~MDEV-12699~~.

Marko Mäkelä added a comment - 2019-05-21 15:42 - edited MDEV-19541 reports a crash during crash recovery, related to a slightly modified test of MDEV-12699 .

Marko Mäkelä made changes - 2019-05-21 15:53

Link

This issue relates to ~~MDEV-19541~~ [ ~~MDEV-19541~~ ]

Marko Mäkelä made changes - 2019-11-14 20:04

Link

This issue relates to ~~MDEV-18519~~ [ ~~MDEV-18519~~ ]

Marko Mäkelä made changes - 2019-12-17 15:58

Link

This issue relates to ~~MDEV-18115~~ [ ~~MDEV-18115~~ ]

Marko Mäkelä made changes - 2020-01-03 18:28

Remote Link

This issue links to "Intentionally cause an I/O error in Linux? (Web Link)" [ 29401 ]

Marko Mäkelä added a comment - 2020-01-03 18:29

For testing on Linux, we can create a block device with injected bad blocks, like this:

dd if=/dev/zero of=/var/lib/virtualblock.img bs=512 count=1048576

losetup /dev/loop0 /var/lib/virtualblock.img

dmsetup create errdev0 << EOF

0 261144 linear /dev/loop0 0

261144 5 error

261149 787427 linear /dev/loop0 261139

EOF

The device with the ‘hole’ in it will typically be created in /dev/mapper.

Marko Mäkelä added a comment - 2020-01-03 18:29 For testing on Linux, we can create a block device with injected bad blocks, like this: dd if=/dev/zero of=/var/lib/virtualblock.img bs=512 count=1048576 losetup /dev/loop0 /var/lib/virtualblock.img dmsetup create errdev0 << EOF 0 261144 linear /dev/loop0 0 261144 5 error 261149 787427 linear /dev/loop0 261139 EOF The device with the ‘hole’ in it will typically be created in /dev/mapper .

Marko Mäkelä made changes - 2020-01-17 14:14

Link

This issue relates to ~~MDEV-21513~~ [ ~~MDEV-21513~~ ]

Marko Mäkelä made changes - 2020-02-25 11:36

Link

This issue relates to ~~MDEV-21030~~ [ ~~MDEV-21030~~ ]

Marko Mäkelä made changes - 2020-02-25 11:41

Link

This issue relates to ~~MDEV-21098~~ [ ~~MDEV-21098~~ ]

Marko Mäkelä made changes - 2020-02-26 08:09

Link

This issue relates to ~~MDEV-14481~~ [ ~~MDEV-14481~~ ]

Marko Mäkelä made changes - 2020-03-09 13:47

Link

This issue relates to ~~MDEV-21863~~ [ ~~MDEV-21863~~ ]

Marko Mäkelä made changes - 2020-04-28 06:21

Link

This issue includes ~~MDEV-22388~~ [ ~~MDEV-22388~~ ]

Marko Mäkelä made changes - 2020-05-19 14:34

Link

This issue relates to ~~MDEV-22624~~ [ ~~MDEV-22624~~ ]

Sergei Golubchik made changes - 2020-05-20 13:28

Link

This issue relates to ~~MDEV-20931~~ [ ~~MDEV-20931~~ ]

Marko Mäkelä made changes - 2020-06-04 08:03

Link

This issue relates to ~~MDEV-22481~~ [ ~~MDEV-22481~~ ]

Julien Fritsch made changes - 2020-07-22 17:17

Fix Version/s

10.6 [ 24028 ]

Julien Fritsch made changes - 2020-07-22 17:18

Fix Version/s

10.5 [ 23123 ]

Marko Mäkelä made changes - 2020-07-24 14:27

Link

This issue relates to ~~MDEV-22373~~ [ ~~MDEV-22373~~ ]

Marko Mäkelä made changes - 2020-07-24 16:53

Link

This issue blocks ~~MDEV-21098~~ [ ~~MDEV-21098~~ ]

Marko Mäkelä made changes - 2020-07-24 16:53

Link

This issue relates to ~~MDEV-21098~~ [ ~~MDEV-21098~~ ]

Marko Mäkelä made changes - 2020-08-03 07:03

Link

This issue blocks ~~MDEV-18519~~ [ ~~MDEV-18519~~ ]

Marko Mäkelä made changes - 2020-08-03 07:12

Link

This issue relates to ~~MDEV-18519~~ [ ~~MDEV-18519~~ ]

Sergei Golubchik made changes - 2020-09-01 17:36

Fix Version/s

10.6 [ 24028 ]

Julien Fritsch made changes - 2020-09-09 06:57

Assignee

Vladislav Lesin [ vlad.lesin ]

Marko Mäkelä [ marko ]

Marko Mäkelä made changes - 2020-09-14 12:18

Link

This issue relates to ~~MDEV-23734~~ [ ~~MDEV-23734~~ ]

Marko Mäkelä made changes - 2020-10-23 05:19

Link

This issue relates to ~~MDEV-19711~~ [ ~~MDEV-19711~~ ]

Marko Mäkelä made changes - 2020-10-28 10:04

Link

This issue relates to ~~MDEV-23344~~ [ ~~MDEV-23344~~ ]

Marko Mäkelä added a comment - 2020-10-28 10:04

~~MDEV-23344~~ reports a crash when trying to access a page that does not exist in the tablespace.

Marko Mäkelä added a comment - 2020-10-28 10:04 MDEV-23344 reports a crash when trying to access a page that does not exist in the tablespace.

Marko Mäkelä made changes - 2020-12-17 17:24

Link

This issue relates to ~~MDEV-24412~~ [ ~~MDEV-24412~~ ]

Marko Mäkelä made changes - 2021-01-13 09:34

Link

This issue relates to ~~MDEV-24578~~ [ ~~MDEV-24578~~ ]

Marko Mäkelä made changes - 2021-04-26 07:42

Link

This issue relates to ~~MDEV-24287~~ [ ~~MDEV-24287~~ ]

Rob Schwyzer (Inactive) made changes - 2021-06-24 20:27

Labels

corruption encryption page_compression recovery

ServiceNow corruption encryption page_compression recovery

Rob Schwyzer (Inactive) made changes - 2021-07-02 19:59

Labels

ServiceNow corruption encryption page_compression recovery

76qDvLB8Gju6Hs7nk3VY3EX42G795W5z corruption encryption page_compression recovery

Sergei Golubchik made changes - 2021-08-12 21:25

Link

This issue blocks MDEV-15827 [ MDEV-15827 ]

Sergei Golubchik made changes - 2021-08-13 22:34

Labels

76qDvLB8Gju6Hs7nk3VY3EX42G795W5z corruption encryption page_compression recovery

corruption encryption page_compression recovery

Marko Mäkelä made changes - 2021-09-06 09:52

Link

This issue is duplicated by ~~MDEV-26550~~ [ ~~MDEV-26550~~ ]

Marko Mäkelä made changes - 2021-09-06 15:19

Link

This issue relates to ~~MDEV-26537~~ [ ~~MDEV-26537~~ ]

Marko Mäkelä made changes - 2021-10-07 05:25

Link

This issue relates to ~~MDEV-26780~~ [ ~~MDEV-26780~~ ]

Marko Mäkelä made changes - 2021-10-25 04:26

Link

This issue relates to ~~MDEV-26893~~ [ ~~MDEV-26893~~ ]

Marko Mäkelä made changes - 2021-11-16 06:12

Link

This issue relates to ~~MDEV-27053~~ [ ~~MDEV-27053~~ ]

Sergei Golubchik made changes - 2021-12-06 21:36

Workflow

MariaDB v3 [ 82110 ]

MariaDB v4 [ 143964 ]

Marko Mäkelä made changes - 2022-02-17 17:43

Link

This issue relates to ~~MDEV-27865~~ [ ~~MDEV-27865~~ ]

Sergei Golubchik made changes - 2022-02-22 14:26

Fix Version/s		10.10 [ 27530 ]
Fix Version/s	10.2 [ 14601 ]
Fix Version/s	10.3 [ 22126 ]
Fix Version/s	10.4 [ 22408 ]
Fix Version/s	10.5 [ 23123 ]

Marko Mäkelä made changes - 2022-03-04 09:07

Link

This issue relates to ~~MDEV-28002~~ [ ~~MDEV-28002~~ ]

Marko Mäkelä made changes - 2022-03-25 14:01

Link

This issue relates to ~~MDEV-28174~~ [ ~~MDEV-28174~~ ]

Marko Mäkelä made changes - 2022-04-19 13:57

Link

This issue relates to MDEV-28349 [ MDEV-28349 ]

Marko Mäkelä added a comment - 2022-05-02 12:15

When it comes to reading corrupted pages into the buffer pool, we seem to already do the right thing with regard to buf_page_is_corrupted() and buf_page_t::read_complete(). One relevant change was ~~MDEV-19541~~.

It looks like the main part of the fix would be to check that error results from buf_page_get_gen() will be propagated from the caller.

I would exclude that in change buffer merge from this. We have seen crashes in btr_page_reorganize() that were invoked from ibuf_insert_to_index_page_low() during ibuf_merge_or_delete_for_page(). Due to the design of the change buffer, such crashes can also occur during CHECK TABLE, where some extra attention has been spent to avoid crashes due to corruption. We already disabled the change buffer by default (~~MDEV-27734~~) and deprecated it (~~MDEV-27735~~) due to corruption that we are unable to reproduce in our internal testing.

Marko Mäkelä added a comment - 2022-05-02 12:15 When it comes to reading corrupted pages into the buffer pool, we seem to already do the right thing with regard to buf_page_is_corrupted() and buf_page_t::read_complete() . One relevant change was MDEV-19541 . It looks like the main part of the fix would be to check that error results from buf_page_get_gen() will be propagated from the caller. I would exclude that in change buffer merge from this. We have seen crashes in btr_page_reorganize() that were invoked from ibuf_insert_to_index_page_low() during ibuf_merge_or_delete_for_page() . Due to the design of the change buffer, such crashes can also occur during CHECK TABLE , where some extra attention has been spent to avoid crashes due to corruption. We already disabled the change buffer by default ( MDEV-27734 ) and deprecated it ( MDEV-27735 ) due to corruption that we are unable to reproduce in our internal testing.

Marko Mäkelä added a comment - 2022-05-02 14:09

It looks like we already have a large part of these robustness fixes present in 10.5 (added by ~~MDEV-15528~~ and related changes) or 10.6 (added in ~~MDEV-25506~~ and changes related to atomic DDL). I already went through all direct callers of buf_page_get_gen(), and there is not that much to be fixed. I still have to check all callers of some 12 functions and then implement some fault injection in buf_page_get_gen() so that we can run some stress tests.

I would declare that those linked open bugs that are related to some other forms of corruption (say, a garbage BLOB pointer or DB_ROLL_PTR, or an incorrect key specified for decrypting an encrypted page) can be out of the scope of this fix. This fix should mainly be about handling failures to read a page.

Marko Mäkelä added a comment - 2022-05-02 14:09 It looks like we already have a large part of these robustness fixes present in 10.5 (added by MDEV-15528 and related changes) or 10.6 (added in MDEV-25506 and changes related to atomic DDL). I already went through all direct callers of buf_page_get_gen() , and there is not that much to be fixed. I still have to check all callers of some 12 functions and then implement some fault injection in buf_page_get_gen() so that we can run some stress tests. I would declare that those linked open bugs that are related to some other forms of corruption (say, a garbage BLOB pointer or DB_ROLL_PTR , or an incorrect key specified for decrypting an encrypted page) can be out of the scope of this fix. This fix should mainly be about handling failures to read a page.

Marko Mäkelä made changes - 2022-05-02 14:10

Status

Confirmed [ 10101 ]

In Progress [ 3 ]

Marko Mäkelä made changes - 2022-05-16 11:05

Link

This issue is duplicated by ~~MDEV-28574~~ [ ~~MDEV-28574~~ ]

Marko Mäkelä made changes - 2022-05-16 14:55

Link

This issue is duplicated by ~~MDEV-27865~~ [ ~~MDEV-27865~~ ]

Marko Mäkelä made changes - 2022-05-16 14:55

Link

This issue relates to ~~MDEV-27865~~ [ ~~MDEV-27865~~ ]

Matthias Leich added a comment - 2022-05-19 14:54

The RQG testing on the development tree

     origin/bb-10.6-MDEV-13542 dca6937a30eeee56f46fc47ac85656243227e4d2 2022-05-16T16:31:15+03:00

showed frequent

- SEGV's with a backtrace like in the description on top of MDEV-25257

- cases where the server error log contains the message '[ERROR] mysqld got signal 11' but the server process

   did not disappear even after more than 300s waiting --> no core file nor rr trace

After applying Thiru's initial patch for MDEV-25257

    diff --git a/storage/innobase/dict/dict0load.cc b/storage/innobase/dict/dict0load.cc

index 20916c2c96e..2c3d48b9573 100644

--- a/storage/innobase/dict/dict0load.cc

+++ b/storage/innobase/dict/dict0load.cc

@@ -1383,6 +1383,7 @@ dict_load_columns(

                        the flag is set before the table is created. */

                        if (table->fts == NULL) {

                                table->fts = fts_create(table);

+                               table->fts->cache = fts_cache_create(table);

                        ut_a(table->fts->doc_col == ULINT_UNDEFINED);

 to the tree bb-10.6-MDEV-13542 both bad effects were no more seen again in testing.

Now some new dominant problem showed up.

[rr 3532891 209362]mysqld: /data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc:2682: buf_block_t* buf_page_get_low(page_id_t, ulint, ulint, buf_block_t*, ulint, mtr_t*, dberr_t*, bool): Assertion `mode == 11 || mode == 12 || block->zip_size() == zip_size' failed.

(rr) bt

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50

#1  0x00007fcf37482859 in __GI_abort () at abort.c:79

#2  0x00007fcf37482729 in __assert_fail_base (fmt=0x7fcf37618588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55996a841da0 "mode == 11 || mode == 12 || block->zip_size() == zip_size",

    file=0x55996a83d680 "/data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc", line=2682, function=<optimized out>) at assert.c:92

#3  0x00007fcf37493f36 in __GI___assert_fail (assertion=0x55996a841da0 "mode == 11 || mode == 12 || block->zip_size() == zip_size", file=0x55996a83d680 "/data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc",

    line=2682, function=0x55996a841880 "buf_block_t* buf_page_get_low(page_id_t, ulint, ulint, buf_block_t*, ulint, mtr_t*, dberr_t*, bool)") at assert.c:101

#4  0x00005599696fa35f in buf_page_get_low (page_id=..., zip_size=0, rw_latch=8, guess=0x0, mode=16, mtr=0x7fcf1a61aec0, err=0x7fcf1a61a770, allow_ibuf_merge=false)

    at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc:2682

#5  0x00005599696fbaad in buf_page_get_gen (page_id=..., zip_size=0, rw_latch=8, guess=0x0, mode=16, mtr=0x7fcf1a61aec0, err=0x7fcf1a61a770, allow_ibuf_merge=false)

    at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc:2984

#6  0x00005599696ae711 in btr_cur_pess_upd_restore_supremum (block=0x7fcf2aea57e0, rec=0x7fcf2b49e781 "", mtr=0x7fcf1a61aec0) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/btr/btr0cur.cc:4782

#7  0x00005599696b1f88 in btr_cur_pessimistic_update (flags=10, cursor=0x61100005f180, offsets=0x7fcf1a61aac0, offsets_heap=0x7fcf1a61abc0, entry_heap=0x619000bf9080, big_rec=0x7fcf1a61aaa0, update=0x62100026f3f0, cmpl_info=0,

    thr=0x62100026f750, trx_id=1719, mtr=0x7fcf1a61aec0) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/btr/btr0cur.cc:5243

#8  0x000055996956ff1b in row_upd_clust_rec (flags=0, node=0x62100026f2c0, index=0x616002f91a08, offsets=0x619000bf9850, offsets_heap=0x7fcf1a61abc0, thr=0x62100026f750, mtr=0x7fcf1a61aec0)

    at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0upd.cc:2456

#9  0x0000559969571812 in row_upd_clust_step (node=0x62100026f2c0, thr=0x62100026f750) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0upd.cc:2729

#10 0x0000559969571f69 in row_upd (node=0x62100026f2c0, thr=0x62100026f750) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0upd.cc:2792

#11 0x0000559969572a6b in row_upd_step (thr=0x62100026f750) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0upd.cc:2934

#12 0x00005599694d46e5 in row_update_for_mysql (prebuilt=0x62100026e988) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0mysql.cc:1691

#13 0x000055996914b046 in ha_innobase::update_row (this=0x61d0007f94b8, old_row=0x61a0003c80f8 <incomplete sequence \375>, new_row=0x61a0003c7eb8 "\265")

    at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/handler/ha_innodb.cc:8668

#14 0x00005599688687dd in handler::ha_update_row (this=0x61d0007f94b8, old_data=0x61a0003c80f8 <incomplete sequence \375>, new_data=0x61a0003c7eb8 "\265") at /data/Server/bb-10.6-MDEV-13542G/sql/handler.cc:7602

#15 0x00005599683484c9 in mysql_update (thd=0x62b00009a218, table_list=0x62b0000a13f8, fields=..., values=..., conds=0x0, order_num=0, order=0x0, limit=18446744073709551602, ignore=false, found_return=0x7fcf1a61c2f0,

    updated_return=0x7fcf1a61c310) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_update.cc:1087

#16 0x000055996806794f in mysql_execute_command (thd=0x62b00009a218, is_called_from_prepared_stmt=false) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_parse.cc:4423

#17 0x000055996807fd40 in mysql_parse (thd=0x62b00009a218, rawbuf=0x62b0000a1238 "UPDATE t2 SET col_text = REPEAT(SUBSTR(CAST( 2361 AS CHAR),1,1), @fill_amount)  /* E_R Thread2 QNO 1139 CON_ID 16 */", length=116,

    parser_state=0x7fcf1a61cb20) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_parse.cc:8045

#18 0x0000559968058133 in dispatch_command (command=COM_QUERY, thd=0x62b00009a218, packet=0x629000c53219 " UPDATE t2 SET col_text = REPEAT(SUBSTR(CAST( 2361 AS CHAR),1,1), @fill_amount)  /* E_R Thread2 QNO 1139 CON_ID 16 */ ",

    packet_length=118, blocking=true) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_parse.cc:1912

#19 0x0000559968055369 in do_command (thd=0x62b00009a218, blocking=true) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_parse.cc:1409

#20 0x0000559968458c53 in do_handle_one_connection (connect=0x608000002eb8, put_in_cache=true) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_connect.cc:1418

#21 0x00005599684584df in handle_one_connection (arg=0x608000002838) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_connect.cc:1312

#22 0x00007fcf379ac609 in start_thread (arg=<optimized out>) at pthread_create.c:477

#23 0x00007fcf3757f293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

(rr)

pluto:/data/results/1652901598/TBR-1503/dev/shm/rqg/1652901598/57/1/rr

This problem disappeared after applying Thiru's cursor_restore_supremum.patch.

Matthias Leich added a comment - 2022-05-19 14:54 The RQG testing on the development tree origin/bb-10.6-MDEV-13542 dca6937a30eeee56f46fc47ac85656243227e4d2 2022-05-16T16:31:15+03:00 showed frequent - SEGV's with a backtrace like in the description on top of MDEV-25257 - cases where the server error log contains the message '[ERROR] mysqld got signal 11' but the server process did not disappear even after more than 300s waiting --> no core file nor rr trace After applying Thiru's initial patch for MDEV-25257 diff --git a/storage/innobase/dict/dict0load.cc b/storage/innobase/dict/dict0load.cc index 20916c2c96e..2c3d48b9573 100644 --- a/storage/innobase/dict/dict0load.cc +++ b/storage/innobase/dict/dict0load.cc @@ -1383,6 +1383,7 @@ dict_load_columns( the flag is set before the table is created. */ if (table->fts == NULL) { table->fts = fts_create(table); + table->fts->cache = fts_cache_create(table); } ut_a(table->fts->doc_col == ULINT_UNDEFINED); to the tree bb-10.6-MDEV-13542 both bad effects were no more seen again in testing. Now some new dominant problem showed up. [rr 3532891 209362]mysqld: /data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc:2682: buf_block_t* buf_page_get_low(page_id_t, ulint, ulint, buf_block_t*, ulint, mtr_t*, dberr_t*, bool): Assertion `mode == 11 || mode == 12 || block->zip_size() == zip_size' failed. (rr) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007fcf37482859 in __GI_abort () at abort.c:79 #2 0x00007fcf37482729 in __assert_fail_base (fmt=0x7fcf37618588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55996a841da0 "mode == 11 || mode == 12 || block->zip_size() == zip_size", file=0x55996a83d680 "/data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc", line=2682, function=<optimized out>) at assert.c:92 #3 0x00007fcf37493f36 in __GI___assert_fail (assertion=0x55996a841da0 "mode == 11 || mode == 12 || block->zip_size() == zip_size", file=0x55996a83d680 "/data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc", line=2682, function=0x55996a841880 "buf_block_t* buf_page_get_low(page_id_t, ulint, ulint, buf_block_t*, ulint, mtr_t*, dberr_t*, bool)") at assert.c:101 #4 0x00005599696fa35f in buf_page_get_low (page_id=..., zip_size=0, rw_latch=8, guess=0x0, mode=16, mtr=0x7fcf1a61aec0, err=0x7fcf1a61a770, allow_ibuf_merge=false) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc:2682 #5 0x00005599696fbaad in buf_page_get_gen (page_id=..., zip_size=0, rw_latch=8, guess=0x0, mode=16, mtr=0x7fcf1a61aec0, err=0x7fcf1a61a770, allow_ibuf_merge=false) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/buf/buf0buf.cc:2984 #6 0x00005599696ae711 in btr_cur_pess_upd_restore_supremum (block=0x7fcf2aea57e0, rec=0x7fcf2b49e781 "", mtr=0x7fcf1a61aec0) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/btr/btr0cur.cc:4782 #7 0x00005599696b1f88 in btr_cur_pessimistic_update (flags=10, cursor=0x61100005f180, offsets=0x7fcf1a61aac0, offsets_heap=0x7fcf1a61abc0, entry_heap=0x619000bf9080, big_rec=0x7fcf1a61aaa0, update=0x62100026f3f0, cmpl_info=0, thr=0x62100026f750, trx_id=1719, mtr=0x7fcf1a61aec0) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/btr/btr0cur.cc:5243 #8 0x000055996956ff1b in row_upd_clust_rec (flags=0, node=0x62100026f2c0, index=0x616002f91a08, offsets=0x619000bf9850, offsets_heap=0x7fcf1a61abc0, thr=0x62100026f750, mtr=0x7fcf1a61aec0) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0upd.cc:2456 #9 0x0000559969571812 in row_upd_clust_step (node=0x62100026f2c0, thr=0x62100026f750) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0upd.cc:2729 #10 0x0000559969571f69 in row_upd (node=0x62100026f2c0, thr=0x62100026f750) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0upd.cc:2792 #11 0x0000559969572a6b in row_upd_step (thr=0x62100026f750) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0upd.cc:2934 #12 0x00005599694d46e5 in row_update_for_mysql (prebuilt=0x62100026e988) at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/row/row0mysql.cc:1691 #13 0x000055996914b046 in ha_innobase::update_row (this=0x61d0007f94b8, old_row=0x61a0003c80f8 <incomplete sequence \375>, new_row=0x61a0003c7eb8 "\265") at /data/Server/bb-10.6-MDEV-13542G/storage/innobase/handler/ha_innodb.cc:8668 #14 0x00005599688687dd in handler::ha_update_row (this=0x61d0007f94b8, old_data=0x61a0003c80f8 <incomplete sequence \375>, new_data=0x61a0003c7eb8 "\265") at /data/Server/bb-10.6-MDEV-13542G/sql/handler.cc:7602 #15 0x00005599683484c9 in mysql_update (thd=0x62b00009a218, table_list=0x62b0000a13f8, fields=..., values=..., conds=0x0, order_num=0, order=0x0, limit=18446744073709551602, ignore=false, found_return=0x7fcf1a61c2f0, updated_return=0x7fcf1a61c310) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_update.cc:1087 #16 0x000055996806794f in mysql_execute_command (thd=0x62b00009a218, is_called_from_prepared_stmt=false) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_parse.cc:4423 #17 0x000055996807fd40 in mysql_parse (thd=0x62b00009a218, rawbuf=0x62b0000a1238 "UPDATE t2 SET col_text = REPEAT(SUBSTR(CAST( 2361 AS CHAR),1,1), @fill_amount) /* E_R Thread2 QNO 1139 CON_ID 16 */", length=116, parser_state=0x7fcf1a61cb20) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_parse.cc:8045 #18 0x0000559968058133 in dispatch_command (command=COM_QUERY, thd=0x62b00009a218, packet=0x629000c53219 " UPDATE t2 SET col_text = REPEAT(SUBSTR(CAST( 2361 AS CHAR),1,1), @fill_amount) /* E_R Thread2 QNO 1139 CON_ID 16 */ ", packet_length=118, blocking=true) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_parse.cc:1912 #19 0x0000559968055369 in do_command (thd=0x62b00009a218, blocking=true) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_parse.cc:1409 #20 0x0000559968458c53 in do_handle_one_connection (connect=0x608000002eb8, put_in_cache=true) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_connect.cc:1418 #21 0x00005599684584df in handle_one_connection (arg=0x608000002838) at /data/Server/bb-10.6-MDEV-13542G/sql/sql_connect.cc:1312 #22 0x00007fcf379ac609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #23 0x00007fcf3757f293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (rr) pluto:/data/results/1652901598/TBR-1503/dev/shm/rqg/1652901598/57/1/rr This problem disappeared after applying Thiru's cursor_restore_supremum.patch.

Matthias Leich made changes - 2022-05-19 14:55

Attachment

cursor_restore_supremum.patch [ 63815 ]

Marko Mäkelä added a comment - 2022-05-23 15:05

thiru, thank you for cursor_restore_supremum.patch. I hope that my simpler solution works as well.

Even today, the fault injection exposed a couple new crashes, which I fixed:

./mtr --parallel=auto --big-test --force --max-test-fail=0 --mysqld=--debug=d,intermittent_read_failure

My main effort today was to prevent crashes on change buffer merge failure (motivated by MDEV-28349) as well as on page split or merge. Such corruption should not be a direct consequence of a page checksum mismatch, but something else (say, ~~MDEV-28312~~, a bug in Galera snapshot transfer).

Marko Mäkelä added a comment - 2022-05-23 15:05 thiru , thank you for cursor_restore_supremum.patch . I hope that my simpler solution works as well. Even today, the fault injection exposed a couple new crashes, which I fixed: ./mtr --parallel=auto --big-test --force --max-test-fail=0 --mysqld=--debug=d,intermittent_read_failure My main effort today was to prevent crashes on change buffer merge failure (motivated by MDEV-28349 ) as well as on page split or merge. Such corruption should not be a direct consequence of a page checksum mismatch, but something else (say, MDEV-28312 , a bug in Galera snapshot transfer).

Marko Mäkelä made changes - 2022-05-23 15:33

Fix Version/s		10.6 [ 24028 ]
Fix Version/s		10.7 [ 24805 ]
Fix Version/s		10.8 [ 26121 ]
Fix Version/s		10.9 [ 26905 ]

Marko Mäkelä made changes - 2022-06-02 11:35

Link

This issue blocks ~~MDEV-18976~~ [ ~~MDEV-18976~~ ]

Marko Mäkelä made changes - 2022-06-06 12:57

Link

This issue blocks ~~MDEV-28457~~ [ ~~MDEV-28457~~ ]

Marko Mäkelä made changes - 2022-06-06 14:56

issue.field.resolutiondate

2022-06-06 14:56:03.0

2022-06-06 14:56:03.901

Marko Mäkelä made changes - 2022-06-06 14:56

Fix Version/s		10.6.9 [ 27507 ]
Fix Version/s		10.7.5 [ 27505 ]
Fix Version/s		10.8.4 [ 27503 ]
Fix Version/s		10.9.2 [ 27115 ]
Fix Version/s	10.6 [ 24028 ]
Fix Version/s	10.7 [ 24805 ]
Fix Version/s	10.8 [ 26121 ]
Fix Version/s	10.9 [ 26905 ]
Fix Version/s	10.10 [ 27530 ]
Resolution		Fixed [ 1 ]
Status	In Progress [ 3 ]	Closed [ 6 ]

Marko Mäkelä made changes - 2022-06-07 14:59

Link

This issue blocks MDEV-15827 [ MDEV-15827 ]

Marko Mäkelä made changes - 2022-06-07 15:07

Link

This issue includes ~~MDEV-22388~~ [ ~~MDEV-22388~~ ]

Marko Mäkelä made changes - 2022-06-07 15:08

Link

This issue blocks ~~MDEV-22388~~ [ ~~MDEV-22388~~ ]

Marko Mäkelä made changes - 2022-06-07 15:20

Link

This issue is duplicated by ~~MDEV-14170~~ [ ~~MDEV-14170~~ ]

Marko Mäkelä made changes - 2022-06-15 07:30

Link

This issue causes ~~MDEV-28845~~ [ ~~MDEV-28845~~ ]

Marko Mäkelä made changes - 2022-06-22 09:18

Link

This issue is duplicated by ~~MDEV-28148~~ [ ~~MDEV-28148~~ ]

Marko Mäkelä made changes - 2022-06-22 09:25

Link

This issue relates to MDEV-28143 [ MDEV-28143 ]

Marko Mäkelä made changes - 2022-06-27 09:30

Link

This issue causes ~~MDEV-28950~~ [ ~~MDEV-28950~~ ]

Marko Mäkelä made changes - 2022-06-29 05:25

Link

This issue relates to ~~MDEV-28975~~ [ ~~MDEV-28975~~ ]

Marko Mäkelä made changes - 2022-07-14 14:07

Link

This issue relates to ~~MDEV-29082~~ [ ~~MDEV-29082~~ ]

Marko Mäkelä made changes - 2022-07-20 19:49

Link

This issue is duplicated by ~~MDEV-29126~~ [ ~~MDEV-29126~~ ]

Marko Mäkelä made changes - 2022-07-20 19:55

Link

This issue is duplicated by ~~MDEV-29130~~ [ ~~MDEV-29130~~ ]

Marko Mäkelä made changes - 2022-07-22 16:46

Link

This issue is duplicated by ~~MDEV-29150~~ [ ~~MDEV-29150~~ ]

Marko Mäkelä made changes - 2022-07-26 13:10

Link

This issue is duplicated by ~~MDEV-28786~~ [ ~~MDEV-28786~~ ]

Marko Mäkelä made changes - 2022-08-05 13:21

Description

InnoDB is notoriously aborting the whole MariaDB server if it is trying to access a corrupted page.
~~MDEV-12253~~ alleviated the issue somewhat, but not for background processing, such as the purge of no-longer-needed history.
This bug exists in all InnoDB and XtraDB versions.
MariaDB essentially must fix [MySQL Bug#10132 Crashing the server on corrupt InnoDB page is unhelpful|https://bugs.mysql.com/bug.php?id=10132] that was filed on April 25, 2005.

In 10.3 with ~~MDEV-12288~~ and ~~MDEV-13536~~, the InnoDB purge threads will access pages more often, causing all sorts of debug assertion failures in the B-tree code. To work around this problem, tests that intentionally cause B-tree pages to be unreadable are amended with the following parameters, so that everything will be purged before the server is restarted to deal with the corruption:
{noformat}
--innodb-purge-rseg-truncate-frequency=1
--skip-innodb-fast-shutdown
{noformat}

Since the very first version, InnoDB was aborting the whole MySQL or MariaDB server when trying to access a corrupted page.

~~MDEV-12253~~ alleviated the issue somewhat, but not for background processing, such as the purge of no-longer-needed history.
This bug exists in all InnoDB and XtraDB versions.
MariaDB essentially must fix [MySQL Bug#10132 Crashing the server on corrupt InnoDB page is unhelpful|https://bugs.mysql.com/bug.php?id=10132] that was filed on April 25, 2005.

In 10.3 with ~~MDEV-12288~~ and ~~MDEV-13536~~, the InnoDB purge threads will access pages more often, causing all sorts of debug assertion failures in the B-tree code. To work around this problem, tests that intentionally cause B-tree pages to be unreadable are amended with the following parameters, so that everything will be purged before the server is restarted to deal with the corruption:
{noformat}
--innodb-purge-rseg-truncate-frequency=1
--skip-innodb-fast-shutdown
{noformat}

Marko Mäkelä made changes - 2022-08-05 13:27

Description

Since the very first version, InnoDB was aborting the whole MySQL or MariaDB server when trying to access a corrupted page.

Earlier, small steps have been taken to improve MariaDB stability, such as ~~MDEV-12253~~.

We must enable error handling for corrupted pages instead of allowing the server to crash. This is not only about detecting a checksum failure when reading a corrupted page (like suggested in [MySQL Bug#10132 Crashing the server on corrupt InnoDB page is unhelpful|https://bugs.mysql.com/bug.php?id=10132], filed on April 25, 2005), but also about performing some operations on corrupted pages whose checksum appears valid.

In some background operations, such as the purge of committed transaction history, errors may be reported to the server error log only, or silently ignored. While executing SQL, errors should be reported back to the client, and the transaction could be rolled back.

Marko Mäkelä made changes - 2022-08-18 13:01

Link

This issue causes ~~MDEV-29321~~ [ ~~MDEV-29321~~ ]

Marko Mäkelä made changes - 2022-08-25 06:50

Link

This issue relates to ~~MDEV-29374~~ [ ~~MDEV-29374~~ ]

Marko Mäkelä made changes - 2022-08-26 08:15

Link

This issue causes ~~MDEV-29383~~ [ ~~MDEV-29383~~ ]

Marko Mäkelä made changes - 2022-08-26 08:50

Link

This issue causes MENT-1609 [ MENT-1609 ]

Marko Mäkelä made changes - 2022-08-31 09:04

Link

This issue causes ~~MDEV-29374~~ [ ~~MDEV-29374~~ ]

Marko Mäkelä made changes - 2022-08-31 09:06

Link

This issue relates to ~~MDEV-29374~~ [ ~~MDEV-29374~~ ]

Marko Mäkelä made changes - 2022-09-01 06:59

Link

This issue causes ~~MDEV-29435~~ [ ~~MDEV-29435~~ ]

Marko Mäkelä made changes - 2022-09-22 09:30

Link

This issue blocks ~~MDEV-29603~~ [ ~~MDEV-29603~~ ]

Marko Mäkelä made changes - 2022-11-03 06:55

Link

This issue relates to MDEV-29938 [ MDEV-29938 ]

Marko Mäkelä made changes - 2022-11-03 07:18

Link

This issue relates to ~~MDEV-29276~~ [ ~~MDEV-29276~~ ]

Marko Mäkelä made changes - 2022-11-03 07:33

Link

This issue relates to ~~MDEV-26022~~ [ ~~MDEV-26022~~ ]

Marko Mäkelä made changes - 2022-11-23 10:00

Link

This issue relates to ~~MDEV-9663~~ [ ~~MDEV-9663~~ ]

Marko Mäkelä made changes - 2022-11-30 07:15

Link

This issue relates to ~~MDEV-30132~~ [ ~~MDEV-30132~~ ]

Marko Mäkelä made changes - 2022-11-30 08:59

Link

This issue is duplicated by MENT-1669 [ MENT-1669 ]

Marko Mäkelä made changes - 2023-02-08 16:22

Link

This issue relates to ~~MDEV-30397~~ [ ~~MDEV-30397~~ ]

Ralf Gebhardt made changes - 2023-03-02 09:44

Link

This issue relates to MENT-1749 [ MENT-1749 ]

Marko Mäkelä made changes - 2023-05-12 05:36

Link

This issue relates to ~~MDEV-31245~~ [ ~~MDEV-31245~~ ]

Marko Mäkelä made changes - 2023-05-24 09:12

Link

This issue causes ~~MDEV-31333~~ [ ~~MDEV-31333~~ ]

Marko Mäkelä made changes - 2023-05-26 13:05

Link

This issue relates to ~~MDEV-31353~~ [ ~~MDEV-31353~~ ]

Marko Mäkelä made changes - 2023-07-25 08:00

Link

This issue relates to ~~MDEV-31767~~ [ ~~MDEV-31767~~ ]

Marko Mäkelä made changes - 2023-08-01 12:01

Link

This issue blocks ~~MDEV-27593~~ [ ~~MDEV-27593~~ ]

Marko Mäkelä made changes - 2023-08-03 12:59

Link

This issue relates to ~~MDEV-27538~~ [ ~~MDEV-27538~~ ]

Marko Mäkelä made changes - 2023-08-31 05:13

Link

This issue relates to ~~MDEV-32044~~ [ ~~MDEV-32044~~ ]

Marko Mäkelä made changes - 2023-09-27 12:37

Link

This issue relates to ~~MDEV-32269~~ [ ~~MDEV-32269~~ ]

Marko Mäkelä made changes - 2024-01-29 07:04

Link

This issue relates to ~~MDEV-33325~~ [ ~~MDEV-33325~~ ]

Marko Mäkelä made changes - 2024-02-07 08:48

Link

This issue is duplicated by ~~MDEV-33398~~ [ ~~MDEV-33398~~ ]

Rob Schwyzer (Inactive) made changes - 2024-04-04 19:53

Remote Link

This issue links to "Page (MariaDB Confluence)" [ 36696 ]

Rob Schwyzer (Inactive) made changes - 2024-04-05 18:34

Remote Link

This issue links to "Page (MariaDB Confluence)" [ 36696 ]

Marko Mäkelä made changes - 2024-05-24 10:23

Link

This issue relates to ~~MDEV-34233~~ [ ~~MDEV-34233~~ ]

Jira Automation (IT) made changes - 2024-07-04 08:22

Zendesk Related Tickets		201658 110821 114635 110277 135951 119185 118441 132996 109663 161967 179756 175774
Zendesk active tickets		201658

Marko Mäkelä made changes - 2024-08-19 09:57

Link

This issue relates to ~~MDEV-34773~~ [ ~~MDEV-34773~~ ]

Marko Mäkelä made changes - 2024-08-28 12:37

Link

This issue blocks ~~MDEV-34830~~ [ ~~MDEV-34830~~ ]

Marko Mäkelä made changes - 2024-11-14 06:29

Link

This issue relates to ~~MDEV-35413~~ [ ~~MDEV-35413~~ ]

Jimmy Hú made changes - 2024-11-17 20:24

Link

This issue causes ~~MDEV-35430~~ [ ~~MDEV-35430~~ ]

Marko Mäkelä made changes - 2025-01-03 06:49

Link

This issue relates to ~~MDEV-35729~~ [ ~~MDEV-35729~~ ]

Alessandro Vetere made changes - 2025-03-04 10:47

Remote Link

This issue links to "SAMU-289 (Jira)" [ 37435 ]

Thirunarayanan Balathandayuthapani made changes - 2025-03-21 06:56

Link

This issue relates to ~~MDEV-18455~~ [ ~~MDEV-18455~~ ]

People

Assignee:: Marko Mäkelä

Reporter:: Marko Mäkelä

Votes:: 7 Vote for this issue

Watchers:: 25 Start watching this issue

Dates

Created:: 2017-08-16 06:02

Updated:: 2025-03-21 06:56

Resolved:: 2022-06-06 14:56

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration