[MDEV-11502] Assert in _fil_io because of node->size=0 during backup of encrypted FULLTEXT Created: 2016-12-08  Updated: 2016-12-13  Resolved: 2016-12-13

Status: Closed
Project: MariaDB Server
Component/s: Backup
Affects Version/s: None
Fix Version/s: 10.1.20

Type: Bug Priority: Major
Reporter: Andrii Nikitin (Inactive) Assignee: Vladislav Vaintroub
Resolution: Fixed Votes: 0
Labels: None


 Description   

Created from MDEV-11483

Assertion triggers sometimes during backup of encrypted empty FULLTEXT index :

161207 11:43:40 [01] Copying .\test\fts_0000000000000014_000000000000001e_index_6.ibd to F:\git\mysql-test\var\tmp\backup\test\fts_0000000000000014_000000000000001e_index_6.ibd
InnoDB: Error: trying to access page number 0 in space 15,
InnoDB: space name test/FTS_0000000000000014_000000000000001e_INDEX_6,
InnoDB: which is outside the tablespace bounds.
InnoDB: Byte offset 0, len 16384, i/o type 10.
InnoDB: If you get this error at mysqld startup, please check that
InnoDB: your my.cnf matches the ibdata files that you have in the
InnoDB: MySQL server.
2016-12-07 11:43:40 194c InnoDB: Assertion failure in thread 6476 in file fil0fil.cc line 6265

It is hard to reproduce in some environments. Following test case shows the crash 95% of runs for me:
https://github.com/MariaDB/server/blob/bb-10.1-wlad-xtrabackup/mysql-test/suite/xtrabackup/xb_fulltext_encrypted.test



 Comments   
Comment by Andrii Nikitin (Inactive) [ 2016-12-08 ]

Stack during assert:
xtrabackup.exe!abort()[abort.cpp:64]
xtrabackup.exe!_fil_io()[fil0fil.cc:6283]
xtrabackup.exe!fil_read()[fil0fil.cc:289]
xtrabackup.exe!fil_space_get_crypt_data()[fil0fil.cc:7654]
xtrabackup.exe!buf_page_is_corrupted()[buf0buf.cc:724]
xtrabackup.exe!fil_space_verify_crypt_checksum()[fil0crypt.cc:982]
xtrabackup.exe!xb_fil_cur_read()[fil_cur.cc:349]
xtrabackup.exe!xtrabackup_copy_datafile()[xtrabackup.cc:2418]
xtrabackup.exe!data_copy_thread_func()[xtrabackup.cc:2934]

Following condition in _fil_io() triggers assert because node->size has 0 (when trying to read encryption info (to verify checksum of page which was successfully read from file in datadir) ):
/* Check that at least the start offset is within the bounds of a
single-table tablespace, including rollback tablespaces. */
if (UNIV_UNLIKELY(node->size <= block_offset)
&& space->id != 0 && space->purpose == FIL_TABLESPACE)

{ fil_report_invalid_page_access( block_offset, space_id, space->name, byte_offset, len, type); ut_error; }
Comment by Andrii Nikitin (Inactive) [ 2016-12-08 ]

I am able to repeat the same assert without FULLTEXT, working on it.
It looks probability of crash is increased if machine is busy with some additional heavy jobs.

Comment by Andrii Nikitin (Inactive) [ 2016-12-08 ]

Just for reference it looks buildbot had the same assert at least once (also with encryption):
http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest2/builds/6441/steps/test_3/logs/stdio

encryption.innodb_encryption_discard_import
InnoDB: Error: trying to access page number 0 in space 7,
InnoDB: space name test/t1,
InnoDB: which is outside the tablespace bounds.
InnoDB: Byte offset 0, len 16384, i/o type 10.
2016-12-06 08:46:31 a8dfcb40 InnoDB: Assertion failure in thread 2833238848 in file fil0fil.cc line 5933

Comment by Andrii Nikitin (Inactive) [ 2016-12-08 ]

I am still not sure how to reliably reproduce it , but for me the crash is definitely gone after I replaced following line in fil0fil.cc :: fil_load_single_table_tablespace() :

if (!fil_node_create(fsp->filepath, 0, fsp->id, FALSE)) {
with
if (!fil_node_create(fsp->filepath, size, fsp->id, FALSE)) {

Comment says that zero size is OK, because correct size will be initialized during fil_node_open_file() . But I did put diagnostic message into that function and confirm that fil_node_open_file() is never called in xtrabackup for problem files. Further investigation shows that reason is following line down in fil_load_single_table_tablespace() :
if (IS_XTRABACKUP() && srv_backup_mode && !srv_close_files) {
...
node->open = TRUE;

I am not sure why it works in most of cases and at which point correct node->size is initialized, but hope that this investigation is helpful

Comment by Andrii Nikitin (Inactive) [ 2016-12-09 ]

good that buildbot caught the problem as well:
https://buildbot.askmonty.org/buildbot/builders/labrador/builds/8576/steps/test/logs/stdio

InnoDB: Error: trying to access page number 0 in space 18,
InnoDB: space name test/FTS_0000000000000014_0000000000000025_INDEX_3,
InnoDB: which is outside the tablespace bounds.
InnoDB: Byte offset 0, len 16384, i/o type 10.
2016-12-09 09:54:19 106a91000 InnoDB: Assertion failure in thread 4406710272 in file fil0fil.cc line 6283

Comment by Andrii Nikitin (Inactive) [ 2016-12-09 ]

It looks that suggested change to pass `size` instead of `0` in fil_node_create in fil0fil.cc leads to occasional error in tests "InnoDB: File (unknown): 'read' returned OS error 0. Cannot continue operation". Sad that it doesn't work, because crash doesn't happen to me anymore.

Comment by Vladislav Vaintroub [ 2016-12-09 ]

Lets hope that https://github.com/mariadb/server/commit/150dbda406cc33196f4643231ec74235844fddc7 fixed it.

Generated at Thu Feb 08 07:50:26 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.