[MDEV-25783] CHECK TABLE harvests InnoDB: Index 'abdcef' contains 10001 entries, should be 10000 - Jira

Marko Mäkelä added a comment - 2021-05-27 11:41

Thank you. Initial investigation suggests that this bug should be avoided by setting innodb_change_buffering=none.

I analyzed the trace of the recovery a little. Buffered changes will be applied to many pages, including an index with the identifier 0x1d, where 10,001 records are counted instead of 10,000.

Furthermore, I tried recovery of the data directory on my local system and got the following:

mariadbd --datadir /dev/shm/data --innodb-page-size=4k --tc-heuristic-recover=1

10.6 f078788e293c0eed416e1bff0a4a3437d4085abb

2021-05-27 14:33:17 0 [Note] InnoDB: Rollback of non-prepared transactions completed

2021-05-27 14:33:24 0 [ERROR] InnoDB: tried to purge non-delete-marked record in index `abdcef` of table `test`.`tab1`: tuple: TUPLE (info_bits=0, 7 fields): {[6]profit(0x70726F666974),[10]vwmjgafhrr(0x76776D6A676166687272),[3]wmj(0x776D6A),[6]clinic(0x636C696E6963),[1]j(0x6A),[1]u(0x75),[4]    (0x0000130A)}, record: RECORD(info_bits=0, 7 fields): {[6]profit(0x70726F666974),[10]vwmjgafhrr(0x76776D6A676166687272),[3]wmj(0x776D6A),[6]clinic(0x636C696E6963),[1]j(0x6A),[1]u(0x75),[4]    (0x0000130A)}

So, we could be hitting the problems that were reported in ~~MDEV-22373~~.

Last, while debugging the recovery in the rr replay trace, I noticed a number of unnecessary calls that would be prevented by the following patch:

diff --git a/storage/innobase/buf/buf0buf.cc b/storage/innobase/buf/buf0buf.cc

index 24e6e549da1..1159fce261c 100644

--- a/storage/innobase/buf/buf0buf.cc

+++ b/storage/innobase/buf/buf0buf.cc

@@ -3347,7 +3347,7 @@ static buf_block_t* buf_page_create_low(page_id_t page_id, ulint zip_size,

     if (block->page.ibuf_exist)

-      if (!recv_recovery_is_on())

+      if (page_id < page_id_t{SRV_TMP_SPACE_ID, 0} && !recv_recovery_is_on())

         ibuf_merge_or_delete_for_page(nullptr, page_id, zip_size);

       block->page.ibuf_exist= false;

I do not plan to apply that patch; it only illustrates the problem. I think that we should ever allow the ~~MDEV-19514~~ related ibuf_exist flag to exist at this point of time. The change buffer is a persistent data structure and therefore never used for temporary tables. This might be a sign of a ~~MDEV-19514~~ related regression in 10.5 and 10.6.

Marko Mäkelä added a comment - 2021-05-27 11:41 Thank you. Initial investigation suggests that this bug should be avoided by setting innodb_change_buffering=none . I analyzed the trace of the recovery a little. Buffered changes will be applied to many pages, including an index with the identifier 0x1d, where 10,001 records are counted instead of 10,000. Furthermore, I tried recovery of the data directory on my local system and got the following: mariadbd --datadir /dev/shm/data --innodb-page-size=4k --tc-heuristic-recover=1 10.6 f078788e293c0eed416e1bff0a4a3437d4085abb 2021-05-27 14:33:17 0 [Note] InnoDB: Rollback of non-prepared transactions completed 2021-05-27 14:33:24 0 [ERROR] InnoDB: tried to purge non-delete-marked record in index `abdcef` of table `test`.`tab1`: tuple: TUPLE (info_bits=0, 7 fields): {[6]profit(0x70726F666974),[10]vwmjgafhrr(0x76776D6A676166687272),[3]wmj(0x776D6A),[6]clinic(0x636C696E6963),[1]j(0x6A),[1]u(0x75),[4] (0x0000130A)}, record: RECORD(info_bits=0, 7 fields): {[6]profit(0x70726F666974),[10]vwmjgafhrr(0x76776D6A676166687272),[3]wmj(0x776D6A),[6]clinic(0x636C696E6963),[1]j(0x6A),[1]u(0x75),[4] (0x0000130A)} So, we could be hitting the problems that were reported in MDEV-22373 . Last, while debugging the recovery in the rr replay trace, I noticed a number of unnecessary calls that would be prevented by the following patch: diff --git a/storage/innobase/buf/buf0buf.cc b/storage/innobase/buf/buf0buf.cc index 24e6e549da1..1159fce261c 100644 --- a/storage/innobase/buf/buf0buf.cc +++ b/storage/innobase/buf/buf0buf.cc @@ -3347,7 +3347,7 @@ static buf_block_t* buf_page_create_low(page_id_t page_id, ulint zip_size, if (block->page.ibuf_exist) { - if (!recv_recovery_is_on()) + if (page_id < page_id_t{SRV_TMP_SPACE_ID, 0} && !recv_recovery_is_on()) ibuf_merge_or_delete_for_page(nullptr, page_id, zip_size); block->page.ibuf_exist= false; } I do not plan to apply that patch; it only illustrates the problem. I think that we should ever allow the MDEV-19514 related ibuf_exist flag to exist at this point of time. The change buffer is a persistent data structure and therefore never used for temporary tables. This might be a sign of a MDEV-19514 related regression in 10.5 and 10.6.

Marko Mäkelä added a comment - 2021-05-27 13:40

I made a mistake in my previous comment. There actually are two calls to that function, and the unnecessary call was the other one. So, it does not look like ~~MDEV-19514~~ caused any correctness regression. In ibuf_merge_or_delete_for_page() we do an early return if the page number belongs to the temporary tablespace. But, we could optimize that check so that the function call would be avoided for temporary tables.

Marko Mäkelä added a comment - 2021-05-27 13:40 I made a mistake in my previous comment. There actually are two calls to that function, and the unnecessary call was the other one. So, it does not look like MDEV-19514 caused any correctness regression. In ibuf_merge_or_delete_for_page() we do an early return if the page number belongs to the temporary tablespace. But, we could optimize that check so that the function call would be avoided for temporary tables.

Marko Mäkelä added a comment - 2021-05-27 14:26

data.tar.bz2 is a copy of the data directory from the server, for which I have rr replay traces before and after the crash. To reduce the file size so that it would be acceptable for this system, I removed some unrelated files and started the server with innodb_force_recovery=3 and innodb_log_file_size=1m so that the 98-megabyte redo log file was replaced with a 1-megabyte one, without running rollback or purge (or thanks to ~~MDEV-19514~~, change buffer merge). The problem is still repeatable with that data directory.

I added a little instrumentation on my local system:

diff --git a/storage/innobase/ibuf/ibuf0ibuf.cc b/storage/innobase/ibuf/ibuf0ibuf.cc

index 178f94684f2..df8f0ba568c 100644

--- a/storage/innobase/ibuf/ibuf0ibuf.cc

+++ b/storage/innobase/ibuf/ibuf0ibuf.cc

@@ -4224,6 +4224,8 @@ void ibuf_merge_or_delete_for_page(buf_block_t *block, const page_id_t page_id,

 		page_id.space(), page_id.page_no(), heap);

 	if (block != NULL) {

+		if (mach_read_from_8(PAGE_HEADER + PAGE_INDEX_ID + block->frame) == 0x1d)

+			ib::info() << "applying to page " << page_id;

 		/* Move the ownership of the x-latch on the page to this OS

 		thread, so that we can acquire a second x-latch on it. This

 		is needed for the insert operations to the index page to pass

This will produce me some page numbers of the corrupted index:

mariadbd --innodb-page-size=4k --datadir /dev/shm/data --innodb-change-buffer-dump=1 --tc-heuristic-recover=1 >|/dev/shm/dump 2>&1

10.6 f078788e293c0eed416e1bff0a4a3437d4085abb with patch
2021-05-27 17:02:46 0 [Note] InnoDB: applying to page [page id: space=5, page number=1792]
2021-05-27 17:02:46 0 [Note] InnoDB: Rolled back recovered transaction 6894
2021-05-27 17:02:46 0 [Note] InnoDB: Rolled back recovered transaction 6758
2021-05-27 17:02:46 0 [Note] InnoDB: applying to page [page id: space=5, page number=1822]
…
2021-05-27 17:02:55 0 [ERROR] InnoDB: tried to purge non-delete-marked record in index `abdcef` of table `test`.`tab1`: tuple: …

Actually, if I disable purge by additionally specifying innodb_force_recovery=2, then the server will avoid the crash, and during CHECK TABLE test.tab1; it will display a few more messages for the change buffer merge. And indeed, the exact same error would be reported to the client:

InnoDB: Index 'abdcef' contains 10001 entries, should be 10000.

At the start of the server error log we would have the contents of the change buffer. So, it should be possible to see which of the change buffer records are for the pages of this index.

Because we have a copy of the data directory available, it is possible to modify the code so that CHECK TABLE will log each visited record in these indexes. Then we can determine which record is extraneous. Once we know what exactly to look for, I can go back to the rr replay trace of the server that was victim of SIGKILL. This approach has worked in an excellent way for many crash recovery bugs ever since we introduced https://rr-project.org to our workflow at the start of 2020.

Marko Mäkelä added a comment - 2021-05-27 14:26 data.tar.bz2 is a copy of the data directory from the server, for which I have rr replay traces before and after the crash. To reduce the file size so that it would be acceptable for this system, I removed some unrelated files and started the server with innodb_force_recovery=3 and innodb_log_file_size=1m so that the 98-megabyte redo log file was replaced with a 1-megabyte one, without running rollback or purge (or thanks to MDEV-19514 , change buffer merge). The problem is still repeatable with that data directory. I added a little instrumentation on my local system: diff --git a/storage/innobase/ibuf/ibuf0ibuf.cc b/storage/innobase/ibuf/ibuf0ibuf.cc index 178f94684f2..df8f0ba568c 100644 --- a/storage/innobase/ibuf/ibuf0ibuf.cc +++ b/storage/innobase/ibuf/ibuf0ibuf.cc @@ -4224,6 +4224,8 @@ void ibuf_merge_or_delete_for_page(buf_block_t *block, const page_id_t page_id, page_id.space(), page_id.page_no(), heap); if (block != NULL) { + if (mach_read_from_8(PAGE_HEADER + PAGE_INDEX_ID + block->frame) == 0x1d) + ib::info() << "applying to page " << page_id; /* Move the ownership of the x-latch on the page to this OS thread, so that we can acquire a second x-latch on it. This is needed for the insert operations to the index page to pass This will produce me some page numbers of the corrupted index: mariadbd --innodb-page-size=4k --datadir /dev/shm/data --innodb-change-buffer-dump=1 --tc-heuristic-recover=1 >|/dev/shm/dump 2>&1 10.6 f078788e293c0eed416e1bff0a4a3437d4085abb with patch 2021-05-27 17:02:46 0 [Note] InnoDB: applying to page [page id: space=5, page number=1792] 2021-05-27 17:02:46 0 [Note] InnoDB: Rolled back recovered transaction 6894 2021-05-27 17:02:46 0 [Note] InnoDB: Rolled back recovered transaction 6758 2021-05-27 17:02:46 0 [Note] InnoDB: applying to page [page id: space=5, page number=1822] … 2021-05-27 17:02:55 0 [ERROR] InnoDB: tried to purge non-delete-marked record in index `abdcef` of table `test`.`tab1`: tuple: … Actually, if I disable purge by additionally specifying innodb_force_recovery=2 , then the server will avoid the crash, and during CHECK TABLE test.tab1; it will display a few more messages for the change buffer merge. And indeed, the exact same error would be reported to the client: InnoDB: Index 'abdcef' contains 10001 entries, should be 10000. At the start of the server error log we would have the contents of the change buffer. So, it should be possible to see which of the change buffer records are for the pages of this index. Because we have a copy of the data directory available, it is possible to modify the code so that CHECK TABLE will log each visited record in these indexes. Then we can determine which record is extraneous. Once we know what exactly to look for, I can go back to the rr replay trace of the server that was victim of SIGKILL. This approach has worked in an excellent way for many crash recovery bugs ever since we introduced https://rr-project.org to our workflow at the start of 2020.

Marko Mäkelä added a comment - 2021-05-31 14:55

Today, I concluded that ~~MDEV-25745~~ (which we reproduced with a similar test case) is completely unrelated to this failure and specific to the 10.5 and 10.6 release series. I extracted the page numbers of change buffer merge to the corrupted secondary index with the following filter of the server error log (using the above patch), and after running CHECK TABLE tab1:

sql/mariadbd --datadir /dev/shm/data --innodb-page-size=4k --tc-heuristic-recover=1 --innodb-change-buffer-dump --innodb-force-recovery=2  > /dev/shm/ibuf 2>&1

sed -ne 's/.*InnoDB: applying to page \[page id: space=5, page number=\(.*\)\]/\1/p' ibuf|sort -nu > page_numbers.txt

This produces 103 page numbers. We must convert them to hexadecimal so that we can filter the change buffer entries for this index:

perl -pe 'chop;$_=sprintf " 2: len 4; hex %08x\n",int($_)' < page_numbers.txt  > f.txt

grep -A 8 -F -f f.txt ibuf

There are 745 records for those pages. Next, we must restore the data directory and restart the server, to dump all the records that are being visited by the CHECK TABLE. This is our entire patch:

diff --git a/storage/innobase/ibuf/ibuf0ibuf.cc b/storage/innobase/ibuf/ibuf0ibuf.cc

index 3c7f2de2a67..fdfba81c5a6 100644

--- a/storage/innobase/ibuf/ibuf0ibuf.cc

+++ b/storage/innobase/ibuf/ibuf0ibuf.cc

@@ -4275,6 +4275,8 @@ void ibuf_merge_or_delete_for_page(buf_block_t *block, const page_id_t page_id,

 		page_id.space(), page_id.page_no(), heap);

 	if (block != NULL) {

+		if (mach_read_from_8(PAGE_HEADER + PAGE_INDEX_ID + block->frame) == 0x1d)

+			ib::info() << "applying to page " << page_id;

 		/* Move the ownership of the x-latch on the page to this OS

 		thread, so that we can acquire a second x-latch on it. This

 		is needed for the insert operations to the index page to pass

diff --git a/storage/innobase/row/row0mysql.cc b/storage/innobase/row/row0mysql.cc

index ce0746f20c6..dc24a02d4ad 100644

--- a/storage/innobase/row/row0mysql.cc

+++ b/storage/innobase/row/row0mysql.cc

@@ -4792,6 +4792,8 @@ row_scan_index_for_mysql(

 	offsets = rec_get_offsets(rec, index, offsets_, index->n_core_fields,

 				  ULINT_UNDEFINED, &heap);

+	if (index->is_clust() || index->id == 0x1d)

+		ib::info() << rec_printer(rec, offsets).str();

 	if (prev_entry != NULL) {

 		matched_fields = 0;

diff --git a/storage/innobase/rem/rem0rec.cc b/storage/innobase/rem/rem0rec.cc

index c4886101499..edf4de839cf 100644

--- a/storage/innobase/rem/rem0rec.cc

+++ b/storage/innobase/rem/rem0rec.cc

@@ -2584,12 +2584,12 @@ rec_print(

 			o << '['

 			  << local_len

 			  << '+' << BTR_EXTERN_FIELD_REF_SIZE << ']';

-			ut_print_buf(o, data, local_len);

+			ut_print_buf_hex(o, data, local_len);

 			ut_print_buf_hex(o, data + local_len,

 					 BTR_EXTERN_FIELD_REF_SIZE);

 		} else {

 			o << '[' << len << ']';

-			ut_print_buf(o, data, len);

+			ut_print_buf_hex(o, data, len);

In the invocation, we do not need to dump change buffer anymore:

tar xjf 57891_data.tar.bz2

mariadbd --datadir /dev/shm/data --innodb-page-size=4k --tc-heuristic-recover=1 --innodb-force-recovery=2  > /dev/shm/ibuf2 2>&1

After running CHECK TABLE tab1, we can see all records in the server error log, like this:

2021-05-31 17:15:30 3 [Note] InnoDB: RECORD(info_bits=0, 9 fields): {[4](0x00000001),[6](0x000000000000),[7](0x80000000000000),[2](0x6173),[3](0x6D646B),[7](0x62696C6C696F6E),[4](0x636F7570),[8](0x7375727669766F72),[8](0x737472696B696E67)}

…

2021-05-31 17:15:35 3 [Note] InnoDB: RECORD(info_bits=0, 7 fields): {[1](0x61),[1](0x61),[1](0x6B),[6](0x617373756D65),[1](0x6C),[1](0x67),[4](0x00000421)}

We must check that the first field in the clustered index records (9 fields) correspond to the last field in each secondary index record. The easiest way should be to rewrite the clustered index records into secondary index records, and to sort the output. As a first cut, we can remove pk,DB_TRX_ID,DB_ROLL_PTR from the start and append pk to the end:

sed -ne 's/.*\(RECORD(info_bits=.*\), 9 fields): {\([^,]*\),[^,]*,[^,]*,\(.*\)}/\1, 7 fields): {\3,\2}/p' < ibuf2

The fields of the clustered index are pk,DB_TRX_ID,DB_ROLL_PTR,a,e,c,b,f,d. The fields of the secondary index are a,b,d,c,e,f,pk. The above would map the records to a,e,c,b,f,d,pk. We must permute a little more. We do not really care preserving the info_bits, because it is always 0. We are nearly running out of numbered regular expression subexpressions:

sed -ne 's/.*9 fields): {\([^,]*\),[^,]*,[^,]*,\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\)}/{\2,\5,\7,\4,\3,\6,\1}/p' < ibuf2 |sort -u > abdcef.txt

sed -ne 's/.*7 fields): //p' < ibuf2|sort|diff -u - abdcef.txt

Yes, we have to sort both outputs, because the first sort command is not using the same sort order as InnoDB would, in the second output. This identifies the extra record:

--- -	2021-05-31 17:35:33.547076221 +0300

+++ abdcef.txt	2021-05-31 17:33:37.308757738 +0300

@@ -7671,7 +7671,6 @@

 {[6](0x707265747479),[1](0x76),[2](0x7A6F),[10](0x68656C69636F70746572),[1](0x7A),[1](0x6C),[4](0x00002337)}

 {[6](0x707265747479),[11](0x6775666D766E7570777969),[5](0x75666D766E),[1](0x73),[6](0x726775666D76),[1](0x6F),[4](0x00000671)}

 {[6](0x707269736F6E),[6](0x646776766872),[12](0x706172746963756C61726C79),[1](0x6D),[1](0x68),[9](0x686973746F7269616E),[4](0x000009F2)}

-{[6](0x70726F666974),[10](0x76776D6A676166687272),[3](0x776D6A),[6](0x636C696E6963),[1](0x6A),[1](0x75),[4](0x0000130A)}

 {[6](0x70726F666974),[10](0x76776D6A676166687272),[3](0x776D6A),[6](0x636C696E6963),[9](0x73657875616C697479),[1](0x75),[4](0x0000130A)}

 {[6](0x70726F666974),[8](0x6D696E6973746572),[1](0x64),[6](0x746869727479),[7](0x74726F75626C65),[3](0x747279),[4](0x000017AC)}

 {[6](0x70726F6D7074),[3](0x656767),[6](0x6F72616E6765),[1](0x69),[9](0x6D65657272627A636C),[6](0x61637475616C),[4](0x00000873)}

The next step is to determine which page contained this extra record. In ASCII, it is as follows:
(a,b,d,c,e,f,pk)=('profit','vwmjgafhrr','wmj','clinic','j','u',4874)
In GNU Emacs, I did hexl-find-file on test/tab1.ibd and isearch-forward for the ASCII string, after shutting down the server after the CHECK TABLE failure. There are two matches:

0028b560: 1913 1006 0000 d011 09a3 7072 6f66 6974  ..........profit

0028b570: 7677 6d6a 6761 6668 7272 776d 6a63 6c69  vwmjgafhrrwmjcli

0028b580: 6e69 636a 7500 0013 0a00 0003 662b 2723  nicju.......f+'#

…

00366080: 1913 1006 0000 100f 081c 7072 6f66 6974  ..........profit

00366090: 7677 6d6a 6761 6668 7272 776d 6a63 6c69  vwmjgafhrrwmjcli

003660a0: 6e69 636a 7500 0013 0a1e 1a13 0f0a 0908  nicju...........

The page with the first match contains 1 in PAGE_LEVEL, so it is the second match for page 0x366 (870) that counts. Now that we know what to look for, let us check the innodb_change_buffer_dump output:

PHYSICAL RECORD: n_fields 11; 1-byte offsets; info bits 0

 0: len 4; hex 00000005; asc     ;;

 1: len 1; hex 00; asc  ;;

 2: len 4; hex 00000366; asc    f;;

 3: len 30; hex 00000000010f00640008010f00640008010f00640008010f00640008010f; asc        d     d     d     d    ; (total 46 bytes);

 4: len 10; hex 70726f6a656374696f6e; asc projection;;

 5: len 6; hex 6b6a6b647367; asc kjkdsg;;

 6: len 1; hex 7a; asc z;;

 7: len 9; hex 6a6b6a6b6473676569; asc jkjkdsgei;;

 8: len 5; hex 7472756c79; asc truly;;

 9: len 1; hex 68; asc h;;

 10: len 4; hex 00001659; asc    Y;;

This was the only buffered change for our page 5:0x366. We can see that the data does not match ('projection' is not 'profit'). Also the pk value 0x1659 is different. So, this failure cannot be explained by failure to merge buffered changes.

We must evaluate the rr replay trace of the server run that ended in SIGKILL, to find out which update is related to the extra record, and why the secondary index record was not removed.

Marko Mäkelä added a comment - 2021-05-31 14:55 Today, I concluded that MDEV-25745 (which we reproduced with a similar test case) is completely unrelated to this failure and specific to the 10.5 and 10.6 release series. I extracted the page numbers of change buffer merge to the corrupted secondary index with the following filter of the server error log (using the above patch), and after running CHECK TABLE tab1 : sql/mariadbd --datadir /dev/shm/data --innodb-page-size=4k --tc-heuristic-recover=1 --innodb-change-buffer-dump --innodb-force-recovery=2 > /dev/shm/ibuf 2>&1 sed -ne 's/.*InnoDB: applying to page \[page id: space=5, page number=$.*$\]/\1/p' ibuf|sort -nu > page_numbers.txt This produces 103 page numbers. We must convert them to hexadecimal so that we can filter the change buffer entries for this index: perl -pe 'chop;$_=sprintf " 2: len 4; hex %08x\n",int($_)' < page_numbers.txt > f.txt grep -A 8 -F -f f.txt ibuf There are 745 records for those pages. Next, we must restore the data directory and restart the server, to dump all the records that are being visited by the CHECK TABLE . This is our entire patch: diff --git a/storage/innobase/ibuf/ibuf0ibuf.cc b/storage/innobase/ibuf/ibuf0ibuf.cc index 3c7f2de2a67..fdfba81c5a6 100644 --- a/storage/innobase/ibuf/ibuf0ibuf.cc +++ b/storage/innobase/ibuf/ibuf0ibuf.cc @@ -4275,6 +4275,8 @@ void ibuf_merge_or_delete_for_page(buf_block_t *block, const page_id_t page_id, page_id.space(), page_id.page_no(), heap); if (block != NULL) { + if (mach_read_from_8(PAGE_HEADER + PAGE_INDEX_ID + block->frame) == 0x1d) + ib::info() << "applying to page " << page_id; /* Move the ownership of the x-latch on the page to this OS thread, so that we can acquire a second x-latch on it. This is needed for the insert operations to the index page to pass diff --git a/storage/innobase/row/row0mysql.cc b/storage/innobase/row/row0mysql.cc index ce0746f20c6..dc24a02d4ad 100644 --- a/storage/innobase/row/row0mysql.cc +++ b/storage/innobase/row/row0mysql.cc @@ -4792,6 +4792,8 @@ row_scan_index_for_mysql( offsets = rec_get_offsets(rec, index, offsets_, index->n_core_fields, ULINT_UNDEFINED, &heap); + if (index->is_clust() || index->id == 0x1d) + ib::info() << rec_printer(rec, offsets).str(); if (prev_entry != NULL) { matched_fields = 0; diff --git a/storage/innobase/rem/rem0rec.cc b/storage/innobase/rem/rem0rec.cc index c4886101499..edf4de839cf 100644 --- a/storage/innobase/rem/rem0rec.cc +++ b/storage/innobase/rem/rem0rec.cc @@ -2584,12 +2584,12 @@ rec_print( o << '[' << local_len << '+' << BTR_EXTERN_FIELD_REF_SIZE << ']'; - ut_print_buf(o, data, local_len); + ut_print_buf_hex(o, data, local_len); ut_print_buf_hex(o, data + local_len, BTR_EXTERN_FIELD_REF_SIZE); } else { o << '[' << len << ']'; - ut_print_buf(o, data, len); + ut_print_buf_hex(o, data, len); } } In the invocation, we do not need to dump change buffer anymore: tar xjf 57891_data.tar.bz2 mariadbd --datadir /dev/shm/data --innodb-page-size=4k --tc-heuristic-recover=1 --innodb-force-recovery=2 > /dev/shm/ibuf2 2>&1 After running CHECK TABLE tab1 , we can see all records in the server error log, like this: 2021-05-31 17:15:30 3 [Note] InnoDB: RECORD(info_bits=0, 9 fields): {[4](0x00000001),[6](0x000000000000),[7](0x80000000000000),[2](0x6173),[3](0x6D646B),[7](0x62696C6C696F6E),[4](0x636F7570),[8](0x7375727669766F72),[8](0x737472696B696E67)} … 2021-05-31 17:15:35 3 [Note] InnoDB: RECORD(info_bits=0, 7 fields): {[1](0x61),[1](0x61),[1](0x6B),[6](0x617373756D65),[1](0x6C),[1](0x67),[4](0x00000421)} We must check that the first field in the clustered index records (9 fields) correspond to the last field in each secondary index record. The easiest way should be to rewrite the clustered index records into secondary index records, and to sort the output. As a first cut, we can remove pk,DB_TRX_ID,DB_ROLL_PTR from the start and append pk to the end: sed -ne 's/.*$RECORD(info_bits=.*$, 9 fields): {$[^,]*$,[^,]*,[^,]*,$.*$}/\1, 7 fields): {\3,\2}/p' < ibuf2 The fields of the clustered index are pk,DB_TRX_ID,DB_ROLL_PTR,a,e,c,b,f,d. The fields of the secondary index are a,b,d,c,e,f,pk. The above would map the records to a,e,c,b,f,d,pk. We must permute a little more. We do not really care preserving the info_bits , because it is always 0. We are nearly running out of numbered regular expression subexpressions: sed -ne 's/.*9 fields): {$[^,]*$,[^,]*,[^,]*,$[^,]*$,$[^,]*$,$[^,]*$,$[^,]*$,$[^,]*$,$[^,]*$}/{\2,\5,\7,\4,\3,\6,\1}/p' < ibuf2 |sort -u > abdcef.txt sed -ne 's/.*7 fields): //p' < ibuf2|sort|diff -u - abdcef.txt Yes, we have to sort both outputs, because the first sort command is not using the same sort order as InnoDB would, in the second output. This identifies the extra record: --- - 2021-05-31 17:35:33.547076221 +0300 +++ abdcef.txt 2021-05-31 17:33:37.308757738 +0300 @@ -7671,7 +7671,6 @@ {[6](0x707265747479),[1](0x76),[2](0x7A6F),[10](0x68656C69636F70746572),[1](0x7A),[1](0x6C),[4](0x00002337)} {[6](0x707265747479),[11](0x6775666D766E7570777969),[5](0x75666D766E),[1](0x73),[6](0x726775666D76),[1](0x6F),[4](0x00000671)} {[6](0x707269736F6E),[6](0x646776766872),[12](0x706172746963756C61726C79),[1](0x6D),[1](0x68),[9](0x686973746F7269616E),[4](0x000009F2)} -{[6](0x70726F666974),[10](0x76776D6A676166687272),[3](0x776D6A),[6](0x636C696E6963),[1](0x6A),[1](0x75),[4](0x0000130A)} {[6](0x70726F666974),[10](0x76776D6A676166687272),[3](0x776D6A),[6](0x636C696E6963),[9](0x73657875616C697479),[1](0x75),[4](0x0000130A)} {[6](0x70726F666974),[8](0x6D696E6973746572),[1](0x64),[6](0x746869727479),[7](0x74726F75626C65),[3](0x747279),[4](0x000017AC)} {[6](0x70726F6D7074),[3](0x656767),[6](0x6F72616E6765),[1](0x69),[9](0x6D65657272627A636C),[6](0x61637475616C),[4](0x00000873)} The next step is to determine which page contained this extra record. In ASCII, it is as follows: (a,b,d,c,e,f,pk)=('profit','vwmjgafhrr','wmj','clinic','j','u',4874) In GNU Emacs, I did hexl-find-file on test/tab1.ibd and isearch-forward for the ASCII string, after shutting down the server after the CHECK TABLE failure. There are two matches: 0028b560: 1913 1006 0000 d011 09a3 7072 6f66 6974 ..........profit 0028b570: 7677 6d6a 6761 6668 7272 776d 6a63 6c69 vwmjgafhrrwmjcli 0028b580: 6e69 636a 7500 0013 0a00 0003 662b 2723 nicju.......f+'# … 00366080: 1913 1006 0000 100f 081c 7072 6f66 6974 ..........profit 00366090: 7677 6d6a 6761 6668 7272 776d 6a63 6c69 vwmjgafhrrwmjcli 003660a0: 6e69 636a 7500 0013 0a1e 1a13 0f0a 0908 nicju........... The page with the first match contains 1 in PAGE_LEVEL , so it is the second match for page 0x366 (870) that counts. Now that we know what to look for, let us check the innodb_change_buffer_dump output: PHYSICAL RECORD: n_fields 11; 1-byte offsets; info bits 0 0: len 4; hex 00000005; asc ;; 1: len 1; hex 00; asc ;; 2: len 4; hex 00000366; asc f;; 3: len 30; hex 00000000010f00640008010f00640008010f00640008010f00640008010f; asc d d d d ; (total 46 bytes); 4: len 10; hex 70726f6a656374696f6e; asc projection;; 5: len 6; hex 6b6a6b647367; asc kjkdsg;; 6: len 1; hex 7a; asc z;; 7: len 9; hex 6a6b6a6b6473676569; asc jkjkdsgei;; 8: len 5; hex 7472756c79; asc truly;; 9: len 1; hex 68; asc h;; 10: len 4; hex 00001659; asc Y;; This was the only buffered change for our page 5:0x366. We can see that the data does not match ('projection' is not 'profit'). Also the pk value 0x1659 is different. So, this failure cannot be explained by failure to merge buffered changes. We must evaluate the rr replay trace of the server run that ended in SIGKILL, to find out which update is related to the extra record, and why the secondary index record was not removed.

Marko Mäkelä added a comment - 2021-05-31 15:23

The clustered index record with the pk value of the extraneous secondary record is as follows:

{[4](0x0000130A),[6](0x000000000EDD),[7](0x77000004F10330),[6](0x70726F666974),[9](0x73657875616C697479),[6](0x636C696E6963),[10](0x76776D6A676166687272),[1](0x75),[3](0x776D6A)}

In clear text: (pk,DB_TRX_ID,DB_ROLL_PTR,a,e,c,b,f,d)=(4874,0xedd,(update),'profit','sexuality','clinic','vwmjgafhrr','u').
In the secondary index record format, it is as follows:

{[6](0x70726F666974),[10](0x76776D6A676166687272),[3](0x776D6A),[6](0x636C696E6963),[9](0x73657875616C697479),[1](0x75),[4](0x0000130A)}

In the diff output of the previous comment, that record would be the successor of our extraneous record. The successor of the record should be at 0x36681c in the file (in the same page with the extraneous record):

00366810: 2322 1913 1006 0001 600f 00b6 7072 6f66  #"......`...prof

00366820: 6974 7677 6d6a 6761 6668 7272 776d 6a63  itvwmjgafhrrwmjc

00366830: 6c69 6e69 6373 6578 7561 6c69 7479 7500  linicsexualityu.

00366840: 0013 0a28 2421 1815 0c08 0001 680f 0540  ...($!......h..@

I think that we must find out when the page 5:0x366 (5:870) was accessed before the SIGKILL, or alternatively, when the clustered index record was modified. Because the lenghts are varying a lot, also clustered index records ought to be moving at least within the page, but maybe also between pages. The last location of the clustered index record is:

002a9280: 6478 3431 3026 2017 110a 0400 0058 1302  dx410& ......X..

002a9290: d400 0013 0a00 0000 000e dd77 0000 04f1  ...........w....

002a92a0: 0330 7072 6f66 6974 7365 7875 616c 6974  .0profitsexualit

002a92b0: 7963 6c69 6e69 6376 776d 6a67 6166 6872  yclinicvwmjgafhr

002a92c0: 7275 776d 6a42 3b34 2a22 1511 0a04 0000  ruwmjB;4*"......

It could help to trace changes of PAGE_MAX_TRX_ID in the secondary index page, and also to check what happened during the transaction 0xedd.

Marko Mäkelä added a comment - 2021-05-31 15:23 The clustered index record with the pk value of the extraneous secondary record is as follows: {[4](0x0000130A),[6](0x000000000EDD),[7](0x77000004F10330),[6](0x70726F666974),[9](0x73657875616C697479),[6](0x636C696E6963),[10](0x76776D6A676166687272),[1](0x75),[3](0x776D6A)} In clear text: (pk,DB_TRX_ID,DB_ROLL_PTR,a,e,c,b,f,d)=(4874,0xedd,(update),'profit','sexuality','clinic','vwmjgafhrr','u'). In the secondary index record format, it is as follows: {[6](0x70726F666974),[10](0x76776D6A676166687272),[3](0x776D6A),[6](0x636C696E6963),[9](0x73657875616C697479),[1](0x75),[4](0x0000130A)} In the diff output of the previous comment, that record would be the successor of our extraneous record. The successor of the record should be at 0x36681c in the file (in the same page with the extraneous record): 00366810: 2322 1913 1006 0001 600f 00b6 7072 6f66 #"......`...prof 00366820: 6974 7677 6d6a 6761 6668 7272 776d 6a63 itvwmjgafhrrwmjc 00366830: 6c69 6e69 6373 6578 7561 6c69 7479 7500 linicsexualityu. 00366840: 0013 0a28 2421 1815 0c08 0001 680f 0540 ...($!......h..@ I think that we must find out when the page 5:0x366 (5:870) was accessed before the SIGKILL, or alternatively, when the clustered index record was modified. Because the lenghts are varying a lot, also clustered index records ought to be moving at least within the page, but maybe also between pages. The last location of the clustered index record is: 002a9280: 6478 3431 3026 2017 110a 0400 0058 1302 dx410& ......X.. 002a9290: d400 0013 0a00 0000 000e dd77 0000 04f1 ...........w.... 002a92a0: 0330 7072 6f66 6974 7365 7875 616c 6974 .0profitsexualit 002a92b0: 7963 6c69 6e69 6376 776d 6a67 6166 6872 yclinicvwmjgafhr 002a92c0: 7275 776d 6a42 3b34 2a22 1511 0a04 0000 ruwmjB;4*"...... It could help to trace changes of PAGE_MAX_TRX_ID in the secondary index page, and also to check what happened during the transaction 0xedd .

Marko Mäkelä added a comment - 2021-05-31 17:25

Transaction 0xedd executed {{UPDATE `tab1` SET e = 'sexuality' WHERE pk = 4874 /* E_R Thread3 QNO 52 CON_ID 15 */ }}. Thankfully, it consisted of just that single statement. Now we only have to figure out what exactly happened to pages 5:870 and 5:681 during the transaction, or during the purge of the index record. It is suspicious that the delete-mark flag was not set in the record that starts at 0x36608a. Delete-marking the record should have been part of this transaction.
At the time of the transaction commit, we can thankfully find both pages in the buffer pool:

define find_page_105

  set $id=((unsigned long long)$arg0)<<32|$arg1

  set $chunk=buf_pool.chunks

  set $i=buf_pool.n_chunks

  set $found=0

  while ($i-- && !$found)

    set $j = $chunk->size

    set $b = $chunk->blocks

    while ($j--)

      if ($b->page.id_.m_id==$id)

        print $b

        set $found=1

        loop_break

end

      set $b++

end

    set $chunk++

end

end

(rr) find_page_105 5 0x366

$1 = (buf_block_t *) 0x7fccc6b651a0

(rr) find_page_105 5 0x2a9

$2 = (buf_block_t *) 0x7fccc6b7fd40

(rr) p/x $1.frame[0x8a-6]@6

$3 = {0x0, 0x0, 0x10, 0xf, 0x8, 0x1c}

We can see that indeed there is no delete-mark on the record that was supposed to be delete-marked. Let us check when that record was last accessed:

(rr) p $1.frame+0x8a-6

$6 = (byte *) 0x7fccc6de0084 ""

(rr) awatch *(byte*)0x7fccc6de0084@8

Hardware access (read/write) watchpoint 2: *(byte*)0x7fccc6de0084@8

(rr) when

Current event: 928166

(rr) reverse-continue

Continuing.

Thread 13 hit Hardware access (read/write) watchpoint 2: *(byte*)0x7fccc6de0084@8

Value = "\000\000\020\017\b\034pr"

(rr) bt

#0  0x000055b6ea4ce754 in page_cur_insert_rec_low (cur=cur@entry=0x7fccb432a4a8, index=index@entry=0x617000056088, rec=rec@entry=0x6190003eb315 "profitvwmjgafhrrwmjclinicsexualityu", offsets=<optimized out>,

    mtr=mtr@entry=0x7fccb432a860) at /usr/include/c++/9/bits/stl_algobase.h:198

#1  0x000055b6ea909073 in page_cur_tuple_insert (cursor=cursor@entry=0x7fccb432a4a8, tuple=tuple@entry=0x6190003e7c08, index=index@entry=0x617000056088, offsets=offsets@entry=0x7fccb432a3a0,

    heap=heap@entry=0x7fccb432a480, n_ext=<optimized out>, mtr=0x7fccb432a860) at /data/Server/10.5_old/storage/innobase/include/page0cur.ic:285

#2  0x000055b6ea910199 in btr_cur_optimistic_insert (flags=flags@entry=0, cursor=cursor@entry=0x7fccb432a4a0, offsets=offsets@entry=0x7fccb432a3a0, heap=heap@entry=0x7fccb432a480,

    entry=entry@entry=0x6190003e7c08, rec=rec@entry=0x7fccb432a3c0, big_rec=<optimized out>, n_ext=0, thr=<optimized out>, mtr=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:3562

#3  0x000055b6ea5e2b93 in row_ins_sec_index_entry_low (flags=flags@entry=0, mode=<optimized out>, mode@entry=33, index=index@entry=0x617000056088, offsets_heap=offsets_heap@entry=0x6190003eb280,

    heap=heap@entry=0x6190003ead80, entry=entry@entry=0x6190003e7c08, trx_id=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3109

#4  0x000055b6ea5efdc1 in row_ins_sec_index_entry (index=index@entry=0x617000056088, entry=entry@entry=0x6190003e7c08, thr=thr@entry=0x62100068be48, check_foreign=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3308

#5  0x000055b6ea71e375 in row_upd_sec_index_entry (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2236

#6  0x000055b6ea71e6e6 in row_upd_sec_step (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2263

#7  0x000055b6ea71f271 in row_upd (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:3024

#8  0x000055b6ea71fc6b in row_upd_step (thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:3139

#9  0x000055b6ea64108d in row_update_for_mysql (prebuilt=0x62100068b188) at /data/Server/10.5_old/storage/innobase/row/row0mysql.cc:1844

#10 0x000055b6ea22fec9 in ha_innobase::update_row (this=0x61d0008e44b8, old_row=<optimized out>, new_row=<optimized out>) at /data/Server/10.5_old/storage/innobase/handler/ha_innodb.cc:8443

#11 0x000055b6e980670b in handler::ha_update_row (this=0x61d0008e44b8, old_data=0x61a000331520 "\300\006profit", new_data=0x61a0003312b8 "\300\006profit") at /data/Server/10.5_old/sql/handler.cc:7207

#12 0x000055b6e92edc77 in mysql_update (thd=thd@entry=0x62b0000ee218, table_list=<optimized out>, fields=..., values=..., conds=<optimized out>, order_num=<optimized out>, order=<optimized out>,

    limit=18446744073709551615, ignore=<optimized out>, found_return=<optimized out>, updated_return=<optimized out>) at /data/Server/10.5_old/sql/sql_update.cc:1077

#13 0x000055b6e8fda762 in mysql_execute_command (thd=thd@entry=0x62b0000ee218) at /data/Server/10.5_old/sql/sql_limit.h:67

#14 0x000055b6e8f9a0d6 in mysql_parse (thd=thd@entry=0x62b0000ee218, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x7fccb432e530, is_com_multi=is_com_multi@entry=false,

    is_next_command=is_next_command@entry=false) at /data/Server/10.5_old/sql/sql_parse.cc:8099

#15 0x000055b6e8fc8c1c in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x62b0000ee218,

    packet=packet@entry=0x629000028219 "UPDATE `tab1` SET e = 'sexuality' WHERE pk = 4874 /* E_R Thread3 QNO 52 CON_ID 15 */ ", packet_length=packet_length@entry=85, is_com_multi=is_com_multi@entry=false,

    is_next_command=is_next_command@entry=false) at /data/Server/10.5_old/sql/sql_class.h:1268

(rr) p cur.rec

$7 = (rec_t *) 0x7fccc6de008a "profitvwmjgafhrrwmjclinicju"

So, we are inserting the updated record after the record that we failed to delete-mark. Let us trace further backwards. We had also updated the next-record pointer a little earlier:

#0  0x000055b6ea4be1e1 in mach_write_to_2 (b=0x7fccc6de0088 "\b\266profitvwmjgafhrrwmjclinicju", n=2076) at /data/Server/10.5_old/storage/innobase/include/mach0data.ic:61

#1  0x000055b6ea4cdfb3 in page_cur_insert_rec_low (cur=cur@entry=0x7fccb432a4a8, index=index@entry=0x617000056088, rec=rec@entry=0x6190003eb315 "profitvwmjgafhrrwmjclinicsexualityu", offsets=<optimized out>,

    mtr=mtr@entry=0x7fccb432a860) at /data/Server/10.5_old/storage/innobase/include/page0page.h:216

…

(rr) rc

…

#0  0x000055b6ea3d0756 in mach_read_from_2 (b=0x7fccc6de0088 "") at /data/Server/10.5_old/storage/innobase/include/mach0data.ic:84

#1  rec_get_next_offs (comp=0, rec=0x7fccc6de008a "profitvwmjgafhrrwmjclinicju") at /data/Server/10.5_old/storage/innobase/include/rem0rec.ic:291

#2  page_rec_get_next_low (rec=rec@entry=0x7fccc6de008a "profitvwmjgafhrrwmjclinicju", comp=0) at /data/Server/10.5_old/storage/innobase/include/page0page.ic:434

#3  0x000055b6ea403f1b in page_rec_get_next_const (rec=0x7fccc6de008a "profitvwmjgafhrrwmjclinicju") at /data/Server/10.5_old/storage/innobase/include/page0page.h:276

#4  lock_rec_insert_check_and_lock (flags=flags@entry=0, rec=rec@entry=0x7fccc6de008a "profitvwmjgafhrrwmjclinicju", block=0x7fccc6b651a0, index=index@entry=0x617000056088, thr=thr@entry=0x62100068be48,

    mtr=mtr@entry=0x7fccb432a860, inherit=<optimized out>) at /data/Server/10.5_old/storage/innobase/lock/lock0lock.cc:5129

#5  0x000055b6ea8fa376 in btr_cur_ins_lock_and_undo (flags=flags@entry=0, cursor=cursor@entry=0x7fccb432a4a0, entry=entry@entry=0x6190003e7c08, thr=thr@entry=0x62100068be48, mtr=mtr@entry=0x7fccb432a860,

    inherit=inherit@entry=0x7fccb4329bc0) at /data/Server/10.5_old/storage/innobase/include/btr0cur.ic:61

#6  0x000055b6ea9100ed in btr_cur_optimistic_insert (flags=flags@entry=0, cursor=cursor@entry=0x7fccb432a4a0, offsets=offsets@entry=0x7fccb432a3a0, heap=heap@entry=0x7fccb432a480,

    entry=entry@entry=0x6190003e7c08, rec=rec@entry=0x7fccb432a3c0, big_rec=<optimized out>, n_ext=0, thr=<optimized out>, mtr=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:3529

#7  0x000055b6ea5e2b93 in row_ins_sec_index_entry_low (flags=flags@entry=0, mode=<optimized out>, mode@entry=33, index=index@entry=0x617000056088, offsets_heap=offsets_heap@entry=0x6190003eb280,

    heap=heap@entry=0x6190003ead80, entry=entry@entry=0x6190003e7c08, trx_id=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3109

#8  0x000055b6ea5efdc1 in row_ins_sec_index_entry (index=index@entry=0x617000056088, entry=entry@entry=0x6190003e7c08, thr=thr@entry=0x62100068be48, check_foreign=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3308

#9  0x000055b6ea71e375 in row_upd_sec_index_entry (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2236

#10 0x000055b6ea71e6e6 in row_upd_sec_step (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2263

#11 0x000055b6ea71f271 in row_upd (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:3024

#12 0x000055b6ea71fc6b in row_upd_step (thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:3139

#13 0x000055b6ea64108d in row_update_for_mysql (prebuilt=0x62100068b188) at /data/Server/10.5_old/storage/innobase/row/row0mysql.cc:1844

#14 0x000055b6ea22fec9 in ha_innobase::update_row (this=0x61d0008e44b8, old_row=<optimized out>, new_row=<optimized out>) at /data/Server/10.5_old/storage/innobase/handler/ha_innodb.cc:8443

(rr) reverse-finish

…

#0  0x000055b6ea5cf150 in mach_read_from_2 (b=0x7fccc6de0086 "\020\017") at /data/Server/10.5_old/storage/innobase/include/mach0data.ic:84

#1  rec_get_bit_field_2 (shift=1, mask=2046, offs=4, rec=0x7fccc6de008a "profitvwmjgafhrrwmjclinicju") at /data/Server/10.5_old/storage/innobase/include/rem0rec.ic:177

#2  rec_get_n_fields_old (rec=0x7fccc6de008a "profitvwmjgafhrrwmjclinicju") at /data/Server/10.5_old/storage/innobase/include/rem0rec.ic:389

#3  rec_get_n_fields (index=0x617000056088, rec=0x7fccc6de008a "profitvwmjgafhrrwmjclinicju") at /data/Server/10.5_old/storage/innobase/include/rem0rec.ic:431

#4  rec_n_fields_is_sane (index=index@entry=0x617000056088, rec=rec@entry=0x7fccc6de008a "profitvwmjgafhrrwmjclinicju", entry=entry@entry=0x6190003e7c08)

    at /data/Server/10.5_old/storage/innobase/include/rem0rec.ic:462

#5  0x000055b6ea5e1cd0 in row_ins_sec_index_entry_low (flags=flags@entry=0, mode=mode@entry=33, index=index@entry=0x617000056088, offsets_heap=offsets_heap@entry=0x6190003eb280, heap=heap@entry=0x6190003ead80,

    entry=entry@entry=0x6190003e7c08, trx_id=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3007

#6  0x000055b6ea5efdc1 in row_ins_sec_index_entry (index=index@entry=0x617000056088, entry=entry@entry=0x6190003e7c08, thr=thr@entry=0x62100068be48, check_foreign=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3308

#7  0x000055b6ea71e375 in row_upd_sec_index_entry (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2236

(rr) reverse-finish

…

#17 rw_lock_x_lock_low (lock=0x7fccc6b518a0, pass=<optimized out>, file_name=<optimized out>, line=<optimized out>) at /data/Server/10.5_old/storage/innobase/sync/sync0rw.cc:506

#18 0x000055b6ea78ba04 in rw_lock_x_lock_func (lock=lock@entry=0x7fccc6b518a0, pass=pass@entry=0, file_name=file_name@entry=0x55b6eb78c580 "/data/Server/10.5_old/storage/innobase/include/mtr0mtr.ic",

    line=line@entry=145) at /data/Server/10.5_old/storage/innobase/sync/sync0rw.cc:661

#19 0x000055b6ea942e19 in mtr_t::x_latch_at_savepoint (this=this@entry=0x7fccb432a860, savepoint=<optimized out>, block=<optimized out>) at /data/Server/10.5_old/storage/innobase/include/mtr0mtr.ic:145

#20 0x000055b6ea937e28 in btr_cur_search_to_nth_level_func (index=index@entry=0x617000056088, level=level@entry=0, tuple=tuple@entry=0x6190003e7c08, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>,

    latch_mode@entry=545, cursor=cursor@entry=0x7fccb432a4a0, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:2236

#21 0x000055b6ea5e1765 in row_ins_sec_index_entry_low (flags=flags@entry=0, mode=mode@entry=33, index=index@entry=0x617000056088, offsets_heap=offsets_heap@entry=0x6190003eb280, heap=heap@entry=0x6190003ead80,

    entry=entry@entry=0x6190003e7c08, trx_id=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:2971

#22 0x000055b6ea5efdc1 in row_ins_sec_index_entry (index=index@entry=0x617000056088, entry=entry@entry=0x6190003e7c08, thr=thr@entry=0x62100068be48, check_foreign=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3308

#23 0x000055b6ea71e375 in row_upd_sec_index_entry (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2236

#24 0x000055b6ea71e6e6 in row_upd_sec_step (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2263

#25 0x000055b6ea71f271 in row_upd (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:3024

#26 0x000055b6ea71fc6b in row_upd_step (thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:3139

#27 0x000055b6ea64108d in row_update_for_mysql (prebuilt=0x62100068b188) at /data/Server/10.5_old/storage/innobase/row/row0mysql.cc:1844

#28 0x000055b6ea22fec9 in ha_innobase::update_row (this=0x61d0008e44b8, old_row=<optimized out>, new_row=<optimized out>) at /data/Server/10.5_old/storage/innobase/handler/ha_innodb.cc:8443

This was still in the "insert" part of the "update", after the record should have been delete-marked. I was not able to reverse-finish across an emulated system call, so I had to restart debugging in the forward direction from an earlier point of logical time. As expected for this change buffer test case, the page was loaded to the buffer pool rather recently. It was not in the buffer pool at the start of our transaction:

(rr) when

Current event: 916240

(rr) bt

#0  trx_sys_t::register_rw (trx=0x7fccc7108540, this=<optimized out>) at /data/Server/10.5_old/storage/innobase/include/trx0sys.h:1084

#1  trx_set_rw_mode (trx=trx@entry=0x7fccc7108540) at /data/Server/10.5_old/storage/innobase/trx/trx0trx.cc:2317

#2  0x000055b6ea40c0bd in lock_table (flags=flags@entry=0, table=0x618000053508, mode=LOCK_IX, thr=thr@entry=0x62100068b988) at /data/Server/10.5_old/storage/innobase/lock/lock0lock.cc:3766

#3  0x000055b6ea6e85c2 in row_search_mvcc (buf=buf@entry=0x61a0003312b8 "\377", mode=PAGE_CUR_GE, prebuilt=<optimized out>, match_mode=<optimized out>, direction=direction@entry=0)

    at /data/Server/10.5_old/storage/innobase/row/row0sel.cc:4608

#4  0x000055b6ea2370fe in ha_innobase::index_read (this=0x61d0008e44b8, buf=0x61a0003312b8 "\377", key_ptr=0x6210020cda00 "\n\023", key_len=4, find_flag=HA_READ_KEY_EXACT)

    at /data/Server/10.5_old/storage/innobase/handler/ha_innodb.cc:8891

At this point of time, the page 5:870 does not exist in the buffer pool. Here we insert something into the change buffer root page during the execution of this transaction, for a different index (fbcdea):

#0  buf_page_get_low (page_id={m_id = 4}, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x7fccc6b3c480, mode=mode@entry=10,

    file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:2904

#1  0x000055b6ea9b45bd in buf_page_get_gen (page_id=<optimized out>, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x7fccc6b3c480, mode=mode@entry=10,

    file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3482

#2  0x000055b6ea93403e in btr_cur_search_to_nth_level_func (index=index@entry=0x617000002e88, level=<optimized out>, tuple=tuple@entry=0x61900813d108, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>,

    latch_mode@entry=36, cursor=cursor@entry=0x7fccb43292d0, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:1612

#3  0x000055b6ea3a66f0 in btr_pcur_open_low (index=0x617000002e88, level=level@entry=0, tuple=tuple@entry=0x61900813d108, mode=mode@entry=PAGE_CUR_LE, latch_mode=latch_mode@entry=36,

    cursor=cursor@entry=0x7fccb43292d0, file=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=3276, autoinc=<optimized out>, mtr=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/include/btr0pcur.ic:441

#4  0x000055b6ea3beeb1 in ibuf_insert_low (mode=mode@entry=36, op=op@entry=IBUF_OP_DELETE_MARK, no_counter=no_counter@entry=0, entry=entry@entry=0x61900813e508, entry_size=entry_size@entry=44,

    index=index@entry=0x617000054b88, page_id=<optimized out>, zip_size=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3276

#5  0x000055b6ea3c0883 in ibuf_insert (op=op@entry=IBUF_OP_DELETE_MARK, entry=entry@entry=0x61900813e508, index=index@entry=0x617000054b88, page_id=<optimized out>, zip_size=zip_size@entry=0,

    thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3603

#6  0x000055b6ea934d0a in btr_cur_search_to_nth_level_func (index=index@entry=0x617000054b88, level=<optimized out>, tuple=tuple@entry=0x61900813e508, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>,

    latch_mode@entry=4098, cursor=cursor@entry=0x7fccb432bb50, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:1663

#7  0x000055b6ea6ba59c in btr_pcur_open_low (index=index@entry=0x617000054b88, level=level@entry=0, tuple=tuple@entry=0x61900813e508, mode=mode@entry=PAGE_CUR_LE, latch_mode=latch_mode@entry=4098,

    cursor=cursor@entry=0x7fccb432bb50, file=0x55b6eb6ed660 "/data/Server/10.5_old/storage/innobase/row/row0row.cc", line=1303, autoinc=<optimized out>, mtr=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/include/btr0pcur.ic:441

#8  0x000055b6ea6bc3d8 in row_search_index_entry (index=index@entry=0x617000054b88, entry=entry@entry=0x61900813e508, mode=mode@entry=4098, pcur=pcur@entry=0x7fccb432bb50, mtr=mtr@entry=0x7fccb432c380)

    at /data/Server/10.5_old/storage/innobase/row/row0row.cc:1303

#9  0x000055b6ea71c5b2 in row_upd_sec_index_entry (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2067

Finally, we are checking if our index page exists in the buffer pool:

4: /x page_id = {m_id = 0x500000366}

(rr) when

Current event: 916872

(rr) bt

#0  buf_page_get_low (page_id={m_id = 21474837350}, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=guess@entry=0x0, mode=mode@entry=11,

    file=file@entry=0x55b6eb6ed660 "/data/Server/10.5_old/storage/innobase/row/row0row.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:2904

#1  0x000055b6ea9b45bd in buf_page_get_gen (page_id=<optimized out>, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=guess@entry=0x0, mode=mode@entry=11,

    file=file@entry=0x55b6eb6ed660 "/data/Server/10.5_old/storage/innobase/row/row0row.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3482

#2  0x000055b6ea93403e in btr_cur_search_to_nth_level_func (index=index@entry=0x617000056088, level=<optimized out>, tuple=tuple@entry=0x6190003e7c08, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>,

    latch_mode@entry=4098, cursor=cursor@entry=0x7fccb432bb50, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:1612

#3  0x000055b6ea6ba59c in btr_pcur_open_low (index=index@entry=0x617000056088, level=level@entry=0, tuple=tuple@entry=0x6190003e7c08, mode=mode@entry=PAGE_CUR_LE, latch_mode=latch_mode@entry=4098,

    cursor=cursor@entry=0x7fccb432bb50, file=0x55b6eb6ed660 "/data/Server/10.5_old/storage/innobase/row/row0row.cc", line=1303, autoinc=<optimized out>, mtr=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/include/btr0pcur.ic:441

#4  0x000055b6ea6bc3d8 in row_search_index_entry (index=index@entry=0x617000056088, entry=entry@entry=0x6190003e7c08, mode=mode@entry=4098, pcur=pcur@entry=0x7fccb432bb50, mtr=mtr@entry=0x7fccb432c380)

    at /data/Server/10.5_old/storage/innobase/row/row0row.cc:1303

…

(rr) finish

…

Value returned is $24 = (buf_block_t *) 0x0

We decide to buffer the delete-mark operation. During that exact time, the previous contents of the buffer pool page frame where page 5:870 will be loaded will be evicted:

Thread 15 hit Hardware watchpoint 5: -location block.page.id_.m_id

Old value = 21474836572

New value = 18446744073709551615

0x000055b6eaa09e73 in buf_page_t::set_corrupt_id (this=0x7fccc6b651a0) at /data/Server/10.5_old/storage/innobase/include/buf0types.h:130

130	  page_id_t(uint64_t id) : m_id(id) {}

(rr) thread apply 17 backtrace

Thread 17 (Thread 1020881.1039616):

…

#11 __pthread_cond_wait (cond=cond@entry=0x60b00001beb0, mutex=mutex@entry=0x60b00001be88) at pthread_cond_wait.c:638

#12 0x000055b6ea4bc736 in os_event::wait (this=0x60b00001be70) at /data/Server/10.5_old/storage/innobase/include/sync0types.h:418

#13 os_event::wait_low (this=0x60b00001be70, reset_sig_count=2130) at /data/Server/10.5_old/storage/innobase/os/os0event.cc:325

#14 0x000055b6ea4bcfe4 in os_event_wait_low (event=<optimized out>, reset_sig_count=<optimized out>) at /data/Server/10.5_old/storage/innobase/os/os0event.cc:502

#15 0x000055b6ea77d032 in sync_array_wait_event (arr=arr@entry=0x611000001440, cell=@0x7fccb43273e0: 0x7fcccfa1bd00) at /data/Server/10.5_old/storage/innobase/sync/sync0arr.cc:456

#16 0x000055b6ea78a43e in rw_lock_s_lock_spin (lock=lock@entry=0x7fccc6b3c500, pass=pass@entry=0, file_name=file_name@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=line@entry=3276) at /data/Server/10.5_old/storage/innobase/sync/sync0rw.cc:371

#17 0x000055b6ea99a720 in rw_lock_s_lock_func (line=<optimized out>, file_name=<optimized out>, pass=0, lock=0x7fccc6b3c500) at /data/Server/10.5_old/storage/innobase/include/sync0rw.ic:294

#18 buf_page_mtr_lock (block=block@entry=0x7fccc6b3c480, rw_latch=rw_latch@entry=1, mtr=mtr@entry=0x7fccb4329480, file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=line@entry=3276) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:2843

#19 0x000055b6ea9b3cc0 in buf_page_get_low (page_id={m_id = 1080863910568919049}, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=<optimized out>, guess@entry=0x7fccc6b3c480, mode=mode@entry=10, file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3413

#20 0x000055b6ea9b45bd in buf_page_get_gen (page_id=<optimized out>, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x7fccc6b3c480, mode=mode@entry=10, file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3482

#21 0x000055b6ea93403e in btr_cur_search_to_nth_level_func (index=index@entry=0x617000002e88, level=<optimized out>, tuple=tuple@entry=0x6190003e7208, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=36, cursor=cursor@entry=0x7fccb43292d0, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:1612

#22 0x000055b6ea3a66f0 in btr_pcur_open_low (index=0x617000002e88, level=level@entry=0, tuple=tuple@entry=0x6190003e7208, mode=mode@entry=PAGE_CUR_LE, latch_mode=latch_mode@entry=36, cursor=cursor@entry=0x7fccb43292d0, file=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=3276, autoinc=<optimized out>, mtr=<optimized out>) at /data/Server/10.5_old/storage/innobase/include/btr0pcur.ic:441

#23 0x000055b6ea3beeb1 in ibuf_insert_low (mode=mode@entry=36, op=op@entry=IBUF_OP_DELETE_MARK, no_counter=no_counter@entry=0, entry=entry@entry=0x6190003e7c08, entry_size=entry_size@entry=44, index=index@entry=0x617000056088, page_id=<optimized out>, zip_size=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3276

#24 0x000055b6ea3c0883 in ibuf_insert (op=op@entry=IBUF_OP_DELETE_MARK, entry=entry@entry=0x6190003e7c08, index=index@entry=0x617000056088, page_id=<optimized out>, zip_size=zip_size@entry=0, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3603

#25 0x000055b6ea934d0a in btr_cur_search_to_nth_level_func (index=index@entry=0x617000056088, level=<optimized out>, tuple=tuple@entry=0x6190003e7c08, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=4098, cursor=cursor@entry=0x7fccb432bb50, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:1663

#26 0x000055b6ea6ba59c in btr_pcur_open_low (index=index@entry=0x617000056088, level=level@entry=0, tuple=tuple@entry=0x6190003e7c08, mode=mode@entry=PAGE_CUR_LE, latch_mode=latch_mode@entry=4098, cursor=cursor@entry=0x7fccb432bb50, file=0x55b6eb6ed660 "/data/Server/10.5_old/storage/innobase/row/row0row.cc", line=1303, autoinc=<optimized out>, mtr=<optimized out>) at /data/Server/10.5_old/storage/innobase/include/btr0pcur.ic:441

#27 0x000055b6ea6bc3d8 in row_search_index_entry (index=index@entry=0x617000056088, entry=entry@entry=0x6190003e7c08, mode=mode@entry=4098, pcur=pcur@entry=0x7fccb432bb50, mtr=mtr@entry=0x7fccb432c380) at /data/Server/10.5_old/storage/innobase/row/row0row.cc:1303

#28 0x000055b6ea71c5b2 in row_upd_sec_index_entry (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2067

(rr) continue

…

Thread 17 hit Breakpoint 9, buf_page_get_low (page_id={m_id = 1363}, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x0, mode=mode@entry=10,

    file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:2904

2904		ut_ad((mtr == NULL) == (mode == BUF_EVICT_IF_IN_POOL));

(rr) bt

#0  buf_page_get_low (page_id={m_id = 1363}, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x0, mode=mode@entry=10,

    file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:2904

#1  0x000055b6ea9b45bd in buf_page_get_gen (page_id=<optimized out>, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x0, mode=mode@entry=10,

    file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3482

#2  0x000055b6ea93403e in btr_cur_search_to_nth_level_func (index=index@entry=0x617000002e88, level=<optimized out>, tuple=tuple@entry=0x6190003e7208, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>,

    latch_mode@entry=36, cursor=cursor@entry=0x7fccb43292d0, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:1612

#3  0x000055b6ea3a66f0 in btr_pcur_open_low (index=0x617000002e88, level=level@entry=0, tuple=tuple@entry=0x6190003e7208, mode=mode@entry=PAGE_CUR_LE, latch_mode=latch_mode@entry=36,

    cursor=cursor@entry=0x7fccb43292d0, file=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=3276, autoinc=<optimized out>, mtr=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/include/btr0pcur.ic:441

#4  0x000055b6ea3beeb1 in ibuf_insert_low (mode=mode@entry=36, op=op@entry=IBUF_OP_DELETE_MARK, no_counter=no_counter@entry=0, entry=entry@entry=0x6190003e7c08, entry_size=entry_size@entry=44,

    index=index@entry=0x617000056088, page_id=<optimized out>, zip_size=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3276

#5  0x000055b6ea3c0883 in ibuf_insert (op=op@entry=IBUF_OP_DELETE_MARK, entry=entry@entry=0x6190003e7c08, index=index@entry=0x617000056088, page_id=<optimized out>, zip_size=zip_size@entry=0,

    thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3603

…

The page 5:870 is still missing from the buffer pool, but the insert into the change buffer is proceeding further to access multiple change buffer pages. The delete-mark operation is finally buffered:

#2  0x000055b6ea4ccca2 in page_cur_insert_rec_low (cur=cur@entry=0x7fccb43292d8, index=index@entry=0x617000002e88, rec=rec@entry=0x612000108559 "", offsets=<optimized out>, mtr=mtr@entry=0x7fccb4329480)

    at /data/Server/10.5_old/storage/innobase/page/page0cur.cc:1449

#3  0x000055b6ea909073 in page_cur_tuple_insert (cursor=cursor@entry=0x7fccb43292d8, tuple=tuple@entry=0x6190003e7208, index=index@entry=0x617000002e88, offsets=offsets@entry=0x7fccb4329230,

    heap=heap@entry=0x7fccb4329210, n_ext=<optimized out>, mtr=0x7fccb4329480) at /data/Server/10.5_old/storage/innobase/include/page0cur.ic:285

#4  0x000055b6ea910199 in btr_cur_optimistic_insert (flags=flags@entry=3, cursor=cursor@entry=0x7fccb43292d0, offsets=offsets@entry=0x7fccb4329230, heap=heap@entry=0x7fccb4329210,

    entry=entry@entry=0x6190003e7208, rec=rec@entry=0x7fccb4329270, big_rec=<optimized out>, n_ext=0, thr=<optimized out>, mtr=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:3562

#5  0x000055b6ea3bf9ab in ibuf_insert_low (mode=mode@entry=36, op=op@entry=IBUF_OP_DELETE_MARK, no_counter=no_counter@entry=0, entry=entry@entry=0x6190003e7c08, entry_size=entry_size@entry=44,

    index=index@entry=0x617000056088, page_id=<optimized out>, zip_size=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3403

#6  0x000055b6ea3c0883 in ibuf_insert (op=op@entry=IBUF_OP_DELETE_MARK, entry=entry@entry=0x6190003e7c08, index=index@entry=0x617000056088, page_id=<optimized out>, zip_size=zip_size@entry=0,

    thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3603

…

(rr) when

Current event: 918452

At this point, I was forced to restart rr replay, because watchpoints could no longer be set (not even after disabling or removing old watchpoints). Even after restarting, I cannot continue using watchpoints across that point. But, I was able to catch the event of loading the page, using a breakpoint:

Thread 35 hit Breakpoint 3, buf_page_read_complete (bpage=bpage@entry=0x7fccc6b651a0, node=...) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:4099

4099	{

1: /x *(char*)0x7fccc6dcfe7f@20 = {0x0 <repeats 20 times>}

(rr) p/x bpage.id_

$1 = {m_id = 0x500000366}

(rr) thr a 17 bt

Thread 17 (Thread 1020881.1039623):

#0  0x00007fccd36abe6b in ?? () from /lib/x86_64-linux-gnu/libasan.so.5

#1  0x00007fccd36987ca in __asan_region_is_poisoned () from /lib/x86_64-linux-gnu/libasan.so.5

#2  0x000055b6ea568862 in rec_offs_get_n_alloc (offsets=0x7fccb3f24300) at /data/Server/10.5_old/storage/innobase/include/rem0rec.h:613

#3  rec_offs_n_fields (offsets=0x7fccb3f24300) at /data/Server/10.5_old/storage/innobase/include/rem0rec.h:629

#4  rec_offs_nth_type (n=1, offsets=0x7fccb3f24300) at /data/Server/10.5_old/storage/innobase/include/rem0rec.h:642

#5  rec_offs_nth_default (n=1, offsets=0x7fccb3f24300) at /data/Server/10.5_old/storage/innobase/include/rem0rec.h:653

#6  cmp_rec_rec (rec1=rec1@entry=0x7fccc6d391ec "nmnorthwestdhb", rec2=rec2@entry=0x7fccc6d391bd "nliimjkemnsoblanketadvicesorry", offsets1=0x7fccb3f24300, offsets2=0x7fccb3f245e0, index=index@entry=0x6170011ef688, nulls_unequal=nulls_unequal@entry=false, matched_fields=<optimized out>) at /data/Server/10.5_old/storage/innobase/rem/rem0cmp.cc:949

#7  0x000055b6ea4fa98f in page_validate (page=<optimized out>, index=<optimized out>) at /data/Server/10.5_old/storage/innobase/page/page0page.cc:2211

#8  0x000055b6ea3bd510 in ibuf_merge_or_delete_for_page (block=block@entry=0x7fccc6b56c00, page_id=..., zip_size=zip_size@entry=0) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:4384

#9  0x000055b6ea9b43c2 in buf_page_get_low (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=<optimized out>, guess@entry=0x0, mode=mode@entry=16, file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3401

#10 0x000055b6ea9b45bd in buf_page_get_gen (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=guess@entry=0x0, mode=mode@entry=16, file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3482

#11 0x000055b6ea3b5657 in ibuf_read_merge_pages (space_ids=space_ids@entry=0x7fccb3f26400, page_nos=page_nos@entry=0x7fccb3f26440, n_stored=<optimized out>) at /data/Server/10.5_old/storage/innobase/include/buf0types.h:127

#12 0x000055b6ea3c006d in ibuf_insert_low (mode=mode@entry=36, op=op@entry=IBUF_OP_INSERT, no_counter=no_counter@entry=0, entry=entry@entry=0x6190003e5908, entry_size=entry_size@entry=39, index=index@entry=0x617000056788, page_id=..., zip_size=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3480

#13 0x000055b6ea3c0883 in ibuf_insert (op=op@entry=IBUF_OP_INSERT, entry=entry@entry=0x6190003e5908, index=index@entry=0x617000056788, page_id=..., zip_size=zip_size@entry=0, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3603

#14 0x000055b6ea934ebb in btr_cur_search_to_nth_level_func (index=index@entry=0x617000056788, level=level@entry=0, tuple=tuple@entry=0x6190003e5908, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=514, cursor=cursor@entry=0x7fccb3f284a0, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:1650

#15 0x000055b6ea5e1765 in row_ins_sec_index_entry_low (flags=flags@entry=0, mode=mode@entry=2, index=index@entry=0x617000056788, offsets_heap=offsets_heap@entry=0x619000051e80, heap=heap@entry=0x619002954080, entry=entry@entry=0x6190003e5908, trx_id=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:2971

#16 0x000055b6ea5ef6f8 in row_ins_sec_index_entry (index=index@entry=0x617000056788, entry=entry@entry=0x6190003e5908, thr=thr@entry=0x62100060b248, check_foreign=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3294

While buffering an insert into the index acbdef (which is not the being-corrupted index abdcef), we are conducting a change buffer merge for page 5:260, because the buffer is over capacity. (Why not just refuse to buffer more?)

rr replay was again acting up, so I was forced to restart once more. I was able to catch the stack trace of the thread that completed the page read:

#0  buf_page_read_complete (bpage=bpage@entry=0x7fccc6b651a0, node=...) at /data/Server/10.5_old/storage/innobase/include/buf0buf.h:912

#1  0x000055b6eaa1d02e in buf_read_page_low (err=err@entry=0x7fccaf316850, space=<optimized out>, sync=sync@entry=true, mode=mode@entry=132, page_id=..., zip_size=zip_size@entry=0, unzip=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/buf/buf0rea.cc:360

#2  0x000055b6eaa22a1d in buf_read_page (page_id=..., zip_size=zip_size@entry=0) at /data/Server/10.5_old/storage/innobase/buf/buf0rea.cc:489

#3  0x000055b6ea9b1c0e in buf_page_get_low (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=guess@entry=0x0, mode=mode@entry=10,

    file=file@entry=0x55b6eb6b06c0 "/data/Server/10.5_old/storage/innobase/row/row0ins.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3024

#4  0x000055b6ea9b45bd in buf_page_get_gen (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=guess@entry=0x0, mode=mode@entry=10,

    file=file@entry=0x55b6eb6b06c0 "/data/Server/10.5_old/storage/innobase/row/row0ins.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3482

#5  0x000055b6ea93403e in btr_cur_search_to_nth_level_func (index=index@entry=0x617000056088, level=level@entry=0, tuple=tuple@entry=0x6190071f2b08, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>,

    latch_mode@entry=514, cursor=cursor@entry=0x7fccaf3184a0, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:1612

#6  0x000055b6ea5e1765 in row_ins_sec_index_entry_low (flags=flags@entry=0, mode=mode@entry=2, index=index@entry=0x617000056088, offsets_heap=offsets_heap@entry=0x619002931f80, heap=heap@entry=0x619002935680,

    entry=entry@entry=0x6190071f2b08, trx_id=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:2971

#7  0x000055b6ea5ef6f8 in row_ins_sec_index_entry (index=index@entry=0x617000056088, entry=entry@entry=0x6190071f2b08, thr=thr@entry=0x621000a78e48, check_foreign=<optimized out>)

    at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3294

#8  0x000055b6ea71e375 in row_upd_sec_index_entry (node=node@entry=0x621000a78a38, thr=thr@entry=0x621000a78e48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2236

#9  0x000055b6ea71e6e6 in row_upd_sec_step (node=node@entry=0x621000a78a38, thr=thr@entry=0x621000a78e48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2263

#10 0x000055b6ea71f271 in row_upd (node=node@entry=0x621000a78a38, thr=thr@entry=0x621000a78e48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:3024

#11 0x000055b6ea71fc6b in row_upd_step (thr=thr@entry=0x621000a78e48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:3139

#12 0x000055b6ea64108d in row_update_for_mysql (prebuilt=0x621000a78188) at /data/Server/10.5_old/storage/innobase/row/row0mysql.cc:1844

#13 0x000055b6ea22fec9 in ha_innobase::update_row (this=0x61d0008642b8, old_row=<optimized out>, new_row=<optimized out>) at /data/Server/10.5_old/storage/innobase/handler/ha_innodb.cc:8443

#14 0x000055b6e980670b in handler::ha_update_row (this=0x61d0008642b8, old_data=0x61a0002b1120 "\300\006proper", new_data=0x61a0002b0eb8 "\300\006proper") at /data/Server/10.5_old/sql/handler.cc:7207

#15 0x000055b6e92edc77 in mysql_update (thd=thd@entry=0x62b0001ff218, table_list=<optimized out>, fields=..., values=..., conds=<optimized out>, order_num=<optimized out>, order=<optimized out>,

    limit=18446744073709551615, ignore=<optimized out>, found_return=<optimized out>, updated_return=<optimized out>) at /data/Server/10.5_old/sql/sql_update.cc:1077

#16 0x000055b6e8fda762 in mysql_execute_command (thd=thd@entry=0x62b0001ff218) at /data/Server/10.5_old/sql/sql_limit.h:67

#17 0x000055b6e8f9a0d6 in mysql_parse (thd=thd@entry=0x62b0001ff218, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x7fccaf31c530, is_com_multi=is_com_multi@entry=false,

    is_next_command=is_next_command@entry=false) at /data/Server/10.5_old/sql/sql_parse.cc:8099

#18 0x000055b6e8fc8c1c in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x62b0001ff218,

    packet=packet@entry=0x62900124d219 "UPDATE `tab1` SET c = 'eastern' WHERE pk = 9999 /* E_R Thread21 QNO 56 CON_ID 34 */ ", packet_length=packet_length@entry=84, is_com_multi=is_com_multi@entry=false,

    is_next_command=is_next_command@entry=false) at /data/Server/10.5_old/sql/sql_class.h:1268

The reason why the change buffer merge fails is that the change buffer record that we find is for a different page, not for our 5:0x366:

4345			/* Check if the entry is for this index page */

4346			if (ibuf_rec_get_page_no(&mtr, rec) != page_id.page_no()

4347			    || ibuf_rec_get_space(&mtr, rec) != page_id.space()) {

4349				if (block != NULL) {

4350					page_header_reset_last_insert(block, &mtr);

4351				}

4353				goto reset_bit;

(rr) p/x *rec@1000

$8 = {0x0, 0x0, 0x0, 0x5, 0x0, 0x0, 0x0, 0x3, 0x6b, …}

The change buffer page where the record was expected to be found was 1358. And indeed, in that page the closest entries to 0x366 are for 0x35c and 0x36b:

00000420: 4140 3c37 0905 0400 0058 1704 8900 0000  A@<7.....X......

00000430: 0500 0000 035c 0005 0000 010f 0064 0008  .....\.......d..

…

00000480: 0905 0400 0060 1704 ee00 0000 0500 0000  .....`..........

00000490: 036b 0000 0100 010f 0064 0008 010f 0064  .k.......d.....d

That is, the change buffer entry for the delete-marked record somehow disappeared from the change buffer tree.

I hurried to debug this trace today, because we will likely lose access to the system for at least a week, and it is possible that the data will be lost. It has been a long day, so I will not be able to debug this trace to locate the ultimate root cause. But, at least now we have a clearer idea what is going wrong, and a documented procedure for debugging this.

Marko Mäkelä added a comment - 2021-05-31 17:25 Transaction 0xedd executed {{UPDATE `tab1` SET e = 'sexuality' WHERE pk = 4874 /* E_R Thread3 QNO 52 CON_ID 15 */ }}. Thankfully, it consisted of just that single statement. Now we only have to figure out what exactly happened to pages 5:870 and 5:681 during the transaction, or during the purge of the index record. It is suspicious that the delete-mark flag was not set in the record that starts at 0x36608a. Delete-marking the record should have been part of this transaction. At the time of the transaction commit, we can thankfully find both pages in the buffer pool: define find_page_105 set $id=((unsigned long long)$arg0)<<32|$arg1 set $chunk=buf_pool.chunks set $i=buf_pool.n_chunks set $found=0 while ($i-- && !$found) set $j = $chunk->size set $b = $chunk->blocks while ($j--) if ($b->page.id_.m_id==$id) print $b set $found=1 loop_break end set $b++ end set $chunk++ end end (rr) find_page_105 5 0x366 $1 = (buf_block_t *) 0x7fccc6b651a0 (rr) find_page_105 5 0x2a9 $2 = (buf_block_t *) 0x7fccc6b7fd40 (rr) p/x $1.frame[0x8a-6]@6 $3 = {0x0, 0x0, 0x10, 0xf, 0x8, 0x1c} We can see that indeed there is no delete-mark on the record that was supposed to be delete-marked. Let us check when that record was last accessed: (rr) p $1.frame+0x8a-6 $6 = (byte *) 0x7fccc6de0084 "" (rr) awatch *(byte*)0x7fccc6de0084@8 Hardware access (read/write) watchpoint 2: *(byte*)0x7fccc6de0084@8 (rr) when Current event: 928166 (rr) reverse-continue Continuing. Thread 13 hit Hardware access (read/write) watchpoint 2: *(byte*)0x7fccc6de0084@8 Value = "\000\000\020\017\b\034pr" (rr) bt #0 0x000055b6ea4ce754 in page_cur_insert_rec_low (cur=cur@entry=0x7fccb432a4a8, index=index@entry=0x617000056088, rec=rec@entry=0x6190003eb315 "profitvwmjgafhrrwmjclinicsexualityu", offsets=<optimized out>, mtr=mtr@entry=0x7fccb432a860) at /usr/include/c++/9/bits/stl_algobase.h:198 #1 0x000055b6ea909073 in page_cur_tuple_insert (cursor=cursor@entry=0x7fccb432a4a8, tuple=tuple@entry=0x6190003e7c08, index=index@entry=0x617000056088, offsets=offsets@entry=0x7fccb432a3a0, heap=heap@entry=0x7fccb432a480, n_ext=<optimized out>, mtr=0x7fccb432a860) at /data/Server/10.5_old/storage/innobase/include/page0cur.ic:285 #2 0x000055b6ea910199 in btr_cur_optimistic_insert (flags=flags@entry=0, cursor=cursor@entry=0x7fccb432a4a0, offsets=offsets@entry=0x7fccb432a3a0, heap=heap@entry=0x7fccb432a480, entry=entry@entry=0x6190003e7c08, rec=rec@entry=0x7fccb432a3c0, big_rec=<optimized out>, n_ext=0, thr=<optimized out>, mtr=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:3562 #3 0x000055b6ea5e2b93 in row_ins_sec_index_entry_low (flags=flags@entry=0, mode=<optimized out>, mode@entry=33, index=index@entry=0x617000056088, offsets_heap=offsets_heap@entry=0x6190003eb280, heap=heap@entry=0x6190003ead80, entry=entry@entry=0x6190003e7c08, trx_id=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3109 #4 0x000055b6ea5efdc1 in row_ins_sec_index_entry (index=index@entry=0x617000056088, entry=entry@entry=0x6190003e7c08, thr=thr@entry=0x62100068be48, check_foreign=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3308 #5 0x000055b6ea71e375 in row_upd_sec_index_entry (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2236 #6 0x000055b6ea71e6e6 in row_upd_sec_step (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2263 #7 0x000055b6ea71f271 in row_upd (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:3024 #8 0x000055b6ea71fc6b in row_upd_step (thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:3139 #9 0x000055b6ea64108d in row_update_for_mysql (prebuilt=0x62100068b188) at /data/Server/10.5_old/storage/innobase/row/row0mysql.cc:1844 #10 0x000055b6ea22fec9 in ha_innobase::update_row (this=0x61d0008e44b8, old_row=<optimized out>, new_row=<optimized out>) at /data/Server/10.5_old/storage/innobase/handler/ha_innodb.cc:8443 #11 0x000055b6e980670b in handler::ha_update_row (this=0x61d0008e44b8, old_data=0x61a000331520 "\300\006profit", new_data=0x61a0003312b8 "\300\006profit") at /data/Server/10.5_old/sql/handler.cc:7207 #12 0x000055b6e92edc77 in mysql_update (thd=thd@entry=0x62b0000ee218, table_list=<optimized out>, fields=..., values=..., conds=<optimized out>, order_num=<optimized out>, order=<optimized out>, limit=18446744073709551615, ignore=<optimized out>, found_return=<optimized out>, updated_return=<optimized out>) at /data/Server/10.5_old/sql/sql_update.cc:1077 #13 0x000055b6e8fda762 in mysql_execute_command (thd=thd@entry=0x62b0000ee218) at /data/Server/10.5_old/sql/sql_limit.h:67 #14 0x000055b6e8f9a0d6 in mysql_parse (thd=thd@entry=0x62b0000ee218, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x7fccb432e530, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /data/Server/10.5_old/sql/sql_parse.cc:8099 #15 0x000055b6e8fc8c1c in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x62b0000ee218, packet=packet@entry=0x629000028219 "UPDATE `tab1` SET e = 'sexuality' WHERE pk = 4874 /* E_R Thread3 QNO 52 CON_ID 15 */ ", packet_length=packet_length@entry=85, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /data/Server/10.5_old/sql/sql_class.h:1268 (rr) p cur.rec $7 = (rec_t *) 0x7fccc6de008a "profitvwmjgafhrrwmjclinicju" So, we are inserting the updated record after the record that we failed to delete-mark. Let us trace further backwards. We had also updated the next-record pointer a little earlier: #0 0x000055b6ea4be1e1 in mach_write_to_2 (b=0x7fccc6de0088 "\b\266profitvwmjgafhrrwmjclinicju", n=2076) at /data/Server/10.5_old/storage/innobase/include/mach0data.ic:61 #1 0x000055b6ea4cdfb3 in page_cur_insert_rec_low (cur=cur@entry=0x7fccb432a4a8, index=index@entry=0x617000056088, rec=rec@entry=0x6190003eb315 "profitvwmjgafhrrwmjclinicsexualityu", offsets=<optimized out>, mtr=mtr@entry=0x7fccb432a860) at /data/Server/10.5_old/storage/innobase/include/page0page.h:216 … (rr) rc … #0 0x000055b6ea3d0756 in mach_read_from_2 (b=0x7fccc6de0088 "") at /data/Server/10.5_old/storage/innobase/include/mach0data.ic:84 #1 rec_get_next_offs (comp=0, rec=0x7fccc6de008a "profitvwmjgafhrrwmjclinicju") at /data/Server/10.5_old/storage/innobase/include/rem0rec.ic:291 #2 page_rec_get_next_low (rec=rec@entry=0x7fccc6de008a "profitvwmjgafhrrwmjclinicju", comp=0) at /data/Server/10.5_old/storage/innobase/include/page0page.ic:434 #3 0x000055b6ea403f1b in page_rec_get_next_const (rec=0x7fccc6de008a "profitvwmjgafhrrwmjclinicju") at /data/Server/10.5_old/storage/innobase/include/page0page.h:276 #4 lock_rec_insert_check_and_lock (flags=flags@entry=0, rec=rec@entry=0x7fccc6de008a "profitvwmjgafhrrwmjclinicju", block=0x7fccc6b651a0, index=index@entry=0x617000056088, thr=thr@entry=0x62100068be48, mtr=mtr@entry=0x7fccb432a860, inherit=<optimized out>) at /data/Server/10.5_old/storage/innobase/lock/lock0lock.cc:5129 #5 0x000055b6ea8fa376 in btr_cur_ins_lock_and_undo (flags=flags@entry=0, cursor=cursor@entry=0x7fccb432a4a0, entry=entry@entry=0x6190003e7c08, thr=thr@entry=0x62100068be48, mtr=mtr@entry=0x7fccb432a860, inherit=inherit@entry=0x7fccb4329bc0) at /data/Server/10.5_old/storage/innobase/include/btr0cur.ic:61 #6 0x000055b6ea9100ed in btr_cur_optimistic_insert (flags=flags@entry=0, cursor=cursor@entry=0x7fccb432a4a0, offsets=offsets@entry=0x7fccb432a3a0, heap=heap@entry=0x7fccb432a480, entry=entry@entry=0x6190003e7c08, rec=rec@entry=0x7fccb432a3c0, big_rec=<optimized out>, n_ext=0, thr=<optimized out>, mtr=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:3529 #7 0x000055b6ea5e2b93 in row_ins_sec_index_entry_low (flags=flags@entry=0, mode=<optimized out>, mode@entry=33, index=index@entry=0x617000056088, offsets_heap=offsets_heap@entry=0x6190003eb280, heap=heap@entry=0x6190003ead80, entry=entry@entry=0x6190003e7c08, trx_id=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3109 #8 0x000055b6ea5efdc1 in row_ins_sec_index_entry (index=index@entry=0x617000056088, entry=entry@entry=0x6190003e7c08, thr=thr@entry=0x62100068be48, check_foreign=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3308 #9 0x000055b6ea71e375 in row_upd_sec_index_entry (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2236 #10 0x000055b6ea71e6e6 in row_upd_sec_step (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2263 #11 0x000055b6ea71f271 in row_upd (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:3024 #12 0x000055b6ea71fc6b in row_upd_step (thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:3139 #13 0x000055b6ea64108d in row_update_for_mysql (prebuilt=0x62100068b188) at /data/Server/10.5_old/storage/innobase/row/row0mysql.cc:1844 #14 0x000055b6ea22fec9 in ha_innobase::update_row (this=0x61d0008e44b8, old_row=<optimized out>, new_row=<optimized out>) at /data/Server/10.5_old/storage/innobase/handler/ha_innodb.cc:8443 (rr) reverse-finish … #0 0x000055b6ea5cf150 in mach_read_from_2 (b=0x7fccc6de0086 "\020\017") at /data/Server/10.5_old/storage/innobase/include/mach0data.ic:84 #1 rec_get_bit_field_2 (shift=1, mask=2046, offs=4, rec=0x7fccc6de008a "profitvwmjgafhrrwmjclinicju") at /data/Server/10.5_old/storage/innobase/include/rem0rec.ic:177 #2 rec_get_n_fields_old (rec=0x7fccc6de008a "profitvwmjgafhrrwmjclinicju") at /data/Server/10.5_old/storage/innobase/include/rem0rec.ic:389 #3 rec_get_n_fields (index=0x617000056088, rec=0x7fccc6de008a "profitvwmjgafhrrwmjclinicju") at /data/Server/10.5_old/storage/innobase/include/rem0rec.ic:431 #4 rec_n_fields_is_sane (index=index@entry=0x617000056088, rec=rec@entry=0x7fccc6de008a "profitvwmjgafhrrwmjclinicju", entry=entry@entry=0x6190003e7c08) at /data/Server/10.5_old/storage/innobase/include/rem0rec.ic:462 #5 0x000055b6ea5e1cd0 in row_ins_sec_index_entry_low (flags=flags@entry=0, mode=mode@entry=33, index=index@entry=0x617000056088, offsets_heap=offsets_heap@entry=0x6190003eb280, heap=heap@entry=0x6190003ead80, entry=entry@entry=0x6190003e7c08, trx_id=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3007 #6 0x000055b6ea5efdc1 in row_ins_sec_index_entry (index=index@entry=0x617000056088, entry=entry@entry=0x6190003e7c08, thr=thr@entry=0x62100068be48, check_foreign=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3308 #7 0x000055b6ea71e375 in row_upd_sec_index_entry (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2236 (rr) reverse-finish … #17 rw_lock_x_lock_low (lock=0x7fccc6b518a0, pass=<optimized out>, file_name=<optimized out>, line=<optimized out>) at /data/Server/10.5_old/storage/innobase/sync/sync0rw.cc:506 #18 0x000055b6ea78ba04 in rw_lock_x_lock_func (lock=lock@entry=0x7fccc6b518a0, pass=pass@entry=0, file_name=file_name@entry=0x55b6eb78c580 "/data/Server/10.5_old/storage/innobase/include/mtr0mtr.ic", line=line@entry=145) at /data/Server/10.5_old/storage/innobase/sync/sync0rw.cc:661 #19 0x000055b6ea942e19 in mtr_t::x_latch_at_savepoint (this=this@entry=0x7fccb432a860, savepoint=<optimized out>, block=<optimized out>) at /data/Server/10.5_old/storage/innobase/include/mtr0mtr.ic:145 #20 0x000055b6ea937e28 in btr_cur_search_to_nth_level_func (index=index@entry=0x617000056088, level=level@entry=0, tuple=tuple@entry=0x6190003e7c08, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=545, cursor=cursor@entry=0x7fccb432a4a0, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:2236 #21 0x000055b6ea5e1765 in row_ins_sec_index_entry_low (flags=flags@entry=0, mode=mode@entry=33, index=index@entry=0x617000056088, offsets_heap=offsets_heap@entry=0x6190003eb280, heap=heap@entry=0x6190003ead80, entry=entry@entry=0x6190003e7c08, trx_id=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:2971 #22 0x000055b6ea5efdc1 in row_ins_sec_index_entry (index=index@entry=0x617000056088, entry=entry@entry=0x6190003e7c08, thr=thr@entry=0x62100068be48, check_foreign=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3308 #23 0x000055b6ea71e375 in row_upd_sec_index_entry (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2236 #24 0x000055b6ea71e6e6 in row_upd_sec_step (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2263 #25 0x000055b6ea71f271 in row_upd (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:3024 #26 0x000055b6ea71fc6b in row_upd_step (thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:3139 #27 0x000055b6ea64108d in row_update_for_mysql (prebuilt=0x62100068b188) at /data/Server/10.5_old/storage/innobase/row/row0mysql.cc:1844 #28 0x000055b6ea22fec9 in ha_innobase::update_row (this=0x61d0008e44b8, old_row=<optimized out>, new_row=<optimized out>) at /data/Server/10.5_old/storage/innobase/handler/ha_innodb.cc:8443 This was still in the "insert" part of the "update", after the record should have been delete-marked. I was not able to reverse-finish across an emulated system call, so I had to restart debugging in the forward direction from an earlier point of logical time. As expected for this change buffer test case, the page was loaded to the buffer pool rather recently. It was not in the buffer pool at the start of our transaction: (rr) when Current event: 916240 (rr) bt #0 trx_sys_t::register_rw (trx=0x7fccc7108540, this=<optimized out>) at /data/Server/10.5_old/storage/innobase/include/trx0sys.h:1084 #1 trx_set_rw_mode (trx=trx@entry=0x7fccc7108540) at /data/Server/10.5_old/storage/innobase/trx/trx0trx.cc:2317 #2 0x000055b6ea40c0bd in lock_table (flags=flags@entry=0, table=0x618000053508, mode=LOCK_IX, thr=thr@entry=0x62100068b988) at /data/Server/10.5_old/storage/innobase/lock/lock0lock.cc:3766 #3 0x000055b6ea6e85c2 in row_search_mvcc (buf=buf@entry=0x61a0003312b8 "\377", mode=PAGE_CUR_GE, prebuilt=<optimized out>, match_mode=<optimized out>, direction=direction@entry=0) at /data/Server/10.5_old/storage/innobase/row/row0sel.cc:4608 #4 0x000055b6ea2370fe in ha_innobase::index_read (this=0x61d0008e44b8, buf=0x61a0003312b8 "\377", key_ptr=0x6210020cda00 "\n\023", key_len=4, find_flag=HA_READ_KEY_EXACT) at /data/Server/10.5_old/storage/innobase/handler/ha_innodb.cc:8891 At this point of time, the page 5:870 does not exist in the buffer pool. Here we insert something into the change buffer root page during the execution of this transaction, for a different index (fbcdea): #0 buf_page_get_low (page_id={m_id = 4}, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x7fccc6b3c480, mode=mode@entry=10, file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:2904 #1 0x000055b6ea9b45bd in buf_page_get_gen (page_id=<optimized out>, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x7fccc6b3c480, mode=mode@entry=10, file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3482 #2 0x000055b6ea93403e in btr_cur_search_to_nth_level_func (index=index@entry=0x617000002e88, level=<optimized out>, tuple=tuple@entry=0x61900813d108, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=36, cursor=cursor@entry=0x7fccb43292d0, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:1612 #3 0x000055b6ea3a66f0 in btr_pcur_open_low (index=0x617000002e88, level=level@entry=0, tuple=tuple@entry=0x61900813d108, mode=mode@entry=PAGE_CUR_LE, latch_mode=latch_mode@entry=36, cursor=cursor@entry=0x7fccb43292d0, file=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=3276, autoinc=<optimized out>, mtr=<optimized out>) at /data/Server/10.5_old/storage/innobase/include/btr0pcur.ic:441 #4 0x000055b6ea3beeb1 in ibuf_insert_low (mode=mode@entry=36, op=op@entry=IBUF_OP_DELETE_MARK, no_counter=no_counter@entry=0, entry=entry@entry=0x61900813e508, entry_size=entry_size@entry=44, index=index@entry=0x617000054b88, page_id=<optimized out>, zip_size=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3276 #5 0x000055b6ea3c0883 in ibuf_insert (op=op@entry=IBUF_OP_DELETE_MARK, entry=entry@entry=0x61900813e508, index=index@entry=0x617000054b88, page_id=<optimized out>, zip_size=zip_size@entry=0, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3603 #6 0x000055b6ea934d0a in btr_cur_search_to_nth_level_func (index=index@entry=0x617000054b88, level=<optimized out>, tuple=tuple@entry=0x61900813e508, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=4098, cursor=cursor@entry=0x7fccb432bb50, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:1663 #7 0x000055b6ea6ba59c in btr_pcur_open_low (index=index@entry=0x617000054b88, level=level@entry=0, tuple=tuple@entry=0x61900813e508, mode=mode@entry=PAGE_CUR_LE, latch_mode=latch_mode@entry=4098, cursor=cursor@entry=0x7fccb432bb50, file=0x55b6eb6ed660 "/data/Server/10.5_old/storage/innobase/row/row0row.cc", line=1303, autoinc=<optimized out>, mtr=<optimized out>) at /data/Server/10.5_old/storage/innobase/include/btr0pcur.ic:441 #8 0x000055b6ea6bc3d8 in row_search_index_entry (index=index@entry=0x617000054b88, entry=entry@entry=0x61900813e508, mode=mode@entry=4098, pcur=pcur@entry=0x7fccb432bb50, mtr=mtr@entry=0x7fccb432c380) at /data/Server/10.5_old/storage/innobase/row/row0row.cc:1303 #9 0x000055b6ea71c5b2 in row_upd_sec_index_entry (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2067 Finally, we are checking if our index page exists in the buffer pool: 4: /x page_id = {m_id = 0x500000366} (rr) when Current event: 916872 (rr) bt #0 buf_page_get_low (page_id={m_id = 21474837350}, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=guess@entry=0x0, mode=mode@entry=11, file=file@entry=0x55b6eb6ed660 "/data/Server/10.5_old/storage/innobase/row/row0row.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:2904 #1 0x000055b6ea9b45bd in buf_page_get_gen (page_id=<optimized out>, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=guess@entry=0x0, mode=mode@entry=11, file=file@entry=0x55b6eb6ed660 "/data/Server/10.5_old/storage/innobase/row/row0row.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3482 #2 0x000055b6ea93403e in btr_cur_search_to_nth_level_func (index=index@entry=0x617000056088, level=<optimized out>, tuple=tuple@entry=0x6190003e7c08, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=4098, cursor=cursor@entry=0x7fccb432bb50, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:1612 #3 0x000055b6ea6ba59c in btr_pcur_open_low (index=index@entry=0x617000056088, level=level@entry=0, tuple=tuple@entry=0x6190003e7c08, mode=mode@entry=PAGE_CUR_LE, latch_mode=latch_mode@entry=4098, cursor=cursor@entry=0x7fccb432bb50, file=0x55b6eb6ed660 "/data/Server/10.5_old/storage/innobase/row/row0row.cc", line=1303, autoinc=<optimized out>, mtr=<optimized out>) at /data/Server/10.5_old/storage/innobase/include/btr0pcur.ic:441 #4 0x000055b6ea6bc3d8 in row_search_index_entry (index=index@entry=0x617000056088, entry=entry@entry=0x6190003e7c08, mode=mode@entry=4098, pcur=pcur@entry=0x7fccb432bb50, mtr=mtr@entry=0x7fccb432c380) at /data/Server/10.5_old/storage/innobase/row/row0row.cc:1303 … (rr) finish … Value returned is $24 = (buf_block_t *) 0x0 We decide to buffer the delete-mark operation. During that exact time, the previous contents of the buffer pool page frame where page 5:870 will be loaded will be evicted: Thread 15 hit Hardware watchpoint 5: -location block.page.id_.m_id Old value = 21474836572 New value = 18446744073709551615 0x000055b6eaa09e73 in buf_page_t::set_corrupt_id (this=0x7fccc6b651a0) at /data/Server/10.5_old/storage/innobase/include/buf0types.h:130 130 page_id_t(uint64_t id) : m_id(id) {} (rr) thread apply 17 backtrace Thread 17 (Thread 1020881.1039616): … #11 __pthread_cond_wait (cond=cond@entry=0x60b00001beb0, mutex=mutex@entry=0x60b00001be88) at pthread_cond_wait.c:638 #12 0x000055b6ea4bc736 in os_event::wait (this=0x60b00001be70) at /data/Server/10.5_old/storage/innobase/include/sync0types.h:418 #13 os_event::wait_low (this=0x60b00001be70, reset_sig_count=2130) at /data/Server/10.5_old/storage/innobase/os/os0event.cc:325 #14 0x000055b6ea4bcfe4 in os_event_wait_low (event=<optimized out>, reset_sig_count=<optimized out>) at /data/Server/10.5_old/storage/innobase/os/os0event.cc:502 #15 0x000055b6ea77d032 in sync_array_wait_event (arr=arr@entry=0x611000001440, cell=@0x7fccb43273e0: 0x7fcccfa1bd00) at /data/Server/10.5_old/storage/innobase/sync/sync0arr.cc:456 #16 0x000055b6ea78a43e in rw_lock_s_lock_spin (lock=lock@entry=0x7fccc6b3c500, pass=pass@entry=0, file_name=file_name@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=line@entry=3276) at /data/Server/10.5_old/storage/innobase/sync/sync0rw.cc:371 #17 0x000055b6ea99a720 in rw_lock_s_lock_func (line=<optimized out>, file_name=<optimized out>, pass=0, lock=0x7fccc6b3c500) at /data/Server/10.5_old/storage/innobase/include/sync0rw.ic:294 #18 buf_page_mtr_lock (block=block@entry=0x7fccc6b3c480, rw_latch=rw_latch@entry=1, mtr=mtr@entry=0x7fccb4329480, file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=line@entry=3276) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:2843 #19 0x000055b6ea9b3cc0 in buf_page_get_low (page_id={m_id = 1080863910568919049}, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=<optimized out>, guess@entry=0x7fccc6b3c480, mode=mode@entry=10, file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3413 #20 0x000055b6ea9b45bd in buf_page_get_gen (page_id=<optimized out>, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x7fccc6b3c480, mode=mode@entry=10, file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3482 #21 0x000055b6ea93403e in btr_cur_search_to_nth_level_func (index=index@entry=0x617000002e88, level=<optimized out>, tuple=tuple@entry=0x6190003e7208, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=36, cursor=cursor@entry=0x7fccb43292d0, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:1612 #22 0x000055b6ea3a66f0 in btr_pcur_open_low (index=0x617000002e88, level=level@entry=0, tuple=tuple@entry=0x6190003e7208, mode=mode@entry=PAGE_CUR_LE, latch_mode=latch_mode@entry=36, cursor=cursor@entry=0x7fccb43292d0, file=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=3276, autoinc=<optimized out>, mtr=<optimized out>) at /data/Server/10.5_old/storage/innobase/include/btr0pcur.ic:441 #23 0x000055b6ea3beeb1 in ibuf_insert_low (mode=mode@entry=36, op=op@entry=IBUF_OP_DELETE_MARK, no_counter=no_counter@entry=0, entry=entry@entry=0x6190003e7c08, entry_size=entry_size@entry=44, index=index@entry=0x617000056088, page_id=<optimized out>, zip_size=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3276 #24 0x000055b6ea3c0883 in ibuf_insert (op=op@entry=IBUF_OP_DELETE_MARK, entry=entry@entry=0x6190003e7c08, index=index@entry=0x617000056088, page_id=<optimized out>, zip_size=zip_size@entry=0, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3603 #25 0x000055b6ea934d0a in btr_cur_search_to_nth_level_func (index=index@entry=0x617000056088, level=<optimized out>, tuple=tuple@entry=0x6190003e7c08, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=4098, cursor=cursor@entry=0x7fccb432bb50, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:1663 #26 0x000055b6ea6ba59c in btr_pcur_open_low (index=index@entry=0x617000056088, level=level@entry=0, tuple=tuple@entry=0x6190003e7c08, mode=mode@entry=PAGE_CUR_LE, latch_mode=latch_mode@entry=4098, cursor=cursor@entry=0x7fccb432bb50, file=0x55b6eb6ed660 "/data/Server/10.5_old/storage/innobase/row/row0row.cc", line=1303, autoinc=<optimized out>, mtr=<optimized out>) at /data/Server/10.5_old/storage/innobase/include/btr0pcur.ic:441 #27 0x000055b6ea6bc3d8 in row_search_index_entry (index=index@entry=0x617000056088, entry=entry@entry=0x6190003e7c08, mode=mode@entry=4098, pcur=pcur@entry=0x7fccb432bb50, mtr=mtr@entry=0x7fccb432c380) at /data/Server/10.5_old/storage/innobase/row/row0row.cc:1303 #28 0x000055b6ea71c5b2 in row_upd_sec_index_entry (node=node@entry=0x62100068ba38, thr=thr@entry=0x62100068be48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2067 (rr) continue … Thread 17 hit Breakpoint 9, buf_page_get_low (page_id={m_id = 1363}, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x0, mode=mode@entry=10, file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:2904 2904 ut_ad((mtr == NULL) == (mode == BUF_EVICT_IF_IN_POOL)); (rr) bt #0 buf_page_get_low (page_id={m_id = 1363}, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x0, mode=mode@entry=10, file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:2904 #1 0x000055b6ea9b45bd in buf_page_get_gen (page_id=<optimized out>, zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x0, mode=mode@entry=10, file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3482 #2 0x000055b6ea93403e in btr_cur_search_to_nth_level_func (index=index@entry=0x617000002e88, level=<optimized out>, tuple=tuple@entry=0x6190003e7208, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=36, cursor=cursor@entry=0x7fccb43292d0, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:1612 #3 0x000055b6ea3a66f0 in btr_pcur_open_low (index=0x617000002e88, level=level@entry=0, tuple=tuple@entry=0x6190003e7208, mode=mode@entry=PAGE_CUR_LE, latch_mode=latch_mode@entry=36, cursor=cursor@entry=0x7fccb43292d0, file=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=3276, autoinc=<optimized out>, mtr=<optimized out>) at /data/Server/10.5_old/storage/innobase/include/btr0pcur.ic:441 #4 0x000055b6ea3beeb1 in ibuf_insert_low (mode=mode@entry=36, op=op@entry=IBUF_OP_DELETE_MARK, no_counter=no_counter@entry=0, entry=entry@entry=0x6190003e7c08, entry_size=entry_size@entry=44, index=index@entry=0x617000056088, page_id=<optimized out>, zip_size=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3276 #5 0x000055b6ea3c0883 in ibuf_insert (op=op@entry=IBUF_OP_DELETE_MARK, entry=entry@entry=0x6190003e7c08, index=index@entry=0x617000056088, page_id=<optimized out>, zip_size=zip_size@entry=0, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3603 … The page 5:870 is still missing from the buffer pool, but the insert into the change buffer is proceeding further to access multiple change buffer pages. The delete-mark operation is finally buffered: #2 0x000055b6ea4ccca2 in page_cur_insert_rec_low (cur=cur@entry=0x7fccb43292d8, index=index@entry=0x617000002e88, rec=rec@entry=0x612000108559 "", offsets=<optimized out>, mtr=mtr@entry=0x7fccb4329480) at /data/Server/10.5_old/storage/innobase/page/page0cur.cc:1449 #3 0x000055b6ea909073 in page_cur_tuple_insert (cursor=cursor@entry=0x7fccb43292d8, tuple=tuple@entry=0x6190003e7208, index=index@entry=0x617000002e88, offsets=offsets@entry=0x7fccb4329230, heap=heap@entry=0x7fccb4329210, n_ext=<optimized out>, mtr=0x7fccb4329480) at /data/Server/10.5_old/storage/innobase/include/page0cur.ic:285 #4 0x000055b6ea910199 in btr_cur_optimistic_insert (flags=flags@entry=3, cursor=cursor@entry=0x7fccb43292d0, offsets=offsets@entry=0x7fccb4329230, heap=heap@entry=0x7fccb4329210, entry=entry@entry=0x6190003e7208, rec=rec@entry=0x7fccb4329270, big_rec=<optimized out>, n_ext=0, thr=<optimized out>, mtr=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:3562 #5 0x000055b6ea3bf9ab in ibuf_insert_low (mode=mode@entry=36, op=op@entry=IBUF_OP_DELETE_MARK, no_counter=no_counter@entry=0, entry=entry@entry=0x6190003e7c08, entry_size=entry_size@entry=44, index=index@entry=0x617000056088, page_id=<optimized out>, zip_size=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3403 #6 0x000055b6ea3c0883 in ibuf_insert (op=op@entry=IBUF_OP_DELETE_MARK, entry=entry@entry=0x6190003e7c08, index=index@entry=0x617000056088, page_id=<optimized out>, zip_size=zip_size@entry=0, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3603 … (rr) when Current event: 918452 At this point, I was forced to restart rr replay , because watchpoints could no longer be set (not even after disabling or removing old watchpoints). Even after restarting, I cannot continue using watchpoints across that point. But, I was able to catch the event of loading the page, using a breakpoint: Thread 35 hit Breakpoint 3, buf_page_read_complete (bpage=bpage@entry=0x7fccc6b651a0, node=...) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:4099 4099 { 1: /x *(char*)0x7fccc6dcfe7f@20 = {0x0 <repeats 20 times>} (rr) p/x bpage.id_ $1 = {m_id = 0x500000366} (rr) thr a 17 bt Thread 17 (Thread 1020881.1039623): #0 0x00007fccd36abe6b in ?? () from /lib/x86_64-linux-gnu/libasan.so.5 #1 0x00007fccd36987ca in __asan_region_is_poisoned () from /lib/x86_64-linux-gnu/libasan.so.5 #2 0x000055b6ea568862 in rec_offs_get_n_alloc (offsets=0x7fccb3f24300) at /data/Server/10.5_old/storage/innobase/include/rem0rec.h:613 #3 rec_offs_n_fields (offsets=0x7fccb3f24300) at /data/Server/10.5_old/storage/innobase/include/rem0rec.h:629 #4 rec_offs_nth_type (n=1, offsets=0x7fccb3f24300) at /data/Server/10.5_old/storage/innobase/include/rem0rec.h:642 #5 rec_offs_nth_default (n=1, offsets=0x7fccb3f24300) at /data/Server/10.5_old/storage/innobase/include/rem0rec.h:653 #6 cmp_rec_rec (rec1=rec1@entry=0x7fccc6d391ec "nmnorthwestdhb", rec2=rec2@entry=0x7fccc6d391bd "nliimjkemnsoblanketadvicesorry", offsets1=0x7fccb3f24300, offsets2=0x7fccb3f245e0, index=index@entry=0x6170011ef688, nulls_unequal=nulls_unequal@entry=false, matched_fields=<optimized out>) at /data/Server/10.5_old/storage/innobase/rem/rem0cmp.cc:949 #7 0x000055b6ea4fa98f in page_validate (page=<optimized out>, index=<optimized out>) at /data/Server/10.5_old/storage/innobase/page/page0page.cc:2211 #8 0x000055b6ea3bd510 in ibuf_merge_or_delete_for_page (block=block@entry=0x7fccc6b56c00, page_id=..., zip_size=zip_size@entry=0) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:4384 #9 0x000055b6ea9b43c2 in buf_page_get_low (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=<optimized out>, guess@entry=0x0, mode=mode@entry=16, file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3401 #10 0x000055b6ea9b45bd in buf_page_get_gen (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=guess@entry=0x0, mode=mode@entry=16, file=file@entry=0x55b6eb61a3e0 "/data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3482 #11 0x000055b6ea3b5657 in ibuf_read_merge_pages (space_ids=space_ids@entry=0x7fccb3f26400, page_nos=page_nos@entry=0x7fccb3f26440, n_stored=<optimized out>) at /data/Server/10.5_old/storage/innobase/include/buf0types.h:127 #12 0x000055b6ea3c006d in ibuf_insert_low (mode=mode@entry=36, op=op@entry=IBUF_OP_INSERT, no_counter=no_counter@entry=0, entry=entry@entry=0x6190003e5908, entry_size=entry_size@entry=39, index=index@entry=0x617000056788, page_id=..., zip_size=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3480 #13 0x000055b6ea3c0883 in ibuf_insert (op=op@entry=IBUF_OP_INSERT, entry=entry@entry=0x6190003e5908, index=index@entry=0x617000056788, page_id=..., zip_size=zip_size@entry=0, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/ibuf/ibuf0ibuf.cc:3603 #14 0x000055b6ea934ebb in btr_cur_search_to_nth_level_func (index=index@entry=0x617000056788, level=level@entry=0, tuple=tuple@entry=0x6190003e5908, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=514, cursor=cursor@entry=0x7fccb3f284a0, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:1650 #15 0x000055b6ea5e1765 in row_ins_sec_index_entry_low (flags=flags@entry=0, mode=mode@entry=2, index=index@entry=0x617000056788, offsets_heap=offsets_heap@entry=0x619000051e80, heap=heap@entry=0x619002954080, entry=entry@entry=0x6190003e5908, trx_id=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:2971 #16 0x000055b6ea5ef6f8 in row_ins_sec_index_entry (index=index@entry=0x617000056788, entry=entry@entry=0x6190003e5908, thr=thr@entry=0x62100060b248, check_foreign=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3294 While buffering an insert into the index acbdef (which is not the being-corrupted index abdcef ), we are conducting a change buffer merge for page 5:260, because the buffer is over capacity. (Why not just refuse to buffer more?) rr replay was again acting up, so I was forced to restart once more. I was able to catch the stack trace of the thread that completed the page read: #0 buf_page_read_complete (bpage=bpage@entry=0x7fccc6b651a0, node=...) at /data/Server/10.5_old/storage/innobase/include/buf0buf.h:912 #1 0x000055b6eaa1d02e in buf_read_page_low (err=err@entry=0x7fccaf316850, space=<optimized out>, sync=sync@entry=true, mode=mode@entry=132, page_id=..., zip_size=zip_size@entry=0, unzip=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0rea.cc:360 #2 0x000055b6eaa22a1d in buf_read_page (page_id=..., zip_size=zip_size@entry=0) at /data/Server/10.5_old/storage/innobase/buf/buf0rea.cc:489 #3 0x000055b6ea9b1c0e in buf_page_get_low (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=guess@entry=0x0, mode=mode@entry=10, file=file@entry=0x55b6eb6b06c0 "/data/Server/10.5_old/storage/innobase/row/row0ins.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3024 #4 0x000055b6ea9b45bd in buf_page_get_gen (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=guess@entry=0x0, mode=mode@entry=10, file=file@entry=0x55b6eb6b06c0 "/data/Server/10.5_old/storage/innobase/row/row0ins.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.5_old/storage/innobase/buf/buf0buf.cc:3482 #5 0x000055b6ea93403e in btr_cur_search_to_nth_level_func (index=index@entry=0x617000056088, level=level@entry=0, tuple=tuple@entry=0x6190071f2b08, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=514, cursor=cursor@entry=0x7fccaf3184a0, ahi_latch=<optimized out>, file=<optimized out>, line=<optimized out>, mtr=<optimized out>, autoinc=<optimized out>) at /data/Server/10.5_old/storage/innobase/btr/btr0cur.cc:1612 #6 0x000055b6ea5e1765 in row_ins_sec_index_entry_low (flags=flags@entry=0, mode=mode@entry=2, index=index@entry=0x617000056088, offsets_heap=offsets_heap@entry=0x619002931f80, heap=heap@entry=0x619002935680, entry=entry@entry=0x6190071f2b08, trx_id=<optimized out>, thr=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:2971 #7 0x000055b6ea5ef6f8 in row_ins_sec_index_entry (index=index@entry=0x617000056088, entry=entry@entry=0x6190071f2b08, thr=thr@entry=0x621000a78e48, check_foreign=<optimized out>) at /data/Server/10.5_old/storage/innobase/row/row0ins.cc:3294 #8 0x000055b6ea71e375 in row_upd_sec_index_entry (node=node@entry=0x621000a78a38, thr=thr@entry=0x621000a78e48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2236 #9 0x000055b6ea71e6e6 in row_upd_sec_step (node=node@entry=0x621000a78a38, thr=thr@entry=0x621000a78e48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:2263 #10 0x000055b6ea71f271 in row_upd (node=node@entry=0x621000a78a38, thr=thr@entry=0x621000a78e48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:3024 #11 0x000055b6ea71fc6b in row_upd_step (thr=thr@entry=0x621000a78e48) at /data/Server/10.5_old/storage/innobase/row/row0upd.cc:3139 #12 0x000055b6ea64108d in row_update_for_mysql (prebuilt=0x621000a78188) at /data/Server/10.5_old/storage/innobase/row/row0mysql.cc:1844 #13 0x000055b6ea22fec9 in ha_innobase::update_row (this=0x61d0008642b8, old_row=<optimized out>, new_row=<optimized out>) at /data/Server/10.5_old/storage/innobase/handler/ha_innodb.cc:8443 #14 0x000055b6e980670b in handler::ha_update_row (this=0x61d0008642b8, old_data=0x61a0002b1120 "\300\006proper", new_data=0x61a0002b0eb8 "\300\006proper") at /data/Server/10.5_old/sql/handler.cc:7207 #15 0x000055b6e92edc77 in mysql_update (thd=thd@entry=0x62b0001ff218, table_list=<optimized out>, fields=..., values=..., conds=<optimized out>, order_num=<optimized out>, order=<optimized out>, limit=18446744073709551615, ignore=<optimized out>, found_return=<optimized out>, updated_return=<optimized out>) at /data/Server/10.5_old/sql/sql_update.cc:1077 #16 0x000055b6e8fda762 in mysql_execute_command (thd=thd@entry=0x62b0001ff218) at /data/Server/10.5_old/sql/sql_limit.h:67 #17 0x000055b6e8f9a0d6 in mysql_parse (thd=thd@entry=0x62b0001ff218, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x7fccaf31c530, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /data/Server/10.5_old/sql/sql_parse.cc:8099 #18 0x000055b6e8fc8c1c in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x62b0001ff218, packet=packet@entry=0x62900124d219 "UPDATE `tab1` SET c = 'eastern' WHERE pk = 9999 /* E_R Thread21 QNO 56 CON_ID 34 */ ", packet_length=packet_length@entry=84, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /data/Server/10.5_old/sql/sql_class.h:1268 The reason why the change buffer merge fails is that the change buffer record that we find is for a different page, not for our 5:0x366: 4345 /* Check if the entry is for this index page */ 4346 if (ibuf_rec_get_page_no(&mtr, rec) != page_id.page_no() 4347 || ibuf_rec_get_space(&mtr, rec) != page_id.space()) { 4348 4349 if (block != NULL) { 4350 page_header_reset_last_insert(block, &mtr); 4351 } 4352 4353 goto reset_bit; (rr) p/x *rec@1000 $8 = {0x0, 0x0, 0x0, 0x5, 0x0, 0x0, 0x0, 0x3, 0x6b, …} The change buffer page where the record was expected to be found was 1358. And indeed, in that page the closest entries to 0x366 are for 0x35c and 0x36b: 00000420: 4140 3c37 0905 0400 0058 1704 8900 0000 A@<7.....X...... 00000430: 0500 0000 035c 0005 0000 010f 0064 0008 .....\.......d.. … 00000480: 0905 0400 0060 1704 ee00 0000 0500 0000 .....`.......... 00000490: 036b 0000 0100 010f 0064 0008 010f 0064 .k.......d.....d That is, the change buffer entry for the delete-marked record somehow disappeared from the change buffer tree. I hurried to debug this trace today, because we will likely lose access to the system for at least a week, and it is possible that the data will be lost. It has been a long day, so I will not be able to debug this trace to locate the ultimate root cause. But, at least now we have a clearer idea what is going wrong, and a documented procedure for debugging this.

Roel Van de Paar added a comment - 2021-05-31 22:37

Thank you very much marko.

Perhaps a dumb question below, but dumb questions tend to have a tendency to trigger not-so-dumb ideas in the smart.

"That is, the change buffer entry for the delete-marked record somehow disappeared from the change buffer tree."
> Is it possible to add various assertions to catch this and/or further/better define the root cause?

Roel Van de Paar added a comment - 2021-05-31 22:37 Thank you very much marko . Perhaps a dumb question below, but dumb questions tend to have a tendency to trigger not-so-dumb ideas in the smart. "That is, the change buffer entry for the delete-marked record somehow disappeared from the change buffer tree." > Is it possible to add various assertions to catch this and/or further/better define the root cause?

Marko Mäkelä added a comment - 2021-06-01 11:00

Roel, I was not able to find the exact root cause before losing access to the trace. I suspect that it is a race condition between two change buffer operations. I liberally add assertions to debug builds when I have a clear idea. For this case, I do not have it yet.

I checked another trace that was attached to ~~MDEV-22373~~, and in that case, the corruption will be introduced after the last server startup. I downloaded the two latest versions of the data directory, and CHECK TABLE did not report any failure. I did repeat failure by starting up the server on the final data directory. I had to convince CHECK TABLE that it should do what I mean:

diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc

index dacb1eb98b4..7ba6076b878 100644

--- a/storage/innobase/handler/ha_innodb.cc

+++ b/storage/innobase/handler/ha_innodb.cc

@@ -14787,7 +14787,7 @@ ha_innobase::check(

 	m_prebuilt->trx->op_info = "checking table";

-	if (m_prebuilt->table->corrupted) {

+	if (0 && m_prebuilt->table->corrupted) {

 		/* If some previous operation has marked the table as

 		corrupted in memory, and has not propagated such to

 		clustered index, we will do so here */

diff --git a/storage/innobase/ibuf/ibuf0ibuf.cc b/storage/innobase/ibuf/ibuf0ibuf.cc

index fccd87ab416..f4f7a699400 100644

--- a/storage/innobase/ibuf/ibuf0ibuf.cc

+++ b/storage/innobase/ibuf/ibuf0ibuf.cc

@@ -4413,6 +4413,8 @@ void ibuf_merge_or_delete_for_page(buf_block_t *block, const page_id_t page_id,

 		page_id.space(), page_id.page_no(), heap);

 	if (block != NULL) {

+		if (mach_read_from_8(PAGE_HEADER + PAGE_INDEX_ID + block->frame) == 176)

+			ib::info() << "applying to page " << page_id;

 		/* Move the ownership of the x-latch on the page to this OS

 		thread, so that we can acquire a second x-latch on it. This

 		is needed for the insert operations to the index page to pass

diff --git a/storage/innobase/include/dict0mem.h b/storage/innobase/include/dict0mem.h

index 9a8675a0a1d..ec4f3d18047 100644

--- a/storage/innobase/include/dict0mem.h

+++ b/storage/innobase/include/dict0mem.h

@@ -2379,9 +2379,7 @@ inline bool dict_index_t::is_instant() const

 inline bool dict_index_t::is_corrupted() const

-	return UNIV_UNLIKELY(online_status >= ONLINE_INDEX_ABORTED

-			     || (type & DICT_CORRUPT)

-			     || (table && table->corrupted));

+	return false;

 inline void dict_index_t::clear_instant_add()

diff --git a/storage/innobase/rem/rem0rec.cc b/storage/innobase/rem/rem0rec.cc

index 581637be073..ec3d3c2714d 100644

--- a/storage/innobase/rem/rem0rec.cc

+++ b/storage/innobase/rem/rem0rec.cc

@@ -2610,12 +2610,12 @@ rec_print(

 			o << '['

 			  << local_len

 			  << '+' << BTR_EXTERN_FIELD_REF_SIZE << ']';

-			ut_print_buf(o, data, local_len);

+			ut_print_buf_hex(o, data, local_len);

 			ut_print_buf_hex(o, data + local_len,

 					 BTR_EXTERN_FIELD_REF_SIZE);

 		} else {

 			o << '[' << len << ']';

-			ut_print_buf(o, data, len);

+			ut_print_buf_hex(o, data, len);

diff --git a/storage/innobase/row/row0mysql.cc b/storage/innobase/row/row0mysql.cc

index a6c75f7f450..a878c401257 100644

--- a/storage/innobase/row/row0mysql.cc

+++ b/storage/innobase/row/row0mysql.cc

@@ -4791,6 +4791,8 @@ row_scan_index_for_mysql(

 	offsets = rec_get_offsets(rec, index, offsets_, index->n_core_fields,

 				  ULINT_UNDEFINED, &heap);

+	if (index->is_clust() || index->id == 176)

+		ib::info() << rec_printer(rec, offsets).str();

 	if (prev_entry != NULL) {

 		matched_fields = 0;

This table has fewer fields in the indexes:

CREATE TABLE `iwlinks_2` (

  `iwl_from` int(10) unsigned NOT NULL DEFAULT 0,

  `iwl_prefix` varbinary(20) NOT NULL DEFAULT '',

  `iwl_title` varbinary(255) NOT NULL DEFAULT '',

  PRIMARY KEY (`iwl_from`,`iwl_prefix`,`iwl_title`),

  KEY `iwl_prefix_title_from` (`iwl_prefix`,`iwl_title`,`iwl_from`)

) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Multiple indexes are corrupted, but I will concentrate on the first one that was reported. In the output generated by my patch, I must transform clustered index records
(iwl_from,iwl_prefix,iwl_title,DB_TRX_ID,DB_ROW_ID) into (iwl_prefix,iwl_title,iwl_from). Also in this case, we always have info_bits=0 in the output.

sed -ne 's/.*5 fields): {\([^,]*\),\([^,]*\),\([^,]*\),\[6\][^,]*,\[7\][^,]*}/{\2,\3,\1}/p' < ibuf3 |sort -u >sec.txt

sed -ne 's/.*3 fields): //p' < ibuf3|sort -u|diff -u - sec.txt|grep '^[+-]{'

This yields 108 lines that start with -, matching the message that CHECK TABLE reported:

InnoDB: Index 'iwl_prefix_title_from' contains 4895 entries, should be 4787.

Let us choose one of the shortest records:

-{[0](0x),[1](0x73),[4](0x00000004)}

-{[0](0x),[1](0x75),[4](0x00000004)}

-{[0](0x),[1](0x75),[4](0x00000007)}

-{[0](0x),[1](0x76),[4](0x00000005)}

-{[0](0x),[1](0x79),[4](0x00000006)}

The smallest counted PRIMARY KEY value starts with iwl_from>=8. Also in the data file, there is no smaller record in the clustered index. So, we might check in the rr replay trace what happened when those records were purged. In the data directory right before the startup, we have many delete-marked records in page 12 of that tablespace, starting with iwl_from=0. So, there must be some purge action present in the rr replay trace.

Marko Mäkelä added a comment - 2021-06-01 11:00 Roel , I was not able to find the exact root cause before losing access to the trace. I suspect that it is a race condition between two change buffer operations. I liberally add assertions to debug builds when I have a clear idea. For this case, I do not have it yet. I checked another trace that was attached to MDEV-22373 , and in that case, the corruption will be introduced after the last server startup. I downloaded the two latest versions of the data directory, and CHECK TABLE did not report any failure. I did repeat failure by starting up the server on the final data directory. I had to convince CHECK TABLE that it should do what I mean: diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc index dacb1eb98b4..7ba6076b878 100644 --- a/storage/innobase/handler/ha_innodb.cc +++ b/storage/innobase/handler/ha_innodb.cc @@ -14787,7 +14787,7 @@ ha_innobase::check( m_prebuilt->trx->op_info = "checking table"; - if (m_prebuilt->table->corrupted) { + if (0 && m_prebuilt->table->corrupted) { /* If some previous operation has marked the table as corrupted in memory, and has not propagated such to clustered index, we will do so here */ diff --git a/storage/innobase/ibuf/ibuf0ibuf.cc b/storage/innobase/ibuf/ibuf0ibuf.cc index fccd87ab416..f4f7a699400 100644 --- a/storage/innobase/ibuf/ibuf0ibuf.cc +++ b/storage/innobase/ibuf/ibuf0ibuf.cc @@ -4413,6 +4413,8 @@ void ibuf_merge_or_delete_for_page(buf_block_t *block, const page_id_t page_id, page_id.space(), page_id.page_no(), heap); if (block != NULL) { + if (mach_read_from_8(PAGE_HEADER + PAGE_INDEX_ID + block->frame) == 176) + ib::info() << "applying to page " << page_id; /* Move the ownership of the x-latch on the page to this OS thread, so that we can acquire a second x-latch on it. This is needed for the insert operations to the index page to pass diff --git a/storage/innobase/include/dict0mem.h b/storage/innobase/include/dict0mem.h index 9a8675a0a1d..ec4f3d18047 100644 --- a/storage/innobase/include/dict0mem.h +++ b/storage/innobase/include/dict0mem.h @@ -2379,9 +2379,7 @@ inline bool dict_index_t::is_instant() const inline bool dict_index_t::is_corrupted() const { - return UNIV_UNLIKELY(online_status >= ONLINE_INDEX_ABORTED - || (type & DICT_CORRUPT) - || (table && table->corrupted)); + return false; } inline void dict_index_t::clear_instant_add() diff --git a/storage/innobase/rem/rem0rec.cc b/storage/innobase/rem/rem0rec.cc index 581637be073..ec3d3c2714d 100644 --- a/storage/innobase/rem/rem0rec.cc +++ b/storage/innobase/rem/rem0rec.cc @@ -2610,12 +2610,12 @@ rec_print( o << '[' << local_len << '+' << BTR_EXTERN_FIELD_REF_SIZE << ']'; - ut_print_buf(o, data, local_len); + ut_print_buf_hex(o, data, local_len); ut_print_buf_hex(o, data + local_len, BTR_EXTERN_FIELD_REF_SIZE); } else { o << '[' << len << ']'; - ut_print_buf(o, data, len); + ut_print_buf_hex(o, data, len); } } diff --git a/storage/innobase/row/row0mysql.cc b/storage/innobase/row/row0mysql.cc index a6c75f7f450..a878c401257 100644 --- a/storage/innobase/row/row0mysql.cc +++ b/storage/innobase/row/row0mysql.cc @@ -4791,6 +4791,8 @@ row_scan_index_for_mysql( offsets = rec_get_offsets(rec, index, offsets_, index->n_core_fields, ULINT_UNDEFINED, &heap); + if (index->is_clust() || index->id == 176) + ib::info() << rec_printer(rec, offsets).str(); if (prev_entry != NULL) { matched_fields = 0; This table has fewer fields in the indexes: CREATE TABLE `iwlinks_2` ( `iwl_from` int (10) unsigned NOT NULL DEFAULT 0, `iwl_prefix` varbinary(20) NOT NULL DEFAULT '' , `iwl_title` varbinary(255) NOT NULL DEFAULT '' , PRIMARY KEY (`iwl_from`,`iwl_prefix`,`iwl_title`), KEY `iwl_prefix_title_from` (`iwl_prefix`,`iwl_title`,`iwl_from`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; Multiple indexes are corrupted, but I will concentrate on the first one that was reported. In the output generated by my patch, I must transform clustered index records (iwl_from,iwl_prefix,iwl_title,DB_TRX_ID,DB_ROW_ID) into (iwl_prefix,iwl_title,iwl_from). Also in this case, we always have info_bits=0 in the output. sed -ne 's/.*5 fields): {$[^,]*$,$[^,]*$,$[^,]*$,\[6\][^,]*,\[7\][^,]*}/{\2,\3,\1}/p' < ibuf3 |sort -u >sec.txt sed -ne 's/.*3 fields): //p' < ibuf3|sort -u|diff -u - sec.txt|grep '^[+-]{' This yields 108 lines that start with - , matching the message that CHECK TABLE reported: InnoDB: Index 'iwl_prefix_title_from' contains 4895 entries, should be 4787. Let us choose one of the shortest records: -{[0](0x),[1](0x73),[4](0x00000004)} -{[0](0x),[1](0x75),[4](0x00000004)} -{[0](0x),[1](0x75),[4](0x00000007)} -{[0](0x),[1](0x76),[4](0x00000005)} -{[0](0x),[1](0x79),[4](0x00000006)} The smallest counted PRIMARY KEY value starts with iwl_from>=8 . Also in the data file, there is no smaller record in the clustered index. So, we might check in the rr replay trace what happened when those records were purged. In the data directory right before the startup, we have many delete-marked records in page 12 of that tablespace, starting with iwl_from=0 . So, there must be some purge action present in the rr replay trace.

Marko Mäkelä added a comment - 2021-06-01 13:23

The 10.4-based trace is difficult to debug, because it contains so many tables and so much corruption. I got another trace that is similar to the 10.5-based one that I was analyzing yesterday until we lost access to the environment. In the new trace, the index ID is 27 instead of 29, and the index name is abcdfe instead of abdcef.
The clustered index fields are: (pk,DB_TRX_ID,DB_ROLL_PTR,b,c,a,f,e,d)
The secondary index fields are: (a,b,c,d,f,e,pk).

I will adapt the commands from yesterday’s session:

sed -ne 's/.*9 fields): {\([^,]*\),[^,]*,[^,]*,\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\)}/{\4,\2,\3,\7,\5,\6,\1}/p' < ibuf2 |sort -u > abcdfe.txt

sed -ne 's/.*7 fields): //p' < ibuf2|sort|diff -u - abcdfe.txt

Again, this returns exactly one orphan record:

--- -	2021-06-01 16:09:42.307132457 +0300

+++ abcdfe.txt	2021-06-01 16:08:10.457148645 +0300

@@ -1091,7 +1091,6 @@

 {[1](0x65),[5](0x636C617373),[6](0x737476757962),[10](0x636F6D6D697373696F6E),[5](0x7476757962),[3](0x767579),[4](0x00000DA8)}

 {[1](0x65),[5](0x636C656172),[8](0x6D6276786F766E70),[1](0x76),[2](0x6276),[12](0x76786F766E706D67616E7770),[4](0x00000EF8)}

 {[1](0x65),[5](0x6372616674),[1](0x79),[1](0x72),[5](0x626C61636B),[1](0x68),[4](0x00001237)}

-{[1](0x65),[5](0x6374667278),[12](0x74667278707670706B6F7770),[1](0x61),[6](0x667278707670),[1](0x72),[4](0x00001523)}

 {[1](0x65),[5](0x6374667278),[12](0x74667278707670706B6F7770),[1](0x61),[6](0x667278707670),[8](0x73656E74656E6365),[4](0x00001523)}

 {[1](0x65),[5](0x6375727665),[5](0x6E77756261),[8](0x6F62736572766572),[3](0x777562),[10](0x756261757775636B6673),[4](0x000003EE)}

 {[1](0x65),[5](0x677265656B),[1](0x61),[2](0x6474),[4](0x70617373),[9](0x696E74656772697479),[4](0x00000054)}

Clear text: (a,b,c,d,f,e,pk)=('ctfrx','tfrxpvppkowp','a','frxpvp','r',4691).
In the above diff output, we can see that below the extra record there is another record with the same pk=4691 that only differs in the value of the column e ('r' is replaced with 'sentence').
The clustered index record with pk=4691 was:

{[4](0x00001523),[6](0x000000000FB6),[7](0x64000004CB043D),[5](0x6374667278),[12](0x74667278707670706B6F7770),[1](0x65),[6](0x667278707670),[8](0x73656E74656E6365),[1](0x61)}

Clear text: (pk,DB_TRX_ID,DB_ROLL_PTR,b,c,a,f,e,d)=(4691,0xfb6,(update),'ctfrx','tfrxpvppkowp','e','frxpvp','sentence','a'). (Note: 2 columns seem to differ.)

I will pause debugging this for now, because actually I planned to finish something else first. This trace is expected to be preserved. Note to self:

ssh pluto

rr replay /data/Results/1622543479/CR-MDEV-25783--TBR-36/dev/shm/vardir/1622543479/45/1/rr/latest-trace

Marko Mäkelä added a comment - 2021-06-01 13:23 The 10.4-based trace is difficult to debug, because it contains so many tables and so much corruption. I got another trace that is similar to the 10.5-based one that I was analyzing yesterday until we lost access to the environment. In the new trace, the index ID is 27 instead of 29, and the index name is abcdfe instead of abdcef. The clustered index fields are: (pk,DB_TRX_ID,DB_ROLL_PTR,b,c,a,f,e,d) The secondary index fields are: (a,b,c,d,f,e,pk). I will adapt the commands from yesterday’s session: sed -ne 's/.*9 fields): {$[^,]*$,[^,]*,[^,]*,$[^,]*$,$[^,]*$,$[^,]*$,$[^,]*$,$[^,]*$,$[^,]*$}/{\4,\2,\3,\7,\5,\6,\1}/p' < ibuf2 |sort -u > abcdfe.txt sed -ne 's/.*7 fields): //p' < ibuf2|sort|diff -u - abcdfe.txt Again, this returns exactly one orphan record: --- - 2021-06-01 16:09:42.307132457 +0300 +++ abcdfe.txt 2021-06-01 16:08:10.457148645 +0300 @@ -1091,7 +1091,6 @@ {[1](0x65),[5](0x636C617373),[6](0x737476757962),[10](0x636F6D6D697373696F6E),[5](0x7476757962),[3](0x767579),[4](0x00000DA8)} {[1](0x65),[5](0x636C656172),[8](0x6D6276786F766E70),[1](0x76),[2](0x6276),[12](0x76786F766E706D67616E7770),[4](0x00000EF8)} {[1](0x65),[5](0x6372616674),[1](0x79),[1](0x72),[5](0x626C61636B),[1](0x68),[4](0x00001237)} -{[1](0x65),[5](0x6374667278),[12](0x74667278707670706B6F7770),[1](0x61),[6](0x667278707670),[1](0x72),[4](0x00001523)} {[1](0x65),[5](0x6374667278),[12](0x74667278707670706B6F7770),[1](0x61),[6](0x667278707670),[8](0x73656E74656E6365),[4](0x00001523)} {[1](0x65),[5](0x6375727665),[5](0x6E77756261),[8](0x6F62736572766572),[3](0x777562),[10](0x756261757775636B6673),[4](0x000003EE)} {[1](0x65),[5](0x677265656B),[1](0x61),[2](0x6474),[4](0x70617373),[9](0x696E74656772697479),[4](0x00000054)} Clear text: (a,b,c,d,f,e,pk)=('ctfrx','tfrxpvppkowp','a','frxpvp','r',4691). In the above diff output, we can see that below the extra record there is another record with the same pk=4691 that only differs in the value of the column e ('r' is replaced with 'sentence'). The clustered index record with pk=4691 was: {[4](0x00001523),[6](0x000000000FB6),[7](0x64000004CB043D),[5](0x6374667278),[12](0x74667278707670706B6F7770),[1](0x65),[6](0x667278707670),[8](0x73656E74656E6365),[1](0x61)} Clear text: (pk,DB_TRX_ID,DB_ROLL_PTR,b,c,a,f,e,d)=(4691,0xfb6,(update),'ctfrx','tfrxpvppkowp','e','frxpvp','sentence','a'). (Note: 2 columns seem to differ.) I will pause debugging this for now, because actually I planned to finish something else first. This trace is expected to be preserved. Note to self: ssh pluto rr replay /data/Results/1622543479/CR-MDEV-25783--TBR-36/dev/shm/vardir/1622543479/45/1/rr/latest-trace

Marko Mäkelä added a comment - 2021-06-04 08:10

I made a mistake above: 0x1523 is of course 5411, while 4691 would be 0x1253. Also, I forgot to decode the first column 'e'. The correct secondary index record is:
(a,b,c,d,f,e,pk)=('e','ctfrx','tfrxpvppkowp','a','frxpvp','r',5411).
That record is found in data2.tar.bz2 (after a start-up and shutdown with innodb_force_recovery=2) in the leaf page 5:0x601 starting at offset 0x45d:

00601450: 1e1a 1913 1206 0100 00d0 0f0a 6f65 6374  ............oect

00601460: 6672 7874 6672 7870 7670 706b 6f77 7061  frxtfrxpvppkowpa

00601470: 6672 7870 7670 7200 0015 2324 2016 130b  frxpvpr...#$ ...

The clustered index record starts at offset 0x26f:

00315260: 3231 2923 2216 110a 0400 0050 1302 b000  21)#"......P....

00315270: 0015 2300 0000 000f b664 0000 04cb 043d  ..#......d.....=

00315280: 6374 6672 7874 6672 7870 7670 706b 6f77  ctfrxtfrxpvppkow

00315290: 7065 6672 7870 7670 7365 6e74 656e 6365  pefrxpvpsentence

003152a0: 612d 2726 2519 1211 0a04 0000 5813 02ec  a-'&%.......X...

We see DB_TRX_ID=0xfb6 followed by a DB_ROLL_PTR whose MSB is not set (so it must be an update). Curiously, the delete-mark flag is not set in either record. In the secondary index page, we have PAGE_MAX_TRX_ID=0x1034.

After already uploading that file, I realized that I should have run innodb_force_recovery=3 to prevent the rollback of any transactions, instead of only innodb_force_recovery=2 to prevent the purge of history. But, I checked that both page contents will be unaffected by the change. In 10.5 we do not apply the change buffer in the background, thanks to ~~MDEV-19514~~.
I also checked that the output of --innodb-change-buffer-dump for our secondary index page did not differ between those two recovery runs. The change buffer contains the following records for the page:

PHYSICAL RECORD: n_fields 11; 1-byte offsets; info bits 0

 0: len 4; hex 00000005; asc     ;;

 1: len 1; hex 00; asc  ;;

 2: len 4; hex 00000601; asc     ;;

 3: len 30; hex 00000100010f00640008010f00640008010f00640008010f00640008010f; asc        d     d     d     d    ; (total 46 bytes);

 4: len 1; hex 65; asc e;;

 5: len 11; hex 627079646e6a746b667368; asc bpydnjtkfsh;;

 6: len 8; hex 6865726974616765; asc heritage;;

 7: len 7; hex 7079646e6a746b; asc pydnjtk;;

 8: len 13; hex 70617274696369706174696f6e; asc participation;;

 9: len 7; hex 6d6f6e74686c79; asc monthly;;

 10: len 4; hex 00001190; asc     ;;

PHYSICAL RECORD: n_fields 11; 1-byte offsets; info bits 0

 0: len 4; hex 00000005; asc     ;;

 1: len 1; hex 00; asc  ;;

 2: len 4; hex 00000601; asc     ;;

 3: len 30; hex 00010000010f00640008010f00640008010f00640008010f00640008010f; asc        d     d     d     d    ; (total 46 bytes);

 4: len 1; hex 65; asc e;;

 5: len 11; hex 627079646e6a746b667368; asc bpydnjtkfsh;;

 6: len 8; hex 6865726974616765; asc heritage;;

 7: len 12; hex 72656c6174696f6e73686970; asc relationship;;

 8: len 13; hex 70617274696369706174696f6e; asc participation;;

 9: len 7; hex 6d6f6e74686c79; asc monthly;;

 10: len 4; hex 00001190; asc     ;;

PHYSICAL RECORD: n_fields 11; 1-byte offsets; info bits 0

 0: len 4; hex 00000005; asc     ;;

 1: len 1; hex 00; asc  ;;

 2: len 4; hex 00000601; asc     ;;

 3: len 30; hex 00020100010f00640008010f00640008010f00640008010f00640008010f; asc        d     d     d     d    ; (total 46 bytes);

 4: len 1; hex 65; asc e;;

 5: len 1; hex 61; asc a;;

 6: len 1; hex 7a; asc z;;

 7: len 1; hex 74; asc t;;

 8: len 9; hex 657a7468776e666a6d; asc ezthwnfjm;;

 9: len 1; hex 62; asc b;;

 10: len 4; hex 00001e99; asc     ;;

PHYSICAL RECORD: n_fields 11; 1-byte offsets; info bits 0

 0: len 4; hex 00000005; asc     ;;

 1: len 1; hex 00; asc  ;;

 2: len 4; hex 00000601; asc     ;;

 3: len 30; hex 00030000010f00640008010f00640008010f00640008010f00640008010f; asc        d     d     d     d    ; (total 46 bytes);

 4: len 1; hex 65; asc e;;

 5: len 1; hex 61; asc a;;

 6: len 1; hex 7a; asc z;;

 7: len 9; hex 6d756e69636970616c; asc municipal;;

 8: len 9; hex 657a7468776e666a6d; asc ezthwnfjm;;

 9: len 1; hex 62; asc b;;

 10: len 4; hex 00001e99; asc     ;;

PHYSICAL RECORD: n_fields 11; 1-byte offsets; info bits 0

 0: len 4; hex 00000005; asc     ;;

 1: len 1; hex 00; asc  ;;

 2: len 4; hex 00000601; asc     ;;

 3: len 30; hex 00040100010f00640008010f00640008010f00640008010f00640008010f; asc        d     d     d     d    ; (total 46 bytes);

 4: len 1; hex 65; asc e;;

 5: len 1; hex 63; asc c;;

 6: len 2; hex 7865; asc xe;;

 7: len 1; hex 62; asc b;;

 8: len 1; hex 75; asc u;;

 9: len 9; hex 736474737976766b6f; asc sdtsyvvko;;

 10: len 4; hex 00000833; asc    3;;

Field 3 contains the operation counter and type followed by some metadata. Fields 4 to 10 correspond to the secondary index fields. We can see that none of these records is our problematic record, because field 10 never is 0x1523 (the value of pk), and none of the records contain the string 'ctfrx'.

All signs indicate that unlike in the previous similar trace, this one had corrupted the secondary index already before the server had been killed. Therefore, I must analyze the trace of the run of the server that was intentionally killed.

Marko Mäkelä added a comment - 2021-06-04 08:10 I made a mistake above: 0x1523 is of course 5411, while 4691 would be 0x1253. Also, I forgot to decode the first column 'e'. The correct secondary index record is: (a,b,c,d,f,e,pk)=('e','ctfrx','tfrxpvppkowp','a','frxpvp','r',5411). That record is found in data2.tar.bz2 (after a start-up and shutdown with innodb_force_recovery=2 ) in the leaf page 5:0x601 starting at offset 0x45d: 00601450: 1e1a 1913 1206 0100 00d0 0f0a 6f65 6374 ............oect 00601460: 6672 7874 6672 7870 7670 706b 6f77 7061 frxtfrxpvppkowpa 00601470: 6672 7870 7670 7200 0015 2324 2016 130b frxpvpr...#$ ... The clustered index record starts at offset 0x26f: 00315260: 3231 2923 2216 110a 0400 0050 1302 b000 21)#"......P.... 00315270: 0015 2300 0000 000f b664 0000 04cb 043d ..#......d.....= 00315280: 6374 6672 7874 6672 7870 7670 706b 6f77 ctfrxtfrxpvppkow 00315290: 7065 6672 7870 7670 7365 6e74 656e 6365 pefrxpvpsentence 003152a0: 612d 2726 2519 1211 0a04 0000 5813 02ec a-'&%.......X... We see DB_TRX_ID=0xfb6 followed by a DB_ROLL_PTR whose MSB is not set (so it must be an update). Curiously, the delete-mark flag is not set in either record. In the secondary index page, we have PAGE_MAX_TRX_ID=0x1034 . After already uploading that file, I realized that I should have run innodb_force_recovery=3 to prevent the rollback of any transactions, instead of only innodb_force_recovery=2 to prevent the purge of history. But, I checked that both page contents will be unaffected by the change. In 10.5 we do not apply the change buffer in the background, thanks to MDEV-19514 . I also checked that the output of --innodb-change-buffer-dump for our secondary index page did not differ between those two recovery runs. The change buffer contains the following records for the page: PHYSICAL RECORD: n_fields 11; 1-byte offsets; info bits 0 0: len 4; hex 00000005; asc ;; 1: len 1; hex 00; asc ;; 2: len 4; hex 00000601; asc ;; 3: len 30; hex 00000100010f00640008010f00640008010f00640008010f00640008010f; asc d d d d ; (total 46 bytes); 4: len 1; hex 65; asc e;; 5: len 11; hex 627079646e6a746b667368; asc bpydnjtkfsh;; 6: len 8; hex 6865726974616765; asc heritage;; 7: len 7; hex 7079646e6a746b; asc pydnjtk;; 8: len 13; hex 70617274696369706174696f6e; asc participation;; 9: len 7; hex 6d6f6e74686c79; asc monthly;; 10: len 4; hex 00001190; asc ;; PHYSICAL RECORD: n_fields 11; 1-byte offsets; info bits 0 0: len 4; hex 00000005; asc ;; 1: len 1; hex 00; asc ;; 2: len 4; hex 00000601; asc ;; 3: len 30; hex 00010000010f00640008010f00640008010f00640008010f00640008010f; asc d d d d ; (total 46 bytes); 4: len 1; hex 65; asc e;; 5: len 11; hex 627079646e6a746b667368; asc bpydnjtkfsh;; 6: len 8; hex 6865726974616765; asc heritage;; 7: len 12; hex 72656c6174696f6e73686970; asc relationship;; 8: len 13; hex 70617274696369706174696f6e; asc participation;; 9: len 7; hex 6d6f6e74686c79; asc monthly;; 10: len 4; hex 00001190; asc ;; PHYSICAL RECORD: n_fields 11; 1-byte offsets; info bits 0 0: len 4; hex 00000005; asc ;; 1: len 1; hex 00; asc ;; 2: len 4; hex 00000601; asc ;; 3: len 30; hex 00020100010f00640008010f00640008010f00640008010f00640008010f; asc d d d d ; (total 46 bytes); 4: len 1; hex 65; asc e;; 5: len 1; hex 61; asc a;; 6: len 1; hex 7a; asc z;; 7: len 1; hex 74; asc t;; 8: len 9; hex 657a7468776e666a6d; asc ezthwnfjm;; 9: len 1; hex 62; asc b;; 10: len 4; hex 00001e99; asc ;; PHYSICAL RECORD: n_fields 11; 1-byte offsets; info bits 0 0: len 4; hex 00000005; asc ;; 1: len 1; hex 00; asc ;; 2: len 4; hex 00000601; asc ;; 3: len 30; hex 00030000010f00640008010f00640008010f00640008010f00640008010f; asc d d d d ; (total 46 bytes); 4: len 1; hex 65; asc e;; 5: len 1; hex 61; asc a;; 6: len 1; hex 7a; asc z;; 7: len 9; hex 6d756e69636970616c; asc municipal;; 8: len 9; hex 657a7468776e666a6d; asc ezthwnfjm;; 9: len 1; hex 62; asc b;; 10: len 4; hex 00001e99; asc ;; PHYSICAL RECORD: n_fields 11; 1-byte offsets; info bits 0 0: len 4; hex 00000005; asc ;; 1: len 1; hex 00; asc ;; 2: len 4; hex 00000601; asc ;; 3: len 30; hex 00040100010f00640008010f00640008010f00640008010f00640008010f; asc d d d d ; (total 46 bytes); 4: len 1; hex 65; asc e;; 5: len 1; hex 63; asc c;; 6: len 2; hex 7865; asc xe;; 7: len 1; hex 62; asc b;; 8: len 1; hex 75; asc u;; 9: len 9; hex 736474737976766b6f; asc sdtsyvvko;; 10: len 4; hex 00000833; asc 3;; Field 3 contains the operation counter and type followed by some metadata. Fields 4 to 10 correspond to the secondary index fields. We can see that none of these records is our problematic record, because field 10 never is 0x1523 (the value of pk ), and none of the records contain the string 'ctfrx' . All signs indicate that unlike in the previous similar trace, this one had corrupted the secondary index already before the server had been killed. Therefore, I must analyze the trace of the run of the server that was intentionally killed.

Marko Mäkelä added a comment - 2021-06-04 10:30

I checked the contents of the secondary index page each time it was read into the buffer pool. The record never changed.

Thread 4 hit Breakpoint 1, buf_page_read_complete (bpage=0x7f0812127e40,

    node=...) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:4101

4101	  const page_id_t id(bpage->id());

1: /x bpage.id_.m_id = 0x500000601

(rr) when

Current event: 228935

(rr) display/x ((buf_block_t*)bpage)->frame[0x457]@36

2: /x ((buf_block_t*)bpage)->frame[0x457]@36 = {0x0, 0x0, 0xd0, 0xf, 0x4, 0x88, 0x65, 0x63, 0x74, 0x66, 0x72, 0x78, 0x74, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x70, 0x6b, 0x6f, 0x77, 0x70, 0x61, 0x66, 0x72,

  0x78, 0x70, 0x76, 0x70, 0x72, 0x0, 0x0, 0x15, 0x23}

(rr) command 1

Type commands for breakpoint(s) 1, one per line.

End with a line saying just "end".

>when

>c

>end

(rr) c

Continuing.

[Switching to Thread 1599113.1605076]

Thread 15 hit Breakpoint 1, buf_page_read_complete (bpage=0x7f08121488c0, node=...) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:4101

4101	  const page_id_t id(bpage->id());

1: /x bpage.id_.m_id = 0x500000601

2: /x ((buf_block_t*)bpage)->frame[0x457]@36 = {0x0, 0x0, 0xd0, 0xf, 0x4, 0x88, 0x65, 0x63, 0x74, 0x66, 0x72, 0x78, 0x74, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x70, 0x6b, 0x6f, 0x77, 0x70, 0x61, 0x66, 0x72,

  0x78, 0x70, 0x76, 0x70, 0x72, 0x0, 0x0, 0x15, 0x23}

Current event: 616159

[Switching to Thread 1599113.1608534]

Thread 28 hit Breakpoint 1, buf_page_read_complete (bpage=0x7f08121198a0, node=...) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:4101

4101	  const page_id_t id(bpage->id());

1: /x bpage.id_.m_id = 0x500000601

2: /x ((buf_block_t*)bpage)->frame[0x457]@36 = {0x0, 0x0, 0xd0, 0xf, 0x4, 0x88, 0x65, 0x63, 0x74, 0x66, 0x72, 0x78, 0x74, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x70, 0x6b, 0x6f, 0x77, 0x70, 0x61, 0x66, 0x72,

  0x78, 0x70, 0x76, 0x70, 0x72, 0x0, 0x0, 0x15, 0x23}

Current event: 818385

[Switching to Thread 1599113.1608599]

Thread 46 hit Breakpoint 1, buf_page_read_complete (bpage=0x7f0812116460, node=...) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:4101

4101	  const page_id_t id(bpage->id());

1: /x bpage.id_.m_id = 0x500000601

2: /x ((buf_block_t*)bpage)->frame[0x457]@36 = {0x0, 0x0, 0xd0, 0xf, 0x4, 0x88, 0x65, 0x63, 0x74, 0x66, 0x72, 0x78, 0x74, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x70, 0x6b, 0x6f, 0x77, 0x70, 0x61, 0x66, 0x72,

  0x78, 0x70, 0x76, 0x70, 0x72, 0x0, 0x0, 0x15, 0x23}

Current event: 1149964

[Switching to Thread 1599113.1608512]

Thread 21 hit Breakpoint 1, buf_page_read_complete (bpage=0x7f08120fa6e0, node=...) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:4101

4101	  const page_id_t id(bpage->id());

1: /x bpage.id_.m_id = 0x500000601

2: /x ((buf_block_t*)bpage)->frame[0x457]@36 = {0x0, 0x0, 0xd0, 0xf, 0x4, 0x88, 0x65, 0x63, 0x74, 0x66, 0x72, 0x78, 0x74, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x70, 0x6b, 0x6f, 0x77, 0x70, 0x61, 0x66, 0x72,

  0x78, 0x70, 0x76, 0x70, 0x72, 0x0, 0x0, 0x15, 0x23}

Current event: 1228265

Thread 47 received signal SIGKILL, Killed.

Here is the start of the transaction with ID 0xfb6 (which last modified the clustered index record):

Thread 17 hit Breakpoint 12, trx_set_rw_mode (trx=0x7f08126d29a8) at /data/Server/10.5/storage/innobase/trx/trx0trx.cc:2320

2320		if (trx->read_view.is_open()) {

4: /x trx->id = 0xfb6

(rr) bt

#0  trx_set_rw_mode (trx=0x7f08126d29a8) at /data/Server/10.5/storage/innobase/trx/trx0trx.cc:2320

#1  0x0000562d87c70ca0 in lock_table (flags=0, table=0x618000063508, mode=LOCK_IX, thr=0x6210002ee988) at /data/Server/10.5/storage/innobase/lock/lock0lock.cc:3766

#2  0x0000562d87ee479e in row_search_mvcc (buf=0x61a0001b12b8 "\377", mode=PAGE_CUR_GE, prebuilt=0x6210002ee188, match_mode=1, direction=0) at /data/Server/10.5/storage/innobase/row/row0sel.cc:4608

#3  0x0000562d87aec741 in ha_innobase::index_read (this=0x61d0007c42b8, buf=0x61a0001b12b8 "\377", key_ptr=0x6210032f0200 "#\025", key_len=4, find_flag=HA_READ_KEY_EXACT)

    at /data/Server/10.5/storage/innobase/handler/ha_innodb.cc:8866

#4  0x0000562d871f0c6b in handler::index_read_map (this=0x61d0007c42b8, buf=0x61a0001b12b8 "\377", key=0x6210032f0200 "#\025", keypart_map=1, find_flag=HA_READ_KEY_EXACT) at /data/Server/10.5/sql/handler.h:3818

#5  0x0000562d871cad30 in handler::ha_index_read_map (this=0x61d0007c42b8, buf=0x61a0001b12b8 "\377", key=0x6210032f0200 "#\025", keypart_map=1, find_flag=HA_READ_KEY_EXACT)

    at /data/Server/10.5/sql/handler.cc:3123

#6  0x0000562d871df416 in handler::read_range_first (this=0x61d0007c42b8, start_key=0x61d0007c43b8, end_key=0x61d0007c43d8, eq_range_arg=true, sorted=true) at /data/Server/10.5/sql/handler.cc:6200

#7  0x0000562d86ee5ef1 in handler::multi_range_read_next (this=0x61d0007c42b8, range_info=0x7f07fc233790) at /data/Server/10.5/sql/multi_range_read.cc:516

#8  0x0000562d86ee623e in Mrr_simple_index_reader::get_next (this=0x61d0007c48e0, range_info=0x7f07fc233790) at /data/Server/10.5/sql/multi_range_read.cc:553

#9  0x0000562d86eef20d in DsMrr_impl::dsmrr_next (this=0x61d0007c4790, range_info=0x7f07fc233790) at /data/Server/10.5/sql/multi_range_read.cc:1653

#10 0x0000562d87b1bc3e in ha_innobase::multi_range_read_next (this=0x61d0007c42b8, range_info=0x7f07fc233790) at /data/Server/10.5/storage/innobase/handler/ha_innodb.cc:20401

#11 0x0000562d875a8e84 in QUICK_RANGE_SELECT::get_next (this=0x6130000e50c0) at /data/Server/10.5/sql/opt_range.cc:12647

#12 0x0000562d875da15c in rr_quick (info=0x7f07fc233ba0) at /data/Server/10.5/sql/records.cc:403

#13 0x0000562d868c27c2 in READ_RECORD::read_record (this=0x7f07fc233ba0) at /data/Server/10.5/sql/records.h:80

#14 0x0000562d86d16f11 in mysql_update (thd=0x62b0001c0218, table_list=0x62b0001c73d0, fields=..., values=..., conds=0x62b0001c7f40, order_num=0, order=0x0, limit=18446744073709551615, ignore=false,

    found_return=0x7f07fc2343c0, updated_return=0x7f07fc2343e0) at /data/Server/10.5/sql/sql_update.cc:996

#15 0x0000562d86a534ff in mysql_execute_command (thd=0x62b0001c0218) at /data/Server/10.5/sql/sql_parse.cc:4468

#16 0x0000562d86a6b5b6 in mysql_parse (thd=0x62b0001c0218, rawbuf=0x62b0001c7238 "UPDATE `tab1` SET e = 'sentence' WHERE pk = 5411 /* E_R Thread17 QNO 53 CON_ID 30 */", length=84, parser_state=0x7f07fc234cb0,

    is_com_multi=false, is_next_command=false) at /data/Server/10.5/sql/sql_parse.cc:8096

…

(rr) break ibuf0ibuf.cc:3179 thread 12

Breakpoint 13 at 0x562d87c3feb8: file /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc, line 3179.

(rr) cond 13 index->id==27

(rr) c

Continuing.

Thread 17 hit Breakpoint 1, buf_page_read_complete (bpage=0x7f081214f980, node=...) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:4101

4101	  const page_id_t id(bpage->id());

1: /x bpage.id_.m_id = 0x500000315

Current event: 1148257

First, our transaction performed a synchronous read of the clustered index page (5:0x315). Soon thereafter, another thread completed a synchronous read of the secondary index page, because (just like in the previous trace that I analyzed) it decided that some room needs to be allocated in the change buffer, while attempting to buffer another insert (coincidentally, to the same index, and same index page 0x602):

Thread 30 hit Breakpoint 1, buf_page_read_complete (bpage=0x7f0812116460, node=...) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:4101

4101	  const page_id_t id(bpage->id());

1: /x bpage.id_.m_id = 0x500000601

2: /x ((buf_block_t*)bpage)->frame[0x457]@36 = {0x0, 0x0, 0xd0, 0xf, 0x4, 0x88, 0x65, 0x63, 0x74, 0x66, 0x72, 0x78, 0x74, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x70, 0x6b, 0x6f, 0x77, 0x70, 0x61, 0x66, 0x72,

  0x78, 0x70, 0x76, 0x70, 0x72, 0x0, 0x0, 0x15, 0x23}

Current event: 1149964

(rr) bt

#0  buf_page_read_complete (bpage=0x7f0812116460, node=...) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:4101

#1  0x0000562d8812ad23 in buf_read_page_low (err=0x7f07f82b9410, space=0x615000562f00, sync=true, mode=132, page_id=..., zip_size=0, unzip=false) at /data/Server/10.5/storage/innobase/buf/buf0rea.cc:360

#2  0x0000562d8812bb4f in buf_read_page (page_id=..., zip_size=0) at /data/Server/10.5/storage/innobase/buf/buf0rea.cc:489

#3  0x0000562d880d55e6 in buf_page_get_low (page_id=..., zip_size=0, rw_latch=2, guess=0x0, mode=16, file=0x562d891851a0 "/data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc", line=2328, mtr=0x7f07f82b9ce0,

    err=0x7f07f82b9b20, allow_ibuf_merge=true) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:3024

#4  0x0000562d880d7392 in buf_page_get_gen (page_id=..., zip_size=0, rw_latch=2, guess=0x0, mode=16, file=0x562d891851a0 "/data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc", line=2328, mtr=0x7f07f82b9ce0,

    err=0x7f07f82b9b20, allow_ibuf_merge=true) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:3482

#5  0x0000562d87c3b7a4 in ibuf_read_merge_pages (space_ids=0x7f07f82ba590, page_nos=0x7f07f82ba5d0, n_stored=3) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:2325

#6  0x0000562d87c412b2 in ibuf_insert_low (mode=36, op=IBUF_OP_INSERT, no_counter=0, entry=0x619000a88508, entry_size=55, index=0x617000035388, page_id=..., zip_size=0, thr=0x621000801a48)

    at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3480

#7  0x0000562d87c41a50 in ibuf_insert (op=IBUF_OP_INSERT, entry=0x619000a88508, index=0x617000035388, page_id=..., zip_size=0, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3603

#8  0x0000562d880690be in btr_cur_search_to_nth_level_func (index=0x617000035388, level=0, tuple=0x619000a88508, mode=PAGE_CUR_LE, latch_mode=2, cursor=0x7f07f82bc470, ahi_latch=0x0,

    file=0x562d8925ae40 "/data/Server/10.5/storage/innobase/row/row0ins.cc", line=2974, mtr=0x7f07f82bc830, autoinc=0) at /data/Server/10.5/storage/innobase/btr/btr0cur.cc:1650

#9  0x0000562d87e13938 in row_ins_sec_index_entry_low (flags=0, mode=2, index=0x617000035388, offsets_heap=0x619000a87a80, heap=0x619000f39880, entry=0x619000a88508, trx_id=0, thr=0x621000801a48)

    at /data/Server/10.5/storage/innobase/row/row0ins.cc:2971

#10 0x0000562d87e152da in row_ins_sec_index_entry (index=0x617000035388, entry=0x619000a88508, thr=0x621000801a48, check_foreign=true) at /data/Server/10.5/storage/innobase/row/row0ins.cc:3294

#11 0x0000562d87f0b593 in row_upd_sec_index_entry (node=0x621000801638, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/row/row0upd.cc:2236

#12 0x0000562d87f0bbbe in row_upd_sec_step (node=0x621000801638, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/row/row0upd.cc:2263

#13 0x0000562d87f112d2 in row_upd (node=0x621000801638, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/row/row0upd.cc:3024

#14 0x0000562d87f11a9c in row_upd_step (thr=0x621000801a48) at /data/Server/10.5/storage/innobase/row/row0upd.cc:3139

#15 0x0000562d87e5dedb in row_update_for_mysql (prebuilt=0x621000800d88) at /data/Server/10.5/storage/innobase/row/row0mysql.cc:1844

#16 0x0000562d87aea754 in ha_innobase::update_row (this=0x61d0008246b8, old_row=0x61a000211b20 "\300\trecipient", new_row=0x61a0002118b8 "\300\vcelebration")

    at /data/Server/10.5/storage/innobase/handler/ha_innodb.cc:8445

#17 0x0000562d871e7213 in handler::ha_update_row (this=0x61d0008246b8, old_data=0x61a000211b20 "\300\trecipient", new_data=0x61a0002118b8 "\300\vcelebration") at /data/Server/10.5/sql/handler.cc:7207

#18 0x0000562d86d1753d in mysql_update (thd=0x62b000299218, table_list=0x62b0002683d0, fields=..., values=..., conds=0x62b000268f48, order_num=0, order=0x0, limit=18446744073709551615, ignore=false,

    found_return=0x7f07f82bf3c0, updated_return=0x7f07f82bf3e0) at /data/Server/10.5/sql/sql_update.cc:1077

#19 0x0000562d86a534ff in mysql_execute_command (thd=0x62b000299218) at /data/Server/10.5/sql/sql_parse.cc:4468

#20 0x0000562d86a6b5b6 in mysql_parse (thd=0x62b000299218, rawbuf=0x62b000268238 "UPDATE `tab1` SET b = 'celebration' WHERE pk = 6083 /* E_R Thread33 QNO 59 CON_ID 45 */", length=87,

    parser_state=0x7f07f82bfcb0, is_com_multi=false, is_next_command=false) at /data/Server/10.5/sql/sql_parse.cc:8096

At this point, our ‘victim’ transaction is waiting for a latch on the change buffer root page (page 4). It has already decided to write to the change buffer, even though the page actually exists in the buffer pool:

#20 0x0000562d880d7392 in buf_page_get_gen (page_id=..., zip_size=0, rw_latch=1, guess=0x7f0812107aa0, mode=10, file=0x562d891851a0 "/data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc", line=3276, mtr=0x7f07fc230590, err=0x7f07fc22efd0, allow_ibuf_merge=false) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:3482

#21 0x0000562d88068b9b in btr_cur_search_to_nth_level_func (index=0x617000002e88, level=0, tuple=0x61900165e008, mode=PAGE_CUR_LE, latch_mode=36, cursor=0x7f07fc2303e0, ahi_latch=0x0, file=0x562d891851a0 "/data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc", line=3276, mtr=0x7f07fc230590, autoinc=0) at /data/Server/10.5/storage/innobase/btr/btr0cur.cc:1614

#22 0x0000562d87c31daa in btr_pcur_open_low (index=0x617000002e88, level=0, tuple=0x61900165e008, mode=PAGE_CUR_LE, latch_mode=36, cursor=0x7f07fc2303e0, file=0x562d891851a0 "/data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc", line=3276, autoinc=0, mtr=0x7f07fc230590) at /data/Server/10.5/storage/innobase/include/btr0pcur.ic:441

#23 0x0000562d87c405e1 in ibuf_insert_low (mode=36, op=IBUF_OP_DELETE_MARK, no_counter=0, entry=0x61900165f908, entry_size=43, index=0x617000035388, page_id=..., zip_size=0, thr=0x6210002eee48) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3276

#24 0x0000562d87c41a50 in ibuf_insert (op=IBUF_OP_DELETE_MARK, entry=0x61900165f908, index=0x617000035388, page_id=..., zip_size=0, thr=0x6210002eee48) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3603

That thread would complete the change buffer merge. Our ‘victim’ transaction is still suspended, waiting to open the change buffer leaf page. Meanwhile, the page 5:0x601 is evicted again on write completion:

#1  0x0000562d88127104 in buf_page_t::free_file_page (this=0x7f0812116460) at /data/Server/10.5/storage/innobase/include/buf0buf.h:956

#2  0x0000562d8811be7d in buf_LRU_block_free_hashed_page (block=0x7f0812116460) at /data/Server/10.5/storage/innobase/buf/buf0lru.cc:130

#3  0x0000562d88122bf4 in buf_LRU_free_page (bpage=0x7f0812116460, zip=true) at /data/Server/10.5/storage/innobase/buf/buf0lru.cc:962

#4  0x0000562d881095eb in buf_page_write_complete (request=...) at /data/Server/10.5/storage/innobase/buf/buf0flu.cc:417

#5  0x0000562d881f4f84 in fil_aio_callback (request=...) at /data/Server/10.5/storage/innobase/fil/fil0fil.cc:3372

#6  0x0000562d87d123cb in io_callback (cb=0x7f081268b140) at /data/Server/10.5/storage/innobase/os/os0file.cc:3604

#7  0x0000562d883218c8 in tpool::simulated_aio::simulated_aio_callback (param=0x7f081268b140) at /data/Server/10.5/tpool/aio_simulated.cc:162

…

(rr) when

Current event: 1151722

Finally, soon thereafter, our interesting transaction will resume execution:

(rr) c

Continuing.

[Switching to Thread 1599113.1608542]

Thread 17 hit Breakpoint 16, ibuf_insert_low (mode=36, op=IBUF_OP_DELETE_MARK, no_counter=0, entry=0x61900165f908, entry_size=43, index=0x617000035388, page_id=..., zip_size=0, thr=0x6210002eee48)

    at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3282

3282		buffered = ibuf_get_volume_buffered(&pcur,

3: /x index.id = 0x1b

(rr) when

Current event: 1151763

We will insert the record in change buffer page 0x356, right after the page infimum, before the first record that is buffered for the immediately following page 5:0x602. So far, so good. But, what is not so good is the following:

(rr) p page_cursor.rec

$35 = (rec_t *) 0x7f0812556065 "infimum"

(rr) p/x page_cursor.block.page.id_

$36 = {m_id = 0x356}

(rr) p *rec

$37 = (rec_t *) 0x7f081255614b ""

(rr) dump binary memory /dev/shm/ibuf356.bin page_cursor.block.frame  page_cursor.block.frame+srv_page_size

(rr) p/x **rec@90

$38 = {0x0, 0x0, 0x0, 0x5, 0x0, 0x0, 0x0, 0x6, 0x1, 0x0, 0x0, 0x1, 0x0, 0x1, 0xf, 0x0, 0x64, 0x0, 0x8, 0x1, 0xf, 0x0, 0x64, 0x0, 0x8, 0x1, 0xf, 0x0, 0x64, 0x0, 0x8, 0x1, 0xf, 0x0, 0x64, 0x0, 0x8, 0x1, 0xf,

  0x0, 0x64, 0x0, 0x8, 0x1, 0xf, 0x0, 0x64, 0x0, 0x8, 0x86, 0x3, 0x0, 0x4, 0x80, 0x0, 0x65, 0x63, 0x74, 0x66, 0x72, 0x78, 0x74, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x70, 0x6b, 0x6f, 0x77, 0x70, 0x61, 0x66,

  0x72, 0x78, 0x70, 0x76, 0x70, 0x72, 0x0, 0x0, 0x15, 0x23, 0x0, 0x61, 0x5d, 0x56, 0x50}

(rr) set $r=*rec

(rr) awatch $r[-2]@2

Hardware access (read/write) watchpoint 21: $r[-2]@2

(rr) when

Current event: 1152412

(rr) c

Continuing.

Thread 17 hit Breakpoint 19, mtr_t::commit (this=0x3f) at /data/Server/10.5/storage/innobase/mtr/mtr0mtr.cc:393

393	{

(rr) disa 19

(rr) c

Continuing.

[Switching to Thread 1599113.1608599]

Thread 30 hit Hardware access (read/write) watchpoint 21: $r[-2]@2

Value = "\001\262"

mach_read_from_2 (b=0x7f0812556149 "\001\262") at /data/Server/10.5/storage/innobase/include/mach0data.ic:84

84		return(uint16_t(uint16_t(b[0]) << 8 | b[1]));

(rr) bt

#0  mach_read_from_2 (b=0x7f0812556149 "\001\262") at /data/Server/10.5/storage/innobase/include/mach0data.ic:84

#1  0x0000562d87c58cbb in rec_get_next_offs (rec=0x7f081255614b "", comp=0) at /data/Server/10.5/storage/innobase/include/rem0rec.ic:291

#2  0x0000562d87c6d421 in lock_update_delete (block=0x7f0812151240, rec=0x7f081255614b "") at /data/Server/10.5/storage/innobase/lock/lock0lock.cc:3260

#3  0x0000562d88084cf8 in btr_cur_optimistic_delete_func (cursor=0x7f07f82b9b70, flags=0, mtr=0x7f07f82b9ce0) at /data/Server/10.5/storage/innobase/btr/btr0cur.cc:5595

#4  0x0000562d87c45be3 in ibuf_delete_rec (page_id=..., pcur=0x7f07f82b9b70, search_tuple=0x6160018ff708, mtr=0x7f07f82b9ce0) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:4078

#5  0x0000562d87c3bcd3 in ibuf_read_merge_pages (space_ids=0x7f07f82ba590, page_nos=0x7f07f82ba5d0, n_stored=3) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:2361

#6  0x0000562d87c412b2 in ibuf_insert_low (mode=36, op=IBUF_OP_INSERT, no_counter=0, entry=0x619000a88508, entry_size=55, index=0x617000035388, page_id=..., zip_size=0, thr=0x621000801a48)

    at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3480

#7  0x0000562d87c41a50 in ibuf_insert (op=IBUF_OP_INSERT, entry=0x619000a88508, index=0x617000035388, page_id=..., zip_size=0, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3603

Thread 30 that was in the process of merging the change buffer to that page is deleting our precious record immediately afterwards, even though the page no longer exists in the buffer pool and we cannot know if all those buffered records were actually applied. At the time we inserted the record in Thread 17, that culprit Thread 30 was waiting for a latch on that same change buffer leaf page 0:0x356:

#12 0x0000562d880600eb in btr_cur_latch_leaves (block=0x7f0812151240, latch_mode=2, cursor=0x7f07f82b9b70, mtr=0x7f07f82b9ce0) at /data/Server/10.5/storage/innobase/btr/btr0cur.cc:249

#13 0x0000562d8806a570 in btr_cur_search_to_nth_level_func (index=0x617000002e88, level=0, tuple=0x6160018ff708, mode=PAGE_CUR_GE, latch_mode=2, cursor=0x7f07f82b9b70, ahi_latch=0x0, file=0x562d891851a0 "/data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc", line=2342, mtr=0x7f07f82b9ce0, autoinc=0) at /data/Server/10.5/storage/innobase/btr/btr0cur.cc:1848

#14 0x0000562d87c31daa in btr_pcur_open_low (index=0x617000002e88, level=0, tuple=0x6160018ff708, mode=PAGE_CUR_GE, latch_mode=2, cursor=0x7f07f82b9b70, file=0x562d891851a0 "/data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc", line=2342, autoinc=0, mtr=0x7f07f82b9ce0) at /data/Server/10.5/storage/innobase/include/btr0pcur.ic:441

#15 0x0000562d87c3b9ae in ibuf_read_merge_pages (space_ids=0x7f07f82ba590, page_nos=0x7f07f82ba5d0, n_stored=3) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:2342

#16 0x0000562d87c412b2 in ibuf_insert_low (mode=36, op=IBUF_OP_INSERT, no_counter=0, entry=0x619000a88508, entry_size=55, index=0x617000035388, page_id=..., zip_size=0, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3480

#17 0x0000562d87c41a50 in ibuf_insert (op=IBUF_OP_INSERT, entry=0x619000a88508, index=0x617000035388, page_id=..., zip_size=0, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3603

The wrongful deletion was added by me in order to fix a hang in change buffer merge (~~MDEV-20934~~), whose root cause should have been the recovery bug that we fixed in ~~MDEV-24449~~.

I think that we must revert the work-around of MDEV-20394.

Marko Mäkelä added a comment - 2021-06-04 10:30 I checked the contents of the secondary index page each time it was read into the buffer pool. The record never changed. Thread 4 hit Breakpoint 1, buf_page_read_complete (bpage=0x7f0812127e40, node=...) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:4101 4101 const page_id_t id(bpage->id()); 1: /x bpage.id_.m_id = 0x500000601 (rr) when Current event: 228935 (rr) display/x ((buf_block_t*)bpage)->frame[0x457]@36 2: /x ((buf_block_t*)bpage)->frame[0x457]@36 = {0x0, 0x0, 0xd0, 0xf, 0x4, 0x88, 0x65, 0x63, 0x74, 0x66, 0x72, 0x78, 0x74, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x70, 0x6b, 0x6f, 0x77, 0x70, 0x61, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x72, 0x0, 0x0, 0x15, 0x23} (rr) command 1 Type commands for breakpoint(s) 1, one per line. End with a line saying just "end". >when >c >end (rr) c Continuing. [Switching to Thread 1599113.1605076] Thread 15 hit Breakpoint 1, buf_page_read_complete (bpage=0x7f08121488c0, node=...) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:4101 4101 const page_id_t id(bpage->id()); 1: /x bpage.id_.m_id = 0x500000601 2: /x ((buf_block_t*)bpage)->frame[0x457]@36 = {0x0, 0x0, 0xd0, 0xf, 0x4, 0x88, 0x65, 0x63, 0x74, 0x66, 0x72, 0x78, 0x74, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x70, 0x6b, 0x6f, 0x77, 0x70, 0x61, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x72, 0x0, 0x0, 0x15, 0x23} Current event: 616159 [Switching to Thread 1599113.1608534] Thread 28 hit Breakpoint 1, buf_page_read_complete (bpage=0x7f08121198a0, node=...) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:4101 4101 const page_id_t id(bpage->id()); 1: /x bpage.id_.m_id = 0x500000601 2: /x ((buf_block_t*)bpage)->frame[0x457]@36 = {0x0, 0x0, 0xd0, 0xf, 0x4, 0x88, 0x65, 0x63, 0x74, 0x66, 0x72, 0x78, 0x74, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x70, 0x6b, 0x6f, 0x77, 0x70, 0x61, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x72, 0x0, 0x0, 0x15, 0x23} Current event: 818385 [Switching to Thread 1599113.1608599] Thread 46 hit Breakpoint 1, buf_page_read_complete (bpage=0x7f0812116460, node=...) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:4101 4101 const page_id_t id(bpage->id()); 1: /x bpage.id_.m_id = 0x500000601 2: /x ((buf_block_t*)bpage)->frame[0x457]@36 = {0x0, 0x0, 0xd0, 0xf, 0x4, 0x88, 0x65, 0x63, 0x74, 0x66, 0x72, 0x78, 0x74, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x70, 0x6b, 0x6f, 0x77, 0x70, 0x61, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x72, 0x0, 0x0, 0x15, 0x23} Current event: 1149964 [Switching to Thread 1599113.1608512] Thread 21 hit Breakpoint 1, buf_page_read_complete (bpage=0x7f08120fa6e0, node=...) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:4101 4101 const page_id_t id(bpage->id()); 1: /x bpage.id_.m_id = 0x500000601 2: /x ((buf_block_t*)bpage)->frame[0x457]@36 = {0x0, 0x0, 0xd0, 0xf, 0x4, 0x88, 0x65, 0x63, 0x74, 0x66, 0x72, 0x78, 0x74, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x70, 0x6b, 0x6f, 0x77, 0x70, 0x61, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x72, 0x0, 0x0, 0x15, 0x23} Current event: 1228265 Thread 47 received signal SIGKILL, Killed. Here is the start of the transaction with ID 0xfb6 (which last modified the clustered index record): Thread 17 hit Breakpoint 12, trx_set_rw_mode (trx=0x7f08126d29a8) at /data/Server/10.5/storage/innobase/trx/trx0trx.cc:2320 2320 if (trx->read_view.is_open()) { 4: /x trx->id = 0xfb6 (rr) bt #0 trx_set_rw_mode (trx=0x7f08126d29a8) at /data/Server/10.5/storage/innobase/trx/trx0trx.cc:2320 #1 0x0000562d87c70ca0 in lock_table (flags=0, table=0x618000063508, mode=LOCK_IX, thr=0x6210002ee988) at /data/Server/10.5/storage/innobase/lock/lock0lock.cc:3766 #2 0x0000562d87ee479e in row_search_mvcc (buf=0x61a0001b12b8 "\377", mode=PAGE_CUR_GE, prebuilt=0x6210002ee188, match_mode=1, direction=0) at /data/Server/10.5/storage/innobase/row/row0sel.cc:4608 #3 0x0000562d87aec741 in ha_innobase::index_read (this=0x61d0007c42b8, buf=0x61a0001b12b8 "\377", key_ptr=0x6210032f0200 "#\025", key_len=4, find_flag=HA_READ_KEY_EXACT) at /data/Server/10.5/storage/innobase/handler/ha_innodb.cc:8866 #4 0x0000562d871f0c6b in handler::index_read_map (this=0x61d0007c42b8, buf=0x61a0001b12b8 "\377", key=0x6210032f0200 "#\025", keypart_map=1, find_flag=HA_READ_KEY_EXACT) at /data/Server/10.5/sql/handler.h:3818 #5 0x0000562d871cad30 in handler::ha_index_read_map (this=0x61d0007c42b8, buf=0x61a0001b12b8 "\377", key=0x6210032f0200 "#\025", keypart_map=1, find_flag=HA_READ_KEY_EXACT) at /data/Server/10.5/sql/handler.cc:3123 #6 0x0000562d871df416 in handler::read_range_first (this=0x61d0007c42b8, start_key=0x61d0007c43b8, end_key=0x61d0007c43d8, eq_range_arg=true, sorted=true) at /data/Server/10.5/sql/handler.cc:6200 #7 0x0000562d86ee5ef1 in handler::multi_range_read_next (this=0x61d0007c42b8, range_info=0x7f07fc233790) at /data/Server/10.5/sql/multi_range_read.cc:516 #8 0x0000562d86ee623e in Mrr_simple_index_reader::get_next (this=0x61d0007c48e0, range_info=0x7f07fc233790) at /data/Server/10.5/sql/multi_range_read.cc:553 #9 0x0000562d86eef20d in DsMrr_impl::dsmrr_next (this=0x61d0007c4790, range_info=0x7f07fc233790) at /data/Server/10.5/sql/multi_range_read.cc:1653 #10 0x0000562d87b1bc3e in ha_innobase::multi_range_read_next (this=0x61d0007c42b8, range_info=0x7f07fc233790) at /data/Server/10.5/storage/innobase/handler/ha_innodb.cc:20401 #11 0x0000562d875a8e84 in QUICK_RANGE_SELECT::get_next (this=0x6130000e50c0) at /data/Server/10.5/sql/opt_range.cc:12647 #12 0x0000562d875da15c in rr_quick (info=0x7f07fc233ba0) at /data/Server/10.5/sql/records.cc:403 #13 0x0000562d868c27c2 in READ_RECORD::read_record (this=0x7f07fc233ba0) at /data/Server/10.5/sql/records.h:80 #14 0x0000562d86d16f11 in mysql_update (thd=0x62b0001c0218, table_list=0x62b0001c73d0, fields=..., values=..., conds=0x62b0001c7f40, order_num=0, order=0x0, limit=18446744073709551615, ignore=false, found_return=0x7f07fc2343c0, updated_return=0x7f07fc2343e0) at /data/Server/10.5/sql/sql_update.cc:996 #15 0x0000562d86a534ff in mysql_execute_command (thd=0x62b0001c0218) at /data/Server/10.5/sql/sql_parse.cc:4468 #16 0x0000562d86a6b5b6 in mysql_parse (thd=0x62b0001c0218, rawbuf=0x62b0001c7238 "UPDATE `tab1` SET e = 'sentence' WHERE pk = 5411 /* E_R Thread17 QNO 53 CON_ID 30 */", length=84, parser_state=0x7f07fc234cb0, is_com_multi=false, is_next_command=false) at /data/Server/10.5/sql/sql_parse.cc:8096 … (rr) break ibuf0ibuf.cc:3179 thread 12 Breakpoint 13 at 0x562d87c3feb8: file /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc, line 3179. (rr) cond 13 index->id==27 (rr) c Continuing. Thread 17 hit Breakpoint 1, buf_page_read_complete (bpage=0x7f081214f980, node=...) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:4101 4101 const page_id_t id(bpage->id()); 1: /x bpage.id_.m_id = 0x500000315 Current event: 1148257 First, our transaction performed a synchronous read of the clustered index page (5:0x315). Soon thereafter, another thread completed a synchronous read of the secondary index page, because (just like in the previous trace that I analyzed) it decided that some room needs to be allocated in the change buffer, while attempting to buffer another insert (coincidentally, to the same index, and same index page 0x602): Thread 30 hit Breakpoint 1, buf_page_read_complete (bpage=0x7f0812116460, node=...) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:4101 4101 const page_id_t id(bpage->id()); 1: /x bpage.id_.m_id = 0x500000601 2: /x ((buf_block_t*)bpage)->frame[0x457]@36 = {0x0, 0x0, 0xd0, 0xf, 0x4, 0x88, 0x65, 0x63, 0x74, 0x66, 0x72, 0x78, 0x74, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x70, 0x6b, 0x6f, 0x77, 0x70, 0x61, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x72, 0x0, 0x0, 0x15, 0x23} Current event: 1149964 (rr) bt #0 buf_page_read_complete (bpage=0x7f0812116460, node=...) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:4101 #1 0x0000562d8812ad23 in buf_read_page_low (err=0x7f07f82b9410, space=0x615000562f00, sync=true, mode=132, page_id=..., zip_size=0, unzip=false) at /data/Server/10.5/storage/innobase/buf/buf0rea.cc:360 #2 0x0000562d8812bb4f in buf_read_page (page_id=..., zip_size=0) at /data/Server/10.5/storage/innobase/buf/buf0rea.cc:489 #3 0x0000562d880d55e6 in buf_page_get_low (page_id=..., zip_size=0, rw_latch=2, guess=0x0, mode=16, file=0x562d891851a0 "/data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc", line=2328, mtr=0x7f07f82b9ce0, err=0x7f07f82b9b20, allow_ibuf_merge=true) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:3024 #4 0x0000562d880d7392 in buf_page_get_gen (page_id=..., zip_size=0, rw_latch=2, guess=0x0, mode=16, file=0x562d891851a0 "/data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc", line=2328, mtr=0x7f07f82b9ce0, err=0x7f07f82b9b20, allow_ibuf_merge=true) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:3482 #5 0x0000562d87c3b7a4 in ibuf_read_merge_pages (space_ids=0x7f07f82ba590, page_nos=0x7f07f82ba5d0, n_stored=3) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:2325 #6 0x0000562d87c412b2 in ibuf_insert_low (mode=36, op=IBUF_OP_INSERT, no_counter=0, entry=0x619000a88508, entry_size=55, index=0x617000035388, page_id=..., zip_size=0, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3480 #7 0x0000562d87c41a50 in ibuf_insert (op=IBUF_OP_INSERT, entry=0x619000a88508, index=0x617000035388, page_id=..., zip_size=0, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3603 #8 0x0000562d880690be in btr_cur_search_to_nth_level_func (index=0x617000035388, level=0, tuple=0x619000a88508, mode=PAGE_CUR_LE, latch_mode=2, cursor=0x7f07f82bc470, ahi_latch=0x0, file=0x562d8925ae40 "/data/Server/10.5/storage/innobase/row/row0ins.cc", line=2974, mtr=0x7f07f82bc830, autoinc=0) at /data/Server/10.5/storage/innobase/btr/btr0cur.cc:1650 #9 0x0000562d87e13938 in row_ins_sec_index_entry_low (flags=0, mode=2, index=0x617000035388, offsets_heap=0x619000a87a80, heap=0x619000f39880, entry=0x619000a88508, trx_id=0, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/row/row0ins.cc:2971 #10 0x0000562d87e152da in row_ins_sec_index_entry (index=0x617000035388, entry=0x619000a88508, thr=0x621000801a48, check_foreign=true) at /data/Server/10.5/storage/innobase/row/row0ins.cc:3294 #11 0x0000562d87f0b593 in row_upd_sec_index_entry (node=0x621000801638, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/row/row0upd.cc:2236 #12 0x0000562d87f0bbbe in row_upd_sec_step (node=0x621000801638, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/row/row0upd.cc:2263 #13 0x0000562d87f112d2 in row_upd (node=0x621000801638, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/row/row0upd.cc:3024 #14 0x0000562d87f11a9c in row_upd_step (thr=0x621000801a48) at /data/Server/10.5/storage/innobase/row/row0upd.cc:3139 #15 0x0000562d87e5dedb in row_update_for_mysql (prebuilt=0x621000800d88) at /data/Server/10.5/storage/innobase/row/row0mysql.cc:1844 #16 0x0000562d87aea754 in ha_innobase::update_row (this=0x61d0008246b8, old_row=0x61a000211b20 "\300\trecipient", new_row=0x61a0002118b8 "\300\vcelebration") at /data/Server/10.5/storage/innobase/handler/ha_innodb.cc:8445 #17 0x0000562d871e7213 in handler::ha_update_row (this=0x61d0008246b8, old_data=0x61a000211b20 "\300\trecipient", new_data=0x61a0002118b8 "\300\vcelebration") at /data/Server/10.5/sql/handler.cc:7207 #18 0x0000562d86d1753d in mysql_update (thd=0x62b000299218, table_list=0x62b0002683d0, fields=..., values=..., conds=0x62b000268f48, order_num=0, order=0x0, limit=18446744073709551615, ignore=false, found_return=0x7f07f82bf3c0, updated_return=0x7f07f82bf3e0) at /data/Server/10.5/sql/sql_update.cc:1077 #19 0x0000562d86a534ff in mysql_execute_command (thd=0x62b000299218) at /data/Server/10.5/sql/sql_parse.cc:4468 #20 0x0000562d86a6b5b6 in mysql_parse (thd=0x62b000299218, rawbuf=0x62b000268238 "UPDATE `tab1` SET b = 'celebration' WHERE pk = 6083 /* E_R Thread33 QNO 59 CON_ID 45 */", length=87, parser_state=0x7f07f82bfcb0, is_com_multi=false, is_next_command=false) at /data/Server/10.5/sql/sql_parse.cc:8096 At this point, our ‘victim’ transaction is waiting for a latch on the change buffer root page (page 4). It has already decided to write to the change buffer, even though the page actually exists in the buffer pool: #20 0x0000562d880d7392 in buf_page_get_gen (page_id=..., zip_size=0, rw_latch=1, guess=0x7f0812107aa0, mode=10, file=0x562d891851a0 "/data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc", line=3276, mtr=0x7f07fc230590, err=0x7f07fc22efd0, allow_ibuf_merge=false) at /data/Server/10.5/storage/innobase/buf/buf0buf.cc:3482 #21 0x0000562d88068b9b in btr_cur_search_to_nth_level_func (index=0x617000002e88, level=0, tuple=0x61900165e008, mode=PAGE_CUR_LE, latch_mode=36, cursor=0x7f07fc2303e0, ahi_latch=0x0, file=0x562d891851a0 "/data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc", line=3276, mtr=0x7f07fc230590, autoinc=0) at /data/Server/10.5/storage/innobase/btr/btr0cur.cc:1614 #22 0x0000562d87c31daa in btr_pcur_open_low (index=0x617000002e88, level=0, tuple=0x61900165e008, mode=PAGE_CUR_LE, latch_mode=36, cursor=0x7f07fc2303e0, file=0x562d891851a0 "/data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc", line=3276, autoinc=0, mtr=0x7f07fc230590) at /data/Server/10.5/storage/innobase/include/btr0pcur.ic:441 #23 0x0000562d87c405e1 in ibuf_insert_low (mode=36, op=IBUF_OP_DELETE_MARK, no_counter=0, entry=0x61900165f908, entry_size=43, index=0x617000035388, page_id=..., zip_size=0, thr=0x6210002eee48) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3276 #24 0x0000562d87c41a50 in ibuf_insert (op=IBUF_OP_DELETE_MARK, entry=0x61900165f908, index=0x617000035388, page_id=..., zip_size=0, thr=0x6210002eee48) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3603 That thread would complete the change buffer merge. Our ‘victim’ transaction is still suspended, waiting to open the change buffer leaf page. Meanwhile, the page 5:0x601 is evicted again on write completion: #1 0x0000562d88127104 in buf_page_t::free_file_page (this=0x7f0812116460) at /data/Server/10.5/storage/innobase/include/buf0buf.h:956 #2 0x0000562d8811be7d in buf_LRU_block_free_hashed_page (block=0x7f0812116460) at /data/Server/10.5/storage/innobase/buf/buf0lru.cc:130 #3 0x0000562d88122bf4 in buf_LRU_free_page (bpage=0x7f0812116460, zip=true) at /data/Server/10.5/storage/innobase/buf/buf0lru.cc:962 #4 0x0000562d881095eb in buf_page_write_complete (request=...) at /data/Server/10.5/storage/innobase/buf/buf0flu.cc:417 #5 0x0000562d881f4f84 in fil_aio_callback (request=...) at /data/Server/10.5/storage/innobase/fil/fil0fil.cc:3372 #6 0x0000562d87d123cb in io_callback (cb=0x7f081268b140) at /data/Server/10.5/storage/innobase/os/os0file.cc:3604 #7 0x0000562d883218c8 in tpool::simulated_aio::simulated_aio_callback (param=0x7f081268b140) at /data/Server/10.5/tpool/aio_simulated.cc:162 … (rr) when Current event: 1151722 Finally, soon thereafter, our interesting transaction will resume execution: (rr) c Continuing. [Switching to Thread 1599113.1608542] Thread 17 hit Breakpoint 16, ibuf_insert_low (mode=36, op=IBUF_OP_DELETE_MARK, no_counter=0, entry=0x61900165f908, entry_size=43, index=0x617000035388, page_id=..., zip_size=0, thr=0x6210002eee48) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3282 3282 buffered = ibuf_get_volume_buffered(&pcur, 3: /x index.id = 0x1b (rr) when Current event: 1151763 We will insert the record in change buffer page 0x356, right after the page infimum, before the first record that is buffered for the immediately following page 5:0x602. So far, so good. But, what is not so good is the following: (rr) p page_cursor.rec $35 = (rec_t *) 0x7f0812556065 "infimum" (rr) p/x page_cursor.block.page.id_ $36 = {m_id = 0x356} (rr) p *rec $37 = (rec_t *) 0x7f081255614b "" (rr) dump binary memory /dev/shm/ibuf356.bin page_cursor.block.frame page_cursor.block.frame+srv_page_size (rr) p/x **rec@90 $38 = {0x0, 0x0, 0x0, 0x5, 0x0, 0x0, 0x0, 0x6, 0x1, 0x0, 0x0, 0x1, 0x0, 0x1, 0xf, 0x0, 0x64, 0x0, 0x8, 0x1, 0xf, 0x0, 0x64, 0x0, 0x8, 0x1, 0xf, 0x0, 0x64, 0x0, 0x8, 0x1, 0xf, 0x0, 0x64, 0x0, 0x8, 0x1, 0xf, 0x0, 0x64, 0x0, 0x8, 0x1, 0xf, 0x0, 0x64, 0x0, 0x8, 0x86, 0x3, 0x0, 0x4, 0x80, 0x0, 0x65, 0x63, 0x74, 0x66, 0x72, 0x78, 0x74, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x70, 0x6b, 0x6f, 0x77, 0x70, 0x61, 0x66, 0x72, 0x78, 0x70, 0x76, 0x70, 0x72, 0x0, 0x0, 0x15, 0x23, 0x0, 0x61, 0x5d, 0x56, 0x50} (rr) set $r=*rec (rr) awatch $r[-2]@2 Hardware access (read/write) watchpoint 21: $r[-2]@2 (rr) when Current event: 1152412 (rr) c Continuing. Thread 17 hit Breakpoint 19, mtr_t::commit (this=0x3f) at /data/Server/10.5/storage/innobase/mtr/mtr0mtr.cc:393 393 { (rr) disa 19 (rr) c Continuing. [Switching to Thread 1599113.1608599] Thread 30 hit Hardware access (read/write) watchpoint 21: $r[-2]@2 Value = "\001\262" mach_read_from_2 (b=0x7f0812556149 "\001\262") at /data/Server/10.5/storage/innobase/include/mach0data.ic:84 84 return(uint16_t(uint16_t(b[0]) << 8 | b[1])); (rr) bt #0 mach_read_from_2 (b=0x7f0812556149 "\001\262") at /data/Server/10.5/storage/innobase/include/mach0data.ic:84 #1 0x0000562d87c58cbb in rec_get_next_offs (rec=0x7f081255614b "", comp=0) at /data/Server/10.5/storage/innobase/include/rem0rec.ic:291 #2 0x0000562d87c6d421 in lock_update_delete (block=0x7f0812151240, rec=0x7f081255614b "") at /data/Server/10.5/storage/innobase/lock/lock0lock.cc:3260 #3 0x0000562d88084cf8 in btr_cur_optimistic_delete_func (cursor=0x7f07f82b9b70, flags=0, mtr=0x7f07f82b9ce0) at /data/Server/10.5/storage/innobase/btr/btr0cur.cc:5595 #4 0x0000562d87c45be3 in ibuf_delete_rec (page_id=..., pcur=0x7f07f82b9b70, search_tuple=0x6160018ff708, mtr=0x7f07f82b9ce0) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:4078 #5 0x0000562d87c3bcd3 in ibuf_read_merge_pages (space_ids=0x7f07f82ba590, page_nos=0x7f07f82ba5d0, n_stored=3) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:2361 #6 0x0000562d87c412b2 in ibuf_insert_low (mode=36, op=IBUF_OP_INSERT, no_counter=0, entry=0x619000a88508, entry_size=55, index=0x617000035388, page_id=..., zip_size=0, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3480 #7 0x0000562d87c41a50 in ibuf_insert (op=IBUF_OP_INSERT, entry=0x619000a88508, index=0x617000035388, page_id=..., zip_size=0, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3603 Thread 30 that was in the process of merging the change buffer to that page is deleting our precious record immediately afterwards, even though the page no longer exists in the buffer pool and we cannot know if all those buffered records were actually applied. At the time we inserted the record in Thread 17, that culprit Thread 30 was waiting for a latch on that same change buffer leaf page 0:0x356: #12 0x0000562d880600eb in btr_cur_latch_leaves (block=0x7f0812151240, latch_mode=2, cursor=0x7f07f82b9b70, mtr=0x7f07f82b9ce0) at /data/Server/10.5/storage/innobase/btr/btr0cur.cc:249 #13 0x0000562d8806a570 in btr_cur_search_to_nth_level_func (index=0x617000002e88, level=0, tuple=0x6160018ff708, mode=PAGE_CUR_GE, latch_mode=2, cursor=0x7f07f82b9b70, ahi_latch=0x0, file=0x562d891851a0 "/data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc", line=2342, mtr=0x7f07f82b9ce0, autoinc=0) at /data/Server/10.5/storage/innobase/btr/btr0cur.cc:1848 #14 0x0000562d87c31daa in btr_pcur_open_low (index=0x617000002e88, level=0, tuple=0x6160018ff708, mode=PAGE_CUR_GE, latch_mode=2, cursor=0x7f07f82b9b70, file=0x562d891851a0 "/data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc", line=2342, autoinc=0, mtr=0x7f07f82b9ce0) at /data/Server/10.5/storage/innobase/include/btr0pcur.ic:441 #15 0x0000562d87c3b9ae in ibuf_read_merge_pages (space_ids=0x7f07f82ba590, page_nos=0x7f07f82ba5d0, n_stored=3) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:2342 #16 0x0000562d87c412b2 in ibuf_insert_low (mode=36, op=IBUF_OP_INSERT, no_counter=0, entry=0x619000a88508, entry_size=55, index=0x617000035388, page_id=..., zip_size=0, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3480 #17 0x0000562d87c41a50 in ibuf_insert (op=IBUF_OP_INSERT, entry=0x619000a88508, index=0x617000035388, page_id=..., zip_size=0, thr=0x621000801a48) at /data/Server/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:3603 The wrongful deletion was added by me in order to fix a hang in change buffer merge ( MDEV-20934 ), whose root cause should have been the recovery bug that we fixed in MDEV-24449 . I think that we must revert the work-around of MDEV-20394 .

Marko Mäkelä added a comment - 2021-06-04 12:09

Due to ~~MDEV-19514~~ (which entered 10.5.0), the ~~MDEV-20934~~ logic would kick in also during normal server operation and not only during a slow shutdown (innodb_fast_shutdown=0).

In ~~MDEV-22373~~, spachev claimed the following about 10.4:

innodb_fast_shutdown=0 never produces a post-restart count mismatch corruption, while innodb_fast_shutdown=1 very reliably does

Thus, that corruption cannot be caused by ~~MDEV-20934~~, because before 10.5 the work-around would only be enabled during a shutdown with innodb_fast_shutdown=0.

Marko Mäkelä added a comment - 2021-06-04 12:09 Due to MDEV-19514 (which entered 10.5.0), the MDEV-20934 logic would kick in also during normal server operation and not only during a slow shutdown ( innodb_fast_shutdown=0 ). In MDEV-22373 , spachev claimed the following about 10.4: innodb_fast_shutdown=0 never produces a post-restart count mismatch corruption, while innodb_fast_shutdown=1 very reliably does Thus, that corruption cannot be caused by MDEV-20934 , because before 10.5 the work-around would only be enabled during a shutdown with innodb_fast_shutdown=0 .

Marko Mäkelä added a comment - 2021-06-04 12:27

I am a little undecided regarding whether to fix this in older versions than MariaDB 10.5. Normally, nobody should be using innodb_fast_shutdown=0. It is not really required even before upgrades (see ~~MDEV-23755~~).

If we removed the corruption work-around that was added in ~~MDEV-20934~~ from release builds of 10.2, 10.3, 10.4, then users who suffer from corrupted change buffer data structures (presumably due to having used old versions where ~~MDEV-24449~~ was not yet fixed) would be out of luck. Using a debug build and a special debug option ought to be impractically slow when there are terabytes of data. For the record, I was considering to push the following:

diff --git a/storage/innobase/ibuf/ibuf0ibuf.cc b/storage/innobase/ibuf/ibuf0ibuf.cc

index 46649187ca0..b32b52c2a46 100644

--- a/storage/innobase/ibuf/ibuf0ibuf.cc

+++ b/storage/innobase/ibuf/ibuf0ibuf.cc

@@ -4444,6 +4444,23 @@ ibuf_merge_or_delete_for_page(

 			/* No changes are buffered for this page. */

 			fil_space_release(space);

+#ifndef DBUG_OFF

+			/* The following code works around a hang when

+			the change buffer is corrupted, likely due to

+			the race condition in crash recovery that was

+			fixed in MDEV-24449. But, it also introduces

+			corruption by itself in the following

+			scenario:

+			(1) We released the bitmap page latch.

+			(2) Another thread buffers changes to the page.

+			(3) We will wrongly discard those newly

+			buffered changes below.

+			This code will be available in debug builds, so that

+			users may try to fix a shutdown hang that occurs due

+			to a corrupted change buffer. */

 			if (UNIV_UNLIKELY(srv_shutdown_state)

 			    && !srv_fast_shutdown

 			    && (!block

@@ -4453,8 +4470,10 @@ ibuf_merge_or_delete_for_page(

 				shutdown, in case the bitmap bits are

 				wrongly clear even though buffered

 				changes exist. */

-				ibuf_delete_recs(page_id);

+				DBUG_EXECUTE_IF("ibuf_merge_corruption",

+						ibuf_delete_recs(page_id););

+#endif

 			return;

On 10.5, I would carry on and disable it, because 10.5 never was affected by ~~MDEV-24449~~, and a potential other cause of change buffer corruption (~~MDEV-24709~~) was fixed already in 10.5.9.

Maybe in 10.5 we should also remove the do_merge logic of ibuf_insert_low() and simply refuse to buffer more operations if the change buffer is full. In that way, in 10.5 outside a slow shutdown, change buffer merges should only occur when the secondary index leaf page is actively being read.

Marko Mäkelä added a comment - 2021-06-04 12:27 I am a little undecided regarding whether to fix this in older versions than MariaDB 10.5. Normally, nobody should be using innodb_fast_shutdown=0 . It is not really required even before upgrades (see MDEV-23755 ). If we removed the corruption work-around that was added in MDEV-20934 from release builds of 10.2, 10.3, 10.4, then users who suffer from corrupted change buffer data structures (presumably due to having used old versions where MDEV-24449 was not yet fixed) would be out of luck. Using a debug build and a special debug option ought to be impractically slow when there are terabytes of data. For the record, I was considering to push the following: diff --git a/storage/innobase/ibuf/ibuf0ibuf.cc b/storage/innobase/ibuf/ibuf0ibuf.cc index 46649187ca0..b32b52c2a46 100644 --- a/storage/innobase/ibuf/ibuf0ibuf.cc +++ b/storage/innobase/ibuf/ibuf0ibuf.cc @@ -4444,6 +4444,23 @@ ibuf_merge_or_delete_for_page( /* No changes are buffered for this page. */ fil_space_release(space); +#ifndef DBUG_OFF + /* The following code works around a hang when + the change buffer is corrupted, likely due to + the race condition in crash recovery that was + fixed in MDEV-24449. But, it also introduces + corruption by itself in the following + scenario: + + (1) We released the bitmap page latch. + (2) Another thread buffers changes to the page. + (3) We will wrongly discard those newly + buffered changes below. + + This code will be available in debug builds, so that + users may try to fix a shutdown hang that occurs due + to a corrupted change buffer. */ + if (UNIV_UNLIKELY(srv_shutdown_state) && !srv_fast_shutdown && (!block @@ -4453,8 +4470,10 @@ ibuf_merge_or_delete_for_page( shutdown, in case the bitmap bits are wrongly clear even though buffered changes exist. */ - ibuf_delete_recs(page_id); + DBUG_EXECUTE_IF("ibuf_merge_corruption", + ibuf_delete_recs(page_id);); } +#endif return; } } On 10.5, I would carry on and disable it, because 10.5 never was affected by MDEV-24449 , and a potential other cause of change buffer corruption ( MDEV-24709 ) was fixed already in 10.5.9. Maybe in 10.5 we should also remove the do_merge logic of ibuf_insert_low() and simply refuse to buffer more operations if the change buffer is full. In that way, in 10.5 outside a slow shutdown, change buffer merges should only occur when the secondary index leaf page is actively being read.

Matthias Leich added a comment - 2021-06-07 14:13

Attempts to replay the problem with RQG

==================================

GIT_SHOW: HEAD, origin/bb-10.6-wlad, origin/HEAD, origin/10.6, 10.6 3d6eb7afcfb47cb432f790d0b25c8c1f4ec5bcf3 2021-06-06T13:21:03+02:00

      ~ 200 RQG tests

     Two replays of the effect described above (Corruption detected after SIGKILL + restart with success).

     12 runs show

           [ 'TBR-857', 'mysqld: .{1,150}ibuf0ibuf.cc:.{1,30} void ibuf_set_del_mark.{1,200}: Assertion .0. failed.' ],

           without crash injection and recovery.

       8 runs show

         [ 'TBR-835', '\[ERROR\] InnoDB: Record in index `(?!FTS_DOC_ID_INDEX).{0,200}` of table .{1,100} was not found on update: ' ],

                        Tests using FULLTEXT indexes might miss records in FTS_DOC_ID_INDEX (-> whatever other bug).

                        But in the current case FULLTEXT indexes are not used at all. And a record within a index created by the user is missing.

         [ 'TBR-915-MDEV-25004', 'mysqld: .{1,200}row0upd.cc:.{1,20} dberr_t row_upd_sec_index_entry\(upd_node_t\*, que_thr_t\*\): Assertion .0. failed' ],

                       This looks like MDEV-25004 but versioned tables are not involved in the current case.

         without crash injection and recovery.

      Given the results above I am not able to tell if the current bug is a plain crash recovery problem or caused by "misbehavior" before SIGKILL

      followed by restart invoking crash recovery.

bb-10.6-MDEV-25506 GIT_SHOW: HEAD 8850f7d854465bee24639ceb2148baf7b0ec1b39 2021-06-04T19:54:45+03:00

which contains a fix for the current bug but also other fixes.

      ~ 600 RQG tests

      One time

       [ 'TBR-977', 'mysqld: .{1,200}buf0buf.h:.{1,20}: buf_page_t\* buf_pool_t::page_hash_get_low\(page_id_t, ulint\): Assertion \`bpage->in_page_hash\' failed.+The RQG run ended with status STATUS_SERVER_CRASHED' ],

      without crash injection followed by recovery.

      This failure seems to be a sibling of  MDEV-25776 (here it happens after crash recovery).

Matthias Leich added a comment - 2021-06-07 14:13 Attempts to replay the problem with RQG ================================== GIT_SHOW: HEAD, origin/bb-10.6-wlad, origin/HEAD, origin/10.6, 10.6 3d6eb7afcfb47cb432f790d0b25c8c1f4ec5bcf3 2021-06-06T13:21:03+02:00 ~ 200 RQG tests Two replays of the effect described above (Corruption detected after SIGKILL + restart with success). 12 runs show [ 'TBR-857', 'mysqld: .{1,150}ibuf0ibuf.cc:.{1,30} void ibuf_set_del_mark.{1,200}: Assertion .0. failed.' ], without crash injection and recovery. 8 runs show [ 'TBR-835', '\[ERROR\] InnoDB: Record in index `(?!FTS_DOC_ID_INDEX).{0,200}` of table .{1,100} was not found on update: ' ], Tests using FULLTEXT indexes might miss records in FTS_DOC_ID_INDEX (-> whatever other bug). But in the current case FULLTEXT indexes are not used at all. And a record within a index created by the user is missing. [ 'TBR-915-MDEV-25004', 'mysqld: .{1,200}row0upd.cc:.{1,20} dberr_t row_upd_sec_index_entry$upd_node_t\*, que_thr_t\*$: Assertion .0. failed' ], This looks like MDEV-25004 but versioned tables are not involved in the current case. without crash injection and recovery. Given the results above I am not able to tell if the current bug is a plain crash recovery problem or caused by "misbehavior" before SIGKILL followed by restart invoking crash recovery. bb-10.6-MDEV-25506 GIT_SHOW: HEAD 8850f7d854465bee24639ceb2148baf7b0ec1b39 2021-06-04T19:54:45+03:00 which contains a fix for the current bug but also other fixes. ~ 600 RQG tests One time [ 'TBR-977', 'mysqld: .{1,200}buf0buf.h:.{1,20}: buf_page_t\* buf_pool_t::page_hash_get_low$page_id_t, ulint$: Assertion \`bpage->in_page_hash\' failed.+The RQG run ended with status STATUS_SERVER_CRASHED' ], without crash injection followed by recovery. This failure seems to be a sibling of MDEV-25776 (here it happens after crash recovery).

Marko Mäkelä added a comment - 2021-06-07 16:52

My fix did not optimize the ibuf_insert_low() logic, because we do not have time for the extensive testing right now.

The problematic code will be omitted from 10.5.11 onwards. It can be enabled in debug builds by SET GLOBAL debug_dbug='+d,ibuf_merge_corruption';. In 10.5, this bug has a much larger impact than in older major versions, because in 10.5 the incorrect code would kick in during normal operation, without involving any startup or shutdown. For older releases, the ~~MDEV-20934~~ logic is only enabled during shutdown with innodb_fast_shutdown=0. Normally, there should be no reason to use innodb_fast_shutdown=0, and besides, one could SET GLOBAL innodb_change_buffering=none; before the slow shutdown to work around this problem.

Marko Mäkelä added a comment - 2021-06-07 16:52 My fix did not optimize the ibuf_insert_low() logic, because we do not have time for the extensive testing right now. The problematic code will be omitted from 10.5.11 onwards. It can be enabled in debug builds by SET GLOBAL debug_dbug='+d,ibuf_merge_corruption'; . In 10.5, this bug has a much larger impact than in older major versions, because in 10.5 the incorrect code would kick in during normal operation, without involving any startup or shutdown. For older releases, the MDEV-20934 logic is only enabled during shutdown with innodb_fast_shutdown=0 . Normally, there should be no reason to use innodb_fast_shutdown=0 , and besides, one could SET GLOBAL innodb_change_buffering=none; before the slow shutdown to work around this problem.

MariaDB Server

CHECK TABLE harvests InnoDB: Index 'abdcef' contains 10001 entries, should be 10000

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration