[MDEV-28168] Compressed page is read for non-compressed space during recovery - Jira

Details

Type: Bug
Status: Confirmed (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.6
Fix Version/s: 10.5, 10.6
Component/s: Storage Engine - InnoDB
Labels:
None

Description

During ~~MDEV-27557~~ testing we found the following crash:

#2  0x000055cbfd573981 in ib::fatal::~fatal (this=0x27936b9ce500, __in_chrg=<optimized out>)

    at ./storage/innobase/ut/ut0ut.cc:519

#3  0x000055cbfd6257b0 in fil_page_type_validate (space=0x612000024f40, page=0x7fed98338000 "")

    at ./storage/innobase/include/fil0fil.inl:131

#4  0x000055cbfd626f5e in buf_page_decrypt_after_read (bpage=0x7fed97b0d1a0, node=...)

    at ./storage/innobase/buf/buf0buf.cc:472

#5  0x000055cbfd6386d3 in buf_page_t::read_complete (this=0x7fed97b0d1a0, node=...)

    at ./storage/innobase/buf/buf0buf.cc:3617

#6  0x000055cbfd7523ae in fil_aio_callback (request=...) at ./storage/innobase/fil/fil0fil.cc:2930

buf_page_decrypt_after_read() does decompression only if node.space->is_compressed() && buf_page_is_compressed(dst_frame, flags) condition is true. For our case node.space->is_compressed() returns false as both FSP_FLAGS_FCRC32_GET_COMPRESSED_ALGO() and FSP_FLAGS_HAS_PAGE_COMPRESSION() are false. The space flags is 0x16. What means the space is not compressed. But the 15-th bit of FIL_PAGE_TYPE is set, what means the full_crc32 page is compressed, i.e. buf_page_is_compressed(dst_frame, flags) should return true.

So we have compressed page on non-compressed space. The question is how is it possible?

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

page.bin
32 kB
2022-03-24 11:43
page-0.bin
32 kB
2022-03-24 11:43

Issue Links

relates to

MDEV-18644 Support FULL_CRC32 for compressed pages.

Closed

Activity

Ascending order - Click to sort in descending order

Vladislav Lesin added a comment - 2022-03-24 11:44

In the attachment there is page itself page.bin, and the first page of the space page-0.bin.

Vladislav Lesin added a comment - 2022-03-24 11:44 In the attachment there is page itself page.bin , and the first page of the space page-0.bin .

Marko Mäkelä added a comment - 2022-03-24 12:10

I think that we need an rr replay of the server that was killed before the failed recovery.

My current hypothesis is that this is repeatable with the default innodb_checksum_algorithm=full_crc32 and a test that runs something like this:

CREATE TABLE t ENGINE=InnoDB AS SELECT * FROM seq_1_to_10000;

ALTER TABLE t page_compressed=1;

ALTER TABLE t page_compressed=0;

ALTER TABLE t page_compressed=1;

ALTER TABLE t page_compressed=0;

…

and kills the server during the execution of one of the ALTER TABLE statements.

Disabling page_compressed should always cause the table to be rebuilt. It should be possible to enable page_compressed=1 instantly, by just updating the tablespace flags in the first page, because a page_compressed file may contain uncompressed pages. That is apparently being violated.

Here, the first page does not identify the table as being in page_compressed=1 format, but a page is in the compressed format nevertheless.

If I remember correctly, changes to page_compressed should not be reflected by any SYS_ tables, so this could theoretically also be a failure of DDL recovery, which currently largely depends on the DB_TRX_ID fields that are read from records of SYS_TABLES, SYS_INDEXES, and SYS_COLUMNS.

Marko Mäkelä added a comment - 2022-03-24 12:10 I think that we need an rr replay of the server that was killed before the failed recovery. My current hypothesis is that this is repeatable with the default innodb_checksum_algorithm=full_crc32 and a test that runs something like this: CREATE TABLE t ENGINE=InnoDB AS SELECT * FROM seq_1_to_10000; ALTER TABLE t page_compressed=1; ALTER TABLE t page_compressed=0; ALTER TABLE t page_compressed=1; ALTER TABLE t page_compressed=0; … and kills the server during the execution of one of the ALTER TABLE statements. Disabling page_compressed should always cause the table to be rebuilt. It should be possible to enable page_compressed=1 instantly, by just updating the tablespace flags in the first page, because a page_compressed file may contain uncompressed pages. That is apparently being violated. Here, the first page does not identify the table as being in page_compressed=1 format, but a page is in the compressed format nevertheless. If I remember correctly, changes to page_compressed should not be reflected by any SYS_ tables, so this could theoretically also be a failure of DDL recovery, which currently largely depends on the DB_TRX_ID fields that are read from records of SYS_TABLES , SYS_INDEXES , and SYS_COLUMNS .

Vladislav Lesin added a comment - 2022-03-24 16:02 - edited

The history if space->flags changing:

1. the space creating, the initial value is 0x16:

0x0000558cf5f0b453 in fil_space_t::create (id=281, flags=22, purpose=FIL_TYPE_TABLESPACE, crypt_data=0x6130000011c0, mode=FIL_ENCRYPTION_DEFAULT)

    at ./storage/innobase/fil/fil0fil.cc:940

940             space->flags = flags;

(rr) bt

#0  0x0000558cf5f0b453 in fil_space_t::create (id=281, flags=22, purpose=FIL_TYPE_TABLESPACE, crypt_data=0x6130000011c0,

    mode=FIL_ENCRYPTION_DEFAULT) at ./storage/innobase/fil/fil0fil.cc:940

#1  0x0000558cf5f17add in fil_ibd_load (space_id=281, filename=0x65e06169e03a "./test/t7.ibd", space=@0x7fff50b3bb50: 0x0)

    at ./storage/innobase/fil/fil0fil.cc:2647

#2  0x0000558cf5a05d1b in fil_name_process (name=0x65e06169e03a "./test/t7.ibd", len=13, space_id=281, deleted=false, lsn=333147302,

    store=0x7fff50b3ed50) at ./storage/innobase/log/log0recv.cc:1201

#3  0x0000558cf5a11458 in recv_sys_t::parse (this=0x558cf7d10340 <recv_sys>, checkpoint_lsn=324596146, store=0x7fff50b3ed50, apply=false)

    at ./storage/innobase/log/log0recv.cc:2578

2. redo record type WRITE is applied, flags value 0x76, so bits 6,7 are set, what means FSP_FLAGS_FCRC32_GET_COMPRESSED_ALGO() should return 3, i.e. the space is compressed:

Thread 1 hit Hardware watchpoint 1: -location node.space->flags

Old value = 22

New value = 118

fil_space_set_recv_size_and_flags (id=281, size=0, flags=118)

    at ./storage/innobase/fil/fil0fil.cc:1113

1113      mysql_mutex_unlock(&fil_system.mutex);

(rr) bt

#0  fil_space_set_recv_size_and_flags (id=281, size=0, flags=118)

    at ./storage/innobase/fil/fil0fil.cc:1113

#1  0x0000558cf5a0fd07 in recv_sys_t::parse (this=0x558cf7d10340 <recv_sys>, checkpoint_lsn=324596146, store=0x7fff50b3ed50, apply=true)

    at ./storage/innobase/log/log0recv.cc:2409

#2  0x0000558cf5a1a39e in recv_scan_log_recs (store=0x7fff50b3ed50, log_block=0x7b102cbf6000 "", checkpoint_lsn=324596146, start_lsn=332460032,

    end_lsn=332591104, contiguous_lsn=0x7fff50b3ee80, group_scanned_lsn=0x558cf8a6e408 <log_sys+520>)

    at ./storage/innobase/log/log0recv.cc:3554

3. page 0 recovery, sets space flags to 0x16 again, i.e. cleans "compressed" flags:

Old value = 118

New value = 22

recv_recover_page (block=0x769534a7ba00, mtr=..., p=

      {first = {m_id = 1206885810176}, second = {state = page_recv_t::RECV_BEING_PROCESSED, last_offset = 62, log = {head = 0x7695351d7ca8, tail = 0x769534afe6f8}}}, space=0x612000024f40, init=0x0) at ./storage/innobase/log/log0recv.cc:2744

2744                                            + frame);

(rr) bt

#0  recv_recover_page (block=0x769534a7ba00, mtr=..., p=

      {first = {m_id = 1206885810176}, second = {state = page_recv_t::RECV_BEING_PROCESSED, last_offset = 62, log = {head = 0x7695351d7ca8, tail = 0x769534afe6f8}}}, space=0x612000024f40, init=0x0) at ./storage/innobase/log/log0recv.cc:2744

#1  0x0000558cf5a150de in recv_recover_page (space=0x612000024f40, bpage=0x769534a7ba00)

    at ./storage/innobase/log/log0recv.cc:2906

#2  0x0000558cf5e014d9 in buf_page_t::read_complete (this=0x769534a7ba00, node=...)

    at ./storage/innobase/buf/buf0buf.cc:3721

#3  0x0000558cf5f1a3ae in fil_aio_callback (request=...) at ./storage/innobase/fil/fil0fil.cc:2930

And then the recovery of page 67 starts and the crash happens.

The tablespace flags on disk is 0x16, i.e. the space without compression. It looks like the page was not flushed. When redo log record on (2) was parsed, the the correct space flags were set. And then, when the page 0 was read, the space flags was set from non-flushed before killing page 0, from FSP_HEADER_OFFSET + FSP_SPACE_FLAGS (0x36) offset.

It looks like the issue is in recovery itself.

Vladislav Lesin added a comment - 2022-03-24 16:02 - edited The history if space->flags changing: 1. the space creating, the initial value is 0x16: 0x0000558cf5f0b453 in fil_space_t::create (id=281, flags=22, purpose=FIL_TYPE_TABLESPACE, crypt_data=0x6130000011c0, mode=FIL_ENCRYPTION_DEFAULT) at ./storage/innobase/fil/fil0fil.cc:940 940 space->flags = flags; (rr) bt #0 0x0000558cf5f0b453 in fil_space_t::create (id=281, flags=22, purpose=FIL_TYPE_TABLESPACE, crypt_data=0x6130000011c0, mode=FIL_ENCRYPTION_DEFAULT) at ./storage/innobase/fil/fil0fil.cc:940 #1 0x0000558cf5f17add in fil_ibd_load (space_id=281, filename=0x65e06169e03a "./test/t7.ibd", space=@0x7fff50b3bb50: 0x0) at ./storage/innobase/fil/fil0fil.cc:2647 #2 0x0000558cf5a05d1b in fil_name_process (name=0x65e06169e03a "./test/t7.ibd", len=13, space_id=281, deleted=false, lsn=333147302, store=0x7fff50b3ed50) at ./storage/innobase/log/log0recv.cc:1201 #3 0x0000558cf5a11458 in recv_sys_t::parse (this=0x558cf7d10340 <recv_sys>, checkpoint_lsn=324596146, store=0x7fff50b3ed50, apply=false) at ./storage/innobase/log/log0recv.cc:2578 2. redo record type WRITE is applied, flags value 0x76, so bits 6,7 are set, what means FSP_FLAGS_FCRC32_GET_COMPRESSED_ALGO() should return 3, i.e. the space is compressed: Thread 1 hit Hardware watchpoint 1: -location node.space->flags Old value = 22 New value = 118 fil_space_set_recv_size_and_flags (id=281, size=0, flags=118) at ./storage/innobase/fil/fil0fil.cc:1113 1113 mysql_mutex_unlock(&fil_system.mutex); (rr) bt #0 fil_space_set_recv_size_and_flags (id=281, size=0, flags=118) at ./storage/innobase/fil/fil0fil.cc:1113 #1 0x0000558cf5a0fd07 in recv_sys_t::parse (this=0x558cf7d10340 <recv_sys>, checkpoint_lsn=324596146, store=0x7fff50b3ed50, apply=true) at ./storage/innobase/log/log0recv.cc:2409 #2 0x0000558cf5a1a39e in recv_scan_log_recs (store=0x7fff50b3ed50, log_block=0x7b102cbf6000 "", checkpoint_lsn=324596146, start_lsn=332460032, end_lsn=332591104, contiguous_lsn=0x7fff50b3ee80, group_scanned_lsn=0x558cf8a6e408 <log_sys+520>) at ./storage/innobase/log/log0recv.cc:3554 3. page 0 recovery, sets space flags to 0x16 again, i.e. cleans "compressed" flags: Old value = 118 New value = 22 recv_recover_page (block=0x769534a7ba00, mtr=..., p= {first = {m_id = 1206885810176}, second = {state = page_recv_t::RECV_BEING_PROCESSED, last_offset = 62, log = {head = 0x7695351d7ca8, tail = 0x769534afe6f8}}}, space=0x612000024f40, init=0x0) at ./storage/innobase/log/log0recv.cc:2744 2744 + frame); (rr) bt #0 recv_recover_page (block=0x769534a7ba00, mtr=..., p= {first = {m_id = 1206885810176}, second = {state = page_recv_t::RECV_BEING_PROCESSED, last_offset = 62, log = {head = 0x7695351d7ca8, tail = 0x769534afe6f8}}}, space=0x612000024f40, init=0x0) at ./storage/innobase/log/log0recv.cc:2744 #1 0x0000558cf5a150de in recv_recover_page (space=0x612000024f40, bpage=0x769534a7ba00) at ./storage/innobase/log/log0recv.cc:2906 #2 0x0000558cf5e014d9 in buf_page_t::read_complete (this=0x769534a7ba00, node=...) at ./storage/innobase/buf/buf0buf.cc:3721 #3 0x0000558cf5f1a3ae in fil_aio_callback (request=...) at ./storage/innobase/fil/fil0fil.cc:2930 And then the recovery of page 67 starts and the crash happens. The tablespace flags on disk is 0x16, i.e. the space without compression. It looks like the page was not flushed. When redo log record on (2) was parsed, the the correct space flags were set. And then, when the page 0 was read, the space flags was set from non-flushed before killing page 0, from FSP_HEADER_OFFSET + FSP_SPACE_FLAGS (0x36) offset. It looks like the issue is in recovery itself.

Marko Mäkelä added a comment - 2022-03-24 17:23

That adjustment of fil_space_t::flags in recv_recover_page() may be redundant.

If recv_sys_t::apply() wrote "Starting a batch to recover" instead of "Starting final batch", then I think that this is pretty clear: the tablespace flags in page 0 would only be set to the final value during a later batch.

Marko Mäkelä added a comment - 2022-03-24 17:23 That adjustment of fil_space_t::flags in recv_recover_page() may be redundant. If recv_sys_t::apply() wrote "Starting a batch to recover" instead of "Starting final batch", then I think that this is pretty clear: the tablespace flags in page 0 would only be set to the final value during a later batch.

Marko Mäkelä added a comment - 2022-03-25 11:09

I suspect that this may have been broken ever since ~~MDEV-18644~~. The possibly redundant and harmful assignment would be this one (10.4 version, before the code was refactored in ~~MDEV-12353~~):

diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc

index 4359fb6b308..cca94530e4c 100644

--- a/storage/innobase/log/log0recv.cc

+++ b/storage/innobase/log/log0recv.cc

@@ -1503,7 +1503,6 @@ recv_parse_or_apply_log_rec_body(

 				switch (offs) {

 				case FSP_HEADER_OFFSET + FSP_SPACE_FLAGS:

-					space->flags = val;

 					break;

 				case FSP_HEADER_OFFSET + FSP_SIZE:

 					space->size_in_header = val;

Possibly, we should adjust all these fil_space_t fields once when initially parsing the redo log, before starting to apply any records to any data pages. In ~~MDEV-11556~~ we only did that for the FSP_SIZE, so that files could be extended to fil_space_t::recv_size before starting to access any data pages.

Marko Mäkelä added a comment - 2022-03-25 11:09 I suspect that this may have been broken ever since MDEV-18644 . The possibly redundant and harmful assignment would be this one (10.4 version, before the code was refactored in MDEV-12353 ): diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc index 4359fb6b308..cca94530e4c 100644 --- a/storage/innobase/log/log0recv.cc +++ b/storage/innobase/log/log0recv.cc @@ -1503,7 +1503,6 @@ recv_parse_or_apply_log_rec_body( switch (offs) { case FSP_HEADER_OFFSET + FSP_SPACE_FLAGS: - space->flags = val; break; case FSP_HEADER_OFFSET + FSP_SIZE: space->size_in_header = val; Possibly, we should adjust all these fil_space_t fields once when initially parsing the redo log, before starting to apply any records to any data pages. In MDEV-11556 we only did that for the FSP_SIZE , so that files could be extended to fil_space_t::recv_size before starting to access any data pages.

Marko Mäkelä added a comment - 2024-12-10 15:19

~~MDEV-34830~~ could have changed some of this logic, and MDEV-35226 might be related to this. Also ~~MDEV-34898~~ may change something related to this.

Marko Mäkelä added a comment - 2024-12-10 15:19 MDEV-34830 could have changed some of this logic, and MDEV-35226 might be related to this. Also MDEV-34898 may change something related to this.

MariaDB Server

Compressed page is read for non-compressed space during recovery

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration