[MDEV-28784] mariadb-backup prepare fails with srv_start() returned 37 (Data structure corruption). - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Duplicate
Affects Version/s: 10.8.4
Fix Version/s: 10.6.10, 10.7.6, 10.8.5, 10.9.3, 10.10.2
Component/s: Backup, Storage Engine - InnoDB
Labels:
- corruption
- rr-profile-analyzed

Description

Scenario:

1. Start the server and load some initial data.

2. Several sessions run a DML mix

3. During 2. is ongoing the run

    mariabackup --backup

    mariabackup --prepare on the backuped data

The latter fails with

# 2022-06-07T15:46:45 [3674472] | [rr 3711177 3917]2022-06-07 15:44:57 0 [Note] InnoDB: Completed initialization of buffer pool

# 2022-06-07T15:46:45 [3674472] | [rr 3711177 4041]2022-06-07 15:44:57 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=4258151

# 2022-06-07T15:46:45 [3674472] | [rr 3711177 4976]2022-06-07 15:45:00 0 [Note] InnoDB: Starting final batch to recover 314 pages from redo log.

# 2022-06-07T15:46:45 [3674472] | [rr 3711177 5686]2022-06-07 15:45:03 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.

# 2022-06-07T15:46:45 [3674472] | [rr 3711177 5690][00] 2022-06-07 15:45:03 mariadb-backup: srv_start() returned 37 (Data structure corruption).

origin/bb-10.8-MDEV-28708 f431888b636328b487504e24d71753cdaf5e6189 2022-06-07T12:15:27+03:00

Marko assumes that this problem is most probably valid for all official MariaDB versions >=10.6.

pluto:/data/results/1654601840/TBR-1535/home/mleich/rqg/1654601840/219/1/rr

      # rr trace of the source DB server during mariabackup --backup and later

      # That ends with RQG killing that server because of the trouble with mariabackup --prepare.

pluto:/data/results/1654601840/TBR-1535/home/mleich/rqg/1654601840/219/1_clone/rr

      # rr trace of mariabackup --prepare ...

      # mariabackup-0 is the rr trace of mariabackup --backup

pluto:/data/results/1654601840/TBR-1535/home/mleich/rqg/1654601840/219/1/data_backup

       # Filecopy of the data written by  mariabackup --backup and before the --prepare

RQG

===

# git clone https://github.com/mleich1/rqg --branch experimental RQG

# GIT_SHOW: HEAD -> experimental, origin/experimental fcae46996ce85de6bb0844799b69ed6ee8add272 2022-06-01T20:04:31+02:00

# rqg.pl  : Version 4.0.5 (2022-04)

# $RQG_HOME/rqg.pl \

# --gendata=conf/mariadb/oltp.zz \

# --max_gd_duration=900 \

# --grammar=conf/mariadb/oltp.yy \

# --reporters=Mariabackup_linux \

# --mysqld=--loose-innodb_lock_schedule_algorithm=fcfs \

# --mysqld=--loose-idle_write_transaction_timeout=0 \

# --mysqld=--loose-idle_transaction_timeout=0 \

# --mysqld=--loose-idle_readonly_transaction_timeout=0 \

# --mysqld=--connect_timeout=60 \

# --mysqld=--interactive_timeout=28800 \

# --mysqld=--slave_net_timeout=60 \

# --mysqld=--net_read_timeout=30 \

# --mysqld=--net_write_timeout=60 \

# --mysqld=--loose-table_lock_wait_timeout=50 \

# --mysqld=--wait_timeout=28800 \

# --mysqld=--lock-wait-timeout=86400 \

# --mysqld=--innodb-lock-wait-timeout=50 \

# --no-mask \

# --queries=10000000 \

# --seed=random \

# --reporters=Backtrace \

# --reporters=ErrorLog \

# --reporters=Deadlock1 \

# --validators=None \

# --mysqld=--log_output=none \

# --mysqld=--log_bin_trust_function_creators=1 \

# --mysqld=--loose-debug_assert_on_not_freed_memory=0 \

# --engine=InnoDB \

# --restart_timeout=240 \

# --mysqld=--plugin-load-add=file_key_management.so \

# --mysqld=--loose-file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \

# --mysqld=--plugin-load-add=provider_lzo.so \

# --mysqld=--plugin-load-add=provider_bzip2.so \

# --mysqld=--plugin-load-add=provider_lzma.so \

# --mysqld=--plugin-load-add=provider_snappy.so \

# --mysqld=--plugin-load-add=provider_lz4.so \

# --duration=300 \

# --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \

# --mysqld=--loose-innodb_read_only_compressed=OFF \

# --mysqld=--innodb_stats_persistent=on \

# --mysqld=--innodb_adaptive_hash_index=off \

# --mysqld=--log-bin \

# --mysqld=--sync-binlog=1 \

# --mysqld=--loose-innodb_evict_tables_on_commit_debug=off \

# --mysqld=--loose-max-statement-time=30 \

# --threads=9 \

# --mysqld=--innodb-use-native-aio=0 \

# --mysqld=--loose-gdb \

# --mysqld=--loose-debug-gdb \

# --rr=Extended \

# --rr_options=--wait \

# --mysqld=--loose_innodb_change_buffering=purges \

# --mysqld=--innodb_rollback_on_timeout=ON \

# --vardir_type=slow \

# --mysqld=--innodb_page_size=16K \

# --mysqld=--innodb-buffer-pool-size=8M \

# --no_mask \

# --mtr-build-thread=948 \

# --batch \

# <local settings>

Error pattern for RQG

[ 'TBR-1535', 'ERROR: Reporter Mariabackup : First prepare returned.+ Executing backup: .+ Executing first prepare: .+mariadb-backup: srv_start\(\) returned 37 \(Data structure corruption\).+RESULT: The RQG run ended with status STATUS_BACKUP_FAILURE' ],

Attachments

Issue Links

duplicates

MDEV-29374 Frequent "Data structure corruption" in InnoDB after OOM

Closed

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2022-06-09 10:45

Thank you. I think that a similar problem occurred during the testing of ~~MDEV-13542~~.

I did some initial analysis with the following:

ssh pluto

rr replay /data/results/1654601840/TBR-1535/home/mleich/rqg/1654601840/219/1_clone/rr/latest-trace

watch -l recv_sys.found_corrupt_fs

continue

break buf_page_t::read_complete

reverse-continue

print/x *frame@srv_page_size

The page is all-zero, so it did pass the consistency check of mariadb-backup --backup. But, we do have log records to be applied to this page. Hence, the page should actually be nonzero.

I think that in order to find the exact reason of the failure, the rr replay trace of mariadb-backup --backup will have to be analyzed. It might be something that cannot be fixed without implementing some type of server-assisted backup (MDEV-14992) where the server would stream log and data pages to the client.

Because this trace is in the 10.8 ~~MDEV-14425~~ format, the log files should be easier to read, since they only contain plain log records, no block headers or footers.

Marko Mäkelä added a comment - 2022-06-09 10:45 Thank you. I think that a similar problem occurred during the testing of MDEV-13542 . I did some initial analysis with the following: ssh pluto rr replay /data/results/1654601840/TBR-1535/home/mleich/rqg/1654601840/219/1_clone/rr/latest-trace watch -l recv_sys.found_corrupt_fs continue break buf_page_t::read_complete reverse-continue print/x *frame@srv_page_size The page is all-zero, so it did pass the consistency check of mariadb-backup --backup . But, we do have log records to be applied to this page. Hence, the page should actually be nonzero. I think that in order to find the exact reason of the failure, the rr replay trace of mariadb-backup --backup will have to be analyzed. It might be something that cannot be fixed without implementing some type of server-assisted backup ( MDEV-14992 ) where the server would stream log and data pages to the client. Because this trace is in the 10.8 MDEV-14425 format, the log files should be easier to read, since they only contain plain log records, no block headers or footers.

Marko Mäkelä added a comment - 2023-03-23 14:16

An attempt to run a recent mariadb-backup --prepare on the corrupted backup provides a little more diagnostics:

10.8 e4b83f0febbd24b04687d4814871c758cbede111
2023-03-23 13:59:39 0 [Note] InnoDB: Starting final batch to recover 314 pages from redo log.
2023-03-23 13:59:39 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=7, page number=255]; set innodb_force_recovery to ignore
[00] 2023-03-23 13:59:39 mariadb-backup: srv_start() returned 37 (Data structure corruption).

I see that we could spend a little more effort and display the file name as well here.

The branch on which this was reproduced is 1 commit ahead of 10.8 cf57fa8d879d3621122acd26bfa21b605be79064. It includes the initial commit of ~~MDEV-13542~~ but not the fixes of ~~MDEV-29374~~ or ~~MDEV-29383~~. The corruption is being flagged because the page is filled with NUL bytes.

10.8 e4b83f0febbd24b04687d4814871c758cbede111
#2 0x0000564bf54690b9 in buf_pool_t::corrupted_evict (
this=this@entry=0x564bf5942b00 <buf_pool>,
bpage=bpage@entry=0x7f44edc211c0, state=state@entry=2147483648)
at /mariadb/10.8/storage/innobase/buf/buf0lru.cc:1261
#3 0x0000564bf5445619 in buf_page_t::read_complete (this=0x7f44edc211c0,
node=@0x564bf6420300: {space = 0x564bf64201c0, name = 0x564bf6420350 "test/oltp3.ibd", handle = {m_file = 12}, on_ssd = 0, punch_hole = 0, atomic_write = 0, is_raw_disk = 0, deferred = 0, size = 704, init_size = 0, max_size = 4294967295, being_extended = {m = std::atomic<bool> = { false }}, chain = {prev = 0x0, next = 0x0}, block_size = 4096})

Page 255 would be at the end of this range of the file:

2e0000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

400000 …

During the execution of the server that ran concurrently with the backup, the function buf_page_t::write_complete() was never invoked on this page. At log_sys.lsn=4997963 the page was removed from buf_pool.flush_list:

#2  0x00005620f96aa7f2 in buf_page_t::clear_oldest_modification (

    this=0x7fb4ba113b60)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/include/buf0buf.h:2047

#3  0x00005620f96ab7fa in buf_pool_t::release_freed_page (

    this=0x5620fb732f40 <buf_pool>, bpage=0x7fb4ba113b60)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/buf/buf0flu.cc:728

#4  0x00005620f96abe37 in buf_page_t::flush (this=0x7fb4ba113b60, lru=false,

    space=0x612000097540)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/buf/buf0flu.cc:756

#5  0x00005620f96a259a in buf_do_flush_list_batch (max_n=200,

    lsn=18446744073709551615)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/buf/buf0flu.cc:1426

#6  0x00005620f96a2f23 in buf_flush_list (max_n=200, lsn=18446744073709551615)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/buf/buf0flu.cc:1502

#7  0x00005620f96a800e in buf_flush_page_cleaner ()

At this time, the latest checkpoint LSN was 3495408. The backup preparation started at checkpoint LSN 4258151. After the page had been freed, the checkpoint LSN was updated to 3925936 and subsequently to 4258151. Now, why is recovery interested in a freed page? Here is where we recovery registers its interest in the page:

10.8 e4b83f0febbd24b04687d4814871c758cbede111
Thread 1 hit Breakpoint 6, recv_sys_t::add (
this=this@entry=0x564bf594e920 <recv_sys>,
it=it@entry={first = {m_id = 30064771327}, second = {state = page_recv_t::RECV_NOT_PROCESSED, last_offset = 1, log = {head = 0x0, tail = 0x0}}},
start_lsn=start_lsn@entry=4296217, lsn=4296441,
l=0x7f44f8d8e4b2 " \200B\a\200\177\006\205<", len=210)
at /mariadb/10.8/storage/innobase/log/log0recv.cc:1993

The log record is INSERT_HEAP_DYNAMIC. Let us see under which circumstances this log record had been written:

Thread 16 hit Hardware watchpoint 12: -location log_sys.lsn._M_i

Old value = 4296217

New value = 4296441

0x00005620f92e65e6 in std::__atomic_base<unsigned long>::store (

    __m=std::memory_order_relaxed, __i=4296441, this=0x5620fc347b40 <log_sys>)

    at /usr/include/c++/9/bits/atomic_base.h:397

397		__atomic_store_n(&_M_i, __i, int(__m));

(rr) when

Current event: 295265

(rr) bt

#0  0x00005620f92e65e6 in std::__atomic_base<unsigned long>::store (

    __m=std::memory_order_relaxed, __i=4296441, this=0x5620fc347b40 <log_sys>)

    at /usr/include/c++/9/bits/atomic_base.h:397

#1  log_t::append_prepare<false> (this=0x5620fc347b40 <log_sys>, size=224,

    ex=false)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/mtr/mtr0mtr.cc:1129

#2  0x00005620f92d72f8 in mtr_t::finish_write (this=0x7fb4a5b3dcb0, len=224)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/mtr/mtr0mtr.cc:1302

#3  0x00005620f92d69ec in mtr_t::do_write (this=0x7fb4a5b3dcb0)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/mtr/mtr0mtr.cc:1280

#4  0x00005620f92d22f6 in mtr_t::commit (this=0x7fb4a5b3dcb0)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/mtr/mtr0mtr.cc:715

#5  0x00005620f93f75c7 in row_ins_clust_index_entry_low (flags=0, mode=2,

    index=0x616000906108, n_uniq=1, entry=0x6160054f7908, n_ext=0,

    thr=0x6210004148d8)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/row/row0ins.cc:2797

Here is our page of interest:

(rr) p *(mtr_memo_slot_t*)m_memo.m_first_block.m_data@3

$7 = {{object = 0x0, type = MTR_MEMO_S_LOCK}, {object = 0x7fb4ba2139c0,

    type = MTR_MEMO_PAGE_SX_FIX}, {object = 0x7fb4ba113b60,

    type = MTR_MEMO_PAGE_X_MODIFY}}

(rr) p/x *(buf_block_t*)0x7fb4ba113b60

$8 = {page = {id_ = {m_id = 0x7000000ff}, hash = 0x0, oldest_modification_ = {

      m = {<std::__atomic_base<unsigned long>> = {_M_i = 0x39ed8a}…

At this point of time, we still had log_sys.last_checkpoint_lsn=3377749. The recv_sys.lsn during recovery advances all the way to 5447384. This write started after the checkpoint LSN 4258151, but the oldest modification to the page was 0x39ed8a=3796362, before the checkpoint. Therefore, recovery is right in complaining that it fails to read a copy of the page so that it can apply this log record to it. Why was the block freed and never written out? Let us check this call to buf_page_free() at LSN=4824182:

#0  buf_page_free (

    space=0x5620f97bc18a <fseg_free_page(unsigned char*, fil_space_t*, unsigned int, mtr_t*, bool)+789>, page=32692, mtr=0x7fb4b70ca960)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/buf/buf0buf.cc:2184

#1  0x00005620f95b99c1 in btr_page_free (index=0x616000906108,

    block=0x7fb4ba113b60, mtr=0x7fb4b70cbbe0, blob=false, space_latched=false)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/btr/btr0btr.cc:619

#2  0x00005620f95d154e in btr_compress (cursor=0x61a00000e5a8, adjust=false,

    mtr=0x7fb4b70cbbe0)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/btr/btr0btr.cc:3850

#3  0x00005620f9621ac5 in btr_cur_compress_if_useful (cursor=0x61a00000e5a8,

    adjust=false, mtr=0x7fb4b70cbbe0)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/btr/btr0cur.cc:5393

#4  0x00005620f9624d47 in btr_cur_pessimistic_delete (err=0x7fb4b70cb8d0,

    has_reserved_extents=0, cursor=0x61a00000e5a8, flags=0, rollback=false,

    mtr=0x7fb4b70cbbe0)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/btr/btr0cur.cc:5838

#5  0x00005620f947c36b in row_purge_remove_clust_if_poss_low (

    node=0x61a00000e508, mode=522)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/row/row0purge.cc:220

#6  0x00005620f947c619 in row_purge_remove_clust_if_poss (node=0x61a00000e508)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/row/row0purge.cc:260

#7  0x00005620f947e4fb in row_purge_del_mark (node=0x61a00000e508)

…

(rr) display

1: log_sys.lsn._M_i = 4824182

3: buf_pool.flush_list.start.oldest_modification_.m._M_i = 4816180

4: /x id_.m_id = <error: current stack frame does not contain a variable named `this'>

6: log_sys.last_checkpoint_lsn.m._M_i = 3377749

(rr) when

Current event: 393629

This looks like something that was fixed in ~~MDEV-29374~~, which revised buf_page_free() as follows:

diff --git a/storage/innobase/buf/buf0buf.cc b/storage/innobase/buf/buf0buf.cc

index eaf0f955a1f..c1e9dedf72d 100644

--- a/storage/innobase/buf/buf0buf.cc

+++ b/storage/innobase/buf/buf0buf.cc

@@ -2128,7 +2128,7 @@ void buf_page_free(fil_space_t *space, uint32_t page, mtr_t *mtr)

     btr_search_drop_page_hash_index(block);

 #endif /* BTR_CUR_HASH_ADAPT */

   block->page.set_freed(block->page.state());

-  mtr->memo_push(block, MTR_MEMO_PAGE_X_FIX);

+  mtr->memo_push(block, MTR_MEMO_PAGE_X_MODIFY);

 /** Get read access to a compressed page (usually of type

That revision should not actually make any difference in this case, because at the time this thread invokes mtr_t::commit(), we do have another record that marked the page as modified in the mini-transaction:

(rr) p *(mtr_memo_slot_t*)m_memo.m_first_block.m_data@20

$26 = {{object = 0x616000906278, type = MTR_MEMO_SX_LOCK}, {

    object = 0x7fb4ba2139c0, type = MTR_MEMO_PAGE_X_FIX}, {

    object = 0x7fb4ba113b60, type = MTR_MEMO_BUF_FIX}, {

    object = 0x7fb4ba015560, type = MTR_MEMO_PAGE_X_FIX}, {

    object = 0x7fb4ba113b60, type = MTR_MEMO_PAGE_X_MODIFY}, {

    object = 0x7fb4ba514860, type = MTR_MEMO_PAGE_X_FIX}, {

    object = 0x612000097540, type = MTR_MEMO_SPACE_X_LOCK}, {

    object = 0x7fb4ba213750, type = MTR_MEMO_PAGE_SX_FIX}, {

    object = 0x7fb4ba2139c0, type = MTR_MEMO_BUF_FIX}, {

    object = 0x7fb4ba2139c0, type = MTR_MEMO_PAGE_X_MODIFY}, {

    object = 0x7fb4ba015560, type = MTR_MEMO_PAGE_X_MODIFY}, {

    object = 0x7fb4ba015560, type = MTR_MEMO_PAGE_X_MODIFY}, {

    object = 0x7fb4ba514860, type = MTR_MEMO_PAGE_X_MODIFY}, {

    object = 0x7fb4ba2139c0, type = MTR_MEMO_PAGE_SX_FIX}, {

    object = 0x7fb4ba2138f0, type = MTR_MEMO_PAGE_SX_MODIFY}, {

    object = 0x7fb4ba213750, type = MTR_MEMO_PAGE_SX_MODIFY}, {

    object = 0x7fb4ba213750, type = MTR_MEMO_PAGE_SX_MODIFY}, {

    object = 0x7fb4ba113b60, type = MTR_MEMO_PAGE_X_FIX}, {

    object = 0x7fb4ba2139c0, type = MTR_MEMO_BUF_FIX}, {

    object = 0x7fb4ba2139c0, type = MTR_MEMO_PAGE_X_FIX}}

Besides, our block 0x7fb4ba113b60 still carries the oldest_modification LSN of 3796362.

The mini-transaction is going to write the following log records, at LSN=4824182:

(rr) p/x *m_log.m_first_block.m_data@39

$33 = {0x29, 0x7, 0x80, 0x7f, 0x9, 0x8e, 0x16, 0x0, 0x80, 0x49,

  0x30, 0x5, 0x7, 0x80, 0x29, 0x27, 0xe, 0x2b, 0xec, 0x80, 0x38, 0x0, 0x0, 0x0, 0x0,

  0x2b, 0x23, 0x0, 0x2, 0x0, 0x35, 0xb3, 0x2b, 0x0, 0x1a, 0xb0, 0xaa, 0x30,

  0x19}

The first record is DELETE_ROW_FORMAT_DYNAMIC for the page 7:255. It is followed by 2 records for page 7:169, written by btr_page_reorganize_low(). For some reason, no FREE_PAGE record for our page 255 was written by this mini-transaction. I double checked that during the recovery, no FREE_PAGE record for the page was parsed:

diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc

index 793f7b327c8..88ed55f577a 100644

--- a/storage/innobase/log/log0recv.cc

+++ b/storage/innobase/log/log0recv.cc

@@ -1994,6 +1994,7 @@ inline void recv_sys_t::add(map::iterator it, lsn_t start_lsn, lsn_t lsn,

   switch (*l & 0x70) {

   case FREE_PAGE: case INIT_PAGE:

+    ut_ad(page_id.page_no() != 255);

     recs.will_not_read();

     mlog_init.add(page_id, start_lsn); /* FIXME: remove this! */

     /* fall through */

@@ -2547,6 +2548,7 @@ inline recv_sys_t::parse_mtr_result recv_sys_t::parse(store_t store, source &l)

       ut_ad(freed.find(id) == freed.end());

       switch (b & 0x70) {

       case FREE_PAGE:

+        ut_ad(id.page_no() != 255);

         ut_ad(freed.emplace(id).second);

         last_offset= 1; /* the next record must not be same_page  */

         goto free_or_init_page;

Neither assertion was hit.

One more possible explanation could have been that the change under test was different from what was finally pushed as ~~MDEV-28708~~. Even if this difference had caused buf_pool.flush_list to be in wrong order, it should not explain this recovery error, because the checkpoint from which the recovery started was executed much after the time the page had been evicted from the buffer pool.

The last clue in the ~~MDEV-29374~~ commit message is useful:

fseg_free_page_low(): Remove an accidentally added return statement
that prevented mtr_t::free() from being called. This fixes a regression
that was introduced in
commit 0b47c126e31cddda1e94588799599e138400bcf8 (~~MDEV-13542~~).

Let us check for that:

(rr) bt

#0  fseg_free_page_low (

    seg_inode=0xabfbd9f3423bf700 <error: Cannot access memory at address 0xabfbd9f3423bf700>, iblock=0x612000097540, space=0x7f00b70cbc48, offset=22048,

    mtr=0x5620f92d8535 <mtr_t::memo_contains(fil_space_t const&, bool)+365>,

    ahi=12)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/fsp/fsp0fsp.cc:2486

#1  0x00005620f97bc18a in fseg_free_page (seg_header=0x7fb4ba24804a "",

    space=0x612000097540, offset=255, mtr=0x7fb4b70cbbe0, have_latch=false)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/fsp/fsp0fsp.cc:2614

#2  0x00005620f95b9932 in btr_page_free (index=0x616000906108,

    block=0x7fb4ba113b60, mtr=0x7fb4b70cbbe0, blob=false, space_latched=false)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/btr/btr0btr.cc:613

#3  0x00005620f95d154e in btr_compress (cursor=0x61a00000e5a8, adjust=false,

    mtr=0x7fb4b70cbbe0)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/btr/btr0btr.cc:3850

(rr) finish

Run till exit from #0  fseg_free_page_low (

    seg_inode=0xabfbd9f3423bf700 <error: Cannot access memory at address 0xabfbd9f3423bf700>, iblock=0x612000097540, space=0x7f00b70cbc48, offset=22048,

    mtr=0x5620f92d8535 <mtr_t::memo_contains(fil_space_t const&, bool)+365>,

    ahi=12)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/fsp/fsp0fsp.cc:2486

fseg_free_page (seg_header=0x7fb4ba24804a "", space=0x612000097540,

    offset=255, mtr=0x7fb4b70cbbe0, have_latch=false)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/fsp/fsp0fsp.cc:2614

2614	    return fseg_free_page_low(seg_inode, iblock, space, offset, mtr);

1: log_sys.lsn._M_i = 4824182

3: buf_pool.flush_list.start.oldest_modification_.m._M_i = 4816180

4: /x id_.m_id = <error: current stack frame does not contain a variable named `this'>

6: log_sys.last_checkpoint_lsn.m._M_i = 3377749

Value returned is $35 = DB_SUCCESS

(rr) disable display

(rr) reverse-step

fseg_free_page_low (seg_inode=0x7fb4ba2440f2 "", iblock=0x7fb4ba2138f0,

    space=0x612000097540, offset=255, mtr=0x7fb4b70cbbe0, ahi=false)

    at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/fsp/fsp0fsp.cc:2586

2586	}

(rr) reverse-next

2486	{

(rr)

2581			return err;

(rr) list

2576			}

2577			err = fsp_free_extent(space, offset, mtr);

2578			if (UNIV_UNLIKELY(err != DB_SUCCESS)) {

2579				return err;

2580			}

2581			return err;

2582		}

2584		mtr->free(*space, static_cast<uint32_t>(offset));

2585		return DB_SUCCESS;

That is, the accidental stray return statement that was removed in ~~MDEV-29374~~ caused this bug. In other words, this bug report is a duplicate of ~~MDEV-29374~~.

Marko Mäkelä added a comment - 2023-03-23 14:16 An attempt to run a recent mariadb-backup --prepare on the corrupted backup provides a little more diagnostics: 10.8 e4b83f0febbd24b04687d4814871c758cbede111 2023-03-23 13:59:39 0 [Note] InnoDB: Starting final batch to recover 314 pages from redo log. 2023-03-23 13:59:39 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=7, page number=255]; set innodb_force_recovery to ignore [00] 2023-03-23 13:59:39 mariadb-backup: srv_start() returned 37 (Data structure corruption). I see that we could spend a little more effort and display the file name as well here. The branch on which this was reproduced is 1 commit ahead of 10.8 cf57fa8d879d3621122acd26bfa21b605be79064. It includes the initial commit of MDEV-13542 but not the fixes of MDEV-29374 or MDEV-29383 . The corruption is being flagged because the page is filled with NUL bytes. 10.8 e4b83f0febbd24b04687d4814871c758cbede111 #2 0x0000564bf54690b9 in buf_pool_t::corrupted_evict ( this=this@entry=0x564bf5942b00 <buf_pool>, bpage=bpage@entry=0x7f44edc211c0, state=state@entry=2147483648) at /mariadb/10.8/storage/innobase/buf/buf0lru.cc:1261 #3 0x0000564bf5445619 in buf_page_t::read_complete (this=0x7f44edc211c0, node=@0x564bf6420300: {space = 0x564bf64201c0, name = 0x564bf6420350 "test/oltp3.ibd", handle = {m_file = 12}, on_ssd = 0, punch_hole = 0, atomic_write = 0, is_raw_disk = 0, deferred = 0, size = 704, init_size = 0, max_size = 4294967295, being_extended = {m = std::atomic<bool> = { false }}, chain = {prev = 0x0, next = 0x0}, block_size = 4096}) Page 255 would be at the end of this range of the file: 2e0000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 400000 … During the execution of the server that ran concurrently with the backup, the function buf_page_t::write_complete() was never invoked on this page. At log_sys.lsn=4997963 the page was removed from buf_pool.flush_list : #2 0x00005620f96aa7f2 in buf_page_t::clear_oldest_modification ( this=0x7fb4ba113b60) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/include/buf0buf.h:2047 #3 0x00005620f96ab7fa in buf_pool_t::release_freed_page ( this=0x5620fb732f40 <buf_pool>, bpage=0x7fb4ba113b60) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/buf/buf0flu.cc:728 #4 0x00005620f96abe37 in buf_page_t::flush (this=0x7fb4ba113b60, lru=false, space=0x612000097540) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/buf/buf0flu.cc:756 #5 0x00005620f96a259a in buf_do_flush_list_batch (max_n=200, lsn=18446744073709551615) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/buf/buf0flu.cc:1426 #6 0x00005620f96a2f23 in buf_flush_list (max_n=200, lsn=18446744073709551615) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/buf/buf0flu.cc:1502 #7 0x00005620f96a800e in buf_flush_page_cleaner () At this time, the latest checkpoint LSN was 3495408. The backup preparation started at checkpoint LSN 4258151. After the page had been freed, the checkpoint LSN was updated to 3925936 and subsequently to 4258151. Now, why is recovery interested in a freed page? Here is where we recovery registers its interest in the page: 10.8 e4b83f0febbd24b04687d4814871c758cbede111 Thread 1 hit Breakpoint 6, recv_sys_t::add ( this=this@entry=0x564bf594e920 <recv_sys>, it=it@entry={first = {m_id = 30064771327}, second = {state = page_recv_t::RECV_NOT_PROCESSED, last_offset = 1, log = {head = 0x0, tail = 0x0}}}, start_lsn=start_lsn@entry=4296217, lsn=4296441, l=0x7f44f8d8e4b2 " \200B\a\200\177\006\205<", len=210) at /mariadb/10.8/storage/innobase/log/log0recv.cc:1993 The log record is INSERT_HEAP_DYNAMIC . Let us see under which circumstances this log record had been written: Thread 16 hit Hardware watchpoint 12: -location log_sys.lsn._M_i Old value = 4296217 New value = 4296441 0x00005620f92e65e6 in std::__atomic_base<unsigned long>::store ( __m=std::memory_order_relaxed, __i=4296441, this=0x5620fc347b40 <log_sys>) at /usr/include/c++/9/bits/atomic_base.h:397 397 __atomic_store_n(&_M_i, __i, int(__m)); (rr) when Current event: 295265 (rr) bt #0 0x00005620f92e65e6 in std::__atomic_base<unsigned long>::store ( __m=std::memory_order_relaxed, __i=4296441, this=0x5620fc347b40 <log_sys>) at /usr/include/c++/9/bits/atomic_base.h:397 #1 log_t::append_prepare<false> (this=0x5620fc347b40 <log_sys>, size=224, ex=false) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/mtr/mtr0mtr.cc:1129 #2 0x00005620f92d72f8 in mtr_t::finish_write (this=0x7fb4a5b3dcb0, len=224) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/mtr/mtr0mtr.cc:1302 #3 0x00005620f92d69ec in mtr_t::do_write (this=0x7fb4a5b3dcb0) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/mtr/mtr0mtr.cc:1280 #4 0x00005620f92d22f6 in mtr_t::commit (this=0x7fb4a5b3dcb0) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/mtr/mtr0mtr.cc:715 #5 0x00005620f93f75c7 in row_ins_clust_index_entry_low (flags=0, mode=2, index=0x616000906108, n_uniq=1, entry=0x6160054f7908, n_ext=0, thr=0x6210004148d8) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/row/row0ins.cc:2797 Here is our page of interest: (rr) p *(mtr_memo_slot_t*)m_memo.m_first_block.m_data@3 $7 = {{object = 0x0, type = MTR_MEMO_S_LOCK}, {object = 0x7fb4ba2139c0, type = MTR_MEMO_PAGE_SX_FIX}, {object = 0x7fb4ba113b60, type = MTR_MEMO_PAGE_X_MODIFY}} (rr) p/x *(buf_block_t*)0x7fb4ba113b60 $8 = {page = {id_ = {m_id = 0x7000000ff}, hash = 0x0, oldest_modification_ = { m = {<std::__atomic_base<unsigned long>> = {_M_i = 0x39ed8a}… At this point of time, we still had log_sys.last_checkpoint_lsn=3377749 . The recv_sys.lsn during recovery advances all the way to 5447384. This write started after the checkpoint LSN 4258151, but the oldest modification to the page was 0x39ed8a=3796362, before the checkpoint. Therefore, recovery is right in complaining that it fails to read a copy of the page so that it can apply this log record to it. Why was the block freed and never written out? Let us check this call to buf_page_free() at LSN=4824182: #0 buf_page_free ( space=0x5620f97bc18a <fseg_free_page(unsigned char*, fil_space_t*, unsigned int, mtr_t*, bool)+789>, page=32692, mtr=0x7fb4b70ca960) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/buf/buf0buf.cc:2184 #1 0x00005620f95b99c1 in btr_page_free (index=0x616000906108, block=0x7fb4ba113b60, mtr=0x7fb4b70cbbe0, blob=false, space_latched=false) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/btr/btr0btr.cc:619 #2 0x00005620f95d154e in btr_compress (cursor=0x61a00000e5a8, adjust=false, mtr=0x7fb4b70cbbe0) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/btr/btr0btr.cc:3850 #3 0x00005620f9621ac5 in btr_cur_compress_if_useful (cursor=0x61a00000e5a8, adjust=false, mtr=0x7fb4b70cbbe0) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/btr/btr0cur.cc:5393 #4 0x00005620f9624d47 in btr_cur_pessimistic_delete (err=0x7fb4b70cb8d0, has_reserved_extents=0, cursor=0x61a00000e5a8, flags=0, rollback=false, mtr=0x7fb4b70cbbe0) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/btr/btr0cur.cc:5838 #5 0x00005620f947c36b in row_purge_remove_clust_if_poss_low ( node=0x61a00000e508, mode=522) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/row/row0purge.cc:220 #6 0x00005620f947c619 in row_purge_remove_clust_if_poss (node=0x61a00000e508) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/row/row0purge.cc:260 #7 0x00005620f947e4fb in row_purge_del_mark (node=0x61a00000e508) … (rr) display 1: log_sys.lsn._M_i = 4824182 3: buf_pool.flush_list.start.oldest_modification_.m._M_i = 4816180 4: /x id_.m_id = <error: current stack frame does not contain a variable named `this'> 6: log_sys.last_checkpoint_lsn.m._M_i = 3377749 (rr) when Current event: 393629 This looks like something that was fixed in MDEV-29374 , which revised buf_page_free() as follows: diff --git a/storage/innobase/buf/buf0buf.cc b/storage/innobase/buf/buf0buf.cc index eaf0f955a1f..c1e9dedf72d 100644 --- a/storage/innobase/buf/buf0buf.cc +++ b/storage/innobase/buf/buf0buf.cc @@ -2128,7 +2128,7 @@ void buf_page_free(fil_space_t *space, uint32_t page, mtr_t *mtr) btr_search_drop_page_hash_index(block); #endif /* BTR_CUR_HASH_ADAPT */ block->page.set_freed(block->page.state()); - mtr->memo_push(block, MTR_MEMO_PAGE_X_FIX); + mtr->memo_push(block, MTR_MEMO_PAGE_X_MODIFY); } /** Get read access to a compressed page (usually of type That revision should not actually make any difference in this case, because at the time this thread invokes mtr_t::commit() , we do have another record that marked the page as modified in the mini-transaction: (rr) p *(mtr_memo_slot_t*)m_memo.m_first_block.m_data@20 $26 = {{object = 0x616000906278, type = MTR_MEMO_SX_LOCK}, { object = 0x7fb4ba2139c0, type = MTR_MEMO_PAGE_X_FIX}, { object = 0x7fb4ba113b60, type = MTR_MEMO_BUF_FIX}, { object = 0x7fb4ba015560, type = MTR_MEMO_PAGE_X_FIX}, { object = 0x7fb4ba113b60, type = MTR_MEMO_PAGE_X_MODIFY}, { object = 0x7fb4ba514860, type = MTR_MEMO_PAGE_X_FIX}, { object = 0x612000097540, type = MTR_MEMO_SPACE_X_LOCK}, { object = 0x7fb4ba213750, type = MTR_MEMO_PAGE_SX_FIX}, { object = 0x7fb4ba2139c0, type = MTR_MEMO_BUF_FIX}, { object = 0x7fb4ba2139c0, type = MTR_MEMO_PAGE_X_MODIFY}, { object = 0x7fb4ba015560, type = MTR_MEMO_PAGE_X_MODIFY}, { object = 0x7fb4ba015560, type = MTR_MEMO_PAGE_X_MODIFY}, { object = 0x7fb4ba514860, type = MTR_MEMO_PAGE_X_MODIFY}, { object = 0x7fb4ba2139c0, type = MTR_MEMO_PAGE_SX_FIX}, { object = 0x7fb4ba2138f0, type = MTR_MEMO_PAGE_SX_MODIFY}, { object = 0x7fb4ba213750, type = MTR_MEMO_PAGE_SX_MODIFY}, { object = 0x7fb4ba213750, type = MTR_MEMO_PAGE_SX_MODIFY}, { object = 0x7fb4ba113b60, type = MTR_MEMO_PAGE_X_FIX}, { object = 0x7fb4ba2139c0, type = MTR_MEMO_BUF_FIX}, { object = 0x7fb4ba2139c0, type = MTR_MEMO_PAGE_X_FIX}} Besides, our block 0x7fb4ba113b60 still carries the oldest_modification LSN of 3796362. The mini-transaction is going to write the following log records, at LSN=4824182: (rr) p/x *m_log.m_first_block.m_data@39 $33 = {0x29, 0x7, 0x80, 0x7f, 0x9, 0x8e, 0x16, 0x0, 0x80, 0x49, 0x30, 0x5, 0x7, 0x80, 0x29, 0x27, 0xe, 0x2b, 0xec, 0x80, 0x38, 0x0, 0x0, 0x0, 0x0, 0x2b, 0x23, 0x0, 0x2, 0x0, 0x35, 0xb3, 0x2b, 0x0, 0x1a, 0xb0, 0xaa, 0x30, 0x19} The first record is DELETE_ROW_FORMAT_DYNAMIC for the page 7:255. It is followed by 2 records for page 7:169, written by btr_page_reorganize_low() . For some reason, no FREE_PAGE record for our page 255 was written by this mini-transaction. I double checked that during the recovery, no FREE_PAGE record for the page was parsed: diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc index 793f7b327c8..88ed55f577a 100644 --- a/storage/innobase/log/log0recv.cc +++ b/storage/innobase/log/log0recv.cc @@ -1994,6 +1994,7 @@ inline void recv_sys_t::add(map::iterator it, lsn_t start_lsn, lsn_t lsn, switch (*l & 0x70) { case FREE_PAGE: case INIT_PAGE: + ut_ad(page_id.page_no() != 255); recs.will_not_read(); mlog_init.add(page_id, start_lsn); /* FIXME: remove this! */ /* fall through */ @@ -2547,6 +2548,7 @@ inline recv_sys_t::parse_mtr_result recv_sys_t::parse(store_t store, source &l) ut_ad(freed.find(id) == freed.end()); switch (b & 0x70) { case FREE_PAGE: + ut_ad(id.page_no() != 255); ut_ad(freed.emplace(id).second); last_offset= 1; /* the next record must not be same_page */ goto free_or_init_page; Neither assertion was hit. One more possible explanation could have been that the change under test was different from what was finally pushed as MDEV-28708 . Even if this difference had caused buf_pool.flush_list to be in wrong order, it should not explain this recovery error, because the checkpoint from which the recovery started was executed much after the time the page had been evicted from the buffer pool. The last clue in the MDEV-29374 commit message is useful: fseg_free_page_low(): Remove an accidentally added return statement that prevented mtr_t::free() from being called. This fixes a regression that was introduced in commit 0b47c126e31cddda1e94588799599e138400bcf8 ( MDEV-13542 ). Let us check for that: (rr) bt #0 fseg_free_page_low ( seg_inode=0xabfbd9f3423bf700 <error: Cannot access memory at address 0xabfbd9f3423bf700>, iblock=0x612000097540, space=0x7f00b70cbc48, offset=22048, mtr=0x5620f92d8535 <mtr_t::memo_contains(fil_space_t const&, bool)+365>, ahi=12) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/fsp/fsp0fsp.cc:2486 #1 0x00005620f97bc18a in fseg_free_page (seg_header=0x7fb4ba24804a "", space=0x612000097540, offset=255, mtr=0x7fb4b70cbbe0, have_latch=false) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/fsp/fsp0fsp.cc:2614 #2 0x00005620f95b9932 in btr_page_free (index=0x616000906108, block=0x7fb4ba113b60, mtr=0x7fb4b70cbbe0, blob=false, space_latched=false) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/btr/btr0btr.cc:613 #3 0x00005620f95d154e in btr_compress (cursor=0x61a00000e5a8, adjust=false, mtr=0x7fb4b70cbbe0) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/btr/btr0btr.cc:3850 (rr) finish Run till exit from #0 fseg_free_page_low ( seg_inode=0xabfbd9f3423bf700 <error: Cannot access memory at address 0xabfbd9f3423bf700>, iblock=0x612000097540, space=0x7f00b70cbc48, offset=22048, mtr=0x5620f92d8535 <mtr_t::memo_contains(fil_space_t const&, bool)+365>, ahi=12) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/fsp/fsp0fsp.cc:2486 fseg_free_page (seg_header=0x7fb4ba24804a "", space=0x612000097540, offset=255, mtr=0x7fb4b70cbbe0, have_latch=false) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/fsp/fsp0fsp.cc:2614 2614 return fseg_free_page_low(seg_inode, iblock, space, offset, mtr); 1: log_sys.lsn._M_i = 4824182 3: buf_pool.flush_list.start.oldest_modification_.m._M_i = 4816180 4: /x id_.m_id = <error: current stack frame does not contain a variable named `this'> 6: log_sys.last_checkpoint_lsn.m._M_i = 3377749 Value returned is $35 = DB_SUCCESS (rr) disable display (rr) reverse-step fseg_free_page_low (seg_inode=0x7fb4ba2440f2 "", iblock=0x7fb4ba2138f0, space=0x612000097540, offset=255, mtr=0x7fb4b70cbbe0, ahi=false) at /data/Server/bb-10.8-MDEV-28708A/storage/innobase/fsp/fsp0fsp.cc:2586 2586 } (rr) reverse-next 2486 { (rr) 2581 return err; (rr) list 2576 } 2577 err = fsp_free_extent(space, offset, mtr); 2578 if (UNIV_UNLIKELY(err != DB_SUCCESS)) { 2579 return err; 2580 } 2581 return err; 2582 } 2583 2584 mtr->free(*space, static_cast<uint32_t>(offset)); 2585 return DB_SUCCESS; That is, the accidental stray return statement that was removed in MDEV-29374 caused this bug. In other words, this bug report is a duplicate of MDEV-29374 .

People

Assignee:: Marko Mäkelä

Reporter:: Matthias Leich

Votes:: 1 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2022-06-09 10:24

Updated:: 2023-03-23 14:19

Resolved:: 2023-03-23 14:19

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server