[MDEV-21347] innodb_log_optimize_ddl=OFF is not crash safe - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.2.17, 10.3.9, 10.4.0, 10.5.0, 10.5
Fix Version/s: 10.2.33, 10.3.24, 10.4.14, 10.5.5
Component/s: Storage Engine - InnoDB
Labels:
- corruption

Description

In ~~MDEV-16809~~, MariaDB introduced a feature to enable redo log recording for bulk load index creation, which disabled redo log recording originally.

This is a nice feature, it solves the issue for PXB backup with concurrent DDL, we ported this feature to our MySQL branch and report a feature request to MySQL upstream by https://bugs.mysql.com/bug.php?id=92099.

But recently, we encountered several data corruption case in our pro env, and after some investigation, we found that this feature is not crash safe. If mysqld restart abnormally during DDL, crash happened during crash recovery. And I managed repro this bug using the latest mariadb (10.5, fetch source from github).

Thread 7 "mysqld" received signal SIGSEGV, Segmentation fault.

>>> bt

#0  page_rec_find_owner_rec (rec=0x7fff7830c09d "\200") at /home/fungo/Projects/mariadb-server/storage/innobase/include/page0page.ic:770

#1  page_cur_insert_rec_low (cur=cur@entry=0x7fff33ffd6f0, index=index@entry=0x7fff2c0016f0, rec=rec@entry=0x7fff33ffdab7 "\200", offsets=<optimized out>, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/page/page0cur.cc:1532

#2  0x0000555556173aaa in page_cur_rec_insert (mtr=0x7fff33ffe420, offsets=<optimized out>, index=0x7fff2c0016f0, rec=0x7fff33ffdab7 "\200", cursor=0x7fff33ffd6f0) at /home/fungo/Projects/mariadb-server/storage/innobase/include/page0cur.ic:319

#3  page_cur_parse_insert_rec (is_short=is_short@entry=false, ptr=<optimized out>, end_ptr=end_ptr@entry=0x55555ab30b51 "", block=block@entry=0x7fff78001590, index=0x7fff2c0016f0, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/page/page0cur.cc:1146

#4  0x0000555556159559 in recv_parse_or_apply_log_rec_body (type=MLOG_COMP_REC_INSERT, ptr=<optimized out>, end_ptr=0x55555ab30b51 "", space_id=6, page_no=15, apply=apply@entry=true, block=block@entry=0x7fff78001590, mtr=mtr@entry=0x7fff33ffe420) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:1497

#5  0x0000555556159d7b in recv_recover_page (block=block@entry=0x7fff78001590, mtr=..., p=..., init=init@entry=0x0) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:1951

#6  0x0000555555b4fb25 in recv_recover_page (bpage=bpage@entry=0x7fff78001590) at /home/fungo/Projects/mariadb-server/storage/innobase/log/log0recv.cc:2048

#7  0x000055555627945a in buf_page_io_complete (bpage=bpage@entry=0x7fff78001590, dblwr=dblwr@entry=true, evict=evict@entry=false) at /home/fungo/Projects/mariadb-server/storage/innobase/buf/buf0buf.cc:5993

#8  0x00005555562d8098 in fil_aio_callback (cb=cb@entry=0x555557a60e60) at /home/fungo/Projects/mariadb-server/storage/innobase/fil/fil0fil.cc:4375

#9  0x000055555616e363 in io_callback (cb=0x555557a60e60) at /home/fungo/Projects/mariadb-server/storage/innobase/os/os0file.cc:3880

#10 0x0000555556350707 in tpool::task_group::execute (this=0x555557a3d480, t=0x555557a60ea8) at /home/fungo/Projects/mariadb-server/tpool/task_group.cc:55

#11 0x000055555634efc1 in tpool::thread_pool_generic::worker_main (this=0x5555579fe090, thread_var=0x555557a0d8a0) at /home/fungo/Projects/mariadb-server/tpool/tpool_generic.cc:509

#12 0x00007ffff6c36360 in ?? () from /lib64/libstdc++.so.6

#13 0x00007ffff7bc6e25 in start_thread () from /lib64/libpthread.so.0

#14 0x00007ffff6398f1d in clone () from /lib64/libc.so.6

>>> p rec

$1 = (rec_t *) 0x0

Analysis:

The reason is that
1. redo log is recorded(page_cur_insert_rec_write_log())along with every record insertion in PageBulk::insert()
2. the index page header info is fixed at BtrBulk::pageCommit() by invoking PageBulk::finishPage(), then we use PageBulk::commit() to commit the mtr (relese buf_fix_count, rwlock, and write local redo to global buffer).
3. But we may commit the mtr too early by PageBulk::release()

If the mtr is comimtted by PageBulk::release(), the rwlock is released, and the dirty page can be flush to disk. But the index page header info is not fixed.

So if
1. a checkpoint happened after PageBulk::release() and before PageBulk::commit().
2. mysqld is killed (OOM or somehow) before InnoDB do a new checkpoint
3. the crash recover after 2 will crash as showed in the previous stack

Bellow is how I manually reproed this (need gdb asssitance):

0. in my.cnf set innodb_sort_buffer_size=65536, make sure PageBulk::release() will be invoked.

1. prepare data into t1
create table t1(id int auto_increment, name varchar(30), primary key(id)) engine=innodb;

insert into t1 values (1, "MySQL"), (2, "MariaDB"), (3, "AlisQL"), (4, "PolarDB"), (5, "hahaha");

insert into t1(name) select a.name from t1 a, t1 b limit 5000;
insert into t1(name) select a.name from t1 a, t1 b limit 5000;
insert into t1(name) select a.name from t1 a, t1 b limit 5000;
insert into t1(name) select a.name from t1 a, t1 b limit 5000;
insert into t1(name) select a.name from t1 a, t1 b limit 5000;
insert into t1(name) select a.name from t1 a, t1 b limit 5000;

2. set gdb breakpoint at PageBulk::release()

3. run
optimize table t1;

4. gdb will break at PageBulk::release()
make PageBulk::release() finish, by 2 or more finish commands

then using gdb maually call os_thread_sleep(30*1000000) to block current bulk load thread, and this will give page cleaner enougth time to flush all dirty pages and advance checkpoint

5. after os_thread_sleep() return, using same gdb trick in step 4 to block page cleaner thread forever
, such as call os_thread_sleep(3000*1000000)

6. optimize query in step 3 will finish

7. using `show engine innodb status`, we can see there is some un-checkpointed redo

8. then kill -9 mysqld, and the crash recovery will crash

Attachments

Issue Links

blocks

MDEV-23156 [ERROR] InnoDB: Record <number1> is above rec heap top <number2> hit during restart after crash

Closed

is caused by

MDEV-16809 Allow full redo logging for ALTER TABLE

Closed

relates to

MDEV-10217 innodb.innodb_bug59641 fails sporadically in buildbot: InnoDB: Failing assertion: current_rec != insert_rec in file page0cur.c line 1052

Closed

MDEV-15110 InnoDB crash on recovery

Closed

MDEV-23720 Change innodb_log_optimize_ddl=OFF by default

Closed

MDEV-22190 After IMPORT: InnoDB: Record 126 is above rec heap top 120

Closed

(1 relates to)

Activity

People

Assignee:: Marko Mäkelä

Reporter:: Fungo Wang

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 2019-12-18 09:15

Updated:: 2020-09-11 13:52

Resolved:: 2020-07-16 04:28

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server