[MDEV-23986] [ERROR] [FATAL] InnoDB: Page ... name ... page_type ... key_version 1 lsn ... compressed_len ... Created: 2020-10-19  Updated: 2020-10-21  Resolved: 2020-10-21

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.5
Fix Version/s: 10.5.7

Type: Bug Priority: Major
Reporter: Matthias Leich Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: debug, page_compression, regression, rr-profile

Issue Links:
Blocks
blocks MDEV-23855 InnoDB log checkpointing causes regre... Closed
Problem/Incident
is caused by MDEV-18644 Support FULL_CRC32 for compressed pages. Closed
Relates
relates to MDEV-23199 page_compression flag is missing for ... Closed
relates to MDEV-12353 Efficient InnoDB redo log record format Closed

 Description   

The problem looks roughly like the closed bug https://jira.mariadb.org/browse/MDEV-23199.
Work flow:
1. Start the DB server and generate some data.
2. Stress the DB server with DDL and DML
     Encryption+Compression are involved, but no explicite XA commands or replication.
3. SIGKILL the DB server during 2.
4. Attempt to restart the DB Server
[rr 2043283 3708]2020-10-19 16:19:14 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=154755265
...
2020-10-19 16:19:16 0 [Note] InnoDB: Ignoring data file './test/#sql-alter-1c8fbe-12.ibd' with space ID 145. Another data file called ./test/t3.ibd exists with the same space ID.
2020-10-19 16:19:17 0 [Note] InnoDB: Transaction 351694 was in the XA prepared state.
2020-10-19 16:19:17 0 [ERROR] [FATAL] InnoDB: Page 1115494424:3 name test/t5 page_type 32770 key_version 1 lsn 150031025 compressed_len 26538
201019 16:19:17 [ERROR] mysqld got signal 6 ;
...
(rr) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x000014c09952a859 in __GI_abort () at abort.c:79
#2  0x0000563e8df53670 in ib::fatal::~fatal (this=0x7ffe6b4e1810, __in_chrg=<optimized out>) at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/ut/ut0ut.cc:580
#3  0x0000563e8e05a9ae in fil_page_type_validate (space=0x615000004198, page=page@entry=0x1bf109030000 "") at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/include/mach0data.ic:84
#4  0x0000563e8e061e78 in buf_page_decrypt_after_read (bpage=bpage@entry=0x1bf107b12718, node=...) at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/buf/buf0buf.cc:497
#5  0x0000563e8e0661a2 in buf_page_read_complete (bpage=bpage@entry=0x1bf107b12718, node=...) at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/buf/buf0buf.cc:4200
#6  0x0000563e8e0ec03d in buf_read_page_low (err=err@entry=0x7ffe6b4e3590, space=<optimized out>, sync=sync@entry=true, mode=mode@entry=132, page_id=..., zip_size=zip_size@entry=0, unzip=<optimized out>)
    at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/buf/buf0rea.cc:360
#7  0x0000563e8e0ec2c5 in buf_read_page (page_id=..., zip_size=zip_size@entry=0) at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/buf/buf0rea.cc:489
#8  0x0000563e8e072b53 in buf_page_get_low (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x0, mode=mode@entry=10, 
    file=file@entry=0x563e8efe4e80 "/home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/dict/dict0load.cc", line=<optimized out>, mtr=<optimized out>, err=<optimized out>, allow_ibuf_merge=<optimized out>)
    at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/buf/buf0buf.cc:3119
#9  0x0000563e8e075541 in buf_page_get_gen (page_id=..., zip_size=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x0, mode=mode@entry=10, file=file@entry=0x563e8efe4e80 "/home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/dict/dict0load.cc", 
    line=2988, mtr=0x7ffe6b4e4960, err=0x0, allow_ibuf_merge=false) at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/buf/buf0buf.cc:3576
#10 0x0000563e8e17ffb3 in dict_load_table_one (name=..., ignore_err=ignore_err@entry=DICT_ERR_IGNORE_RECOVER_LOCK, fk_tables=std::deque with 0 elements) at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/include/fil0fil.h:674
#11 0x0000563e8e1816e0 in dict_load_table (name=0x614000005938 "test/t5", ignore_err=ignore_err@entry=DICT_ERR_IGNORE_RECOVER_LOCK) at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/include/dict0types.h:114
#12 0x0000563e8e1957c5 in dict_load_table_on_id (table_id=table_id@entry=137, ignore_err=ignore_err@entry=DICT_ERR_IGNORE_RECOVER_LOCK) at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/dict/dict0load.cc:3176
#13 0x0000563e8e1401ea in dict_table_open_on_id_low (table_id=table_id@entry=137, ignore_err=<optimized out>, cached_only=<optimized out>) at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/dict/dict0dict.cc:222
#14 0x0000563e8e15321b in dict_table_open_on_id (table_id=137, dict_locked=dict_locked@entry=false, table_op=table_op@entry=DICT_TABLE_OP_LOAD_TABLESPACE, thd=thd@entry=0x0, mdl=mdl@entry=0x0)
    at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/dict/dict0dict.cc:948
#15 0x0000563e8df1a459 in trx_resurrect_table_locks (trx=trx@entry=0x4a1e648a29d8, undo=undo@entry=0x6120000049d8) at /usr/include/c++/9/bits/stl_tree.h:356
#16 0x0000563e8df2001c in trx_resurrect (undo=undo@entry=0x6120000049d8, rseg=rseg@entry=0x613000004f18, start_time=start_time@entry=1603124357, start_time_micro=start_time_micro@entry=2644134905403, rows_to_undo=rows_to_undo@entry=0x7ffe6b4e6b40, 
    is_old_insert=is_old_insert@entry=false) at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/trx/trx0trx.cc:713
#17 0x0000563e8df2220a in trx_lists_init_at_db_start () at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/trx/trx0trx.cc:774
#18 0x0000563e8de23a44 in srv_start (create_new_db=<optimized out>) at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/srv/srv0start.cc:1510
#19 0x0000563e8d8c8fac in innodb_init (p=<optimized out>) at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/handler/ha_innodb.cc:3896
#20 0x0000563e8cc894c8 in ha_initialize_handlerton (plugin=0x62100002e010) at /home/mleich/Server/bb-10.5-MDEV-23855/sql/handler.cc:645
#21 0x0000563e8c3fba8e in plugin_initialize (tmp_root=tmp_root@entry=0x7ffe6b4ead10, plugin=plugin@entry=0x62100002e010, argc=argc@entry=0x563e8fdf2ec0 <remaining_argc>, argv=argv@entry=0x615000000d78, options_only=<optimized out>)
    at /home/mleich/Server/bb-10.5-MDEV-23855/sql/sql_plugin.cc:1459
#22 0x0000563e8c3ff34d in plugin_init (argc=argc@entry=0x563e8fdf2ec0 <remaining_argc>, argv=<optimized out>, flags=0) at /home/mleich/Server/bb-10.5-MDEV-23855/sql/sql_plugin.cc:1751
#23 0x0000563e8c064074 in init_server_components () at /home/mleich/Server/bb-10.5-MDEV-23855/sql/mysqld.cc:4913
#24 0x0000563e8c06eb96 in mysqld_main (argc=46, argv=<optimized out>) at /home/mleich/Server/bb-10.5-MDEV-23855/sql/mysqld.cc:5496
#25 0x0000563e8c04cf76 in main (argc=<optimized out>, argv=<optimized out>) at /home/mleich/Server/bb-10.5-MDEV-23855/sql/main.cc:25
(rr) quit
 
origin/bb-10.5-MDEV-23855 6155c7e55fafe2c0bd03c41e181e3d4531cfe0c5 2020-10-19T18:08:53+03:00 compiled with debug+ASAN+Og
Per Marko: The problem is not related to MDEV-23855.
 
Hitting this problem is extreme rare.
 
RQG
====
git clone https://github.com/mleich1/rqg --branch experimental RQG
 
perl rqg.pl \
--grammar=conf/mariadb/innodb_compression_encryption.yy \
--gendata=conf/mariadb/innodb_compression_encryption.zz \
--mysqld=--plugin-load-add=file_key_management.so \
--mysqld=--loose-file-key-management-filename=/data/mleich/RQG_mleich/conf/mariadb/encryption_keys.txt \
--mysqld=--innodb-encrypt-log \
--mysqld=--innodb-encrypt-tables \
--reporters=CrashRecovery1 \
--mysqld=--innodb_use_native_aio=1 \
--mysqld=--innodb_lock_schedule_algorithm=fcfs \
--mysqld=--loose-idle_write_transaction_timeout=0 \
--mysqld=--loose-idle_transaction_timeout=0 \
--mysqld=--loose-idle_readonly_transaction_timeout=0 \
--mysqld=--connect_timeout=60 \
--mysqld=--interactive_timeout=28800 \
--mysqld=--slave_net_timeout=60 \
--mysqld=--net_read_timeout=30 \
--mysqld=--net_write_timeout=60 \
--mysqld=--loose-table_lock_wait_timeout=50 \
--mysqld=--wait_timeout=28800 \
--mysqld=--lock-wait-timeout=86400 \
--mysqld=--innodb-lock-wait-timeout=50 \
--no-mask \
--queries=10000000 \
--seed=random \
--reporters=Backtrace \
--reporters=ErrorLog \
--reporters=Deadlock1 \
--validators=None \
--mysqld=--log_output=none \
--mysqld=--log-bin \
--mysqld=--log_bin_trust_function_creators=1 \
--mysqld=--loose-debug_assert_on_not_freed_memory=0 \
--engine=InnoDB \
--restart_timeout=120 \
--duration=400 \
--mysqld=--loose-innodb_fatal_semaphore_wait_threshold=400 \
--mysqld=--loose-innodb-sync-debug \
--mysqld=--innodb_stats_persistent=on \
--mysqld=--loose-max-statement-time=30 \
--threads=9 \
--mysqld=--innodb_page_size=64K \
--mysqld=--innodb-buffer-pool-size=24M \
--duration=400 \
--no_mask \
--workdir=<local settings> \
--vardir=<local settings> \
--mtr-build-thread=<local settings> \
--basedir1=<local settings> \
--script_debug=_nix_ \
--rr=Server \
--rr_options=--chaos
 



 Comments   
Comment by Matthias Leich [ 2020-10-19 ]

rr:/home/mleich/RQG/storage/1603121566/CR-29/dev/shm/vardir/1603121566/83/1/rr
_RR_TRACE_DIR="." rr replay --mark-stdio mysqld-0    # Fate of the server till SIGKILL  ~ event 2208622
_RR_TRACE_DIR="." rr replay --mark-stdio mysqld-1    # Fate of the server during restart attempt
 
rr:/home/mleich/RQG/storage/1603121566/000458.tgz -- Archive with remains of the test
rr:/home/mleich/RQG/storage/1603121566/000458.log -- RQG log
RQG error pattern
 [ 'CR-29' , '\[ERROR\] \[FATAL\] InnoDB: Page .{1,20} name .{1,50} page_type .{1,20} key_version' ],
                                                   

Comment by Marko Mäkelä [ 2020-10-19 ]

thiru, please debug this further. I think that the problem is that we are compressing or decompressing the 64KiB page in the wrong way. The page appears to compress to 512 bytes. Well before the SIGKILL we had the following:

Thread 18 hit Breakpoint 2, buf_flush_page (bpage=bpage@entry=0x317b21012200, lru=lru@entry=false, space=space@entry=0x61500007b518) at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/buf/buf0flu.cc:777
777	{
1: /x bpage.id_ = {m_id = 0x7a00000003}
(rr) when
Current event: 1998110
(rr) set $b=(buf_block_t*)bpage
(rr) watch -l $b.page.io_fix_
(rr) continue
Continuing.
2020-10-19 16:18:20 20 [Note] InnoDB: Online DDL : End of reading clustered index of the table and create temporary files
[Switching to Thread 1871806.1959551]Thread 65 hit Hardware watchpoint 2: -location $b.page.io_fix_
Old value = {m = {_M_i = BUF_IO_WRITE}}
New value = {m = {_M_i = BUF_IO_NONE}}
buf_page_write_complete (bpage=bpage@entry=0x317b21012200, request=@0x713d7afd1ab8: {bpage = 0x317b21012200, node = 0x0, type = IORequest::WRITE_ASYNC}, dblwr=<optimized out>)
    at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/buf/buf0flu.cc:330
330	  mysql_mutex_lock(&buf_pool.flush_list_mutex);
(rr) when
Current event: 1999902
(rr) p/x *$b.frame@srv_page_size
$1 = {0x0, 0x0, 0x0, 0x1, 0x0, 0x0, 0x0, 0x3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x0, 0x0, 0x0, 0x0, 0x8, 0xf1, 0x4a, 0xb1, 0x45, 0xbf, 0x0 <repeats 11 times>, 0x7a, 0x0, 0x2, 0x0, 0xac, 0x80, 0x6, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x5, 0x0, 0x0, 0x0, 0x4, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xdc, 0x7e, 0x0, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xed, 0x0, 0x0, 0x0, 0x7a, 0x0, 0x0, 0x0, 0x2, 0x0, 0xf2, 0x0, 0x0, 0x0, 0x7a, 0x0, 0x0, 0x0, 0x2, 0x0, 0x32, 0x1, 0x0, 0x2, 0x0, 0x1a, 0x69, 0x6e, 0x66, 0x69, 0x6d, 0x75, 0x6d, 0x0, 0x5, 0x0, 0xb, 0x0, 0x0, 0x73, 0x75, 0x70, 0x72, 0x65, 0x6d, 0x75, 0x6d, 0x10, 0x0, 0x11, 0x0, 0xd, 0x0, 0x0, 0x0, 0x1, 0x0, 0x0, 0x0, 0x5, 0x0, 0x0, 0x19, 0x0, 0xd, 0x0, 0x0, 0x1, 0x29, 0x0, 0x0, 0x0, 0x6, 0x0, 0x0, 0x21, 0x0, 0xd, 0x0, 0x0, 0x2, 0x51, 0x0, 0x0, 0x0, 0x8, 0x0, 0x0, 0x29, 0xff, 0xcc, 0x0, 0x0, 0x3, 0x79, 0x0, 0x0, 0x0, 0x9, 0x0 <repeats 65353 times>, 0x70, 0x0, 0x63, 0x8, 0xf1, 0x4a, 0xb1, 0xa5, 0x40, 0x3b, 0xd7}
(rr) up
#3  0x00005585a470aa79 in tpool::simulated_aio::simulated_aio_callback (param=0x70e574832a30) at /home/mleich/Server/bb-10.5-MDEV-23855/tpool/aio_simulated.cc:162
162	    cb->m_callback(cb);
(rr) p *cb
$2 = {<iocb> = {data = 0x0, key = 0, aio_rw_flags = 0, aio_lio_opcode = 0, aio_reqprio = 0, aio_fildes = 0, u = {c = {buf = 0x0, nbytes = 0, offset = 0, __pad3 = 0, flags = 0, resfd = 0}, v = {vec = 0x0, nr = 0, offset = 0}, poll = {events = 0, __pad1 = 0}, saddr = {addr = 0x0, len = 0}}}, m_fh = 46, m_opcode = tpool::aio_opcode::AIO_PWRITE, m_offset = 196608, m_buffer = 0x1493c8db0000, m_len = 512, m_callback = 0x5585a3e98567 <io_callback(tpool::aiocb*)>, m_group = 0x6120000024d0, m_ret_len = 512, m_err = 0, m_internal = 0x0, m_internal_task = {_vptr.task = 0x5585a5a9b208 <vtable for tpool::task+16>, m_func = 0x5585a470a8be <tpool::simulated_aio::simulated_aio_callback(void*)>, m_arg = 0x70e574832a30, m_group = 0x6120000024d0}, m_userdata = "\030\t\001\000\300`\000\000\000\"\001!{1\000\000\000\000\000\000\000\000\000\000\021\000\000\000\377\377\377\377\000\"\001!{1\000"}
(rr) p/x *(char*)cb.m_buffer@512
$3 = {0x0, 0x0, 0x0, 0x1, 0x0, 0x0, 0x0, 0x3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x0, 0x0, 0x0, 0x0, 0x8, 0xf1, 0x4a, 0xb1, 0x80, 0x2, 0xd7, 0x42, 0x68, 0x22, 0x38, 0xcf, 0x78, 0x81, 0x42, 0x7d, 0x18, 0x18, 0x67, 0xaa, 0x7b, 0xef, 0x59, 0xb0, 0x76, 0xa8, 0xb8, 0xc0, 0x13, 0xb5, 0x1a, 0xb8, 0x76, 0xd1, 0xa0, 0x77, 0xa, 0xe4, 0x7b, 0x36, 0x69, 0x54, 0xf, 0xeb, 0x47, 0x79, 0xc0, 0xe8, 0xbc, 0x13, 0xbb, 0x8d, 0x29, 0xb7, 0xb3, 0x42, 0x2f, 0xe, 0xa1, 0x5, 0x53, 0x8f, 0x7d, 0xc4, 0x63, 0x6d, 0x82, 0xa2, 0x59, 0x74, 0x4e, 0xc1, 0x85, 0xb2, 0xd8, 0xd9, 0x5b, 0x39, 0x47, 0x31, 0x48, 0x60, 0x66, 0x34, 0x10, 0xb8, 0xcb, 0xc3, 0x60, 0xfa, 0x3a, 0xd9, 0xe7, 0x16, 0x4b, 0x54, 0xe0, 0x5e, 0xbf, 0xd7, 0x48, 0xa, 0x2e, 0x90, 0x82, 0xf5, 0xb2, 0xde, 0x24, 0x7e, 0x8d, 0x81, 0x17, 0x5, 0x2a, 0x12, 0x48, 0x91, 0x95, 0x1, 0x2f, 0x3c, 0x56, 0x92, 0x3c, 0x98, 0x6a, 0x23, 0xdf, 0x6d, 0x64, 0x40, 0x15, 0x77, 0xe6, 0xc3, 0x1b, 0x15, 0xc4, 0x80, 0x48, 0x39, 0xfa, 0x6d, 0x58, 0x59, 0x17, 0x8b, 0xea, 0xba, 0xca, 0xbe, 0x92, 0x60, 0x43, 0x97, 0x51, 0xb4, 0xa8, 0x1, 0x74, 0xcb, 0x7, 0xcb, 0xe3, 0x97, 0x9f, 0xa3, 0xcf, 0x28, 0x88, 0x7f, 0xef, 0xa9, 0x8e, 0xfa, 0xc2, 0x2b, 0xd0, 0x3c, 0x39, 0x80, 0x94, 0x7c, 0xed, 0x3e, 0x80, 0xf8, 0x8a, 0xaa, 0x15, 0x31, 0xc, 0x73, 0x5c, 0xb1, 0xaf, 0x48, 0x7f, 0xb5, 0x13, 0xe6, 0x57, 0xf, 0x94, 0x3b, 0xa4, 0xc2, 0x60, 0xe7, 0xe7, 0xde, 0xa4, 0x7c, 0x3, 0xaa, 0x4f, 0xf3, 0xb0, 0xd0, 0x29, 0xb4, 0xbc, 0xeb, 0xa7, 0x32, 0xa5, 0x6c, 0xa7, 0x8d, 0x1e, 0xc, 0x99, 0xde, 0x18, 0x7d, 0x74, 0xfa, 0xce, 0x73, 0x1, 0xf7, 0x8b, 0xad, 0x5, 0x43, 0x23, 0x3b, 0x7c, 0x2c, 0xef, 0xfc, 0xce, 0xfa, 0xe9, 0xb0, 0xd2, 0x1d, 0xb2, 0xd7, 0xc9, 0x81, 0xcd, 0x87, 0x2b, 0x74, 0xb8, 0x6b, 0x84, 0x56, 0x42, 0xf7, 0x79, 0xda, 0x34, 0xe2, 0x38, 0xb7, 0x24, 0x53, 0x1d, 0xb5, 0xa2, 0x5c, 0x4, 0x12, 0x9a, 0x2e, 0x5e, 0xdd, 0x23, 0xae, 0x76, 0x42, 0x5f, 0x0, 0x4, 0x2f, 0x2f, 0xb4, 0xa6, 0xde, 0x7e, 0x38, 0x46, 0x8e, 0x9a, 0xb1, 0x4d, 0xdb, 0xca, 0x1f, 0x5d, 0x9b, 0x3, 0x5f, 0xdb, 0xa6, 0x6c, 0xb7, 0x6a, 0xbb, 0xed, 0x6d, 0xb5, 0x4b, 0x76, 0x3, 0xeb, 0xe2, 0x38, 0x99, 0x33, 0xdf, 0xce, 0x7f, 0x6a, 0x16, 0x26, 0x50, 0x8a, 0xc4, 0x51, 0x9, 0xc0, 0xef, 0xad, 0x2d, 0xb7, 0xde, 0x9, 0xb6, 0x37, 0x9a, 0xc8, 0x4b, 0x7e, 0xef, 0x1, 0x37, 0x4, 0x2e, 0x7d, 0xb2, 0xba, 0x19, 0x2f, 0xc2, 0x8c, 0xea, 0x86, 0xf8, 0x12, 0x9a, 0xd6, 0x9f, 0x81, 0x1e, 0x33, 0xee, 0xb8, 0x37, 0x3, 0xad, 0xa9, 0x10, 0xe6, 0x8b, 0x16, 0x97, 0xf8, 0x3d, 0xfb, 0x4e, 0xf5, 0xfb, 0x93, 0xa6, 0x41, 0x80, 0x75, 0x7e, 0x22, 0x43, 0xbd, 0x83, 0x49, 0xc7, 0xf, 0xe2, 0x2c, 0x23, 0x3, 0x9d, 0x7e, 0x5e, 0x43, 0x1a, 0x0, 0x82, 0x43, 0x93, 0xfc, 0xaa, 0x35, 0xe5, 0xfa, 0x54, 0xce, 0xda, 0x7e, 0x2e, 0x87, 0x4, 0xdf, 0x22, 0x1, 0x29, 0xa6, 0x21, 0x1c, 0x3b, 0xd0, 0x23, 0x73, 0xfb, 0x95, 0xae, 0xb9, 0x0, 0x2c, 0x62, 0x20, 0x37, 0xda, 0x44, 0x31, 0xa7, 0x29, 0x43, 0xa8, 0x8a, 0xc3, 0x90, 0xb6, 0x4b, 0x28, 0xa7, 0x63, 0x54, 0x91, 0x2b, 0x18, 0x85, 0x98, 0x87, 0xb8, 0x6f, 0xb2, 0x89, 0x3d, 0x61, 0xb7, 0x8e, 0x77, 0x61, 0xa7, 0xfd, 0x2a, 0x41, 0xa6, 0xf1}

The $3 matches exactly the contents of *page@512 when recovery is reporting page corruption.

We must find out what we are doing wrong in the compression or encryption.

I suspect that the affected version is 10.4 (which introduced innodb_checksum_algorithm=full_crc32) or an even earlier version.

Comment by Marko Mäkelä [ 2020-10-20 ]

I debugged this a bit further. The assertion fails because we apparently decrypt the 512 bytes wrong. In fil_encrypt_buf_for_full_crc32(), the bytes before encrypting are as follows:

(rr) p/x *src_frame@size
$3 = {0x0, 0x0, 0x0, 0x1, 0x0, 0x0, 0x0, 0x3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x0, 0x0, 0x0, 0x0, 0x8, 0xf1, 0x4a, 0xb1, 0x80, 0x2, 0x78, 0x9c, 0xed, 0xce, 0x31, 0x4e, 0x42, 0x51, 0x10, 0x5, 0xd0, 0xfb, 0x41, 0xd, 0xc6, 0x98, 0x48, 0xa9, 0x9d, 0x25, 0xad, 0xad, 0x8d, 0x8d, 0xd, 0x9d, 0x6b, 0x30, 0x9a, 0x50, 0x60, 0x88, 0x84, 0x42, 0xb, 0xe3, 0x66, 0xec, 0x6c, 0x58, 0x1, 0x1b, 0xb0, 0xb7, 0x71, 0xb, 0x16, 0xb8, 0x1, 0x9c, 0xf, 0x4, 0x37, 0x60, 0x79, 0x4e, 0xf2, 0x5e, 0xe6, 0x26, 0xf3, 0xde, 0x4c, 0x92, 0x26, 0x49, 0x77, 0xb5, 0x55, 0x75, 0x7a, 0xcb, 0xe1, 0xfc, 0x7a, 0x91, 0x3f, 0xcf, 0xe9, 0xe4, 0xfd, 0xf5, 0x60, 0x9b, 0xf6, 0xeb, 0xec, 0x6d, 0xca, 0xaf, 0x97, 0xf5, 0xeb, 0xd6, 0xf7, 0xba, 0x2f, 0xd5, 0xf9, 0xb3, 0xab, 0x2e, 0x9a, 0xba, 0xce, 0x46, 0xf, 0xf7, 0xa3, 0xf1, 0x6c, 0x5c, 0xef, 0x8e, 0x92, 0xe9, 0x6c, 0xf2, 0x78, 0x57, 0xe9, 0x24, 0xfd, 0x1c, 0x67, 0x33, 0xbc, 0x3e, 0x3c, 0x6d, 0x43, 0x33, 0xa8, 0x50, 0x53, 0xce, 0xdb, 0xd0, 0xb9, 0x69, 0x37, 0x49, 0x6, 0xab, 0x8f, 0x5a, 0xef, 0xa9, 0xc2, 0x61, 0x0 <repeats 63 times>, 0xf8, 0x5f, 0x93, 0xdc, 0xf6, 0x96, 0xc3, 0xf9, 0xdb, 0xd5, 0xe5, 0xe7, 0x2f, 0x84, 0x10, 0x1f, 0xfa, 0x0 <repeats 252 times>, 0x1f, 0xbd, 0xed, 0x5e}

It is a little curious that we have a sequence of 252 zero bytes at the end. The total length of the preceding block is 256 bytes, while the 252 NUL bytes and 4 CRC-32C bytes would make up another 256 bytes.

Note that the length is encoded as mach_read_from_2(src_frame + FIL_PAGE_TYPE) == 0x8002, which will trip the assertion failure on recovery. Those bytes were intentionally written by fil_page_compress_for_full_crc32():

	memcpy(out_buf, buf, header_len);
	out_buf[FIL_PAGE_TYPE] = 1U << (FIL_PAGE_COMPRESS_FCRC32_MARKER - 8);
	out_buf[FIL_PAGE_TYPE + 1] = byte(write_size >> 8);
	/* Clean up the buffer for the remaining write_size (except checksum) */

It turns out that since MDEV-12026, we have two assertions in the code path in buf_page_decrypt_after_read(), and the first one is obviously failing since MDEV-18644. Only after both assertions, we will check the full_crc32 flag:

		ut_d(fil_page_type_validate(space, dst_frame));
 
		/* decrypt using crypt_buf to dst_frame */
		if (!fil_space_decrypt(space, slot->crypt_buf, dst_frame)) {
			slot->release();
			goto decrypt_failed;
		}
 
		ut_d(fil_page_type_validate(space, dst_frame));
 
		if ((space->full_crc32() && page_compressed)
		    || fil_page_is_compressed_encrypted(dst_frame)) {
			goto decompress_with_slot;
		}

I think that we must skip all the assertions for the full_crc32 format. There is a third one right after the decompress_with_slot: label. That one should fail for unencrypted page_compressed pages as well.

Please try to recover the dataset after fixing the assertions. As far as I can tell, it should be easy to write a test case based on this analysis. It might be as simple as FLUSH TABLES…FOR EXPORT followed by ALTER TABLE…DISCARD TABLESPACE and ALTER TABLE…IMPORT TABLESPACE.

Comment by Marko Mäkelä [ 2020-10-20 ]

It is possible that the changes in MDEV-23855 broke the log checkpoint. We seem to be writing log checkpoint information for a too new LSN. Apparently, page 0 of the tablespace 122 had not been written to the data file at the time the SIGKILL was delivered to the server. Yet, crash recovery resumed from a much newer LSN. I will debug this further.

Comment by Marko Mäkelä [ 2020-10-20 ]

At the time of the SIGKILL, the latest checkpoint LSN is 153136497. When that checkpoint was written, we invoked log_write_checkpoint_info(end_lsn=154755265), which is what the start-up message displays.

However, the start-up message is wrong, and the actual checkpoint LSN matches what we had last written before the server was killed:

Thread 1 hit Hardware watchpoint 2: -location recv_sys.mlog_checkpoint_lsn
 
Old value = 0
New value = 154755265
recv_sys_t::parse (this=this@entry=0x563e8fb46340 <recv_sys>, 
    checkpoint_lsn=checkpoint_lsn@entry=153136497, 
    store=store@entry=0x7ffe6b4e5ca0, apply=apply@entry=false)
    at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/log/log0recv.cc:2118
2118	            l+= 8;
(rr) list
2113	          {
2114	            /* There can be multiple FILE_CHECKPOINT for the same LSN. */
2115	            if (mlog_checkpoint_lsn)
2116	              continue;
2117	            mlog_checkpoint_lsn= recovered_lsn;
2118	            l+= 8;
2119	            recovered_offset= l - buf;
2120	            return true;
2121	          }
2122	          continue;
(rr) p/x *(char*)l@8
$4 = {0x0, 0x0, 0x0, 0x0, 0x9, 0x20, 0xad, 0x71}
(rr) p 0x920ad71
$5 = 153136497

In the buffer pool before the server was killed, page 122:0 carries the oldest_modification of 153500937 and FIL_PAGE_LSN of 153500943 (both of which are after the checkpoint).

The question should be: Why was fil_space_set_recv_size_and_flags(id=122) not invoked? So, let us look at the recovery once again:

Thread 1 hit Breakpoint 4, recv_sys_t::parse (
    this=this@entry=0x563e8fb46340 <recv_sys>, 
    checkpoint_lsn=checkpoint_lsn@entry=153136497, 
    store=store@entry=0x7ffe6b4e5ca0, apply=apply@entry=true)
    at /home/mleich/Server/bb-10.5-MDEV-23855/storage/innobase/log/log0recv.cc:1804
1804	  if (UNIV_UNLIKELY(end_lsn > scanned_lsn))
1: end_lsn = 153500943
(rr) when
Current event: 3993
(rr) p/x *log@6
$6 = {0x34, 0x7a, 0x0, 0x39, 0x37, 0x0}

The entire mini-transaction is a single WRITE record for page 122:0, at byte offset 0x39, writing the byte 0x37.

Updates of FSP_SPACE_FLAGS are supposed to always be written in full, that is, the start offset must not be more than FSP_HEADER_OFFSET+FSP_SPACE_FLAGS=0x36, and the offset+length must not be less than 0x3a. The problem is unrelated to MDEV-23855, and we are simply missing a mtr_t::FORCED in the write.

We appear to have several writes of FSP_SPACE_FLAGS that are missing the mtr_t::FORCED flag to ensure full logging. I will produce a fix shortly.

Comment by Marko Mäkelä [ 2020-10-20 ]

The problem appears to be that the MDEV-23199 fix was not merged correctly to 10.5.

Generated at Thu Feb 08 09:26:35 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.