[MDEV-24445] Using innodb_undo_tablespaces corrupts system tablespace Created: 2020-12-18  Updated: 2020-12-21  Resolved: 2020-12-19

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.5.5, 10.5.6, 10.5.7, 10.5.8
Fix Version/s: 10.5.9

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: corruption, regression-10.5

Issue Links:
Problem/Incident
causes MDEV-24455 Assertion `!m_freed_space' failed in ... Closed
is caused by MDEV-8139 Fix scrubbing Closed

 Description   

In the rewrite of MDEV-8139 (based on changes that were made in MDEV-15528), we introduced a wrong assumption that any persistent tablespace that is not an .ibd file is the system tablespace. This assumption is broken when innodb_undo_tablespaces (undo001… files) are being used. By default, we have innodb_undo_tablespaces=0 (the persistent undo log is being stored in the system tablespace).

A simple test with innodb_file_per_table=0 and innodb_undo_tablespaces=3 reproduced corruption where a page in the system tablespace had been filled with zeroes (except for FIL_NULL stored in FIL_PAGE_PREV and FIL_PAGE_NEXT).



 Comments   
Comment by Matthias Leich [ 2020-12-18 ]

Workflow:
1. Start the server
2. Concurrent DDL/DML load
3. During 2. is ongoing kill the server process
4. Attempt to restart the server
 
Snip of the server error log during restart
2020-12-16 13:50:52 0 [Note] InnoDB: Using Linux native AIO
2020-12-16 13:50:52 0 [Note] InnoDB: Initializing buffer pool, total size = 50331648, chunk size = 50331648
2020-12-16 13:50:52 0 [Note] InnoDB: Completed initialization of buffer pool
2020-12-16 13:50:52 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2020-12-16 13:50:52 0 [Note] InnoDB: Opened 3 undo tablespaces
2020-12-16 13:50:52 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=143068293,148594426
2020-12-16 13:50:56 0 [Note] InnoDB: Opened 3 undo tablespaces
2020-12-16 13:50:56 0 [Note] InnoDB: 2 transaction(s) which must be rolled back or cleaned up in total 1168 row operations to undo
2020-12-16 13:50:56 0 [Note] InnoDB: Trx id counter is 652358
2020-12-16 13:50:56 0 [Note] InnoDB: Starting final batch to recover 2163 pages from redo log.
mysqld: /Server/10.6B/storage/innobase/log/log0recv.cc:274: log_phys_t::apply_status log_phys_t::apply(const buf_block_t&, uint16_t&) const: Assertion `mach_read_from_4(frame + 4U) == block.page.id().page_no()' failed.
201216 13:50:56 [ERROR] mysqld got signal 6 ;
 
(gdb) bt
#0  __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
#1  0x0000561d0158b81e in my_write_core (sig=6) at /Server/10.6B/mysys/stacktrace.c:424
#2  0x0000561cffffe8d2 in handle_fatal_signal (sig=6) at /Server/10.6B/sql/signal_handler.cc:330
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#5  0x00007fd081128859 in __GI_abort () at abort.c:79
#6  0x00007fd081128729 in __assert_fail_base (fmt=0x7fd0812be588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x561d02519c00 "mach_read_from_4(frame + 4U) == block.page.id().page_no()", 
    file=0x561d025195a0 "/Server/10.6B/storage/innobase/log/log0recv.cc", line=274, function=<optimized out>) at assert.c:92
#7  0x00007fd081139f36 in __GI___assert_fail (assertion=0x561d02519c00 "mach_read_from_4(frame + 4U) == block.page.id().page_no()", file=0x561d025195a0 "/Server/10.6B/storage/innobase/log/log0recv.cc", line=274, 
    function=0x561d02519980 "log_phys_t::apply_status log_phys_t::apply(const buf_block_t&, uint16_t&) const") at assert.c:101
#8  0x0000561d00d357a9 in log_phys_t::apply (this=0x7fd0695110b0, block=..., last_offset=@0x608000153764: 0) at /Server/10.6B/storage/innobase/log/log0recv.cc:274
#9  0x0000561d00d22ca8 in recv_recover_page (block=0x7fd069163280, mtr=..., Python Exception <class 'AttributeError'> 'NoneType' object has no attribute 'pointer': 
p=..., space=0x613000001018, init=0x607000007840) at /Server/10.6B/storage/innobase/log/log0recv.cc:2312
#10 0x0000561d00d3cd71 in recv_sys_t::recover_low (this=0x561d037fc060 <recv_sys>, page_id=..., Python Exception <class 'AttributeError'> 'NoneType' object has no attribute 'pointer': 
p=..., mtr=..., b=0x7fd069163280) at /Server/10.6B/storage/innobase/log/log0recv.cc:2558
#11 0x0000561d00d25a1a in recv_sys_t::apply (this=0x561d037fc060 <recv_sys>, last_batch=true) at /Server/10.6B/storage/innobase/log/log0recv.cc:2661
#12 0x0000561d00fbf6da in srv_start (create_new_db=false) at /Server/10.6B/storage/innobase/srv/srv0start.cc:1489
#13 0x0000561d00b22722 in innodb_init (p=0x615000002b18) at /Server/10.6B/storage/innobase/handler/ha_innodb.cc:3649
#14 0x0000561d00007a23 in ha_initialize_handlerton (plugin=0x621000033010) at /Server/10.6B/sql/handler.cc:645
#15 0x0000561cff8411b0 in plugin_initialize (tmp_root=0x7ffc88e119f0, plugin=0x621000033010, argc=0x561d03d7b000 <remaining_argc>, argv=0x615000000ff8, options_only=false) at /Server/10.6B/sql/sql_plugin.cc:1459
#16 0x0000561cff84313e in plugin_init (argc=0x561d03d7b000 <remaining_argc>, argv=0x615000000ff8, flags=0) at /Server/10.6B/sql/sql_plugin.cc:1751
#17 0x0000561cff4fc663 in init_server_components () at /Server/10.6B/sql/mysqld.cc:4950
#18 0x0000561cff4fe662 in mysqld_main (argc=47, argv=0x615000000ff8) at /Server/10.6B/sql/mysqld.cc:5538
#19 0x0000561cff4e815d in main (argc=46, argv=0x7ffc88e140a8) at /Server/10.6B/sql/main.cc:25
(gdb)
 
origin/10.6, 10.6 43d3dad11468bf45d2098ebca70002c49104f298 2020-12-15T14:29:40+02:00
 
RQG
-------
git clone https://github.com/mleich1/rqg --branch experimental RQG
 
perl rqg.pl \             
--grammar=conf/mariadb/innodb_compression_encryption.yy \
--gendata=conf/mariadb/innodb_compression_encryption.zz \
--mysqld=--innodb-encrypt-log \
--mysqld=--innodb-encrypt-tables \
--mysqld=--innodb_undo_tablespaces=3 \
--mysqld=--innodb_immediate_scrub_data_uncompressed=1 \
--mysqld=--innodb_file_per_table=0 \
--mysqld=--innodb_use_native_aio=1 \
--mysqld=--innodb_lock_schedule_algorithm=fcfs \
--mysqld=--loose-idle_write_transaction_timeout=0 \
--mysqld=--loose-idle_transaction_timeout=0 \
--mysqld=--loose-idle_readonly_transaction_timeout=0 \
--mysqld=--connect_timeout=60 \
--mysqld=--interactive_timeout=28800 \
--mysqld=--slave_net_timeout=60 \
--mysqld=--net_read_timeout=30 \
--mysqld=--net_write_timeout=60 \
--mysqld=--loose-table_lock_wait_timeout=50 \
--mysqld=--wait_timeout=28800 \
--mysqld=--lock-wait-timeout=86400 \
--mysqld=--innodb-lock-wait-timeout=50 \
--no-mask \
--queries=10000000 \
--seed=random \
--reporters=Backtrace \
--reporters=ErrorLog \
--reporters=Deadlock1 \
--reporters=CrashRecovery1 \
--validators=None \
--mysqld=--log_output=none \
--mysqld=--log-bin \
--mysqld=--log_bin_trust_function_creators=1 \
--mysqld=--loose-debug_assert_on_not_freed_memory=0 \
--engine=InnoDB \
--restart_timeout=120 \
--max_gd_duration=1000 \
--mysqld=--plugin-load-add=file_key_management.so \
--mysqld=--loose-file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
--duration=300 \
--mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
--mysqld=--loose-max-statement-time=30 \
--threads=33 \
--mysqld=--innodb_page_size=4K \
--mysqld=--innodb-buffer-pool-size=8M \
--duration=300 \
--no_mask \
--workdir=<local settings> \
--vardir=<local settings> \
--mtr-build-thread=<local settings> \
--basedir1=<local settings> \
--script_debug=_nix_
 
The automatic test simplification showed that
- --innodb_immediate_scrub_data_uncompressed=1 is not required
- the number of threads could be reduced to ~ 3
- statements enforcing the use of compression (in grammar innodb_compression_encryption.yy) are not required.
 
Simplified YY grammar
-----------------------------------
alter:
    ALTER TABLE _table FORCE ;
 
insert:
    INSERT IGNORE INTO _table ( _field_int ) VALUES ( _smallint_unsigned ) ;
 
query:
    alter |
    alter |
    alter |
    alter |
    alter |
    alter |
    alter |
    |
    |
    insert |
    insert |
    insert |
    insert |
    insert |
    insert |
    insert ;
 
query_init:
    ;
 
thread_connect:
    ;

Generated at Thu Feb 08 09:30:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.