[MDEV-29558] Crash recovery: SHOW KEYS .. harvests 1696: Failed to read from the .par file Created: 2022-09-16  Updated: 2023-09-24

Status: Open
Project: MariaDB Server
Component/s: Partitioning, Storage Engine - InnoDB
Affects Version/s: 10.7.6
Fix Version/s: 11.0

Type: Bug Priority: Major
Reporter: Matthias Leich Assignee: Aleksey Midenkov
Resolution: Unresolved Votes: 0
Labels: rr-profile-analyzed

Attachments: File DD-DIFF-partitioning-mini.yy    
Issue Links:
Blocks
is blocked by MDEV-27180 Fully atomic partitioning DDL operations Stalled
Relates
relates to MDEV-27180 Fully atomic partitioning DDL operations Stalled
relates to MDEV-27618 Atomic DDL is not very atomic on part... Open

 Description   

origin/10.7 f3785f099c2f0f251f39632928e822328abe9a02 2022-09-13T08:48:40+03:00
 
Scenario:
1, Start the server and generate some initial data.
2. One session runs a DDL/DML mix.
3. Intentional kill of the server
4. Restart of the server with success
5. SHOW KEYS FROM `test`.`s` harvested 1696: Failed to read from the .par file
6. RQG reacts with killing the DB server
 
# git clone https://github.com/mleich1/rqg --branch experimental RQG
#
# GIT_SHOW: HEAD -> experimental, origin/experimental 9c333aebe04973365c0b91b363957f00362f2605 2022-09-12T18:19:28+02:00
# rqg.pl  : Version 4.0.6 (2022-05)
#
# $RQG_HOME/rqg.pl \
# --duration=150 \
# --queries=10000000 \
# --no_mask \
# --seed=random \
# --max_gd_duration=1200 \
# --rpl_mode=none \
# --engine=InnoDB \
# --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
# --mysqld=--log-output=none \
# --mysqld=--loose_innodb_lock_schedule_algorithm=fcfs \
# --mysqld=--loose-idle_transaction_timeout=0 \
# --mysqld=--loose-max-statement-time=30 \
# --mysqld=--loose-plugin-load-add=provider_lz4.so \
# --mysqld=--log_bin_trust_function_creators=1 \
# --mysqld=--loose-table_lock_wait_timeout=50 \
# --mysqld=--net_read_timeout=30 \
# --mysqld=--innodb_page_size=16K \
# --mysqld=--interactive_timeout=28800 \
# --mysqld=--file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
# --mysqld=--loose-idle_write_transaction_timeout=0 \
# --mysqld=--loose-innodb_read_only_compressed=OFF \
# --mysqld=--connect_timeout=60 \
# --mysqld=--sync-binlog=1 \
# --mysqld=--plugin-load-add=file_key_management.so \
# --mysqld=--net_write_timeout=60 \
# --mysqld=--innodb-lock-wait-timeout=50 \
# --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
# --mysqld=--loose-innodb_evict_tables_on_commit_debug=on \
# --mysqld=--lock-wait-timeout=86400 \
# --mysqld=--innodb-buffer-pool-size=8M \
# --mysqld=--wait_timeout=28800 \
# --mysqld=--loose-idle_readonly_transaction_timeout=0 \
# --mysqld=--slave_net_timeout=60 \
# --mysqld=--log-bin \
# --reporters=Backtrace,CrashRecovery1,Deadlock1,ErrorLog \
# --validators=None \
# --grammar=conf/mariadb/partitions_innodb.yy \
# --threads=1 \
# <local settings>
 
1. - 3. of the scenario
pluto:/data/results/1663590139/DD-DIFF-partitioning$ _RR_TRACE_DIR=./1/rr/ rr replay --mark-stdio mysqld-1
4. - 6. of the scenario
pluto:/data/results/1663590139/DD-DIFF-partitioning$ _RR_TRACE_DIR=./1/rr/ rr replay --mark-stdio mysqld-2



 Comments   
Comment by Matthias Leich [ 2022-09-16 ]

The automatic RQG test simplifier is just running.

Comment by Elena Stepanova [ 2022-09-18 ]

FYI, I have a similar error in MDEV-29566 which doesn't involve crash recovery.

Comment by Matthias Leich [ 2022-09-19 ]

Per simplified RQG test:
Just one sessions runs a stream of diced
CREATE TABLE IF NOT EXISTS b (`col_int_nokey` INTEGER,`col_int_key` INTEGER NOT NULL,KEY (`col_int_key`)) ENGINE = INNODB ;
ALTER TABLE b PARTITION BY RANGE ( `col_int_nokey` ) <have four partitions and change their range>.
I had no luck in my attempts to generate some MTR based tests.

Comment by Marko Mäkelä [ 2022-09-19 ]

The last durable write by InnoDB occurred as part of a RENAME operation of an ALTER TABLE…PARTITION BY:

10.7 f3785f099c2f0f251f39632928e822328abe9a02

(rr) continue
Continuing.
 
Thread 4 hit Hardware watchpoint 1: -location log_sys.flushed_to_disk_lsn._M_i
 
Old value = 2971106
New value = 2971551
0x0000556ea20bacf2 in std::__atomic_base<unsigned long>::store (__m=std::memory_order_release, __i=2971551, this=0x556ea511c948 <log_sys+8>) at /usr/include/c++/9/bits/atomic_base.h:397
397		__atomic_store_n(&_M_i, __i, int(__m));
(rr) bt
#0  0x0000556ea20bacf2 in std::__atomic_base<unsigned long>::store (__m=std::memory_order_release, __i=2971551, this=0x556ea511c948 <log_sys+8>) at /usr/include/c++/9/bits/atomic_base.h:397
#1  log_t::set_flushed_lsn (this=0x556ea511c940 <log_sys>, lsn=2971551) at /data/Server/10.7C/storage/innobase/include/log0log.h:638
#2  0x0000556ea20b43e3 in log_write_flush_to_disk_low (lsn=2971551) at /data/Server/10.7C/storage/innobase/log/log0log.cc:651
#3  0x0000556ea20b5e26 in log_write_and_flush () at /data/Server/10.7C/storage/innobase/log/log0log.cc:903
#4  0x0000556ea21238e3 in mtr_t::commit_file (this=0x4ed447f98b00, space=..., name=0x603000153670 "./test/b#P#p2.ibd") at /data/Server/10.7C/storage/innobase/mtr/mtr0mtr.cc:660
#5  0x0000556ea25ce0a8 in fil_space_t::rename (this=0x612000204c40, path=0x603000153670 "./test/b#P#p2.ibd", log=true, replace=false) at /data/Server/10.7C/storage/innobase/fil/fil0fil.cc:1811
#6  0x0000556ea253487a in dict_table_t::rename_tablespace (this=0x618000042908, new_name=..., replace=false) at /data/Server/10.7C/storage/innobase/dict/dict0dict.cc:1501
#7  0x0000556ea2534ed7 in dict_table_rename_in_cache (table=0x618000042908, new_name=..., replace_new_file=false) at /data/Server/10.7C/storage/innobase/dict/dict0dict.cc:1541
#8  0x0000556ea229236b in row_rename_table_for_mysql (old_name=0x4ed447f9b350 "test/#sql-alter-3fe742-12#P#p2", new_name=0x4ed447f9b110 "test/b#P#p2", trx=0x640000041940, use_fk=true)
    at /data/Server/10.7C/storage/innobase/row/row0mysql.cc:2853
#9  0x0000556ea1f18dda in innobase_rename_table (trx=0x640000041940, from=0x4ed447f9c4d0 "./test/#sql-alter-3fe742-12#P#p2", to=0x4ed447f9c760 "./test/b#P#p2", use_fk=true)
    at /data/Server/10.7C/storage/innobase/handler/ha_innodb.cc:13784
#10 0x0000556ea1f1bfef in ha_innobase::rename_table (this=0x629000db29d8, from=0x4ed447f9c4d0 "./test/#sql-alter-3fe742-12#P#p2", to=0x4ed447f9c760 "./test/b#P#p2")
    at /data/Server/10.7C/storage/innobase/handler/ha_innodb.cc:14183
#11 0x0000556ea1669ab3 in handler::ha_rename_table (this=0x629000db29d8, from=0x4ed447f9c4d0 "./test/#sql-alter-3fe742-12#P#p2", to=0x4ed447f9c760 "./test/b#P#p2") at /data/Server/10.7C/sql/handler.cc:5338
#12 0x0000556ea1de4182 in ha_partition::del_ren_table (this=0x629000db1238, from=0x4ed447f9cc20 "./test/#sql-alter-3fe742-12", to=0x4ed447f9ce60 "./test/b") at /data/Server/10.7C/sql/ha_partition.cc:2507
#13 0x0000556ea1dda1a3 in ha_partition::rename_table (this=0x629000db1238, from=0x4ed447f9cc20 "./test/#sql-alter-3fe742-12", to=0x4ed447f9ce60 "./test/b") at /data/Server/10.7C/sql/ha_partition.cc:662
#14 0x0000556ea1669ab3 in handler::ha_rename_table (this=0x629000db1238, from=0x4ed447f9cc20 "./test/#sql-alter-3fe742-12", to=0x4ed447f9ce60 "./test/b") at /data/Server/10.7C/sql/handler.cc:5338
#15 0x0000556ea10ce630 in mysql_rename_table (base=0x61500000ad18, old_db=0x4ed447f9f0f0, old_name=0x4ed447f9f120, new_db=0x4ed447f9f0f0, new_name=0x4ed447f9f110, id=0x4ed447f9f140, flags=1)
    at /data/Server/10.7C/sql/sql_table.cc:5097
#16 0x0000556ea10f42b4 in mysql_alter_table (thd=0x62b00021b218, new_db=0x62b00021fcb8, new_name=0x62b0002200d0, create_info=0x4ed447fa0550, table_list=0x629000f284f8, alter_info=0x4ed447fa0420, order_num=0, 
    order=0x0, ignore=false, if_exists=false) at /data/Server/10.7C/sql/sql_table.cc:10733
#17 0x0000556ea1280e2f in Sql_cmd_alter_table::execute (this=0x629000f29618, thd=0x62b00021b218) at /data/Server/10.7C/sql/sql_alter.cc:544
#18 0x0000556ea0e4c0be in mysql_execute_command (thd=0x62b00021b218, is_called_from_prepared_stmt=false) at /data/Server/10.7C/sql/sql_parse.cc:5989
#19 0x0000556ea0e5851c in mysql_parse (thd=0x62b00021b218, 
    rawbuf=0x629000f28238 "ALTER TABLE b PARTITION BY RANGE ( `col_int_nokey` ) ( PARTITION p0 VALUES LESS THAN ( 3 ),\nPARTITION p1 VALUES LESS THAN ( 226 ),\nPARTITION p2 VALUES LESS THAN ( 58910 ),\nPARTITION p3 VALUES LESS THA"..., length=247, parser_state=0x4ed447fa1a20) at /data/Server/10.7C/sql/sql_parse.cc:8028

This class of operations on partitioned tables is not expected to be crash safe before MDEV-27180.

Generated at Thu Feb 08 10:09:32 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.