Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL), 10.11
Description
For the dataset in MDEV-30068, a failure occurred during crash recovery with 10.6-based branch:
2022-11-21 8:59:32 0 [ERROR] [FATAL] InnoDB: Trying to write 65536 bytes at 2359296 outside the bounds of the file: ./test/t5.ibd
|
Before the failure, there was one call to fil_system_t::extend_to_recv_size() from recv_sys.apply_log(false).
The failure would occur during "background recovery":
#5 0x00005640dd514e7d in fil_invalid_page_access_msg (name=<optimized out>,
|
offset=offset@entry=2359296, len=len@entry=65536, is_read=<optimized out>)
|
at /data/Server/bb-10.6-MDEV-29835extra/storage/innobase/fil/fil0fil.cc:2639
|
#6 0x00005640dd5287e6 in fil_space_t::io (this=this@entry=0x6120000034c0,
|
type=@0x7ffc0c342410: {bpage = 0x45cb7349a870, slot = 0x631000028820, node = 0x0, type = IORequest::WRITE_LRU}, offset=2359296, len=<optimized out>,
|
buf=buf@entry=0x6f6f7afb0000, bpage=bpage@entry=0x45cb7349a870)
|
at /data/Server/bb-10.6-MDEV-29835extra/storage/innobase/fil/fil0fil.cc:2715
|
#7 0x00005640dd41fdf2 in buf_page_t::flush (this=this@entry=0x45cb7349a870,
|
lru=lru@entry=true, space=space@entry=0x6120000034c0)
|
at /data/Server/bb-10.6-MDEV-29835extra/storage/innobase/buf/buf0flu.cc:900
|
#8 0x00005640dd4152a1 in buf_flush_LRU_list_batch (max=max@entry=32,
|
n=n@entry=0x7ffc0c342600)
|
at /data/Server/bb-10.6-MDEV-29835extra/storage/innobase/buf/buf0flu.cc:1299
|
#9 0x00005640dd416295 in buf_do_LRU_batch (max=max@entry=32)
|
at /data/Server/bb-10.6-MDEV-29835extra/storage/innobase/buf/buf0flu.cc:1337
|
#10 0x00005640dd416590 in buf_flush_LRU (max_n=32)
|
at /data/Server/bb-10.6-MDEV-29835extra/storage/innobase/buf/buf0flu.cc:1678
|
#11 0x00005640dd42f58d in buf_LRU_get_free_block (
|
have_mutex=have_mutex@entry=false)
|
at /data/Server/bb-10.6-MDEV-29835extra/storage/innobase/buf/buf0lru.cc:488
|
#12 0x00005640dcf678d0 in recv_sys_t::recover_low (
|
this=0x5640dec48760 <recv_sys>, page_id=<optimized out>)
|
at /data/Server/bb-10.6-MDEV-29835extra/storage/innobase/log/log0recv.cc:3155
|
#13 0x00005640dd3e4508 in recv_sys_t::recover (page_id=<optimized out>,
|
this=<optimized out>)
|
at /data/Server/bb-10.6-MDEV-29835extra/storage/innobase/include/log0recv.h:415
|
#14 buf_page_get_gen (page_id={m_id = 303}, zip_size=zip_size@entry=0,
|
rw_latch=rw_latch@entry=1, guess=guess@entry=0x45cb7349a870,
|
…
|
#25 0x00005640dd2cc19a in trx_resurrect (undo=undo@entry=0x611000043fc0, rseg=rseg@entry=0x5640decf2340 <trx_sys+30080>, start_time=start_time@entry=1669049972, start_time_micro=start_time_micro@entry=284020700207, rows_to_undo=rows_to_undo@entry=0x7ffc0c346840) at /data/Server/bb-10.6-MDEV-29835extra/storage/innobase/trx/trx0trx.cc:701
|
#26 0x00005640dd2cc843 in trx_lists_init_at_db_start () at /data/Server/bb-10.6-MDEV-29835extra/storage/innobase/trx/trx0trx.cc:752
|
#27 0x00005640dd23d5f0 in srv_start (create_new_db=<optimized out>) at /data/Server/bb-10.6-MDEV-29835extra/storage/innobase/srv/srv0start.cc:1485
|
#28 0x00005640dcdb1952 in innodb_init (p=<optimized out>) at /data/Server/bb-10.6-MDEV-29835extra/storage/innobase/handler/ha_innodb.cc:4293
|
#29 0x00005640dc30fd21 in ha_initialize_handlerton (plugin=0x62100002e010) at /data/Server/bb-10.6-MDEV-29835extra/sql/handler.cc:659
|
As far as I can tell, the problem should be fixed by extending the tablespace before invoking buf_LRU_get_free_block(), at least in recv_sys_t::recover_low() and possibly elsewhere (definitely when MDEV-14481 is implemented).
I was not able to reproduce the error for a copy of the data directory, so some additional testing will be needed.
Attachments
Issue Links
- is caused by
-
MDEV-24626 Remove synchronous write of page0 and flushing file during file creation
- Closed
- relates to
-
MDEV-30132 Crash after recovery, with InnoDB: Tried to read ... bytes at offset
- Closed
-
MDEV-11556 InnoDB redo log apply fails to adjust data file sizes
- Closed
-
MDEV-14481 Execute InnoDB crash recovery in the background
- Closed
-
MDEV-30068 Confusing error message when encryption is not available on recovery
- Closed
RQG
===
# git clone https://github.com/mleich1/rqg --branch experimental RQG
#
# GIT_SHOW: HEAD -> experimental, origin/experimental ec83f640bab8f5ea31f07b9bc118d508abae3b35 2022-11-17T15:46:44+01:00
# rqg.pl : Version 4.0.6 (2022-09)
#
# $RQG_HOME/rqg.pl \
# --grammar=conf/mariadb/table_stress_innodb_nocopy1.yy \
# --gendata=conf/mariadb/table_stress.zz \
# --gendata_sql=conf/mariadb/table_stress.sql \
# --reporters=CrashRecovery1 \
# --mysqld=--loose-innodb_lock_schedule_algorithm=fcfs \
# --mysqld=--loose-idle_write_transaction_timeout=0 \
# --mysqld=--loose-idle_transaction_timeout=0 \
# --mysqld=--loose-idle_readonly_transaction_timeout=0 \
# --mysqld=--connect_timeout=60 \
# --mysqld=--interactive_timeout=28800 \
# --mysqld=--slave_net_timeout=60 \
# --mysqld=--net_read_timeout=30 \
# --mysqld=--net_write_timeout=60 \
# --mysqld=--loose-table_lock_wait_timeout=50 \
# --mysqld=--wait_timeout=28800 \
# --mysqld=--lock-wait-timeout=86400 \
# --mysqld=--innodb-lock-wait-timeout=50 \
# --no-mask \
# --queries=10000000 \
# --seed=random \
# --reporters=Backtrace \
# --reporters=ErrorLog \
# --reporters=Deadlock1 \
# --validators=None \
# --mysqld=--log_output=none \
# --mysqld=--log_bin_trust_function_creators=1 \
# --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
# --engine=InnoDB \
# --restart_timeout=240 \
# --mysqld=--plugin-load-add=file_key_management.so \
# --mysqld=--loose-file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
# --mysqld=--plugin-load-add=provider_lzo.so \
# --mysqld=--plugin-load-add=provider_bzip2.so \
# --mysqld=--plugin-load-add=provider_lzma.so \
# --mysqld=--plugin-load-add=provider_snappy.so \
# --mysqld=--plugin-load-add=provider_lz4.so \
# --duration=300 \
# --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
# --mysqld=--innodb_file_per_table=1 \
# --mysqld=--loose-innodb_read_only_compressed=OFF \
# --mysqld=--innodb_stats_persistent=off \
# --mysqld=--innodb_adaptive_hash_index=off \
# --mysqld=--log-bin \
# --mysqld=--sync-binlog=1 \
# --mysqld=--loose-innodb_evict_tables_on_commit_debug=on \
# --mysqld=--loose-max-statement-time=30 \
# --threads=33 \
# --mysqld=--innodb-use-native-aio=0 \
# --mysqld=--loose-gdb \
# --mysqld=--loose-debug-gdb \
# --rr=Extended \
# --rr_options=--chaos --wait \
# --mysqld=--innodb_rollback_on_timeout=ON \
# --mysqld=--innodb_page_size=64K \
# --mysqld=--innodb-buffer-pool-size=24M \
# <local settings>
sdp:/data/results/1669039949/TBR-1663
Error pattern
[ 'TBR-1663', 'INFO: Reporter \'CrashRecovery.\': Attempting database recovery using the server.+\[Note\] InnoDB: Starting crash recovery from checkpoint LSN.+\[ERROR\] \[FATAL\] InnoDB: Trying to write .{1,20} bytes at .{1,20} outside the bounds of the file: .{1,250}\.ibd.+RESULT: The RQG run ended with status STATUS_RECOVERY_FAILURE' ],