[MDEV-25110] [ERROR] [FATAL] InnoDB: Trying to write 16384 bytes at 1032192 outside the bounds Created: 2021-03-10  Updated: 2021-03-18  Resolved: 2021-03-11

Status: Closed
Project: MariaDB Server
Component/s: mariabackup, Storage Engine - InnoDB
Affects Version/s: 10.5.7, 10.5.8, 10.5.9
Fix Version/s: 10.5.10

Type: Bug Priority: Blocker
Reporter: Matthias Leich Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: rr-profile-analyzed

Issue Links:
Blocks
Problem/Incident
is caused by MDEV-23855 InnoDB log checkpointing causes regre... Closed
Relates

 Description   

mariadb-10.5.9 3a8ca9096ea82ca61811450775511533d6cb1bb4 2021-02-19T10:37:51+01:00 + a code of  MDEV-25031
 
1. Start server and create some initial data
2. Several sessions run a concurrent  DDL/DML mix
3. During 2. is ongoing
     mariabackup --backup ... to <other_dir>
      mariabackup --prepare ... in other_dir
This "prepare" fails with
2021-03-10 13:11:24 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=162939,15421408
# 2021-03-10T13:12:35 [3326538] | 2021-03-10 13:11:30 0 [Note] InnoDB: Starting final batch to recover 497 pages from redo log.
# 2021-03-10T13:12:35 [3326538] | 2021-03-10 13:11:33 0 [ERROR] [FATAL] InnoDB: Trying to write 16384 bytes at 1032192 outside the bounds of the file: test/table100_innodb_int_autoinc.ibd
# 2021-03-10T13:12:35 [3326538] | 210310 13:11:33 [ERROR] mysqld got signal 6 ;
 
(rr) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x000073605f7e8859 in __GI_abort () at abort.c:79
#2  0x000055db8e6edb37 in ib::fatal::~fatal (this=0x439a0eee0570, __in_chrg=<optimized out>) at /Server/10.5.9-MDEV-25031/storage/innobase/ut/ut0ut.cc:576
#3  0x000055db8e0d00ab in fil_report_invalid_page_access (name=0x60600000a3b8 "test/table100_innodb_int_autoinc.ibd", offset=1032192, len=16384, is_read=false) at /Server/10.5.9-MDEV-25031/storage/innobase/fil/fil0fil.cc:3321
#4  0x000055db8e0d0e46 in fil_space_t::io (this=0x615000003a18, type=..., offset=1032192, len=16384, buf=0x7a864000, bpage=0x79c30be8) at /Server/10.5.9-MDEV-25031/storage/innobase/fil/fil0fil.cc:3412
#5  0x000055db8e009653 in buf_flush_page (bpage=0x79c30be8, lru=false, space=0x615000003a18) at /Server/10.5.9-MDEV-25031/storage/innobase/buf/buf0flu.cc:962
#6  0x000055db8e00de54 in buf_do_flush_list_batch (max_n=200, lsn=18446744073709551615) at /Server/10.5.9-MDEV-25031/storage/innobase/buf/buf0flu.cc:1472
#7  0x000055db8e00e9b4 in buf_flush_lists (max_n=200, lsn=18446744073709551615) at /Server/10.5.9-MDEV-25031/storage/innobase/buf/buf0flu.cc:1581
#8  0x000055db8e013626 in buf_flush_page_cleaner () at /Server/10.5.9-MDEV-25031/storage/innobase/buf/buf0flu.cc:2168
#9  0x00007feed4e78609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#10 0x000073605f8e5293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(rr)
pluto:/home/mleich/RQG_O/storage/1615376730/TBR-932/dev/shm/vardir/1615376730/58/1_clone/rr



 Comments   
Comment by Marko Mäkelä [ 2021-03-10 ]

I suspect that this was caused by the changes in MDEV-23855. In the trace that I analyzed, the log is being applied in multiple batches, and the data file is being extended multiple times according to the parsed redo log. However, the data file was opened already early during the batch, and therefore fil_space_t::prepare() will not be invoked. That function would normally extend the data file to the recovered size before anything is going to be written to the file.

Generated at Thu Feb 08 09:35:13 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.