Details
-
Bug
-
Status: Closed (View Workflow)
-
Blocker
-
Resolution: Fixed
-
11.2.3, 11.3.2, 11.4.1, 11.2.4
Description
This bug was found related to MDEV-34212.
Some InnoDB tests, most notably, innodb.table_flags,64k, would occasionally fail. I am able to reproduce this locally on a MemorySanitizer build, some of the time. The attached file data.tar.xz is a copy of the data directory of a failed test in a 10.4 based branch that contains a fix of
MDEV-34212.
I was able to reproduce the problem with an even simpler test case:
tf.test |
--source include/have_innodb.inc
|
SELECT @@innodb_page_size; |
and the following options:
tf.opt |
--innodb-undo-tablespaces=0 --innodb-page-size=64k --innodb-buffer-pool-size=20m
|
I was able to produce an rr replay by repeatedly running the following on a MemorySanitizer build:
while ./mtr --boot-rr --parallel=60 innodb.tf{,,,,,,,,,}{,}{,,}; do :; done |
In the rr replay trace, I concluded that the problem is that the last write of the undo page 50 is being discarded due to a condition in buf_page_t::flush():
if (UNIV_UNLIKELY(lsn < space->get_create_lsn())) |
{
|
ut_ad(space->purpose == FIL_TYPE_TABLESPACE);
|
goto freed; |
}
|
The bug seems to be a conflict between the supposedly final buffer pool flushing and fsp_system_tablespace_truncate(). Here are some stack traces, from the same thread:
buf_page_t::flush()
|
buf_do_flush_list_batch (max_n=2000, lsn=18446744073709551615)
|
buf_flush_list_holding_mutex (max_n=max_n@entry=2000, lsn=lsn@entry=18446744073709551615)
|
buf_flush_list (max_n=2000, lsn=lsn@entry=18446744073709551615)
|
buf_flush_buffer_pool ()
|
logs_empty_and_mark_files_at_shutdown ()
|
innodb_shutdown ()
|
innobase_end ()
|
The system tablespace truncation had been invoked a little earlier in the same thread:
fil_space_t::set_create_lsn (this=0x71100001b080, lsn=58443)
|
mtr_t::commit_shrink (...)
|
fsp_system_tablespace_truncate ()
|
innobase_end () at /mariadb/11.4/storage/innobase/handler/ha_innodb.cc:4269
|
Attachments
Issue Links
- is caused by
-
MDEV-14795 InnoDB system tablespace cannot be shrunk
-
- Closed
-
-
MDEV-32452 InnoDB system tablespace is not shrunk on slow shutdown
-
- Closed
-
thiru suggested the following fix:
diff --git a/storage/innobase/mtr/mtr0mtr.cc b/storage/innobase/mtr/mtr0mtr.cc
index 90a2007a48d..8db52ac1f47 100644
--- a/storage/innobase/mtr/mtr0mtr.cc
+++ b/storage/innobase/mtr/mtr0mtr.cc
@@ -583,8 +583,9 @@ void mtr_t::commit_shrink(fil_space_t &space, uint32_t size)
if (space.id == TRX_SYS_SPACE)
srv_sys_space.set_last_file_size(file->size);
+ else
+ space.set_create_lsn(m_commit_lsn);
- space.set_create_lsn(m_commit_lsn);
mysql_mutex_unlock(&fil_system.mutex);
This is obvious and should have been caught in the
MDEV-14795review already.The purpose of fil_space_t::create_lsn is to denote the logical time when the entire tablespace was rewritten or created. It is supposed to be used on the undo tablespaces only. The system tablespace is newer re-created, but its size may be reduced.
With this patch, my MSAN based test does not fail. I will let it run for a few more minutes, because the failure was sporadic.