[MDEV-23439] Assertion `size == space->size' failed in buf_read_ahead_random Created: 2020-08-10  Updated: 2020-10-06  Resolved: 2020-08-12

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.2, 10.3, 10.4
Fix Version/s: 10.2.35, 10.3.26, 10.4.16

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: debug, not-10.5, rr-profile-analyzed

Attachments: File mdev23439a.test.gz    
Issue Links:
Relates
relates to MDEV-15053 Reduce buf_pool_t::mutex contention Closed
relates to MDEV-23190 Assertion `id.page_no() < space.size'... Closed

 Description   

10.2 845e3c9801d

mysqld: /home/elenst/MDEV-23439/10.2/storage/innobase/buf/buf0rea.cc:296: ulint buf_read_ahead_random(page_id_t, const page_size_t&, ulint): Assertion `size == space->size' failed.
200810 10:58:31 [ERROR] mysqld got signal 6 ;
 
#3  0x00007f1d678274a2 in __GI___assert_fail (assertion=0x5648a81141e1 "size == space->size", file=0x5648a8114580 "/home/elenst/MDEV-23439/10.2/storage/innobase/buf/buf0rea.cc", 
    line=296, 
    function=0x5648a8115000 <buf_read_ahead_random(page_id_t, page_size_t const&, unsigned long)::__PRETTY_FUNCTION__> "ulint buf_read_ahead_random(page_id_t, const page_size_t&, ulint)") at assert.c:101
#4  0x00005648a7ba04e9 in buf_read_ahead_random (page_id=..., page_size=..., inside_ibuf=0) at /home/elenst/MDEV-23439/10.2/storage/innobase/buf/buf0rea.cc:296
#5  0x00005648a7b7757a in buf_page_get_low (page_id=..., page_size=..., rw_latch=2, guess=0x0, mode=10, 
    file=0x5648a8135fe0 "/home/elenst/MDEV-23439/10.2/storage/innobase/fil/fil0fil.cc", line=4497, mtr=0x7f1d2c0a86d0, err=0x0)
    at /home/elenst/MDEV-23439/10.2/storage/innobase/buf/buf0buf.cc:4390
#6  0x00005648a7b78c49 in buf_page_get_gen (page_id=..., page_size=..., rw_latch=2, guess=0x0, mode=10, 
    file=0x5648a8135fe0 "/home/elenst/MDEV-23439/10.2/storage/innobase/fil/fil0fil.cc", line=4497, mtr=0x7f1d2c0a86d0, err=0x0)
    at /home/elenst/MDEV-23439/10.2/storage/innobase/buf/buf0buf.cc:4908
#7  0x00005648a7c012e0 in fsp_flags_try_adjust (space_id=1918, flags=41) at /home/elenst/MDEV-23439/10.2/storage/innobase/fil/fil0fil.cc:4495
#8  0x00005648a7c015d0 in fil_space_for_table_exists_in_mem (id=1918, name=0x7f1ce00d2f90 "test/#sql2-20ea-e", table_flags=41)
    at /home/elenst/MDEV-23439/10.2/storage/innobase/fil/fil0fil.cc:4555
#9  0x00005648a7a4e32d in row_drop_single_table_tablespace (space_id=1918, tablename=0x7f1ce00d2f90 "test/#sql2-20ea-e", filepath=0x7f1ce00f5b50 "./test/#sql2-20ea-e.ibd", 
    table_flags=41) at /home/elenst/MDEV-23439/10.2/storage/innobase/row/row0mysql.cc:3253
#10 0x00005648a7a4f65c in row_drop_table_for_mysql (name=0x7f1d2c0a94f0 "test/#sql2-20ea-e", trx=0x7f1d650a1268, sqlcom=SQLCOM_ALTER_TABLE, create_failed=false, nonatomic=true)
    at /home/elenst/MDEV-23439/10.2/storage/innobase/row/row0mysql.cc:3740
#11 0x00005648a792a488 in ha_innobase::delete_table (this=0x7f1ce0013590, name=0x7f1d2c0aacf0 "./test/#sql2-20ea-e", sqlcom=SQLCOM_ALTER_TABLE)
    at /home/elenst/MDEV-23439/10.2/storage/innobase/handler/ha_innodb.cc:13304
#12 0x00005648a7914ff8 in ha_innobase::delete_table (this=0x7f1ce0013590, name=0x7f1d2c0aacf0 "./test/#sql2-20ea-e")
    at /home/elenst/MDEV-23439/10.2/storage/innobase/handler/ha_innodb.cc:13429
#13 0x00005648a7712d1e in handler::ha_delete_table (this=0x7f1ce0013590, name=0x7f1d2c0aacf0 "./test/#sql2-20ea-e") at /home/elenst/MDEV-23439/10.2/sql/handler.cc:4473
#14 0x00005648a770d441 in ha_delete_table (thd=0x7f1ce0000d50, table_type=0x5648ab13fa00, path=0x7f1d2c0aacf0 "./test/#sql2-20ea-e", db=0x7f1ce0011b30 "test", 
    alias=0x7f1d2c0ab0d0 "#sql2-20ea-e", generate_warning=false) at /home/elenst/MDEV-23439/10.2/sql/handler.cc:2442
#15 0x00005648a75545ff in quick_rm_table (thd=0x7f1ce0000d50, base=0x5648ab13fa00, db=0x7f1ce0011b30 "test", table_name=0x7f1d2c0ab0d0 "#sql2-20ea-e", flags=3, table_path=0x0)
    at /home/elenst/MDEV-23439/10.2/sql/sql_table.cc:2734
#16 0x00005648a7567ac8 in mysql_alter_table (thd=0x7f1ce0000d50, new_db=0x7f1ce0011b30 "test", new_name=0x0, create_info=0x7f1d2c0ac2c0, table_list=0x7f1ce0011518, 
    alter_info=0x7f1d2c0ac210, order_num=0, order=0x0, ignore=true) at /home/elenst/MDEV-23439/10.2/sql/sql_table.cc:9876
#17 0x00005648a75e390b in Sql_cmd_alter_table::execute (this=0x7f1ce0011b60, thd=0x7f1ce0000d50) at /home/elenst/MDEV-23439/10.2/sql/sql_alter.cc:333
#18 0x00005648a7496537 in mysql_execute_command (thd=0x7f1ce0000d50) at /home/elenst/MDEV-23439/10.2/sql/sql_parse.cc:5964
#19 0x00005648a749b4cf in mysql_parse (thd=0x7f1ce0000d50, 
    rawbuf=0x7f1ce0011338 "ALTER /* QNO 122120 CON_ID 15 */ IGNORE TABLE alt_t5 /* 100301 WAIT 1 */ /*!100200 DROP CONSTRAINT IF EXISTS k */ /* COMMENT 'Skipped DROP CONSTRAINT' */", 
    length=153, parser_state=0x7f1d2c0ad640, is_com_multi=false, is_next_command=false) at /home/elenst/MDEV-23439/10.2/sql/sql_parse.cc:7733
#20 0x00005648a74897db in dispatch_command (command=COM_QUERY, thd=0x7f1ce0000d50, packet=0x7f1ce0008b11 "", packet_length=153, is_com_multi=false, is_next_command=false)
    at /home/elenst/MDEV-23439/10.2/sql/sql_parse.cc:1823
#21 0x00005648a7488258 in do_command (thd=0x7f1ce0000d50) at /home/elenst/MDEV-23439/10.2/sql/sql_parse.cc:1377
#22 0x00005648a75de491 in do_handle_one_connection (connect=0x5648abc8afe0) at /home/elenst/MDEV-23439/10.2/sql/sql_connect.cc:1336
#23 0x00005648a75de1fc in handle_one_connection (arg=0x5648abc8afe0) at /home/elenst/MDEV-23439/10.2/sql/sql_connect.cc:1241
#24 0x00007f1d6852e6db in start_thread (arg=0x7f1d2c0ae700) at pthread_create.c:463
#25 0x00007f1d67918a3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

A test case for the reference is attached, although I don't know how useful it is for debugging, it is very big in size. Execution time in shm is actually not that bad, it fails for me on 10.2 debug in 2-3 min, but on disk it can take almost an hour. The test case is to be run with

--mem --mysqld=--innodb --mysqld=--default-storage-engine=InnoDB --mysqld=--partition --mysqld=--loose-innodb-stats-persistent=on --mysqld=--character-set-server=utf8 --mysqld=--max-allowed-packet=128M --mysqld=--innodb-buffer-pool-size=128M --mysqld=--innodb-log-file-size=48M --mysqld=--log_bin_trust_function_creators=OFF --mysqld=--key_buffer_size=128M --mysqld=--performance-schema=OFF --mysqld=--log_output=FILE --mysqld=--max-statement-time=20 --mysqld=--loose-max-statement-time=10 --mysqld=--innodb-lock-wait-timeout=3 --mysqld=--lock-wait-timeout=5

(not all options are important, but these has been used successfully).

Only reproducible with this test case on 10.2, but the failure has been seen on 10.2-10.4. No observations for 10.5.

rr profile is available.



 Comments   
Comment by Marko Mäkelä [ 2020-08-10 ]

MDEV-23190 recently changed something in this area. With an rr replay trace, it should be possible to diagnose and fix this properly.

When it comes to the fsp_flags_try_adjust() call, I noted the following in MDEV-16115 last week:

it should make absolutely no sense to invoke the MDEV-11623 fixup when we had already opened the tablespace, and especially not when we are about to drop the tablespace.

Without debugging an rr replay trace, I cannot confirm that the mismatch would be completely avoided by avoiding that call.

Comment by Marko Mäkelä [ 2020-08-12 ]

The file was originally test/alt_t5.ibd, and at the time of the assertion failure, it had been renamed to test/#sql2-fe54-e.ibd, apparently during the final part of an ALTER TABLE…ALGORITHM=COPY operation.

In the rr replay trace, right before the assertion failure os_file_get_size() reports 65536 bytes, which converts to 8 pages (ROW_FORMAT=COMPRESSED using 8192-byte pages), while space->size is 6 pages. Much earlier, there was an os_file_set_size() call for 65536 bytes. I suspect that something is wrong right at the clustered index creation:

10.2 845e3c9801d17905bf2392f27cda66669c00f75f

#0  os_file_set_size (name=0x7f954414fb60 "./test/alt_t5.ibd", file=54, 
    size=65536, is_sparse=false)
    at /home/mariadb/MDEV-23439/10.2/storage/innobase/os/os0file.cc:5400
#1  0x000055bdbeef566e in fil_space_extend_must_retry (space=0x7f95440cfdf0, 
    node=0x7f95442255e0, size=6, success=0x7f95840f3aef)
    at /home/mariadb/MDEV-23439/10.2/storage/innobase/fil/fil0fil.cc:968
#2  0x000055bdbeef5e68 in fil_space_extend (space=0x7f95440cfdf0, size=6)
    at /home/mariadb/MDEV-23439/10.2/storage/innobase/fil/fil0fil.cc:1146
#3  0x000055bdbef11591 in fsp_try_extend_data_file_with_pages (
    space=0x7f95440cfdf0, page_no=5, header=0x7f95b5b64026 "", 
    mtr=0x7f95840f3f10)
    at /home/mariadb/MDEV-23439/10.2/storage/innobase/fsp/fsp0fsp.cc:861
#4  0x000055bdbef16f9a in fsp_reserve_free_pages (space=0x7f95440cfdf0, 
    space_header=0x7f95b5b64026 "", size=4, mtr=0x7f95840f3f10, n_pages=2)
    at /home/mariadb/MDEV-23439/10.2/storage/innobase/fsp/fsp0fsp.cc:2688
#5  0x000055bdbef171d0 in fsp_reserve_free_extents (n_reserved=0x7f95840f3ca8, 
    space_id=1918, n_ext=2, alloc_type=FSP_NORMAL, mtr=0x7f95840f3f10, 
    n_pages=2)
    at /home/mariadb/MDEV-23439/10.2/storage/innobase/fsp/fsp0fsp.cc:2763
#6  0x000055bdbef14d60 in fseg_create_general (space_id=1918, page=3, 

Comment by Marko Mäkelä [ 2020-08-12 ]

The file size is being bloated on purpose in fil_space_extend_must_retry():

	/* fil_read_first_page() expects UNIV_PAGE_SIZE bytes.
	fil_node_open_file() expects at least 4 * UNIV_PAGE_SIZE bytes.*/
	os_offset_t new_size = std::max(
		os_offset_t(size - file_start_page_no) * page_size,
		os_offset_t(FIL_IBD_FILE_INITIAL_SIZE * UNIV_PAGE_SIZE));
 
	*success = os_file_set_size(node->name, node->handle, new_size,
		FSP_FLAGS_HAS_PAGE_COMPRESSION(space->flags));

Without that std::max(), we would attempt to expand the file to 49152 bytes (6 pages) instead of 65536 bytes (4*innodb_page_size).

I believe that the least disruptive fix is to relax the debug assertion to account for this.

Comment by Marko Mäkelä [ 2020-08-12 ]

The debug assertion had been removed in 10.5 in MDEV-15053. I think that it is simplest to remove the assertion in older versions as well.

Generated at Thu Feb 08 09:22:25 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.