[MDEV-34845] buf_flush_buffer_pool(): Assertion `!os_aio_pending_reads()' failed - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.6.20
Fix Version/s: 10.6.20, 10.11.10, 11.2.6, 11.4.4, 11.6.2
Component/s: Storage Engine - InnoDB
Labels:
None

Description

Scenario:
1. Start a server and load some initial data.
2. Several connections run random SQL in concurrent
3. During 2. is ongoing run cycles consisting of
3.1 mariabackup --backup <to location A>
3.2 mariabackup --prepare <on location A>
3.3 Start a server on the data on location A
3.4 Run certain checks on the data of that server
3.5 Shutdown that server
At some point of time the shutdown failed with
2024-08-29 14:23:12 0 [Note] /data/Server_bin/10.6-~~MDEV-34830~~_debug_Og/bin/mariadbd (initiated by: root[root] @ localhost [127.0.0.1]): Normal shutdown
2024-08-29 14:23:12 0 [Warning] Aborted connection 6 to db: 'mysql' user: 'root' host: 'localhost' (KILLED)
2024-08-29 14:23:13 0 [Note] InnoDB: FTS optimize thread exiting.
2024-08-29 14:23:13 0 [Note] InnoDB: Starting shutdown...
2024-08-29 14:23:13 0 [Note] InnoDB: Dumping buffer pool(s) to /dev/shm/rqg/1724928849/197/1_clone/data/ib_buffer_pool
2024-08-29 14:23:13 0 [Note] InnoDB: Restricted to 126 pages due to innodb_buf_pool_dump_pct=25
2024-08-29 14:23:13 0 [Note] InnoDB: Buffer pool(s) dump completed at 240829 14:23:13

2024-08-29T14:23:14 [3420387] | [rr 3472874 110589]mariadbd: /data/Server/10.6-~~MDEV-34830~~/storage/innobase/buf/buf0flu.cc:2591: void buf_flush_buffer_pool(): Assertion `!os_aio_pending_reads()' failed.

RQG

===

# git clone https://github.com/mleich1/rqg --branch <pick the right branch> RQG

# GIT_SHOW: HEAD -> master, origin/master, origin/HEAD de161f09647823897338538e4f146ea6350eac29 2024-08-28T14:46:53+02:00

# rqg.pl  : Version 4.5.0 (2023-12)

# $RQG_HOME/rqg.pl \

# --grammar=conf/mariadb/innodb_compression_encryption.yy \

# --gendata=conf/mariadb/innodb_compression_encryption.zz \

# --max_gd_duration=1800 \

# --mysqld=--innodb-encrypt-log \

# --mysqld=--innodb-encrypt-tables \

# --reporters=Mariabackup_linux \

# --mysqld=--loose-innodb-log-file-size=200M \

# --mysqld=--loose-innodb_lock_schedule_algorithm=fcfs \

# --mysqld=--loose-idle_write_transaction_timeout=0 \

# --mysqld=--loose-idle_transaction_timeout=0 \

# --mysqld=--loose-idle_readonly_transaction_timeout=0 \

# --mysqld=--connect_timeout=60 \

# --mysqld=--interactive_timeout=28800 \

# --mysqld=--slave_net_timeout=60 \

# --mysqld=--net_read_timeout=30 \

# --mysqld=--net_write_timeout=60 \

# --mysqld=--loose-table_lock_wait_timeout=50 \

# --mysqld=--wait_timeout=28800 \

# --no_mask \

# --queries=10000000 \

# --seed=random \

# --reporters=None \

# --reporters=ErrorLog \

# --reporters=Deadlock \

# --validators=None \

# --mysqld=--log_output=none \

# --mysqld=--log_bin_trust_function_creators=1 \

# --mysqld=--loose-debug_assert_on_not_freed_memory=0 \

# --engine=InnoDB \

# --restart_timeout=240 \

# --mysqld=--plugin-load-add=file_key_management.so \

# --mysqld=--loose-file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \

# --mysqld=--plugin-load-add=provider_lzo.so \

# --mysqld=--plugin-load-add=provider_bzip2.so \

# --mysqld=--plugin-load-add=provider_lzma.so \

# --mysqld=--plugin-load-add=provider_snappy.so \

# --mysqld=--plugin-load-add=provider_lz4.so \

# --mysqld=--loose-innodb_compression_level=1 \

# --duration=300 \

# --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \

# --mysqld=--loose-innodb_data_file_path=ibdata1:1M:autoextend:autoshrink \

# --redefine=conf/mariadb/redefine_innodb_log_write_ahead_size.yy \

# --mysqld=--lock-wait-timeout=15 \

# --mysqld=--innodb-lock-wait-timeout=10 \

# --mysqld=--sql_mode=STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION \

# --mysqld=--innodb_file_per_table=0 \

# --mysqld=--loose-innodb_read_only_compressed=OFF \

# --mysqld=--loose-innodb-sync-debug \

# --mysqld=--innodb_stats_persistent=off \

# --mysqld=--innodb_adaptive_hash_index=off \

# --mysqld=--innodb_random_read_ahead=OFF \

# --mysqld=--loose-innodb_evict_tables_on_commit_debug=off \

# --threads=1 \

# --rr=Extended \

# --rr_options=--chaos --wait \

# --mysqld=--innodb_undo_log_truncate=OFF \

# --mysqld=--loose_innodb_change_buffering=none \

# --mysqld=--innodb_undo_tablespaces=0 \

# --mysqld=--innodb_rollback_on_timeout=OFF \

# --mysqld=--innodb_page_size=16K \

# --mysqld=--innodb-buffer-pool-size=8M \

# --vardir_type=fast \

# <local settings>

Error pattern:

[ 'TBR-2011', '(mariadbd|mysqld): .{1,300}buf0flu.cc:.{1,10}: void buf_flush_buffer_pool\(\): Assertion \`\!os_aio_pending_reads\(\)\' failed' ],

Attachments

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2024-08-30 12:29

In the rr replay trace, one page read that had been initiated by buf_read_ahead_linear() was pending. I think that we must simply replace the failing assertion with a wait:

diff --git a/storage/innobase/buf/buf0flu.cc b/storage/innobase/buf/buf0flu.cc

index 46129a037b0..494cd7b3f42 100644

--- a/storage/innobase/buf/buf0flu.cc

+++ b/storage/innobase/buf/buf0flu.cc

@@ -2588,12 +2588,12 @@ ATTRIBUTE_COLD void buf_flush_page_cleaner_init()

 /** Flush the buffer pool on shutdown. */

 ATTRIBUTE_COLD void buf_flush_buffer_pool()

-  ut_ad(!os_aio_pending_reads());

   ut_ad(!buf_page_cleaner_is_active);

   ut_ad(!buf_flush_sync_lsn);

   service_manager_extend_timeout(INNODB_EXTEND_TIMEOUT_INTERVAL,

                                  "Waiting to flush the buffer pool");

+  os_aio_wait_until_no_pending_reads(false);

   mysql_mutex_lock(&buf_pool.flush_list_mutex);

Note: In 10.5 the code is different. The assertion ut_ad(!buf_pool.any_io_pending()) at the end of the function could theoretically fail.

Marko Mäkelä added a comment - 2024-08-30 12:29 In the rr replay trace, one page read that had been initiated by buf_read_ahead_linear() was pending. I think that we must simply replace the failing assertion with a wait: diff --git a/storage/innobase/buf/buf0flu.cc b/storage/innobase/buf/buf0flu.cc index 46129a037b0..494cd7b3f42 100644 --- a/storage/innobase/buf/buf0flu.cc +++ b/storage/innobase/buf/buf0flu.cc @@ -2588,12 +2588,12 @@ ATTRIBUTE_COLD void buf_flush_page_cleaner_init() /** Flush the buffer pool on shutdown. */ ATTRIBUTE_COLD void buf_flush_buffer_pool() { - ut_ad(!os_aio_pending_reads()); ut_ad(!buf_page_cleaner_is_active); ut_ad(!buf_flush_sync_lsn); service_manager_extend_timeout(INNODB_EXTEND_TIMEOUT_INTERVAL, "Waiting to flush the buffer pool"); + os_aio_wait_until_no_pending_reads(false); mysql_mutex_lock(&buf_pool.flush_list_mutex); Note: In 10.5 the code is different. The assertion ut_ad(!buf_pool.any_io_pending()) at the end of the function could theoretically fail.

Debarun Banerjee added a comment - 2024-09-03 14:44

Indeed, from the RR trace we can see there is one read IO in progress for page ID 447.

#0  0x0000000070000002 in syscall_traced ()

#1  0x000055b382a08525 in _raw_syscall () at /home/ubuntu/rr/src/preload/raw_syscall.S:120

#2  0x000055b382a01949 in traced_raw_syscall (call=0x7f4acdba2fa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:350

#3  0x000055b382a062df in sys_futex (call=<optimized out>) at /home/ubuntu/rr/src/preload/syscallbuf.c:2012

#4  syscall_hook_internal (call=0x7f4acdba2fa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:4097

#5  syscall_hook (call=0x7f4acdba2fa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:4274

#6  0x000055b382a01353 in _syscall_hook_trampoline () at /home/ubuntu/rr/src/preload/syscall_hook.S:308

#7  0x000055b382a013bd in __morestack () at /home/ubuntu/rr/src/preload/syscall_hook.S:443

#8  0x000055b382a013c4 in _syscall_hook_trampoline_48_3d_01_f0_ff_ff () at /home/ubuntu/rr/src/preload/syscall_hook.S:457

#9  0x000055b385631893 in syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38

#10 0x000055b383e0b6b8 in srw_mutex_impl<true>::wake (this=this@entry=0x7f4acefa4bb8) at /data/Server/10.6-MDEV-34830/storage/innobase/sync/srw_lock.cc:255

#11 0x000055b383bd895a in srw_mutex_impl<true>::wr_unlock (this=0x7f4acefa4bb8) at /data/Server/10.6-MDEV-34830/storage/innobase/include/srw_lock.h:158

#12 ssux_lock_impl<true>::wr_unlock (this=0x7f4acefa4bb8) at /data/Server/10.6-MDEV-34830/storage/innobase/include/srw_lock.h:324

#13 sux_lock<ssux_lock_impl<true> >::u_or_x_unlock (this=0x7f4acefa4bb8, allow_readers=<optimized out>, claim_ownership=<optimized out>)

    at /data/Server/10.6-MDEV-34830/storage/innobase/include/sux_lock.h:298

#14 0x000055b383eceeaa in sux_lock<ssux_lock_impl<true> >::x_unlock (claim_ownership=true, this=0x7f4acefa4bb8) at /data/Server/10.6-MDEV-34830/storage/innobase/include/sux_lock.h:305

#15 buf_page_t::read_complete (this=0x7f4acefa4ba0, node=...) at /data/Server/10.6-MDEV-34830/storage/innobase/buf/buf0buf.cc:3780

#16 0x000055b383f89678 in IORequest::read_complete (this=this@entry=0x55b38619d058, io_error=0) at /data/Server/10.6-MDEV-34830/storage/innobase/fil/fil0fil.cc:2941

#17 0x000055b383cd1647 in read_io_callback (c=0x55b38619cfe0) at /data/Server/10.6-MDEV-34830/storage/innobase/os/os0file.cc:3453

#18 0x000055b3840183f3 in tpool::task_group::execute (this=0x55b38619cf20, t=t@entry=0x55b38619d038) at /data/Server/10.6-MDEV-34830/tpool/task_group.cc:55

It seems like it is quite possible that some read ahead aysnc page fetch was requested and if we have never waited for the IO to complete, this seems a good place to do so.

I agree with the fix.

It does seem to be relevant for 10.5 too though unless somehow we are waiting for such completion earlier.

Debarun Banerjee added a comment - 2024-09-03 14:44 Indeed, from the RR trace we can see there is one read IO in progress for page ID 447. #0 0x0000000070000002 in syscall_traced () #1 0x000055b382a08525 in _raw_syscall () at /home/ubuntu/rr/src/preload/raw_syscall.S:120 #2 0x000055b382a01949 in traced_raw_syscall (call=0x7f4acdba2fa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:350 #3 0x000055b382a062df in sys_futex (call=<optimized out>) at /home/ubuntu/rr/src/preload/syscallbuf.c:2012 #4 syscall_hook_internal (call=0x7f4acdba2fa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:4097 #5 syscall_hook (call=0x7f4acdba2fa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:4274 #6 0x000055b382a01353 in _syscall_hook_trampoline () at /home/ubuntu/rr/src/preload/syscall_hook.S:308 #7 0x000055b382a013bd in __morestack () at /home/ubuntu/rr/src/preload/syscall_hook.S:443 #8 0x000055b382a013c4 in _syscall_hook_trampoline_48_3d_01_f0_ff_ff () at /home/ubuntu/rr/src/preload/syscall_hook.S:457 #9 0x000055b385631893 in syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 #10 0x000055b383e0b6b8 in srw_mutex_impl<true>::wake (this=this@entry=0x7f4acefa4bb8) at /data/Server/10.6-MDEV-34830/storage/innobase/sync/srw_lock.cc:255 #11 0x000055b383bd895a in srw_mutex_impl<true>::wr_unlock (this=0x7f4acefa4bb8) at /data/Server/10.6-MDEV-34830/storage/innobase/include/srw_lock.h:158 #12 ssux_lock_impl<true>::wr_unlock (this=0x7f4acefa4bb8) at /data/Server/10.6-MDEV-34830/storage/innobase/include/srw_lock.h:324 #13 sux_lock<ssux_lock_impl<true> >::u_or_x_unlock (this=0x7f4acefa4bb8, allow_readers=<optimized out>, claim_ownership=<optimized out>) at /data/Server/10.6-MDEV-34830/storage/innobase/include/sux_lock.h:298 #14 0x000055b383eceeaa in sux_lock<ssux_lock_impl<true> >::x_unlock (claim_ownership=true, this=0x7f4acefa4bb8) at /data/Server/10.6-MDEV-34830/storage/innobase/include/sux_lock.h:305 #15 buf_page_t::read_complete (this=0x7f4acefa4ba0, node=...) at /data/Server/10.6-MDEV-34830/storage/innobase/buf/buf0buf.cc:3780 #16 0x000055b383f89678 in IORequest::read_complete (this=this@entry=0x55b38619d058, io_error=0) at /data/Server/10.6-MDEV-34830/storage/innobase/fil/fil0fil.cc:2941 #17 0x000055b383cd1647 in read_io_callback (c=0x55b38619cfe0) at /data/Server/10.6-MDEV-34830/storage/innobase/os/os0file.cc:3453 #18 0x000055b3840183f3 in tpool::task_group::execute (this=0x55b38619cf20, t=t@entry=0x55b38619d038) at /data/Server/10.6-MDEV-34830/tpool/task_group.cc:55 It seems like it is quite possible that some read ahead aysnc page fetch was requested and if we have never waited for the IO to complete, this seems a good place to do so. I agree with the fix. It does seem to be relevant for 10.5 too though unless somehow we are waiting for such completion earlier.

Marko Mäkelä added a comment - 2024-09-03 15:20

Yes, for 10.5 a similar fix would seem to be applicable. It might even explain MDEV-26733.

For 10.6, I was thinking if there might be a potential impact that that a change buffer merge would be initiated at the very last moment, and the write for it would be lost, resulting in a corruption. This does not seem to be possible, because srv_shutdown() should have kept looping until the change buffer is empty. So, this bug should only have an impact on debug instrumented builds.

Marko Mäkelä added a comment - 2024-09-03 15:20 Yes, for 10.5 a similar fix would seem to be applicable. It might even explain MDEV-26733 . For 10.6, I was thinking if there might be a potential impact that that a change buffer merge would be initiated at the very last moment, and the write for it would be lost, resulting in a corruption. This does not seem to be possible, because srv_shutdown() should have kept looping until the change buffer is empty. So, this bug should only have an impact on debug instrumented builds.

MariaDB Server

buf_flush_buffer_pool(): Assertion `!os_aio_pending_reads()' failed

Details

Description

Attachments

Activity

People

Dates

Git Integration