Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34845

buf_flush_buffer_pool(): Assertion `!os_aio_pending_reads()' failed

Details

    Description

      Scenario:
      1. Start a server and load some initial data.
      2. Several connections run random SQL in concurrent
      3. During 2. is ongoing run cycles consisting of
      3.1 mariabackup --backup <to location A>
      3.2 mariabackup --prepare <on location A>
      3.3 Start a server on the data on location A
      3.4 Run certain checks on the data of that server
      3.5 Shutdown that server
      At some point of time the shutdown failed with
      2024-08-29 14:23:12 0 [Note] /data/Server_bin/10.6-MDEV-34830_debug_Og/bin/mariadbd (initiated by: root[root] @ localhost [127.0.0.1]): Normal shutdown
      2024-08-29 14:23:12 0 [Warning] Aborted connection 6 to db: 'mysql' user: 'root' host: 'localhost' (KILLED)
      2024-08-29 14:23:13 0 [Note] InnoDB: FTS optimize thread exiting.
      2024-08-29 14:23:13 0 [Note] InnoDB: Starting shutdown...
      2024-08-29 14:23:13 0 [Note] InnoDB: Dumping buffer pool(s) to /dev/shm/rqg/1724928849/197/1_clone/data/ib_buffer_pool
      2024-08-29 14:23:13 0 [Note] InnoDB: Restricted to 126 pages due to innodb_buf_pool_dump_pct=25
      2024-08-29 14:23:13 0 [Note] InnoDB: Buffer pool(s) dump completed at 240829 14:23:13

      1. 2024-08-29T14:23:14 [3420387] | [rr 3472874 110589]mariadbd: /data/Server/10.6-MDEV-34830/storage/innobase/buf/buf0flu.cc:2591: void buf_flush_buffer_pool(): Assertion `!os_aio_pending_reads()' failed.

      RQG
      ===
      # git clone https://github.com/mleich1/rqg --branch <pick the right branch> RQG
      #
      # GIT_SHOW: HEAD -> master, origin/master, origin/HEAD de161f09647823897338538e4f146ea6350eac29 2024-08-28T14:46:53+02:00
      # rqg.pl  : Version 4.5.0 (2023-12)
      #
      # $RQG_HOME/rqg.pl \
      # --grammar=conf/mariadb/innodb_compression_encryption.yy \
      # --gendata=conf/mariadb/innodb_compression_encryption.zz \
      # --max_gd_duration=1800 \
      # --mysqld=--innodb-encrypt-log \
      # --mysqld=--innodb-encrypt-tables \
      # --reporters=Mariabackup_linux \
      # --mysqld=--loose-innodb-log-file-size=200M \
      # --mysqld=--loose-innodb_lock_schedule_algorithm=fcfs \
      # --mysqld=--loose-idle_write_transaction_timeout=0 \
      # --mysqld=--loose-idle_transaction_timeout=0 \
      # --mysqld=--loose-idle_readonly_transaction_timeout=0 \
      # --mysqld=--connect_timeout=60 \
      # --mysqld=--interactive_timeout=28800 \
      # --mysqld=--slave_net_timeout=60 \
      # --mysqld=--net_read_timeout=30 \
      # --mysqld=--net_write_timeout=60 \
      # --mysqld=--loose-table_lock_wait_timeout=50 \
      # --mysqld=--wait_timeout=28800 \
      # --no_mask \
      # --queries=10000000 \
      # --seed=random \
      # --reporters=None \
      # --reporters=ErrorLog \
      # --reporters=Deadlock \
      # --validators=None \
      # --mysqld=--log_output=none \
      # --mysqld=--log_bin_trust_function_creators=1 \
      # --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
      # --engine=InnoDB \
      # --restart_timeout=240 \
      # --mysqld=--plugin-load-add=file_key_management.so \
      # --mysqld=--loose-file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
      # --mysqld=--plugin-load-add=provider_lzo.so \
      # --mysqld=--plugin-load-add=provider_bzip2.so \
      # --mysqld=--plugin-load-add=provider_lzma.so \
      # --mysqld=--plugin-load-add=provider_snappy.so \
      # --mysqld=--plugin-load-add=provider_lz4.so \
      # --mysqld=--loose-innodb_compression_level=1 \
      # --duration=300 \
      # --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
      # --mysqld=--loose-innodb_data_file_path=ibdata1:1M:autoextend:autoshrink \
      # --redefine=conf/mariadb/redefine_innodb_log_write_ahead_size.yy \
      # --mysqld=--lock-wait-timeout=15 \
      # --mysqld=--innodb-lock-wait-timeout=10 \
      # --mysqld=--sql_mode=STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION \
      # --mysqld=--innodb_file_per_table=0 \
      # --mysqld=--loose-innodb_read_only_compressed=OFF \
      # --mysqld=--loose-innodb-sync-debug \
      # --mysqld=--innodb_stats_persistent=off \
      # --mysqld=--innodb_adaptive_hash_index=off \
      # --mysqld=--innodb_random_read_ahead=OFF \
      # --mysqld=--loose-innodb_evict_tables_on_commit_debug=off \
      # --threads=1 \
      # --rr=Extended \
      # --rr_options=--chaos --wait \
      # --mysqld=--innodb_undo_log_truncate=OFF \
      # --mysqld=--loose_innodb_change_buffering=none \
      # --mysqld=--innodb_undo_tablespaces=0 \
      # --mysqld=--innodb_rollback_on_timeout=OFF \
      # --mysqld=--innodb_page_size=16K \
      # --mysqld=--innodb-buffer-pool-size=8M \
      # --vardir_type=fast \
      # <local settings>
       
      Error pattern:
      [ 'TBR-2011', '(mariadbd|mysqld): .{1,300}buf0flu.cc:.{1,10}: void buf_flush_buffer_pool\(\): Assertion \`\!os_aio_pending_reads\(\)\' failed' ],
      

      Attachments

        Activity

          In the rr replay trace, one page read that had been initiated by buf_read_ahead_linear() was pending. I think that we must simply replace the failing assertion with a wait:

          diff --git a/storage/innobase/buf/buf0flu.cc b/storage/innobase/buf/buf0flu.cc
          index 46129a037b0..494cd7b3f42 100644
          --- a/storage/innobase/buf/buf0flu.cc
          +++ b/storage/innobase/buf/buf0flu.cc
          @@ -2588,12 +2588,12 @@ ATTRIBUTE_COLD void buf_flush_page_cleaner_init()
           /** Flush the buffer pool on shutdown. */
           ATTRIBUTE_COLD void buf_flush_buffer_pool()
           {
          -  ut_ad(!os_aio_pending_reads());
             ut_ad(!buf_page_cleaner_is_active);
             ut_ad(!buf_flush_sync_lsn);
           
             service_manager_extend_timeout(INNODB_EXTEND_TIMEOUT_INTERVAL,
                                            "Waiting to flush the buffer pool");
          +  os_aio_wait_until_no_pending_reads(false);
           
             mysql_mutex_lock(&buf_pool.flush_list_mutex);
           
          

          Note: In 10.5 the code is different. The assertion ut_ad(!buf_pool.any_io_pending()) at the end of the function could theoretically fail.

          marko Marko Mäkelä added a comment - In the rr replay trace, one page read that had been initiated by buf_read_ahead_linear() was pending. I think that we must simply replace the failing assertion with a wait: diff --git a/storage/innobase/buf/buf0flu.cc b/storage/innobase/buf/buf0flu.cc index 46129a037b0..494cd7b3f42 100644 --- a/storage/innobase/buf/buf0flu.cc +++ b/storage/innobase/buf/buf0flu.cc @@ -2588,12 +2588,12 @@ ATTRIBUTE_COLD void buf_flush_page_cleaner_init() /** Flush the buffer pool on shutdown. */ ATTRIBUTE_COLD void buf_flush_buffer_pool() { - ut_ad(!os_aio_pending_reads()); ut_ad(!buf_page_cleaner_is_active); ut_ad(!buf_flush_sync_lsn); service_manager_extend_timeout(INNODB_EXTEND_TIMEOUT_INTERVAL, "Waiting to flush the buffer pool"); + os_aio_wait_until_no_pending_reads(false); mysql_mutex_lock(&buf_pool.flush_list_mutex); Note: In 10.5 the code is different. The assertion ut_ad(!buf_pool.any_io_pending()) at the end of the function could theoretically fail.

          Indeed, from the RR trace we can see there is one read IO in progress for page ID 447.

          #0  0x0000000070000002 in syscall_traced ()
          #1  0x000055b382a08525 in _raw_syscall () at /home/ubuntu/rr/src/preload/raw_syscall.S:120
          #2  0x000055b382a01949 in traced_raw_syscall (call=0x7f4acdba2fa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:350
          #3  0x000055b382a062df in sys_futex (call=<optimized out>) at /home/ubuntu/rr/src/preload/syscallbuf.c:2012
          #4  syscall_hook_internal (call=0x7f4acdba2fa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:4097
          #5  syscall_hook (call=0x7f4acdba2fa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:4274
          #6  0x000055b382a01353 in _syscall_hook_trampoline () at /home/ubuntu/rr/src/preload/syscall_hook.S:308
          #7  0x000055b382a013bd in __morestack () at /home/ubuntu/rr/src/preload/syscall_hook.S:443
          #8  0x000055b382a013c4 in _syscall_hook_trampoline_48_3d_01_f0_ff_ff () at /home/ubuntu/rr/src/preload/syscall_hook.S:457
          #9  0x000055b385631893 in syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
          #10 0x000055b383e0b6b8 in srw_mutex_impl<true>::wake (this=this@entry=0x7f4acefa4bb8) at /data/Server/10.6-MDEV-34830/storage/innobase/sync/srw_lock.cc:255
          #11 0x000055b383bd895a in srw_mutex_impl<true>::wr_unlock (this=0x7f4acefa4bb8) at /data/Server/10.6-MDEV-34830/storage/innobase/include/srw_lock.h:158
          #12 ssux_lock_impl<true>::wr_unlock (this=0x7f4acefa4bb8) at /data/Server/10.6-MDEV-34830/storage/innobase/include/srw_lock.h:324
          #13 sux_lock<ssux_lock_impl<true> >::u_or_x_unlock (this=0x7f4acefa4bb8, allow_readers=<optimized out>, claim_ownership=<optimized out>)
              at /data/Server/10.6-MDEV-34830/storage/innobase/include/sux_lock.h:298
          #14 0x000055b383eceeaa in sux_lock<ssux_lock_impl<true> >::x_unlock (claim_ownership=true, this=0x7f4acefa4bb8) at /data/Server/10.6-MDEV-34830/storage/innobase/include/sux_lock.h:305
          #15 buf_page_t::read_complete (this=0x7f4acefa4ba0, node=...) at /data/Server/10.6-MDEV-34830/storage/innobase/buf/buf0buf.cc:3780
          #16 0x000055b383f89678 in IORequest::read_complete (this=this@entry=0x55b38619d058, io_error=0) at /data/Server/10.6-MDEV-34830/storage/innobase/fil/fil0fil.cc:2941
          #17 0x000055b383cd1647 in read_io_callback (c=0x55b38619cfe0) at /data/Server/10.6-MDEV-34830/storage/innobase/os/os0file.cc:3453
          #18 0x000055b3840183f3 in tpool::task_group::execute (this=0x55b38619cf20, t=t@entry=0x55b38619d038) at /data/Server/10.6-MDEV-34830/tpool/task_group.cc:55
          

          It seems like it is quite possible that some read ahead aysnc page fetch was requested and if we have never waited for the IO to complete, this seems a good place to do so.

          I agree with the fix.

          It does seem to be relevant for 10.5 too though unless somehow we are waiting for such completion earlier.

          debarun Debarun Banerjee added a comment - Indeed, from the RR trace we can see there is one read IO in progress for page ID 447. #0 0x0000000070000002 in syscall_traced () #1 0x000055b382a08525 in _raw_syscall () at /home/ubuntu/rr/src/preload/raw_syscall.S:120 #2 0x000055b382a01949 in traced_raw_syscall (call=0x7f4acdba2fa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:350 #3 0x000055b382a062df in sys_futex (call=<optimized out>) at /home/ubuntu/rr/src/preload/syscallbuf.c:2012 #4 syscall_hook_internal (call=0x7f4acdba2fa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:4097 #5 syscall_hook (call=0x7f4acdba2fa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:4274 #6 0x000055b382a01353 in _syscall_hook_trampoline () at /home/ubuntu/rr/src/preload/syscall_hook.S:308 #7 0x000055b382a013bd in __morestack () at /home/ubuntu/rr/src/preload/syscall_hook.S:443 #8 0x000055b382a013c4 in _syscall_hook_trampoline_48_3d_01_f0_ff_ff () at /home/ubuntu/rr/src/preload/syscall_hook.S:457 #9 0x000055b385631893 in syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 #10 0x000055b383e0b6b8 in srw_mutex_impl<true>::wake (this=this@entry=0x7f4acefa4bb8) at /data/Server/10.6-MDEV-34830/storage/innobase/sync/srw_lock.cc:255 #11 0x000055b383bd895a in srw_mutex_impl<true>::wr_unlock (this=0x7f4acefa4bb8) at /data/Server/10.6-MDEV-34830/storage/innobase/include/srw_lock.h:158 #12 ssux_lock_impl<true>::wr_unlock (this=0x7f4acefa4bb8) at /data/Server/10.6-MDEV-34830/storage/innobase/include/srw_lock.h:324 #13 sux_lock<ssux_lock_impl<true> >::u_or_x_unlock (this=0x7f4acefa4bb8, allow_readers=<optimized out>, claim_ownership=<optimized out>) at /data/Server/10.6-MDEV-34830/storage/innobase/include/sux_lock.h:298 #14 0x000055b383eceeaa in sux_lock<ssux_lock_impl<true> >::x_unlock (claim_ownership=true, this=0x7f4acefa4bb8) at /data/Server/10.6-MDEV-34830/storage/innobase/include/sux_lock.h:305 #15 buf_page_t::read_complete (this=0x7f4acefa4ba0, node=...) at /data/Server/10.6-MDEV-34830/storage/innobase/buf/buf0buf.cc:3780 #16 0x000055b383f89678 in IORequest::read_complete (this=this@entry=0x55b38619d058, io_error=0) at /data/Server/10.6-MDEV-34830/storage/innobase/fil/fil0fil.cc:2941 #17 0x000055b383cd1647 in read_io_callback (c=0x55b38619cfe0) at /data/Server/10.6-MDEV-34830/storage/innobase/os/os0file.cc:3453 #18 0x000055b3840183f3 in tpool::task_group::execute (this=0x55b38619cf20, t=t@entry=0x55b38619d038) at /data/Server/10.6-MDEV-34830/tpool/task_group.cc:55 It seems like it is quite possible that some read ahead aysnc page fetch was requested and if we have never waited for the IO to complete, this seems a good place to do so. I agree with the fix. It does seem to be relevant for 10.5 too though unless somehow we are waiting for such completion earlier.

          Yes, for 10.5 a similar fix would seem to be applicable. It might even explain MDEV-26733.

          For 10.6, I was thinking if there might be a potential impact that that a change buffer merge would be initiated at the very last moment, and the write for it would be lost, resulting in a corruption. This does not seem to be possible, because srv_shutdown() should have kept looping until the change buffer is empty. So, this bug should only have an impact on debug instrumented builds.

          marko Marko Mäkelä added a comment - Yes, for 10.5 a similar fix would seem to be applicable. It might even explain MDEV-26733 . For 10.6, I was thinking if there might be a potential impact that that a change buffer merge would be initiated at the very last moment, and the write for it would be lost, resulting in a corruption. This does not seem to be possible, because srv_shutdown() should have kept looping until the change buffer is empty. So, this bug should only have an impact on debug instrumented builds.

          People

            marko Marko Mäkelä
            mleich Matthias Leich
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.