[MDEV-28796] mariabackup --prepare fails terminate called after throwing an instance of 'std::system_error' Created: 2022-06-10  Updated: 2022-07-08  Resolved: 2022-07-08

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6.9
Fix Version/s: 10.6.9

Type: Bug Priority: Major
Reporter: Matthias Leich Assignee: Vladislav Vaintroub
Resolution: Duplicate Votes: 0
Labels: rr-profile

Issue Links:
Duplicate
duplicates MDEV-28935 crash in io_slots::release Closed

 Description   

Scenario:
1. Start the server and load some initial data.
2. Several sessions run a DDL/DML mix
3. During 2. is ongoing the run
    mariabackup --backup
    mariabackup --prepare on the backuped data
The latter fails with
[rr 3134170 4577]2022-06-09 18:10:17 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1719498,4966836
[rr 3134170 6127]2022-06-09 18:10:20 0 [Note] InnoDB: Starting final batch to recover 1123 pages from redo log.
[rr 3134170 9572]2022-06-09 18:10:25 0 [Note] InnoDB: Last binlog file './mysql-bin.000005', position 369807
[rr 3134170 9605][00] 2022-06-09 18:10:25 Last binlog file ./mysql-bin.000005, position 369807
[rr 3134170 9982]terminate called after throwing an instance of '[rr 3134170 9984]std::system_error[rr 3134170 9986]'
[rr 3134170 9988]  what():  [rr 3134170 9990]Operation not permitted[rr 3134170 9992]
[rr 3134170 10012]220609 18:10:25 [rr 3134170 10014][ERROR] mysqld got signal 6 ;
 
pluto:/data/results/1654790072/003738/1_clone/rr
(rr) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x000014d02624f859 in __GI_abort () at abort.c:79
#2  0x0000524b46881911 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x0000524b4688d38c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x0000524b4688d3f7 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x0000524b4688d6a9 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x0000524b4688473f in std::__throw_system_error(int) () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00005628bbed3e72 in std::unique_lock<std::mutex>::lock (this=0x102f683c6960) at /usr/include/c++/9/bits/unique_lock.h:136
#8  0x00005628bbed41a1 in std::unique_lock<std::mutex>::unique_lock (this=0x102f683c6960, __m=...) at /usr/include/c++/9/bits/unique_lock.h:71
#9  0x00005628bbf00b56 in tpool::cache<tpool::aiocb>::put (this=0x0, ele=0x1b0b639935a0) at /data/Server/bb-10.6-MDEV-21136-records_in_range-experimental/tpool/tpool_structs.h:100
#10 0x00005628bbeff923 in io_slots::release (this=0x0, aiocb=0x1b0b639935a0) at /data/Server/bb-10.6-MDEV-21136-records_in_range-experimental/storage/innobase/os/os0file.cc:106
#11 0x00005628bbefb810 in io_callback (cb=0x1b0b639935a0) at /data/Server/bb-10.6-MDEV-21136-records_in_range-experimental/storage/innobase/os/os0file.cc:3534
#12 0x00005628bc2ff8ce in tpool::task_group::execute (this=0x612000022750, t=0x1b0b639935f8) at /data/Server/bb-10.6-MDEV-21136-records_in_range-experimental/tpool/task_group.cc:55
#13 0x00005628bc300197 in tpool::task::execute (this=0x1b0b639935f8) at /data/Server/bb-10.6-MDEV-21136-records_in_range-experimental/tpool/task.cc:32
#14 0x00005628bc2ec7d9 in tpool::thread_pool_generic::worker_main (this=0x618000000080, thread_var=0x63000000fb80) at /data/Server/bb-10.6-MDEV-21136-records_in_range-experimental/tpool/tpool_generic.cc:588
#15 0x00005628bc2ff32f in std::__invoke_impl<void, void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__f=
    @0x6040000080e8: (void (tpool::thread_pool_generic::*)(class tpool::thread_pool_generic * const, struct tpool::worker_data *)) 0x5628bc2ec5f6 <tpool::thread_pool_generic::worker_main(tpool::worker_data*)>, 
    __t=@0x6040000080e0: 0x618000000080) at /usr/include/c++/9/bits/invoke.h:73
#16 0x00005628bc2ff0da in std::__invoke<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__fn=
    @0x6040000080e8: (void (tpool::thread_pool_generic::*)(class tpool::thread_pool_generic * const, struct tpool::worker_data *)) 0x5628bc2ec5f6 <tpool::thread_pool_generic::worker_main(tpool::worker_data*)>)
    at /usr/include/c++/9/bits/invoke.h:95
#17 0x00005628bc2fef6b in std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::_M_invoke<0ul, 1ul, 2ul> (this=0x6040000080d8)
    at /usr/include/c++/9/thread:244
#18 0x00005628bc2feeb5 in std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::operator() (this=0x6040000080d8)
    at /usr/include/c++/9/thread:251
#19 0x00005628bc2fee1a in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> > >::_M_run (this=0x6040000080d0)
    at /usr/include/c++/9/thread:195
#20 0x0000524b468b9de4 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#21 0x000046c472eb6609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#22 0x000014d02634c293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(rr)
 
I assume this problem affects the current official 10.6 because I replayed it on two development trees.
origin/bb-10.6-MDEV-21136-records_in_range-experimental 1f39a8b73437c93accd63cff5147fd305cd0ead5 2022-06-09T18:10:55+03:00
origin/10.6 e11b82f8f5c7006f7bf271f0db8e8c514cc9baad 2022-06-09T13:34:52+03:00
+ one Patch by Thiru
 
RQG
===
# git clone https://github.com/mleich1/rqg --branch experimental RQG
#
# GIT_SHOW: HEAD -> experimental, origin/experimental fcae46996ce85de6bb0844799b69ed6ee8add272 2022-06-01T20:04:31+02:00
# rqg.pl  : Version 4.0.6 (2022-05)
#
# $RQG_HOME/rqg.pl \
# --gendata=conf/mariadb/oltp.zz \
# --max_gd_duration=900 \
# --grammar=conf/mariadb/oltp.yy \
# --reporters=Mariabackup_linux \
# --mysqld=--loose-innodb_lock_schedule_algorithm=fcfs \
# --mysqld=--loose-idle_write_transaction_timeout=0 \
# --mysqld=--loose-idle_transaction_timeout=0 \
# --mysqld=--loose-idle_readonly_transaction_timeout=0 \
# --mysqld=--connect_timeout=60 \
# --mysqld=--interactive_timeout=28800 \
# --mysqld=--slave_net_timeout=60 \
# --mysqld=--net_read_timeout=30 \
# --mysqld=--net_write_timeout=60 \
# --mysqld=--loose-table_lock_wait_timeout=50 \
# --mysqld=--wait_timeout=28800 \
# --mysqld=--lock-wait-timeout=86400 \
# --mysqld=--innodb-lock-wait-timeout=50 \
# --no-mask \
# --queries=10000000 \
# --seed=random \
# --reporters=Backtrace \
# --reporters=ErrorLog \
# --reporters=Deadlock1 \
# --validators=None \
# --mysqld=--log_output=none \
# --mysqld=--log_bin_trust_function_creators=1 \
# --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
# --engine=InnoDB \
# --restart_timeout=240 \
# --mysqld=--plugin-load-add=file_key_management.so \
# --mysqld=--loose-file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
# --mysqld=--plugin-load-add=provider_lzo.so \
# --mysqld=--plugin-load-add=provider_bzip2.so \
# --mysqld=--plugin-load-add=provider_lzma.so \
# --mysqld=--plugin-load-add=provider_snappy.so \
# --mysqld=--plugin-load-add=provider_lz4.so \
# --duration=300 \
# --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
# --mysqld=--loose-innodb_read_only_compressed=OFF \
# --mysqld=--loose-innodb-sync-debug \
# --mysqld=--innodb_stats_persistent=on \
# --mysqld=--innodb_adaptive_hash_index=off \
# --mysqld=--log-bin \
# --mysqld=--sync-binlog=1 \
# --mysqld=--loose-innodb_evict_tables_on_commit_debug=off \
# --mysqld=--loose-max-statement-time=30 \
# --threads=1 \
# --mysqld=--innodb-use-native-aio=0 \
# --mysqld=--loose-gdb \
# --mysqld=--loose-debug-gdb \
# --rr=Extended \
# --mysqld=--innodb_rollback_on_timeout=OFF \
# --vardir_type=slow \
# --mysqld=--innodb_page_size=4K \
# --mysqld=--innodb-buffer-pool-size=5M \
# --no_mask \
# --batch \
# <local settings>



 Comments   
Comment by Marko Mäkelä [ 2022-06-10 ]

The crashing thread is invoking the following undefined behaviour (this==nullptr):

#0  io_slots::release (this=0x0, aiocb=0x1b0b639935a0)
    at /data/Server/bb-10.6-MDEV-21136-records_in_range-experimental/storage/innobase/os/os0file.cc:106
#1  0x00005628bbefb810 in io_callback (cb=0x1b0b639935a0)
    at /data/Server/bb-10.6-MDEV-21136-records_in_range-experimental/storage/innobase/os/os0file.cc:3534

At the same time, another thread is executing the following:

#14 0x00005628bbeff96c in io_slots::wait (this=0x6120000226c0) at /data/Server/bb-10.6-MDEV-21136-records_in_range-experimental/storage/innobase/os/os0file.cc:117
#15 0x00005628bbeffa10 in io_slots::~io_slots (this=0x6120000226c0, __in_chrg=<optimized out>) at /data/Server/bb-10.6-MDEV-21136-records_in_range-experimental/storage/innobase/os/os0file.cc:132
#16 0x00005628bbf02484 in std::default_delete<io_slots>::operator() (this=0x5628bedc6880 <read_slots>, __ptr=0x6120000226c0) at /usr/include/c++/9/bits/unique_ptr.h:81
#17 0x00005628bbf0166c in std::unique_ptr<io_slots, std::default_delete<io_slots> >::reset (this=0x5628bedc6880 <read_slots>, __p=0x6120000226c0) at /usr/include/c++/9/bits/unique_ptr.h:402
#18 0x00005628bbefc14b in os_aio_free () at /data/Server/bb-10.6-MDEV-21136-records_in_range-experimental/storage/innobase/os/os0file.cc:3725
#19 0x00005628bc101200 in innodb_shutdown () at /data/Server/bb-10.6-MDEV-21136-records_in_range-experimental/storage/innobase/srv/srv0start.cc:1984
#20 0x00005628ba6d33ca in xtrabackup_prepare_func (argv=0x615000001c80) at /data/Server/bb-10.6-MDEV-21136-records_in_range-experimental/extra/mariabackup/xtrabackup.cc:6095

Generated at Thu Feb 08 10:03:32 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.