[MDEV-29896] mariadb-backup got stuck with --throttle option doing incremental backup - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.6.9
Fix Version/s: 10.11.2, 10.6.12, 10.7.8, 10.8.7, 10.9.5, 10.10.3
Component/s: mariabackup
Labels:
None

Description

1. create a table
2. make a first part of backup:

sudo mariadb-backup --backup -utest_user -pct570c3521fCCR# --target-dir=backup1 --throttle=2

3. create another table
4. try to create a second part of incremental backup

sudo mariadb-backup --backup -utest_user -pct570c3521fCCR# --target-dir=backup1_2 --incremental-basedir=backup1 --throttle=2

Result:
process stops forever at:

[00] 2022-10-27 11:44:33 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...

[00] 2022-10-27 11:44:33 mariabackup: The latest check point (for incremental): '42171'

mariabackup: Stopping log copying thread.....[00] 2022-10-27 11:44:34 >> log scanned up to (751195)

throttle-1.sh - script to reproduce

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

log_--throttle=2.log
45 kB
2022-10-27 12:19
throttle-1.sh
1 kB
2022-10-27 12:01
throttle10000000.log
38 kB
2022-10-28 19:01

Issue Links

is caused by

MDEV-21452 Use condition variables and normal mutexes instead of InnoDB os_event and mutex

Closed

Activity

Ascending order - Click to sort in descending order

Timofey Turenko added a comment - 2022-10-28 19:02

added log for `--throttle=10000000`

Timofey Turenko added a comment - 2022-10-28 19:02 added log for `--throttle=10000000`

Timofey Turenko added a comment - 2022-10-28 19:04

attempt to attach to the mariadb-backup process:

Attaching to process 323145

[New LWP 323146]

[New LWP 323147]

[Thread debugging using libthread_db enabled]

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

futex_wait (private=<optimized out>, expected=4294967292, futex_word=0x55ae1baf7384 <log_copying_stop+36>) at ../sysdeps/nptl/futex-internal.h:141

141	../sysdeps/nptl/futex-internal.h: No such file or directory.

(gdb) bt

#0  futex_wait (private=<optimized out>, expected=4294967292, futex_word=0x55ae1baf7384 <log_copying_stop+36>) at ../sysdeps/nptl/futex-internal.h:141

#1  futex_wait_simple (private=<optimized out>, expected=4294967292, futex_word=0x55ae1baf7384 <log_copying_stop+36>) at ../sysdeps/nptl/futex-internal.h:172

#2  __pthread_cond_destroy (cond=0x55ae1baf7360 <log_copying_stop>) at pthread_cond_destroy.c:54

#3  0x000055ae1a9afe14 in stop_backup_threads (running=<optimized out>) at ./extra/mariabackup/xtrabackup.cc:4377

#4  0x000055ae1a9c17fe in xtrabackup_backup_low () at ./extra/mariabackup/xtrabackup.cc:4417

#5  xtrabackup_backup_func () at ./extra/mariabackup/xtrabackup.cc:4774

#6  main_low (argv=<optimized out>) at ./extra/mariabackup/xtrabackup.cc:6941

#7  0x000055ae1a96b332 in main (argc=<optimized out>, argv=<optimized out>) at ./extra/mariabackup/xtrabackup.cc:6731

stopped here https://github.com/MariaDB/server/blob/mariadb-10.6.9/extra/mariabackup/xtrabackup.cc#L4377 ?

Timofey Turenko added a comment - 2022-10-28 19:04 attempt to attach to the mariadb-backup process: Attaching to process 323145 [New LWP 323146] [New LWP 323147] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". futex_wait (private=<optimized out>, expected=4294967292, futex_word=0x55ae1baf7384 <log_copying_stop+36>) at ../sysdeps/nptl/futex-internal.h:141 141 ../sysdeps/nptl/futex-internal.h: No such file or directory. (gdb) bt #0 futex_wait (private=<optimized out>, expected=4294967292, futex_word=0x55ae1baf7384 <log_copying_stop+36>) at ../sysdeps/nptl/futex-internal.h:141 #1 futex_wait_simple (private=<optimized out>, expected=4294967292, futex_word=0x55ae1baf7384 <log_copying_stop+36>) at ../sysdeps/nptl/futex-internal.h:172 #2 __pthread_cond_destroy (cond=0x55ae1baf7360 <log_copying_stop>) at pthread_cond_destroy.c:54 #3 0x000055ae1a9afe14 in stop_backup_threads (running=<optimized out>) at ./extra/mariabackup/xtrabackup.cc:4377 #4 0x000055ae1a9c17fe in xtrabackup_backup_low () at ./extra/mariabackup/xtrabackup.cc:4417 #5 xtrabackup_backup_func () at ./extra/mariabackup/xtrabackup.cc:4774 #6 main_low (argv=<optimized out>) at ./extra/mariabackup/xtrabackup.cc:6941 #7 0x000055ae1a96b332 in main (argc=<optimized out>, argv=<optimized out>) at ./extra/mariabackup/xtrabackup.cc:6731 stopped here https://github.com/MariaDB/server/blob/mariadb-10.6.9/extra/mariabackup/xtrabackup.cc#L4377 ?

Elena Stepanova added a comment - 2022-10-28 19:30

It looks like marko touched that code last, assigned to him in the absence of better criteria.

Elena Stepanova added a comment - 2022-10-28 19:30 It looks like marko touched that code last, assigned to him in the absence of better criteria.

Marko Mäkelä added a comment - 2022-10-31 09:22

tturenko, your GDB output shows 2 threads (LWP=light weight process). Can you please post a stack trace of both threads (thread apply all backtrace)?

I do not see anything obviously wrong in the code around the condition variable log_copying_stop. There seems to be a race condition around the flag log_copying_running. It is mostly protected by log_sys.mutex, but stop_backup_threads() is polling it without using any mutex. That flag should be declared as Atomic_relaxed<bool> for that to be correct. But, this should not make any difference with regard to the generated AMD64 object code.

Marko Mäkelä added a comment - 2022-10-31 09:22 tturenko , your GDB output shows 2 threads (LWP=light weight process). Can you please post a stack trace of both threads ( thread apply all backtrace )? I do not see anything obviously wrong in the code around the condition variable log_copying_stop . There seems to be a race condition around the flag log_copying_running . It is mostly protected by log_sys.mutex , but stop_backup_threads() is polling it without using any mutex. That flag should be declared as Atomic_relaxed<bool> for that to be correct. But, this should not make any difference with regard to the generated AMD64 object code.

Timofey Turenko added a comment - 2022-11-03 10:11

(gdb) thread apply all bt

Thread 3 (Thread 0x7f2eab9fe700 (LWP 28723)):

#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38

#1  0x000055ee42dc480d in my_getevents (min_nr=1, nr=256, ev=0x7f2eab9fbde0, ctx=<optimized out>) at ./tpool/aio_linux.cc:104

#2  tpool::aio_linux::getevent_thread_routine (aio=0x55ee452c1030) at ./tpool/aio_linux.cc:104

#3  0x00007f2eb06fdde4 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6

#4  0x00007f2eb0817609 in start_thread (arg=<optimized out>) at pthread_create.c:477

#5  0x00007f2eb0403133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 2 (Thread 0x7f2eb0139700 (LWP 28722)):

#0  futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7f2eb0138df0, clockid=<optimized out>, expected=0, futex_word=0x55ee43fc500c <COND_timer+44>) at ../sysdeps/nptl/futex-internal.h:320

#1  __pthread_cond_wait_common (abstime=0x7f2eb0138df0, clockid=<optimized out>, mutex=0x55ee43fc5020 <LOCK_timer>, cond=0x55ee43fc4fe0 <COND_timer>) at pthread_cond_wait.c:520

#2  __pthread_cond_timedwait (cond=cond@entry=0x55ee43fc4fe0 <COND_timer>, mutex=mutex@entry=0x55ee43fc5020 <LOCK_timer>, abstime=abstime@entry=0x7f2eb0138df0) at pthread_cond_wait.c:665

#3  0x000055ee42e32b55 in inline_mysql_cond_timedwait (that=0x55ee43fc4fe0 <COND_timer>, mutex=0x55ee43fc5020 <LOCK_timer>, src_file=0x55ee430d7950 "/home/buildbot/buildbot/build/mariadb-10.6.9/mysys/thr_timer.c", src_line=321, abstime=0x7f2eb0138df0) at ./include/mysql/psi/mysql_thread.h:1088

#4  timer_handler (arg=<optimized out>) at ./mysys/thr_timer.c:321

#5  0x00007f2eb0817609 in start_thread (arg=<optimized out>) at pthread_create.c:477

#6  0x00007f2eb0403133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7f2eb014e800 (LWP 28721)):

#0  futex_wait (private=<optimized out>, expected=4294967292, futex_word=0x55ee436f2384 <log_copying_stop+36>) at ../sysdeps/nptl/futex-internal.h:141

#1  futex_wait_simple (private=<optimized out>, expected=4294967292, futex_word=0x55ee436f2384 <log_copying_stop+36>) at ../sysdeps/nptl/futex-internal.h:172

#2  __pthread_cond_destroy (cond=0x55ee436f2360 <log_copying_stop>) at pthread_cond_destroy.c:54

#3  0x000055ee425aae14 in stop_backup_threads (running=<optimized out>) at ./extra/mariabackup/xtrabackup.cc:4377

#4  0x000055ee425bc7fe in xtrabackup_backup_low () at ./extra/mariabackup/xtrabackup.cc:4417

#5  xtrabackup_backup_func () at ./extra/mariabackup/xtrabackup.cc:4774

#6  main_low (argv=<optimized out>) at ./extra/mariabackup/xtrabackup.cc:6941

#7  0x000055ee42566332 in main (argc=<optimized out>, argv=<optimized out>) at ./extra/mariabackup/xtrabackup.cc:6731

Timofey Turenko added a comment - 2022-11-03 10:11 (gdb) thread apply all bt Thread 3 (Thread 0x7f2eab9fe700 (LWP 28723)): #0 syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 #1 0x000055ee42dc480d in my_getevents (min_nr=1, nr=256, ev=0x7f2eab9fbde0, ctx=<optimized out>) at ./tpool/aio_linux.cc:104 #2 tpool::aio_linux::getevent_thread_routine (aio=0x55ee452c1030) at ./tpool/aio_linux.cc:104 #3 0x00007f2eb06fdde4 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6 #4 0x00007f2eb0817609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #5 0x00007f2eb0403133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 2 (Thread 0x7f2eb0139700 (LWP 28722)): #0 futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7f2eb0138df0, clockid=<optimized out>, expected=0, futex_word=0x55ee43fc500c <COND_timer+44>) at ../sysdeps/nptl/futex-internal.h:320 #1 __pthread_cond_wait_common (abstime=0x7f2eb0138df0, clockid=<optimized out>, mutex=0x55ee43fc5020 <LOCK_timer>, cond=0x55ee43fc4fe0 <COND_timer>) at pthread_cond_wait.c:520 #2 __pthread_cond_timedwait (cond=cond@entry=0x55ee43fc4fe0 <COND_timer>, mutex=mutex@entry=0x55ee43fc5020 <LOCK_timer>, abstime=abstime@entry=0x7f2eb0138df0) at pthread_cond_wait.c:665 #3 0x000055ee42e32b55 in inline_mysql_cond_timedwait (that=0x55ee43fc4fe0 <COND_timer>, mutex=0x55ee43fc5020 <LOCK_timer>, src_file=0x55ee430d7950 "/home/buildbot/buildbot/build/mariadb-10.6.9/mysys/thr_timer.c", src_line=321, abstime=0x7f2eb0138df0) at ./include/mysql/psi/mysql_thread.h:1088 #4 timer_handler (arg=<optimized out>) at ./mysys/thr_timer.c:321 #5 0x00007f2eb0817609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #6 0x00007f2eb0403133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 1 (Thread 0x7f2eb014e800 (LWP 28721)): #0 futex_wait (private=<optimized out>, expected=4294967292, futex_word=0x55ee436f2384 <log_copying_stop+36>) at ../sysdeps/nptl/futex-internal.h:141 #1 futex_wait_simple (private=<optimized out>, expected=4294967292, futex_word=0x55ee436f2384 <log_copying_stop+36>) at ../sysdeps/nptl/futex-internal.h:172 #2 __pthread_cond_destroy (cond=0x55ee436f2360 <log_copying_stop>) at pthread_cond_destroy.c:54 #3 0x000055ee425aae14 in stop_backup_threads (running=<optimized out>) at ./extra/mariabackup/xtrabackup.cc:4377 #4 0x000055ee425bc7fe in xtrabackup_backup_low () at ./extra/mariabackup/xtrabackup.cc:4417 #5 xtrabackup_backup_func () at ./extra/mariabackup/xtrabackup.cc:4774 #6 main_low (argv=<optimized out>) at ./extra/mariabackup/xtrabackup.cc:6941 #7 0x000055ee42566332 in main (argc=<optimized out>, argv=<optimized out>) at ./extra/mariabackup/xtrabackup.cc:6731

Marko Mäkelä added a comment - 2022-11-03 10:49

Thank you. This looks somewhat similar to the mystery in ~~MDEV-29843~~. There, pthread_cond_signal() was waiting for something for no apparent reason. Here, we have pthread_cond_destroy() waiting. The condition variable log_copying_stop should never be accessed by any of the other threads. Can you please show the output of the following as well?

print log_copying_stop

What is the version number of the pthread implementation? Usually it is GNU libc.

Can you try to repeat the experiment with an cmake -DWITH_ASAN=ON instrumented executable? Would it flag any heap-use-after-free or other signs of memory corruption?

Marko Mäkelä added a comment - 2022-11-03 10:49 Thank you. This looks somewhat similar to the mystery in MDEV-29843 . There, pthread_cond_signal() was waiting for something for no apparent reason. Here, we have pthread_cond_destroy() waiting. The condition variable log_copying_stop should never be accessed by any of the other threads. Can you please show the output of the following as well? print log_copying_stop What is the version number of the pthread implementation? Usually it is GNU libc. Can you try to repeat the experiment with an cmake -DWITH_ASAN=ON instrumented executable? Would it flag any heap-use-after-free or other signs of memory corruption?

Marko Mäkelä added a comment - 2022-11-03 11:29

The function pthread_cond_destroy() is apparently waiting for a field __wrefs to reach 0. I found some related discussion, including an idea of a fix:

diff --git a/extra/mariabackup/xtrabackup.cc b/extra/mariabackup/xtrabackup.cc

index 4d2da0082aa..ba95b8beb88 100644

--- a/extra/mariabackup/xtrabackup.cc

+++ b/extra/mariabackup/xtrabackup.cc

@@ -4375,6 +4375,7 @@ static void stop_backup_threads(bool running)

       std::this_thread::sleep_for(std::chrono::milliseconds(200));

     putc('\n', stderr);

+    mysql_cond_broadcast(&log_copying_stop);

     mysql_cond_destroy(&log_copying_stop);

I see that we are only using timed waits when waiting for the condition variable. If the waits are terminated by a timeout, it seems possible that the condition variable ‘thinks’ that some waits are pending.

Marko Mäkelä added a comment - 2022-11-03 11:29 The function pthread_cond_destroy() is apparently waiting for a field __wrefs to reach 0. I found some related discussion , including an idea of a fix: diff --git a/extra/mariabackup/xtrabackup.cc b/extra/mariabackup/xtrabackup.cc index 4d2da0082aa..ba95b8beb88 100644 --- a/extra/mariabackup/xtrabackup.cc +++ b/extra/mariabackup/xtrabackup.cc @@ -4375,6 +4375,7 @@ static void stop_backup_threads(bool running) std::this_thread::sleep_for(std::chrono::milliseconds(200)); } putc('\n', stderr); + mysql_cond_broadcast(&log_copying_stop); mysql_cond_destroy(&log_copying_stop); } I see that we are only using timed waits when waiting for the condition variable. If the waits are terminated by a timeout, it seems possible that the condition variable ‘thinks’ that some waits are pending.

Marko Mäkelä added a comment - 2022-11-03 14:04

The patch does not avoid any hang when using GNU libc 2.31-0ubuntu9.9. The pthread_cond_broadcast() in it (as well as in the latest source code) is only reading __wrefs and not modifying it. There is a comment in pthread_cond_destroy.c:

/* See __pthread_cond_wait for a high-level description of the algorithm.

   A correct program must make sure that no waiters are blocked on the condvar

   when it is destroyed, and that there are no concurrent signals or

   broadcasts.  To wake waiters reliably, the program must signal or

   broadcast while holding the mutex or after having held the mutex.  It must

   also ensure that no signal or broadcast are still pending to unblock

   waiters; IOW, because waiters can wake up spuriously, the program must

   effectively ensure that destruction happens after the execution of those

   signal or broadcast calls.

   Thus, we can assume that all waiters that are still accessing the condvar

   have been woken.  We wait until they have confirmed to have woken up by

   decrementing __wrefs.  */

It looks like some more interlocking is needed.

Marko Mäkelä added a comment - 2022-11-03 14:04 The patch does not avoid any hang when using GNU libc 2.31-0ubuntu9.9. The pthread_cond_broadcast() in it (as well as in the latest source code ) is only reading __wrefs and not modifying it. There is a comment in pthread_cond_destroy.c : /* See __pthread_cond_wait for a high-level description of the algorithm. A correct program must make sure that no waiters are blocked on the condvar when it is destroyed, and that there are no concurrent signals or broadcasts. To wake waiters reliably, the program must signal or broadcast while holding the mutex or after having held the mutex. It must also ensure that no signal or broadcast are still pending to unblock waiters; IOW, because waiters can wake up spuriously, the program must effectively ensure that destruction happens after the execution of those signal or broadcast calls. Thus, we can assume that all waiters that are still accessing the condvar have been woken. We wait until they have confirmed to have woken up by decrementing __wrefs. */ It looks like some more interlocking is needed.

Marko Mäkelä added a comment - 2022-11-03 15:22

I think that I have something to proceed with.

Marko Mäkelä added a comment - 2022-11-03 15:22 I think that I have something to proceed with.

Marko Mäkelä added a comment - 2022-11-09 15:23

I can repeat a hang with the following:

diff --git a/mysql-test/suite/mariabackup/incremental_backup.test b/mysql-test/suite/mariabackup/incremental_backup.test

index ec38af4c27d..5db5646ce57 100644

--- a/mysql-test/suite/mariabackup/incremental_backup.test

+++ b/mysql-test/suite/mariabackup/incremental_backup.test

@@ -33,7 +33,7 @@ INSERT INTO t VALUES(1);

 echo # Create full backup , modify table, then create incremental/differential backup;

 --disable_result_log

-exec $XTRABACKUP --defaults-file=$MYSQLTEST_VARDIR/my.cnf  --backup --parallel=10 --target-dir=$basedir;

+exec $XTRABACKUP --defaults-file=$MYSQLTEST_VARDIR/my.cnf  --backup --parallel=10 --target-dir=$basedir --throttle=2;

 --enable_result_log

 BEGIN;

 INSERT INTO t VALUES(0);

However, it is a different hang: log_copying_thread() would keep invoking xtrabackup_copy_logfile() every 2 seconds, apparently never finishing, even though there should be no log to copy.

Marko Mäkelä added a comment - 2022-11-09 15:23 I can repeat a hang with the following: diff --git a/mysql-test/suite/mariabackup/incremental_backup.test b/mysql-test/suite/mariabackup/incremental_backup.test index ec38af4c27d..5db5646ce57 100644 --- a/mysql-test/suite/mariabackup/incremental_backup.test +++ b/mysql-test/suite/mariabackup/incremental_backup.test @@ -33,7 +33,7 @@ INSERT INTO t VALUES(1); echo # Create full backup , modify table, then create incremental/differential backup; --disable_result_log -exec $XTRABACKUP --defaults-file=$MYSQLTEST_VARDIR/my.cnf --backup --parallel=10 --target-dir=$basedir; +exec $XTRABACKUP --defaults-file=$MYSQLTEST_VARDIR/my.cnf --backup --parallel=10 --target-dir=$basedir --throttle=2; --enable_result_log BEGIN; INSERT INTO t VALUES(0); However, it is a different hang: log_copying_thread() would keep invoking xtrabackup_copy_logfile() every 2 seconds, apparently never finishing, even though there should be no log to copy.

Marko Mäkelä added a comment - 2022-12-21 07:39

My above change to a test does not always reproduce an actual hang. In one case that I debugged (innodb_page_size=32k, innodb_undo_tablespaces=0), it only limited the speed of copying files to about 2 operations per second. When I set a data watchpoint on io_ticket in GDB, small progress was being made, copying some non-InnoDB files. Eventually the test did complete, in 3 to 4 minutes. But, a next test with innodb_undo_tablespaces=3 did hang, with only 2 threads being active: timer_handler waiting for something, and the thread that tries to execute mysql_cond_destroy(&log_copying_stop).

I suspect that there could be a problem that two threads are executing timed waits on that condition variable. I will try to reduce that to 1 and see what would happen.

Marko Mäkelä added a comment - 2022-12-21 07:39 My above change to a test does not always reproduce an actual hang. In one case that I debugged (innodb_page_size=32k, innodb_undo_tablespaces=0), it only limited the speed of copying files to about 2 operations per second. When I set a data watchpoint on io_ticket in GDB, small progress was being made, copying some non-InnoDB files. Eventually the test did complete, in 3 to 4 minutes. But, a next test with innodb_undo_tablespaces=3 did hang, with only 2 threads being active: timer_handler waiting for something, and the thread that tries to execute mysql_cond_destroy(&log_copying_stop) . I suspect that there could be a problem that two threads are executing timed waits on that condition variable. I will try to reduce that to 1 and see what would happen.

Marko Mäkelä added a comment - 2022-12-21 11:31

Finally, I figured it out. The io_watching_thread() was being created before the condition variable log_copying_stop had been initialized in the main thread. So, the first mysql_cond_timedwait() could be executed concurrently with the initialization of the condition variable.

I would clean up also some other race conditions around this.

The following test seems to run about as fast with my fix, and rather reliably reproduce the hang (5 of 10 tests would complete in 6 seconds, and 5 would hang when I tried it):

diff --git a/mysql-test/suite/mariabackup/incremental_backup.test b/mysql-test/suite/mariabackup/incremental_backup.test

index fed2d5e60a9..62e4f9c6875 100644

--- a/mysql-test/suite/mariabackup/incremental_backup.test

+++ b/mysql-test/suite/mariabackup/incremental_backup.test

@@ -34,7 +34,7 @@ INSERT INTO t VALUES(1);

 echo # Create full backup , modify table, then create incremental/differential backup;

 --disable_result_log

-exec $XTRABACKUP --defaults-file=$MYSQLTEST_VARDIR/my.cnf  --backup --parallel=10 --target-dir=$basedir;

+exec $XTRABACKUP --defaults-file=$MYSQLTEST_VARDIR/my.cnf  --backup --parallel=10 --target-dir=$basedir --throttle=1000;

 --enable_result_log

 BEGIN;

 INSERT INTO t VALUES(0);

This race condition was caused by ~~MDEV-21452~~, which made io_watching_thread() depend on the condition variable log_copying_stop.

Marko Mäkelä added a comment - 2022-12-21 11:31 Finally, I figured it out. The io_watching_thread() was being created before the condition variable log_copying_stop had been initialized in the main thread. So, the first mysql_cond_timedwait() could be executed concurrently with the initialization of the condition variable. I would clean up also some other race conditions around this. The following test seems to run about as fast with my fix, and rather reliably reproduce the hang (5 of 10 tests would complete in 6 seconds, and 5 would hang when I tried it): diff --git a/mysql-test/suite/mariabackup/incremental_backup.test b/mysql-test/suite/mariabackup/incremental_backup.test index fed2d5e60a9..62e4f9c6875 100644 --- a/mysql-test/suite/mariabackup/incremental_backup.test +++ b/mysql-test/suite/mariabackup/incremental_backup.test @@ -34,7 +34,7 @@ INSERT INTO t VALUES(1); echo # Create full backup , modify table, then create incremental/differential backup; --disable_result_log -exec $XTRABACKUP --defaults-file=$MYSQLTEST_VARDIR/my.cnf --backup --parallel=10 --target-dir=$basedir; +exec $XTRABACKUP --defaults-file=$MYSQLTEST_VARDIR/my.cnf --backup --parallel=10 --target-dir=$basedir --throttle=1000; --enable_result_log BEGIN; INSERT INTO t VALUES(0); This race condition was caused by MDEV-21452 , which made io_watching_thread() depend on the condition variable log_copying_stop .

Timofey Turenko added a comment - 2022-12-21 14:11

test is OK. Bug can be closed

Timofey Turenko added a comment - 2022-12-21 14:11 test is OK. Bug can be closed

People

Assignee:: Marko Mäkelä

Reporter:: Timofey Turenko

Votes:: 1 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 2022-10-27 12:20

Updated:: 2022-12-21 14:14

Resolved:: 2022-12-21 14:14

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration