[MDEV-12703] xtrabackup 2.4.7 may crash with --kill-long-queries in 10.2 (kill_long_selects.sh) Created: 2017-05-05  Updated: 2017-05-05  Resolved: 2017-05-05

Status: Closed
Project: MariaDB Server
Component/s: Backup
Affects Version/s: 10.2.5
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Andrii Nikitin (Inactive) Assignee: Andrii Nikitin (Inactive)
Resolution: Not a Bug Votes: 0
Labels: None


 Description   

This test https://github.com/percona/percona-xtrabackup/blob/2.4/storage/innobase/xtrabackup/test/t/kill_long_selects.sh shows following behavior in MariaDB / XtraBackup combinations:

10.1.23 2.3.8 PASS
10.1.23 2.4.7 PASS
10.2.5 2.4.7 FAIL (Xtrabackup crash)
10.2 (current branch) 2.4.7 FAIL (Xtrabackup crash)

With MySQL:
5.7.18 2.4.7 PASS

2017-05-05 12:17:33: bash: ===> /usr/bin/innobackupex --defaults-file=/dev/shm/__var0/var1/my.cnf --no-version-check /dev/shm/__var0/var1/full --kill-long-queries-timeout=3 --kill-long-query-type=select
170505 12:17:33 innobackupex: Starting the backup operation
 
IMPORTANT: Please check that the backup run completes successfully.
           At the end of a successful backup run innobackupex
           prints "completed OK!".
 
170505 12:17:33 Connecting to MySQL server host: localhost, user: root, password: not set, port: not set, socket: /dev/shm/__var0/tmp/mysql.sock.fsALNy
Using server version 10.2.5-MariaDB-log
/usr/bin/innobackupex version 2.4.7 based on MySQL server 5.7.13 Linux (x86_64) (revision id: 6f7a799)
xtrabackup: uses posix_fadvise().
xtrabackup: cd to /dev/shm/__var0/var1/data
xtrabackup: open files limit requested 0, set to 1024
xtrabackup: using the following InnoDB configuration:
xtrabackup:   innodb_data_home_dir = .
xtrabackup:   innodb_data_file_path = ibdata1:12M:autoextend
xtrabackup:   innodb_log_group_home_dir = ./
xtrabackup:   innodb_log_files_in_group = 2
xtrabackup:   innodb_log_file_size = 50331648
InnoDB: Number of pools: 1
170505 12:17:33 >> log scanned up to (1625674)
xtrabackup: Generating a list of tablespaces
InnoDB: Allocated tablespace ID 4 for test/t1, old maximum was 0
170505 12:17:34 [01] Copying ./ibdata1 to /dev/shm/__var0/var1/full/2017-05-05_12-17-33/ibdata1
170505 12:17:34 [01]        ...done
170505 12:17:34 [01] Copying ./test/t1.ibd to /dev/shm/__var0/var1/full/2017-05-05_12-17-33/test/t1.ibd
170505 12:17:34 [01]        ...done
170505 12:17:34 [01] Copying ./mysql/gtid_slave_pos.ibd to /dev/shm/__var0/var1/full/2017-05-05_12-17-33/mysql/gtid_slave_pos.ibd
170505 12:17:34 [01]        ...done
170505 12:17:34 [01] Copying ./mysql/innodb_index_stats.ibd to /dev/shm/__var0/var1/full/2017-05-05_12-17-33/mysql/innodb_index_stats.ibd
170505 12:17:34 [01]        ...done
170505 12:17:34 [01] Copying ./mysql/innodb_table_stats.ibd to /dev/shm/__var0/var1/full/2017-05-05_12-17-33/mysql/innodb_table_stats.ibd
170505 12:17:34 [01]        ...done
170505 12:17:34 >> log scanned up to (1625674)
170505 12:17:35 Executing FLUSH TABLES WITH READ LOCK...
170505 12:17:35 Kill query timeout 3 seconds.
170505 12:17:35 >> log scanned up to (1625674)
170505 12:17:36 >> log scanned up to (1625674)
170505 12:17:37 >> log scanned up to (1625674)
170505 12:17:38 Connecting to MySQL server host: localhost, user: root, password: not set, port: not set, socket: /dev/shm/__var0/tmp/mysql.sock.fsALNy
10:17:38 UTC - xtrabackup got signal 11 ;
This could be because you hit a bug or data is corrupted.
This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.
 
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x10000
/usr/bin/innobackupex(my_print_stacktrace+0x3b)[0x55a6ac5ac52b]
/usr/bin/innobackupex(handle_fatal_signal+0x291)[0x55a6ac420711]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11630)[0x7ff033990630]
/lib/x86_64-linux-gnu/libc.so.6(+0x3b725)[0x7ff03192f725]
/usr/bin/innobackupex(+0x58a0cc)[0x55a6abd1e0cc]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ca)[0x7ff0339866ca]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x5f)[0x7ff0319fc0af]

This may be major issue for 10.2 and needs to be investigated accordingly.



 Comments   
Comment by Andrii Nikitin (Inactive) [ 2017-05-05 ]

It looks that function wait_for_connection_count() in the test is not compatible with 10.2 (because 10.2 has additional background daemon connections). Thus different behavior (most probably) is сaused by that.
No problem was identified in 10.2 after my investigation.

Generated at Thu Feb 08 07:59:47 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.