[MDEV-28210] SIGSEGV in the test galera.galera_sst_rsync2 Created: 2022-04-01  Updated: 2022-04-02  Resolved: 2022-04-02

Status: Closed
Project: MariaDB Server
Component/s: Galera SST
Affects Version/s: 10.3
Fix Version/s: 10.3.35

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: crash, not-10.4


 Description   

The test galera.galera_sst_rsync fails with a SIGSEGV on 10.3 but not later versions on various builders as follows:

10.3 1e859d4abcfd7e3b2c238e5dc8c909254661082a

galera.galera_sst_rsync2 'innodb,release' w2 [ fail ]
        Test ended at 2022-03-29 21:26:18
 
CURRENT_TEST: galera.galera_sst_rsync2
sh: line 1: 28830 Segmentation fault      /usr/sbin/mysqld --defaults-group-suffix=.2 --defaults-file=/dev/shm/var/2/my.cnf --log-error=/dev/shm/var/tmp/2/galera_wsrep_recover.log --innodb --wsrep-recover > /dev/shm/var/tmp/2/galera_wsrep_recover.log 2>&1
mysqltest: In included file "./suite/galera/include/galera_wsrep_recover.inc": 
included from ./suite/galera/include/galera_st_kill_slave.inc at line 58:
included from /usr/share/mysql-test/suite/galera/t/galera_sst_rsync2.test at line 11:
At line 8: exec of '/usr/sbin/mysqld --defaults-group-suffix=.2 --defaults-file=/dev/shm/var/2/my.cnf --log-error=/dev/shm/var/tmp/2/galera_wsrep_recover.log --innodb --wsrep-recover > /dev/shm/var/tmp/2/galera_wsrep_recover.log 2>&1' failed, error: 35584, status: 139, errno: 2
Output from before failure:
Performing --wsrep-recover ...

Locally, I checked this with rr:

diff --git a/mysql-test/suite/galera/include/galera_wsrep_recover.inc b/mysql-test/suite/galera/include/galera_wsrep_recover.inc
index aa2f0e2e777..7857a76daf1 100644
--- a/mysql-test/suite/galera/include/galera_wsrep_recover.inc
+++ b/mysql-test/suite/galera/include/galera_wsrep_recover.inc
@@ -5,7 +5,7 @@ if ($wsrep_recover_additional)
 }
 if (!$wsrep_recover_additional)
 {
---exec $MYSQLD --defaults-group-suffix=.$galera_wsrep_recover_server_id --defaults-file=$MYSQLTEST_VARDIR/my.cnf --log-error=$MYSQL_TMP_DIR/galera_wsrep_recover.log --innodb --wsrep-recover > $MYSQL_TMP_DIR/galera_wsrep_recover.log 2>&1
+--exec rr record $MYSQLD --defaults-group-suffix=.$galera_wsrep_recover_server_id --defaults-file=$MYSQLTEST_VARDIR/my.cnf --log-error=$MYSQL_TMP_DIR/galera_wsrep_recover.log --innodb --wsrep-recover > $MYSQL_TMP_DIR/galera_wsrep_recover.log 2>&1
 }
 
 --perl

mkdir /dev/shm/rr
_RR_TRACE_DIR=/dev/shm/rr ./mtr galera.galera_sst_rsync

The code crashes due to a global variable wsrep being dereferenced.

#0  0x000055dcd8f28e63 in wsrep_post_commit (thd=0x7f8f58001618, all=<optimized out>) at /mariadb/10.3/sql/wsrep_hton.cc:158
158	         wsrep->post_rollback(wsrep, &thd->wsrep_ws_handle))
(rr) p wsrep
$1 = (wsrep_t *) 0x0

The only read or write access of that variable during the entire run (according to awatch wsrep) was the crashing statement.


Generated at Thu Feb 08 09:58:56 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.