Details
Description
One of our customer is facing issue after implementing replication between two, 3 nodes galera cluster.
[Test Environment]
- MariaDB 10.1.30
- OP Node (Cluster) : MDBD0, MDBD1, MDBD2
- DR Node (Cluster) : MDBDG0, MDBDG1, MDBDG2
- Replication (Dual) : MDBD2 -> MDBDG2, MDBDG2 -> MDBD2
1. OP3(MDBD2) DB Sevice change to DR3(MDBDG2). And OP3 DATA backup.
2. Send a OP3 data backup File to DR3
3. DR3 data file delete, and restore op3 data backup file.
4. Replication sync completed.
5. New table create on DR3 DB.
- OP3 replication completed.
- DR1,DR2,OP1,OP2 replicaton completed by Galera cluster
6. But OP3 (MDBD2)DB Down.
When OP-MDBD2 or DR-MDBDG2 executed the CREATE TABLE AS SELECT (CTAS) statement, we found that an error occurred when there was no data in the table that executed the SELECT statement.
[Test Scenarios]
Case#1
Execute CTAS on a table (sbtest1) with no data in OP-MDBD2.(The same result in DR-MDBDG2)
CREATE TABLE IF NOT EXISTS temp1 AS (SELECT * FROM sbtest1);
Result : The following error occurs
DR-MDBDG2 Error Log
2019-01-31 16:45:06 140115764099840 [Warning] WSREP: SQL statement was ineffective, THD: 8, buf: 129
schema: (null)
QUERY: (null)
=> Skipping replication
2019-01-31 16:45:06 140115764099840 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 0-202-5957873, Internal MariaDB error code: 1047
2019-01-31 16:45:06 140115764099840 [Note] Slave SQL thread exiting, replication stopped in log 'maria-bin.000004' at position 790493
Case#2
Execute CTAS on the table (sbtest2) in which data exists in OP-MDBD2.(The same result in DR-MDBDG2)
Result : Both OP-MDBD2 and DR-MDBDG2 are normal
Case#3
Execute CTAS on a table (sbtest3) with no data on OP-MDBD0 or OP-MDBD1.(The same result in DR-MDBDG0 or DR-MDBDG1)
Result : All normal
Case#4
Execute 'create table sbtest4 (id int(10), primary key (id));' statement on all nodes instead of CTAS
Result : All normal
Error log details:
2019-01-17 18:42:05 139854293756672 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 0-106-24925607, Internal MariaDB error code: 1047
|
2019-01-17 18:42:05 139854293756672 [Note] Slave SQL thread exiting, replication stopped in log 'maria-bin.000004' at position 12399630
|
2019-01-17 18:42:05 139854293756672 [Note] WSREP: Slave error due to node temporarily non-primarySQL slave will continue
|
2019-01-17 18:42:05 139854293756672 [Note] WSREP: slave restart: 7
|
2019-01-17 18:42:05 139854293756672 [Note] WSREP: ready state reached
|
2019-01-17 18:42:05 139854293756672 [Note] Slave SQL thread initialized, starting replication in log 'maria-bin.000004' at position 12399630, relay log './relay-log.000002' position: 537
|
2019-01-17 18:42:05 139854293756672 [Warning] WSREP: SQL statement was ineffective, THD: 459, buf: 458
|
schema: (null)
|
QUERY: (null)
|
=> Skipping replication
|
2019-01-17 18:42:05 139854293756672 [ERROR] WSREP: FSM: no such a transition ROLLED_BACK -> ROLLED_BACK
|
190117 18:42:05 [ERROR] mysqld got signal 6 ;
|
This could be because you hit a bug. It is also possible that this binary
|
or one of the libraries it was linked against is corrupt, improperly built,
|
or misconfigured. This error can also be caused by malfunctioning hardware.
|
|
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
|
|
We will try our best to scrape up some info that will hopefully help
|
diagnose the problem, but since we have already crashed,
|
something is definitely wrong and this may fail.
|
|
Server version: 10.1.30-MariaDB
|
key_buffer_size=33554432
|
read_buffer_size=1048576
|
max_used_connections=100
|
max_threads=302
|
thread_count=101
|
It is possible that mysqld could use up to
|
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1585270 K bytes of memory
|
Hope that's ok; if not, decrease some variables in the equation.
|
|
Thread pointer: 0x7f325bf63008
|
Attempting backtrace. You can use the following information to find out
|
where mysqld died. If you see no messages after this, something went
|
terribly wrong...
|
stack_bottom = 0x7f325d7fe298 thread_stack 0x48400
|
/db/mariadb/app/bin/mysqld(my_print_stacktrace+0x2e)[0xc192be]
|
/db/mariadb/app/bin/mysqld(handle_fatal_signal+0x4bf)[0x77177f]
|
/lib64/libpthread.so.0(+0xf680)[0x7f3420d8a680]
|
/lib64/libc.so.6(gsignal+0x37)[0x7f341fb96207]
|
/lib64/libc.so.6(abort+0x148)[0x7f341fb978f8]
|
/usr/lib64/galera/libgalera_smm.so(_ZN6galera3FSMINS_9TrxHandle5StateENS1_10TransitionENS_10EmptyGuardENS_11EmptyActionEE8shift_toES2_+0x17c)[0x7f341d9925cc]
|
/usr/lib64/galera/libgalera_smm.so(_ZN6galera13ReplicatorSMM13post_rollbackEPNS_9TrxHandleE+0x26)[0x7f341d9883b6]
|
/usr/lib64/galera/libgalera_smm.so(galera_post_rollback+0x48)[0x7f341d9997d8]
|
/db/mariadb/app/bin/mysqld[0x6fc960]
|
/db/mariadb/app/bin/mysqld(_Z17ha_rollback_transP3THDb+0x12e)[0x774ece]
|
/db/mariadb/app/bin/mysqld(_Z15ha_commit_transP3THDb+0x32a)[0x77704a]
|
/db/mariadb/app/bin/mysqld(_Z12trans_commitP3THD+0x4c)[0x6a721c]
|
/db/mariadb/app/bin/mysqld(_ZN13Xid_log_event14do_apply_eventEP14rpl_group_info+0xcd)[0x85dd6d]
|
/db/mariadb/app/bin/mysqld[0x537583]
|
/db/mariadb/app/bin/mysqld[0x54152d]
|
/db/mariadb/app/bin/mysqld(handle_slave_sql+0x150b)[0x54315b]
|
/lib64/libpthread.so.0(+0x7dd5)[0x7f3420d82dd5]
|
/lib64/libc.so.6(clone+0x6d)[0x7f341fc5eb3d]
|
|
Trying to get some variables.
|
Some pointers may be invalid and cause the dump to abort.
|
Query (0x0): is an invalid pointer
|
Connection ID (thread ID): 459
|
Status: NOT_KILLED
|
Running 10.6.8 on a 2-node galera cluster, slave of a 4-node cluster.
I have these errors constantly, while node has not actually dropped from cluster, but replication stops.
I had to set this in config files: slave-skip-errors = 1062,1032,1047
:~$ more /var/log/mysql/error.log | grep 'error code: 1047'
2022-07-06 17:49:29 22 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-134218456, Internal MariaDB error code: 1047
2022-07-06 18:04:08 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-134237226, Internal MariaDB error code: 1047
2022-07-06 18:15:05 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-134261407, Internal MariaDB error code: 1047
2022-07-06 18:28:10 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-134289196, Internal MariaDB error code: 1047
2022-07-06 20:38:20 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-134506531, Internal MariaDB error code: 1047
2022-07-06 21:44:37 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-134605458, Internal MariaDB error code: 1047
2022-07-06 23:45:16 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-134767147, Internal MariaDB error code: 1047
2022-07-07 4:06:57 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-135164750, Internal MariaDB error code: 1047
2022-07-07 4:07:27 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-135164926, Internal MariaDB error code: 1047
2022-07-07 4:07:57 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-135165095, Internal MariaDB error code: 1047
2022-07-07 5:20:50 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-135256917, Internal MariaDB error code: 1047
2022-07-07 5:23:50 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-135259071, Internal MariaDB error code: 1047
2022-07-07 5:28:22 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-135265888, Internal MariaDB error code: 1047
2022-07-07 8:11:12 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-135498294, Internal MariaDB error code: 1047
2022-07-07 8:47:23 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-135561143, Internal MariaDB error code: 1047
2022-07-07 9:01:58 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-135589935, Internal MariaDB error code: 1047
2022-07-07 9:35:27 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-135667939, Internal MariaDB error code: 1047
2022-07-07 10:00:08 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-135742581, Internal MariaDB error code: 1047
2022-07-07 10:33:49 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-135832362, Internal MariaDB error code: 1047
2022-07-07 10:47:54 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-135854614, Internal MariaDB error code: 1047
2022-07-07 17:37:43 59 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-136777435, Internal MariaDB error code: 1047
2022-07-07 19:29:19 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-136962997, Internal MariaDB error code: 1047
2022-07-07 22:47:39 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-137255559, Internal MariaDB error code: 1047
2022-07-07 22:50:40 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-137260577, Internal MariaDB error code: 1047
2022-07-07 22:54:12 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-137264387, Internal MariaDB error code: 1047
2022-07-07 23:02:45 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-137274695, Internal MariaDB error code: 1047
2022-07-07 23:06:17 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-137280680, Internal MariaDB error code: 1047
2022-07-08 9:20:48 70 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-136668818, Internal MariaDB error code: 1047
2022-07-08 9:27:50 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-136976555, Internal MariaDB error code: 1047
2022-07-08 9:47:28 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-137509014, Internal MariaDB error code: 1047
2022-07-08 9:47:58 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-137511512, Internal MariaDB error code: 1047
2022-07-08 9:54:32 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-137683032, Internal MariaDB error code: 1047
2022-07-08 10:09:36 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-137765495, Internal MariaDB error code: 1047
2022-07-08 12:23:48 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-138147848, Internal MariaDB error code: 1047
2022-07-08 15:12:10 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-138624469, Internal MariaDB error code: 1047
2022-07-08 16:56:40 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-138875426, Internal MariaDB error code: 1047
2022-07-08 21:46:08 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-139344078, Internal MariaDB error code: 1047
2022-07-08 21:47:38 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-139344635, Internal MariaDB error code: 1047
2022-07-09 0:22:10 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-139582275, Internal MariaDB error code: 1047
2022-07-09 0:28:12 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-139590546, Internal MariaDB error code: 1047
2022-07-09 0:30:45 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-139592431, Internal MariaDB error code: 1047
2022-07-09 0:55:27 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-139639948, Internal MariaDB error code: 1047
2022-07-09 2:26:22 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-139781107, Internal MariaDB error code: 1047
2022-07-09 4:22:26 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-139901575, Internal MariaDB error code: 1047
2022-07-09 4:22:56 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-139901738, Internal MariaDB error code: 1047
2022-07-09 4:23:27 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-139901843, Internal MariaDB error code: 1047
2022-07-09 4:25:27 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-139904667, Internal MariaDB error code: 1047
2022-07-09 4:27:58 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-139906404, Internal MariaDB error code: 1047
2022-07-09 4:31:29 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-139909323, Internal MariaDB error code: 1047
2022-07-09 4:32:30 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-139910326, Internal MariaDB error code: 1047
2022-07-09 4:59:39 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-139934164, Internal MariaDB error code: 1047
2022-07-09 6:32:37 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-140050542, Internal MariaDB error code: 1047
2022-07-09 6:33:37 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-140051158, Internal MariaDB error code: 1047
2022-07-09 9:09:56 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-140289491, Internal MariaDB error code: 1047
2022-07-09 9:13:57 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-140298910, Internal MariaDB error code: 1047
2022-07-09 10:04:44 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-140422469, Internal MariaDB error code: 1047
2022-07-09 10:05:14 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-140423459, Internal MariaDB error code: 1047
2022-07-09 11:23:08 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-140658361, Internal MariaDB error code: 1047
2022-07-09 11:34:11 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-140691115, Internal MariaDB error code: 1047
2022-07-09 12:41:02 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-140887442, Internal MariaDB error code: 1047
2022-07-09 18:38:53 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-141874681, Internal MariaDB error code: 1047
2022-07-09 18:50:27 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-141899781, Internal MariaDB error code: 1047
2022-07-09 20:31:01 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-142073856, Internal MariaDB error code: 1047
2022-07-10 1:38:59 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-142522499, Internal MariaDB error code: 1047
2022-07-10 1:39:29 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-142522743, Internal MariaDB error code: 1047
2022-07-10 5:52:43 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-142769138, Internal MariaDB error code: 1047
2022-07-10 6:00:16 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-142777243, Internal MariaDB error code: 1047
2022-07-10 6:05:47 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-142785959, Internal MariaDB error code: 1047
2022-07-10 8:01:53 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-142919642, Internal MariaDB error code: 1047
2022-07-10 10:21:38 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-143230384, Internal MariaDB error code: 1047
2022-07-10 12:39:16 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-143637638, Internal MariaDB error code: 1047
2022-07-10 18:14:52 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-143637638, Internal MariaDB error code: 1047
2022-07-11 8:44:47 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-145469727, Internal MariaDB error code: 1047
2022-07-11 8:48:21 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-145513744, Internal MariaDB error code: 1047
2022-07-11 10:30:08 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-145774325, Internal MariaDB error code: 1047
2022-07-11 11:11:46 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-145894655, Internal MariaDB error code: 1047
2022-07-11 11:32:22 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-145958788, Internal MariaDB error code: 1047
2022-07-11 19:16:03 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-147117308, Internal MariaDB error code: 1047
2022-07-11 21:16:05 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-147287276, Internal MariaDB error code: 1047
2022-07-12 2:47:23 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-147688986, Internal MariaDB error code: 1047
2022-07-12 4:17:51 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-147775476, Internal MariaDB error code: 1047
2022-07-12 6:41:04 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-147931798, Internal MariaDB error code: 1047
2022-07-12 8:48:51 21 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-148086554, Internal MariaDB error code: 1047
2022-07-12 9:32:35 429 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-148162858, Internal MariaDB error code: 1047
2022-07-12 10:37:23 4499 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-148275864, Internal MariaDB error code: 1047
2022-07-14 11:15:27 56675 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-148276355, Internal MariaDB error code: 1047
2022-07-18 14:12:51 56746 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-165314277, Internal MariaDB error code: 1047
2022-07-18 15:08:07 902 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-165473209, Internal MariaDB error code: 1047
2022-07-19 10:21:54 25565 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 100-1-166151934, Internal MariaDB error code: 1047
As I intend to make both clusters in sync some day, this is causing slave to stop and is a show-stopper to be able to use slave cluster at all.