One of our customer is facing issue after implementing replication between two, 3 nodes galera cluster.
[Test Environment]
- MariaDB 10.1.30
- OP Node (Cluster) : MDBD0, MDBD1, MDBD2
- DR Node (Cluster) : MDBDG0, MDBDG1, MDBDG2
- Replication (Dual) : MDBD2 -> MDBDG2, MDBDG2 -> MDBD2
1. OP3(MDBD2) DB Sevice change to DR3(MDBDG2). And OP3 DATA backup.
2. Send a OP3 data backup File to DR3
3. DR3 data file delete, and restore op3 data backup file.
4. Replication sync completed.
5. New table create on DR3 DB.
- OP3 replication completed.
- DR1,DR2,OP1,OP2 replicaton completed by Galera cluster
6. But OP3 (MDBD2)DB Down.
When OP-MDBD2 or DR-MDBDG2 executed the CREATE TABLE AS SELECT (CTAS) statement, we found that an error occurred when there was no data in the table that executed the SELECT statement.
[Test Scenarios]
Case#1
Execute CTAS on a table (sbtest1) with no data in OP-MDBD2.(The same result in DR-MDBDG2)
CREATE TABLE IF NOT EXISTS temp1 AS (SELECT * FROM sbtest1);
Result : The following error occurs
DR-MDBDG2 Error Log
2019-01-31 16:45:06 140115764099840 [Warning] WSREP: SQL statement was ineffective, THD: 8, buf: 129
schema: (null)
QUERY: (null)
=> Skipping replication
2019-01-31 16:45:06 140115764099840 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 0-202-5957873, Internal MariaDB error code: 1047
2019-01-31 16:45:06 140115764099840 [Note] Slave SQL thread exiting, replication stopped in log 'maria-bin.000004' at position 790493
Case#2
Execute CTAS on the table (sbtest2) in which data exists in OP-MDBD2.(The same result in DR-MDBDG2)
Result : Both OP-MDBD2 and DR-MDBDG2 are normal
Case#3
Execute CTAS on a table (sbtest3) with no data on OP-MDBD0 or OP-MDBD1.(The same result in DR-MDBDG0 or DR-MDBDG1)
Result : All normal
Case#4
Execute 'create table sbtest4 (id int(10), primary key (id));' statement on all nodes instead of CTAS
Result : All normal
Error log details:
2019-01-17 18:42:05 139854293756672 [ERROR] Slave SQL: Node has dropped from cluster, Gtid 0-106-24925607, Internal MariaDB error code: 1047
|
2019-01-17 18:42:05 139854293756672 [Note] Slave SQL thread exiting, replication stopped in log 'maria-bin.000004' at position 12399630
|
2019-01-17 18:42:05 139854293756672 [Note] WSREP: Slave error due to node temporarily non-primarySQL slave will continue
|
2019-01-17 18:42:05 139854293756672 [Note] WSREP: slave restart: 7
|
2019-01-17 18:42:05 139854293756672 [Note] WSREP: ready state reached
|
2019-01-17 18:42:05 139854293756672 [Note] Slave SQL thread initialized, starting replication in log 'maria-bin.000004' at position 12399630, relay log './relay-log.000002' position: 537
|
2019-01-17 18:42:05 139854293756672 [Warning] WSREP: SQL statement was ineffective, THD: 459, buf: 458
|
schema: (null)
|
QUERY: (null)
|
=> Skipping replication
|
2019-01-17 18:42:05 139854293756672 [ERROR] WSREP: FSM: no such a transition ROLLED_BACK -> ROLLED_BACK
|
190117 18:42:05 [ERROR] mysqld got signal 6 ;
|
This could be because you hit a bug. It is also possible that this binary
|
or one of the libraries it was linked against is corrupt, improperly built,
|
or misconfigured. This error can also be caused by malfunctioning hardware.
|
|
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
|
|
We will try our best to scrape up some info that will hopefully help
|
diagnose the problem, but since we have already crashed,
|
something is definitely wrong and this may fail.
|
|
Server version: 10.1.30-MariaDB
|
key_buffer_size=33554432
|
read_buffer_size=1048576
|
max_used_connections=100
|
max_threads=302
|
thread_count=101
|
It is possible that mysqld could use up to
|
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1585270 K bytes of memory
|
Hope that's ok; if not, decrease some variables in the equation.
|
|
Thread pointer: 0x7f325bf63008
|
Attempting backtrace. You can use the following information to find out
|
where mysqld died. If you see no messages after this, something went
|
terribly wrong...
|
stack_bottom = 0x7f325d7fe298 thread_stack 0x48400
|
/db/mariadb/app/bin/mysqld(my_print_stacktrace+0x2e)[0xc192be]
|
/db/mariadb/app/bin/mysqld(handle_fatal_signal+0x4bf)[0x77177f]
|
/lib64/libpthread.so.0(+0xf680)[0x7f3420d8a680]
|
/lib64/libc.so.6(gsignal+0x37)[0x7f341fb96207]
|
/lib64/libc.so.6(abort+0x148)[0x7f341fb978f8]
|
/usr/lib64/galera/libgalera_smm.so(_ZN6galera3FSMINS_9TrxHandle5StateENS1_10TransitionENS_10EmptyGuardENS_11EmptyActionEE8shift_toES2_+0x17c)[0x7f341d9925cc]
|
/usr/lib64/galera/libgalera_smm.so(_ZN6galera13ReplicatorSMM13post_rollbackEPNS_9TrxHandleE+0x26)[0x7f341d9883b6]
|
/usr/lib64/galera/libgalera_smm.so(galera_post_rollback+0x48)[0x7f341d9997d8]
|
/db/mariadb/app/bin/mysqld[0x6fc960]
|
/db/mariadb/app/bin/mysqld(_Z17ha_rollback_transP3THDb+0x12e)[0x774ece]
|
/db/mariadb/app/bin/mysqld(_Z15ha_commit_transP3THDb+0x32a)[0x77704a]
|
/db/mariadb/app/bin/mysqld(_Z12trans_commitP3THD+0x4c)[0x6a721c]
|
/db/mariadb/app/bin/mysqld(_ZN13Xid_log_event14do_apply_eventEP14rpl_group_info+0xcd)[0x85dd6d]
|
/db/mariadb/app/bin/mysqld[0x537583]
|
/db/mariadb/app/bin/mysqld[0x54152d]
|
/db/mariadb/app/bin/mysqld(handle_slave_sql+0x150b)[0x54315b]
|
/lib64/libpthread.so.0(+0x7dd5)[0x7f3420d82dd5]
|
/lib64/libc.so.6(clone+0x6d)[0x7f341fc5eb3d]
|
|
Trying to get some variables.
|
Some pointers may be invalid and cause the dump to abort.
|
Query (0x0): is an invalid pointer
|
Connection ID (thread ID): 459
|
Status: NOT_KILLED
|
{"report":{"fcp":1486.5,"ttfb":614.1999998092651,"pageVisibility":"visible","entityId":72404,"key":"jira.project.issue.view-issue","isInitial":true,"threshold":1000,"elementTimings":{},"userDeviceMemory":8,"userDeviceProcessors":64,"apdex":0.5,"journeyId":"4a257a4b-5561-415b-9b08-0e4e2d3b9a59","navigationType":0,"readyForUser":1576.0999999046326,"redirectCount":0,"resourceLoadedEnd":1949.5999999046326,"resourceLoadedStart":622.4000000953674,"resourceTiming":[{"duration":291.8999996185303,"initiatorType":"link","name":"https://jira.mariadb.org/s/2c21342762a6a02add1c328bed317ffd-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/css/_super/batch.css","startTime":622.4000000953674,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":622.4000000953674,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":914.2999997138977,"responseStart":0,"secureConnectionStart":0},{"duration":292.19999980926514,"initiatorType":"link","name":"https://jira.mariadb.org/s/7ebd35e77e471bc30ff0eba799ebc151-CDN/lu2cib/820016/12ta74/494e4c556ecbb29f90a3d3b4f09cb99c/_/download/contextbatch/css/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.css?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&slack-enabled=true&whisper-enabled=true","startTime":622.5999999046326,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":622.5999999046326,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":914.7999997138977,"responseStart":0,"secureConnectionStart":0},{"duration":356.6000003814697,"initiatorType":"script","name":"https://jira.mariadb.org/s/0917945aaa57108d00c5076fea35e069-CDN/lu2cib/820016/12ta74/0a8bac35585be7fc6c9cc5a0464cd4cf/_/download/contextbatch/js/_super/batch.js?locale=en","startTime":622.7999997138977,"connectEnd":622.7999997138977,"connectStart":622.7999997138977,"domainLookupEnd":622.7999997138977,"domainLookupStart":622.7999997138977,"fetchStart":622.7999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":935.4000000953674,"responseEnd":979.4000000953674,"responseStart":947.0999999046326,"secureConnectionStart":622.7999997138977},{"duration":595.0999999046326,"initiatorType":"script","name":"https://jira.mariadb.org/s/2d8175ec2fa4c816e8023260bd8c1786-CDN/lu2cib/820016/12ta74/494e4c556ecbb29f90a3d3b4f09cb99c/_/download/contextbatch/js/jira.browse.project,project.issue.navigator,jira.view.issue,jira.general,jira.global,atl.general,-_super/batch.js?agile_global_admin_condition=true&jag=true&jira.create.linked.issue=true&locale=en&slack-enabled=true&whisper-enabled=true","startTime":623,"connectEnd":1129.0999999046326,"connectStart":1129.0999999046326,"domainLookupEnd":1129.0999999046326,"domainLookupStart":1129.0999999046326,"fetchStart":623,"redirectEnd":0,"redirectStart":0,"requestStart":1129.4000000953674,"responseEnd":1218.0999999046326,"responseStart":1140.9000000953674,"secureConnectionStart":1129.0999999046326},{"duration":533.7000002861023,"initiatorType":"script","name":"https://jira.mariadb.org/s/a9324d6758d385eb45c462685ad88f1d-CDN/lu2cib/820016/12ta74/c92c0caa9a024ae85b0ebdbed7fb4bd7/_/download/contextbatch/js/atl.global,-_super/batch.js?locale=en","startTime":623.2999997138977,"connectEnd":623.2999997138977,"connectStart":623.2999997138977,"domainLookupEnd":623.2999997138977,"domainLookupStart":623.2999997138977,"fetchStart":623.2999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":1145.0999999046326,"responseEnd":1157,"responseStart":1156.0999999046326,"secureConnectionStart":623.2999997138977},{"duration":551.8999996185303,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-en/jira.webresources:calendar-en.js","startTime":623.4000000953674,"connectEnd":623.4000000953674,"connectStart":623.4000000953674,"domainLookupEnd":623.4000000953674,"domainLookupStart":623.4000000953674,"fetchStart":623.4000000953674,"redirectEnd":0,"redirectStart":0,"requestStart":1161,"responseEnd":1175.2999997138977,"responseStart":1173.6999998092651,"secureConnectionStart":623.4000000953674},{"duration":564.1999998092651,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:calendar-localisation-moment/jira.webresources:calendar-localisation-moment.js","startTime":623.5999999046326,"connectEnd":623.5999999046326,"connectStart":623.5999999046326,"domainLookupEnd":623.5999999046326,"domainLookupStart":623.5999999046326,"fetchStart":623.5999999046326,"redirectEnd":0,"redirectStart":0,"requestStart":1176.2999997138977,"responseEnd":1187.7999997138977,"responseStart":1186.9000000953674,"secureConnectionStart":623.5999999046326},{"duration":506,"initiatorType":"link","name":"https://jira.mariadb.org/s/b04b06a02d1959df322d9cded3aeecc1-CDN/lu2cib/820016/12ta74/a2ff6aa845ffc9a1d22fe23d9ee791fc/_/download/contextbatch/css/jira.global.look-and-feel,-_super/batch.css","startTime":623.7999997138977,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":623.7999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1129.7999997138977,"responseStart":0,"secureConnectionStart":0},{"duration":580.5999999046326,"initiatorType":"script","name":"https://jira.mariadb.org/rest/api/1.0/shortcuts/820016/47140b6e0a9bc2e4913da06536125810/shortcuts.js?context=issuenavigation&context=issueaction","startTime":623.9000000953674,"connectEnd":623.9000000953674,"connectStart":623.9000000953674,"domainLookupEnd":623.9000000953674,"domainLookupStart":623.9000000953674,"fetchStart":623.9000000953674,"redirectEnd":0,"redirectStart":0,"requestStart":1193,"responseEnd":1204.5,"responseStart":1203.4000000953674,"secureConnectionStart":623.9000000953674},{"duration":521.0999999046326,"initiatorType":"link","name":"https://jira.mariadb.org/s/3ac36323ba5e4eb0af2aa7ac7211b4bb-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/css/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.css?jira.create.linked.issue=true","startTime":624.1999998092651,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":624.1999998092651,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1145.2999997138977,"responseStart":0,"secureConnectionStart":0},{"duration":596.2999997138977,"initiatorType":"script","name":"https://jira.mariadb.org/s/5d5e8fe91fbc506585e83ea3b62ccc4b-CDN/lu2cib/820016/12ta74/d176f0986478cc64f24226b3d20c140d/_/download/contextbatch/js/com.atlassian.jira.projects.sidebar.init,-_super,-project.issue.navigator,-jira.view.issue/batch.js?jira.create.linked.issue=true&locale=en","startTime":624.4000000953674,"connectEnd":624.4000000953674,"connectStart":624.4000000953674,"domainLookupEnd":624.4000000953674,"domainLookupStart":624.4000000953674,"fetchStart":624.4000000953674,"redirectEnd":0,"redirectStart":0,"requestStart":1209.4000000953674,"responseEnd":1220.6999998092651,"responseStart":1220.0999999046326,"secureConnectionStart":624.4000000953674},{"duration":843.8000001907349,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-js/jira.webresources:bigpipe-js.js","startTime":631.2999997138977,"connectEnd":631.2999997138977,"connectStart":631.2999997138977,"domainLookupEnd":631.2999997138977,"domainLookupStart":631.2999997138977,"fetchStart":631.2999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":1463.9000000953674,"responseEnd":1475.0999999046326,"responseStart":1474.4000000953674,"secureConnectionStart":631.2999997138977},{"duration":1308.5999999046326,"initiatorType":"script","name":"https://jira.mariadb.org/s/d41d8cd98f00b204e9800998ecf8427e-CDN/lu2cib/820016/12ta74/1.0/_/download/batch/jira.webresources:bigpipe-init/jira.webresources:bigpipe-init.js","startTime":641,"connectEnd":641,"connectStart":641,"domainLookupEnd":641,"domainLookupStart":641,"fetchStart":641,"redirectEnd":0,"redirectStart":0,"requestStart":1938.5999999046326,"responseEnd":1949.5999999046326,"responseStart":1948.9000000953674,"secureConnectionStart":641},{"duration":370.5,"initiatorType":"xmlhttprequest","name":"https://jira.mariadb.org/rest/webResources/1.0/resources","startTime":1106.4000000953674,"connectEnd":1106.4000000953674,"connectStart":1106.4000000953674,"domainLookupEnd":1106.4000000953674,"domainLookupStart":1106.4000000953674,"fetchStart":1106.4000000953674,"redirectEnd":0,"redirectStart":0,"requestStart":1443.1999998092651,"responseEnd":1476.9000000953674,"responseStart":1476,"secureConnectionStart":1106.4000000953674},{"duration":500,"initiatorType":"script","name":"https://www.google-analytics.com/analytics.js","startTime":1479.7999997138977,"connectEnd":0,"connectStart":0,"domainLookupEnd":0,"domainLookupStart":0,"fetchStart":1479.7999997138977,"redirectEnd":0,"redirectStart":0,"requestStart":0,"responseEnd":1979.7999997138977,"responseStart":0,"secureConnectionStart":0},{"duration":519.9000000953674,"initiatorType":"xmlhttprequest","name":"https://jira.mariadb.org/rest/webResources/1.0/resources","startTime":1503.1999998092651,"connectEnd":1503.1999998092651,"connectStart":1503.1999998092651,"domainLookupEnd":1503.1999998092651,"domainLookupStart":1503.1999998092651,"fetchStart":1503.1999998092651,"redirectEnd":0,"redirectStart":0,"requestStart":1988.4000000953674,"responseEnd":2023.0999999046326,"responseStart":2022.5,"secureConnectionStart":1503.1999998092651}],"fetchStart":0,"domainLookupStart":0,"domainLookupEnd":0,"connectStart":0,"connectEnd":0,"requestStart":413,"responseStart":614,"responseEnd":641,"domLoading":618,"domInteractive":1983,"domContentLoadedEventStart":1983,"domContentLoadedEventEnd":2036,"domComplete":2557,"loadEventStart":2557,"loadEventEnd":2557,"userAgent":"Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)","marks":[{"name":"bigPipe.sidebar-id.start","time":1951.2999997138977},{"name":"bigPipe.sidebar-id.end","time":1952.1999998092651},{"name":"bigPipe.activity-panel-pipe-id.start","time":1952.4000000953674},{"name":"bigPipe.activity-panel-pipe-id.end","time":1954.5},{"name":"activityTabFullyLoaded","time":2090.2999997138977}],"measures":[],"correlationId":"4965d64f1ed047","effectiveType":"4g","downlink":10,"rtt":0,"serverDuration":130,"dbReadsTimeInMs":11,"dbConnsTimeInMs":19,"applicationHash":"9d11dbea5f4be3d4cc21f03a88dd11d8c8687422","experiments":[]}}
Thanks @Richard
Which node shows this error, B2?
So the problem is perhaps that replication is routed back to originating node, will try next with such multi-master setup.
What are server_id values in each nodes? Server_id check should, in principle, cut replication cycles.