Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
5.5.28a-galera
-
None
-
None
Description
General description
RQG grammars and data and server command lines follow
I have a 3-node cluster (2 servers and the arbitrator).
Test data consists of one table with two columns, a PK and a non-unique key on a char(1) column. The table contains 20 rows.
Test flow on the 1st node:
2 threads
one runs UPDATE <table> SET <non-pk> = ... ORDER BY ... LIMIT 8
another one runs KILL QUERY <first thread>
Test flow on the 2nd node:
Single thread runs UPDATE <table> SET <non-pk> = ... LIMIT 3
After a few minutes on a release build replication on the first node fails with
[ERROR] Slave SQL: Error executing row event: 'Query execution was interrupted', Error_code: 1317
|
[Warning] WSREP: RBR event 2 Update_rows apply warning: 1317, 9845
|
[Warning] WSREP: failed to replay trx: source: 6b5ee6b1-418e-11e2-0800-2723c17dce38 version: 2 local: 1 state: REPLAYING flags: 1 conn_id: 7 trx_id: 30651 seqnos (l: 9880, g: 9845, s: 9843, d: 9844, ts: 1355009139336500474)
|
[Warning] WSREP: Failed to apply app buffer: d<CC><C3>P^S^A, seqno: 9845, status: WSREP_FATAL
|
at galera/src/replicator_smm.cpp:apply_wscoll():49
|
at galera/src/replicator_smm.cpp:apply_trx_ws():120
|
[ERROR] WSREP: trx_replay failed for: 5, query: UPDATE `table20_innodb_int_autoinc` SET `col_char_1_key` = 'l' ORDER BY `col_char_1_key`,`pk` LIMIT 5
|
[ERROR] Aborting
|
After that the server hangs, it doesn't shut down, but doesn't accept any connections either.
On a debug version, it aborts with
mysqld: maria-5.5-galera/sql/sql_error.h:76: uint Diagnostics_area::sql_errno() const: Assertion `m_status == DA_ERROR' failed.
|
[ERROR] mysqld got signal 6 ;
|
#6 0x00007fe6359edd4d in __GI___assert_fail (assertion=0xd495da "m_status == DA_ERR
|
OR", file=<optimized out>, line=76, function=<optimized out>) at assert.c:81
|
#7 0x000000000057bf15 in Diagnostics_area::sql_errno (this=0x7fe600004240) at /home
|
/elenst/maria-5.5-galera/sql/sql_error.h:76
|
#8 0x00000000008cd766 in Rows_log_event::do_apply_event (this=0x40f6450, rli=0x40f3
|
300) at maria-5.5-galera/sql/log_event.cc:8280
|
#9 0x0000000000592e8c in Log_event::apply_event (this=0x40f6450, rli=0x40f3300) at maria-5.5-galera/sql/log_event.h:1230
|
#10 0x00000000006284e9 in wsrep_apply_rbr (thd=0x7fe600000910, rbr_buf=0x40f61f0 "T\231\303P\023\001", buf_len=0) at maria-5.5-galera/sql/sql_parse.cc:8098
|
#11 0x0000000000628ad5 in wsrep_apply_cb (ctx=0x7fe600000910, buf=0x40f61f0, buf_len=168, global_seqno=637) at maria-5.5-galera/sql/sql_parse.cc:8177
|
#12 0x00007fe634d31abf in apply_wscoll (trx=..., apply_cb=0x628a27 <wsrep_apply_cb(void*, void const*, unsigned long, long)>, recv_ctx=0x7fe600000910) at galera/src/replicator_smm.cpp:37
|
#13 apply_trx_ws (recv_ctx=0x7fe600000910, apply_cb=0x628a27 <wsrep_apply_cb(void*, void const*, unsigned long, long)>, commit_cb=0x628d32 <wsrep_commit_cb(void*, long, bool)>, trx=...) at galera/src/replicator_smm.cpp:81
|
#14 0x00007fe634d3600f in galera::ReplicatorSMM::replay_trx (this=0x291ded0, trx=0x41cd280, trx_ctx=0x7fe600000910) at galera/src/replicator_smm.cpp:821
|
#15 0x00007fe634d4db76 in galera_replay_trx (gh=<optimized out>, trx_handle=<optimized out>, recv_ctx=0x7fe600000910) at galera/src/wsrep_provider.cpp:658
|
#16 0x000000000062330b in wsrep_mysql_parse (thd=0x7fe600000910, rawbuf=0x4083178 "UPDATE `table20_innodb_int_autoinc` SET `col_char_1_key` = 'l' ORDER BY `col_char_1_key`,`pk` LIMIT 5", length=101, parser_state=0x7fe60789b550) at maria-5.5-galera/sql/sql_parse.cc:6085
|
#17 0x000000000061543b in dispatch_command (command=COM_QUERY, thd=0x7fe600000910, packet=0x7fe600006561 "UPDATE `table20_innodb_int_autoinc` SET `col_char_1_key` = 'l' ORDER BY `col_char_1_key`,`pk` LIMIT 5", packet_length=101) at maria-5.5-galera/sql/sql_parse.cc:1230
|
#18 0x000000000061429e in do_command (thd=0x7fe600000910) at maria-5.5-galera/sql/sql_parse.cc:890
|
#19 0x000000000071c1b8 in do_handle_one_connection (thd_arg=0x7fe600000910) at maria-5.5-galera/sql/sql_connect.cc:1278
|
Trying to get some variables.
|
Some pointers may be invalid and cause the dump to abort.
|
Query (0x4083178): UPDATE `table20_innodb_int_autoinc` SET `col_char_1_key` = 'l' ORDER BY `col_char_1_key`,`pk` LIMIT 5
|
Connection ID (thread ID): 7
|
Status: KILL_QUERY
|
branch: maria-5.5-galera
|
revision-id: seppo.jaakola@codership.com-20121130113629-lhwlr2ncrib15h18
|
date: 2012-11-30 13:36:29 +0200
|
revno: 3358
|
Command lines:
maria-5.5-galera/sql/mysqld --defaults-file=maria-5.5-galera/mydef1.cnf --datadir=maria-5.5-galera/data1 --wsrep_provider=galera-23.2.2-src/libgalera_smm.so --wsrep_sst_method=rsync --core --default-storage-engine=InnoDB --innodb_autoinc_lock_mode=2 --innodb_locks_unsafe_for_binlog=1 --binlog-format=row --innodb_flush_log_at_trx_commit=0 --log-error=log.err --basedir=maria-5.5-galera/ --port=8306 --loose-lc-messages-dir=maria-5.5-galera/sql/share --socket=/tmp/elenst-galera-1.sock --tmpdir=maria-5.5-galera/data1/tmp --general-log=1 --wsrep_cluster_address=gcomm:// --core --log-bin=master-bin
|
|
maria-5.5-galera/sql/mysqld --defaults-file=maria-5.5-galera/mydef2.cnf --datadir=maria-5.5-galera/data2 --wsrep_provider=galera-23.2.2-src/libgalera_smm.so --wsrep_sst_method=rsync --core --default-storage-engine=InnoDB --innodb_autoinc_lock_mode=2 --innodb_locks_unsafe_for_binlog=1 --binlog-format=row --innodb_flush_log_at_trx_commit=0 --log-error=log.err --basedir=maria-5.5-galera/ --port=8307 --loose-lc-messages-dir=maria-5.5-galera/sql/share --socket=/tmp/elenst-galera-2.sock --tmpdir=maria-5.5-galera/data2/tmp --general-log=1 --wsrep_cluster_address=gcomm://127.0.0.1:4567?gmcast.listen_addr=tcp://127.0.0.1:4566 --core --log-bin=master-bin
|
(mydefX.cnf files are irrelevant, they only contain datadirs and ports).
The test is run via RQG, one instance per node. The data file is the same for both instances, grammars slightly differ.
RQG command lines:
perl gentest.pl --gendata=1.zz --threads=2 --queries=100M --duration=21600 --dsn=dbi:mysql:host=127.0.0.1:port=8306:user=root:database=test --grammar=1a.yy
|
|
perl gentest.pl --gendata=1.zz --threads=1 --queries=100M --duration=21600 --dsn=dbi:mysql:host=127.0.0.1:port=8307:user=root:database=test --grammar=1b.yy
|
data file (1.zz):
$tables = {
|
rows => [ 20 ],
|
engines => [ 'InnoDB' ]
|
};
|
|
$fields = {
|
types => [ 'char(1)' ],
|
pk => [ 'int' ],
|
indexes => [ 'key' ]
|
};
|
|
$data = {
|
numbers => [ 'digit' ],
|
strings => [ 'letter' ]
|
}
|
Grammar for the 1st node (1a.yy):
thread2_init:
|
SELECT CONNECTION_ID() INTO @killer;
|
|
thread2:
|
KILL QUERY @killer - 1 ;
|
|
query:
|
UPDATE _table SET _field_no_pk = _varchar(1) ORDER BY _field_list LIMIT 8 ;
|
Grammar for the 2nd node (1b.yy):
query:
|
UPDATE _table SET _field_no_pk = _char(1) LIMIT 3 ;
|
An example of GRA file produced upon the failure is attached.