[MDEV-6740] Galera crash in rpl_sql_thread_info/cached_charset_compare Created: 2014-09-13  Updated: 2014-09-22  Resolved: 2014-09-22

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.0.13-galera
Fix Version/s: 10.0.14-galera

Type: Bug Priority: Blocker
Reporter: Kolbe Kegel (Inactive) Assignee: Nirbhay Choubey (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

RHEL 6.5 x86-64


Attachments: File mdev_6740.sh     File test_galera_sync.py    

 Description   

I'm doing INSERT on one node and UPDATE on another. It often leads to a crash on the node where I'm executing UPDATE.

Thread pointer: 0x0x7f711aff3008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f710b5b4ce0 thread_stack 0x48000
/usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xb9541b]
/usr/sbin/mysqld(handle_fatal_signal+0x398)[0x744b48]
/lib64/libpthread.so.0[0x3166e0f710]
/lib64/libc.so.6[0x3166b3f3c0]
/usr/sbin/mysqld(_ZNK19rpl_sql_thread_info22cached_charset_compareEPc+0x20)[0x69b460]
/usr/sbin/mysqld(_ZN15Query_log_event14do_apply_eventEP14rpl_group_infoPKcj+0x86f)[0x802bdf]
/usr/sbin/mysqld(_Z14wsrep_apply_cbPvPKvmjPK14wsrep_trx_meta+0x525)[0x6f1c05]
/usr/lib64/galera/libgalera_smm.so(ZNK6galera9TrxHandle5applyEPvPF15wsrep_cb_statusS1_PKvmjPK14wsrep_trx_metaERS6+0xb1)[0x7f71409542c1]
/usr/lib64/galera/libgalera_smm.so(+0x1aaf95)[0x7f714098bf95]
/usr/lib64/galera/libgalera_smm.so(_ZN6galera13ReplicatorSMM10replay_trxEPNS_9TrxHandleEPv+0x12e)[0x7f714098c85e]
/usr/lib64/galera/libgalera_smm.so(galera_replay_trx+0x5c)[0x7f71409a045c]
/usr/sbin/mysqld(_Z24wsrep_replay_transactionP3THD+0x2de)[0x6f379e]
/usr/sbin/mysqld[0x5e24b0]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x16d0)[0x5e3c30]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x132)[0x5e4402]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x54b)[0x6a31cb]
/usr/sbin/mysqld(handle_one_connection+0x42)[0x6a32c2]
/lib64/libpthread.so.0[0x3166e079d1]
/lib64/libc.so.6(clone+0x6d)[0x3166ae886d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f711b19f351): UPDATE t1 SET v=v+1 WHERE k=30
Connection ID (thread ID): 5
Status: NOT_KILLED

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on

Tail of general log:

140912 17:02:44 1 Query INSERT INTO t1 (k,j) VALUES (29,11)
1 Query INSERT INTO t1 (k,j) VALUES (29,12)
1 Query INSERT INTO t1 (k,j) VALUES (29,13)
1 Query INSERT INTO t1 (k,j) VALUES (29,14)
1 Query INSERT INTO t1 (k,j) VALUES (29,15)
1 Query INSERT INTO t1 (k,j) VALUES (29,16)
1 Query INSERT INTO t1 (k,j) VALUES (29,17)
1 Query INSERT INTO t1 (k,j) VALUES (29,18)
1 Query INSERT INTO t1 (k,j) VALUES (29,19)
5 Query UPDATE t1 SET v=v+1 WHERE k=29
1 Query INSERT INTO t1 (k,j) VALUES (30,0)
1 Query INSERT INTO t1 (k,j) VALUES (30,1)
1 Query INSERT INTO t1 (k,j) VALUES (30,2)
1 Query INSERT INTO t1 (k,j) VALUES (30,3)
1 Query INSERT INTO t1 (k,j) VALUES (30,4)
1 Query INSERT INTO t1 (k,j) VALUES (30,5)
1 Query INSERT INTO t1 (k,j) VALUES (30,6)
1 Query INSERT INTO t1 (k,j) VALUES (30,7)
1 Query INSERT INTO t1 (k,j) VALUES (30,8)
1 Query INSERT INTO t1 (k,j) VALUES (30,9)
1 Query INSERT INTO t1 (k,j) VALUES (30,10)
1 Query INSERT INTO t1 (k,j) VALUES (30,11)
1 Query INSERT INTO t1 (k,j) VALUES (30,12)
1 Query INSERT INTO t1 (k,j) VALUES (30,13)
1 Query INSERT INTO t1 (k,j) VALUES (30,14)
1 Query INSERT INTO t1 (k,j) VALUES (30,15)
1 Query INSERT INTO t1 (k,j) VALUES (30,16)
1 Query INSERT INTO t1 (k,j) VALUES (30,17)
5 Query UPDATE t1 SET v=v+1 WHERE k=30
1 Query INSERT INTO t1 (k,j) VALUES (30,18)
1 Query INSERT INTO t1 (k,j) VALUES (30,19)



 Comments   
Comment by Nirbhay Choubey (Inactive) [ 2014-09-18 ]

kolbe I have tried to come up with a simple test which does INSERT and UPDATE on 2 nodes. But it doesn't lead to crash. Can you try this with perhaps a modified table structure & queries and see it reproduces the issue?

Comment by Kolbe Kegel (Inactive) [ 2014-09-18 ]

In what way would you like me to modify the table structure or queries? I was easily able to reproduce this issue when doing my original testing ... so I'm not very inclined to modify my test unless you can give me some guidance about what you'd like me to do and why.

Can I help you gather additional information about what is happening here?

I did this testing in 3 ec2 instances and repeating the problem has been very easy. I can give you access to the ec2 instances if you'd like.

Here's the table structure:

CREATE TABLE `t1` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`k` int(11) DEFAULT NULL,
`v` int(11) DEFAULT '0',
`j` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `k` (`k`)
) ENGINE=InnoDB;

Comment by Kolbe Kegel (Inactive) [ 2014-09-18 ]

test program. give an integer as 1st positional parameter and it'll be used as the value for wsrep_sync_wait.

Comment by Nirbhay Choubey (Inactive) [ 2014-09-18 ]

I was referring to the test script that I uploaded.

Comment by Kolbe Kegel (Inactive) [ 2014-09-18 ]

For one thing your table doesn't have a primary key, which is not good for Galera... and you create a new connection for each statement you send to the server, which adds a lot of overhead and slows it down so much that maybe that alone avoids the problem. Plus your test program does all the inserts and then does the updates which is definitely different from what I was doing... sorry for the confusion. You'll see I've attached my test program so you can get a better view of what I was doing.

Comment by Nirbhay Choubey (Inactive) [ 2014-09-18 ]

Ok, will try your script.

Comment by Nirbhay Choubey (Inactive) [ 2014-09-19 ]

http://lists.askmonty.org/pipermail/commits/2014-September/006606.html

Comment by Jan Lindström (Inactive) [ 2014-09-22 ]

Ok to push.

Comment by Nirbhay Choubey (Inactive) [ 2014-09-22 ]

Pushed to maria-10.0-galera.
http://bazaar.launchpad.net/~maria-captains/maria/maria-10.0-galera/revision/3893

Generated at Thu Feb 08 07:14:14 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.