[MDEV-5883] Deadlock when attempting to update field with same value Created: 2014-03-17  Updated: 2014-05-12  Resolved: 2014-05-12

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 5.5.35-galera, 5.5.36-galera
Fix Version/s: 5.5.38-galera

Type: Bug Priority: Major
Reporter: Ralf Kilian Assignee: Nirbhay Choubey (Inactive)
Resolution: Not a Bug Votes: 0
Labels: galera
Environment:

Main system:

  • CentOS 6.3 x64
  • MariaDB-Galera-server-5.5.36-1.el6.x86_64
  • galera-25.3.2-1.rhel6.x86_64

Tested with multiple systems and versions (see description for details).



 Description   

We have a Galera cluster containing two MariaDB servers with multi-master replication.

When I try to update a field inside a table using the value the field already has (e. g. update value "test" with "test") I get the following error, no matter if integer, boolean or string.

ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

Below you can find the steps how to reproduce this. I also tried that on a MariaDB server without Galera and there it was executed without any errors:

Query OK, 0 rows affected (0.01 sec)
Rows matched: 1 Changed: 0 Warnings: 0

I've already posted this inside the Galera bug list and received the following answer:

Can't be reproduced with the reference Codership binaries. This must be a MariaDB-specific bug and should be filed with them.

Used systems:

  • CentOS 6.3 x64
    • MariaDB-Galera-server-5.5.35-1.el6.x86_64
    • galera-25.3.2-1.rhel6.x86_64
  • CentOS 6.3 x64
    • MariaDB-Galera-server-5.5.36-1.el6.x86_64
    • galera-25.3.2-1.rhel6.x86_64
  • CentOS 6.5 x64
    • MariaDB-server-5.5.36-1.el6.x86_64
    • no Galera packages installed (the case where no error occurred)

Edit: This bug seems to be already fixed in MariaDB version 10.

Successfully tested with:
- CentOS 6.3 x64
- MariaDB-Galera-server-10.0.7-1.x86_64
- galera-25.3.2-1.rhel6.x86_64

How to reproduce:

CREATE DATABASE mytest;
USE mytest;
CREATE TABLE mytable (
    id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    myint INT DEFAULT 0,
    mybool BOOL DEFAULT 0,
    mystr VARCHAR(100) DEFAULT ""
);
INSERT INTO `mytest`.`mytable` (`myint`, `mystr`) VALUES ('0', 'test');
UPDATE mytable SET myint = 0 WHERE id = 1;
UPDATE mytable SET mybool = 0 WHERE id = 1;
UPDATE mytable SET mystr = "test" WHERE id = 1;



 Comments   
Comment by Nirbhay Choubey (Inactive) [ 2014-05-02 ]

Hi Ralf,
I tried to reproduce this issue on a variety of platforms : precise (32 & 64), CentOS 6.5 (32 & 64), different
server versions : 5.5.35, 5.5.36, 5.5.37 and galera version 25.3.2, 25.3.5. But couldn't!

Node 1
=====
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 5
Server version: 5.5.37-MariaDB-wsrep-debug Source distribution, wsrep_25.10.r3980

Copyright (c) 2000, 2014, Oracle, SkySQL Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [test]> CREATE DATABASE mytest;
Query OK, 1 row affected (0.00 sec)

MariaDB [test]> USE mytest;
Database changed
MariaDB [mytest]> CREATE TABLE mytable (
-> id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
-> myint INT DEFAULT 0,
-> mybool BOOL DEFAULT 0,
-> mystr VARCHAR(100) DEFAULT ""
-> );
Query OK, 0 rows affected (0.05 sec)

MariaDB [mytest]> INSERT INTO `mytest`.`mytable` (`myint`, `mystr`) VALUES ('0', 'test');
Query OK, 1 row affected (0.00 sec)

MariaDB [mytest]> UPDATE mytable SET myint = 0 WHERE id = 1;
Query OK, 0 rows affected (0.03 sec)
Rows matched: 1 Changed: 0 Warnings: 0

MariaDB [mytest]> UPDATE mytable SET mybool = 0 WHERE id = 1;
Query OK, 0 rows affected (0.00 sec)
Rows matched: 1 Changed: 0 Warnings: 0

MariaDB [mytest]> UPDATE mytable SET mybool = 0 WHERE id = 1;
Query OK, 0 rows affected (0.05 sec)
Rows matched: 1 Changed: 0 Warnings: 0

MariaDB [mytest]> select * from mytable;
--------------------+

id myint mybool mystr

--------------------+

1 0 0 test

--------------------+
1 row in set (0.01 sec)

Node 2
=====
MariaDB [test]> select * from mytest.mytable;
--------------------+

id myint mybool mystr

--------------------+

1 0 0 test

--------------------+
1 row in set (0.00 sec)

Is there something specific that I am missing (besides CentOS 6.3 as I presume
it shouldn't make much difference)?

Comment by Ralf Kilian [ 2014-05-02 ]

Hello, Nirbhay!

Thank you for your answer!

I tried to reproduce this issue on a variety of platforms (...) But couldn't!

This is pretty weird.

besides CentOS 6.3 as I presume it shouldn't make much difference

I also think that this should not make much difference.

Here is the config behind the replication, maybe there is something wrong or missing and this causes the error in some way?

# Galera replication lines (taken from "/etc/my.cnf.d/server.cnf")
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://192.168.1.2"
wsrep_sst_method=rsync
wsrep_sst_auth=root:password
bind-address=192.168.1.1
skip-name-resolve

However, we're also using these config lines for the replication with MariaDB 10 and everything works fine there.

Thanks in advance!

Comment by Nirbhay Choubey (Inactive) [ 2014-05-02 ]

Still no luck. The wsrep_ config options that you have shared are all standard ones and
required for a node to join the cluster (except of wsrep_sst_auth, which is not necessary
in case of 'rsync', but ok if you leave it there). Can you please share all the config options
which are being passed on server start?

Comment by Ralf Kilian [ 2014-05-06 ]

Hello, Nirbhay!

except of wsrep_sst_auth, which is not necessary in case of 'rsync'

Thanks for the hint, I've removed it (was a left over from the past).

Can you please share all the config options which are being passed on server start?

Yes I can. I'm using the default init script to start the service and this is my config /etc/my.cnf.d/server.cnf.

# this is read by the standalone daemon and embedded servers
[server]
 
# this is only for the mysqld standalone daemon
[mysqld]
character-set-server=utf8
default_week_format=3
event-scheduler=1
innodb_buffer_pool_size=1024M
log-error=/var/log/mysqld.log
log-slow-queries=/var/log/mysqld-slow.log
general_log_file=/var/log/mysqld-general.log
general_log=1
max_allowed_packet=16M
symbolic-links=1
user=mysql
 
# Should be disabled for productive use
performance_schema=on
 
# this is only for embedded server
[embedded]
 
# This group is only read by MariaDB-5.5 servers.
# If you use the same .cnf file for MariaDB of different versions,
# use this group for options that older servers don't understand
[mysqld-5.5]
 
# These two groups are only read by MariaDB servers, not by MySQL.
# If you use the same .cnf file for MySQL and MariaDB,
# you can put MariaDB-only options here
[mariadb]
port=3308
character-set-server=utf8
 
# Galera replication lines
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://192.168.1.2"
wsrep_sst_method=rsync
bind-address=192.168.1.1
skip-name-resolve
 
[mariadb-5.5]
character-set-server=utf8

Thanks in advance!

Comment by Nirbhay Choubey (Inactive) [ 2014-05-07 ]

Hi Ralf,
It looks like binlog-format has not been set? Are you setting it through command line?
In galera cluster, nodes should always be started in "row" binary log format.
On 5.5, I am able to reproduce this issue only when node has binlog format set to
statement (default). However, same works on 10.0-galera due to a minor (and possibly
harmless) difference in one of the innodb handler functions.

Comment by Ralf Kilian [ 2014-05-12 ]

Hello, Nirbhay!

It looks like binlog-format has not been set?

Thanks a lot for the information! It was not set via command line, but after setting the binlog-format to row (inside the config) it works.

However, same works on 10.0-galera due to a minor (and possibly harmless) difference

Should I explicitly set the binlog-format there anyway?

Thanks in advance!

Comment by Nirbhay Choubey (Inactive) [ 2014-05-12 ]

> Should I explicitly set the binlog-format there anyway?

Yes, you should.

Comment by Ralf Kilian [ 2014-05-12 ]

Yes, you should.

Thanks a lot. Would mixed format be okay does it have to be row format for some reason?

Thanks in advance!

Comment by Nirbhay Choubey (Inactive) [ 2014-05-12 ]

Hi Ralf,
Galera uses certification based replication which requires PK of all rows
changed in a transaction alongside the actual change (collectively known
as writeset) in order to determine if the changes can be applied (conflict-free).
The PK information won't be available if the binary logging is statement based
or mixed (which is mostly statement based).
HTH
Thanks!

Comment by Ralf Kilian [ 2014-05-12 ]

I see. Thanks again for your help!

Generated at Thu Feb 08 07:07:46 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.