[MDEV-6333] A deadlock occured on Galera Clustering - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Incomplete
Affects Version/s: 5.5.37-galera
Fix Version/s: 5.5.41-galera
Component/s: Galera
Labels:
- galera

Description

Our System is using Galera Cluster.

Error formatting macro: code: java.lang.StackOverflowError

MariaDB [test]> show variables like ''%wsrep%';

Variable_name	Value
wsrep_OSU_method	TOI
wsrep_auto_increment_control	ON
wsrep_causal_reads	OFF
wsrep_certify_nonPK	ON
wsrep_cluster_address	gcomm://xx.xxx.xx.x1,xx.xxx.xx.x2,xx.xxx.xx.x3
wsrep_cluster_name	GC
wsrep_convert_LOCK_to_trx	OFF
wsrep_data_home_dir	/var/lib/mysql/
wsrep_dbug_option
wsrep_debug	OFF
wsrep_desync	OFF
wsrep_drupal_282555_workaround	OFF
wsrep_forced_binlog_format	NONE
wsrep_load_data_splitting	ON
wsrep_log_conflicts	OFF
wsrep_max_ws_rows	131072
wsrep_max_ws_size	1073741824
wsrep_mysql_replication_bundle	0
wsrep_node_address	xx.xxx.xx.x2
wsrep_node_incoming_address	AUTO
wsrep_node_name	GC-1
wsrep_notify_cmd
wsrep_on	ON
wsrep_provider	/usr/lib64/galera/libgalera_smm.so
wsrep_provider_options	base_host = xx.xxx.xx.x2; base_port = 4567; cert.log_conflicts = no; debug = no; evs.causal_keepalive_period = PT1S; evs.debug_log_mask = 0x1; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.info_log_mask = 0; evs.install_timeout = PT15S; evs.join_retrans_period = PT1S; evs.keepalive_period = PT1S; evs.max_install_timeouts = 1; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.use_aggregate = true; evs.user_send_window = 2; evs.version = 0; evs.view_forget_timeout = P1D; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.listen_addr = tcp://0.0.0.0:4567; gmcast.mcast_addr = ; gmcast.mcast_ttl = 1; gmcast.peer_timeout = PT3S; gmcast.segment = 0; gmcast.time_wait = PT5S; gmcast.version = 0; ist.recv_addr = xx.xxx.xx.x2; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.linger = PT20S; pc.npvo = false; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = P30S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.max_ws_size = 2147483647; repl.proto_max = 5; socket.checksum = 2;
wsrep_recover	OFF
wsrep_replicate_myisam	OFF
wsrep_restart_slave	OFF
wsrep_retry_autocommit	1
wsrep_slave_threads	1
wsrep_sst_auth
wsrep_sst_donor
wsrep_sst_donor_rejects_queries	OFF
wsrep_sst_method	rsync
wsrep_sst_receive_address	AUTO
wsrep_start_position	8e663ba7-f123-11e3-88dc-dfe448d1c69c:1339436

And our DB nodes' auto_increment settings are
Node #1

MariaDB [test]> show variables like '%auto_increment%';

+------------------------------+-------+

| Variable_name                | Value |

+------------------------------+-------+

| auto_increment_increment     | 3     |

| auto_increment_offset        | 1     |

| wsrep_auto_increment_control | ON    |

+------------------------------+-------+

Node #2

MariaDB [test]> show variables like '%auto_increment%';

+------------------------------+-------+

| Variable_name                | Value |

+------------------------------+-------+

| auto_increment_increment     | 3     |

| auto_increment_offset        | 2     |

| wsrep_auto_increment_control | ON    |

+------------------------------+-------+

Node #3

MariaDB [test]> show variables like '%auto_increment%';

+------------------------------+-------+

| Variable_name                | Value |

+------------------------------+-------+

| auto_increment_increment     | 3     |

| auto_increment_offset        | 3     |

| wsrep_auto_increment_control | ON    |

+------------------------------+-------+

This setting is same as with the post on blog.mariadb.org ( https://blog.mariadb.org/auto-increments-in-galera/ ).
But In our System. While doing update logic in Transaction Deadlock is still occured.
Our System is consisted of 2 Agent(Active-Active) ,3 DB nodes(Galera Cluster).
If Agent 1&2 are connected to only 1 node (ex DB node 1), Deadlock is not occured.
But if I consist the connection - Agent 1 to DB node 1 & Agent 2 to DB node 2 each, Deadlock is occured.
What is the problem and How can I solve this Deadlock?.

Attachments

Activity

People

Assignee:: Nirbhay Choubey (Inactive)

Reporter:: shin

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2014-06-12 04:37

Updated:: 2014-12-23 00:16

Resolved:: 2014-12-23 00:16

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.