Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6333

A deadlock occured on Galera Clustering




      Our System is using Galera Cluster.

      Error formatting macro: code: java.lang.StackOverflowError

      MariaDB [test]> show variables like ''%wsrep%';

      Variable_name Value
      wsrep_OSU_method TOI
      wsrep_auto_increment_control ON
      wsrep_causal_reads OFF
      wsrep_certify_nonPK ON
      wsrep_cluster_address gcomm://xx.xxx.xx.x1,xx.xxx.xx.x2,xx.xxx.xx.x3
      wsrep_cluster_name GC
      wsrep_convert_LOCK_to_trx OFF
      wsrep_data_home_dir /var/lib/mysql/
      wsrep_debug OFF
      wsrep_desync OFF
      wsrep_drupal_282555_workaround OFF
      wsrep_forced_binlog_format NONE
      wsrep_load_data_splitting ON
      wsrep_log_conflicts OFF
      wsrep_max_ws_rows 131072
      wsrep_max_ws_size 1073741824
      wsrep_mysql_replication_bundle 0
      wsrep_node_address xx.xxx.xx.x2
      wsrep_node_incoming_address AUTO
      wsrep_node_name GC-1
      wsrep_on ON
      wsrep_provider /usr/lib64/galera/libgalera_smm.so
      wsrep_provider_options base_host = xx.xxx.xx.x2; base_port = 4567; cert.log_conflicts = no; debug = no; evs.causal_keepalive_period = PT1S; evs.debug_log_mask = 0x1; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.info_log_mask = 0; evs.install_timeout = PT15S; evs.join_retrans_period = PT1S; evs.keepalive_period = PT1S; evs.max_install_timeouts = 1; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.use_aggregate = true; evs.user_send_window = 2; evs.version = 0; evs.view_forget_timeout = P1D; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.listen_addr = tcp://; gmcast.mcast_addr = ; gmcast.mcast_ttl = 1; gmcast.peer_timeout = PT3S; gmcast.segment = 0; gmcast.time_wait = PT5S; gmcast.version = 0; ist.recv_addr = xx.xxx.xx.x2; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.linger = PT20S; pc.npvo = false; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = P30S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.max_ws_size = 2147483647; repl.proto_max = 5; socket.checksum = 2;
      wsrep_recover OFF
      wsrep_replicate_myisam OFF
      wsrep_restart_slave OFF
      wsrep_retry_autocommit 1
      wsrep_slave_threads 1
      wsrep_sst_donor_rejects_queries OFF
      wsrep_sst_method rsync
      wsrep_sst_receive_address AUTO
      wsrep_start_position 8e663ba7-f123-11e3-88dc-dfe448d1c69c:1339436

      And our DB nodes' auto_increment settings are
      Node #1

      MariaDB [test]> show variables like '%auto_increment%';
      | Variable_name                | Value |
      | auto_increment_increment     | 3     |
      | auto_increment_offset        | 1     |
      | wsrep_auto_increment_control | ON    |

      Node #2

      MariaDB [test]> show variables like '%auto_increment%';
      | Variable_name                | Value |
      | auto_increment_increment     | 3     |
      | auto_increment_offset        | 2     |
      | wsrep_auto_increment_control | ON    |

      Node #3

      MariaDB [test]> show variables like '%auto_increment%';
      | Variable_name                | Value |
      | auto_increment_increment     | 3     |
      | auto_increment_offset        | 3     |
      | wsrep_auto_increment_control | ON    |

      This setting is same as with the post on blog.mariadb.org ( https://blog.mariadb.org/auto-increments-in-galera/ ).
      But In our System. While doing update logic in Transaction Deadlock is still occured.
      Our System is consisted of 2 Agent(Active-Active) ,3 DB nodes(Galera Cluster).
      If Agent 1&2 are connected to only 1 node (ex DB node 1), Deadlock is not occured.
      But if I consist the connection - Agent 1 to DB node 1 & Agent 2 to DB node 2 each, Deadlock is occured.
      What is the problem and How can I solve this Deadlock?.




            nirbhay_c Nirbhay Choubey (Inactive)
            shin shin
            0 Vote for this issue
            3 Start watching this issue



              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.