Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27410

Galera cluster hangs after one node reboots

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.5.12
    • None
    • None
    • None
    • centos7.5
      Mariadb 10.5.12
      Galera provider version: 26.4.9

    Description

      Happy new year!

      We have a gelera cluster with 3 nodes in production, and we upgraded to 10.5.12 recently from 10.3.10.
      We provide a keepAlived +Haproxy to access the DB clutster.

      The galera cluster would hang (can only select and unable to do update or delete) after one node reboots occationally.
      The application gets the error "Lock wait timeout exceeded; try restarting transaction" and there is no error log in mysql logs. I can do some select in termincal but any update and delete sql would hang.
      And *wsrep_local_state_comment *on 3 nodes are "Synced". *Wsrep_last_commited *on 3 nodes are static and would not go forward any more. And one is lower than other two.

      To recover the cluster, it would work to restart the mariadb instance which has differrent wsrep_last_commited or reboot the cluster with --wsrep-new-cluster.

      The possibility would be much higher when one node is poweroff, and reboot another node.
      Any advice would appreciate, thanks!

      Attachments

        Activity

          People

            Unassigned Unassigned
            philJ PhilJing
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.