Details

    • 10.0.23

    Description

      After upgrading to MariaDB-10.0.22-Galera we seem to trigger a condition which causes a complete lockup on the database. However the errorlog or innodb engine status do not seem to recognize it as such.

      Setup: Three Centos6-x86_64 servers, two with MariaDB-10.0.22-Galera and one with the Garb deamon. The first of the MariaDB servers (Server1) is used for all application queries, the second one (Server2) for running incremental backups (every 5 minutes).

      When we are running a performance test on the application (and thus create a simple query load on Server1) after some time the database will enter a locked state.

      The conditions we have been able to isolate:

      • Servers are running MariaDB-10.0.22-Galera (does not occur with 10.0.21)
      • Servers are running in cluster mode
      • Application queries run against Server1
      • Backup queries run against Server2

      The queries used by the backup software on Server2 are attached (backup.txt). The processlist of Server1 after occurance of the issue is attached (processlist.txt) and also a gdb backtrace from all threads (backtrace.txt) on Server1. Server2 has an empty processlist as this time and backups can still continue to run from Server2 while Server1 has entered this locked state. The locked state does not timeout, the only option for recovery is a mysqld restart.

      Unfortunately I have not been able yet to create a testcase to reproduce the issue on an isolated system. I can however trigger the issue at will by running the application performance test in our setup so gather additonal information.

      Any help to find and resolve this issue would be greatly appreciated.

      Attachments

        1. mysql_repeat.sh
          0.4 kB
        2. Server1 backtrace.txt
          1.01 MB
        3. Server1 processlist.txt
          141 kB
        4. Server2 backup.txt
          26 kB

        Activity

          nirbhay_c Nirbhay Choubey (Inactive) added a comment - seppo Can you please review the following patch? http://lists.askmonty.org/pipermail/commits/2015-December/008781.html
          bradjorgensen Brad Jorgensen added a comment - - edited

          I think I'm experiencing the same issue when running innobackupex against my galera cluster. Basically what I know now is that the server locks up when xtrabackup runs its FTWRL. I did find that it is only a problem with more than one node in the cluster so for now I have to shut down all but one node to run a backup. All of our application traffic is currently directed to one node, however there are a few monitoring queries that write to a single table on every node in the cluster. For us, the problem arises regardless of which node the backup runs on.

          I wrote a message to the mailing list here that has some more information on my problem.
          https://lists.launchpad.net/maria-discuss/msg03178.html

          bradjorgensen Brad Jorgensen added a comment - - edited I think I'm experiencing the same issue when running innobackupex against my galera cluster. Basically what I know now is that the server locks up when xtrabackup runs its FTWRL. I did find that it is only a problem with more than one node in the cluster so for now I have to shut down all but one node to run a backup. All of our application traffic is currently directed to one node, however there are a few monitoring queries that write to a single table on every node in the cluster. For us, the problem arises regardless of which node the backup runs on. I wrote a message to the mailing list here that has some more information on my problem. https://lists.launchpad.net/maria-discuss/msg03178.html
          nirbhay_c Nirbhay Choubey (Inactive) added a comment - https://github.com/MariaDB/server/commit/fe4047dc39090f626408d91999dd4a8f0869ab13 https://github.com/MariaDB/server/commit/89a264809d660fb5a4e7d43e9324b1f529a3a1d7

          People

            nirbhay_c Nirbhay Choubey (Inactive)
            thys Thijs Houtenbos
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.