Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14423

Galera Cluster behaves asynchronously

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Not a Bug
    • 10.2.10
    • N/A
    • Galera
    • None
    • 3 node MaraiDB Galera cluster, running on Ubuntu 16.04 VMs.

    Description

      We've noticed something really troubling with our cluster. Deletes/Inserts/Updates don't seem to be happening synchronously on Galera nodes. To test this scenario I wrote a simple bash script that inserts records into a table on one node and immediately reads row count from a different node. Depending on the latency between servers I get wrong number of rows from the second node in about as high as 39% of the attempts. If I understand the nature of Galera cluster, this should never ever happen, for when I update one node, it shouldn't confirm the update until all nodes have received the data. This is what I've done:

      1. I created a database called 'Testing' with a single table:

      CREATE TABLE `Table1` (
      `Id` INT(11) NOT NULL AUTO_INCREMENT,
      `Text` VARCHAR(50) NOT NULL,
      PRIMARY KEY (`Id`)
      )
      COLLATE='utf8_general_ci'
      ENGINE=InnoDB;

      2. I wrote a bash script that does the following:

      sql='DELETE FROM Table1;';
      mysql -h$node1 -u$user -p$pass -sN -D 'Testing' -e "$sql";

      sql="INSERT INTO Table1(Text) VALUES('Text1'), ('Text2'), ('Text3'), ('Text4');";
      mysql -h$node1 -u$user -p$pass -sN -D 'Testing' -e "$sql";

      sql='SELECT COUNT FROM Table1;';
      result=$(mysql -h$node2 -u$user -p$pass -sN -D 'Testing' -e "$sql");

      As you can see DELETE and INSERT are sent to node1, while SELECT is issued on the node2.

      3. Some percentage of time, depending on the nodes selected and/or timing/network lag, SELECT returns 0 rows (between 0.1% and 40% of the time).

      4. If I add 1 second delay between INSERT and SELECT, I always get the correct number of rows.

      5. There are no errors on MariaDB nodes that I can see and replication seems to be working as far as I can tell.

      I'm attaching the test script I wrote. You have to modify it to set your username and password and call it with two parameters for hostnames of the nodes like so:

      ./test_galera.sh mynode1.domain.com mynode2.domain.com

      Please help, unless I'm missing something obvious, this is a critical issue.

      Let me know if I can provide any additional info to solve this situation.

      Thank you.

      Attachments

        Issue Links

          Activity

            People

              anikitin Andrii Nikitin (Inactive)
              kvasserman Konstantin Vasserman
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.