Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23324

After (re)starting multiple slaves, parallel updates seem to compromise ACID

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.4.13
    • 10.4
    • Replication
    • None
    • binlog_format = ROW
      gtid_strict_mode = 1
      multi-master setup
      slaves:
      * Parallel_Mode: conservative (default)
      * Using_Gtid: Current_Pos

    Description

      I have a table with a counter that is updated relatively by multiple masters. One of the slaves (that has 2 masters) stopped replicating after an unrelated failure (one of the masters deleted a row that the other master inserted previously, apparently there was some lag).

      After restarting the replication from both master (using start all slaves), I ended up with a mismatch of data between the two masters, and the slave. Both masters updated the same counter 6 times, but the slave ended up with a final value of 11 instead of 12:

      redacted binlog from master1, with irrelevant fields from Account removed and redacted ID:

      #200722  8:28:32 server id 1  end_log_pos 12219708 CRC32 0xf87669df     Annotate_rows:
      #Q> INSERT INTO `Account` (`id`, `testHits`) VALUES (9, 0)
      #200722  8:50:18 server id 1  end_log_pos 12992418 CRC32 0x239cb8e2     Annotate_rows:
      #Q> UPDATE `Account` SET `testHits` = `testHits` + 1 WHERE `id` = 9
       
      #200727  8:26:22 server id 1  end_log_pos 12421103 CRC32 0x712982e4     Annotate_rows:
      #Q> UPDATE `Account` SET `testHits` = `testHits` + 1 WHERE `id` = 9
      #200727  8:26:36 server id 1  end_log_pos 12431646 CRC32 0x3e2f3505     Annotate_rows:
      #Q> UPDATE `Account` SET `testHits` = `testHits` + 1 WHERE `id` = 9
      #200727  8:26:43 server id 1  end_log_pos 12436404 CRC32 0x8ab1931a     Annotate_rows:
      #Q> UPDATE `Account` SET `testHits` = `testHits` + 1 WHERE `id` = 9
       
      #200728  7:02:02 server id 1  end_log_pos 340028656 CRC32 0xcf9797a5     Annotate_rows:
      #Q> UPDATE `Account` SET `testHits` = `testHits` + 1 WHERE `id` = 9
      #200728  7:02:19 server id 1  end_log_pos 340032448 CRC32 0x71109d4b     Annotate_rows:
      #Q> UPDATE `Account` SET `testHits` = `testHits` + 1 WHERE `id` = 9
      

      binlog from master1:

      #200722  8:31:20 server id 3  end_log_pos 4420014 CRC32 0xb6bc7350     Annotate_rows:
      #Q> UPDATE `Account` SET `testHits` = `testHits` + 1 WHERE `id` = 9
      #200722  9:49:32 server id 3  end_log_pos 5237738 CRC32 0x9627872a     Annotate_rows:
      #Q> UPDATE `Account` SET `testHits` = `testHits` + 1 WHERE `id` = 9
       
      #200727 14:04:32 server id 3  end_log_pos 8793329 CRC32 0x2651f5ea     Annotate_rows:
      #Q> UPDATE `Account` SET `testHits` = `testHits` + 1 WHERE `id` = 9
      #200727 14:04:45 server id 3  end_log_pos 8794212 CRC32 0x619663f3     Annotate_rows:
      #Q> UPDATE `Account` SET `testHits` = `testHits` + 1 WHERE `id` = 9
      #200727 14:12:01 server id 3  end_log_pos 8821614 CRC32 0xef4ded02     Annotate_rows:
      #Q> UPDATE `Account` SET `testHits` = `testHits` + 1 WHERE `id` = 9
       
      #200728  7:02:11 server id 3  end_log_pos 3108883 CRC32 0x6c2bf4fd     Annotate_rows:
      #Q> UPDATE `Account` SET `testHits` = `testHits` + 1 WHERE `id` = 9
      

      Replication was halted at 200728 04:03:11. Both masters had testHits=12, but this particular slave ended up with testHits=11

      Is this expected behaviour?

      Attachments

        Activity

          People

            Elkin Andrei Elkin
            sjon sjon
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.