Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8031

Parallel replication stops on "connection killed" error (probably incorrectly handled deadlock kill)

    Details

      Description

      Parallel replication stopped like this:

      150419 11:44:05 [ERROR] Slave SQL: Connection was killed, Gtid 0-187203009-1533130924, Internal MariaDB error code: 1927
      150419 11:44:05 [Warning] Slave: Connection was killed Error_code: 1927
      150419 11:44:05 [Warning] Slave: Deadlock found when trying to get lock; try restarting transaction Error_code: 1213
      150419 11:44:05 [Warning] Slave: Connection was killed Error_code: 1927
      150419 11:44:05 [Warning] Slave: Connection was killed Error_code: 1927
      150419 11:44:05 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'binlog.000294' position 193637713

      This is conservative mode on 10.1.4. Bug may also apply to 10.0, to be
      determined.

      The problem is likely a rare race, it only occured once so far for a user on
      a highly loaded system.

      The current theory is that a transaction is deadlock killed due to normal
      deadlock condition (1927). That error is converted to a deadlock error for
      transaction retry (1213). Then more deadlock kills arrive, which is normal
      (more 1927). And for some reason those subsequent deadlock kills are not
      converted into deadlock errors and transaction retry (this would be the
      bug).

        Attachments

          Activity

            People

            • Assignee:
              knielsen Kristian Nielsen
              Reporter:
              knielsen Kristian Nielsen
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: