Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30794

port --rollback-xa of MDEV-21168 fixes to 10.5+

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Incomplete
    • 10.5, 10.6, 10.8(EOL), 10.9(EOL), 10.10(EOL), 10.11
    • N/A
    • Backup
    • None

    Description

      --rollback-xa of MDEV-21168 may be needed when
      the backup image is planned to be restored on a general non-slave server. That server would not be going to communicate with
      the backup donor so can not find out automatically commit-or-rollback decisions for prepared user xa:s.

      Attachments

        Issue Links

          Activity

            As I remember from our long discussions during MDEV-742 and MDEV-21168 implementation, the overall logic is the following. If restored backup is used to start slave, then MDEV-742 fix should solve the issue for 10.5+, as binlog, received from master, should contain information about what to do with prepared XA's. If we restore backup not for slave starting, then we use --tc-heuristic-recover. This follows the logic that we don't rollback transactions on "mariabackup --prepare" execution. Maybe we should mention this somewhere in "mariabackup --help" to avoid users confusing. So, I think, there is no need to implement MDEV-30794.

            vlad.lesin Vladislav Lesin added a comment - As I remember from our long discussions during MDEV-742 and MDEV-21168 implementation, the overall logic is the following. If restored backup is used to start slave, then MDEV-742 fix should solve the issue for 10.5+, as binlog, received from master, should contain information about what to do with prepared XA's. If we restore backup not for slave starting, then we use --tc-heuristic-recover. This follows the logic that we don't rollback transactions on "mariabackup --prepare" execution. Maybe we should mention this somewhere in "mariabackup --help" to avoid users confusing. So, I think, there is no need to implement MDEV-30794 .
            Elkin Andrei Elkin added a comment -

            valerii, considering more elaboration on slack, it looks this ticket is not for the customer trouble. I suggest we'd review their backup procedure and maybe analyzed the backup image itself to explain possible non-user-xa prepared trx in there.

            Elkin Andrei Elkin added a comment - valerii , considering more elaboration on slack, it looks this ticket is not for the customer trouble. I suggest we'd review their backup procedure and maybe analyzed the backup image itself to explain possible non-user-xa prepared trx in there.
            vlad.lesin Vladislav Lesin added a comment - - edited

            As we discussed it in the slack thread I referred above, non-explicit(non-user, "normal", whatever, i.e., the XA's, which were started explicitly with 'XA START' statement) XA prepare and commit must be protected with MDL_BACKUP_COMMIT lock in ha_commit_trans(). What means there must no be prepared and non-committed such XA's in InnoDB redo log. If we see such XA's there, that means something went wrong during backup process. And we need to find out what exactly went wrong. It's expected that after we find out and fix it, there will not be prepared and uncommitted non-explicit XA's in InnoDB redo log for, at least, 10.6+, and --rollback-xa will not be needed.

            Currently, if such non-explicit XA prevents server from starting normally, there is server option --tc-heuristic-recover=rollback, which rolls back prepared XA's on server start. Our customers could use this option. To make it more automation-friendly, we could parse-out such XA's during --prepare(or/and even during --backup) from InnoDB redo log, and notify mariabackup users with special messages in backup log.

            The question to support team is the following. pandi.gurusamy, could you please clarify, if our customers insist on this issue implementing, or some workaround with --tc-heuristic-recover=rollback is fine for them?

            vlad.lesin Vladislav Lesin added a comment - - edited As we discussed it in the slack thread I referred above, non-explicit(non-user, "normal", whatever, i.e., the XA's, which were started explicitly with 'XA START' statement) XA prepare and commit must be protected with MDL_BACKUP_COMMIT lock in ha_commit_trans(). What means there must no be prepared and non-committed such XA's in InnoDB redo log. If we see such XA's there, that means something went wrong during backup process. And we need to find out what exactly went wrong. It's expected that after we find out and fix it, there will not be prepared and uncommitted non-explicit XA's in InnoDB redo log for, at least, 10.6+, and --rollback-xa will not be needed. Currently, if such non-explicit XA prevents server from starting normally, there is server option --tc-heuristic-recover=rollback, which rolls back prepared XA's on server start. Our customers could use this option. To make it more automation-friendly, we could parse-out such XA's during --prepare(or/and even during --backup) from InnoDB redo log, and notify mariabackup users with special messages in backup log. The question to support team is the following. pandi.gurusamy , could you please clarify, if our customers insist on this issue implementing, or some workaround with --tc-heuristic-recover=rollback is fine for them?

            People

              valerii Valerii Kravchuk
              Elkin Andrei Elkin
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.