Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23982

Mariabackup hangs during backup

    XMLWordPrintable

Details

    Description

      MDEV-13318 introduced a condition to Mariabackup that can cause it to hang if the server goes idle after a particular kind of redo log write where the last written redo log block has no payload after the 12-byte header. In log0recv.cc we would treat blocks with less than 12 bytes of length as corrupted, but a block with exactly 12 bytes is fine. However, Mariabackup would disregard blocks that have exactly 12 bytes of length.

      The problem was found in the MDEV-23855 branch via a test failure like this:

      mariabackup.incremental_encrypted '4k,innodb' w2 [ fail ]  timeout after 900 seconds
              Test ended at 2020-10-19 08:16:36
       
      Test case timeout after 900 seconds
       
      == /dev/shm/var/2/log/incremental_encrypted.log == 
      call mtr.add_suppression("InnoDB: New log files created");
      CREATE TABLE t(i INT) ENGINE INNODB ENCRYPTED=YES;
      INSERT INTO t VALUES(1);
      # Create full backup , modify table, then create incremental/differential backup
      SET GLOBAL innodb_flush_log_at_trx_commit = 1;
      INSERT INTO t VALUES(2);
      SELECT * FROM t;
      i
      1
      2
       
       == /dev/shm/var/2/tmp/analyze-timeout-mysqld.1.err ==
      mysqltest: Could not open connection 'default' after 500 attempts: 2002 Can't connect to local MySQL server through socket '/dev/shm/var/tmp/2/mysqld.1.sock' (111)
      

      The MariaDB server process is idle, and the reason for the hang was that mariabackup --backup in xtrabackup_copy_logfile() would never reach the end of the log, and instead kept complaining that the last block is corrupted, which is it not:

      [00] 2020-10-19 12:56:20 >> log scanned up to (59392)
      [00] 2020-10-19 12:56:21 garbage block: 59392,12
      [00] 2020-10-19 12:56:21 >> log scanned up to (59392)
      [00] 2020-10-19 12:56:22 garbage block: 59392,12
      [00] 2020-10-19 12:56:22 >> log scanned up to (59392)
      

      The fix is simple: accept a log block with no payload bytes:

      diff --git a/extra/mariabackup/xtrabackup.cc b/extra/mariabackup/xtrabackup.cc
      index 13c10c2921d..8ad9af9a9c9 100644
      --- a/extra/mariabackup/xtrabackup.cc
      +++ b/extra/mariabackup/xtrabackup.cc
      @@ -2650,7 +2650,7 @@ static lsn_t xtrabackup_copy_log(lsn_t start_lsn, lsn_t end_lsn, bool last)
       			/* We got a full log block. */
       			scanned_lsn += data_len;
       		} else if (data_len >= log_sys.trailer_offset()
      -			   || data_len <= LOG_BLOCK_HDR_SIZE) {
      +			   || data_len < LOG_BLOCK_HDR_SIZE) {
       			/* We got a garbage block (abrupt end of the log). */
       			msg(0,"garbage block: " LSN_PF ",%zu",scanned_lsn, data_len);
       			break;
      

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.