Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20219

PURGE BINARY LOGS does not always purge

Details

    Description

      Purging binary logs does not always purge and delete it. This typically happens when only one binary log is there (little traffic or short purge interval). This is reproducible and does NOT happen with a competitive product.

      Good case:

      ll ~/database/mariadb-103/binlog/ ~/bck/mariadb-103/binlog/ ; cat ~/database/mariadb-103/binlog/binlog.index

      /home/mysql/bck/mariadb-103/binlog/:
      total 1560
      rw-rr- 1 mysql mysql 779746 Jul 30 18:02 bck_binlog.000390.gz
      rw-rr- 1 mysql mysql 424587 Jul 30 18:03 bck_binlog.000391.gz
      rw-rr- 1 mysql mysql 388928 Jul 30 18:04 bck_binlog.000392.gz

      /home/mysql/database/mariadb-103/binlog/:
      total 30912
      rw-rw--- 1 mysql mysql 13367733 Jul 30 18:07 binlog.000393
      rw-rw--- 1 mysql mysql 18268189 Jul 30 18:12 binlog.000394
      rw-rw--- 1 mysql mysql 108 Jul 30 18:07 binlog.index
      rw-rw--- 1 mysql mysql 12 Jul 25 09:56 binlog.state

      /home/mysql/database/mariadb-103/binlog/binlog.000393
      /home/mysql/database/mariadb-103/binlog/binlog.000394

      bman --target=brman:secret@127.0.0.1:3318 --type=binlog --policy=binlog --mode=logical --backupdir=/home/mysql/bck/mariadb-103

      Start backup at 2019-07-30_18:12:39
      Binlog Index file is: /home/mysql/database/mariadb-103/binlog/binlog.index
      Getting lock: /home/oli/fromdual_devel/brman/lck/binlog-logical-binlog.lock
      Releasing lock: /home/oli/fromdual_devel/brman/lck/binlog-logical-binlog.lock
      FLUSH /*!50503 BINARY */ LOGS

      Copy /home/mysql/database/mariadb-103/binlog/binlog.000393 to /home/mysql/bck/mariadb-103/binlog/bck_binlog.000393
      Binary log binlog.000393 begin datetime is: 2019-07-30 18:04:10 and end datetime is: 2019-07-30 18:07:34
      Do MD5 checksum of /home/mysql/bck/mariadb-103/binlog/bck_binlog.000393
      md5sum --binary /home/mysql/bck/mariadb-103/binlog/bck_binlog.000393
      md5 = 54c59d6ec5f14d04176065f50634c7dd
      /usr/bin/pigz -6 /home/mysql/bck/mariadb-103/binlog/bck_binlog.000393
      PURGE BINARY LOGS TO 'binlog.000394'

      Copy /home/mysql/database/mariadb-103/binlog/binlog.000394 to /home/mysql/bck/mariadb-103/binlog/bck_binlog.000394
      Binary log binlog.000394 begin datetime is: 2019-07-30 18:07:34 and end datetime is: 2019-07-30 18:12:39
      Do MD5 checksum of /home/mysql/bck/mariadb-103/binlog/bck_binlog.000394
      md5sum --binary /home/mysql/bck/mariadb-103/binlog/bck_binlog.000394
      md5 = 01538bee0e2de03eaa42b385d369387a
      /usr/bin/pigz -6 /home/mysql/bck/mariadb-103/binlog/bck_binlog.000394
      PURGE BINARY LOGS TO 'binlog.000395'
      End backup at 2019-07-30 18:12:40 (rc=0)

      ll ~/database/mariadb-103/binlog/ ~/bck/mariadb-103/binlog/ ; cat ~/database/mariadb-103/binlog/binlog.index

      /home/mysql/bck/mariadb-103/binlog/:
      total 7152
      rw-rr- 1 mysql mysql 779746 Jul 30 18:02 bck_binlog.000390.gz
      rw-rr- 1 mysql mysql 424587 Jul 30 18:03 bck_binlog.000391.gz
      rw-rr- 1 mysql mysql 388928 Jul 30 18:04 bck_binlog.000392.gz
      rw-rr- 1 mysql mysql 2330609 Jul 30 18:07 bck_binlog.000393.gz
      rw-rr- 1 mysql mysql 3390627 Jul 30 18:12 bck_binlog.000394.gz

      /home/mysql/database/mariadb-103/binlog/:
      total 540
      rw-rw--- 1 mysql mysql 542899 Jul 30 18:12 binlog.000395
      rw-rw--- 1 mysql mysql 54 Jul 30 18:12 binlog.index
      rw-rw--- 1 mysql mysql 12 Jul 25 09:56 binlog.state

      /home/mysql/database/mariadb-103/binlog/binlog.000395

      Bad case:

      bman --target=brman:secret@127.0.0.1:3318 --type=binlog --policy=binlog --mode=logical --backupdir=/home/mysql/bck/mariadb-103

      Start backup at 2019-07-30_18:12:56
      Binlog Index file is: /home/mysql/database/mariadb-103/binlog/binlog.index
      Getting lock: /home/oli/fromdual_devel/brman/lck/binlog-logical-binlog.lock
      Releasing lock: /home/oli/fromdual_devel/brman/lck/binlog-logical-binlog.lock
      FLUSH /*!50503 BINARY */ LOGS

      Copy /home/mysql/database/mariadb-103/binlog/binlog.000395 to /home/mysql/bck/mariadb-103/binlog/bck_binlog.000395
      Binary log binlog.000395 begin datetime is: 2019-07-30 18:12:39 and end datetime is: 2019-07-30 18:12:56
      Do MD5 checksum of /home/mysql/bck/mariadb-103/binlog/bck_binlog.000395
      md5sum --binary /home/mysql/bck/mariadb-103/binlog/bck_binlog.000395
      md5 = 6b83ea44eb4a86cff21aa0e68d78e7d0
      /usr/bin/pigz -6 /home/mysql/bck/mariadb-103/binlog/bck_binlog.000395
      PURGE BINARY LOGS TO 'binlog.000396'
      End backup at 2019-07-30 18:12:56 (rc=0)

      ll ~/database/mariadb-103/binlog/ ~/bck/mariadb-103/binlog/ ; cat ~/database/mariadb-103/binlog/binlog.index

      /home/mysql/bck/mariadb-103/binlog/:
      total 7344
      rw-rr- 1 mysql mysql 779746 Jul 30 18:02 bck_binlog.000390.gz
      rw-rr- 1 mysql mysql 424587 Jul 30 18:03 bck_binlog.000391.gz
      rw-rr- 1 mysql mysql 388928 Jul 30 18:04 bck_binlog.000392.gz
      rw-rr- 1 mysql mysql 2330609 Jul 30 18:07 bck_binlog.000393.gz
      rw-rr- 1 mysql mysql 3390627 Jul 30 18:12 bck_binlog.000394.gz
      rw-rr- 1 mysql mysql 195160 Jul 30 18:12 bck_binlog.000395.gz

      /home/mysql/database/mariadb-103/binlog/:
      total 1260
      rw-rw--- 1 mysql mysql 1116483 Jul 30 18:12 binlog.000395
      rw-rw--- 1 mysql mysql 161749 Jul 30 18:12 binlog.000396
      rw-rw--- 1 mysql mysql 108 Jul 30 18:12 binlog.index
      rw-rw--- 1 mysql mysql 12 Jul 25 09:56 binlog.state

      /home/mysql/database/mariadb-103/binlog/binlog.000395
      /home/mysql/database/mariadb-103/binlog/binlog.000396

      This makes behaviour completely unpredictable for more sophisticated backup software and follow-up backup jobs will fail.

      Attachments

        1. bck_binlog.000604.gz
          33 kB
        2. bck_binlog.000605
          54 kB
        3. bck_binlog.000605.gz
          9 kB
        4. binlog.000606
          228 kB
        5. chef_mariadb-103_error.log
          8 kB
        6. problematic_binary_logs_335x.tar.gz
          92 kB
        7. purge.sh
          0.5 kB

        Issue Links

          Activity

            I tried it again with and without a Slave. The results lead me to the conclusion that it is about 95% sure that it is the Slave:

            Without a Slave:

            mysql@chef:~ [mariadb-103, 3318]> while [ 1 ] ; do /home/mysql/product/brman-2.2.0/bin/bman --target=brman:secret@127.0.0.1:3318 --type=binlog --policy=binlog --mode=logical --backupdir=/home/mysql/bck/mariadb-103 | grep 'End backup' ; sleep $(echo $(($RANDOM % 20))) ; done
            End backup at 2021-09-23 08:07:05 (rc=0)
            ... 79 times the same result
            End backup at 2021-09-23 08:18:39 (rc=0)

            Starting the Slave:

            mysql@chef:~ [mariadb-103, 3318]> while [ 1 ] ; do /home/mysql/product/brman-2.2.0/bin/bman --target=brman:secret@127.0.0.1:3318 --type=binlog --policy=binlog --mode=logical --backupdir=/home/mysql/bck/mariadb-103 | grep 'End backup' ; sleep $(echo $(($RANDOM % 20))) ; done
            End backup at 2021-09-23 08:19:03 (rc=0)
            ... 3 times the same result
            End backup at 2021-09-23 08:19:34 (rc=0)
            End backup at 2021-09-23 08:19:39 (rc=0)
            ERROR: Zipped file /home/mysql/bck/mariadb-103/binlog/bck_binlog.000774.gz already exists and you did not specify --backup-overwrite. (rc=4178)
            End backup at 2021-09-23 08:19:56 (rc=4178)

            Removed the problematic file. Then:

            End backup at 2021-09-23 08:20:40 (rc=0)
            ... 24 times the same result
            End backup at 2021-09-23 08:24:52 (rc=0)
            End backup at 2021-09-23 08:24:52 (rc=0)
            ERROR: Zipped file /home/mysql/bck/mariadb-103/binlog/bck_binlog.000804.gz already exists and you did not specify --backup-overwrite. (rc=4178)
            End backup at 2021-09-23 08:25:03 (rc=4178)

            Removed the problematic file. Then:

            End backup at 2021-09-23 08:26:50 (rc=0)
            ... 56 times the same result
            End backup at 2021-09-23 08:36:25 (rc=0)
            End backup at 2021-09-23 08:36:30 (rc=0)
            ERROR: Zipped file /home/mysql/bck/mariadb-103/binlog/bck_binlog.000872.gz already exists and you did not specify --backup-overwrite. (rc=4178)
            End backup at 2021-09-23 08:36:40 (rc=4178)

            Stop slave. Then:

            End backup at 2021-09-23 08:38:02 (rc=0)
            ... 270 times the same result
            End backup at 2021-09-23 09:20:43 (rc=0)

            oli Oli Sennhauser added a comment - I tried it again with and without a Slave. The results lead me to the conclusion that it is about 95% sure that it is the Slave: Without a Slave: mysql@chef:~ [mariadb-103, 3318] > while [ 1 ] ; do /home/mysql/product/brman-2.2.0/bin/bman --target=brman:secret@127.0.0.1:3318 --type=binlog --policy=binlog --mode=logical --backupdir=/home/mysql/bck/mariadb-103 | grep 'End backup' ; sleep $(echo $(($RANDOM % 20))) ; done End backup at 2021-09-23 08:07:05 (rc=0) ... 79 times the same result End backup at 2021-09-23 08:18:39 (rc=0) Starting the Slave: mysql@chef:~ [mariadb-103, 3318] > while [ 1 ] ; do /home/mysql/product/brman-2.2.0/bin/bman --target=brman:secret@127.0.0.1:3318 --type=binlog --policy=binlog --mode=logical --backupdir=/home/mysql/bck/mariadb-103 | grep 'End backup' ; sleep $(echo $(($RANDOM % 20))) ; done End backup at 2021-09-23 08:19:03 (rc=0) ... 3 times the same result End backup at 2021-09-23 08:19:34 (rc=0) End backup at 2021-09-23 08:19:39 (rc=0) ERROR: Zipped file /home/mysql/bck/mariadb-103/binlog/bck_binlog.000774.gz already exists and you did not specify --backup-overwrite. (rc=4178) End backup at 2021-09-23 08:19:56 (rc=4178) Removed the problematic file. Then: End backup at 2021-09-23 08:20:40 (rc=0) ... 24 times the same result End backup at 2021-09-23 08:24:52 (rc=0) End backup at 2021-09-23 08:24:52 (rc=0) ERROR: Zipped file /home/mysql/bck/mariadb-103/binlog/bck_binlog.000804.gz already exists and you did not specify --backup-overwrite. (rc=4178) End backup at 2021-09-23 08:25:03 (rc=4178) Removed the problematic file. Then: End backup at 2021-09-23 08:26:50 (rc=0) ... 56 times the same result End backup at 2021-09-23 08:36:25 (rc=0) End backup at 2021-09-23 08:36:30 (rc=0) ERROR: Zipped file /home/mysql/bck/mariadb-103/binlog/bck_binlog.000872.gz already exists and you did not specify --backup-overwrite. (rc=4178) End backup at 2021-09-23 08:36:40 (rc=4178) Stop slave. Then: End backup at 2021-09-23 08:38:02 (rc=0) ... 270 times the same result End backup at 2021-09-23 09:20:43 (rc=0)

            It was running now for 6 hours without a problem... If we calculate every 30 seconds a run in average this makes about 2400 runs. But then it broke again! WITHOUT a Slave attached...

            ...
            End backup at 2021-09-23 15:18:40 (rc=0)
            End backup at 2021-09-23 15:18:47 (rc=0)
            ERROR: Zipped file /home/mysql/bck/mariadb-103/binlog/bck_binlog.003351.gz already exists and you did not specify --backup-overwrite. (rc=4178)
            End backup at 2021-09-23 15:18:51 (rc=4178)

            So it looks like this happens more probable WITH Slave. But it also happens rarely WITHOUT Slave... Does this make sense?

            oli Oli Sennhauser added a comment - It was running now for 6 hours without a problem... If we calculate every 30 seconds a run in average this makes about 2400 runs. But then it broke again! WITHOUT a Slave attached... ... End backup at 2021-09-23 15:18:40 (rc=0) End backup at 2021-09-23 15:18:47 (rc=0) ERROR: Zipped file /home/mysql/bck/mariadb-103/binlog/bck_binlog.003351.gz already exists and you did not specify --backup-overwrite. (rc=4178) End backup at 2021-09-23 15:18:51 (rc=4178) So it looks like this happens more probable WITH Slave. But it also happens rarely WITHOUT Slave... Does this make sense?
            bnestere Brandon Nesterenko added a comment - - edited

            Interesting! Can you attach the problematic binary log file (live from the base directory), its gzipped backup version (assuming it was not overwritten?), and the next binary log file? So that would be binlog.003351, bck_binlog.003351.gz, and binlog.003352.

            Edit: The previous binlog may be helpful too, binlog.003350

            bnestere Brandon Nesterenko added a comment - - edited Interesting! Can you attach the problematic binary log file (live from the base directory), its gzipped backup version (assuming it was not overwritten?), and the next binary log file? So that would be binlog.003351, bck_binlog.003351.gz, and binlog.003352. Edit: The previous binlog may be helpful too, binlog.003350

            Hmm, my hunch didn't pan out.

            Thanks for attaching the binlog files. Can you also attach the applicable server/error logs?

            bnestere Brandon Nesterenko added a comment - Hmm, my hunch didn't pan out. Thanks for attaching the binlog files. Can you also attach the applicable server/error logs?

            Uploaded.

            oli Oli Sennhauser added a comment - Uploaded.

            People

              bnestere Brandon Nesterenko
              oli Oli Sennhauser
              Votes:
              2 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.