Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8134

The relay-log is not flushed after the slave-relay-log.999999 showed

Details

    Description

      I am using MariaDB 10.0.14 at CentOS 6.5.
      And, as I mentioned at the title of this question, the relay-log is not
      flushed after the slave-relay-log.999999 showed when using
      "Salve_parallel_threads:10" setting. like showed blow.

      • binlog_format: ROW
      • Slave_parallel_threads:10

      Everything are working fine except the "slave-relay-log.******" files
      continue to exist at the disk which will finally cause the disk full.
      If I change the value of Slave_parallel_threads setting from 10 to 0,
      the log will be flushed. Howevery "PK duplicate warning" error logs come
      next.

      Is there any setting should be used with Slave_parallel_threads setting?
      Any help will be great.

      Best regareds,

      Attachments

        Issue Links

          Activity

            DZW David Zhao added a comment - - edited

            I also came into this issue a few days ago, and I have located this bug and fixed it, I've attached my patch here.

            The problem is that in Relay_log_info::inc_group_relay_log_pos() function, they way you compare two log name is via strcmp() function, this is fine when log name sequence number are both of the same digits( 6 digits). but when the number goes to 7 digits, a 99999 compares greater than 1000000, which is wrong, hence the bug.

            Besides this bug, I also located another bug caused by the same mistake in handle_queued_pos_update(), which could cause parallel replication issues when log name sequence number goes to 7 digits.

            999999.diff

            DZW David Zhao added a comment - - edited I also came into this issue a few days ago, and I have located this bug and fixed it, I've attached my patch here. The problem is that in Relay_log_info::inc_group_relay_log_pos() function, they way you compare two log name is via strcmp() function, this is fine when log name sequence number are both of the same digits( 6 digits). but when the number goes to 7 digits, a 99999 compares greater than 1000000, which is wrong, hence the bug. Besides this bug, I also located another bug caused by the same mistake in handle_queued_pos_update(), which could cause parallel replication issues when log name sequence number goes to 7 digits. 999999.diff
            mpflaum Maria M Pflaum (Inactive) added a comment - - edited

            julien.fritsch It turns out I don't need a custom build for the customer. Sorry for the confusion. I just need to let them know when it's available in the next 10.2.x CS release.

            mpflaum Maria M Pflaum (Inactive) added a comment - - edited julien.fritsch It turns out I don't need a custom build for the customer. Sorry for the confusion. I just need to let them know when it's available in the next 10.2.x CS release.

            Hello Andrei,

            Can you please review fix for MDEV-8134.

            Patch: https://github.com/MariaDB/server/commit/ffbb7348f67fdb78bb6c19cbf3d8d890e9ea29d7

            BuildBot: bb-10.2-sujatha

            Regarding usage of 'binlog_id' for file name comparison, this variable is dedicated for binary logs. Its usage is tightly coupled with binlog checkpoint. This variable monotonically increases during server runtime, starting from 1. If binary logs are cleared by using RESET MASTER the binlog_id is not reset. Only upon server restart it is reset to 1. Hence we cannot use this variable for file_name comparison.

            sujatha.sivakumar Sujatha Sivakumar (Inactive) added a comment - Hello Andrei, Can you please review fix for MDEV-8134 . Patch: https://github.com/MariaDB/server/commit/ffbb7348f67fdb78bb6c19cbf3d8d890e9ea29d7 BuildBot: bb-10.2-sujatha Regarding usage of 'binlog_id' for file name comparison, this variable is dedicated for binary logs. Its usage is tightly coupled with binlog checkpoint. This variable monotonically increases during server runtime, starting from 1. If binary logs are cleared by using RESET MASTER the binlog_id is not reset. Only upon server restart it is reset to 1. Hence we cannot use this variable for file_name comparison.

            Okay to push

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Okay to push
            sujatha.sivakumar Sujatha Sivakumar (Inactive) added a comment - Fix is implemented in 10.2.37. Patch has been cherry-picked to higher versions and tested. No major merge conflicts were observed. 10.3: https://github.com/MariaDB/server/commit/811dac176c10973d90f110480d2d353c452d78b9 10.4: https://github.com/MariaDB/server/commit/0f2e60f07763de230baa057818c161ce3d59b994 10.5: https://github.com/MariaDB/server/commit/3aeb78968f662629abd467773f7924fd66772022 10.6: https://github.com/MariaDB/server/commit/822e224170f99478b09ddfe03d316fee8b36a3ec

            People

              sujatha.sivakumar Sujatha Sivakumar (Inactive)
              irani ilhwan Kim
              Votes:
              5 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.