Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32953

main.rpl_mysqldump_slave Fails with "Master binlog wasn't deleted" Assertion

    XMLWordPrintable

Details

    Description

      After MDEV-32611, main.rpl_mysqldump_slave sporadically fails with the following:

      main.rpl_mysqldump_slave 'row'           w5 [ fail ]
              Test ended at 2023-12-05 18:06:55
       
      CURRENT_TEST: main.rpl_mysqldump_slave
      mysqldump: Got error: 1049: "Unknown database 'no_such_db'" when selecting the database
      mysqltest: At line 125: Master binlog wasn't deleted after mariadb-dump with --delete-master-logs.
       
      The result from queries just before the failure was: 
      < snip >
      -- CHANGE MASTER TO MASTER_LOG_FILE='slave-bin.000001', MASTER_LOG_POS=BINLOG_START;
      -- CHANGE MASTER TO MASTER_USE_GTID=slave_pos;
      -- SET GLOBAL gtid_slave_pos='0-2-1003';
       
      3. --master-data --single-transaction
       
      -- CHANGE MASTER TO MASTER_LOG_FILE='slave-bin.000001', MASTER_LOG_POS=BINLOG_START;
      CHANGE MASTER TO MASTER_USE_GTID=slave_pos;
      SET GLOBAL gtid_slave_pos='0-2-1003';
      connection master;
      CREATE TABLE t (
      id int
      );
      insert into t values (1); 
      insert into t values (2); 
      drop table t;
      -- CHANGE MASTER TO MASTER_LOG_FILE='master-bin.000002', MASTER_LOG_POS=BINLOG_START;
      -- SET GLOBAL gtid_slave_pos='0-1-1005';
      # postdump_first_binary_log_filename: master-bin.000001
      # postdump_binlog_filename: master-bin.000002
       
      More results from queries before failure can be found in /home/buildbot/amd64-debian-10/build/mysql-test/var/5/log/rpl_mysqldump_slave.log
      

      The problem is that --delete-master-logs immediately purges logs after flushing, and the active binlog dump thread can still be using the old log when the purge executes, disallowing the file from being deleted. That is because there is validation logic in the purge code to ensure that nobody is actively using that log before it deletes it:

      static my_bool log_in_use_callback(THD *thd, const char *log_name)
      {
        my_bool result= 0;
        mysql_mutex_lock(&thd->LOCK_thd_data);
        if (auto linfo= thd->current_linfo)
          result= !strcmp(log_name, linfo->log_file_name);
        mysql_mutex_unlock(&thd->LOCK_thd_data);
        return result;
      }
       
       
      bool log_in_use(const char* log_name)
      {
        return server_threads.iterate(log_in_use_callback, log_name);
      }
      

      If a user is using mysqldump --delete-master-logs with active binlog dump thread connections, they can run into this with their workaround options being
      1) Re-run PURGE BINARY LOGS once the replicas have caught up
      2) Temporarily disconnect replicas from the primary while running the dump (though this option requires master_use_gtid=Slave_pos)

      Attachments

        Issue Links

          Activity

            People

              bnestere Brandon Nesterenko
              bnestere Brandon Nesterenko
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.