Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25358

MEMORY tables replication - corrupted binlog upon master start




      The binary log contains illegal/extra characters on master for MEMORY tables after mariadb restart.

      There are 2 (multi-)master clusters: old-node01/02 and new-node01/02. Old nodes are running Mariadb 10.1, new nodes are 10.3. New-node01 is connected as a slave to old-node02, replication is working ok.
      After setting up replication between new-node01 -> new-node02, the slave thread stops nearly immediatelly, stating that there is an error in the query - some invalid chars in the query's end.
      Dumping the corresponding binary logfile on new-node01 - which is the master of new-node02 - it seems, that the query in the binlog is really corrupted, there are some special chars after the actual query. My investigation showed, that the error always happens with MEMORY tables.

      The output of the mysqlbinlog on new-node01:

      # at 2361
      #210407 11:13:34 server id 4  end_log_pos 2403 CRC32 0x90080d27         GTID 100-4-37336878143 trans
      /*!100001 SET @@session.gtid_seq_no=37336878143*//*!*/;
      # at 2403
      #210407 11:13:34 server id 4  end_log_pos 2527 CRC32 0x27183d2f         Query   thread_id=341297541     exec_time=318   error_code=0
      SET TIMESTAMP=1617786814/*!*/;
      DELETE FROM `mydb1`.`collect_output`e<FB>#<98>
      # at 2527
      #210407 11:13:34 server id 4  end_log_pos 2598 CRC32 0x427f0ed7         Table_map: `mydb1`.`collect_output` mapped to number 1330

      Please notice the illegal chars at the end of the „DELETE FROM" line.
      (IMHO the DELETE FROM is normal, since new-node01 is starting up from shutdown and mydb1.collect_output is a MEMORY table -> will delete the contents, so that the replications slaves have the same data).

      Table definition for one of the memory tables:

      MariaDB @db01 [mydb1]>show create table collect_output;

      CREATE TABLE `collect_output` (
        `local_data_id` mediumint(8) unsigned NOT NULL DEFAULT 0,
        `rrd_name` varchar(19) NOT NULL DEFAULT '',
        `time` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
        `output` varchar(255) NOT NULL,
        PRIMARY KEY (`local_data_id`,`rrd_name`,`time`) USING BTREE

      Since we're talking about huge amount of data, I'm setting up the replication the following way:
      On new-node01:

      1. stop slave;
      2. flush tables with read lock;
      3. show master status;
      4. shutdown;

      Then I copy over all the data with rsync to new-node02 (mariadb is stopped as well). When the copy is done, I start back the new-node01, and start new-node02 with skip-slave-start, so I can change the master data do the values aquired before the copy.

      I've done the following a couple of times, always leading to the same result. All of the MEMORY tables have the same "bad binlog" entries, if I skip them on the slave (new-node02), than the replications starts up and works ok, even for the MEMORY tables).




            Unassigned Unassigned
            viper Tamas DAJKA
            0 Vote for this issue
            4 Start watching this issue



                Git Integration