[MDEV-12162] binlog_row_image minimal Not Working Created: 2017-03-01  Updated: 2020-09-08  Resolved: 2020-09-08

Status: Closed
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.1.17, 10.1, 10.2
Fix Version/s: N/A

Type: Bug Priority: Minor
Reporter: Brad Jorgensen Assignee: Andrei Elkin
Resolution: Duplicate Votes: 1
Labels: binlog, replication
Environment:

3 10.1.17 servers in galera cluster


Attachments: Text File binlog_row_image.txt     File my.cnf     Text File mysqlbinlog_binlog_row_image_minimal.txt    
Issue Links:
Duplicate
duplicates MDEV-14605 ON UPDATE CURRENT_TIMESTAMP fields by... Closed

 Description   

I'm removing about 200GB of blobs from my servers in a galera cluster (don't get me started on them being in the database to begin). I have had `binlog_row_image` set to `MINIMAL` for at least several months.

The rows I'm updating to remove the blobs are in a table with an `INT` PK called `id` and a `MEDIUMTEXT` column called `message`. For consistency for now, instead of deleting the rows, I am updating the `messsage` to `'s3'` (we are storing the data on S3 now).

I assumed that with `binlog_row_image` set to `MINIMAL` the binlog would only store the old PK (`id`) and the new blob (`message`), but the size of the newly written binlogs is consistent with the amount of data that is being removed and `mysqlbinlog` shows the full old blob. I also tried using `NOBLOB` for testing, but the binlogs looked the same.

I thought it might be that galera caused the full image to be written but our async slaves with basically the same config are also writing the full image.

I have attached the config for one of the servers in the galera cluster, the others are basically the same.

I have moved to updating a similar table as part of the same migration. The table structure is different, but I'm still performing basically the same operation. Here is the create statement for it:

CREATE TABLE `contacts_messages` (
	`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
	`acctID` INT(10) UNSIGNED NOT NULL,
	`stamp` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
	`activity_id` INT(10) UNSIGNED NOT NULL,
	`user_id` INT(10) UNSIGNED NOT NULL,
	`contact_id` INT(10) UNSIGNED NOT NULL,
	`subject` VARCHAR(255) NOT NULL,
	`message` TEXT NOT NULL,
	`to_email` VARCHAR(255) NULL DEFAULT NULL,
	`cc_email` VARCHAR(255) NULL DEFAULT NULL,
	`attachments` TEXT NULL,
	`opened_at` TIMESTAMP NULL DEFAULT NULL,
	PRIMARY KEY (`id`),
	INDEX `contact_id` (`contact_id`),
	INDEX `acctID` (`acctID`)
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB
ROW_FORMAT=DYNAMIC
AUTO_INCREMENT=56049803
;

This is a sample of the update query I'm running. `contacts_messages_upload_status` is a table I was using to track which blobs were uploaded to S3.

UPDATE `contacts_messages` SET `message` = 's3' WHERE message != 's3' AND id >= 12109100 AND id < 12109200 AND id IN (SELECT id FROM contacts_messages_upload_status WHERE id >= 12109100 AND id < 12109200)

I added `SELECT @@binlog_row_image` to the top of the script I'm running to do these updates and it does return `MINIMAL`.



 Comments   
Comment by Elena Stepanova [ 2017-03-03 ]

Yes, please provide the table structure (SHOW CREATE TABLE ...), an example of the UPDATE statement that you are using, SELECT @@binlog_row_image from the same session where UPDATE was executed, and an example of the resulting binlog from the same server.

Just in case – when you are saying that you've had binlog_row_image=MINIMAL for months, I assume you restarted the server after you set it, right?

Comment by Brad Jorgensen [ 2017-03-06 ]

I updated the original description. And, yes, I did restart the servers since adding `binlog_row_image=MINIMAL` to the config file and they have been running with that setting. I will attach an excerpt of the binlog showing for one of the update statements.

Comment by Brad Jorgensen [ 2017-03-06 ]

I attached an example binlog entry for one of the updates. It has the row event decoded so I could redact the original blob data. Without it decoding the row data, the same event has a consistently large `BINLOG '...'` block.

Comment by Brad Jorgensen [ 2017-03-06 ]

After looking at the example UPDATE query I added here outside of the code context I normally see it in, I tried removing the message != 's3' condition from the query and it appears that is doesn't log the old blob data anymore. It looks like since I select the old body so it skips already updated rows, that column is marked as needing to be read when writing the binlog row data. It seems odd, though, that since it has the primary key it still writes the other column's data. Unless I'm thinking about this wrong and the binlog writes the full possibly updated row data for the entire query even if not all of the rows would be updated by the query.

Comment by Elena Stepanova [ 2017-03-13 ]

Indeed, it appears to depend on using blob in the WHERE clause, and I agree that it shouldn't. As a workaround, I suppose you can remove the condition, it doesn't do anything useful in your case anyway.

Nor reproducible with MySQL 5.6, 5.7.

MTR-like test-case (only to reproduce, not to be put into the regression suite as is!)

--source include/have_innodb.inc
--source include/have_binlog_format_row.inc
 
SET binlog_row_image= 'MINIMAL';
 
CREATE TABLE `contacts_messages` (
    id INT PRIMARY KEY,
    message BLOB
) ENGINE=InnoDB;
 
CREATE TABLE `contacts_messages_upload_status` (
    id INT PRIMARY KEY
) ENGINE=InnoDB;
 
INSERT INTO `contacts_messages_upload_status` VALUES (1),(2);
INSERT INTO `contacts_messages` VALUES (1,'foo'),(2,'bar');
 
UPDATE `contacts_messages` SET `message` = 's3' WHERE message != 's3';
 
--let $datadir= `SELECT @@datadir`
 
SHOW BINLOG EVENTS;
FLUSH BINARY LOGS;
 
--exec $MYSQL_BINLOG --verbose --base64-output=DECODE-ROWS $datadir/master-bin.000001
 
DROP TABLE `contacts_messages`, `contacts_messages_upload_status`;

Comment by Claudio Nanni [ 2017-07-18 ]

Hi,

I have found out that what makes the difference (at least in my case) is having a column defined as:

timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP

In particular it must be present 'ON UPDATE CURRENT_TIMESTAMP'

This even with a Primary Key on the table.

I attach a simple test case.
binlog_row_image.txt

Comment by Alice Sherepa [ 2020-09-08 ]

Fixed by commit f9ac7032cbc4b7 ) (I checked 2 cases- in the 2 above comments )

commit f9ac7032cbc4b7b9af9c8122e6ebfd91af9fbaf9 (HEAD)
Author: Andrei Elkin <andrei.elkin@mariadb.com>
Date:   Sun Jan 20 19:00:16 2019 +0200
 
    MDEV-14605 Changes to "ON UPDATE CURRENT_TIMESTAMP" fields are not
               always logged properly with binlog_row_image=MINIMAL

Generated at Thu Feb 08 07:55:36 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.