Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-12162

binlog_row_image minimal Not Working

Details

    • Bug
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Duplicate
    • 10.1.17, 10.1(EOL), 10.2(EOL)
    • N/A
    • Replication
    • 3 10.1.17 servers in galera cluster

    Description

      I'm removing about 200GB of blobs from my servers in a galera cluster (don't get me started on them being in the database to begin). I have had `binlog_row_image` set to `MINIMAL` for at least several months.

      The rows I'm updating to remove the blobs are in a table with an `INT` PK called `id` and a `MEDIUMTEXT` column called `message`. For consistency for now, instead of deleting the rows, I am updating the `messsage` to `'s3'` (we are storing the data on S3 now).

      I assumed that with `binlog_row_image` set to `MINIMAL` the binlog would only store the old PK (`id`) and the new blob (`message`), but the size of the newly written binlogs is consistent with the amount of data that is being removed and `mysqlbinlog` shows the full old blob. I also tried using `NOBLOB` for testing, but the binlogs looked the same.

      I thought it might be that galera caused the full image to be written but our async slaves with basically the same config are also writing the full image.

      I have attached the config for one of the servers in the galera cluster, the others are basically the same.

      I have moved to updating a similar table as part of the same migration. The table structure is different, but I'm still performing basically the same operation. Here is the create statement for it:

      CREATE TABLE `contacts_messages` (
      	`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
      	`acctID` INT(10) UNSIGNED NOT NULL,
      	`stamp` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
      	`activity_id` INT(10) UNSIGNED NOT NULL,
      	`user_id` INT(10) UNSIGNED NOT NULL,
      	`contact_id` INT(10) UNSIGNED NOT NULL,
      	`subject` VARCHAR(255) NOT NULL,
      	`message` TEXT NOT NULL,
      	`to_email` VARCHAR(255) NULL DEFAULT NULL,
      	`cc_email` VARCHAR(255) NULL DEFAULT NULL,
      	`attachments` TEXT NULL,
      	`opened_at` TIMESTAMP NULL DEFAULT NULL,
      	PRIMARY KEY (`id`),
      	INDEX `contact_id` (`contact_id`),
      	INDEX `acctID` (`acctID`)
      )
      COLLATE='utf8_general_ci'
      ENGINE=InnoDB
      ROW_FORMAT=DYNAMIC
      AUTO_INCREMENT=56049803
      ;
      

      This is a sample of the update query I'm running. `contacts_messages_upload_status` is a table I was using to track which blobs were uploaded to S3.

      UPDATE `contacts_messages` SET `message` = 's3' WHERE message != 's3' AND id >= 12109100 AND id < 12109200 AND id IN (SELECT id FROM contacts_messages_upload_status WHERE id >= 12109100 AND id < 12109200)
      

      I added `SELECT @@binlog_row_image` to the top of the script I'm running to do these updates and it does return `MINIMAL`.

      Attachments

        Issue Links

          Activity

            Yes, please provide the table structure (SHOW CREATE TABLE ...), an example of the UPDATE statement that you are using, SELECT @@binlog_row_image from the same session where UPDATE was executed, and an example of the resulting binlog from the same server.

            Just in case – when you are saying that you've had binlog_row_image=MINIMAL for months, I assume you restarted the server after you set it, right?

            elenst Elena Stepanova added a comment - Yes, please provide the table structure ( SHOW CREATE TABLE ... ), an example of the UPDATE statement that you are using, SELECT @@binlog_row_image from the same session where UPDATE was executed, and an example of the resulting binlog from the same server. Just in case – when you are saying that you've had binlog_row_image=MINIMAL for months, I assume you restarted the server after you set it, right?

            I updated the original description. And, yes, I did restart the servers since adding `binlog_row_image=MINIMAL` to the config file and they have been running with that setting. I will attach an excerpt of the binlog showing for one of the update statements.

            bradjorgensen Brad Jorgensen added a comment - I updated the original description. And, yes, I did restart the servers since adding `binlog_row_image=MINIMAL` to the config file and they have been running with that setting. I will attach an excerpt of the binlog showing for one of the update statements.

            I attached an example binlog entry for one of the updates. It has the row event decoded so I could redact the original blob data. Without it decoding the row data, the same event has a consistently large `BINLOG '...'` block.

            bradjorgensen Brad Jorgensen added a comment - I attached an example binlog entry for one of the updates. It has the row event decoded so I could redact the original blob data. Without it decoding the row data, the same event has a consistently large `BINLOG '...'` block.

            After looking at the example UPDATE query I added here outside of the code context I normally see it in, I tried removing the message != 's3' condition from the query and it appears that is doesn't log the old blob data anymore. It looks like since I select the old body so it skips already updated rows, that column is marked as needing to be read when writing the binlog row data. It seems odd, though, that since it has the primary key it still writes the other column's data. Unless I'm thinking about this wrong and the binlog writes the full possibly updated row data for the entire query even if not all of the rows would be updated by the query.

            bradjorgensen Brad Jorgensen added a comment - After looking at the example UPDATE query I added here outside of the code context I normally see it in, I tried removing the message != 's3' condition from the query and it appears that is doesn't log the old blob data anymore. It looks like since I select the old body so it skips already updated rows, that column is marked as needing to be read when writing the binlog row data. It seems odd, though, that since it has the primary key it still writes the other column's data. Unless I'm thinking about this wrong and the binlog writes the full possibly updated row data for the entire query even if not all of the rows would be updated by the query.
            elenst Elena Stepanova added a comment - - edited

            Indeed, it appears to depend on using blob in the WHERE clause, and I agree that it shouldn't. As a workaround, I suppose you can remove the condition, it doesn't do anything useful in your case anyway.

            Nor reproducible with MySQL 5.6, 5.7.

            MTR-like test-case (only to reproduce, not to be put into the regression suite as is!)

            --source include/have_innodb.inc
            --source include/have_binlog_format_row.inc
             
            SET binlog_row_image= 'MINIMAL';
             
            CREATE TABLE `contacts_messages` (
                id INT PRIMARY KEY,
                message BLOB
            ) ENGINE=InnoDB;
             
            CREATE TABLE `contacts_messages_upload_status` (
                id INT PRIMARY KEY
            ) ENGINE=InnoDB;
             
            INSERT INTO `contacts_messages_upload_status` VALUES (1),(2);
            INSERT INTO `contacts_messages` VALUES (1,'foo'),(2,'bar');
             
            UPDATE `contacts_messages` SET `message` = 's3' WHERE message != 's3';
             
            --let $datadir= `SELECT @@datadir`
             
            SHOW BINLOG EVENTS;
            FLUSH BINARY LOGS;
             
            --exec $MYSQL_BINLOG --verbose --base64-output=DECODE-ROWS $datadir/master-bin.000001
             
            DROP TABLE `contacts_messages`, `contacts_messages_upload_status`;
            

            elenst Elena Stepanova added a comment - - edited Indeed, it appears to depend on using blob in the WHERE clause, and I agree that it shouldn't. As a workaround, I suppose you can remove the condition, it doesn't do anything useful in your case anyway. Nor reproducible with MySQL 5.6, 5.7. MTR-like test-case (only to reproduce, not to be put into the regression suite as is!) --source include/have_innodb.inc --source include/have_binlog_format_row.inc   SET binlog_row_image= 'MINIMAL' ;   CREATE TABLE `contacts_messages` ( id INT PRIMARY KEY , message BLOB ) ENGINE=InnoDB;   CREATE TABLE `contacts_messages_upload_status` ( id INT PRIMARY KEY ) ENGINE=InnoDB;   INSERT INTO `contacts_messages_upload_status` VALUES (1),(2); INSERT INTO `contacts_messages` VALUES (1, 'foo' ),(2, 'bar' );   UPDATE `contacts_messages` SET `message` = 's3' WHERE message != 's3' ;   --let $datadir= `SELECT @@datadir`   SHOW BINLOG EVENTS; FLUSH BINARY LOGS;   --exec $MYSQL_BINLOG --verbose --base64-output=DECODE-ROWS $datadir/master-bin.000001   DROP TABLE `contacts_messages`, `contacts_messages_upload_status`;
            claudio.nanni Claudio Nanni added a comment -

            Hi,

            I have found out that what makes the difference (at least in my case) is having a column defined as:

            timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP

            In particular it must be present 'ON UPDATE CURRENT_TIMESTAMP'

            This even with a Primary Key on the table.

            I attach a simple test case.
            binlog_row_image.txt

            claudio.nanni Claudio Nanni added a comment - Hi, I have found out that what makes the difference (at least in my case) is having a column defined as: timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP In particular it must be present 'ON UPDATE CURRENT_TIMESTAMP' This even with a Primary Key on the table. I attach a simple test case. binlog_row_image.txt
            alice Alice Sherepa added a comment -

            Fixed by commit f9ac7032cbc4b7 ) (I checked 2 cases- in the 2 above comments )

            commit f9ac7032cbc4b7b9af9c8122e6ebfd91af9fbaf9 (HEAD)
            Author: Andrei Elkin <andrei.elkin@mariadb.com>
            Date:   Sun Jan 20 19:00:16 2019 +0200
             
                MDEV-14605 Changes to "ON UPDATE CURRENT_TIMESTAMP" fields are not
                           always logged properly with binlog_row_image=MINIMAL
            

            alice Alice Sherepa added a comment - Fixed by commit f9ac7032cbc4b7 ) (I checked 2 cases- in the 2 above comments ) commit f9ac7032cbc4b7b9af9c8122e6ebfd91af9fbaf9 (HEAD) Author: Andrei Elkin <andrei.elkin@mariadb.com> Date: Sun Jan 20 19:00:16 2019 +0200   MDEV-14605 Changes to "ON UPDATE CURRENT_TIMESTAMP" fields are not always logged properly with binlog_row_image=MINIMAL

            People

              Elkin Andrei Elkin
              bradjorgensen Brad Jorgensen
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.