Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21611

mariabackup should update binlog position in InnoDB during prepare

Details

    Description

      Mariabackup currently stores the backup's binary log position in the file xtrabackup_binlog_info.

      https://mariadb.com/kb/en/files-created-by-mariabackup/#xtrabackup_binlog_info

      InnoDB also stores the data's corresponding binary log position in its global tablespace.

      It seems as though mariabackup does not currently update the binary log position stored in InnoDB during the prepare. This has a tendency to confuse users, since they see mismatching binary log positions.

      For example, they might see something like this in the prepare output written by mariabackup:

      2020-01-31 1:58:55 0 \[Note\] InnoDB: Last binlog file '/binlogs/mysql\-bin.014960', position 54640630
      \[00\] 2020-01-31 01:58:56 Last binlog file /binlogs/mysql\-bin.014960, position 54640630
      \[00\] 2020-01-31 01:58:57 completed OK!
      

      And they might see a mismatched binary log position in xtrabackup_binlog_info:

      $ cat /mariabackup/base/2020-01-31_00-06-18/xtrabackup_binlog\_info
      mysql\-bin.014961 449
      

      To prevent confusion, when mariabackup prepares a backup, it should probably parse the binary log position from xtrabackup_binlog_info, and then use that position to update the binary log position in InnoDB.

      Attachments

        Issue Links

          Activity

            mariabackup must not update binlog position during prepare, as it fetches it under either FTWRL (10.[23]) or under BLOCK_COMMIT(10.4+), which must block any binary log updates. This is more likely the bug on server-side locking. We already found similar bug - MDEV-23586, it's affected version is 10.4.14. It might cause the problem described in this report. At the moment we don't know affected versions of this bug, we need to develop test case to determine it. So I set Fix Version to N/A and update it after test is developed.

            vlad.lesin Vladislav Lesin added a comment - mariabackup must not update binlog position during prepare, as it fetches it under either FTWRL (10. [23] ) or under BLOCK_COMMIT(10.4+), which must block any binary log updates. This is more likely the bug on server-side locking. We already found similar bug - MDEV-23586 , it's affected version is 10.4.14. It might cause the problem described in this report. At the moment we don't know affected versions of this bug, we need to develop test case to determine it. So I set Fix Version to N/A and update it after test is developed.

            Folks I ran into this today while attempting to rebuild a host for one of our paying customers.

            select @@version;
            +---------------------+
            | @@version           |
            +---------------------+
            | 10.4.12-MariaDB-log |
            +---------------------+
            1 row in set (0.002 sec)
            

            mariabackup --version
            mariabackup based on MariaDB server 10.4.12-MariaDB Linux (x86_64)
            

            Took backup from slave host #1, so I could rebuild the slave host #2 and point it to master:
            Backup log from source host (slave #1):

            [00] 2020-11-02 11:41:49 >> log scanned up to (7475872827036)
            [00] 2020-11-02 11:41:49 Streaming xtrabackup_slave_info
            [00] 2020-11-02 11:41:49         ...done
            [00] 2020-11-02 11:41:49 Streaming xtrabackup_binlog_info
            [00] 2020-11-02 11:41:49         ...done
            [00] 2020-11-02 11:41:49 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
            [00] 2020-11-02 11:41:49 mariabackup: The latest check point (for incremental): '7475872826954'
            mariabackup: Stopping log copying thread.[00] 2020-11-02 11:41:49 >> log scanned up to (7475872827036)
             
            [00] 2020-11-02 11:41:49 >> log scanned up to (7475872827036)
            [00] 2020-11-02 11:41:49 Executing BACKUP STAGE END
            [00] 2020-11-02 11:41:49 All tables unlocked
            [00] 2020-11-02 11:41:49 Streaming ib_buffer_pool to <STDOUT>
            [00] 2020-11-02 11:41:49         ...done
            [00] 2020-11-02 11:41:49 Backup created in directory '/var/lib/mysql/xtrabackup_backupfiles/'
            [00] 2020-11-02 11:41:49 MySQL binlog position: filename 'mariadb-bin.014113', position '360999250', GTID of the last change '0-40-3424141692,1-1-3051765148'
            [00] 2020-11-02 11:41:49 MySQL slave binlog position: master host '10.0.90.226', gtid_slave_pos 0-40-3424141692,1-1-3051765148
            [00] 2020-11-02 11:41:49 Streaming backup-my.cnf
            [00] 2020-11-02 11:41:49         ...done
            [00] 2020-11-02 11:41:49 Streaming xtrabackup_info
            [00] 2020-11-02 11:41:49         ...done
            [00] 2020-11-02 11:41:49 Redo log (from LSN 7475818325701 to 7475872827036) was copied.
            [00] 2020-11-02 11:41:49 completed OK!
            

            Prepare log from slave #2:

            mariabackup based on MariaDB server 10.4.12-MariaDB Linux (x86_64)
            [00] 2020-11-02 11:42:16 cd to /var/lib/mysql/
            [00] 2020-11-02 11:42:16 This target seems to be not prepared yet.
            [00] 2020-11-02 11:42:16 mariabackup: using the following InnoDB configuration for recovery:
            [00] 2020-11-02 11:42:16 innodb_data_home_dir = .
            [00] 2020-11-02 11:42:16 innodb_data_file_path = ibdata1:12M:autoextend
            [00] 2020-11-02 11:42:16 innodb_log_group_home_dir = .
            [00] 2020-11-02 11:42:16 InnoDB: Using Linux native AIO
            [00] 2020-11-02 11:42:16 Starting InnoDB instance for recovery.
            [00] 2020-11-02 11:42:16 mariabackup: Using 104857600 bytes for buffer pool (set by --use-memory parameter)
            2020-11-02 11:42:16 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
            2020-11-02 11:42:16 0 [Note] InnoDB: Uses event mutexes
            2020-11-02 11:42:16 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
            2020-11-02 11:42:16 0 [Note] InnoDB: Number of pools: 1
            2020-11-02 11:42:16 0 [Note] InnoDB: Using SSE2 crc32 instructions
            2020-11-02 11:42:16 0 [Note] InnoDB: Initializing buffer pool, total size = 100M, instances = 1, chunk size = 100M
            2020-11-02 11:42:16 0 [Note] InnoDB: Completed initialization of buffer pool
            2020-11-02 11:42:16 0 [Note] InnoDB: page_cleaner coordinator priority: -20
            2020-11-02 11:42:16 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=7475820437561
            2020-11-02 11:42:17 0 [Note] InnoDB: Starting a batch to recover 2476 pages from redo log.
            2020-11-02 11:42:19 0 [Note] InnoDB: Starting a batch to recover 2402 pages from redo log.
            2020-11-02 11:42:21 0 [Note] InnoDB: Starting final batch to recover 1806 pages from redo log.
            2020-11-02 11:42:22 0 [Note] InnoDB: Last binlog file '/var/log/mysql/mariadb-bin.014113', position 360999250
            [00] 2020-11-02 11:42:22 Last binlog file /var/log/mysql/mariadb-bin.014113, position 360999250
            [00] 2020-11-02 11:42:23 completed OK!
            

            xtrabackup_slave_info:

            # cat xtrabackup_slave_info
            SET GLOBAL gtid_slave_pos = '0-40-3424141692,1-1-3051765148';
            CHANGE MASTER TO master_use_gtid = slave_pos
            

            I set the coordinates as indicated and attempted to start replication by pointing to the master, which failed as shown below:

            I redacted creds for obvious reasons

            MariaDB [(none)]> SET GLOBAL gtid_slave_pos = '0-40-3424141692,1-1-3051765148';
            Query OK, 0 rows affected (0.030 sec)
             
            MariaDB [(none)]> CHANGE MASTER TO MASTER_HOST='10.0.90.226',
                -> MASTER_USER='*****',
                -> MASTER_PASSWORD='*****',
                -> MASTER_USE_GTID=slave_pos;
            Query OK, 0 rows affected (0.006 sec)
             
             
            MariaDB [(none)]> start slave;
            Query OK, 0 rows affected (0.013 sec)
             
            MariaDB [(none)]> show slave status\G
            *************************** 1. row ***************************
                            Slave_IO_State: Waiting for master to send event
                               Master_Host: 10.0.90.226
                               Master_User: mariadb
                               Master_Port: 3306
                             Connect_Retry: 60
                           Master_Log_File: mariadb-bin.016949
                       Read_Master_Log_Pos: 353215123
                            Relay_Log_File: mariadb-relay.000002
                             Relay_Log_Pos: 13408
                     Relay_Master_Log_File: mariadb-bin.016949
                          Slave_IO_Running: Yes
                         Slave_SQL_Running: No
                           Replicate_Do_DB: 
                       Replicate_Ignore_DB: 
                        Replicate_Do_Table: 
                    Replicate_Ignore_Table: 
                   Replicate_Wild_Do_Table: 
               Replicate_Wild_Ignore_Table: 
                                Last_Errno: 1062
                                Last_Error: Could not execute Write_rows_v1 event on table action.log; Duplicate entry '324452013-2020-11-02' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mariadb-bin.016949, end_log_pos 322668826
                              Skip_Counter: 0
                       Exec_Master_Log_Pos: 322668453
                           Relay_Log_Space: 30560385
                           Until_Condition: None
                            Until_Log_File: 
                             Until_Log_Pos: 0
                        Master_SSL_Allowed: No
                        Master_SSL_CA_File: 
                        Master_SSL_CA_Path: 
                           Master_SSL_Cert: 
                         Master_SSL_Cipher: 
                            Master_SSL_Key: 
                     Seconds_Behind_Master: NULL
             Master_SSL_Verify_Server_Cert: No
                             Last_IO_Errno: 0
                             Last_IO_Error: 
                            Last_SQL_Errno: 1062
                            Last_SQL_Error: Could not execute Write_rows_v1 event on table action.log; Duplicate entry '324452013-2020-11-02' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mariadb-bin.016949, end_log_pos 322668826
               Replicate_Ignore_Server_Ids: 
                          Master_Server_Id: 1
                            Master_SSL_Crl: 
                        Master_SSL_Crlpath: 
                                Using_Gtid: Slave_Pos
                               Gtid_IO_Pos: 1-1-3051772851,0-40-3424141692
                   Replicate_Do_Domain_Ids: 
               Replicate_Ignore_Domain_Ids: 
                             Parallel_Mode: conservative
                                 SQL_Delay: 0
                       SQL_Remaining_Delay: NULL
                   Slave_SQL_Running_State: 
                          Slave_DDL_Groups: 0
            Slave_Non_Transactional_Groups: 0
                Slave_Transactional_Groups: 3259
            1 row in set (0.000 sec)
            

            dalmeida Daniel Almeida (Inactive) added a comment - Folks I ran into this today while attempting to rebuild a host for one of our paying customers. select @@version; +---------------------+ | @@version | +---------------------+ | 10.4.12-MariaDB-log | +---------------------+ 1 row in set (0.002 sec) mariabackup --version mariabackup based on MariaDB server 10.4.12-MariaDB Linux (x86_64) Took backup from slave host #1, so I could rebuild the slave host #2 and point it to master: Backup log from source host (slave #1): [00] 2020-11-02 11:41:49 >> log scanned up to (7475872827036) [00] 2020-11-02 11:41:49 Streaming xtrabackup_slave_info [00] 2020-11-02 11:41:49 ...done [00] 2020-11-02 11:41:49 Streaming xtrabackup_binlog_info [00] 2020-11-02 11:41:49 ...done [00] 2020-11-02 11:41:49 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS... [00] 2020-11-02 11:41:49 mariabackup: The latest check point (for incremental): '7475872826954' mariabackup: Stopping log copying thread.[00] 2020-11-02 11:41:49 >> log scanned up to (7475872827036)   [00] 2020-11-02 11:41:49 >> log scanned up to (7475872827036) [00] 2020-11-02 11:41:49 Executing BACKUP STAGE END [00] 2020-11-02 11:41:49 All tables unlocked [00] 2020-11-02 11:41:49 Streaming ib_buffer_pool to <STDOUT> [00] 2020-11-02 11:41:49 ...done [00] 2020-11-02 11:41:49 Backup created in directory '/var/lib/mysql/xtrabackup_backupfiles/' [00] 2020-11-02 11:41:49 MySQL binlog position: filename 'mariadb-bin.014113', position '360999250', GTID of the last change '0-40-3424141692,1-1-3051765148' [00] 2020-11-02 11:41:49 MySQL slave binlog position: master host '10.0.90.226', gtid_slave_pos 0-40-3424141692,1-1-3051765148 [00] 2020-11-02 11:41:49 Streaming backup-my.cnf [00] 2020-11-02 11:41:49 ...done [00] 2020-11-02 11:41:49 Streaming xtrabackup_info [00] 2020-11-02 11:41:49 ...done [00] 2020-11-02 11:41:49 Redo log (from LSN 7475818325701 to 7475872827036) was copied. [00] 2020-11-02 11:41:49 completed OK! Prepare log from slave #2: mariabackup based on MariaDB server 10.4.12-MariaDB Linux (x86_64) [00] 2020-11-02 11:42:16 cd to /var/lib/mysql/ [00] 2020-11-02 11:42:16 This target seems to be not prepared yet. [00] 2020-11-02 11:42:16 mariabackup: using the following InnoDB configuration for recovery: [00] 2020-11-02 11:42:16 innodb_data_home_dir = . [00] 2020-11-02 11:42:16 innodb_data_file_path = ibdata1:12M:autoextend [00] 2020-11-02 11:42:16 innodb_log_group_home_dir = . [00] 2020-11-02 11:42:16 InnoDB: Using Linux native AIO [00] 2020-11-02 11:42:16 Starting InnoDB instance for recovery. [00] 2020-11-02 11:42:16 mariabackup: Using 104857600 bytes for buffer pool (set by --use-memory parameter) 2020-11-02 11:42:16 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2020-11-02 11:42:16 0 [Note] InnoDB: Uses event mutexes 2020-11-02 11:42:16 0 [Note] InnoDB: Compressed tables use zlib 1.2.7 2020-11-02 11:42:16 0 [Note] InnoDB: Number of pools: 1 2020-11-02 11:42:16 0 [Note] InnoDB: Using SSE2 crc32 instructions 2020-11-02 11:42:16 0 [Note] InnoDB: Initializing buffer pool, total size = 100M, instances = 1, chunk size = 100M 2020-11-02 11:42:16 0 [Note] InnoDB: Completed initialization of buffer pool 2020-11-02 11:42:16 0 [Note] InnoDB: page_cleaner coordinator priority: -20 2020-11-02 11:42:16 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=7475820437561 2020-11-02 11:42:17 0 [Note] InnoDB: Starting a batch to recover 2476 pages from redo log. 2020-11-02 11:42:19 0 [Note] InnoDB: Starting a batch to recover 2402 pages from redo log. 2020-11-02 11:42:21 0 [Note] InnoDB: Starting final batch to recover 1806 pages from redo log. 2020-11-02 11:42:22 0 [Note] InnoDB: Last binlog file '/var/log/mysql/mariadb-bin.014113', position 360999250 [00] 2020-11-02 11:42:22 Last binlog file /var/log/mysql/mariadb-bin.014113, position 360999250 [00] 2020-11-02 11:42:23 completed OK! xtrabackup_slave_info: # cat xtrabackup_slave_info SET GLOBAL gtid_slave_pos = '0-40-3424141692,1-1-3051765148'; CHANGE MASTER TO master_use_gtid = slave_pos I set the coordinates as indicated and attempted to start replication by pointing to the master, which failed as shown below: I redacted creds for obvious reasons MariaDB [(none)]> SET GLOBAL gtid_slave_pos = '0-40-3424141692,1-1-3051765148'; Query OK, 0 rows affected (0.030 sec)   MariaDB [(none)]> CHANGE MASTER TO MASTER_HOST='10.0.90.226', -> MASTER_USER='*****', -> MASTER_PASSWORD='*****', -> MASTER_USE_GTID=slave_pos; Query OK, 0 rows affected (0.006 sec)     MariaDB [(none)]> start slave; Query OK, 0 rows affected (0.013 sec)   MariaDB [(none)]> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.0.90.226 Master_User: mariadb Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mariadb-bin.016949 Read_Master_Log_Pos: 353215123 Relay_Log_File: mariadb-relay.000002 Relay_Log_Pos: 13408 Relay_Master_Log_File: mariadb-bin.016949 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1062 Last_Error: Could not execute Write_rows_v1 event on table action.log; Duplicate entry '324452013-2020-11-02' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mariadb-bin.016949, end_log_pos 322668826 Skip_Counter: 0 Exec_Master_Log_Pos: 322668453 Relay_Log_Space: 30560385 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1062 Last_SQL_Error: Could not execute Write_rows_v1 event on table action.log; Duplicate entry '324452013-2020-11-02' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mariadb-bin.016949, end_log_pos 322668826 Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-1-3051772851,0-40-3424141692 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave_DDL_Groups: 0 Slave_Non_Transactional_Groups: 0 Slave_Transactional_Groups: 3259 1 row in set (0.000 sec)

            People

              Unassigned Unassigned
              GeoffMontee Geoff Montee (Inactive)
              Votes:
              2 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.