Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26322

Last binlog file and position are "empty" in mariabackup --prepare output

Details

    Description

      Normally (till 10.5.x) mariabackup prepare stage reports last binlog file and position at the end, for example:

      openxs@ao756:~/dbs/maria10.4$  bin/mariabackup --no-defaults --binlog-info=on --prepare --target-dir=/tmp/backup
      bin/mariabackup based on MariaDB server 10.4.22-MariaDB Linux (x86_64)
      [00] 2021-08-08 16:46:12 cd to /tmp/backup/
      [00] 2021-08-08 16:46:12 open files limit requested 0, set to 1024
      [00] 2021-08-08 16:46:12 This target seems to be not prepared yet.
      [00] 2021-08-08 16:46:12 mariabackup: using the following InnoDB configuration for recovery:
      [00] 2021-08-08 16:46:12 innodb_data_home_dir = .
      [00] 2021-08-08 16:46:12 innodb_data_file_path = ibdata1:10M:autoextend
      [00] 2021-08-08 16:46:12 innodb_log_group_home_dir = .
      [00] 2021-08-08 16:46:12 InnoDB: Using Linux native AIO
      [00] 2021-08-08 16:46:12 Starting InnoDB instance for recovery.
      [00] 2021-08-08 16:46:12 mariabackup: Using 104857600 bytes for buffer pool (set by --use-memory parameter)
      2021-08-08 16:46:12 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      2021-08-08 16:46:12 0 [Note] InnoDB: Uses event mutexes
      2021-08-08 16:46:12 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
      2021-08-08 16:46:12 0 [Note] InnoDB: Number of pools: 1
      2021-08-08 16:46:12 0 [Note] InnoDB: Using SSE2 crc32 instructions
      2021-08-08 16:46:12 0 [Note] InnoDB: Initializing buffer pool, total size = 100M, instances = 1, chunk size = 100M
      2021-08-08 16:46:13 0 [Note] InnoDB: Completed initialization of buffer pool
      2021-08-08 16:46:13 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
      2021-08-08 16:46:13 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=651139114
      2021-08-08 16:46:15 0 [Note] InnoDB: Starting a batch to recover 291 pages from redo log.
      2021-08-08 16:46:17 0 [Note] InnoDB: Starting final batch to recover 167 pages from redo log.
      2021-08-08 16:46:17 0 [Note] InnoDB: Last binlog file './ao756-bin.000002', position 31344186
      [00] 2021-08-08 16:46:18 Last binlog file ./ao756-bin.000002, position 31344186
      [00] 2021-08-08 16:46:18 completed OK!
      

      But in 10.5.x it's reported as "empty":

      openxs@ao756:~/dbs/maria10.5$ bin/mariabackup --no-defaults --binlog-info=on --prepare --target-dir=/tmp/backup
      bin/mariabackup based on MariaDB server 10.5.13-MariaDB Linux (x86_64)
      [00] 2021-08-08 14:58:54 cd to /tmp/backup/
      [00] 2021-08-08 14:58:54 open files limit requested 0, set to 1024
      [00] 2021-08-08 14:58:54 This target seems to be not prepared yet.
      [00] 2021-08-08 14:58:54 mariabackup: using the following InnoDB configuration for recovery:
      [00] 2021-08-08 14:58:54 innodb_data_home_dir = .
      [00] 2021-08-08 14:58:54 innodb_data_file_path = ibdata1:10M:autoextend
      [00] 2021-08-08 14:58:54 innodb_log_group_home_dir = .
      [00] 2021-08-08 14:58:54 InnoDB: Using Linux native AIO
      [00] 2021-08-08 14:58:54 Starting InnoDB instance for recovery.
      [00] 2021-08-08 14:58:54 mariabackup: Using 104857600 bytes for buffer pool (set by --use-memory parameter)
      2021-08-08 14:58:54 0 [Note] InnoDB: Uses event mutexes
      2021-08-08 14:58:54 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
      2021-08-08 14:58:54 0 [Note] InnoDB: Number of pools: 1
      2021-08-08 14:58:54 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
      2021-08-08 14:58:54 0 [Note] InnoDB: Using Linux native AIO
      2021-08-08 14:58:54 0 [Note] InnoDB: Initializing buffer pool, total size = 104857600, chunk size = 104857600
      2021-08-08 14:58:54 0 [Note] InnoDB: Completed initialization of buffer pool
      2021-08-08 14:58:55 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1050962794,1120684838
      2021-08-08 14:59:00 0 [Note] InnoDB: Starting a batch to recover 155 pages from redo log.
      2021-08-08 14:59:02 0 [Note] InnoDB: Starting a batch to recover 155 pages from redo log.
      2021-08-08 14:59:04 0 [Note] InnoDB: Starting a batch to recover 157 pages from redo log.
      2021-08-08 14:59:06 0 [Note] InnoDB: Starting a batch to recover 162 pages from redo log.
      2021-08-08 14:59:07 0 [Note] InnoDB: Starting final batch to recover 358 pages from redo log.
      [00] 2021-08-08 14:59:08 Last binlog file , position 0
      [00] 2021-08-08 14:59:08 completed OK!
      

      It looks like a regression bug, but maybe there is a good reason NOT to report it (where is it documented then?) In the latter case why this output appears at the log at all? It adds no value, just a confusion.

      See also MDEV-21611 and related bugs there. Cross checking vs the information reported at --backup stage and xtrabackup_binlog_info is important for troubleshooting in some cases.

      Attachments

        Issue Links

          Activity

            The function trx_rseg_update_binlog_offset() is still there, but it was refactored in 10.5 by MDEV-21174 and MDEV-12353.

            marko Marko Mäkelä added a comment - The function trx_rseg_update_binlog_offset() is still there, but it was refactored in 10.5 by MDEV-21174 and MDEV-12353 .

            alice, can you try to produce an mtr test case for this? I just checked

            ./mtr --manual-rr innodb.binlog_consistent
            

            in a 10.6-based branch (which I believed to be similar enough to 10.5) and I do see the information being filled in. So, it is not obviously broken in the database server.

            (rr) break trx_rseg_update_binlog_offset
            Breakpoint 1 at 0x5601cc6e4989: file /mariadb/10.6/storage/innobase/trx/trx0rseg.cc, line 697.
            (rr) c
            Continuing.
            2021-08-26 19:02:50 0 [Note] /dev/shm/10.6/sql/mariadbd (server 10.6.5-MariaDB-debug-log) starting as process 2684728 ...
            …
            Thread 2 hit Breakpoint 1, trx_rseg_update_binlog_offset (
                rseg_header=rseg_header@entry=0x7f47361d63f0, 
                trx=trx@entry=0x7f4736a3a1b8, mtr=mtr@entry=0x7f472c579bd0)
                at /mariadb/10.6/storage/innobase/trx/trx0rseg.cc:697
            --Type <RET> for more, q to quit, c to continue without paging--
            697	{
            (rr) bt
            #0  trx_rseg_update_binlog_offset (
                rseg_header=rseg_header@entry=0x7f47361d63f0, 
                trx=trx@entry=0x7f4736a3a1b8, mtr=mtr@entry=0x7f472c579bd0)
                at /mariadb/10.6/storage/innobase/trx/trx0rseg.cc:697
            …
            #10 0x00005601cc30dcba in TC_LOG::run_commit_ordered (this=this@entry=0x5601cd2b1e80 <mysql_bin_log>, thd=0x7f4718001a28, all=false) at /mariadb/10.6/sql/log.cc:9258
            #11 0x00005601cc315c39 in MYSQL_BIN_LOG::trx_group_commit_leader (this=this@entry=0x5601cd2b1e80 <mysql_bin_log>, leader=leader@entry=0x7f472c57a420) at /mariadb/10.6/sql/log.cc:8489
            #12 0x00005601cc3160f4 in MYSQL_BIN_LOG::write_transaction_to_binlog_events (this=this@entry=0x5601cd2b1e80 <mysql_bin_log>, entry=entry@entry=0x7f472c57a420) at /mariadb/10.6/sql/log.cc:8070
            …
            (rr) n
            698		DBUG_LOG("trx", "trx_mysql_binlog_offset: " << trx->mysql_log_offset);
            (rr) p trx->mysql_log_offset
            $1 = 687
            (rr) p trx->mysql_log_
            mysql_log_file_name  mysql_log_offset     
            (rr) p trx->mysql_log_file_name 
            $2 = 0x7f4718081ac8 "./master-bin.000001"
            

            Maybe it is broken somewhere else. Maybe Mariabackup is not copying the binlog files? I am not too familiar with the binlog, so I would appreciate a self-contained test case that I can execute in mtr.

            marko Marko Mäkelä added a comment - alice , can you try to produce an mtr test case for this? I just checked ./mtr --manual-rr innodb.binlog_consistent in a 10.6-based branch (which I believed to be similar enough to 10.5) and I do see the information being filled in. So, it is not obviously broken in the database server. (rr) break trx_rseg_update_binlog_offset Breakpoint 1 at 0x5601cc6e4989: file /mariadb/10.6/storage/innobase/trx/trx0rseg.cc, line 697. (rr) c Continuing. 2021-08-26 19:02:50 0 [Note] /dev/shm/10.6/sql/mariadbd (server 10.6.5-MariaDB-debug-log) starting as process 2684728 ... … Thread 2 hit Breakpoint 1, trx_rseg_update_binlog_offset ( rseg_header=rseg_header@entry=0x7f47361d63f0, trx=trx@entry=0x7f4736a3a1b8, mtr=mtr@entry=0x7f472c579bd0) at /mariadb/10.6/storage/innobase/trx/trx0rseg.cc:697 --Type <RET> for more, q to quit, c to continue without paging-- 697 { (rr) bt #0 trx_rseg_update_binlog_offset ( rseg_header=rseg_header@entry=0x7f47361d63f0, trx=trx@entry=0x7f4736a3a1b8, mtr=mtr@entry=0x7f472c579bd0) at /mariadb/10.6/storage/innobase/trx/trx0rseg.cc:697 … #10 0x00005601cc30dcba in TC_LOG::run_commit_ordered (this=this@entry=0x5601cd2b1e80 <mysql_bin_log>, thd=0x7f4718001a28, all=false) at /mariadb/10.6/sql/log.cc:9258 #11 0x00005601cc315c39 in MYSQL_BIN_LOG::trx_group_commit_leader (this=this@entry=0x5601cd2b1e80 <mysql_bin_log>, leader=leader@entry=0x7f472c57a420) at /mariadb/10.6/sql/log.cc:8489 #12 0x00005601cc3160f4 in MYSQL_BIN_LOG::write_transaction_to_binlog_events (this=this@entry=0x5601cd2b1e80 <mysql_bin_log>, entry=entry@entry=0x7f472c57a420) at /mariadb/10.6/sql/log.cc:8070 … (rr) n 698 DBUG_LOG("trx", "trx_mysql_binlog_offset: " << trx->mysql_log_offset); (rr) p trx->mysql_log_offset $1 = 687 (rr) p trx->mysql_log_ mysql_log_file_name mysql_log_offset (rr) p trx->mysql_log_file_name $2 = 0x7f4718081ac8 "./master-bin.000001" Maybe it is broken somewhere else. Maybe Mariabackup is not copying the binlog files? I am not too familiar with the binlog, so I would appreciate a self-contained test case that I can execute in mtr .
            alice Alice Sherepa added a comment -

            marko, I repeated on 10.5-10.6, 10.4 indeed reports position, while 10.5 shows 0. for mtr I used mariabackup.binlog test - just check /mysql-test/var/log/mysqltest.log after running the test - it shows "Last binlog file , position 0"

            alice Alice Sherepa added a comment - marko , I repeated on 10.5-10.6, 10.4 indeed reports position, while 10.5 shows 0. for mtr I used mariabackup.binlog test - just check /mysql-test/var/log/mysqltest.log after running the test - it shows "Last binlog file , position 0"

            I diagnosed this with the following patch:

            diff --git a/mysql-test/suite/mariabackup/binlog.test b/mysql-test/suite/mariabackup/binlog.test
            index b2f08c2d7bb..1587c399d25 100644
            --- a/mysql-test/suite/mariabackup/binlog.test
            +++ b/mysql-test/suite/mariabackup/binlog.test
            @@ -12,7 +12,7 @@ SHOW VARIABLES like 'log_bin';
             exec $XTRABACKUP --defaults-file=$MYSQLTEST_VARDIR/my.cnf  --backup --target-dir=$basedir;
             --enable_result_log
             
            -exec $XTRABACKUP --prepare --binlog-info=1 --target-dir=$basedir ;
            +exec rr record $XTRABACKUP --prepare --binlog-info=1 --skip-innodb-use-native-aio --target-dir=$basedir ;
             
             let SEARCH_FILE=$MYSQLTEST_VARDIR/log/current_test;
             --let SEARCH_PATTERN= Last binlog file .*, position .*
            

            According to the rr replay, we do not initialize the variables at all:

            awatch trx_sys.recovered_binlog_offset
            continue
            

            10.6 4690442411b75ea0fc1a6aacabe767d6fb1d1f62

            Thread 1 hit Hardware access (read/write) watchpoint 1: trx_sys.recovered_binlog_offset
             
            Value = 0
            xtrabackup_prepare_func (argv=argv@entry=0x5568f49988e8)
                at /mariadb/10.6/extra/mariabackup/xtrabackup.cc:5920
            5920			msg("Last binlog file %s, position %lld",
            

            In 10.4, the initialization happens with the following stack trace:

            10.4 865e5b6405d163c5591b400689717af80e99427f

            #0  trx_rseg_mem_restore (rseg=rseg@entry=0x563fdcd04b40, 
                max_trx_id=@0x7ffe0446c448: 0, mtr=mtr@entry=0x7ffe0446c500)
                at /mariadb/10.4/storage/innobase/trx/trx0rseg.cc:473
            #1  0x0000563fd97fa18b in trx_rseg_array_init ()
                at /mariadb/10.4/storage/innobase/trx/trx0rseg.cc:602
            #2  0x0000563fd980717c in trx_lists_init_at_db_start ()
                at /mariadb/10.4/storage/innobase/trx/trx0trx.cc:743
            #3  0x0000563fd97b87dd in srv_start (create_new_db=<optimized out>)
                at /mariadb/10.4/storage/innobase/srv/srv0start.cc:1896
            #4  0x0000563fd8e4b670 in innodb_init ()
                at /mariadb/10.4/extra/mariabackup/xtrabackup.cc:2231
            #5  0x0000563fd8e5dc29 in xtrabackup_prepare_func (
                argv=argv@entry=0x563fdc5d3bd0)
                at /mariadb/10.4/extra/mariabackup/xtrabackup.cc:6113
            #6  0x0000563fd8e5e78b in main_low (argv=0x563fdc5d3bd0)
                at /mariadb/10.4/extra/mariabackup/xtrabackup.cc:7065
            #7  0x0000563fd8e5e988 in main (argc=5, argv=0x7ffe04471468)
                at /mariadb/10.4/extra/mariabackup/xtrabackup.cc:6861
            

            This was broken in MariaDB Server 10.5.7 by a change that aimed to speed up Mariabackup.

            I think that it should suffice to read the binlog information from the rollback segment header pages, without reading all the undo pages that are reachable via the header pages.
            serg has stated earlier that the binlog position should not be stored in InnoDB in the first place. Until MDEV-22351 was fixed, MariaDB 10.3 and later could report bogus information after RESET MASTER.

            I think that this is a regression that deserves to be fixed. It is a separate discussion whether a future major release could stop storing binlog information in InnoDB.

            marko Marko Mäkelä added a comment - I diagnosed this with the following patch: diff --git a/mysql-test/suite/mariabackup/binlog.test b/mysql-test/suite/mariabackup/binlog.test index b2f08c2d7bb..1587c399d25 100644 --- a/mysql-test/suite/mariabackup/binlog.test +++ b/mysql-test/suite/mariabackup/binlog.test @@ -12,7 +12,7 @@ SHOW VARIABLES like 'log_bin'; exec $XTRABACKUP --defaults-file=$MYSQLTEST_VARDIR/my.cnf --backup --target-dir=$basedir; --enable_result_log -exec $XTRABACKUP --prepare --binlog-info=1 --target-dir=$basedir ; +exec rr record $XTRABACKUP --prepare --binlog-info=1 --skip-innodb-use-native-aio --target-dir=$basedir ; let SEARCH_FILE=$MYSQLTEST_VARDIR/log/current_test; --let SEARCH_PATTERN= Last binlog file .*, position .* According to the rr replay , we do not initialize the variables at all: awatch trx_sys.recovered_binlog_offset continue 10.6 4690442411b75ea0fc1a6aacabe767d6fb1d1f62 Thread 1 hit Hardware access (read/write) watchpoint 1: trx_sys.recovered_binlog_offset   Value = 0 xtrabackup_prepare_func (argv=argv@entry=0x5568f49988e8) at /mariadb/10.6/extra/mariabackup/xtrabackup.cc:5920 5920 msg("Last binlog file %s, position %lld", In 10.4, the initialization happens with the following stack trace: 10.4 865e5b6405d163c5591b400689717af80e99427f #0 trx_rseg_mem_restore (rseg=rseg@entry=0x563fdcd04b40, max_trx_id=@0x7ffe0446c448: 0, mtr=mtr@entry=0x7ffe0446c500) at /mariadb/10.4/storage/innobase/trx/trx0rseg.cc:473 #1 0x0000563fd97fa18b in trx_rseg_array_init () at /mariadb/10.4/storage/innobase/trx/trx0rseg.cc:602 #2 0x0000563fd980717c in trx_lists_init_at_db_start () at /mariadb/10.4/storage/innobase/trx/trx0trx.cc:743 #3 0x0000563fd97b87dd in srv_start (create_new_db=<optimized out>) at /mariadb/10.4/storage/innobase/srv/srv0start.cc:1896 #4 0x0000563fd8e4b670 in innodb_init () at /mariadb/10.4/extra/mariabackup/xtrabackup.cc:2231 #5 0x0000563fd8e5dc29 in xtrabackup_prepare_func ( argv=argv@entry=0x563fdc5d3bd0) at /mariadb/10.4/extra/mariabackup/xtrabackup.cc:6113 #6 0x0000563fd8e5e78b in main_low (argv=0x563fdc5d3bd0) at /mariadb/10.4/extra/mariabackup/xtrabackup.cc:7065 #7 0x0000563fd8e5e988 in main (argc=5, argv=0x7ffe04471468) at /mariadb/10.4/extra/mariabackup/xtrabackup.cc:6861 This was broken in MariaDB Server 10.5.7 by a change that aimed to speed up Mariabackup . I think that it should suffice to read the binlog information from the rollback segment header pages, without reading all the undo pages that are reachable via the header pages. serg has stated earlier that the binlog position should not be stored in InnoDB in the first place. Until MDEV-22351 was fixed, MariaDB 10.3 and later could report bogus information after RESET MASTER . I think that this is a regression that deserves to be fixed. It is a separate discussion whether a future major release could stop storing binlog information in InnoDB.
            vlad.lesin Vladislav Lesin added a comment - - edited

            marko, yes, you are correct, the cause of the bug is 59a0236da4810a5ae243e77cb7a9c6262639a6e4. But I don't understand why this commit was necessary, because both trx_lists_init_at_db_start() and trx_rseg_mem_restore() contain special cases for srv_operation == SRV_OPERATION_RESTORE condition, and on this condition only rseg headers are read:

            dberr_t trx_lists_init_at_db_start()                                            
            {                                                                               
                    ut_a(srv_is_being_started);                                             
                    ut_ad(!srv_was_started);                                                
                                                                                            
                    if (srv_operation == SRV_OPERATION_RESTORE) {                           
                            /* mariabackup --prepare only deals with                        
                            the redo log and the data files, not with                       
                            transactions or the data dictionary. */                         
                            return trx_rseg_array_init();                                   
                    }                                                                       
            ...                                                                             
            }                                                                               
                                                                                            
            static dberr_t trx_rseg_mem_restore(...)                                        
            {                                                                                           
            ...                                                                                                                             
                            const byte* binlog_name = TRX_RSEG + TRX_RSEG_BINLOG_NAME                         
                                    + rseg_hdr->frame;                                      
                            if (*binlog_name) {                                             
                            ...                                                                                 
                            }                                                               
            ...                                                                             
                    if (srv_operation == SRV_OPERATION_RESTORE) {                                                                      
                            /* mariabackup --prepare only deals with                        
                            the redo log and the data files, not with                       
                            transactions or the data dictionary. */                         
                            return DB_SUCCESS;                                              
                    }                                                                       
            ...                                                                                                                      
            }
            

            So the fix I propose is to revert that commit. The only doubt I have is the reason of that commit. Was there any performance testing or support cases which caused that fix?

            Note: binlog position is written with trx_rseg_update_binlog_offset().

            vlad.lesin Vladislav Lesin added a comment - - edited marko , yes, you are correct, the cause of the bug is 59a0236da4810a5ae243e77cb7a9c6262639a6e4. But I don't understand why this commit was necessary, because both trx_lists_init_at_db_start() and trx_rseg_mem_restore() contain special cases for srv_operation == SRV_OPERATION_RESTORE condition, and on this condition only rseg headers are read: dberr_t trx_lists_init_at_db_start() { ut_a(srv_is_being_started); ut_ad(!srv_was_started); if (srv_operation == SRV_OPERATION_RESTORE) { /* mariabackup --prepare only deals with the redo log and the data files, not with transactions or the data dictionary. */ return trx_rseg_array_init(); } ... } static dberr_t trx_rseg_mem_restore(...) { ... const byte * binlog_name = TRX_RSEG + TRX_RSEG_BINLOG_NAME + rseg_hdr->frame; if (*binlog_name) { ... } ... if (srv_operation == SRV_OPERATION_RESTORE) { /* mariabackup --prepare only deals with the redo log and the data files, not with transactions or the data dictionary. */ return DB_SUCCESS; } ... } So the fix I propose is to revert that commit. The only doubt I have is the reason of that commit. Was there any performance testing or support cases which caused that fix? Note: binlog position is written with trx_rseg_update_binlog_offset().

            vlad.lesin, you are right, that change should be reverted and a test case be added, so that this bug will not be reintroduced again.

            I did not realize that we actually must read the rollback segment header pages so that the binlog position can be reported. The call trx_rseg_read_wsrep_checkpoint() probably is unnecessary in mariadb-backup, but it should not incur significant additional cost. The bulk of the cost should be reading the 128 rollback segment header pages.

            marko Marko Mäkelä added a comment - vlad.lesin , you are right, that change should be reverted and a test case be added, so that this bug will not be reintroduced again. I did not realize that we actually must read the rollback segment header pages so that the binlog position can be reported. The call trx_rseg_read_wsrep_checkpoint() probably is unnecessary in mariadb-backup , but it should not incur significant additional cost. The bulk of the cost should be reading the 128 rollback segment header pages.

            People

              vlad.lesin Vladislav Lesin
              valerii Valerii Kravchuk
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.