Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27919

mariabackup --log-copy-interval is measured in milliseconds in 10.5 and in microseconds in 10.6

Details

    Description

      Hi, all MariaBackup developers.

      We have two MariaDB clusters, which both version 10.0.x

      And recently upgraded from 10.0.x to 10.6.2
      But I found that the log ">> log scanned up to (...)" print too fast, overwhelming the stdout/stderr. (the log file size is occupied by several gigabytes)

      But as I tested, when I upgraded from 10.0.x to 10.5.x, the log ">> log scanned up to (...)" is not so overwhelming. (the log file size is occupied by several megabytes)

      FYI, Recovered from the same database backup.

      It sounds like a bug in mariabackup, since the --log-copy-interval (default is 1000ms) can't control the log output frequency.

      # 10.5 MariaBackup
      # One log per second
      [00] 2022-02-14 17:50:25 >> log scanned up to (46200984432003)
      [00] 2022-02-14 17:50:26 >> log scanned up to (46200984454330)
      [00] 2022-02-14 17:50:27 >> log scanned up to (46200984469322)
      [00] 2022-02-14 17:50:28 >> log scanned up to (46200984491853)
      [00] 2022-02-14 17:50:29 >> log scanned up to (46200984529292)
      [00] 2022-02-14 17:50:30 >> log scanned up to (46200984561329)
       
      # 10.6 MariaBackup
      # --log-copy-interval=1000 is not working for 10.6
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221905582)
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221905582)
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221905582)
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221905582)
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221905582)
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221905582)
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221905582)
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221905582)
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221905582)
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221905582)
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221905582)
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221906153)
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221906805)
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221906805)
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221906805)
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221907586)
      [00] 2022-02-22 17:11:48 >> log scanned up to (46276221908255)
      

      Attachments

        Issue Links

          Activity

            Our regression tests (mtr) filter out all output of mariabackup or mariadb-backup.

            I noticed this spamming when removing the log block structure and log block parser in MDEV-14425 for 10.8, but I did not investigate it very deep.

            If this really is a regression in 10.6, then MDEV-21452 could have caused this.

            marko Marko Mäkelä added a comment - Our regression tests ( mtr ) filter out all output of mariabackup or mariadb-backup . I noticed this spamming when removing the log block structure and log block parser in MDEV-14425 for 10.8, but I did not investigate it very deep. If this really is a regression in 10.6, then MDEV-21452 could have caused this.

            In 10.5 polling interval is measured in microseconds:

            static os_thread_ret_t DECLARE_THREAD(log_copying_thread)(void*)                
            {                                                                               
            ...                                                                             
                    for (;;) {                                                              
                            os_event_reset(log_copying_stop);                               
                            os_event_wait_time_low(log_copying_stop,                        
                                                   xtrabackup_log_copy_interval * 1000U,    
                                                   0);                                      
                            if (xtrabackup_copy_logfile()) {                                
                                    break;                                                  
                            }                                                               
            ...                                                                             
                    }                                                                       
            ...                                                                             
            }                                                                               
                                                                                            
            os_event_wait_time_low(                                                         
            /*===================*/                                                         
                    os_event_t      event,                  /*!< in/out: event to wait */   
                    ulint           time_in_usec,           /*!< in: timeout in             
                                                            microseconds, or                
                                                            OS_SYNC_INFINITE_TIME */        
                    int64_t         reset_sig_count)        /*!< in: zero or the value
            

            While in 10.6 the interval is measured in nanoseconds:

            static void log_copying_thread()                                                
            {                                                                               
            ...                                                                             
              while (!xtrabackup_copy_logfile() &&                                          
                     (!metadata_to_lsn || metadata_to_lsn > log_copy_scanned_lsn))          
              {                                                                             
                timespec abstime;                                                           
                set_timespec_nsec(abstime, 1000ULL * xtrabackup_log_copy_interval);         
                mysql_cond_timedwait(&log_copying_stop, &log_sys.mutex, &abstime);          
              }                                                                             
            ...                                                                             
            }
            

            The workaround is to increase --log-copy-interval by a thousand times until it's fixed.

            vlad.lesin Vladislav Lesin added a comment - In 10.5 polling interval is measured in microseconds: static os_thread_ret_t DECLARE_THREAD(log_copying_thread)( void *) { ... for (;;) { os_event_reset(log_copying_stop); os_event_wait_time_low(log_copying_stop, xtrabackup_log_copy_interval * 1000U, 0 ); if (xtrabackup_copy_logfile()) { break ; } ... } ... } os_event_wait_time_low( /*===================*/ os_event_t event, /*!< in/out: event to wait */ ulint time_in_usec, /*!< in: timeout in microseconds, or OS_SYNC_INFINITE_TIME */ int64_t reset_sig_count) /*!< in: zero or the value While in 10.6 the interval is measured in nanoseconds: static void log_copying_thread() { ... while (!xtrabackup_copy_logfile() && (!metadata_to_lsn || metadata_to_lsn > log_copy_scanned_lsn)) { timespec abstime; set_timespec_nsec(abstime, 1000ULL * xtrabackup_log_copy_interval); mysql_cond_timedwait(&log_copying_stop, &log_sys.mutex, &abstime); } ... } The workaround is to increase --log-copy-interval by a thousand times until it's fixed.

            The bug was caused by 38fd7b7d9170.

            vlad.lesin Vladislav Lesin added a comment - The bug was caused by 38fd7b7d9170.

            Thank you for the analysis. I’m glad that it was so trivial after all. I hope that you can figure out some way to test it in mtr.

            marko Marko Mäkelä added a comment - Thank you for the analysis. I’m glad that it was so trivial after all. I hope that you can figure out some way to test it in mtr .

            People

              vlad.lesin Vladislav Lesin
              hcooper Harvey Cooper
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.