Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27931

A page in innodb_checksum_algorithm=innodb is wrongly claimed to be corrupted

    XMLWordPrintable

Details

    Description

      Hi, all MariaDB developers.

      My team recently performed a successful MariaDB server upgrade, from 10.0.17 to 10.6.7.
      The upgrade procedures is alike:

      • Prepare MariaDB 10.0.17 backup(previously it was backed up by xtrabackup 2.2.9)
        xtrabackup 2.2.9 backup data was totally good.
        dbchecksum for ibdata1 was totally good.
      • Copy prepared backup data into another VM.
      • Upgrade MariaDB to 10.6.7 (mysql_update was utilized) in that VM.

      After a successful upgrade, we can verify that MariaDB 10.6.7 was good about the prepared data, no error log, business level sanity check was totally fine.

      We haven't done CHECK TABLE because we set innodb_file_per_table=1 in my.cnf, ibdata1 should mostly be undo tablespace data(occupied around 200G in size), so I though running CHECK TABLE won't help either.

      But when I run MariaDB 10.6.7 innodbchecksum for the ibdata1, it'll report page1 error. All data upgraded from 10.0.17 to 10.6.7 will report corruption.

      The command I executed for backup the upgraded databased:
      mariabackup --defaults-file=/etc/mysql/my.cnf --socket=/tmp/my.sock --user=xxxxx --password=xxxxxxxx --slave-info --backup --log-innodb-page-corruption --target-dir /tmp/tstempYF2mY9/prep_2022.02.24_093401.414814

      Logs
      (P.S. corrupt page content was excluded intentionally, it contain some sensitive data, if you need that, I may need to do some data redaction)

      [01] 2022-02-24 09:34:01 Copying ibdata1 to /tmp/tstempYF2mY9/prep_2022.02.24_093401.414814/ibdata1
       
      [01] 2022-02-24 09:40:52 Database page corruption detected at page 3994534, retrying...
       
      [01] 2022-02-24 09:40:52 Database page corruption detected at page 3994534, retrying...
       
      [01] 2022-02-24 09:40:52 Database page corruption detected at page 3994534, retrying...
       
      [01] 2022-02-24 09:40:52 Database page corruption detected at page 3994534, retrying...
       
      [01] 2022-02-24 09:40:54 Error: failed to read page after 10 retries. File ibdata1 seems to be corrupted. WARNING!!! The corruption is ignored due to log-innodb-page-corruption option, the backup can contain corrupted data.
       
      2022-02-24 09:40:54 0x7f6663fff700  InnoDB: Assertion failure in file /home/buildbot/buildbot/build/extra/mariabackup/xtrabackup.cc line 804
       
      InnoDB: Failing assertion: p
       
      InnoDB: We intentionally generate a memory trap.
       
      InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
       
      InnoDB: If you get repeated assertion failures or crashes, even
       
      InnoDB: immediately after the mariadbd startup, there may be
       
      InnoDB: corruption in the InnoDB tablespace. Please refer to
       
      InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
       
      InnoDB: about forcing recovery.
       
      220224  9:40:54 [ERROR] mysqld got signal 6 ;
       
      This could be because you hit a bug. It is also possible that this binary
       
      or one of the libraries it was linked against is corrupt, improperly built,
       
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
       
      diagnose the problem, but since we have already crashed,
       
      something is definitely wrong and this may fail.
       
      Server version: 10.6.7-MariaDB
       
      key_buffer_size=0
       
      read_buffer_size=131072
       
      max_used_connections=0
       
      max_threads=1
       
      thread_count=0
       
      It is possible that mysqld could use up to
       
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 6133 K  bytes of memory
       
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x0
       
      Attempting backtrace. You can use the following information to find out
       
      where mysqld died. If you see no messages after this, something went
       
      terribly wrong...
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442274318)
       
      stack_bottom = 0x0 thread_stack 0x49000
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442274318)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442274318)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442274318)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442274318)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442274318)
       
      addr2line: 'mariabackup': No such file
       
      Printing to addr2line failed
       
      mariabackup(my_print_stacktrace+0x2e)[0x55b01b984b7e]
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442274971)
       
      mariabackup(handle_fatal_signal+0x307)[0x55b01b3b6ef7]
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442275535)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442276086)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442276086)
       
      sigaction.c:0(__restore_rt)[0x7f666d7d35f0]
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442276086)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442276086)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442276086)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442276086)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442276086)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442276086)  
       
      :0(__GI_raise)[0x7f666cc1f337]  
       
      :0(__GI_abort)[0x7f666cc20a28]
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442276086)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442276086)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442276086)
       
      addr2line: 'mariabackup': No such file
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442276086)  
       
      mariabackup(+0x6b3275)[0x55b01afd2275]
       
      mariabackup(+0x6ff34a)[0x55b01b01e34a]        
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442276829)
       
      mariabackup(_ZN14CorruptedPages16add_page_no_lockEPKcmjb+0x10a)[0x55b01b02b1aa]
       
      mariabackup(_ZN14CorruptedPages8add_pageEPKcmj+0x37)[0x55b01b02b247]
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442276829)
       
      mariabackup(_Z15xb_fil_cur_readP12xb_fil_cur_tR14CorruptedPages+0x307)[0x55b01b0397a7]
       
      mariabackup(+0x708b35)[0x55b01b027b35]
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442276829)
       
      mariabackup(+0x708dcc)[0x55b01b027dcc]
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442276829)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442277491)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442278246)
       
      ??:0(std::this_thread::__sleep_for(std::chrono::duration<long, std::ratio<1l, 1l> >, std::chrono::duration<long, std::ratio<1l, 1000000000l> >))[0x7f666d36e070]
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442278898)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442278898)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442279634)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442279634)
       
      pthread_create.c:0(start_thread)[0x7f666d7cbe65] 
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442279634)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442280417)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442280417)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442280417)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442280417)
       
      ??:0(__clone)[0x7f666cce788d]
       
      The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
       
      information that should help you find out what is causing the crash.
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442280417)
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442280417)
       
      Writing a core file...
       
      Working directory at /nutstore/mysql/db4
       
      [00] 2022-02-24 09:40:54 >> log scanned up to (46654442281016)
       
      Resource Limits:
       
      Limit                     Soft Limit           Hard Limit           Units
       
      Max cpu time              unlimited            unlimited            seconds   
       
      Max file size             unlimited            unlimited            bytes
       
      Max data size             unlimited            unlimited            bytes
       
      Max stack size            8388608              unlimited            bytes
       
      Max core file size        0                    unlimited            bytes
       
      Max resident set          unlimited            unlimited            bytes
       
      Max processes             4096                 1030594              processes
       
      Max open files            1024                 4096                 files
       
      Max locked memory         65536                65536                bytes
       
      Max address space         unlimited            unlimited            bytes
       
      Max file locks            unlimited            unlimited            locks
       
      Max pending signals       1030594              1030594              signals
       
      Max msgqueue size         819200               819200               bytes
       
      Max nice priority         0                    0              
       
      Max realtime priority     0                    0
       
      Max realtime timeout      unlimited            unlimited            us
       
      Core pattern: core
      

      Since I add the --log-innodb-page-corruption cli option, I expect page corruption will be logged but not to halt the backup procedure, and after data preparing, the page corruption will be gone, since it happened in unallocated system table space?(need confirmation)

      Attachments

        1. page1.bin
          16 kB
          Marko Mäkelä

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              hcooper Harvey Cooper
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.