Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32120

[ERROR] InnoDB: Expected tablespace id 1525 but found 1537 in the file ./test/t1.ibd during mariabackup --backup

Details

    Description

      origin/HEAD, origin/11.3, 11.3 8ad1e26b1bafa4ed9928306efc10c047f2274108 2023-09-04T09:36:44+04:00
       
      Two sessions run concurrent a DDL/DML mix.
      During that is ongoing mariabackup --backup runs and fails.
      Snip from the mariabackup output
      # 2023-09-06T16:59:05 [3561544] | [rr 3628106 1781][00] 2023-09-06 16:59:04 mariabackup: Generating a list of tablespaces
      # 2023-09-06T16:59:05 [3561544] | [rr 3628106 2535]2023-09-06 16:59:05 0 [ERROR] InnoDB: Expected tablespace id 1525 but found 1537 in the file ./test/t1.ibd
      # 2023-09-06T16:59:05 [3561544] | [rr 3628106 2544][00] FATAL ERROR: 2023-09-06 16:59:05 Failed to validate first page of the file test/t1, error 62
       
      sdp:/data1/results/1694027773/TBR-1350$ _RR_TRACE_DIR=./1_clone/rr rr replay --mark-stdio
       
      RQG
      ===
      # git clone https://github.com/mleich1/rqg --branch <pick the right branch> RQG
      #
      # GIT_SHOW: HEAD -> master, origin/master, origin/HEAD d24e1493fb180e50a6aa37006bfb613ef8e35861 2023-09-06T15:46:05+02:00
      # rqg.pl  : Version 4.4.0 (2023-08)
      #
      # $RQG_HOME/rqg.pl \
      # --grammar=conf/runtime/alter_online.yy \
      # --gendata=conf/runtime/alter_online.zz \
      # --reporters=Mariabackup_linux \
      # --mysqld=--loose-innodb-log-file-size=200M \
      # --mysqld=--loose-innodb_lock_schedule_algorithm=fcfs \
      # --mysqld=--loose-idle_write_transaction_timeout=0 \
      # --mysqld=--loose-idle_transaction_timeout=0 \
      # --mysqld=--loose-idle_readonly_transaction_timeout=0 \
      # --mysqld=--connect_timeout=60 \
      # --mysqld=--interactive_timeout=28800 \
      # --mysqld=--slave_net_timeout=60 \
      # --mysqld=--net_read_timeout=30 \
      # --mysqld=--net_write_timeout=60 \
      # --mysqld=--loose-table_lock_wait_timeout=50 \
      # --mysqld=--wait_timeout=28800 \
      # --no_mask \
      # --queries=10000000 \
      # --seed=random \
      # --reporters=None \
      # --reporters=ErrorLog \
      # --reporters=Deadlock \
      # --validators=None \
      # --mysqld=--log_output=none \
      # --mysqld=--log_bin_trust_function_creators=1 \
      # --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
      # --engine=InnoDB \
      # --restart_timeout=240 \
      # --mysqld=--plugin-load-add=file_key_management.so \
      # --mysqld=--loose-file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
      # --mysqld=--plugin-load-add=provider_lzo.so \
      # --mysqld=--plugin-load-add=provider_bzip2.so \
      # --mysqld=--plugin-load-add=provider_lzma.so \
      # --mysqld=--plugin-load-add=provider_snappy.so \
      # --mysqld=--plugin-load-add=provider_lz4.so \
      # --mysqld=--loose-innodb_compression_level=1 \
      # --duration=300 \
      # --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
      # --mysqld=--lock-wait-timeout=15 \
      # --mysqld=--innodb-lock-wait-timeout=10 \
      # --mysqld=--loose-innodb_fast_shutdown=0 \
      # --mysqld=--sql_mode=traditional \
      # --mysqld=--loose-innodb_read_only_compressed=OFF \
      # --mysqld=--innodb_stats_persistent=off \
      # --mysqld=--innodb_adaptive_hash_index=off \
      # --mysqld=--innodb_random_read_ahead=ON \
      # --mysqld=--innodb_read_ahead_threshold=0 \
      # --mysqld=--loose-innodb_evict_tables_on_commit_debug=off \
      # --mysqld=--loose-max-statement-time=30 \
      # --threads=2 \
      # --mysqld=--innodb_use_native_aio=1 \
      # --mysqld=--innodb_undo_log_truncate=OFF \
      # --mysqld=--loose_innodb_change_buffering=changes \
      # --mysqld=--innodb_undo_tablespaces=3 \
      # --mysqld=--innodb_rollback_on_timeout=OFF \
      # --mysqld=--innodb_page_size=32K \
      # --mysqld=--innodb-buffer-pool-size=24M \
      # <local settings>
      

      Attachments

        Issue Links

          Activity

            This is somewhat similar to MDEV-33011.

            marko Marko Mäkelä added a comment - This is somewhat similar to MDEV-33011 .

            We only have an rr replay trace of mariadb-backup --backup, and nothing on the server that was running concurrently.

            This seems to be a race condition within xb_load_tablespaces(). The file test/t1.ibd was first read here, as file descriptor 12:

            #0  __libc_pread64 (fd=12, buf=0x55863e850000, count=65536, offset=0) at ../sysdeps/unix/sysv/linux/pread64.c:24
            #1  0x000055863cf18e6b in os_file_io (in_type=<optimized out>, file=<optimized out>, buf=<optimized out>, n=65536, offset=0, err=0x7ffe13e2b30c) at /data/Server/11.3/storage/innobase/os/os0file.cc:2715
            #2  0x000055863cf19976 in os_file_pread (err=0x7ffe13e2b30c, offset=0, n=65536, buf=<optimized out>, file=<optimized out>, 
                  type=@0x55863d363400: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_MAYBE_PARTIAL}) at /data/Server/11.3/storage/innobase/os/os0file.cc:2873
            #3  os_file_read_func (type=@0x55863d363400: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_MAYBE_PARTIAL}, file=<optimized out>, buf=<optimized out>, offset=offset@entry=0, n=65536, 
                o=o@entry=0x7ffe13e2b358) at /data/Server/11.3/storage/innobase/os/os0file.cc:2903
            #4  0x000055863ce9e55b in Datafile::read_first_page (this=0x55863e6ec590, read_only_mode=<optimized out>) at /data/Server/11.3/storage/innobase/include/os0file.h:96
            #5  0x000055863ce9ea05 in Datafile::validate_first_page (this=this@entry=0x55863e6ec590) at /data/Server/11.3/storage/innobase/fsp/fsp0file.cc:479
            #6  0x000055863c8ac6d2 in xb_load_single_table_tablespace (dirname=<optimized out>, filname=<optimized out>, is_remote=<optimized out>, skip_node_page0=<optimized out>, defer_space_id=0)
                at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:3561
            #7  0x000055863c8aac49 in enumerate_ibd_files (callback=callback@entry=0x55863c8ac4f0 <xb_load_single_table_tablespace(char const*, char const*, bool, bool, uint32_t)>)
                at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:3956
            #8  0x000055863c8aad4a in xb_load_tablespaces () at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:4056
            #9  xb_load_tablespaces () at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:4012
            #10 0x000055863c8b4c9f in xtrabackup_backup_func () at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:4772
            #11 main_low (argv=0x55863e5f4170) at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:7028
            #12 0x000055863c863d94 in main (argc=<optimized out>, argv=<optimized out>) at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:6828
            

            We created a fil_space_t object 0x55863e6ecf80 based on the contents of the page, with the tablespace ID 1525. Then, for this tablespace, we reopen the file as file descriptor 17 and reread the first page:

            #0  pread64 (__offset=<optimized out>, __nbytes=<optimized out>, __buf=<optimized out>, __fd=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/unistd.h:82
            #1  SyncFileIO::execute (request=@0x7ffe13e2b050: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_SYNC}, this=<optimized out>) at /data/Server/11.3/storage/innobase/os/os0file.cc:695
            #2  SyncFileIO::execute (this=<optimized out>, request=@0x7ffe13e2b050: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_SYNC}) at /data/Server/11.3/storage/innobase/os/os0file.cc:687
            #3  0x000055863cf18e6b in os_file_io (in_type=<optimized out>, file=<optimized out>, buf=<optimized out>, n=32768, offset=0, err=0x7ffe13e2b25c) at /data/Server/11.3/storage/innobase/os/os0file.cc:2715
            #4  0x000055863cf19976 in os_file_pread (err=0x7ffe13e2b25c, offset=0, n=32768, buf=0x8000, file=<optimized out>, type=@0x55863d373540: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_SYNC})
                at /data/Server/11.3/storage/innobase/os/os0file.cc:2873
            #5  os_file_read_func (type=@0x55863d373540: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_SYNC}, file=<optimized out>, buf=buf@entry=0x55863e838000, offset=offset@entry=0, n=n@entry=32768, 
                o=o@entry=0x0) at /data/Server/11.3/storage/innobase/os/os0file.cc:2903
            #6  0x000055863cf1ae13 in fil_node_t::read_page0 (this=this@entry=0x55863e6ed0b0) at /data/Server/11.3/storage/innobase/include/os0file.h:96
            #7  0x000055863ce8e824 in fil_node_open_file_low (node=0x55863e6ed0b0) at /data/Server/11.3/storage/innobase/fil/fil0fil.cc:393
            #8  0x000055863ce8ed88 in fil_space_t::read_page0 (this=0x55863e6ecf80) at /data/Server/11.3/storage/innobase/fil/fil0fil.cc:1142
            #9  fil_space_t::read_page0 (this=this@entry=0x55863e6ecf80) at /data/Server/11.3/storage/innobase/fil/fil0fil.cc:1125
            #10 0x000055863c8aca21 in xb_load_single_table_tablespace (dirname=<optimized out>, filname=<optimized out>, is_remote=<optimized out>, skip_node_page0=<optimized out>, defer_space_id=<optimized out>)
                at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:3603
            #11 0x000055863c8aac49 in enumerate_ibd_files (callback=callback@entry=0x55863c8ac4f0 <xb_load_single_table_tablespace(char const*, char const*, bool, bool, uint32_t)>)
                at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:3956
            #12 0x000055863c8aad4a in xb_load_tablespaces () at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:4056
            #13 xb_load_tablespaces () at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:4012
            #14 0x000055863c8b4c9f in xtrabackup_backup_func () at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:4772
            #15 main_low (argv=0x55863e5f4170) at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:7028
            #16 0x000055863c863d94 in main (argc=<optimized out>, argv=<optimized out>) at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:6828
            

            Between these two page read operations, the server had renamed the file and created a new one. If we had retained an open file handle and not closed it in between, there would have been no discrepancy. Based on FILE_ records in the redo log, we should be able to figure out the file creation and renaming for the current test/t1.ibd file.

            In MDEV-25909 I suggested that we should simplify this code. The Datafile class is basically duplicating some of fil_node_t and Tablespace is duplicating fil_space_t.

            marko Marko Mäkelä added a comment - We only have an rr replay trace of mariadb-backup --backup , and nothing on the server that was running concurrently. This seems to be a race condition within xb_load_tablespaces() . The file test/t1.ibd was first read here, as file descriptor 12: #0 __libc_pread64 (fd=12, buf=0x55863e850000, count=65536, offset=0) at ../sysdeps/unix/sysv/linux/pread64.c:24 #1 0x000055863cf18e6b in os_file_io (in_type=<optimized out>, file=<optimized out>, buf=<optimized out>, n=65536, offset=0, err=0x7ffe13e2b30c) at /data/Server/11.3/storage/innobase/os/os0file.cc:2715 #2 0x000055863cf19976 in os_file_pread (err=0x7ffe13e2b30c, offset=0, n=65536, buf=<optimized out>, file=<optimized out>, type=@0x55863d363400: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_MAYBE_PARTIAL}) at /data/Server/11.3/storage/innobase/os/os0file.cc:2873 #3 os_file_read_func (type=@0x55863d363400: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_MAYBE_PARTIAL}, file=<optimized out>, buf=<optimized out>, offset=offset@entry=0, n=65536, o=o@entry=0x7ffe13e2b358) at /data/Server/11.3/storage/innobase/os/os0file.cc:2903 #4 0x000055863ce9e55b in Datafile::read_first_page (this=0x55863e6ec590, read_only_mode=<optimized out>) at /data/Server/11.3/storage/innobase/include/os0file.h:96 #5 0x000055863ce9ea05 in Datafile::validate_first_page (this=this@entry=0x55863e6ec590) at /data/Server/11.3/storage/innobase/fsp/fsp0file.cc:479 #6 0x000055863c8ac6d2 in xb_load_single_table_tablespace (dirname=<optimized out>, filname=<optimized out>, is_remote=<optimized out>, skip_node_page0=<optimized out>, defer_space_id=0) at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:3561 #7 0x000055863c8aac49 in enumerate_ibd_files (callback=callback@entry=0x55863c8ac4f0 <xb_load_single_table_tablespace(char const*, char const*, bool, bool, uint32_t)>) at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:3956 #8 0x000055863c8aad4a in xb_load_tablespaces () at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:4056 #9 xb_load_tablespaces () at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:4012 #10 0x000055863c8b4c9f in xtrabackup_backup_func () at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:4772 #11 main_low (argv=0x55863e5f4170) at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:7028 #12 0x000055863c863d94 in main (argc=<optimized out>, argv=<optimized out>) at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:6828 We created a fil_space_t object 0x55863e6ecf80 based on the contents of the page, with the tablespace ID 1525. Then, for this tablespace, we reopen the file as file descriptor 17 and reread the first page: #0 pread64 (__offset=<optimized out>, __nbytes=<optimized out>, __buf=<optimized out>, __fd=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/unistd.h:82 #1 SyncFileIO::execute (request=@0x7ffe13e2b050: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_SYNC}, this=<optimized out>) at /data/Server/11.3/storage/innobase/os/os0file.cc:695 #2 SyncFileIO::execute (this=<optimized out>, request=@0x7ffe13e2b050: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_SYNC}) at /data/Server/11.3/storage/innobase/os/os0file.cc:687 #3 0x000055863cf18e6b in os_file_io (in_type=<optimized out>, file=<optimized out>, buf=<optimized out>, n=32768, offset=0, err=0x7ffe13e2b25c) at /data/Server/11.3/storage/innobase/os/os0file.cc:2715 #4 0x000055863cf19976 in os_file_pread (err=0x7ffe13e2b25c, offset=0, n=32768, buf=0x8000, file=<optimized out>, type=@0x55863d373540: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_SYNC}) at /data/Server/11.3/storage/innobase/os/os0file.cc:2873 #5 os_file_read_func (type=@0x55863d373540: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_SYNC}, file=<optimized out>, buf=buf@entry=0x55863e838000, offset=offset@entry=0, n=n@entry=32768, o=o@entry=0x0) at /data/Server/11.3/storage/innobase/os/os0file.cc:2903 #6 0x000055863cf1ae13 in fil_node_t::read_page0 (this=this@entry=0x55863e6ed0b0) at /data/Server/11.3/storage/innobase/include/os0file.h:96 #7 0x000055863ce8e824 in fil_node_open_file_low (node=0x55863e6ed0b0) at /data/Server/11.3/storage/innobase/fil/fil0fil.cc:393 #8 0x000055863ce8ed88 in fil_space_t::read_page0 (this=0x55863e6ecf80) at /data/Server/11.3/storage/innobase/fil/fil0fil.cc:1142 #9 fil_space_t::read_page0 (this=this@entry=0x55863e6ecf80) at /data/Server/11.3/storage/innobase/fil/fil0fil.cc:1125 #10 0x000055863c8aca21 in xb_load_single_table_tablespace (dirname=<optimized out>, filname=<optimized out>, is_remote=<optimized out>, skip_node_page0=<optimized out>, defer_space_id=<optimized out>) at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:3603 #11 0x000055863c8aac49 in enumerate_ibd_files (callback=callback@entry=0x55863c8ac4f0 <xb_load_single_table_tablespace(char const*, char const*, bool, bool, uint32_t)>) at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:3956 #12 0x000055863c8aad4a in xb_load_tablespaces () at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:4056 #13 xb_load_tablespaces () at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:4012 #14 0x000055863c8b4c9f in xtrabackup_backup_func () at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:4772 #15 main_low (argv=0x55863e5f4170) at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:7028 #16 0x000055863c863d94 in main (argc=<optimized out>, argv=<optimized out>) at /data/Server/11.3/extra/mariabackup/xtrabackup.cc:6828 Between these two page read operations, the server had renamed the file and created a new one. If we had retained an open file handle and not closed it in between, there would have been no discrepancy. Based on FILE_ records in the redo log, we should be able to figure out the file creation and renaming for the current test/t1.ibd file. In MDEV-25909 I suggested that we should simplify this code. The Datafile class is basically duplicating some of fil_node_t and Tablespace is duplicating fil_space_t .

            People

              thiru Thirunarayanan Balathandayuthapani
              mleich Matthias Leich
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.