Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31488

Restart on backupped data fails with InnoDB: Failing assertion: !i || prev_id + 1 == space_id at srv0start.cc line 803

Details

    Description

      origin/preview-11.2-preview 79f9e94195dc8a007debd27ec4e8d2c15f094a97 2023-06-13T17:23:05+10:00
      Scenario:
      1. Start source server, generate some initial data
      2. Several sessions run concurrent SQL
      3. During 2. is ongoing
          mariabackup --backup data from source server to some other location
          mariabackup --prepare on that other location
          Try to start a DB server on that other location.
          This fails with
          #  2023-06-13 17:32:44 0 [Note] InnoDB: Initializing buffer pool, total size = 8.000MiB, chunk size = 1.000MiB
         #  2023-06-13 17:32:44 0 [Note] InnoDB: Completed initialization of buffer pool
         #  2023-06-13 17:32:44 0 [ERROR] InnoDB: Inconsistent tablespace ID in file .//undo001
         #  2023-06-13 17:32:44 0 [ERROR] InnoDB: Unable to read first page of file .//undo001
         #  2023-06-13 17:32:44 0 [ERROR] InnoDB: Inconsistent tablespace ID in file .//undo002
         #  2023-06-13 17:32:44 0 [ERROR] InnoDB: Unable to read first page of file .//undo002
         #  2023-06-13 17:32:44 0x7f90afc1f7c0  InnoDB: Assertion failure in file /data/Server/preview-11.2-preview/storage/innobase/srv/srv0start.cc line 803
         #  2023-06-13T17:34:05 [3112262] | InnoDB: Failing assertion: !i || prev_id + 1 == space_id
       
      # 2023-06-13T17:34:05 [3112262] Thread 1 (Thread 0x7f90afc1f7c0 (LWP 3205380)):
      # 2023-06-13T17:34:05 [3112262] #0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
      # 2023-06-13T17:34:05 [3112262] #1  0x0000564dc73f8c9f in my_write_core (sig=sig@entry=6) at /data/Server/preview-11.2-preview/mysys/stacktrace.c:424
      # 2023-06-13T17:34:05 [3112262] #2  0x0000564dc6fa5220 in handle_fatal_signal (sig=6) at /data/Server/preview-11.2-preview/sql/signal_handler.cc:360
      # 2023-06-13T17:34:05 [3112262] #3  <signal handler called>
      # 2023-06-13T17:34:05 [3112262] #4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      # 2023-06-13T17:34:05 [3112262] #5  0x00007f90afc49859 in __GI_abort () at abort.c:79
      # 2023-06-13T17:34:05 [3112262] #6  0x0000564dc6c04db3 in ut_dbg_assertion_failed (expr=expr@entry=0x564dc772be92 "!i || prev_id + 1 == space_id", file=file@entry=0x564dc772b158 "/data/Server/preview-11.2-preview/storage/innobase/srv/srv0start.cc", line=line@entry=803) at /data/Server/preview-11.2-preview/storage/innobase/ut/ut0dbg.cc:60
      # 2023-06-13T17:34:05 [3112262] #7  0x0000564dc6bfe915 in srv_all_undo_tablespaces_open (n_undo=<optimized out>, create_new_undo=<optimized out>) at /data/Server/preview-11.2-preview/storage/innobase/srv/srv0start.cc:803
      # 2023-06-13T17:34:05 [3112262] #8  srv_undo_tablespaces_init (create_new_undo=<optimized out>, mtr=0x7ffc2ee88650) at /data/Server/preview-11.2-preview/storage/innobase/srv/srv0start.cc:877
      # 2023-06-13T17:34:05 [3112262] #9  0x0000564dc6bffaa8 in srv_start (create_new_db=false) at /data/Server/preview-11.2-preview/storage/innobase/srv/srv0start.cc:1422
      # 2023-06-13T17:34:05 [3112262] #10 0x0000564dc720434a in innodb_init (p=<optimized out>) at /data/Server/preview-11.2-preview/storage/innobase/handler/ha_innodb.cc:4168
      # 2023-06-13T17:34:05 [3112262] #11 0x0000564dc6fa80fa in ha_initialize_handlerton (plugin=0x564dca0c2bf0) at /data/Server/preview-11.2-preview/sql/handler.cc:687
      # 2023-06-13T17:34:05 [3112262] #12 0x0000564dc6d7a40e in plugin_initialize (tmp_root=tmp_root@entry=0x7ffc2ee89ab0, plugin=plugin@entry=0x564dca0c2bf0, argc=argc@entry=0x564dc7e759e0 <remaining_argc>, argv=argv@entry=0x564dca00f1c8, options_only=<optimized out>) at /data/Server/preview-11.2-preview/sql/sql_plugin.cc:1465
      # 2023-06-13T17:34:05 [3112262] #13 0x0000564dc6d7b313 in plugin_init (argc=argc@entry=0x564dc7e759e0 <remaining_argc>, argv=<optimized out>, flags=0) at /data/Server/preview-11.2-preview/sql/sql_plugin.cc:1758
      # 2023-06-13T17:34:05 [3112262] #14 0x0000564dc6c6641d in init_server_components () at /data/Server/preview-11.2-preview/sql/mysqld.cc:5246
      # 2023-06-13T17:34:05 [3112262] #15 0x0000564dc6c6b1b7 in mysqld_main (argc=<optimized out>, argv=<optimized out>) at /data/Server/preview-11.2-preview/sql/mysqld.cc:5875
      # 2023-06-13T17:34:05 [3112262] #16 0x00007f90afc4b0b3 in __libc_start_main (main=0x564dc6c239c0 <main(int, char**)>, argc=37, argv=0x7ffc2ee8c9e8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffc2ee8c9d8) at ../csu/libc-start.c:308
      # 2023-06-13T17:34:05 [3112262] #17 0x0000564dc6c6208e in _start () at /data/Server/preview-11.2-preview/include/mysql/plugin.h:215
       
      pluto:/data/results/1686674959/TBR-1927$ gdb -c ./1_clone/data/core /data/Server_bin/preview-11.2-preview_RelWithDebInfo/bin/mysqld
       
      RQG
      ====
      # git clone https://github.com/mleich1/rqg --branch <pick the right branch> RQG
      #
      # GIT_SHOW: HEAD -> master, origin/master, origin/HEAD a8c8cd1021480e64ef979d326a2543c577d1068a 2023-06-09T18:22:34+02:00
      # rqg.pl  : Version 4.2.1 (2022-12)
      #
      # $RQG_HOME/rqg.pl \
      # --grammar=conf/mariadb/table_stress_innodb_dml.yy \
      # --gendata=conf/mariadb/table_stress.zz \
      # --gendata_sql=conf/mariadb/table_stress.sql \
      # --reporters=Mariabackup_linux \
      # --mysqld=--loose-innodb-log-file-size=200M \
      # --mysqld=--loose-innodb_lock_schedule_algorithm=fcfs \
      # --mysqld=--loose-idle_write_transaction_timeout=0 \
      # --mysqld=--loose-idle_transaction_timeout=0 \
      # --mysqld=--loose-idle_readonly_transaction_timeout=0 \
      # --mysqld=--connect_timeout=60 \
      # --mysqld=--interactive_timeout=28800 \
      # --mysqld=--slave_net_timeout=60 \
      # --mysqld=--net_read_timeout=30 \
      # --mysqld=--net_write_timeout=60 \
      # --mysqld=--loose-table_lock_wait_timeout=50 \
      # --mysqld=--wait_timeout=28800 \
      # --no_mask \
      # --queries=10000000 \
      # --seed=random \
      # --reporters=None \
      # --reporters=ErrorLog \
      # --reporters=Deadlock \
      # --validators=None \
      # --mysqld=--log_output=none \
      # --mysqld=--log_bin_trust_function_creators=1 \
      # --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
      # --engine=InnoDB \
      # --restart_timeout=240 \
      # --mysqld=--plugin-load-add=file_key_management.so \
      # --mysqld=--loose-file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
      # --mysqld=--plugin-load-add=provider_lzo.so \
      # --mysqld=--plugin-load-add=provider_bzip2.so \
      # --mysqld=--plugin-load-add=provider_lzma.so \
      # --mysqld=--plugin-load-add=provider_snappy.so \
      # --mysqld=--plugin-load-add=provider_lz4.so \
      # --mysqld=--loose-innodb_compression_level=1 \
      # --duration=300 \
      # --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
      # --mysqld=--lock-wait-timeout=15 \
      # --mysqld=--innodb-lock-wait-timeout=10 \
      # --mysqld=--loose-innodb_fast_shutdown=0 \
      # --mysqld=--innodb_file_per_table=0 \
      # --mysqld=--loose-innodb_read_only_compressed=OFF \
      # --mysqld=--innodb_stats_persistent=on \
      # --mysqld=--innodb_adaptive_hash_index=on \
      # --mysqld=--innodb_random_read_ahead=OFF \
      # --mysqld=--log-bin \
      # --mysqld=--sync-binlog=1 \
      # --mysqld=--loose-innodb_evict_tables_on_commit_debug=on \
      # --mysqld=--loose-max-statement-time=30 \
      # --threads=33 \
      # --mysqld=--innodb_use_native_aio=1 \
      # --mysqld=--innodb_undo_log_truncate=ON \
      # --mysqld=--innodb_undo_tablespaces=3 \
      # --mysqld=--innodb_rollback_on_timeout=OFF \
      # --mysqld=--innodb_page_size=16K \
      # --mysqld=--innodb-buffer-pool-size=8M \
      # <local settings>
      

      Attachments

        Issue Links

          Activity

            alexcool Alex added a comment -

            10.11.4 same problem

            alexcool Alex added a comment - 10.11.4 same problem

            MariaDB Server 10.11 was the first version to include MDEV-19229.

            marko Marko Mäkelä added a comment - MariaDB Server 10.11 was the first version to include MDEV-19229 .

            thiru, can you debug this?

            ssh pluto
            gdb -c /data/results/1686674959/TBR-1927/1_clone/data/core /data/Server_bin/preview-11.2-preview_RelWithDebInfo/bin/mysqld
            

            (gdb) frame 7
            #7  0x0000564dc6bfe915 in srv_all_undo_tablespaces_open (
                n_undo=<optimized out>, create_new_undo=<optimized out>)
                at /data/Server/preview-11.2-preview/storage/innobase/srv/srv0start.cc:803
            803	    ut_a(!i || prev_id + 1 == space_id);
            (gdb) info locals
            space_id = 37
            i = 1
            prev_id = 37

            marko Marko Mäkelä added a comment - thiru , can you debug this? ssh pluto gdb -c /data/results/1686674959/TBR-1927/1_clone/data/core /data/Server_bin/preview-11.2-preview_RelWithDebInfo/bin/mysqld (gdb) frame 7 #7 0x0000564dc6bfe915 in srv_all_undo_tablespaces_open ( n_undo=<optimized out>, create_new_undo=<optimized out>) at /data/Server/preview-11.2-preview/storage/innobase/srv/srv0start.cc:803 803 ut_a(!i || prev_id + 1 == space_id); (gdb) info locals … space_id = 37 i = 1 prev_id = 37 …

            PAGE_INIT redo for page0 wasn't applied for undo tablespace. This is because we remove the redo log records during trim().

            This problem should be fixable by MDEV-31487 patch

            thiru Thirunarayanan Balathandayuthapani added a comment - PAGE_INIT redo for page0 wasn't applied for undo tablespace. This is because we remove the redo log records during trim(). This problem should be fixable by MDEV-31487 patch

            Failing assertion: !i || prev_id + 1 == space_id  was only seen once (> 10000 RQG runs with Mariabackup on the tree).
            And that was on origin/preview-11.2-preview 7c7de020b576a42bd381a9060c3076f85da0d9d4 2023-06-18T23:49:17+02:00
            That problem was not observed during the excessive testing
            - 11.2 2867894ac6ca23fc2f82cfad2dd510351f597325 2023-06-28T09:39:28+03:00
            - origin/bb-11.2-MDEV-14795 f4ae37836c3a6632a4284be853675bcbe3f565d5 2023-06-28T15:48:11+05:30
            But with the low likelihood to replay it given its impossible to tell its reason is now somehow fixed.
            

            mleich Matthias Leich added a comment - Failing assertion: !i || prev_id + 1 == space_id was only seen once (> 10000 RQG runs with Mariabackup on the tree). And that was on origin/preview-11.2-preview 7c7de020b576a42bd381a9060c3076f85da0d9d4 2023-06-18T23:49:17+02:00 That problem was not observed during the excessive testing - 11.2 2867894ac6ca23fc2f82cfad2dd510351f597325 2023-06-28T09:39:28+03:00 - origin/bb-11.2-MDEV-14795 f4ae37836c3a6632a4284be853675bcbe3f565d5 2023-06-28T15:48:11+05:30 But with the low likelihood to replay it given its impossible to tell its reason is now somehow fixed.

            thiru was able to successfully recover the backup after applying the fix of MDEV-31487. Therefore, I think that we can conclude that this harder-to-reproduce symptom is actually a duplicate of the easier-to-reproduce bug MDEV-31487.

            marko Marko Mäkelä added a comment - thiru was able to successfully recover the backup after applying the fix of MDEV-31487 . Therefore, I think that we can conclude that this harder-to-reproduce symptom is actually a duplicate of the easier-to-reproduce bug MDEV-31487 .

            People

              thiru Thirunarayanan Balathandayuthapani
              mleich Matthias Leich
              Votes:
              2 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.