Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27836

Assertion `n_fields <= ulint(index->n_fields) + 1' failed

Details

    Description

      origin/10.8 3553cfd0254f69f2fbbe3c32601f2a3e2c606ba3 2022-02-14T09:43:29+02:00
       
      [rr 541663 5930]2022-02-14 13:31:46 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=49871
      [rr 541663 7153]mysqld: /data/Server/10.8D/storage/innobase/rem/rem0rec.cc:304: void rec_init_offsets_comp_ordinary(const rec_t*, const dict_index_t*, rec_offs*, ulint, const dict_col_t::def_t*, rec_leaf_format) [with bool mblob = true; bool redundant_temp = false; rec_t = unsigned char; rec_offs = short unsigned int; ulint = long unsigned int]: Assertion `n_fields <= ulint(index->n_fields) + 1' failed.   (error pattern TBR-1362)
      (rr) bt
      #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      #1  0x00006a6d4e2cd859 in __GI_abort () at abort.c:79
      #2  0x00006a6d4e2cd729 in __assert_fail_base (fmt=0x6a6d4e463588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55f9ae26f700 "n_fields <= ulint(index->n_fields) + 1", 
          file=0x55f9ae26b3a0 "/data/Server/10.8D/storage/innobase/rem/rem0rec.cc", line=304, function=<optimized out>) at assert.c:92
      #3  0x00006a6d4e2def36 in __GI___assert_fail (assertion=0x55f9ae26f700 "n_fields <= ulint(index->n_fields) + 1", file=0x55f9ae26b3a0 "/data/Server/10.8D/storage/innobase/rem/rem0rec.cc", line=304, 
          function=0x55f9ae26f540 "void rec_init_offsets_comp_ordinary(const rec_t*, const dict_index_t*, rec_offs*, ulint, const dict_col_t::def_t*, rec_leaf_format) [with bool mblob = true; bool redundant_temp = false; rec_t = unsign"...) at assert.c:101
      #4  0x000055f9acfbc8ca in rec_init_offsets_comp_ordinary<true> (rec=0x11ea567e00e5 "\200", index=0x61600034de08, offsets=0x620000001108, n_core=5, def_val=0x0, format=REC_LEAF_INSTANT)
          at /data/Server/10.8D/storage/innobase/rem/rem0rec.cc:304
      #5  0x000055f9acfaf8e0 in rec_get_offsets_func (rec=0x11ea567e00e5 "\200", index=0x61600034de08, offsets=0x620000001108, n_core=5, n_fields=18446744073709551615, 
          file=0x55f9ae3b7660 "/data/Server/10.8D/storage/innobase/btr/btr0cur.cc", line=586, heap=0x7fff3f678e70) at /data/Server/10.8D/storage/innobase/rem/rem0rec.cc:943
      #6  0x000055f9ad223f42 in btr_cur_instant_init_low (index=0x61600034de08, mtr=0x7fff3f679570) at /data/Server/10.8D/storage/innobase/btr/btr0cur.cc:586
      #7  0x000055f9ad224bef in btr_cur_instant_init (table=0x618000002d08) at /data/Server/10.8D/storage/innobase/btr/btr0cur.cc:656
      #8  0x000055f9ad347d59 in dict_load_table_one (name=..., ignore_err=DICT_ERR_IGNORE_RECOVER_LOCK, fk_tables=std::deque with 0 elements) at /data/Server/10.8D/storage/innobase/dict/dict0load.cc:2391
      #9  0x000055f9ad348a88 in dict_sys_t::load_table (this=0x55f9af375b40 <dict_sys>, name=..., ignore=DICT_ERR_IGNORE_RECOVER_LOCK) at /data/Server/10.8D/storage/innobase/dict/dict0load.cc:2465
      #10 0x000055f9ad349385 in dict_load_table_on_id (table_id=20, ignore_err=DICT_ERR_IGNORE_RECOVER_LOCK) at /data/Server/10.8D/storage/innobase/dict/dict0load.cc:2553
      #11 0x000055f9ad30f6b5 in dict_table_open_on_id (table_id=20, dict_locked=false, table_op=DICT_TABLE_OP_LOAD_TABLESPACE, thd=0x0, mdl=0x0) at /data/Server/10.8D/storage/innobase/dict/dict0dict.cc:851
      #12 0x000055f9ad19fb0b in trx_resurrect_table_locks (trx=0x5d870634e9a8, undo=0x6110000084c0) at /data/Server/10.8D/storage/innobase/trx/trx0trx.cc:602
      #13 0x000055f9ad1a0a7a in trx_resurrect (undo=0x6110000084c0, rseg=0x55f9af2dc800 <trx_sys+25472>, start_time=1644845508, start_time_micro=2426016653853, rows_to_undo=0x7fff3f67bef0)
          at /data/Server/10.8D/storage/innobase/trx/trx0trx.cc:685
      #14 0x000055f9ad1a11d1 in trx_lists_init_at_db_start () at /data/Server/10.8D/storage/innobase/trx/trx0trx.cc:735
      #15 0x000055f9ad133bc6 in srv_start (create_new_db=false) at /data/Server/10.8D/storage/innobase/srv/srv0start.cc:1226
      #16 0x000055f9accb82dd in innodb_init (p=0x615000003298) at /data/Server/10.8D/storage/innobase/handler/ha_innodb.cc:4203
      #17 0x000055f9ac42677f in ha_initialize_handlerton (plugin=0x621000037268) at /data/Server/10.8D/sql/handler.cc:649
      #18 0x000055f9abc6b0f9 in plugin_initialize (tmp_root=0x7fff3f67f2d0, plugin=0x621000037268, argc=0x55f9af6c57a0 <remaining_argc>, argv=0x6160000003f8, options_only=false) at /data/Server/10.8D/sql/sql_plugin.cc:1462
      #19 0x000055f9abc6ce5e in plugin_init (argc=0x55f9af6c57a0 <remaining_argc>, argv=0x6160000003f8, flags=0) at /data/Server/10.8D/sql/sql_plugin.cc:1755
      #20 0x000055f9ab8adad1 in init_server_components () at /data/Server/10.8D/sql/mysqld.cc:5088
      #21 0x000055f9ab8af64a in mysqld_main (argc=44, argv=0x6160000003f8) at /data/Server/10.8D/sql/mysqld.cc:5704
      #22 0x000055f9ab89df0d in main (argc=43, argv=0x7fff3f681af8) at /data/Server/10.8D/sql/main.cc:34
      (rr)
       
      pluto:/data/results/1644845348/TBR-1310/dev/shm/rqg/1644845348/99/1/rr
      _RR_TRACE_DIR="." rr replay --mark-stdio mysqld-1
          # Fate of server till the process was killed
      _RR_TRACE_DIR="." rr replay --mark-stdio mysqld-2
          # Restart attempt
       
      RQG
      ===
      # git clone https://github.com/mleich1/rqg --branch experimental RQG
      #
      # GIT_SHOW: HEAD -> experimental, origin/experimental ecc3543d118097ad1bf311c35d7910bb3352d9c7 2022-02-11T11:25:50+01:00
      # rqg.pl  : Version 4.0.4 (2021-12)
      #
      # $RQG_HOME/rqg.pl \
      # --duration=100 \
      # --queries=10000000 \
      # --no_mask \
      # --seed=random \
      # --rpl_mode=none \
      # --engine=InnoDB \
      # --max_gd_duration=1200 \
      # --gendata=conf/mariadb/table_stress.zz \
      # --gendata_sql=conf/mariadb/table_stress.sql \
      # --mysqld=--plugin-load-add=file_key_management.so \
      # --mysqld=--slave_net_timeout=60 \
      # --mysqld=--loose-table_lock_wait_timeout=50 \
      # --mysqld=--loose-plugin-load-add=provider_lz4.so \
      # --mysqld=--wait_timeout=28800 \
      # --mysqld=--loose-innodb_evict_tables_on_commit_debug=ON \
      # --mysqld=--lock-wait-timeout=86400 \
      # --mysqld=--net_write_timeout=60 \
      # --mysqld=--loose-idle_readonly_transaction_timeout=0 \
      # --mysqld=--interactive_timeout=28800 \
      # --mysqld=--log_bin_trust_function_creators=1 \
      # --mysqld=--loose-max-statement-time=30 \
      # --mysqld=--connect_timeout=60 \
      # --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
      # --mysqld=--innodb-lock-wait-timeout=50 \
      # --mysqld=--log-bin \
      # --mysqld=--net_read_timeout=30 \
      # --mysqld=--loose_innodb_lock_schedule_algorithm=fcfs \
      # --mysqld=--sync-binlog=1 \
      # --mysqld=--file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
      # --mysqld=--innodb-buffer-pool-size=256M \
      # --mysqld=--loose-idle_transaction_timeout=0 \
      # --mysqld=--loose-idle_write_transaction_timeout=0 \
      # --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
      # --mysqld=--log-output=none \
      # --mysqld=--loose-innodb_read_only_compressed=OFF \
      # --mysqld=--innodb_page_size=8K \
      # --reporters=Backtrace,CrashRecovery1,Deadlock1,ErrorLog \
      # --validators=None \
      # --grammar=TBR-1362-mic.yy \
      # --threads=2 \
      # <local settings>
       
      The test replays sometimes
      - Assertion `!table || (ignore_err & ~DICT_ERR_IGNORE_FK_NOKEY) || !table->is_readable() || !table->corrupted' failed  (RQG error pattern TBR-1322)
      - Assertion `n_fields <= ulint(index->n_fields) + 1' failed (error pattern TBR-1310)
      - MDEV-27798
      
      

      Attachments

        1. simp_TBR-1362.cfg
          46 kB
          Matthias Leich
        2. TBR-1362-mic.yy
          1 kB
          Matthias Leich

        Issue Links

          Activity

            The table contains a huge number of instantly dropped and added columns because of the following in TBR-1362-mic.yy:

            ALTER IGNORE TABLE table_names ADD COLUMN col_int_copy INTEGER;
            ALTER IGNORE TABLE table_names DROP COLUMN col_int;
            ALTER IGNORE TABLE table_names CHANGE COLUMN col_int_copy col_int INTEGER;
            

            The IGNORE will be ignored; it only matters for things like ADD UNIQUE INDEX (in which case InnoDB would refuse native operation and force ALGORITHM=COPY).

            We have n_fields=817=n_core_fields+1+811 and index->n_fields=815. The index fields are (col1,DB_TRX_ID,DB_ROLL_PTR,col2,(lots of instantly dropped columns),col_int_copy).
            The 811=rec_get_n_add_field() was read from the record header. The metadata record is located in the index root page, at byte offset 0xe5 from the start. The number of instantly added fields is encoded in the 2 bytes at offset 0xde, containing 0x06ab. Those header bytes are there already at the time when the server was killed. The server was killed before btr_cur_pessimistic_update() returned during the following:

            #1  0x0000564aa7c0f555 in btr_page_reorganize_low (cursor=0x4d1c51d95d58, index=0x61600200d708, mtr=0x4d1c51d95e80) at /data/Server/10.8D/storage/innobase/btr/btr0btr.cc:1293
            #2  0x0000564aa7c12d53 in btr_page_reorganize (cursor=0x4d1c51d95d58, index=0x61600200d708, mtr=0x4d1c51d95e80) at /data/Server/10.8D/storage/innobase/btr/btr0btr.cc:1548
            #3  0x0000564aa7c702c5 in btr_cur_pessimistic_update (flags=10, cursor=0x4d1c51d95d50, offsets=0x4d1c51d95cb0, offsets_heap=0x4d1c51d95cd0, entry_heap=0x61900027f180, big_rec=0x4d1c51d95cf0, 
                update=0x62d000bfe488, cmpl_info=1, thr=0x6250001ed0a8, trx_id=10594, mtr=0x4d1c51d95e80) at /data/Server/10.8D/storage/innobase/btr/btr0cur.cc:5122
            #4  0x0000564aa778d13e in innobase_instant_try (ha_alter_info=0x4d1c51d97c40, ctx=0x629000e2ad80, altered_table=0x4d1c51d98340, table=0x619000285a98, trx=0x1f8e08bb1e88)
                at /data/Server/10.8D/storage/innobase/handler/handler0alter.cc:5914
            #5  0x0000564aa77de9d2 in commit_try_norebuild (ha_alter_info=0x4d1c51d97c40, ctx=0x629000e2ad80, altered_table=0x4d1c51d98340, old_table=0x619000285a98, trx=0x1f8e08bb1e88, table_name=0x61b00008805d "t1")
                at /data/Server/10.8D/storage/innobase/handler/handler0alter.cc:10408
            #6  0x0000564aa77b6d0e in ha_innobase::commit_inplace_alter_table (this=0x61d0006216b8, altered_table=0x4d1c51d98340, ha_alter_info=0x4d1c51d97c40, commit=true)
                at /data/Server/10.8D/storage/innobase/handler/handler0alter.cc:11086
            …
            #12 0x0000564aa664a37d in mysql_parse (thd=0x62b0001dc218, rawbuf=0x629000e29238 "ALTER IGNORE TABLE t1 DROP COLUMN col_int  /* E_R Thread2 QNO 2437 CON_ID 20 */", length=79, parser_state=0x4d1c51d9bb20)
                at /data/Server/10.8D/sql/sql_parse.cc:8027
            

            Apparently, the being-dropped column col_int was not recovered from the table during recovery, even though it should have been, because the DROP COLUMN col_int operation had not been committed yet.

            mleich, was this fixed by MDEV-27234 (10.6.8, 10.7.4, 10.8.3, 10.9.1)?

            marko Marko Mäkelä added a comment - The table contains a huge number of instantly dropped and added columns because of the following in TBR-1362-mic.yy : ALTER IGNORE TABLE table_names ADD COLUMN col_int_copy INTEGER ; ALTER IGNORE TABLE table_names DROP COLUMN col_int; ALTER IGNORE TABLE table_names CHANGE COLUMN col_int_copy col_int INTEGER ; The IGNORE will be ignored; it only matters for things like ADD UNIQUE INDEX (in which case InnoDB would refuse native operation and force ALGORITHM=COPY ). We have n_fields=817=n_core_fields+1+811 and index->n_fields=815 . The index fields are (col1,DB_TRX_ID,DB_ROLL_PTR,col2,(lots of instantly dropped columns),col_int_copy). The 811=rec_get_n_add_field() was read from the record header. The metadata record is located in the index root page, at byte offset 0xe5 from the start. The number of instantly added fields is encoded in the 2 bytes at offset 0xde, containing 0x06ab. Those header bytes are there already at the time when the server was killed. The server was killed before btr_cur_pessimistic_update() returned during the following: #1 0x0000564aa7c0f555 in btr_page_reorganize_low (cursor=0x4d1c51d95d58, index=0x61600200d708, mtr=0x4d1c51d95e80) at /data/Server/10.8D/storage/innobase/btr/btr0btr.cc:1293 #2 0x0000564aa7c12d53 in btr_page_reorganize (cursor=0x4d1c51d95d58, index=0x61600200d708, mtr=0x4d1c51d95e80) at /data/Server/10.8D/storage/innobase/btr/btr0btr.cc:1548 #3 0x0000564aa7c702c5 in btr_cur_pessimistic_update (flags=10, cursor=0x4d1c51d95d50, offsets=0x4d1c51d95cb0, offsets_heap=0x4d1c51d95cd0, entry_heap=0x61900027f180, big_rec=0x4d1c51d95cf0, update=0x62d000bfe488, cmpl_info=1, thr=0x6250001ed0a8, trx_id=10594, mtr=0x4d1c51d95e80) at /data/Server/10.8D/storage/innobase/btr/btr0cur.cc:5122 #4 0x0000564aa778d13e in innobase_instant_try (ha_alter_info=0x4d1c51d97c40, ctx=0x629000e2ad80, altered_table=0x4d1c51d98340, table=0x619000285a98, trx=0x1f8e08bb1e88) at /data/Server/10.8D/storage/innobase/handler/handler0alter.cc:5914 #5 0x0000564aa77de9d2 in commit_try_norebuild (ha_alter_info=0x4d1c51d97c40, ctx=0x629000e2ad80, altered_table=0x4d1c51d98340, old_table=0x619000285a98, trx=0x1f8e08bb1e88, table_name=0x61b00008805d "t1") at /data/Server/10.8D/storage/innobase/handler/handler0alter.cc:10408 #6 0x0000564aa77b6d0e in ha_innobase::commit_inplace_alter_table (this=0x61d0006216b8, altered_table=0x4d1c51d98340, ha_alter_info=0x4d1c51d97c40, commit=true) at /data/Server/10.8D/storage/innobase/handler/handler0alter.cc:11086 … #12 0x0000564aa664a37d in mysql_parse (thd=0x62b0001dc218, rawbuf=0x629000e29238 "ALTER IGNORE TABLE t1 DROP COLUMN col_int /* E_R Thread2 QNO 2437 CON_ID 20 */", length=79, parser_state=0x4d1c51d9bb20) at /data/Server/10.8D/sql/sql_parse.cc:8027 Apparently, the being-dropped column col_int was not recovered from the table during recovery, even though it should have been, because the DROP COLUMN col_int operation had not been committed yet. mleich , was this fixed by MDEV-27234 (10.6.8, 10.7.4, 10.8.3, 10.9.1)?
            mleich Matthias Leich added a comment - - edited

            origin/10.6 f2f18e20ebfff2b744a08ba169bcedd1477b228c 2022-06-23T13:17:20+03:00
            contains the fix for MDEV-27234 and reports to be a 10.6.9.
            No replay of the assert described on top within 1600 RQG tests executed.
            Therefore I vote for closing the bug.
            

            mleich Matthias Leich added a comment - - edited origin/10.6 f2f18e20ebfff2b744a08ba169bcedd1477b228c 2022-06-23T13:17:20+03:00 contains the fix for MDEV-27234 and reports to be a 10.6.9. No replay of the assert described on top within 1600 RQG tests executed. Therefore I vote for closing the bug.

            People

              mleich Matthias Leich
              mleich Matthias Leich
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.