Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16515

InnoDB: Failing assertion: ++retries < 10000 in file dict0dict.cc line 2737

Details

    Description

      10.1 9f848da640dd6c3f44d56eae18204370ae7f835c

      2018-06-17 04:31:32 7fb03ad51b00  InnoDB: Assertion failure in thread 140394878016256 in file dict0dict.cc line 2737
      InnoDB: Failing assertion: ++retries < 10000
       
      #5  0x00007fb0635c3028 in abort () from /lib/x86_64-linux-gnu/libc.so.6
      #6  0x0000557f3f6b3832 in dict_index_remove_from_cache_low (table=0x557f431fcd28, index=0x557f4308a998, lru_evict=0) at /home/travis/src/storage/xtradb/dict/dict0dict.cc:2737
      #7  0x0000557f3f6b24de in dict_table_remove_from_cache_low (table=0x557f431fcd28, lru_evict=0) at /home/travis/src/storage/xtradb/dict/dict0dict.cc:2114
      #8  0x0000557f3f6b2895 in dict_table_remove_from_cache (table=0x557f431fcd28) at /home/travis/src/storage/xtradb/dict/dict0dict.cc:2151
      #9  0x0000557f3f594b5e in row_drop_table_for_mysql (name=0x557f42f57590 "test/#sql-ib3458-1040254509", trx=0x7fb00000bf98, drop_db=false, create_failed=0, nonatomic=false) at /home/travis/src/storage/xtradb/row/row0mysql.cc:4430
      #10 0x0000557f3f58544a in row_merge_drop_table (trx=0x7fb00000bf98, table=0x557f431fcd28) at /home/travis/src/storage/xtradb/row/row0merge.cc:3823
      #11 0x0000557f3f4a7768 in ha_innobase::commit_inplace_alter_table (this=0x7fb0000aa158, altered_table=0x7fb0000cac40, ha_alter_info=0x7fb03ad4dad0, commit=true) at /home/travis/src/storage/xtradb/handler/handler0alter.cc:6323
      #12 0x0000557f3f13e52f in handler::ha_commit_inplace_alter_table (this=0x7fb0000aa158, altered_table=0x7fb0000cac40, ha_alter_info=0x7fb03ad4dad0, commit=true) at /home/travis/src/sql/handler.cc:4225
      #13 0x0000557f3efb471b in mysql_inplace_alter_table (thd=0x557f42fea610, table_list=0x7fb00003b0c8, table=0x7fb0000d7390, altered_table=0x7fb0000cac40, ha_alter_info=0x7fb03ad4dad0, inplace_supported=HA_ALTER_INPLACE_NO_LOCK_AFTER_PREPARE, target_mdl_request=0x7fb03ad4db40, alter_ctx=0x7fb03ad4e6e0) at /home/travis/src/sql/sql_table.cc:7222
      #14 0x0000557f3efb9085 in mysql_alter_table (thd=0x557f42fea610, new_db=0x7fb00003b6b8 "test", new_name=0x0, create_info=0x7fb03ad4f2d0, table_list=0x7fb00003b0c8, alter_info=0x7fb03ad4f240, order_num=0, order=0x0, ignore=true) at /home/travis/src/sql/sql_table.cc:9048
      #15 0x0000557f3f02b4a8 in Sql_cmd_alter_table::execute (this=0x7fb00003b770, thd=0x557f42fea610) at /home/travis/src/sql/sql_alter.cc:325
      #16 0x0000557f3eef9f39 in mysql_execute_command (thd=0x557f42fea610) at /home/travis/src/sql/sql_parse.cc:5696
      #17 0x0000557f3eefe6a1 in mysql_parse (thd=0x557f42fea610, rawbuf=0x7fb00003add8 "ALTER ONLINE IGNORE TABLE `table_partitioned` DROP FOREIGN KEY IF EXISTS j, DROP COLUMN IF EXISTS tscol5, ALGORITHM=COPY, FORCE, LOCK=NONE, ALGORITHM=INPLACE, DROP COLUMN IF EXISTS `field4`, ALGORITHM"..., length=246, parser_state=0x7fb03ad50630) at /home/travis/src/sql/sql_parse.cc:7449
      #18 0x0000557f3eeed0ed in dispatch_command (command=COM_QUERY, thd=0x557f42fea610, packet=0x557f43149741 "ALTER ONLINE IGNORE TABLE `table_partitioned` DROP FOREIGN KEY IF EXISTS j, DROP COLUMN IF EXISTS tscol5, ALGORITHM=COPY, FORCE, LOCK=NONE, ALGORITHM=INPLACE, DROP COLUMN IF EXISTS `field4`, ALGORITHM"..., packet_length=247) at /home/travis/src/sql/sql_parse.cc:1492
      #19 0x0000557f3eeebe6d in do_command (thd=0x557f42fea610) at /home/travis/src/sql/sql_parse.cc:1121
      #20 0x0000557f3f026a95 in do_handle_one_connection (thd_arg=0x557f42fea610) at /home/travis/src/sql/sql_connect.cc:1330
      #21 0x0000557f3f0267f9 in handle_one_connection (arg=0x557f42fea610) at /home/travis/src/sql/sql_connect.cc:1242
      #22 0x00007fb063f64184 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
      #23 0x00007fb063686ffd in clone () from /lib/x86_64-linux-gnu/libc.so.6
      

      ALTER ONLINE IGNORE TABLE `table_partitioned` DROP FOREIGN KEY IF EXISTS j, DROP COLUMN IF EXISTS tscol5, ALGORITHM=COPY, FORCE, LOCK=NONE, ALGORITHM=INPLACE, DROP COLUMN IF EXISTS `field4`, ALGORITHM=DEFAULT, CHECKSUM 1
      

      Full threads are attached. Datadir, coredump, binary, and logs are available on perro.

      Attachments

        Issue Links

          Activity

            It also happened in buildbot:
            http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest-big/builds/1872

            rpl.rpl_row_mixing_engines 'row,xtradb'  w2 [ fail ]
                    Test ended at 2018-06-14 08:58:04
             
            CURRENT_TEST: rpl.rpl_row_mixing_engines
            mysqltest: In included file "./extra/rpl_tests/rpl_mixing_engines.inc": 
            included from ./extra/rpl_tests/rpl_mixing_engines.test at line 1666:
            included from /mnt/buildbot/build/mariadb-10.1.34/mysql-test/suite/rpl/t/rpl_row_mixing_engines.test at line 12:
            At line 562: query 'DROP TABLE IF EXISTS tt_xx_$tb_id, nt_xx_$tb_id' failed: 2013: Lost connection to MySQL server during query
             
            The result from queries just before the failure was:
            < snip >
            Log_name	Pos	Event_type	Server_id	End_log_pos	Info
            master-bin.000001	#	Gtid	#	#	BEGIN GTID #-#-#
            master-bin.000001	#	Table_map	#	#	table_id: # (test.tt_1)
            master-bin.000001	#	Write_rows_v1	#	#	table_id: # flags: STMT_END_F
            master-bin.000001	#	Xid	#	#	COMMIT /* XID */
            -e-e-e-e-e-e-e-e-e-e-e- >> C << -e-e-e-e-e-e-e-e-e-e-e-
            -b-b-b-b-b-b-b-b-b-b-b- >> trunc-CS-T B ISe-T<-N T C << -b-b-b-b-b-b-b-b-b-b-b-
            include/show_binlog_events.inc
            Log_name	Pos	Event_type	Server_id	End_log_pos	Info
            master-bin.000001	#	Gtid	#	#	GTID #-#-#
            master-bin.000001	#	Query	#	#	use `test`; TRUNCATE TABLE tt_xx_7
            master-bin.000001	#	Gtid	#	#	BEGIN GTID #-#-#
            master-bin.000001	#	Table_map	#	#	table_id: # (test.tt_1)
            master-bin.000001	#	Write_rows_v1	#	#	table_id: # flags: STMT_END_F
            master-bin.000001	#	Xid	#	#	COMMIT /* XID */
            -e-e-e-e-e-e-e-e-e-e-e- >> trunc-CS-T B ISe-T<-N T C << -e-e-e-e-e-e-e-e-e-e-e-
             
            rpl_mixing_engines.inc [commands=drop-CS]
            -b-b-b-b-b-b-b-b-b-b-b- >> drop-CS << -b-b-b-b-b-b-b-b-b-b-b-
            DROP TABLE IF EXISTS tt_xx_7, nt_xx_7;
             
            More results from queries before failure can be found in /mnt/buildbot/build/mariadb-10.1.34/mysql-test/var/2/log/rpl_row_mixing_engines.log
             
             
            Server [mysqld.1 - pid: 27930, winpid: 27930, exit: 256] failed during test run
            Server log from this test:
            ----------SERVER LOG START-----------
            2018-06-14 08:58:00 7ff418576b00  InnoDB: Assertion failure in thread 140686357129984 in file dict0dict.cc line 2737
            InnoDB: Failing assertion: ++retries < 10000
            InnoDB: We intentionally generate a memory trap.
            InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
            InnoDB: If you get repeated assertion failures or crashes, even
            InnoDB: immediately after the mysqld startup, there may be
            InnoDB: corruption in the InnoDB tablespace. Please refer to
            InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
            InnoDB: about forcing recovery.
            180614  8:58:00 [ERROR] mysqld got signal 6 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.
             
            To report this bug, see https://mariadb.com/kb/en/reporting-bugs
             
            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed, 
            something is definitely wrong and this may fail.
             
            Server version: 10.1.34-MariaDB
            key_buffer_size=1048576
            read_buffer_size=131072
            max_used_connections=7
            max_threads=153
            thread_count=6
            It is possible that mysqld could use up to 
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 62940 K  bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.
             
            Thread pointer: 0x62a00077a208
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x7ff418576030 thread_stack 0x48400
            /usr/lib/x86_64-linux-gnu/libasan.so.2(+0x4a077)[0x7ff42e9d3077]
            /mnt/buildbot/build/mariadb-10.1.34/sql/mysqld(my_print_stacktrace+0xdc)[0x564d571994d9]
            /mnt/buildbot/build/mariadb-10.1.34/sql/mysqld(handle_fatal_signal+0x766)[0x564d56638056]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7ff42d87a390]
            /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7ff42ce4a428]
            /lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7ff42ce4c02a]
            /mnt/buildbot/build/mariadb-10.1.34/sql/mysqld(+0x142b472)[0x564d56f2f472]
            /mnt/buildbot/build/mariadb-10.1.34/sql/mysqld(+0x14401ce)[0x564d56f441ce]
            /mnt/buildbot/build/mariadb-10.1.34/sql/mysqld(+0x14409b0)[0x564d56f449b0]
            dict/dict0dict.cc:2738(dict_index_remove_from_cache_low(dict_table_t*, dict_index_t*, unsigned long))[0x564d56d78c60]
            dict/dict0dict.cc:2110(dict_table_remove_from_cache_low(dict_table_t*, unsigned long))[0x564d56bbec8a]
            dict/dict0dict.cc:2152(dict_table_remove_from_cache(dict_table_t*))[0x564d5664ecb7]
            row/row0mysql.cc:4432(row_drop_table_for_mysql(char const*, trx_t*, bool, unsigned long, bool))[0x564d5664eec5]
            handler/ha_innodb.cc:13188(ha_innobase::delete_table(char const*))[0x564d563745fb]
            sql/handler.cc:4320(handler::ha_delete_table(char const*))[0x564d563761bf]
            sql/handler.cc:2384(ha_delete_table(THD*, handlerton*, char const*, char const*, char const*, bool))[0x564d5620f203]
            sql/sql_table.cc:2472(mysql_rm_table_no_locks(THD*, TABLE_LIST*, bool, bool, bool, bool, bool))[0x564d5621a4f9]
            sql/sql_table.cc:2084(mysql_rm_table(THD*, TABLE_LIST*, char, char))[0x564d562200ca]
            sql/sql_parse.cc:4254(mysql_execute_command(THD*))[0x564d56225dfc]
            sql/sql_parse.cc:7450(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x564d564615e5]
            sql/sql_parse.cc:1515(dispatch_command(enum_server_command, THD*, char*, unsigned int))[0x564d56461ac2]
            sql/sql_parse.cc:1121(do_command(THD*))[0x564d56a4c649]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7ff42d8706ba]
            /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7ff42cf1b82d]
             
            Trying to get some variables.
            Some pointers may be invalid and cause the dump to abort.
            Query (0x62b000ba6220): DROP TABLE IF EXISTS tt_xx_7, nt_xx_7
            Connection ID (thread ID): 1747
            Status: NOT_KILLED
             
            Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
             
            The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            information that should help you find out what is causing the crash.
            

            elenst Elena Stepanova added a comment - It also happened in buildbot: http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest-big/builds/1872 rpl.rpl_row_mixing_engines 'row,xtradb' w2 [ fail ] Test ended at 2018-06-14 08:58:04   CURRENT_TEST: rpl.rpl_row_mixing_engines mysqltest: In included file "./extra/rpl_tests/rpl_mixing_engines.inc": included from ./extra/rpl_tests/rpl_mixing_engines.test at line 1666: included from /mnt/buildbot/build/mariadb-10.1.34/mysql-test/suite/rpl/t/rpl_row_mixing_engines.test at line 12: At line 562: query 'DROP TABLE IF EXISTS tt_xx_$tb_id, nt_xx_$tb_id' failed: 2013: Lost connection to MySQL server during query   The result from queries just before the failure was: < snip > Log_name Pos Event_type Server_id End_log_pos Info master-bin.000001 # Gtid # # BEGIN GTID #-#-# master-bin.000001 # Table_map # # table_id: # (test.tt_1) master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F master-bin.000001 # Xid # # COMMIT /* XID */ -e-e-e-e-e-e-e-e-e-e-e- >> C << -e-e-e-e-e-e-e-e-e-e-e- -b-b-b-b-b-b-b-b-b-b-b- >> trunc-CS-T B ISe-T<-N T C << -b-b-b-b-b-b-b-b-b-b-b- include/show_binlog_events.inc Log_name Pos Event_type Server_id End_log_pos Info master-bin.000001 # Gtid # # GTID #-#-# master-bin.000001 # Query # # use `test`; TRUNCATE TABLE tt_xx_7 master-bin.000001 # Gtid # # BEGIN GTID #-#-# master-bin.000001 # Table_map # # table_id: # (test.tt_1) master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F master-bin.000001 # Xid # # COMMIT /* XID */ -e-e-e-e-e-e-e-e-e-e-e- >> trunc-CS-T B ISe-T<-N T C << -e-e-e-e-e-e-e-e-e-e-e-   rpl_mixing_engines.inc [commands=drop-CS] -b-b-b-b-b-b-b-b-b-b-b- >> drop-CS << -b-b-b-b-b-b-b-b-b-b-b- DROP TABLE IF EXISTS tt_xx_7, nt_xx_7;   More results from queries before failure can be found in /mnt/buildbot/build/mariadb-10.1.34/mysql-test/var/2/log/rpl_row_mixing_engines.log     Server [mysqld.1 - pid: 27930, winpid: 27930, exit: 256] failed during test run Server log from this test: ----------SERVER LOG START----------- 2018-06-14 08:58:00 7ff418576b00 InnoDB: Assertion failure in thread 140686357129984 in file dict0dict.cc line 2737 InnoDB: Failing assertion: ++retries < 10000 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to https://jira.mariadb.org/ InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. 180614 8:58:00 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware.   To report this bug, see https://mariadb.com/kb/en/reporting-bugs   We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail.   Server version: 10.1.34-MariaDB key_buffer_size=1048576 read_buffer_size=131072 max_used_connections=7 max_threads=153 thread_count=6 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 62940 K bytes of memory Hope that's ok; if not, decrease some variables in the equation.   Thread pointer: 0x62a00077a208 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7ff418576030 thread_stack 0x48400 /usr/lib/x86_64-linux-gnu/libasan.so.2(+0x4a077)[0x7ff42e9d3077] /mnt/buildbot/build/mariadb-10.1.34/sql/mysqld(my_print_stacktrace+0xdc)[0x564d571994d9] /mnt/buildbot/build/mariadb-10.1.34/sql/mysqld(handle_fatal_signal+0x766)[0x564d56638056] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7ff42d87a390] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7ff42ce4a428] /lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7ff42ce4c02a] /mnt/buildbot/build/mariadb-10.1.34/sql/mysqld(+0x142b472)[0x564d56f2f472] /mnt/buildbot/build/mariadb-10.1.34/sql/mysqld(+0x14401ce)[0x564d56f441ce] /mnt/buildbot/build/mariadb-10.1.34/sql/mysqld(+0x14409b0)[0x564d56f449b0] dict/dict0dict.cc:2738(dict_index_remove_from_cache_low(dict_table_t*, dict_index_t*, unsigned long))[0x564d56d78c60] dict/dict0dict.cc:2110(dict_table_remove_from_cache_low(dict_table_t*, unsigned long))[0x564d56bbec8a] dict/dict0dict.cc:2152(dict_table_remove_from_cache(dict_table_t*))[0x564d5664ecb7] row/row0mysql.cc:4432(row_drop_table_for_mysql(char const*, trx_t*, bool, unsigned long, bool))[0x564d5664eec5] handler/ha_innodb.cc:13188(ha_innobase::delete_table(char const*))[0x564d563745fb] sql/handler.cc:4320(handler::ha_delete_table(char const*))[0x564d563761bf] sql/handler.cc:2384(ha_delete_table(THD*, handlerton*, char const*, char const*, char const*, bool))[0x564d5620f203] sql/sql_table.cc:2472(mysql_rm_table_no_locks(THD*, TABLE_LIST*, bool, bool, bool, bool, bool))[0x564d5621a4f9] sql/sql_table.cc:2084(mysql_rm_table(THD*, TABLE_LIST*, char, char))[0x564d562200ca] sql/sql_parse.cc:4254(mysql_execute_command(THD*))[0x564d56225dfc] sql/sql_parse.cc:7450(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x564d564615e5] sql/sql_parse.cc:1515(dispatch_command(enum_server_command, THD*, char*, unsigned int))[0x564d56461ac2] sql/sql_parse.cc:1121(do_command(THD*))[0x564d56a4c649] /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7ff42d8706ba] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7ff42cf1b82d]   Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x62b000ba6220): DROP TABLE IF EXISTS tt_xx_7, nt_xx_7 Connection ID (thread ID): 1747 Status: NOT_KILLED   Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off   The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash.

            In the core dump, we have index->space=3426 and index->page=FIL_NULL, but the adaptive hash index is not empty:

            (gdb) p *info
            $1 = {ref_count = 56, root_guess = 0x7fb03fd8bb90, hash_analysis = 2292, 
              last_hash_succ = 1, n_hash_potential = 2280, n_fields = 0, n_bytes = 1, 
              left_side = 0, magic_n = 1112765}
            

            The info->root_guess points to a different tablespace:

            (gdb) p $1.root_guess.page
            $10 = {space = 3268, offset = 3, buf_fix_count = 0, io_fix = 0, state = 5, 
              flush_type = 0, buf_pool_index = 0, zip = {data = 0x0, m_start = 0, 
                m_external = false, m_end = 0, m_nonempty = 0, n_blobs = 0, ssize = 0}, 
              write_size = 0, encrypted = false, real_size = 0, slot = 0x0, hash = 0x0, 
              in_page_hash = 1, in_zip_hash = 0, list = {prev = 0x7fb03ffcd0f0, 
                next = 0x7fb03f6b70c8}, in_flush_list = 0, in_free_list = 0, 
              newest_modification = 161156672, oldest_modification = 0, LRU = {
                prev = 0x7fb03f9df820, next = 0x7fb03fd8c5a0}, in_LRU_list = 1, old = 0, 
              freed_page_clock = 11594, access_time = 201517474, is_corrupt = 0, 
              file_page_was_freed = 0}
            

            Assuming that the info->root_guess points to a page that was valid for the index, the issue would seem to be that table->space points to something else, and the call

            buf_LRU_drop_page_hash_for_tablespace(table);
            

            in the loop would have no effect.

            In MariaDB before 10.2, TRUNCATE TABLE could reassign dict_table_t::space. row_truncate_table_for_mysql() does call buf_LRU_drop_page_hash_for_tablespace(table) before that, but it does not check if all the pages were dropped, I/O-fixed or buffer-fixed pages would seem to be skipped.

            I will try to check the fil_system->space_list and the buffer pool contents, to determine if the problem could have been TRUNCATE TABLE.

            marko Marko Mäkelä added a comment - In the core dump, we have index->space=3426 and index->page=FIL_NULL , but the adaptive hash index is not empty: (gdb) p *info $1 = {ref_count = 56, root_guess = 0x7fb03fd8bb90, hash_analysis = 2292, last_hash_succ = 1, n_hash_potential = 2280, n_fields = 0, n_bytes = 1, left_side = 0, magic_n = 1112765} The info->root_guess points to a different tablespace: (gdb) p $1.root_guess.page $10 = {space = 3268, offset = 3, buf_fix_count = 0, io_fix = 0, state = 5, flush_type = 0, buf_pool_index = 0, zip = {data = 0x0, m_start = 0, m_external = false, m_end = 0, m_nonempty = 0, n_blobs = 0, ssize = 0}, write_size = 0, encrypted = false, real_size = 0, slot = 0x0, hash = 0x0, in_page_hash = 1, in_zip_hash = 0, list = {prev = 0x7fb03ffcd0f0, next = 0x7fb03f6b70c8}, in_flush_list = 0, in_free_list = 0, newest_modification = 161156672, oldest_modification = 0, LRU = { prev = 0x7fb03f9df820, next = 0x7fb03fd8c5a0}, in_LRU_list = 1, old = 0, freed_page_clock = 11594, access_time = 201517474, is_corrupt = 0, file_page_was_freed = 0} Assuming that the info->root_guess points to a page that was valid for the index, the issue would seem to be that table->space points to something else, and the call buf_LRU_drop_page_hash_for_tablespace(table); in the loop would have no effect. In MariaDB before 10.2, TRUNCATE TABLE could reassign dict_table_t::space . row_truncate_table_for_mysql() does call buf_LRU_drop_page_hash_for_tablespace(table) before that, but it does not check if all the pages were dropped, I/O-fixed or buffer-fixed pages would seem to be skipped. I will try to check the fil_system->space_list and the buffer pool contents, to determine if the problem could have been TRUNCATE TABLE .

            Only the tablespace 3426 exists:

            $16 = (fil_space_t *) 0x7fb00018e738
            (gdb) p *$16
            $17 = {name = 0x7fb0000b8428 "test/#sql-ib3458-1040254509", id = 3426, 
              tablespace_version = 3427, stop_new_ops = false, purpose = 501, chain = {
                count = 1, start = 0x7fb000115b68, end = 0x7fb000115b68}, size = 6, 
              recv_size = 0, flags = 0, n_reserved_extents = 0, n_pending_flushes = 0, … }
            

            I think that it is plausible that the tablespace (with original ID 3268) was truncated. The size is only 6 pages. The old root_guess page had PAGE_LEVEL=1 and 60 node pointers, and InnoDB tablespaces never shrink outside truncate, so the only way how fil_space_t::size can be only 6 (instead of something bigger than 60) should be TRUNCATE TABLE.

            I found exactly 56 blocks in the buffer pool with block->index == index, matching info->ref_count. All block->page.space numbers point to a yet different tablespace 2828, which does not exist either. The tablespace ID and the page number in each block->frame match those of the block->page. So, it looks like the tablespace ID in the info->root_guess is incorrect (because the block was used for some other table that was later dropped or rebuilt), and the original tablespace ID before TRUNCATE TABLE was 2828, not 3268.

            I think that the most plausible explanation for this failure is that TRUNCATE TABLE was executed, and it did not drop all of the adaptive hash index because some pages were I/O-fixed (likely for flushing dirty pages) or buffer-fixed at that time.

            marko Marko Mäkelä added a comment - Only the tablespace 3426 exists: $16 = (fil_space_t *) 0x7fb00018e738 (gdb) p *$16 $17 = {name = 0x7fb0000b8428 "test/#sql-ib3458-1040254509", id = 3426, tablespace_version = 3427, stop_new_ops = false, purpose = 501, chain = { count = 1, start = 0x7fb000115b68, end = 0x7fb000115b68}, size = 6, recv_size = 0, flags = 0, n_reserved_extents = 0, n_pending_flushes = 0, … } I think that it is plausible that the tablespace (with original ID 3268) was truncated. The size is only 6 pages. The old root_guess page had PAGE_LEVEL=1 and 60 node pointers, and InnoDB tablespaces never shrink outside truncate, so the only way how fil_space_t::size can be only 6 (instead of something bigger than 60) should be TRUNCATE TABLE. I found exactly 56 blocks in the buffer pool with block->index == index , matching info->ref_count . All block->page.space  numbers point to a yet different tablespace 2828, which does not exist either. The tablespace ID and the page number in each block->frame match those of the block->page . So, it looks like the tablespace ID in the info->root_guess is incorrect (because the block was used for some other table that was later dropped or rebuilt), and the original tablespace ID before TRUNCATE TABLE was 2828, not 3268. I think that the most plausible explanation for this failure is that TRUNCATE TABLE was executed, and it did not drop all of the adaptive hash index because some pages were I/O-fixed (likely for flushing dirty pages) or buffer-fixed at that time.

            In MariaDB 5.5, we would not drop the adaptive hash indexes at all during ALTER TABLE…DISCARD TABLESPACE or TRUNCATE TABLE; we would only warn that they are not empty. In MariaDB 10.0 we would call fil_discard_tablespace() in both cases.

            The active dropping was added recently in MariaDB 10.1.34 as part of the MDEV-16283/MDEV-14727/MDEV-14491 fix.

            This failure can be classified as a regression of that fix. I think that the most feasible solution is to fix this in 10.1+ only. I believe that 10.2 could fail in a different way due to TRUNCATE TABLE. Because the tablespace ID would not change, the adaptive hash index would be dropped in the end in this scenario. But we could find false positives when searching for records through the adaptive hash index.

            marko Marko Mäkelä added a comment - In MariaDB 5.5, we would not drop the adaptive hash indexes at all during ALTER TABLE…DISCARD TABLESPACE or TRUNCATE TABLE ; we would only warn that they are not empty. In MariaDB 10.0 we would call fil_discard_tablespace() in both cases. The active dropping was added recently in MariaDB 10.1.34 as part of the MDEV-16283 / MDEV-14727 / MDEV-14491 fix. This failure can be classified as a regression of that fix. I think that the most feasible solution is to fix this in 10.1+ only. I believe that 10.2 could fail in a different way due to TRUNCATE TABLE . Because the tablespace ID would not change, the adaptive hash index would be dropped in the end in this scenario. But we could find false positives when searching for records through the adaptive hash index.

            As noted in MDEV-16759, this fix appears to have fixed a case of InnoDB adaptive hash index corruption that lead to permanent corruption of InnoDB tables (MDEV-9663). It is not known yet if there are other causes of MDEV-9663 than TRUNCATE TABLE while innodb_adaptive_hash_index is at its default setting (ON).

            marko Marko Mäkelä added a comment - As noted in MDEV-16759 , this fix appears to have fixed a case of InnoDB adaptive hash index corruption that lead to permanent corruption of InnoDB tables ( MDEV-9663 ). It is not known yet if there are other causes of MDEV-9663 than TRUNCATE TABLE while innodb_adaptive_hash_index is at its default setting (ON).

            People

              marko Marko Mäkelä
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.