Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14773

DROP TABLE hangs for InnoDB table with FULLTEXT index

Details

    • 10.2.14

    Description

      Statements like ALTER TABLE t. MODIFY COLUMN c DECIMAL(10,2) for the InnoDB table may hand at dropping of temporary table stage when the table has FULLTEXT index(es) defined.

      Full backtrace of relevant threads may look as follows:

      Thread 28 (Thread 0x7f5436be3700 (LWP 19822)):
      #0 ib_vector_last (table=) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/storage/innobase/include/ut0vec.ic:206
      No locals.
      #1 fts_sync_index_check (table=) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/storage/innobase/fts/fts0fts.cc:4049
      word = 0x7f5313f36c10
      rbt_node = 0x7f5313f36bf0
      #2 fts_sync (table=) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/storage/innobase/fts/fts0fts.cc:4292
      index_cache = 0x7f5313f98788
      i = <value optimized out>
      error = <value optimized out>
      cache = 0x7f53135ae2f0
      #3 fts_sync_table (table=) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/storage/innobase/fts/fts0fts.cc:4356
      err = DB_SUCCESS
      #4 0x00007f588972de62 in fts_optimize_sync_table (table_id=) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/storage/innobase/fts/fts0opt.cc:2960
      table = 0x7f531359e6d0
      #5 0x00007f588972e26b in fts_optimize_thread (arg=0x7f58a47957d0) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/storage/innobase/fts/fts0opt.cc:3088
      msg = 0x7f5313fb28c0
      heap_alloc = 0x7f542c0016e0
      current = <value optimized out>
      done = <value optimized out>
      n_tables = <value optimized out>
      n_optimize = <value optimized out>
      wq = 0x7f58a47957d0
      tables = 0x7f542c000a30
      #6 0x00007f5888851aa1 in start_thread () from /lib64/libpthread.so.0
      No symbol table info available.
      #7 0x00007f5886f69bcd in clone () from /lib64/libc.so.6
      No symbol table info available.
       
      ...
       
      Thread 6 (Thread 0x7f586c243700 (LWP 29207)):
      #0 0x00007f588885900d in nanosleep () from /lib64/libpthread.so.0
      No symbol table info available.
      #1 0x00007f58895505d0 in os_thread_sleep (tm=) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/storage/innobase/os/os0thread.cc:230
      t = {tv_sec = 0, tv_nsec = 250000000}
      #2 0x00007f58895a6409 in row_drop_table_for_mysql (name=0x7f586c23e1e0 "edoc_devel_oh/#sql-4914_19d3d", trx=0x7f586d8a5318, drop_db=false, create_failed=0, nonatomic=true) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/storage/innobase/row/row0mysql.cc:3639
      err = <value optimized out>
      table = 0x7f531359e6d0
      page_nos = 0x5a3d0a08
      filepath = 0x0
      tablename = 0x0
      locked_dictionary = true
      info = 0x0
      page_no = <value optimized out>
      foreign = <value optimized out>
      heap = 0x0
      #3 0x00007f58894ebec5 in ha_innobase::delete_table (this=) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/storage/innobase/handler/ha_innodb.cc:13671
      err = <value optimized out>
      trx_in_innodb = {m_trx = 0x7f586d8a5698}
      iter =
      thd = 0x7f534c0009a8
      norm_name = "edoc_devel_oh/#sql-4914_19d3d", '\000' <repeats 19 times>, "\001;\204\211X\177\000\000\200\067\204\211X\177\000\000\000\001\204\211X\177", '\000' <repeats 18 times>"\377, \377\377\377\377\377\377\377", '\000' <repeats 16 times>, "\020\000\000\000\b\000\000\000\000\001\000\000\000\000\000\000\003\000\000\000\000\000\000\000 \000\000LS\177\000\000\220\v\000\000\000\000\000\000\020\020\001\000\000\000\000\000\340\342#lX\177\000\000\t\263\357\206X\177\000\000\360\342#lX\177\000\000\001\326\061\211X\177\000\000]\255\\\023S\177\000\000\002\000\000\000\000\000\000\000\340\342#l^\000\000\000.\000\000\000\000\000\000\000\005", '\000' <repeats 35 times>, "S\177\000\000\000\000\000\000\000\000\000\000\021\004", '\000' <repeats 14 times>...
      parent_trx = 0x7f586d8a5698
      trx = 0x7f586d8a5318
      name_len = <value optimized out>
      #4 0x00007f58892e2b50 in THD::rm_temporary_table (this=) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/sql/temporary_tables.cc:676
      error = false
      file = 0x7f5313591e60
      frm_path = "./edoc_devel_oh/#sql-4914_19d3d.frm\000X\177\000\000\260\346#lX\177", '\000' <repeats 18 times>"\210, \r\000\000\000\000\000\000\020\020\000\000\000\000\000\000\200\r\000\000\000\000\000\000V|iT_\374\026\207X\205Y\023\036\000\000\000\000\000\000\000\036\000\000\000\006\000\000\000\000\000\000\000=\353L\211\362\001\000\000\030\257\234\211X\177\000\000\360\220Y\023S\177\000\000@\347#lX\177\000\000\260\246L\211X\177\000\000\000\000\000\000\000\000\000\000\360\220Y\023S\177\000\000\200\347#lX\177\000\000\000\000\000\000\000\000\000\000\362\001\000\000\000\000\000\000\001", '\000' <repeats 15 times>, "\036\000\000\000\000\000\000\000\360\347#lX\177\000\000r k\211X\177\000\000"...
      #5 0x00007f58892e2fa1 in THD::free_tmp_table_share (this=) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/sql/temporary_tables.cc:1445
      No locals.
      #6 0x00007f58892e48a3 in THD::drop_temporary_table (this=0x7f534c0009a8, table=0x7f5313598558, is_trans=) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/sql/temporary_tables.cc:639
      share = 0x7f53135b2a98
      tab = <value optimized out>
      locked = false
      result = false
      it = {list = <value optimized out>, current = 0x0}
      #7 0x00007f5889238c0f in mysql_alter_table (thd=0x7f534c0009a8, new_db=) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/sql/sql_table.cc:9653
       
      ...
       
      #10 0x00007f58891b9d2a in mysql_parse (thd=0x7f534c0009a8, rawbuf=) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/sql/sql_parse.cc:7887
      found_semicolon = <value optimized out>
      lex = 0x7f534c0042c0
      err = <value optimized out>
      __FUNCTION__ = "mysql_parse"
      #11 0x00007f58891bc09e in dispatch_command (command=COM_QUERY, thd=0x7f534c0009a8, packet=) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/sql/sql_parse.cc:1805
      packet_end = 0x7f534c00f0b1 ""
      parser_state = {m_input = {m_compute_digest = false}, m_lip = {m_thd = 0x7f534c0009a8, yylineno = 1, yylval = 0x7f586c2412a0, lookahead_token = -1, lookahead_yylval = 0x0, m_ptr = 0x7f534c00f0b2 "\r", m_tok_start = 0x7f534c00f0b2 "\r", m_tok_end = 0x7f534c00f0b2 "\r", m_end_of_query = 0x7f534c00f0b1 "", m_tok_start_prev = 0x7f534c00f0b1 "", m_buf = 0x7f534c00f080 "ALTER TABLE unit MODIFY COLUMN ceu DECIMAL (10,2)", m_buf_length = 49, m_echo = true, m_echo_saved = false, m_cpp_buf = 0x7f534c00f120 "ALTER TABLE unit MODIFY COLUMN ceu DECIMAL (10,2)", m_cpp_ptr = 0x7f534c00f151 "", m_cpp_tok_start = 0x7f534c00f151 "", m_cpp_tok_start_prev = 0x7f534c00f151 "", m_cpp_tok_end = 0x7f534c00f151 "", m_body_utf8 = 0x0, m_body_utf8_ptr = 0x20 <Address 0x20 out of bounds>, m_cpp_utf8_processed_ptr = 0x0, next_state = MY_LEX_END, found_semicolon = 0x0, tok_bitmap = 127 '\177', ignore_space = false, stmt_prepare_mode = false, multi_statements = true, in_comment = NO_COMMENT, in_comment_saved = 28, m_cpp_text_start = 0x7f534c00f14f "2)", m_cpp_text_end = 0x7f534c00f150 ")", m_underscore_cs = 0x0, m_digest = 0x0}, m_yacc = {yacc_yyss = 0x0, yacc_yyvs = 0x0, m_set_signal_info = {m_item = {0x0 <repeats 12 times>}}, m_lock_type = TL_READ_DEFAULT, m_mdl_type = MDL_SHARED_READ}, m_digest_psi = 0x0}
      net = 0x7f586c242150
      do_end_of_statement = true
      __FUNCTION__ = "dispatch_command"
      error = false
      drop_more_results = false
      #12 0x00007f58891bc65e in do_command (thd=0x7f534c0009a8) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/sql/sql_parse.cc:1360
      return_value = <value optimized out>
      packet = <value optimized out>
      packet_length = 50
      net = 0x7f534c000be8
      command = COM_QUERY
      #13 0x00007f5889279faf in do_handle_one_connection (connect=) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/sql/sql_connect.cc:1354
      create_user = true
      thr_create_utime = <value optimized out>
      thd = 0x7f534c0009a8
      #14 0x00007f588927a0d4 in handle_one_connection (arg=0x7f58b8f07c28) at /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.11/sql/sql_connect.cc:1260
      connect = 0x7f58b8f07c28
      #15 0x00007f5888851aa1 in start_thread () from /lib64/libpthread.so.0
      No symbol table info available.
      #16 0x00007f5886f69bcd in clone () from /lib64/libc.so.6
      No symbol table info available.
      

      Attachments

        Issue Links

          Activity

            The effect seems easily enough reproducible with this injection:

            --- a/sql/sql_table.cc
            +++ b/sql/sql_table.cc
            @@ -9443,20 +9443,21 @@ bool mysql_alter_table(THD *thd,char *new_db, char *new_name,
                 DBUG_EXECUTE_IF("abort_copy_table", {
                     my_error(ER_LOCK_WAIT_TIMEOUT, MYF(0));
                     goto err_new_table_cleanup;
                   });
                 if (copy_data_between_tables(thd, table, new_table,
                                              alter_info->create_list, ignore,
                                              order_num, order, &copied, &deleted,
                                              alter_info->keys_onoff,
                                              &alter_ctx))
                   goto err_new_table_cleanup;
            +    DBUG_EXECUTE_IF("pretend_copy_data_between_tables_failed",goto err_new_table_cleanup;);
               }
               else
               {
                 if (!table->s->tmp_table &&
                     wait_while_table_is_used(thd, table, HA_EXTRA_FORCE_REOPEN))
                   goto err_new_table_cleanup;
                 THD_STAGE_INFO(thd, stage_manage_keys);
                 alter_table_manage_keys(table, table->file->indexes_are_disabled(),
                                         alter_info->keys_onoff);
                 if (trans_commit_stmt(thd) || trans_commit_implicit(thd))
            

            The test case to be run with the injection on the debug build : mdev14773-2.test . It's not a minimal test case, but close enough, and it seems to reproduce the problem reliably.

            10.2 02b7dc7bec4ff9 + injection above

            Thread 13 (Thread 0x7fa3755f5700 (LWP 11571)):
            #0  0x000055b9a3176e6e in ib_vector_last (vec=0x7fa33408fba0) at /data/src/10.2-bug/storage/innobase/include/ut0vec.ic:204
            #1  0x000055b9a3180f50 in fts_sync_index_check (index_cache=0x7fa33415d1a8) at /data/src/10.2-bug/storage/innobase/fts/fts0fts.cc:4055
            #2  0x000055b9a318182c in fts_sync (sync=0x7fa33415d120, unlock_cache=true, wait=false, has_dict=true) at /data/src/10.2-bug/storage/innobase/fts/fts0fts.cc:4298
            #3  0x000055b9a3181a71 in fts_sync_table (table=0x7fa334172098, unlock_cache=true, wait=false, has_dict=true) at /data/src/10.2-bug/storage/innobase/fts/fts0fts.cc:4362
            #4  0x000055b9a31912ba in fts_optimize_sync_table (table_id=31) at /data/src/10.2-bug/storage/innobase/fts/fts0opt.cc:2960
            #5  0x000055b9a319166c in fts_optimize_thread (arg=0x55b9a5ca0e80) at /data/src/10.2-bug/storage/innobase/fts/fts0opt.cc:3088
            #6  0x00007fa3a7821494 in start_thread (arg=0x7fa3755f5700) at pthread_create.c:333
            #7  0x00007fa3a5c0793f in clone () from /lib/x86_64-linux-gnu/libc.so.6
             
            Thread 2 (Thread 0x7fa3a421f700 (LWP 11583)):
            #0  0x00007fa3a782a7dd in nanosleep () at ../sysdeps/unix/syscall-template.S:84
            #1  0x000055b9a2f06771 in os_thread_sleep (tm=250000) at /data/src/10.2-bug/storage/innobase/os/os0thread.cc:230
            #2  0x000055b9a2f89b85 in row_drop_table_for_mysql (name=0x7fa3a421ada0 "test/#sql-2d17_9", trx=0x7fa390184938, drop_db=false, create_failed=0, nonatomic=true) at /data/src/10.2-bug/storage/innobase/row/row0mysql.cc:3557
            #3  0x000055b9a2e52ea3 in ha_innobase::delete_table (this=0x7fa3340158b8, name=0x7fa33415c2a8 "./test/#sql-2d17_9") at /data/src/10.2-bug/storage/innobase/handler/ha_innodb.cc:13698
            #4  0x000055b9a2b3bbca in handler::ha_delete_table (this=0x7fa3340158b8, name=0x7fa33415c2a8 "./test/#sql-2d17_9") at /data/src/10.2-bug/sql/handler.cc:4335
            #5  0x000055b9a2ab0b19 in THD::rm_temporary_table (this=0x7fa334000b00, base=0x55b9a526d290, path=0x7fa33415c2a8 "./test/#sql-2d17_9") at /data/src/10.2-bug/sql/temporary_tables.cc:676
            #6  0x000055b9a2ab26e5 in THD::free_tmp_table_share (this=0x7fa334000b00, share=0x7fa33415bd90, delete_table=true) at /data/src/10.2-bug/sql/temporary_tables.cc:1445
            #7  0x000055b9a2ab0994 in THD::drop_temporary_table (this=0x7fa334000b00, table=0x7fa33409ce10, is_trans=0x0, delete_table=true) at /data/src/10.2-bug/sql/temporary_tables.cc:639
            #8  0x000055b9a29956af in mysql_alter_table (thd=0x7fa334000b00, new_db=0x7fa334011730 "test", new_name=0x0, create_info=0x7fa3a421cea0, table_list=0x7fa334011120, alter_info=0x7fa3a421cdf0, order_num=0, order=0x0, ignore=false) at /data/src/10.2-bug/sql/sql_table.cc:9651
            #9  0x000055b9a2a0e126 in Sql_cmd_alter_table::execute (this=0x7fa334011830, thd=0x7fa334000b00) at /data/src/10.2-bug/sql/sql_alter.cc:324
            #10 0x000055b9a28c9e5f in mysql_execute_command (thd=0x7fa334000b00) at /data/src/10.2-bug/sql/sql_parse.cc:6209
            #11 0x000055b9a28ce78a in mysql_parse (thd=0x7fa334000b00, rawbuf=0x7fa334011018 "ALTER TABLE t1 MODIFY COLUMN f18 DECIMAL(10,2)", length=46, parser_state=0x7fa3a421e250, is_com_multi=false, is_next_command=false) at /data/src/10.2-bug/sql/sql_parse.cc:7900
            #12 0x000055b9a28bc698 in dispatch_command (command=COM_QUERY, thd=0x7fa334000b00, packet=0x7fa334019241 "ALTER TABLE t1 MODIFY COLUMN f18 DECIMAL(10,2)", packet_length=46, is_com_multi=false, is_next_command=false) at /data/src/10.2-bug/sql/sql_parse.cc:1805
            #13 0x000055b9a28baff6 in do_command (thd=0x7fa334000b00) at /data/src/10.2-bug/sql/sql_parse.cc:1360
            #14 0x000055b9a2a08e17 in do_handle_one_connection (connect=0x55b9a5cce450) at /data/src/10.2-bug/sql/sql_connect.cc:1335
            #15 0x000055b9a2a08ba4 in handle_one_connection (arg=0x55b9a5cce450) at /data/src/10.2-bug/sql/sql_connect.cc:1241
            #16 0x00007fa3a7821494 in start_thread (arg=0x7fa3a421f700) at pthread_create.c:333
            #17 0x00007fa3a5c0793f in clone () from /lib/x86_64-linux-gnu/libc.so.6
            

            MariaDB [test]> show processlist;
            +----+-------------+-----------------+------+---------+------+--------------------------+------------------------------------------------+----------+
            | Id | User        | Host            | db   | Command | Time | State                    | Info                                           | Progress |
            +----+-------------+-----------------+------+---------+------+--------------------------+------------------------------------------------+----------+
            |  3 | system user |                 | NULL | Daemon  | NULL | InnoDB purge worker      | NULL                                           |    0.000 |
            |  2 | system user |                 | NULL | Daemon  | NULL | InnoDB purge worker      | NULL                                           |    0.000 |
            |  1 | system user |                 | NULL | Daemon  | NULL | InnoDB purge coordinator | NULL                                           |    0.000 |
            |  4 | system user |                 | NULL | Daemon  | NULL | InnoDB purge worker      | NULL                                           |    0.000 |
            |  5 | system user |                 | NULL | Daemon  | NULL | InnoDB shutdown handler  | NULL                                           |    0.000 |
            |  9 | root        | localhost:60418 | test | Query   |   38 | enabling keys            | ALTER TABLE t1 MODIFY COLUMN f18 DECIMAL(10,2) |    0.000 |
            | 10 | root        | localhost:60419 | test | Query   |    0 | init                     | show processlist                               |    0.000 |
            +----+-------------+-----------------+------+---------+------+--------------------------+------------------------------------------------+----------+
            7 rows in set (0.00 sec)
            

            The server doesn't hang, it keeps looping inside fts_sync*, hence different stack traces.

            I don't know why exactly copy_data_between_tables would fail for the user, it appears there can be different reasons, the function seems to have many exit points.
            Since, as per marko's analysis (which haven't been posted here yet, but it should be), FTS is only marginally fixable in current versions, maybe it makes sense to investigate the cause of copy_data_between_tables failure (assuming that's indeed what failed on the user's side – so far it's the only route that brought me to this exact pair of stack traces and ALTER hanging exactly in enabling keys). The effect seems to depend on the amount or length or contents of the data in the table. If I remove a part of the data from my attached test case, then instead of the hang I get the assertion failure on the debug build:

            mysqld: /data/src/10.2-bug/sql/protocol.cc:587: void Protocol::end_statement(): Assertion `0' failed.
            171227 21:22:09 [ERROR] mysqld got signal 6 ;
             
            #6  0x00007ff4cbfcbe37 in __assert_fail_base () from /lib/x86_64-linux-gnu/libc.so.6
            #7  0x00007ff4cbfcbee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
            #8  0x000055fa8fed0703 in Protocol::end_statement (this=0x7ff478001098) at /data/src/10.2-bug/sql/protocol.cc:587
            #9  0x000055fa8ff97192 in dispatch_command (command=COM_QUERY, thd=0x7ff478000b00, packet=0x7ff478008941 "", packet_length=46, is_com_multi=false, is_next_command=false) at /data/src/10.2-bug/sql/sql_parse.cc:2361
            #10 0x000055fa8ff93ff6 in do_command (thd=0x7ff478000b00) at /data/src/10.2-bug/sql/sql_parse.cc:1360
            #11 0x000055fa900e1e17 in do_handle_one_connection (connect=0x55fa94255340) at /data/src/10.2-bug/sql/sql_connect.cc:1335
            #12 0x000055fa900e1ba4 in handle_one_connection (arg=0x55fa94255340) at /data/src/10.2-bug/sql/sql_connect.cc:1241
            #13 0x000055fa90500814 in pfs_spawn_thread (arg=0x55fa9425f8b0) at /data/src/10.2-bug/storage/perfschema/pfs.cc:1863
            #14 0x00007ff4cdca2494 in start_thread (arg=0x7ff4c4438700) at pthread_create.c:333
            #15 0x00007ff4cc08893f in clone () from /lib/x86_64-linux-gnu/libc.so.6
             
            Some pointers may be invalid and cause the dump to abort.
            Query (0x7ff4780124e8): ALTER TABLE t1 MODIFY COLUMN f18 DECIMAL(10,2)
            Connection ID (thread ID): 9
            Status: NOT_KILLED
            

            It's not a very helpful assertion, but the point is, if the user tries to change the amount of data in the problematic table, maybe there will be more information about the underlying problem, e.g. some SQL errors, which will help to find a viable workaround.

            Meanwhile, since marko already worked on the issue, I'm assigning it to him.

            Note: The hang/loop is reproducible on 10.2 and 10.1, I didn't try 10.0 or 10.3 or MySQL.

            elenst Elena Stepanova added a comment - The effect seems easily enough reproducible with this injection: --- a/sql/sql_table.cc +++ b/sql/sql_table.cc @@ -9443,20 +9443,21 @@ bool mysql_alter_table(THD *thd,char *new_db, char *new_name, DBUG_EXECUTE_IF("abort_copy_table", { my_error(ER_LOCK_WAIT_TIMEOUT, MYF(0)); goto err_new_table_cleanup; }); if (copy_data_between_tables(thd, table, new_table, alter_info->create_list, ignore, order_num, order, &copied, &deleted, alter_info->keys_onoff, &alter_ctx)) goto err_new_table_cleanup; + DBUG_EXECUTE_IF("pretend_copy_data_between_tables_failed",goto err_new_table_cleanup;); } else { if (!table->s->tmp_table && wait_while_table_is_used(thd, table, HA_EXTRA_FORCE_REOPEN)) goto err_new_table_cleanup; THD_STAGE_INFO(thd, stage_manage_keys); alter_table_manage_keys(table, table->file->indexes_are_disabled(), alter_info->keys_onoff); if (trans_commit_stmt(thd) || trans_commit_implicit(thd)) The test case to be run with the injection on the debug build : mdev14773-2.test . It's not a minimal test case, but close enough, and it seems to reproduce the problem reliably. 10.2 02b7dc7bec4ff9 + injection above Thread 13 (Thread 0x7fa3755f5700 (LWP 11571)): #0 0x000055b9a3176e6e in ib_vector_last (vec=0x7fa33408fba0) at /data/src/10.2-bug/storage/innobase/include/ut0vec.ic:204 #1 0x000055b9a3180f50 in fts_sync_index_check (index_cache=0x7fa33415d1a8) at /data/src/10.2-bug/storage/innobase/fts/fts0fts.cc:4055 #2 0x000055b9a318182c in fts_sync (sync=0x7fa33415d120, unlock_cache=true, wait=false, has_dict=true) at /data/src/10.2-bug/storage/innobase/fts/fts0fts.cc:4298 #3 0x000055b9a3181a71 in fts_sync_table (table=0x7fa334172098, unlock_cache=true, wait=false, has_dict=true) at /data/src/10.2-bug/storage/innobase/fts/fts0fts.cc:4362 #4 0x000055b9a31912ba in fts_optimize_sync_table (table_id=31) at /data/src/10.2-bug/storage/innobase/fts/fts0opt.cc:2960 #5 0x000055b9a319166c in fts_optimize_thread (arg=0x55b9a5ca0e80) at /data/src/10.2-bug/storage/innobase/fts/fts0opt.cc:3088 #6 0x00007fa3a7821494 in start_thread (arg=0x7fa3755f5700) at pthread_create.c:333 #7 0x00007fa3a5c0793f in clone () from /lib/x86_64-linux-gnu/libc.so.6   Thread 2 (Thread 0x7fa3a421f700 (LWP 11583)): #0 0x00007fa3a782a7dd in nanosleep () at ../sysdeps/unix/syscall-template.S:84 #1 0x000055b9a2f06771 in os_thread_sleep (tm=250000) at /data/src/10.2-bug/storage/innobase/os/os0thread.cc:230 #2 0x000055b9a2f89b85 in row_drop_table_for_mysql (name=0x7fa3a421ada0 "test/#sql-2d17_9", trx=0x7fa390184938, drop_db=false, create_failed=0, nonatomic=true) at /data/src/10.2-bug/storage/innobase/row/row0mysql.cc:3557 #3 0x000055b9a2e52ea3 in ha_innobase::delete_table (this=0x7fa3340158b8, name=0x7fa33415c2a8 "./test/#sql-2d17_9") at /data/src/10.2-bug/storage/innobase/handler/ha_innodb.cc:13698 #4 0x000055b9a2b3bbca in handler::ha_delete_table (this=0x7fa3340158b8, name=0x7fa33415c2a8 "./test/#sql-2d17_9") at /data/src/10.2-bug/sql/handler.cc:4335 #5 0x000055b9a2ab0b19 in THD::rm_temporary_table (this=0x7fa334000b00, base=0x55b9a526d290, path=0x7fa33415c2a8 "./test/#sql-2d17_9") at /data/src/10.2-bug/sql/temporary_tables.cc:676 #6 0x000055b9a2ab26e5 in THD::free_tmp_table_share (this=0x7fa334000b00, share=0x7fa33415bd90, delete_table=true) at /data/src/10.2-bug/sql/temporary_tables.cc:1445 #7 0x000055b9a2ab0994 in THD::drop_temporary_table (this=0x7fa334000b00, table=0x7fa33409ce10, is_trans=0x0, delete_table=true) at /data/src/10.2-bug/sql/temporary_tables.cc:639 #8 0x000055b9a29956af in mysql_alter_table (thd=0x7fa334000b00, new_db=0x7fa334011730 "test", new_name=0x0, create_info=0x7fa3a421cea0, table_list=0x7fa334011120, alter_info=0x7fa3a421cdf0, order_num=0, order=0x0, ignore=false) at /data/src/10.2-bug/sql/sql_table.cc:9651 #9 0x000055b9a2a0e126 in Sql_cmd_alter_table::execute (this=0x7fa334011830, thd=0x7fa334000b00) at /data/src/10.2-bug/sql/sql_alter.cc:324 #10 0x000055b9a28c9e5f in mysql_execute_command (thd=0x7fa334000b00) at /data/src/10.2-bug/sql/sql_parse.cc:6209 #11 0x000055b9a28ce78a in mysql_parse (thd=0x7fa334000b00, rawbuf=0x7fa334011018 "ALTER TABLE t1 MODIFY COLUMN f18 DECIMAL(10,2)", length=46, parser_state=0x7fa3a421e250, is_com_multi=false, is_next_command=false) at /data/src/10.2-bug/sql/sql_parse.cc:7900 #12 0x000055b9a28bc698 in dispatch_command (command=COM_QUERY, thd=0x7fa334000b00, packet=0x7fa334019241 "ALTER TABLE t1 MODIFY COLUMN f18 DECIMAL(10,2)", packet_length=46, is_com_multi=false, is_next_command=false) at /data/src/10.2-bug/sql/sql_parse.cc:1805 #13 0x000055b9a28baff6 in do_command (thd=0x7fa334000b00) at /data/src/10.2-bug/sql/sql_parse.cc:1360 #14 0x000055b9a2a08e17 in do_handle_one_connection (connect=0x55b9a5cce450) at /data/src/10.2-bug/sql/sql_connect.cc:1335 #15 0x000055b9a2a08ba4 in handle_one_connection (arg=0x55b9a5cce450) at /data/src/10.2-bug/sql/sql_connect.cc:1241 #16 0x00007fa3a7821494 in start_thread (arg=0x7fa3a421f700) at pthread_create.c:333 #17 0x00007fa3a5c0793f in clone () from /lib/x86_64-linux-gnu/libc.so.6 MariaDB [test]> show processlist; +----+-------------+-----------------+------+---------+------+--------------------------+------------------------------------------------+----------+ | Id | User | Host | db | Command | Time | State | Info | Progress | +----+-------------+-----------------+------+---------+------+--------------------------+------------------------------------------------+----------+ | 3 | system user | | NULL | Daemon | NULL | InnoDB purge worker | NULL | 0.000 | | 2 | system user | | NULL | Daemon | NULL | InnoDB purge worker | NULL | 0.000 | | 1 | system user | | NULL | Daemon | NULL | InnoDB purge coordinator | NULL | 0.000 | | 4 | system user | | NULL | Daemon | NULL | InnoDB purge worker | NULL | 0.000 | | 5 | system user | | NULL | Daemon | NULL | InnoDB shutdown handler | NULL | 0.000 | | 9 | root | localhost:60418 | test | Query | 38 | enabling keys | ALTER TABLE t1 MODIFY COLUMN f18 DECIMAL(10,2) | 0.000 | | 10 | root | localhost:60419 | test | Query | 0 | init | show processlist | 0.000 | +----+-------------+-----------------+------+---------+------+--------------------------+------------------------------------------------+----------+ 7 rows in set (0.00 sec) The server doesn't hang, it keeps looping inside fts_sync*, hence different stack traces. I don't know why exactly copy_data_between_tables would fail for the user, it appears there can be different reasons, the function seems to have many exit points. Since, as per marko 's analysis (which haven't been posted here yet, but it should be), FTS is only marginally fixable in current versions, maybe it makes sense to investigate the cause of copy_data_between_tables failure (assuming that's indeed what failed on the user's side – so far it's the only route that brought me to this exact pair of stack traces and ALTER hanging exactly in enabling keys ). The effect seems to depend on the amount or length or contents of the data in the table. If I remove a part of the data from my attached test case, then instead of the hang I get the assertion failure on the debug build: mysqld: /data/src/10.2-bug/sql/protocol.cc:587: void Protocol::end_statement(): Assertion `0' failed. 171227 21:22:09 [ERROR] mysqld got signal 6 ;   #6 0x00007ff4cbfcbe37 in __assert_fail_base () from /lib/x86_64-linux-gnu/libc.so.6 #7 0x00007ff4cbfcbee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6 #8 0x000055fa8fed0703 in Protocol::end_statement (this=0x7ff478001098) at /data/src/10.2-bug/sql/protocol.cc:587 #9 0x000055fa8ff97192 in dispatch_command (command=COM_QUERY, thd=0x7ff478000b00, packet=0x7ff478008941 "", packet_length=46, is_com_multi=false, is_next_command=false) at /data/src/10.2-bug/sql/sql_parse.cc:2361 #10 0x000055fa8ff93ff6 in do_command (thd=0x7ff478000b00) at /data/src/10.2-bug/sql/sql_parse.cc:1360 #11 0x000055fa900e1e17 in do_handle_one_connection (connect=0x55fa94255340) at /data/src/10.2-bug/sql/sql_connect.cc:1335 #12 0x000055fa900e1ba4 in handle_one_connection (arg=0x55fa94255340) at /data/src/10.2-bug/sql/sql_connect.cc:1241 #13 0x000055fa90500814 in pfs_spawn_thread (arg=0x55fa9425f8b0) at /data/src/10.2-bug/storage/perfschema/pfs.cc:1863 #14 0x00007ff4cdca2494 in start_thread (arg=0x7ff4c4438700) at pthread_create.c:333 #15 0x00007ff4cc08893f in clone () from /lib/x86_64-linux-gnu/libc.so.6   Some pointers may be invalid and cause the dump to abort. Query (0x7ff4780124e8): ALTER TABLE t1 MODIFY COLUMN f18 DECIMAL(10,2) Connection ID (thread ID): 9 Status: NOT_KILLED It's not a very helpful assertion, but the point is, if the user tries to change the amount of data in the problematic table, maybe there will be more information about the underlying problem, e.g. some SQL errors, which will help to find a viable workaround. Meanwhile, since marko already worked on the issue, I'm assigning it to him. Note: The hang/loop is reproducible on 10.2 and 10.1, I didn't try 10.0 or 10.3 or MySQL.
            valerii Valerii Kravchuk added a comment - See upstream https://bugs.mysql.com/bug.php?id=88844 also...

            I can repeat the hang in MariaDB Server 10.0 and MySQL 5.7.21.
            The hang occurs when trying to drop the fully created intermediate copy of the table:

            10.0 4a5c237c76232cdd054bca5cb4affc2b9c22b688

            #2  0x00000000012b78ca in row_drop_table_for_mysql (
                name=0x7fffe3b963e0 "test/#sql-b0e_2", trx=<optimized out>, 
                drop_db=<optimized out>, create_failed=0, nonatomic=248)
                at /mariadb/10.0/storage/xtradb/row/row0mysql.cc:3924
            #3  0x0000000001165b78 in ha_innobase::delete_table (this=<optimized out>, 
                name=0x621000333ce8 "./test/#sql-b0e_2")
                at /mariadb/10.0/storage/xtradb/handler/ha_innodb.cc:11547
            #4  0x000000000073fcd5 in rm_temporary_table (base=<optimized out>, 
                path=0x7fffe3b96c20 "\350<3") at /mariadb/10.0/sql/sql_base.cc:5737
            #5  0x000000000073de26 in close_temporary (table=0x621000332d70, 
                free_share=<error reading variable: access outside bounds of object referenced via synthetic pointer>, 
                delete_table=<error reading variable: access outside bounds of object referenced via synthetic pointer>) at /mariadb/10.0/sql/sql_base.cc:1762
            #6  0x000000000073fa13 in close_temporary_table (thd=<optimized out>, 
                table=<optimized out>, 
                free_share=<error reading variable: access outside bounds of object referenced via synthetic pointer>, 
                delete_table=<error reading variable: access outside bounds of object referenced via synthetic pointer>) at /mariadb/10.0/sql/sql_base.cc:1739
            #7  0x0000000000934175 in mysql_alter_table (thd=0x62a000066270, 
                new_db=0x7fffe3b98390 "\360(\005", new_name=<optimized out>, 
                create_info=0x7fffe3b98ce0, table_list=<optimized out>, 
                alter_info=0x7fffe3b98e50, order_num=0, order=<optimized out>, 
                ignore=<optimized out>) at /mariadb/10.0/sql/sql_table.cc:9284
            #8  0x00000000009fec03 in Sql_cmd_alter_table::execute (this=<optimized out>, 
                thd=<optimized out>) at /mariadb/10.0/sql/sql_alter.cc:306
            

            The table contains 3 indexes: PRIMARY, the hidden FTS_DOC_ID_INDEX(FTS_DOC_ID), and the FULLTEXT INDEX ft_unit. In the FULLTEXT INDEX, the field index_fts_syncing is set, preventing the table from being dropped.

            marko Marko Mäkelä added a comment - I can repeat the hang in MariaDB Server 10.0 and MySQL 5.7.21. The hang occurs when trying to drop the fully created intermediate copy of the table: 10.0 4a5c237c76232cdd054bca5cb4affc2b9c22b688 #2 0x00000000012b78ca in row_drop_table_for_mysql ( name=0x7fffe3b963e0 "test/#sql-b0e_2", trx=<optimized out>, drop_db=<optimized out>, create_failed=0, nonatomic=248) at /mariadb/10.0/storage/xtradb/row/row0mysql.cc:3924 #3 0x0000000001165b78 in ha_innobase::delete_table (this=<optimized out>, name=0x621000333ce8 "./test/#sql-b0e_2") at /mariadb/10.0/storage/xtradb/handler/ha_innodb.cc:11547 #4 0x000000000073fcd5 in rm_temporary_table (base=<optimized out>, path=0x7fffe3b96c20 "\350<3") at /mariadb/10.0/sql/sql_base.cc:5737 #5 0x000000000073de26 in close_temporary (table=0x621000332d70, free_share=<error reading variable: access outside bounds of object referenced via synthetic pointer>, delete_table=<error reading variable: access outside bounds of object referenced via synthetic pointer>) at /mariadb/10.0/sql/sql_base.cc:1762 #6 0x000000000073fa13 in close_temporary_table (thd=<optimized out>, table=<optimized out>, free_share=<error reading variable: access outside bounds of object referenced via synthetic pointer>, delete_table=<error reading variable: access outside bounds of object referenced via synthetic pointer>) at /mariadb/10.0/sql/sql_base.cc:1739 #7 0x0000000000934175 in mysql_alter_table (thd=0x62a000066270, new_db=0x7fffe3b98390 "\360(\005", new_name=<optimized out>, create_info=0x7fffe3b98ce0, table_list=<optimized out>, alter_info=0x7fffe3b98e50, order_num=0, order=<optimized out>, ignore=<optimized out>) at /mariadb/10.0/sql/sql_table.cc:9284 #8 0x00000000009fec03 in Sql_cmd_alter_table::execute (this=<optimized out>, thd=<optimized out>) at /mariadb/10.0/sql/sql_alter.cc:306 The table contains 3 indexes: PRIMARY , the hidden FTS_DOC_ID_INDEX(FTS_DOC_ID) , and the FULLTEXT INDEX ft_unit . In the FULLTEXT INDEX, the field index_fts_syncing is set, preventing the table from being dropped.

            WIth a simple patch, 10.0 no longer hangs, but crashes in the same way that 10.2 crashed for me:

            Version: '10.0.35-MariaDB-debug'  socket: '/mariadb/10.0/build/mysql-test/var/tmp/mysqld.1.sock'  port: 16000  Source distribution
            [New Thread 0x7fffe3b9c700 (LWP 16831)]
            2018-03-12 12:14:48 7fffe7917700 InnoDB: FTS Optimize Removing table test/#sql-41a8_2
            mysqld: /mariadb/10.0/sql/protocol.cc:530: void Protocol::end_statement(): Assertion `0' failed.
            

            diff --git a/storage/xtradb/fts/fts0fts.cc b/storage/xtradb/fts/fts0fts.cc
            index 8838404460f..45f2e96d52c 100644
            --- a/storage/xtradb/fts/fts0fts.cc
            +++ b/storage/xtradb/fts/fts0fts.cc
            @@ -4627,6 +4627,7 @@ fts_sync(
             			ib_vector_get(cache->indexes, i));
             
             		if (index_cache->index->to_be_dropped
            +		    || index_cache->index->table->to_be_dropped
             		    || fts_sync_index_check(index_cache)) {
             			continue;
             		}
            

            I will try to trigger this hang with DROP TABLE.
            The patch does apply to 10.2 as well, but somehow I did not get the hang on 10.2. Maybe it was just (un)lucky scheduling. The code seems full of race conditions.

            Actually, I can repeat the hang in 10.2 non-debug build if I replace the ALTER TABLE with DROP TABLE. The above patch (to InnoDB) fixes it.

            I see that there has been some code cleanup in fts_sync() in MariaDB 10.2. I would backport that to 10.0 while fixing this.

            marko Marko Mäkelä added a comment - WIth a simple patch, 10.0 no longer hangs, but crashes in the same way that 10.2 crashed for me: Version: '10.0.35-MariaDB-debug' socket: '/mariadb/10.0/build/mysql-test/var/tmp/mysqld.1.sock' port: 16000 Source distribution [New Thread 0x7fffe3b9c700 (LWP 16831)] 2018-03-12 12:14:48 7fffe7917700 InnoDB: FTS Optimize Removing table test/#sql-41a8_2 mysqld: /mariadb/10.0/sql/protocol.cc:530: void Protocol::end_statement(): Assertion `0' failed. diff --git a/storage/xtradb/fts/fts0fts.cc b/storage/xtradb/fts/fts0fts.cc index 8838404460f..45f2e96d52c 100644 --- a/storage/xtradb/fts/fts0fts.cc +++ b/storage/xtradb/fts/fts0fts.cc @@ -4627,6 +4627,7 @@ fts_sync( ib_vector_get(cache->indexes, i)); if (index_cache->index->to_be_dropped + || index_cache->index->table->to_be_dropped || fts_sync_index_check(index_cache)) { continue; } I will try to trigger this hang with DROP TABLE . The patch does apply to 10.2 as well, but somehow I did not get the hang on 10.2. Maybe it was just (un)lucky scheduling. The code seems full of race conditions. Actually, I can repeat the hang in 10.2 non-debug build if I replace the ALTER TABLE with DROP TABLE . The above patch (to InnoDB) fixes it. I see that there has been some code cleanup in fts_sync() in MariaDB 10.2. I would backport that to 10.0 while fixing this.
            marko Marko Mäkelä added a comment - The hang was introduced by the change Oracle Bug #24938374 MYSQL CRASHED AFTER LONG WAIT ON DICT OPERATION LOCK WHILE SYNCING FTS INDEX in MySQL 5.6.38.

            debugging, fix, testing, analysis

            marko Marko Mäkelä added a comment - debugging, fix, testing, analysis

            People

              marko Marko Mäkelä
              valerii Valerii Kravchuk
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.