Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18654

Failing assertion: sym_node->table != NULL in buildbot with innodb_fts.sync_ddl and outside

Details

    Description

      http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest2/builds/16190

      nnodb_fts.sync_ddl 'innodb'             w2 [ fail ]
              Test ended at 2019-02-04 11:04:55
       
      CURRENT_TEST: innodb_fts.sync_ddl
      Warning: /mnt/buildbot/build/mariadb-10.3.13/libmysqld/examples/mysqltest_embedded: unknown variable 'loose-ssl-ca=/mnt/buildbot/build/mariadb-10.3.13/mysql-test/std_data/cacert.pem'
      Warning: /mnt/buildbot/build/mariadb-10.3.13/libmysqld/examples/mysqltest_embedded: unknown variable 'loose-ssl-cert=/mnt/buildbot/build/mariadb-10.3.13/mysql-test/std_data/client-cert.pem'
      Warning: /mnt/buildbot/build/mariadb-10.3.13/libmysqld/examples/mysqltest_embedded: unknown variable 'loose-ssl-key=/mnt/buildbot/build/mariadb-10.3.13/mysql-test/std_data/client-key.pem'
      Warning: /mnt/buildbot/build/mariadb-10.3.13/libmysqld/examples/mysqltest_embedded: unknown option '--loose-skip-ssl'
      2019-02-04 11:04:53 0xae6f5b40  InnoDB: Assertion failure in file /home/buildbot/buildbot/build/mariadb-10.3.13/storage/innobase/pars/pars0pars.cc line 815
      InnoDB: Failing assertion: sym_node->table != NULL
      InnoDB: We intentionally generate a memory trap.
      InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
      InnoDB: If you get repeated assertion failures or crashes, even
      InnoDB: immediately after the mysqld startup, there may be
      InnoDB: corruption in the InnoDB tablespace. Please refer to
      InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
      InnoDB: about forcing recovery.
      mysqltest got signal 6
      read_command_buf (0x829cac98): TRUN
      conn->name (0x82c7c220): 
      conn->cur_query (0x82c933d8): TRUNCATE TABLE t1
      Attempting backtrace...
      stack_bottom = 0x0 thread_stack 0x49000
      /mnt/buildbot/build/mariadb-10.3.13/libmysqld/examples/mysqltest_embedded(my_print_stacktrace+0x3c)[0x803a871a]
      mysys/stacktrace.c:269(my_print_stacktrace)[0x8038af28]
      client/mysqltest.cc:9092(dump_backtrace())[0x8038af66]
      addr2line: '': No such file
      [0xb7725c14]
      [0xb7725c31]
      /lib/i386-linux-gnu/libc.so.6(gsignal+0x39)[0xb7091e89]
      /lib/i386-linux-gnu/libc.so.6(abort+0x157)[0xb70933e7]
      include/sync0types.h:1130(my_atomic_loadlint(unsigned int const*))[0x80d768e2]
      pars/pars0pars.cc:817(pars_retrieve_table_def(sym_node_t*))[0x80e2a457]
      pars/pars0pars.cc:1315(pars_insert_statement(sym_node_t*, void*, sel_node_t*))[0x80e2aff0]
      innobase/pars0grm.y:375(yyparse())[0x80dfbc60]
      pars/pars0pars.cc:2135(pars_sql(pars_info_t*, char const*))[0x80e2c23e]
      fts/fts0sql.cc:184(fts_parse_sql(fts_table_t*, pars_info_t*, char const*))[0x80c5aafd]
      fts/fts0fts.cc:3930(fts_write_node(trx_t*, que_fork_t**, fts_table_t*, fts_string_t*, fts_node_t*))[0x80b69295]
      fts/fts0fts.cc:4071(fts_sync_write_words(trx_t*, fts_index_cache_t*, bool))[0x80b69841]
      fts/fts0fts.cc:4152(fts_sync_index(fts_sync_t*, fts_index_cache_t*))[0x80b69e0e]
      fts/fts0fts.cc:4397(fts_sync(fts_sync_t*, bool, bool, bool))[0x80b6a6de]
      fts/fts0fts.cc:4482(fts_sync_table(dict_table_t*, bool, bool, bool))[0x80b6a9c5]
      fts/fts0opt.cc:2829(fts_optimize_sync_table(unsigned long long))[0x80d2c078]
      fts/fts0opt.cc:2942(fts_optimize_thread(void*))[0x80d2c469]
      /lib/i386-linux-gnu/libpthread.so.0(+0x62b5)[0xb76fc2b5]
      /lib/i386-linux-gnu/libc.so.6(clone+0x6e)[0xb714d16e]
      Writing a core file...
      

      Attachments

        Issue Links

          Activity

            /* Make a concurrent Drop fts Index to wait until sync of that
                    fts index is happening in the background */
                    for (int retry_count = 0;;) {
                            bool    retry = false;
             
                            for (inplace_alter_handler_ctx** pctx = ctx_array;
                                *pctx; pctx++) {
                                    ha_innobase_inplace_ctx*        ctx
                                            = static_cast<ha_innobase_inplace_ctx*>(*pctx);
                                    DBUG_ASSERT(new_clustered == ctx->need_rebuild());
             
                                    if (dict_fts_index_syncing(ctx->old_table)) {
                                            retry = true;
                                            break;
                                    }
             
                                    if (new_clustered && dict_fts_index_syncing(ctx->new_table)) {
                                            retry = true;
                                            break;
                                    }
                            }
            

            We don't need to check for the new table in ha_innobase::commit_inplace_alter_table(). Because we just did fts_optimize_remove_table for new table
            just before locking data dictionary. The simplest solution is that acquiring MDL lock for fts_sync(). So that DDL won't drop the table or index in the mean time.
            It is already done for purge thread when it encounters virtual column.

            thiru Thirunarayanan Balathandayuthapani added a comment - /* Make a concurrent Drop fts Index to wait until sync of that fts index is happening in the background */ for (int retry_count = 0;;) { bool retry = false;   for (inplace_alter_handler_ctx** pctx = ctx_array; *pctx; pctx++) { ha_innobase_inplace_ctx* ctx = static_cast<ha_innobase_inplace_ctx*>(*pctx); DBUG_ASSERT(new_clustered == ctx->need_rebuild());   if (dict_fts_index_syncing(ctx->old_table)) { retry = true; break; }   if (new_clustered && dict_fts_index_syncing(ctx->new_table)) { retry = true; break; } } We don't need to check for the new table in ha_innobase::commit_inplace_alter_table(). Because we just did fts_optimize_remove_table for new table just before locking data dictionary. The simplest solution is that acquiring MDL lock for fts_sync(). So that DDL won't drop the table or index in the mean time. It is already done for purge thread when it encounters virtual column.

            I pushed some cleanup as the first commit associated with MDEV-18220. Because the logic related to dict_index_t::index_fts_syncing was not touched, the assertion should be able to fail when using KILL QUERY or when shutting down the server.

            marko Marko Mäkelä added a comment - I pushed some cleanup as the first commit associated with MDEV-18220 . Because the logic related to dict_index_t::index_fts_syncing was not touched, the assertion should be able to fail when using KILL QUERY or when shutting down the server.

            elenst, could this have been fixed in 10.5 by MDEV-16678?

            marko Marko Mäkelä added a comment - elenst , could this have been fixed in 10.5 by MDEV-16678 ?

            Among my test failure records, I have one of June 3rd, 2020 which was automatically recognized as MDEV-18654. The logs are no longer available, so I can't confirm or deny it was an accurate recognition. It happened on 10.4 main branch, revision hash isn't recorded, but it's typically up-to-date when the tests are run.

            elenst Elena Stepanova added a comment - Among my test failure records, I have one of June 3rd, 2020 which was automatically recognized as MDEV-18654 . The logs are no longer available, so I can't confirm or deny it was an accurate recognition. It happened on 10.4 main branch, revision hash isn't recorded, but it's typically up-to-date when the tests are run.

            On buildbot cross-reference, the latest failure of this type was on bb-10.5-MDEV-19176 branch on 2019-12-20 and 2019-12-22, for two builders of the same commit. It is possible that the commit was at fault. Its parent commit 305081a7354f4ef17d2e16ca16f747ee754fee69 did include MDEV-16678, which I believe should have fixed this bug.

            There were some follow-up fixes to MDEV-16678 for the MDL acquisition. The last one is MDEV-21344, which was applied to 10.5.1. I think that we can conclude that this bug was fixed in MariaDB 10.5.1.

            marko Marko Mäkelä added a comment - On buildbot cross-reference, the latest failure of this type was on bb-10.5- MDEV-19176 branch on 2019-12-20 and 2019-12-22, for two builders of the same commit. It is possible that the commit was at fault. Its parent commit 305081a7354f4ef17d2e16ca16f747ee754fee69 did include MDEV-16678 , which I believe should have fixed this bug. There were some follow-up fixes to MDEV-16678 for the MDL acquisition. The last one is MDEV-21344 , which was applied to 10.5.1. I think that we can conclude that this bug was fixed in MariaDB 10.5.1.

            People

              thiru Thirunarayanan Balathandayuthapani
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.