Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27103

mariadb-upgrade fails with 'System table spider_tables is different version' => Can't create database 'performance_schema'

Details

    Description

      e.g. upgrade 10.2 -> latest 10.6

      Phase 4/7: Running 'mysql_fix_privilege_tables'
      ERROR 12609 (HY000) at line 786: System table spider_tables is different version
      ERROR 12609 (HY000) at line 805: System table spider_tables is different version
      ERROR 1007 (HY000) at line 810: Can't create database 'performance_schema'; database exists
      

      Attachments

        Issue Links

          Activity

            ycp Yuchen Pei added a comment - - edited

            I'm not sure how to debug this with gdb, so I added an abort();
            where the failure is triggered.

            10.6 90e11488ac1eafaede6e921133059bd2e08da2be

            modified   storage/spider/spd_sys_table.cc
            @@ -359,6 +359,7 @@ TABLE *spider_open_sys_table(
                     {
                       spider_close_sys_table(thd, table, open_tables_backup, need_lock);
                       table = NULL;
            +          abort();
                       my_printf_error(ER_SPIDER_SYS_TABLE_VERSION_NUM,
                         ER_SPIDER_SYS_TABLE_VERSION_STR, MYF(0),
                         SPIDER_SYS_TABLES_TABLE_NAME_STR);
             

            and here's the trace at the abort

            spider/spd_sys_table.cc:371(spider_open_sys_table(THD*, char const*, int, bool, start_new_trans**, bool, int*))[0x7ff50965c951]
            spider/ha_spider.cc:11867(ha_spider::delete_table(char const*))[0x7ff509742f0c]
            addr2line: 'sql/mysqld': No such file
            sql/mysqld(+0xd074de)[0x5609613924de]
            sql/mysqld(_Z15ha_delete_tableP3THDP10handlertonPKcPK25st_mysql_const_lex_stringS7_b+0x102)[0x560961398abb]
            sql/mysqld(+0xd16515)[0x5609613a1515]
            sql/mysqld(_Z24plugin_foreach_with_maskP3THDPFcS0_PP13st_plugin_intPvEijS4_+0x28f)[0x5609610100da]
            sql/mysqld(_Z21ha_delete_table_forceP3THDPKcPK25st_mysql_const_lex_stringS5_+0xe2)[0x5609613a1683]
            sql/mysqld(_Z23mysql_rm_table_no_locksP3THDP10TABLE_LISTPK25st_mysql_const_lex_stringP16st_ddl_log_statebbbbbb+0x13e3)[0x5609610e94fe]
            sql/mysqld(_Z14mysql_rm_tableP3THDP10TABLE_LISTbbbb+0x3a3)[0x5609610e7f36]
            sql/mysqld(_Z21mysql_execute_commandP3THDb+0x5099)[0x560960ff6646]
            sql/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x2a7)[0x56096100065e]
            sql/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjb+0x10b7)[0x560960feca67]
            sql/mysqld(_Z10do_commandP3THDb+0x96a)[0x560960feb435]
            sql/mysqld(_Z24do_handle_one_connectionP7CONNECTb+0x195)[0x5609611b0c56]
            sql/mysqld(handle_one_connection+0x5b)[0x5609611b09c1]
            sql/mysqld(+0x105134c)[0x5609616dc34c]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x7ea7)[0x7ff531998ea7]
            /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7ff531216a2f]

            So mysql_rm_table_no_locks() is iterating over all storage engines to
            delete some table here, and somehow the spider system table
            mysql.spider_tables has the 10.5 schema with 26 columns, rather
            than the 10.6 schema with 28 columns.

            So I suspect what happens is a race condition caused by the async
            spider init during mysql_upgrade. That is, the spider init did not
            complete when the server tries to delete some table. So the fix for
            MDEV-22979 where the spider init becomes synchronous should fix this
            issue too. Given the versions for this issue is <=10.6, we have to
            wait for MDEV-29870.

            ycp Yuchen Pei added a comment - - edited I'm not sure how to debug this with gdb, so I added an abort(); where the failure is triggered. 10.6 90e11488ac1eafaede6e921133059bd2e08da2be modified storage/spider/spd_sys_table.cc @@ -359,6 +359,7 @@ TABLE *spider_open_sys_table( { spider_close_sys_table(thd, table, open_tables_backup, need_lock); table = NULL; + abort(); my_printf_error(ER_SPIDER_SYS_TABLE_VERSION_NUM, ER_SPIDER_SYS_TABLE_VERSION_STR, MYF(0), SPIDER_SYS_TABLES_TABLE_NAME_STR);   and here's the trace at the abort spider/spd_sys_table.cc:371(spider_open_sys_table(THD*, char const*, int, bool, start_new_trans**, bool, int*))[0x7ff50965c951] spider/ha_spider.cc:11867(ha_spider::delete_table(char const*))[0x7ff509742f0c] addr2line: 'sql/mysqld': No such file sql/mysqld(+0xd074de)[0x5609613924de] sql/mysqld(_Z15ha_delete_tableP3THDP10handlertonPKcPK25st_mysql_const_lex_stringS7_b+0x102)[0x560961398abb] sql/mysqld(+0xd16515)[0x5609613a1515] sql/mysqld(_Z24plugin_foreach_with_maskP3THDPFcS0_PP13st_plugin_intPvEijS4_+0x28f)[0x5609610100da] sql/mysqld(_Z21ha_delete_table_forceP3THDPKcPK25st_mysql_const_lex_stringS5_+0xe2)[0x5609613a1683] sql/mysqld(_Z23mysql_rm_table_no_locksP3THDP10TABLE_LISTPK25st_mysql_const_lex_stringP16st_ddl_log_statebbbbbb+0x13e3)[0x5609610e94fe] sql/mysqld(_Z14mysql_rm_tableP3THDP10TABLE_LISTbbbb+0x3a3)[0x5609610e7f36] sql/mysqld(_Z21mysql_execute_commandP3THDb+0x5099)[0x560960ff6646] sql/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x2a7)[0x56096100065e] sql/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjb+0x10b7)[0x560960feca67] sql/mysqld(_Z10do_commandP3THDb+0x96a)[0x560960feb435] sql/mysqld(_Z24do_handle_one_connectionP7CONNECTb+0x195)[0x5609611b0c56] sql/mysqld(handle_one_connection+0x5b)[0x5609611b09c1] sql/mysqld(+0x105134c)[0x5609616dc34c] /lib/x86_64-linux-gnu/libpthread.so.0(+0x7ea7)[0x7ff531998ea7] /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7ff531216a2f] So mysql_rm_table_no_locks() is iterating over all storage engines to delete some table here, and somehow the spider system table mysql.spider_tables has the 10.5 schema with 26 columns, rather than the 10.6 schema with 28 columns. So I suspect what happens is a race condition caused by the async spider init during mysql_upgrade. That is, the spider init did not complete when the server tries to delete some table. So the fix for MDEV-22979 where the spider init becomes synchronous should fix this issue too. Given the versions for this issue is <=10.6, we have to wait for MDEV-29870 .
            ycp Yuchen Pei added a comment -

            I tested the upgrade for 10.5->10.6 at the following MDEV-29870
            commits: and I can confirm that the fixes for the init bugs also fix
            this issue.

            5b847372069 upstream/bb-10.5-mdev-29870 MDEV-32753 Make spider init queries compatible with oracle sql mode

            c5bab76559c upstream/bb-10.6-mdev-29870 MDEV-32753 Make spider init queries compatible with oracle sql mode

            ycp Yuchen Pei added a comment - I tested the upgrade for 10.5->10.6 at the following MDEV-29870 commits: and I can confirm that the fixes for the init bugs also fix this issue. 5b847372069 upstream/bb-10.5-mdev-29870 MDEV-32753 Make spider init queries compatible with oracle sql mode c5bab76559c upstream/bb-10.6-mdev-29870 MDEV-32753 Make spider init queries compatible with oracle sql mode
            ycp Yuchen Pei added a comment -

            blocked by having MDEV-27595 in 10.6

            ycp Yuchen Pei added a comment - blocked by having MDEV-27595 in 10.6
            ycp Yuchen Pei added a comment - - edited

            Now that MDEV-29870 is in both 10.5 and 10.6, strangely, I could reproduce the bug again...

            10.5 2b8c59fffaaf33455d2a226ecc209b5763c2b7bf
            10.6 f9ae553067143a9db496e49929488c4b87bb2d24

            I also tested the other two combinations, and here's the summary:

            worked:
            10.5 5b847372069
            10.6 c5bab76559c

            10.5 2b8c59fffaa
            10.6 c5bab76559c

            not working:
            10.5 2b8c59fffaa
            10.6 f9ae5530671

            10.5 5b847372069
            10.6 f9ae5530671

            So something in current 10.6 is preventing it from working, as the
            custom branch for 10.6 was ok.

            Also tested on 4f091920679, which is the earliest merge to 10.6
            containing MDEV-29870 patches, plus the fixup that moves
            ddl_log_initialize() to before plugin_init(), which is different
            from the current fix for the same problem (0930eb86cb0 Spider cannot
            run DDL (e.g. create tables) before ddl recovery), but it also
            fails.

            Turned out it is caused by a recent merge not containing the correct
            spider init queries. After a fixup now 10.5->10.6 upgrade works, so
            is 10.4->10.6 upgrade. Will close this ticket as fixed once the
            fixup is pushed.

            ycp Yuchen Pei added a comment - - edited Now that MDEV-29870 is in both 10.5 and 10.6, strangely, I could reproduce the bug again... 10.5 2b8c59fffaaf33455d2a226ecc209b5763c2b7bf 10.6 f9ae553067143a9db496e49929488c4b87bb2d24 I also tested the other two combinations, and here's the summary: worked: 10.5 5b847372069 10.6 c5bab76559c 10.5 2b8c59fffaa 10.6 c5bab76559c not working: 10.5 2b8c59fffaa 10.6 f9ae5530671 10.5 5b847372069 10.6 f9ae5530671 So something in current 10.6 is preventing it from working, as the custom branch for 10.6 was ok. Also tested on 4f091920679, which is the earliest merge to 10.6 containing MDEV-29870 patches, plus the fixup that moves ddl_log_initialize() to before plugin_init(), which is different from the current fix for the same problem (0930eb86cb0 Spider cannot run DDL (e.g. create tables) before ddl recovery), but it also fails. Turned out it is caused by a recent merge not containing the correct spider init queries. After a fixup now 10.5->10.6 upgrade works, so is 10.4->10.6 upgrade. Will close this ticket as fixed once the fixup is pushed.
            ycp Yuchen Pei added a comment -

            Fixed by MDEV-29870

            ycp Yuchen Pei added a comment - Fixed by MDEV-29870

            People

              ycp Yuchen Pei
              tturenko Timofey Turenko
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.