[MDEV-27103] mariadb-upgrade fails with 'System table spider_tables is different version' => Can't create database 'performance_schema' Created: 2021-11-19  Updated: 2023-12-22  Resolved: 2023-12-22

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Spider, Upgrades
Affects Version/s: 10.4, 10.5, 10.6, 10.7
Fix Version/s: 10.4.33, 10.5.24, 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3

Type: Bug Priority: Critical
Reporter: Timofey Turenko Assignee: Yuchen Pei
Resolution: Fixed Votes: 0
Labels: affects-tests

Issue Links:
Blocks
is blocked by MDEV-29870 Backport fixes to spider init bugs to... Closed
Relates
relates to MDEV-22979 "mysqld --bootstrap" / mysql_install_... Closed
relates to MDEV-27233 Server hangs when using --init-file w... Closed

 Description   

e.g. upgrade 10.2 -> latest 10.6

Phase 4/7: Running 'mysql_fix_privilege_tables'
ERROR 12609 (HY000) at line 786: System table spider_tables is different version
ERROR 12609 (HY000) at line 805: System table spider_tables is different version
ERROR 1007 (HY000) at line 810: Can't create database 'performance_schema'; database exists



 Comments   
Comment by Elena Stepanova [ 2021-11-21 ]

My initial suspicions were wrong, it is not related to either Spider being non-stable in 10.2, or to the race condition upon mysql_upgrade. It is a plain upgrade problem.

During mysql_upgrade process, the Spider errors are thrown upon dropping the performance_schema database, so the following performance_schema error is just an aftermath, it happens because DROP failed.

It can be reproduced without package installation, as long as the differences in Spider setup procedure are taken into account.

For example,

  • start 10.2 server on a clean datadir;
  • run share/install_spider.sql which creates a number of tables and installs the plugin;
  • shut down the server;
  • start 10.6 server on the same datadir;
  • run mysql_upgrade.

Or,

  • start 10.5 server on a clean datadir;
  • run install soname 'ha_spider' (there is no SQL script anymore, but spider tables magically appear upon plugin installation);
  • shut down the server;
  • start 10.6 server on the same datadir;
  • run mysql_upgrade.

Affected upgrade combinations:

  • 10.[23] => 10.[4567]
  • 10.[45] => 10.[67]

Oddly, upgrade from 10.4 to 10.5 is not affected, maybe it was handled in some special way.
Upgrade from 10.6 to 10.7 also doesn't seem to be affected.

Comment by Nayuta Yanagisawa (Inactive) [ 2022-02-07 ]

The error, System table spider_tables is different version, is raised when the number of columns in a Spider system table is different from what is expected. I expect that the system tables are fixed once the Spider is loaded after the upgrade, but it seems not to work properly. I've not yet fully understood the problem but I guess that the upgrade from 10.4 to 10.5 or from 10.6 to 10.7 is not affected because the numbers of columns in the Spider system tables have not been changed between these versions.

The Spider initialization process will be changed significantly in MDEV-27233 and MDEV-22979. So, I will handle the present issue after these ones have been completed.

Comment by Timofey Turenko [ 2022-06-22 ]

from latest ES release upgrade tests reproducible in following situations:

10.2 -> 10.6.8
Ubuntu Bionic aarch64
Debian Stretch x86/aarch64

10.3 -> 10.6.8
10.4 -> 10.6.8
Ubuntu Bionic aarch64
Ubuntu Focal aarch64
Debian Stretch x86/aarch64
Debian Buster aarch64

10.5 -> 10.6.8
Debian Stretch x86

Comment by Nayuta Yanagisawa (Inactive) [ 2022-06-27 ]

elenst Do you have any idea about where Spider is loaded after the upgrade (booting a newer version ~ running mysql_upgrade)?

Comment by Elena Stepanova [ 2022-06-27 ]

In both scenarios described here Spider would be first loaded dynamically (and added to mysql.plugin) by INSTALL PLUGIN on the old server version, would remain in mysql.plugin after the old version was shut down, and thus would be further loaded on server startup after the new version was started on the same datadir.

Comment by Yuchen Pei [ 2023-11-03 ]

I'm not sure how to debug this with gdb, so I added an abort();
where the failure is triggered.

10.6 90e11488ac1eafaede6e921133059bd2e08da2be

modified   storage/spider/spd_sys_table.cc
@@ -359,6 +359,7 @@ TABLE *spider_open_sys_table(
         {
           spider_close_sys_table(thd, table, open_tables_backup, need_lock);
           table = NULL;
+          abort();
           my_printf_error(ER_SPIDER_SYS_TABLE_VERSION_NUM,
             ER_SPIDER_SYS_TABLE_VERSION_STR, MYF(0),
             SPIDER_SYS_TABLES_TABLE_NAME_STR);
 

and here's the trace at the abort

spider/spd_sys_table.cc:371(spider_open_sys_table(THD*, char const*, int, bool, start_new_trans**, bool, int*))[0x7ff50965c951]
spider/ha_spider.cc:11867(ha_spider::delete_table(char const*))[0x7ff509742f0c]
addr2line: 'sql/mysqld': No such file
sql/mysqld(+0xd074de)[0x5609613924de]
sql/mysqld(_Z15ha_delete_tableP3THDP10handlertonPKcPK25st_mysql_const_lex_stringS7_b+0x102)[0x560961398abb]
sql/mysqld(+0xd16515)[0x5609613a1515]
sql/mysqld(_Z24plugin_foreach_with_maskP3THDPFcS0_PP13st_plugin_intPvEijS4_+0x28f)[0x5609610100da]
sql/mysqld(_Z21ha_delete_table_forceP3THDPKcPK25st_mysql_const_lex_stringS5_+0xe2)[0x5609613a1683]
sql/mysqld(_Z23mysql_rm_table_no_locksP3THDP10TABLE_LISTPK25st_mysql_const_lex_stringP16st_ddl_log_statebbbbbb+0x13e3)[0x5609610e94fe]
sql/mysqld(_Z14mysql_rm_tableP3THDP10TABLE_LISTbbbb+0x3a3)[0x5609610e7f36]
sql/mysqld(_Z21mysql_execute_commandP3THDb+0x5099)[0x560960ff6646]
sql/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x2a7)[0x56096100065e]
sql/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjb+0x10b7)[0x560960feca67]
sql/mysqld(_Z10do_commandP3THDb+0x96a)[0x560960feb435]
sql/mysqld(_Z24do_handle_one_connectionP7CONNECTb+0x195)[0x5609611b0c56]
sql/mysqld(handle_one_connection+0x5b)[0x5609611b09c1]
sql/mysqld(+0x105134c)[0x5609616dc34c]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7ea7)[0x7ff531998ea7]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7ff531216a2f]

So mysql_rm_table_no_locks() is iterating over all storage engines to
delete some table here, and somehow the spider system table
mysql.spider_tables has the 10.5 schema with 26 columns, rather
than the 10.6 schema with 28 columns.

So I suspect what happens is a race condition caused by the async
spider init during mysql_upgrade. That is, the spider init did not
complete when the server tries to delete some table. So the fix for
MDEV-22979 where the spider init becomes synchronous should fix this
issue too. Given the versions for this issue is <=10.6, we have to
wait for MDEV-29870.

Comment by Yuchen Pei [ 2023-11-28 ]

I tested the upgrade for 10.5->10.6 at the following MDEV-29870
commits: and I can confirm that the fixes for the init bugs also fix
this issue.

5b847372069 upstream/bb-10.5-mdev-29870 MDEV-32753 Make spider init queries compatible with oracle sql mode

c5bab76559c upstream/bb-10.6-mdev-29870 MDEV-32753 Make spider init queries compatible with oracle sql mode

Comment by Yuchen Pei [ 2023-12-07 ]

blocked by having MDEV-27595 in 10.6

Comment by Yuchen Pei [ 2023-12-21 ]

Now that MDEV-29870 is in both 10.5 and 10.6, strangely, I could reproduce the bug again...

10.5 2b8c59fffaaf33455d2a226ecc209b5763c2b7bf
10.6 f9ae553067143a9db496e49929488c4b87bb2d24

I also tested the other two combinations, and here's the summary:

worked:
10.5 5b847372069
10.6 c5bab76559c

10.5 2b8c59fffaa
10.6 c5bab76559c

not working:
10.5 2b8c59fffaa
10.6 f9ae5530671

10.5 5b847372069
10.6 f9ae5530671

So something in current 10.6 is preventing it from working, as the
custom branch for 10.6 was ok.

Also tested on 4f091920679, which is the earliest merge to 10.6
containing MDEV-29870 patches, plus the fixup that moves
ddl_log_initialize() to before plugin_init(), which is different
from the current fix for the same problem (0930eb86cb0 Spider cannot
run DDL (e.g. create tables) before ddl recovery), but it also
fails.

Turned out it is caused by a recent merge not containing the correct
spider init queries. After a fixup now 10.5->10.6 upgrade works, so
is 10.4->10.6 upgrade. Will close this ticket as fixed once the
fixup is pushed.

Comment by Yuchen Pei [ 2023-12-22 ]

Fixed by MDEV-29870

Generated at Thu Feb 08 09:50:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.