The issues with the Spider init (MDEV-22979,MDEV-27233,MDEV-29904,MDEV-30370) is more complicated than I thought. We want to fix the former two bugs without causing regression in the latter two.
The spider init queries require:
- Aria engine, for creation of spider system tables (mysql.spider_tables, etc.)
- mysql database, for creation of spider system tables
- acl init etc., for creation of udfs. I have not looked into exactly what is needed for creation of udfs, but acl_init is a necessary condition. In any case, I assume udf creation can wait for the server start.
For Aria engine, we can rearrange the plugin init order so that dynamic plugins like the spider are initialised after builtin plugins like Aria. For udfs, we can separate out these queries to a background thread to wait for server start.
The problem is when we load spider in bootstrap. In mysqld_main() is that it does the following things in order
- initialises all plugins (init_server_components())
- acl_init
- executes bootstrap queries (bootstrap(mysql_stdin))
- executes init file queries (read_init_file(opt_init_file))
The mysql_install_db script passes the creation of mysql database as part of the bootstrap queries. AIUI mysqld_main() assumes at this stage any builtin or dynamic (from plugin-load-add) storage engine either have been initialised to accept these queries, or will be taken care of by the bootstrap queries ("insider help"). For example, innodb system tables mysql.innodb_table_stats and mysql.innodb_index_stats are created using bootstrap queries which guarantees the mysql database exists first.
Spider does not have any "insider help" and cannot assume that mysql database exists when it is being initialised. Worse, it is possible that the spider storage engine is used before the mysql database is creatred: consider the following bootstrap queries:
drop table if exists foo.bar;
create database mysql;
The drop table statement may require all storage engine to be initialised, but spider requires the mysql database for initialisation. If we put spider init queries into a background thread waiting for the creation of the mysql database, it will deadlock on these queries.
I can think of 3 ways to fix this, in the order of increasing amount of work:
1. advise users to not use plugin-load-add=ha_spider, but use init_file with a file including install soname ha_spider; instead. If possible, abort with an error on or ignore plugin-load-add=ha_spider. This also means no spider during bootstrap because --bootstrap causes the server to exit upon execution of the bootstrap queries thus skipping everything that may happen afterwards including init_file. It is tempting to ban plugin-load-add=ha_spider only for bootstrap, but given the complexity introduced by the plugin init reordering and udf creation background thread and the risk of race conditions and deadlocks, it is much simpler to just run all the init queries synchronously and ban plugin-load-add=ha_spider altogether.
2. make bootstrap+spider invoke a init sql script instead ("insider help"). A bit like the good old install_spider.sql, except that 1) now we place it alongside other bootstrapping scripts make it so that it is run only with say an --install_spider flag, 2) the init of spider outside of bootstrapping (e.g. with --init_file or install soname) still follow the current code path of doing it in C++, both coming from a single source of truth.
3. full on resurrection of install_spider.sql. I am not sure what this entails. But will look more into it if needed.
For simplicity, I think the first option is the best for now.
In
MDEV-27107there was an explicit lock for the mariadb-upgrade (where a second mariadb-upgrade could terminate if it doesn't get the lock), but I was thinking of a second mariadb-plugin-init lock exclusively acquired when mariadb does its system tables updates, where it will wait. The intent is that this lock can be acquired by plugins like spider/columnstore for the install plugin / udf that may not have an easy work around.The other option that is probably also equally valid is ensuring that INSTALL PLUGIN and similar SQL on the system tables acquires metadata locks of the tables correctly (worthy of another task) if found to be deficient.