[MDEV-14622] Running mysql_upgrade in background upon package installation or upgrade causes deadlocks and other side-effects Created: 2017-12-11  Updated: 2022-01-29  Resolved: 2021-11-19

Status: Closed
Project: MariaDB Server
Component/s: Platform Debian
Affects Version/s: 10.1, 10.2, 10.3, 10.4, 10.5
Fix Version/s: 10.5.4

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Otto Kekäläinen
Resolution: Fixed Votes: 2
Labels: None

Attachments: PNG File screenshot-1.png    
Issue Links:
Duplicate
duplicates MDEV-19233 mysql_upgrade is not executed on pack... Closed
Relates
relates to MCOL-4057 Package ColumnStore 5.x with 10.5 ser... Closed
relates to MDEV-23061 Extra engines enabled through config ... Stalled
relates to MDEV-27068 running mariadb-upgrade on mariadb se... Closed
relates to MDEV-27095 Simplify spider init queries Closed

 Description   

MariaDB Debian startup scripts, among other things, after server startup invoke debian-start, which starts a block of functions in background and exits. The block of functions includes mysql_upgrade script (from within upgrade_system_tables_if_necessary).

trap "" SIGHUP
(
  upgrade_system_tables_if_necessary;
  check_root_accounts;
  check_for_crashed_tables;
) >&2 &

On non-systemd installations, this block proceeds after the main script ends, and eventually performs the necessary checks. It's a questionable decision, because the server might start being used before mysql_upgrade has been run. It would be more reasonable either wait till it finishes, if mysql_upgrade is really necessary, or not run it at all, if it's not. However, it's a minor problem.

On systemd installations, when the main script finishes, the background block aborts too. So, mysql_upgrade has almost no chance to be executed. It remains unnoticed, the server reports that debian-start was executed successfully:

$ sudo service mariadb status
● mariadb.service - MariaDB database server
   Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/mariadb.service.d
           └─migrated-from-my.cnf-settings.conf
   Active: active (running) since Mon 2017-12-11 11:21:33 EST; 4s ago
  Process: 13680 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
  Process: 13678 ExecStartPost=/etc/mysql/debian-start (code=exited, status=0/SUCCESS)
  Process: 13622 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= ||   VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ]   && systemctl set-environment _WSR
  Process: 13618 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
  Process: 13615 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS)
 Main PID: 13647 (mysqld)
   Status: "Taking your SQL requests now..."
    Tasks: 30 (limit: 4915)
   CGroup: /system.slice/mariadb.service
           └─13647 /usr/sbin/mysqld
 
Dec 11 11:21:32 debian-9-stretch-amd64 mysqld[13647]: 2017-12-11 11:21:32 0 [Note] InnoDB: 5.7.19 started; log sequence number 1630818
Dec 11 11:21:32 debian-9-stretch-amd64 mysqld[13647]: 2017-12-11 11:21:32 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
Dec 11 11:21:32 debian-9-stretch-amd64 mysqld[13647]: 2017-12-11 11:21:32 0 [Note] Plugin 'FEEDBACK' is disabled.
Dec 11 11:21:32 debian-9-stretch-amd64 mysqld[13647]: 2017-12-11 11:21:32 0 [Note] InnoDB: Buffer pool(s) load completed at 171211 11:21:32
Dec 11 11:21:32 debian-9-stretch-amd64 mysqld[13647]: 2017-12-11 11:21:32 0 [Note] Server socket created on IP: '::'.
Dec 11 11:21:32 debian-9-stretch-amd64 mysqld[13647]: 2017-12-11 11:21:32 0 [Note] Reading of all Master_info entries succeded
Dec 11 11:21:32 debian-9-stretch-amd64 mysqld[13647]: 2017-12-11 11:21:32 0 [Note] Added new Master_info '' to hash table
Dec 11 11:21:32 debian-9-stretch-amd64 mysqld[13647]: 2017-12-11 11:21:32 0 [Note] /usr/sbin/mysqld: ready for connections.
Dec 11 11:21:32 debian-9-stretch-amd64 mysqld[13647]: Version: '10.3.2-MariaDB-10.3.2+maria~stretch-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  mariadb.org
Dec 11 11:21:33 debian-9-stretch-amd64 systemd[1]: Started MariaDB database server.
Warning: mariadb.service changed on disk. Run 'systemctl daemon-reload' to reload units.

but both syslog and general log, if enabled, confirms that the background jobs have never been finished.

Also, result of these jobs is ignored, even if mysql_upgrade fails, it is not reflected in the service status anyhow.



 Comments   
Comment by Marc Olzheim [ 2018-01-02 ]

Indeed, I discovered the same problem, while migrating from Ubuntu 14.04 to 16.04, so actually to systemd.

The systemd ExecPostStart subprocesses seem to be cleaned up after exit.
You can easily verify this by adding a "sleep 1000" in the subshell block after the sighup and notice that it is not running after it has started.

This patch does the trick for me:

--- /etc/mysql/debian-start        2016-09-05 15:29:12.576049459 +0200
+++ /etc/mysql/debian-start.systemd        2018-01-02 17:18:03.831166836 +0100
@@ -29,13 +29,10 @@
 # (There may be no output to stdout inside the background process!)
 echo "Checking for corrupt, not cleanly closed and upgrade needing tables."
 
-# Need to ignore SIGHUP, as otherwise a SIGHUP can sometimes abort the upgrade
-# process in the middle.
-trap "" SIGHUP
 (
   upgrade_system_tables_if_necessary;
   check_root_accounts;
   check_for_crashed_tables;
-) >&2 &
+) >&2
 
 exit 0

Comment by Jean Weisbuch [ 2018-01-26 ]

The execution of debian-start can be very long (several minutes) on servers having thousands of tables.

For example, on a server with about 120000 tables, it takes more than 6 minutes to do the mysql_upgrade itself, the check_for_crashed_tables takes almost 5 minutes.


To speed up the upgrade_system_tables_if_necessary, it could only check/upgrade system tables (as its name states it should be) by using the --upgrade-system-tables option of mysql_upgrade (takes less than 5 seconds on my test server instead of 6 minutes) and checking the non-system tables could be executed on background.


check_for_crashed_tables should be executed on the background as crashed tables are not blocking the server execution and this check has always been done on the background.

It might also be disabled as the engine itself will check the table at first openening of the file and the behavior could be customized using the myisam_recover_options and aria_recover_options.

Comment by Daniel Black [ 2018-04-01 ]

https://www.freedesktop.org/software/systemd/man/systemd.service.html#ExecStartPre= says this should be used for long running processes and also describes the kill off of process.

I suspect that creating 1 (or more individual) services that "After=mariadb.service/BindsTo=mariadb.service" that way they length of their execution and status is independent of the mariadb.service. As independent services other services could depend the completion of the table check for instance.

Options to increase the parallelism of these operations could also be considered.

Comment by Otto Kekäläinen [ 2018-04-09 ]

Oracle MySQL seems to have removed this script completely: https://salsa.debian.org/mariadb-team/mysql/commit/f12dd3fb5387113585a981e2b8d234e81c6a630d

It is legacy anyway, and I don't think anybody has that much deeply reviewed it since systemd was introduced and/or pre/postinstal scripts updated to fix other bugs.

Comment by Jean Weisbuch [ 2018-04-17 ]

check_for_crashed_tables could probably be removed altogether as automatic detection and recovery of crashed tables at opening has improved.

Keeping the automatic mysql_upgrade execution shouldnt be useful if we are sure that the post-install does work well (which doesnt work well in my experience in MySQL 5.7 official packages at least on systems not using Systemd by not running mysql_upgrade at package upgrades) and the check_root_accounts is nice to have but its removal wont break anything.

If you are sure that mysql_upgrade is correctly executed by the post-install of the package upgrade i think it could be removed, meaning that the MYCHECK_RCPT variable on mariadb-server-10.*.mysql.default could also be removed.

Comment by Elena Stepanova [ 2020-07-05 ]

The problems from running mysql_upgrade in the background still exist, moreover they seem to worsen in higher versions, as both mysql_upgrade and post-install scripts of engine-specific packages get more SQL logic and thus the probability of conflicts grows.

The obvious problem is when the server installation ends and the client may start running something while mysql_upgrade is happening in the background – and since it deals with system tables, various strange effects may occur.

The less obvious but even bigger problem is that when the server is installed together with other packages, mysql_upgrade starts as soon as server installation is finished, but other packages are still being installed, and it causes race conditions. Specifically, the frequent outcome in buildbot is a hang upon spider installation. It happens due to a deadlock between spider creating its own objects and mysql_upgrade modifying system tables. One I managed to catch live is below, there may be other ones. Technically the deadlock isn't permanent, it would probably resolve itself after lock_wait_timeout is exceeded, but since it's 1 day by default, for all practical purposes it makes no difference. Besides, even if it was set to a lower level, nothing good would come out of either mysql_upgrade or Spider logic failing with a lock wait timeout.

10.3.23

+----+------------------+-----------+-------+---------+------+---------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
| Id | User             | Host      | db    | Command | Time | State                           | Info                                                                                                                                                                                                 | Progress |
+----+------------------+-----------+-------+---------+------+---------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|  1 | system user      |           | NULL  | Daemon  | NULL | InnoDB purge coordinator        | NULL                                                                                                                                                                                                 |    0.000 |
|  2 | system user      |           | NULL  | Daemon  | NULL | InnoDB purge worker             | NULL                                                                                                                                                                                                 |    0.000 |
|  3 | system user      |           | NULL  | Daemon  | NULL | InnoDB shutdown handler         | NULL                                                                                                                                                                                                 |    0.000 |
|  5 | system user      |           | NULL  | Daemon  | NULL | InnoDB purge worker             | NULL                                                                                                                                                                                                 |    0.000 |
|  4 | system user      |           | NULL  | Daemon  | NULL | InnoDB purge worker             | NULL                                                                                                                                                                                                 |    0.000 |
| 14 | debian-sys-maint | localhost | mysql | Query   |  221 | Waiting for table metadata lock | ALTER TABLE proc MODIFY type enum('FUNCTION',
                                  'PROCEDURE',
                                  'PACKAGE',
                                  'PACKAGE BODY') NOT NULL |    0.000 |
| 15 | debian-sys-maint | localhost | NULL  | Query   |  221 | Waiting for table level lock    | drop procedure mysql.spider_fix_one_table                                                                                                                                                            |    0.000 |
| 17 | root             | localhost | NULL  | Query   |    0 | Init                            | show full processlist                                                                                                                                                                                |    0.000 |
+----+------------------+-----------+-------+---------+------+---------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+

$ ps -ef | grep mysql
root      1060   780  0 12:57 pts/0    00:00:00 sudo sh -c DEBIAN_FRONTEND=noninteractive MYSQLD_STARTUP_TIMEOUT=180 apt-get -o Dpkg::Options::=--force-confnew install --allow-unauthenticated -y libmariadb3 libmariadb3-compat libmariadb3-dbgsym libmariadbclient18 libmariadbd19 libmariadbd19-dbgsym libmariadbd-dev libmariadb-dev libmariadb-dev-compat libmariadb-dev-dbgsym libmysqlclient18 mariadb-backup mariadb-backup-dbgsym mariadb-client mariadb-client-10.3 mariadb-client-10.3-dbgsym mariadb-client-core-10.3 mariadb-client-core-10.3-dbgsym mariadb-common mariadb-plugin-connect mariadb-plugin-connect-dbgsym mariadb-plugin-cracklib-password-check mariadb-plugin-cracklib-password-check-dbgsym mariadb-plugin-gssapi-client mariadb-plugin-gssapi-client-dbgsym mariadb-plugin-gssapi-server mariadb-plugin-gssapi-server-dbgsym mariadb-plugin-mroonga mariadb-plugin-mroonga-dbgsym mariadb-plugin-oqgraph mariadb-plugin-oqgraph-dbgsym mariadb-plugin-rocksdb mariadb-plugin-rocksdb-dbgsym mariadb-plugin-spider mariadb-plugin-spider-dbgsym mariadb-plugin-tokudb mariadb-plugin-tokudb-dbgsym mariadb-server mariadb-server-10.3 mariadb-server-10.3-dbgsym mariadb-server-core-10.3 mariadb-server-core-10.3-dbgsym mariadb-test mariadb-test-data mariadb-test-dbgsym mysql-common
root      1061  1060  0 12:57 pts/0    00:00:00 sh -c DEBIAN_FRONTEND=noninteractive MYSQLD_STARTUP_TIMEOUT=180 apt-get -o Dpkg::Options::=--force-confnew install --allow-unauthenticated -y libmariadb3 libmariadb3-compat libmariadb3-dbgsym libmariadbclient18 libmariadbd19 libmariadbd19-dbgsym libmariadbd-dev libmariadb-dev libmariadb-dev-compat libmariadb-dev-dbgsym libmysqlclient18 mariadb-backup mariadb-backup-dbgsym mariadb-client mariadb-client-10.3 mariadb-client-10.3-dbgsym mariadb-client-core-10.3 mariadb-client-core-10.3-dbgsym mariadb-common mariadb-plugin-connect mariadb-plugin-connect-dbgsym mariadb-plugin-cracklib-password-check mariadb-plugin-cracklib-password-check-dbgsym mariadb-plugin-gssapi-client mariadb-plugin-gssapi-client-dbgsym mariadb-plugin-gssapi-server mariadb-plugin-gssapi-server-dbgsym mariadb-plugin-mroonga mariadb-plugin-mroonga-dbgsym mariadb-plugin-oqgraph mariadb-plugin-oqgraph-dbgsym mariadb-plugin-rocksdb mariadb-plugin-rocksdb-dbgsym mariadb-plugin-spider mariadb-plugin-spider-dbgsym mariadb-plugin-tokudb mariadb-plugin-tokudb-dbgsym mariadb-server mariadb-server-10.3 mariadb-server-10.3-dbgsym mariadb-server-core-10.3 mariadb-server-core-10.3-dbgsym mariadb-test mariadb-test-data mariadb-test-dbgsym mysql-common
root      1062  1061  0 12:57 pts/0    00:00:01 apt-get -o Dpkg::Options::=--force-confnew install --allow-unauthenticated -y libmariadb3 libmariadb3-compat libmariadb3-dbgsym libmariadbclient18 libmariadbd19 libmariadbd19-dbgsym libmariadbd-dev libmariadb-dev libmariadb-dev-compat libmariadb-dev-dbgsym libmysqlclient18 mariadb-backup mariadb-backup-dbgsym mariadb-client mariadb-client-10.3 mariadb-client-10.3-dbgsym mariadb-client-core-10.3 mariadb-client-core-10.3-dbgsym mariadb-common mariadb-plugin-connect mariadb-plugin-connect-dbgsym mariadb-plugin-cracklib-password-check mariadb-plugin-cracklib-password-check-dbgsym mariadb-plugin-gssapi-client mariadb-plugin-gssapi-client-dbgsym mariadb-plugin-gssapi-server mariadb-plugin-gssapi-server-dbgsym mariadb-plugin-mroonga mariadb-plugin-mroonga-dbgsym mariadb-plugin-oqgraph mariadb-plugin-oqgraph-dbgsym mariadb-plugin-rocksdb mariadb-plugin-rocksdb-dbgsym mariadb-plugin-spider mariadb-plugin-spider-dbgsym mariadb-plugin-tokudb mariadb-plugin-tokudb-dbgsym mariadb-server mariadb-server-10.3 mariadb-server-10.3-dbgsym mariadb-server-core-10.3 mariadb-server-core-10.3-dbgsym mariadb-test mariadb-test-data mariadb-test-dbgsym mysql-common
mysql     2755     1  0 13:03 ?        00:00:00 /usr/sbin/mysqld
root      2789     1  0 13:03 ?        00:00:00 /bin/bash /etc/mysql/debian-start
root      2791  2789  0 13:03 ?        00:00:00 /usr/bin/mysql_upgrade --defaults-extra-file=/etc/mysql/debian.cnf --version-check
root      2793  2789  0 13:03 ?        00:00:00 logger -p daemon warn -i -t/etc/mysql/debian-start
root      2811  2791  0 13:03 ?        00:00:00 sh -c '/usr/bin/mysql' --defaults-file=/tmp/mysql_upgrade-W1q3os --database=mysql --batch --force --silent < /tmp/sqlvLM0bR 2>&1 
root      2812  2811  0 13:03 ?        00:00:00 /usr/bin/mysql --defaults-file=/tmp/mysql_upgrade-W1q3os --database=mysql --batch --force --silent
root      2814  2813  0 13:03 pts/1    00:00:00 mysql --defaults-file=/etc/mysql/debian.cnf

Thread 32 (Thread 0x7f3d40067700 (LWP 2815)):
#0  futex_abstimed_wait_cancelable (private=0, abstime=0x7f3d40064400, expected=0, futex_word=0x7f3cd4000b50) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  __pthread_cond_wait_common (abstime=0x7f3d40064400, mutex=0x7f3ce005e1e8, cond=0x7f3cd4000b28) at pthread_cond_wait.c:539
#2  __pthread_cond_timedwait (cond=cond@entry=0x7f3cd4000b28, mutex=0x7f3ce005e1e8, abstime=abstime@entry=0x7f3d40064400) at pthread_cond_wait.c:667
#3  0x00005582110094ce in inline_mysql_cond_timedwait (src_file=0x5582111f7800 "/home/buildbot/buildbot/build/mariadb-10.3.23/mysys/thr_lock.c", src_line=576, abstime=0x7f3d40064400, mutex=<optimized out>, that=0x7f3cd4000b28) at ./include/mysql/psi/mysql_thread.h:1222
#4  wait_for_lock (wait=wait@entry=0x7f3ce005e238, data=data@entry=0x7f3cd40fd118, in_wait_list=in_wait_list@entry=0 '\000', lock_wait_timeout=lock_wait_timeout@entry=31536000) at ./mysys/thr_lock.c:576
#5  0x000055821100a079 in thr_lock (lock_wait_timeout=31536000, owner=0x7f3cd4002670, data=0x7f3cd40fd118) at ./mysys/thr_lock.c:1014
#6  thr_multi_lock (data=<optimized out>, count=<optimized out>, owner=owner@entry=0x7f3cd4002670, lock_wait_timeout=lock_wait_timeout@entry=31536000) at ./mysys/thr_lock.c:1295
#7  0x0000558210bfbadb in mysql_lock_tables (thd=thd@entry=0x7f3cd4000c08, sql_lock=sql_lock@entry=0x7f3cd400f718, flags=<optimized out>, flags@entry=2048) at ./sql/lock.cc:347
#8  0x0000558210bfc65f in mysql_lock_tables (thd=thd@entry=0x7f3cd4000c08, tables=tables@entry=0x7f3d40064838, count=count@entry=1, flags=flags@entry=2048) at ./sql/lock.cc:299
#9  0x00005582108ee028 in open_ltable (thd=thd@entry=0x7f3cd4000c08, table_list=table_list@entry=0x7f3d40064740, lock_type=TL_WRITE, lock_flags=lock_flags@entry=2048) at ./sql/sql_base.cc:4986
#10 0x00005582108f4438 in open_system_table_for_update (thd=thd@entry=0x7f3cd4000c08, one_table=one_table@entry=0x7f3d40064740) at ./sql/sql_base.cc:8948
#11 0x0000558210c35f91 in open_proc_table_for_update (thd=thd@entry=0x7f3cd4000c08) at ./sql/sp.cc:521
#12 0x0000558210c38a7a in Sp_handler::sp_drop_routine (this=this@entry=0x5582116c8330 <sp_handler_procedure>, thd=thd@entry=0x7f3cd4000c08, name=0x7f3cd400f6f0) at ./sql/sp.cc:1579
#13 0x00005582109473b1 in mysql_execute_command (thd=0x7f3cd4000c08) at ./sql/sql_parse.cc:5734
#14 0x000055821094bc29 in mysql_parse (thd=0x7f3cd4000c08, rawbuf=<optimized out>, length=41, parser_state=0x7f3d40066570, is_com_multi=<optimized out>, is_next_command=<optimized out>) at ./sql/sql_parse.cc:7817
#15 0x000055821094dac5 in dispatch_command (command=COM_QUERY, thd=0x7f3cd4000c08, packet=<optimized out>, packet_length=<optimized out>, is_com_multi=<optimized out>, is_next_command=<optimized out>) at ./sql/sql_class.h:1137
#16 0x000055821094f2c2 in do_command (thd=0x7f3cd4000c08) at ./sql/sql_parse.cc:1401
#17 0x0000558210a26a56 in do_handle_one_connection (connect=connect@entry=0x558213fa5e68) at ./sql/sql_connect.cc:1403
#18 0x0000558210a26bcd in handle_one_connection (arg=0x558213fa5e68) at ./sql/sql_connect.cc:1308
#19 0x00007f3d566bbfa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#20 0x00007f3d55bd04cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
 
 
Thread 31 (Thread 0x7f3d400b2700 (LWP 2800)):
#0  futex_abstimed_wait_cancelable (private=0, abstime=0x7f3d400ad630, expected=0, futex_word=0x7f3ce0000d80) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  __pthread_cond_wait_common (abstime=0x7f3d400ad630, mutex=0x7f3ce0000d28, cond=0x7f3ce0000d58) at pthread_cond_wait.c:539
#2  __pthread_cond_timedwait (cond=cond@entry=0x7f3ce0000d58, mutex=mutex@entry=0x7f3ce0000d28, abstime=abstime@entry=0x7f3d400ad630) at pthread_cond_wait.c:667
#3  0x0000558210a2cf46 in inline_mysql_cond_timedwait (src_file=0x55821107f880 "/home/buildbot/buildbot/build/mariadb-10.3.23/sql/mdl.cc", src_line=1094, abstime=0x7f3d400ad630, mutex=0x7f3ce0000d28, that=0x7f3ce0000d58) at ./include/mysql/psi/mysql_thread.h:1222
#4  MDL_wait::timed_wait (this=this@entry=0x7f3ce0000d28, owner=0x7f3ce0000cd8, abs_timeout=abs_timeout@entry=0x7f3d400ad630, set_status_on_timeout=set_status_on_timeout@entry=false, wait_state_name=<optimized out>) at ./sql/mdl.cc:1093
#5  0x0000558210a2e7a7 in MDL_context::acquire_lock (this=this@entry=0x7f3ce0000d28, mdl_request=mdl_request@entry=0x7f3d400ad700, lock_wait_timeout=lock_wait_timeout@entry=86400) at ./sql/mdl.h:395
#6  0x0000558210a2f33e in MDL_context::upgrade_shared_lock (this=this@entry=0x7f3ce0000d28, mdl_ticket=0x7f3ce00870b0, new_type=new_type@entry=MDL_EXCLUSIVE, lock_wait_timeout=86400) at ./sql/mdl.cc:2336
#7  0x00005582108ea875 in wait_while_table_is_used (thd=thd@entry=0x7f3ce0000c08, table=table@entry=0x7f3ce0057238, function=function@entry=HA_EXTRA_PREPARE_FOR_RENAME) at ./sql/sql_base.cc:1254
#8  0x00005582109cf5da in mysql_inplace_alter_table (thd=0x7f3ce0000c08, table_list=0x7f3ce0011918, table=0x7f3ce0057238, altered_table=0x7f3ce00df858, ha_alter_info=0x7f3d400ae190, inplace_supported=<optimized out>, alter_ctx=0x7f3d400aea00, target_mdl_request=<optimized out>) at ./sql/sql_table.cc:7641
#9  0x00005582109da2f1 in mysql_alter_table (thd=thd@entry=0x7f3ce0000c08, new_db=new_db@entry=0x7f3ce0005130, new_name=new_name@entry=0x7f3ce00054f0, create_info=create_info@entry=0x7f3d400af5c0, table_list=<optimized out>, table_list@entry=0x7f3ce0011918, alter_info=alter_info@entry=0x7f3d400af500, order_num=0, order=0x0, ignore=false) at ./sql/sql_table.cc:9922
#10 0x0000558210a29567 in Sql_cmd_alter_table::execute (this=<optimized out>, thd=0x7f3ce0000c08) at ./sql/sql_alter.cc:512
#11 0x00005582109451f1 in mysql_execute_command (thd=0x7f3ce0000c08) at ./sql/sql_parse.cc:6030
#12 0x000055821094bc29 in mysql_parse (thd=0x7f3ce0000c08, rawbuf=<optimized out>, length=196, parser_state=0x7f3d400b1570, is_com_multi=<optimized out>, is_next_command=<optimized out>) at ./sql/sql_parse.cc:7817
#13 0x000055821094dac5 in dispatch_command (command=COM_QUERY, thd=0x7f3ce0000c08, packet=<optimized out>, packet_length=<optimized out>, is_com_multi=<optimized out>, is_next_command=<optimized out>) at ./sql/sql_class.h:1137
#14 0x000055821094f2c2 in do_command (thd=0x7f3ce0000c08) at ./sql/sql_parse.cc:1401
#15 0x0000558210a26a56 in do_handle_one_connection (connect=connect@entry=0x558213f60778) at ./sql/sql_connect.cc:1403
#16 0x0000558210a26bcd in handle_one_connection (arg=0x558213f60778) at ./sql/sql_connect.cc:1308
#17 0x00007f3d566bbfa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#18 0x00007f3d55bd04cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment by Otto Kekäläinen [ 2020-07-06 ]

Good debugging. I am sure the mysql_upgrade is not run optimally now, nor is the server restarted after installs/upgrades optimally.

Something to mitigate the parallel runs can maybe be engineered in the Debian maintainer scripts, but as suggested in https://jira.mariadb.org/browse/MDEV-14622?focusedCommentId=109469&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-109469 we could probably remove this debian-start script completely? If the RPM installs can do without it, why not the Debian as well?

Somebody with deeper understanding of the philosophy behind the mysql_upgrade script and when and how it should run could comment here. And why do we even need a mysql_upgrade script, can't the mariadbd daemon itself just auto-upgrade it's tables if they seem outdated?

Comment by Otto Kekäläinen [ 2020-07-07 ]

Buildbot seems to have tests for this now, making all builds fail:

+ for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
+ ps -ef
+ grep -v grep
+ grep -iE 'mysql_upgrade|mysqlcheck|mysqlrepair|mysqlanalyze|mysqloptimize|mariadb-upgrade|mariadb-check'
root      3566  3564  0 21:39 ?        00:00:00 /usr/bin/mysql_upgrade --defaults-extra-file=/etc/mysql/debian.cnf --version-check
root      3587  3566  0 21:39 ?        00:00:00 sh -c '/usr/bin/mysql' --defaults-file=/tmp/mysql_upgrade-kjjeOp --database=mysql --batch --force --silent < /tmp/sqlI4MSPB 2>&1 
root      3588  3587  0 21:39 ?        00:00:00 /usr/bin/mysql --defaults-file=/tmp/mysql_upgrade-kjjeOp --database=mysql --batch --force --silent
+ sleep 2

Comment by Elena Stepanova [ 2020-07-07 ]

No, that's not that. The logic is contrary, buildbot has to wait for mysql_upgrade to finish before it continues the test, in order to reduce the probability of the failures caused by it running in the background. It doesn't eliminate the problem completely, as the main issue is the conflict between installation scripts themselves, when we install packages in a bulk and have no control over it (e.g. a race between mysql_upgrade after server installation and engine packages still being installed at the same time).

The waiting logic has been there in buildbot for a very long time.

The cause of recent "all builds fail" was a typo in the buildbot config file, it affected debian installation tests yesterday evening and upgrade tests today for half a day, should already be fixed by now.

Comment by Jean Weisbuch [ 2020-07-07 ]

I agree that it should be the server itself that checks at startup that its internal data structure (the "mysql" database) is up-to-date then corrects/update it before accepting connections so you don't risk having a half-working server online with some of its functions that are disabled or limited as long as the structure is not right.

Checking the non-internal databases for UPGRADE during startup would be problematic on servers with a great number of databases/tables and would require a different logic.

Comment by Elena Stepanova [ 2020-07-08 ]

> it should be the server itself that checks at startup that its internal data structure (the "mysql" database) is up-to-date then corrects/update

That's all good as an idea for remote future, but we have several GA versions which will still be supported for years. They obviously can't take that kind of a change, but are still suffering from this problem.
Having mysql_upgrade abort in the middle of execution, if it still happens (I think it does, although I didn't re-check, but it looks so from some logs and side-effects) can have all sorts of consequences.
Race conditions with other packages become more severe too, the more engines/packages with non-trivial installation logic we get. While a race with Spider causes a hang upon installation, as shown above, a race with Columnstore, for example, can leave the server in such a state that it keeps crashing/restarting constantly until systemd gives up trying.

Something has to be done about all of it much earlier than a radical change making the server itself take care of it is implemented, if ever.

Comment by Jean Weisbuch [ 2020-07-08 ]

In this case i believe that executing "mysql_upgrade --upgrade-system-tables" (if it's enough to fix the issues you are talking about) at the end of "mariadb-server-$VERSION.postinst", just after it started the server could be a simple solution.
Or maybe executing this directly on the systemd unit on the start action, as the upgrade operation on system tables should be pretty quick (~1 to 2 seconds when upgrading or almost instant if the server has already been upgraded).


These solutions have a drawback : The mysql_upgrade_info will have been upgraded and the subsequent mysql_upgrade that would be executed by debian-start won't run (except if --force is used) so non-system tables wont have a CHECK $table FOR UPGRADE; executed on them.

A dirty but simple fix could be to put a "flag" file on the datadir after executing a "mysql_upgrade --upgrade-system-tables" to inform that the "mysql_upgrade --force" must be executed in the background after the server is started, it could be executed by debian-start in nohup or a similar non blocking fashion and the flag would be removed once the mysql_upgrade execution has finished.


check_for_crashed_tables on debian-start should be commented as its extremely long and resource consuming on servers with a great number of tables and because it's simply useless because the default behavior has been for a very long time for Aria and MyISAM to repair automatically crashed tables when first opening/using them.

Comment by Otto Kekäläinen [ 2020-08-03 ]

I've added this to https://jira.mariadb.org/browse/MCOL-4057 to track it for ColumnStore packaging. Currently the way ColumnStore works requires it to restart the whole server, just adding the plugin does not activate (unlike it does for other plugins).

grep restart *inst *rm
mariadb-server-10.5.postinst:      invoke-rc.d mariadb restart
mariadb-plugin-columnstore.postrm:# Automatically restart MariaDB after ColumnStore plugin has been removed
mariadb-plugin-columnstore.postrm:      deb-systemd-invoke restart mariadb.service >/dev/null
mariadb-plugin-columnstore.postrm:      invoke-rc.d mariadb restart || exit $?

Triggers are also used during installation but for ColumnStore it is just about ldconfig, not the service.

± cat mariadb-server-10.5.triggers
interest-noawait /etc/mysql
interest-noawait /etc/systemd/system/mariadb.service.d
 
± cat mariadb-plugin-columnstore.triggers
activate-noawait ldconfig

Comment by Otto Kekäläinen [ 2020-08-22 ]

I maybe managed to reproduce this on Salsa-CI as well, there the Spider installation step gets stuck and does not progress:

https://salsa.debian.org/mariadb-team/mariadb-10.3/-/jobs/949386

Comment by Otto Kekäläinen [ 2021-02-06 ]

I believe this was fixed in MDEV-19917 via ba4148698fa648aaf15aa3daea857edcdd0d52c8

Feel free to re-open this elenst if you come across the same failure again.

Comment by Daniel Black [ 2021-11-19 ]

Edit: spider moved to MDEV-27095

Comment by Daniel Black [ 2021-11-19 ]

closing again - MDEV-27068 covers the modern more general version.

Comment by Michael Widenius [ 2022-01-18 ]

Just a note: The parallel run problem is fixed in Daniel's and mine changes to mysql_upgrade. Will soon to be pushed in 10.2 (final testing is going on)

Generated at Thu Feb 08 08:15:00 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.