[MDEV-23755] safety for upgrades (innodb) and packaging - Jira

Marko Mäkelä added a comment - 2021-01-13 07:01

I think that ~~MDEV-11634~~ must retain the code to handle old change buffer formats, to make upgrade easier. The undo log format change in ~~MDEV-12288~~ was an exception, and such format breakage should generally be avoided. After ~~MDEV-11634~~ has been implemented, we can retain the support for the old formats for a few major versions, to allow a smooth upgrade process. This approach was not feasible with regard to ~~MDEV-12288~~, because we moved from a partitioned undo log to a single log. And despite a sincere attempt, I did not get the upgrade logic working, as witnessed in ~~MDEV-15912~~. (But to be fair, the old format suffers from MySQL Bug #55283 assert trx0roll.c undo_no + 1 == trx->undo_no in trx_rollback_or_clean_all_recovered, which was never fixed.)

I do not think that it is reasonable to force a change buffer merge (which can take hours) by setting innodb_fast_shutdown=0 before the shutdown. Besides, the change buffer may be corrupted (for example due to ~~MDEV-24449~~), and the operation could hang (~~MDEV-20934~~) or the server could crash due to the corruption.

Marko Mäkelä added a comment - 2021-01-13 07:01 I think that MDEV-11634 must retain the code to handle old change buffer formats, to make upgrade easier. The undo log format change in MDEV-12288 was an exception, and such format breakage should generally be avoided. After MDEV-11634 has been implemented, we can retain the support for the old formats for a few major versions, to allow a smooth upgrade process. This approach was not feasible with regard to MDEV-12288 , because we moved from a partitioned undo log to a single log. And despite a sincere attempt, I did not get the upgrade logic working, as witnessed in MDEV-15912 . (But to be fair, the old format suffers from MySQL Bug #55283 assert trx0roll.c undo_no + 1 == trx->undo_no in trx_rollback_or_clean_all_recovered , which was never fixed.) I do not think that it is reasonable to force a change buffer merge (which can take hours) by setting innodb_fast_shutdown=0 before the shutdown. Besides, the change buffer may be corrupted (for example due to MDEV-24449 ), and the operation could hang ( MDEV-20934 ) or the server could crash due to the corruption.

Otto Kekäläinen added a comment - 2021-01-13 08:15 - edited

I've just read https://mariadb.com/kb/en/innodb-system-variables/#innodb_fast_shutdown

So we do not want to run `innodb_fast_shutdown=0` on every stop/restart, so we can't inject it permanently into the systemd service (or sysv init) file.

For Debian, we could modify the current `invoke-rc.d mariadb stop` in the mariadb-server.preinst/postinst/prerm/postrm scripts. What shall one run on the command line to tell a running mariadbd to stop with `innodb_fast_shutdown=0`? I am not aware of any method.

Or do we need to enter the MariaDB console and run some `SET GLOBAL innodb_fast_shutdown=0;` this setting just before we stop the service? This should also work, most of the time the mariadb-server.* scripts have console access.

Otto Kekäläinen added a comment - 2021-01-13 08:15 - edited I've just read https://mariadb.com/kb/en/innodb-system-variables/#innodb_fast_shutdown So we do not want to run `innodb_fast_shutdown=0` on every stop/restart, so we can't inject it permanently into the systemd service (or sysv init) file. For Debian, we could modify the current `invoke-rc.d mariadb stop` in the mariadb-server.preinst/postinst/prerm/postrm scripts. What shall one run on the command line to tell a running mariadbd to stop with `innodb_fast_shutdown=0`? I am not aware of any method. Or do we need to enter the MariaDB console and run some `SET GLOBAL innodb_fast_shutdown=0;` this setting just before we stop the service? This should also work, most of the time the mariadb-server.* scripts have console access.

Daniel Black added a comment - 2021-01-13 23:45

For clarity (as a question) so this is largely logic to avoid ~~MDEV-15912~~?

Its no necessary to change systemd/sysv script.

Based on the updated description innodb_max_purge_lag_wait is preferred. So the following as part of the scripts (preinst upgrade?) in the new package operating on the previously still running mariadb version.

mysql -e 'SET GLOBAL innodb_max_purge_lag_wait=0'

Sometime before invoke-rc.d mariadb stop.

We need to ignore failures of this command as older versions don't have this variable.

marko are you suggesting never do innodb_fast_shutdown=0, not even as a fallback to innodb_max_purge_lag_wait not existing? Won't this leave pre-10.3 versions without an upgrade path due to undo log format changes?

Is there an alternate to triggering the purge? Is `set global read_only=1`and waiting for transactions to finish enough to ensure there is no undo log entries that could trigger ~~MDEV-15912~~?

Daniel Black added a comment - 2021-01-13 23:45 For clarity (as a question) so this is largely logic to avoid MDEV-15912 ? Its no necessary to change systemd/sysv script. Based on the updated description innodb_max_purge_lag_wait is preferred. So the following as part of the scripts (preinst upgrade?) in the new package operating on the previously still running mariadb version. mysql -e 'SET GLOBAL innodb_max_purge_lag_wait=0' Sometime before invoke-rc.d mariadb stop. We need to ignore failures of this command as older versions don't have this variable. marko are you suggesting never do innodb_fast_shutdown=0, not even as a fallback to innodb_max_purge_lag_wait not existing? Won't this leave pre-10.3 versions without an upgrade path due to undo log format changes? Is there an alternate to triggering the purge? Is `set global read_only=1`and waiting for transactions to finish enough to ensure there is no undo log entries that could trigger MDEV-15912 ?

Marko Mäkelä added a comment - 2021-01-14 06:38

danblack, if innodb_max_purge_lax_wait does not exist, we surely can fall back to slow shutdown.

On a second thought, for avoiding ~~MDEV-15912~~, it should suffice to ensure that no uncommitted transactions will exist, that is, no insert_undo log records will exist. In ~~MDEV-12288~~, this separate undo log stream was basically merged with the one that was originally called update_undo. Purge will process that stream. The problematic insert_undo stream would be discarded on transaction COMMIT or ROLLBACK. Thus, the problem with ~~MDEV-15912~~ should only be the existence of uncommitted transactions. Waiting for a complete purge of history should be unnecessary.

Now, what kind of uncommitted transactions can exist in the system?

Transactions in XA PREPARE state: Unlikely, because XA transactions were of very limited use before ~~MDEV-742~~ was fixed in 10.5.
Any active transactions: Normally, with the default value of innodb_fast_shutdown=1 (as well as with the value 0) we should wait for these to exit.
Any transactions that were recovered and have not been rolled back by the background thread yet.

Let us double-check logs_empty_and_mark_files_at_shutdown() in 10.1 and 10.2. In 10.2, we have the following code there:

	if (ulint total_trx = srv_was_started && !srv_read_only_mode

	    && srv_force_recovery < SRV_FORCE_NO_TRX_UNDO

	    ? trx_sys_any_active_transactions() : 0) {

		if (srv_print_verbose_log && count > COUNT_INTERVAL) {

			service_manager_extend_timeout(

				COUNT_INTERVAL * CHECK_INTERVAL/1000000 * 2,

				"Waiting for %lu active transactions to finish",

				(ulong) total_trx);

			ib::info() << "Waiting for " << total_trx << " active"

				<< " transactions to finish";

			count = 0;

		goto loop;

In 10.1, the condition is similar. According to this, on the pre-upgrade shutdown, we will also need innodb_read_only_mode=OFF and innodb_force_recovery<3 (ideally it should be 0).

But, this is not the whole truth. For recovered transactions that might still be waiting for a rollback in a dedicated thread, we also have the check trx_roll_must_shutdown(), which includes the following:

	if (trx_get_dict_operation(trx) == TRX_DICT_OP_NONE

	    && srv_shutdown_state != SRV_SHUTDOWN_NONE

	    && !srv_undo_sources && srv_fast_shutdown) {

		return true;

The rollback of a recovered DDL transaction (there can be at most one) should be finished before server starts up. The final condition is that if innodb_fast_shutdown≠0, we will abort the rollback of recovered normal transactions. In 10.3 (which does not even matter for this upgrade problem), this check is in the function row_undo_step() and it should apply to innodb_fast_shutdown=3 (In ~~MDEV-15832~~) only.

It looks like the upgrade scripts would have to implement a polling loop that the number of recovered transactions is zero. It would also be good to check that no transactions exist in XA PREPARE state, so that an useless infinite wait can be avoided. Then, a shutdown with the following settings should be fine:

innodb_fast_shutdown=1 (default); also 0 is acceptable
innodb_read_only_mode=OFF (default, read-only)
innodb_force_recovery=0 (default, read-only); also 1 or 2 are acceptable

Marko Mäkelä added a comment - 2021-01-14 06:38 danblack , if innodb_max_purge_lax_wait does not exist, we surely can fall back to slow shutdown. On a second thought, for avoiding MDEV-15912 , it should suffice to ensure that no uncommitted transactions will exist, that is, no insert_undo log records will exist. In MDEV-12288 , this separate undo log stream was basically merged with the one that was originally called update_undo . Purge will process that stream. The problematic insert_undo stream would be discarded on transaction COMMIT or ROLLBACK . Thus, the problem with MDEV-15912 should only be the existence of uncommitted transactions. Waiting for a complete purge of history should be unnecessary. Now, what kind of uncommitted transactions can exist in the system? Transactions in XA PREPARE state: Unlikely, because XA transactions were of very limited use before MDEV-742 was fixed in 10.5. Any active transactions: Normally, with the default value of innodb_fast_shutdown=1 (as well as with the value 0) we should wait for these to exit. Any transactions that were recovered and have not been rolled back by the background thread yet. Let us double-check logs_empty_and_mark_files_at_shutdown() in 10.1 and 10.2. In 10.2, we have the following code there: if (ulint total_trx = srv_was_started && !srv_read_only_mode && srv_force_recovery < SRV_FORCE_NO_TRX_UNDO ? trx_sys_any_active_transactions() : 0) { if (srv_print_verbose_log && count > COUNT_INTERVAL) { service_manager_extend_timeout( COUNT_INTERVAL * CHECK_INTERVAL/1000000 * 2, "Waiting for %lu active transactions to finish" , (ulong) total_trx); ib::info() << "Waiting for " << total_trx << " active" << " transactions to finish" ; count = 0; } goto loop; } In 10.1, the condition is similar. According to this, on the pre-upgrade shutdown, we will also need innodb_read_only_mode=OFF and innodb_force_recovery<3 (ideally it should be 0). But, this is not the whole truth. For recovered transactions that might still be waiting for a rollback in a dedicated thread, we also have the check trx_roll_must_shutdown() , which includes the following: if (trx_get_dict_operation(trx) == TRX_DICT_OP_NONE && srv_shutdown_state != SRV_SHUTDOWN_NONE && !srv_undo_sources && srv_fast_shutdown) { return true ; } The rollback of a recovered DDL transaction (there can be at most one) should be finished before server starts up. The final condition is that if innodb_fast_shutdown≠0 , we will abort the rollback of recovered normal transactions. In 10.3 (which does not even matter for this upgrade problem), this check is in the function row_undo_step() and it should apply to innodb_fast_shutdown=3 (In MDEV-15832 ) only. It looks like the upgrade scripts would have to implement a polling loop that the number of recovered transactions is zero. It would also be good to check that no transactions exist in XA PREPARE state, so that an useless infinite wait can be avoided. Then, a shutdown with the following settings should be fine: innodb_fast_shutdown=1 (default); also 0 is acceptable innodb_read_only_mode=OFF (default, read-only) innodb_force_recovery=0 (default, read-only); also 1 or 2 are acceptable

Daniel Black added a comment - 2021-01-14 08:36

thanks for the detail.

I tested `set global read_only=1` and it doesn't wait for transactions to complete. It does ensure that existing transactions are doomed when they hit commit ERROR 1290 (HY000): The MariaDB server is running with the --read-only option so it cannot execute this statement (but not XA transactions - bug MDEV-24587)

select SUM(trx_rows_modified) from information_schema.INNODB_TRX; can show if undo records exist. This also included XA prepare statements exist on innodb.

innodb_read_only_mode and innodb_force_recovery aren't dynamic which helps, we can abort an upgrade if those are outside bounds.

if [ $(mysql  -B --column-names=0 -e 'select @@innodb_force_recovery < 3 and @@innodb_read_only=0') -eq 0 ]; then

  # upgrade fail

else

TODO - some polling loop on outstanding modified rows.

(maybe set global readonly), and loop again.

fi

Daniel Black added a comment - 2021-01-14 08:36 thanks for the detail. I tested `set global read_only=1` and it doesn't wait for transactions to complete. It does ensure that existing transactions are doomed when they hit commit ERROR 1290 (HY000): The MariaDB server is running with the --read-only option so it cannot execute this statement (but not XA transactions - bug MDEV-24587 ) select SUM(trx_rows_modified) from information_schema.INNODB_TRX; can show if undo records exist. This also included XA prepare statements exist on innodb. innodb_read_only_mode and innodb_force_recovery aren't dynamic which helps, we can abort an upgrade if those are outside bounds. if [ $(mysql -B --column-names=0 -e 'select @@innodb_force_recovery < 3 and @@innodb_read_only=0') -eq 0 ]; then # upgrade fail else TODO - some polling loop on outstanding modified rows. (maybe set global readonly), and loop again. fi

Marko Mäkelä added a comment - 2021-01-14 09:39

danblack, please note that read_only does not prevent InnoDB from internally modifying data (purging transaction history, merging buffered changes, and so on). The setting innodb_read_only would be for that. Sorry, I made a mistake earlier regarding the parameter name. I am not a fan of the MySQL practice of naming the underlying variables differently:

static MYSQL_SYSVAR_BOOL(read_only, srv_read_only_mode,

  PLUGIN_VAR_OPCMDARG | PLUGIN_VAR_READONLY,

  "Start InnoDB in read only mode (off by default)",

  NULL, NULL, FALSE);

Marko Mäkelä added a comment - 2021-01-14 09:39 danblack , please note that read_only does not prevent InnoDB from internally modifying data (purging transaction history, merging buffered changes, and so on). The setting innodb_read_only would be for that. Sorry, I made a mistake earlier regarding the parameter name. I am not a fan of the MySQL practice of naming the underlying variables differently: static MYSQL_SYSVAR_BOOL(read_only, srv_read_only_mode, PLUGIN_VAR_OPCMDARG | PLUGIN_VAR_READONLY, "Start InnoDB in read only mode (off by default)" , NULL, NULL, FALSE);

Otto Kekäläinen added a comment - 2021-02-03 20:18

I am experimenting this in https://github.com/MariaDB/server/commit/ab989c82231cabddf0db0fb9f8808984133a4eee

Can you provide some test case to use to verify that `innodb_fast_shutdown` did indeed run etc? I don't like adding features without some kind of tests to verify them.

Otto Kekäläinen added a comment - 2021-02-03 20:18 I am experimenting this in https://github.com/MariaDB/server/commit/ab989c82231cabddf0db0fb9f8808984133a4eee Can you provide some test case to use to verify that `innodb_fast_shutdown` did indeed run etc? I don't like adding features without some kind of tests to verify them.

Daniel Black added a comment - 2021-02-03 22:11 - edited

On innodb_fast_shutdown:

10.3 onwards will have an message InnoDB: to purge {num} transactions in the error log on shutdown if a slow shutdown is occurring. For this there actually needs to be transactions to purge. For CI test something like the following is sufficient:

 create table t (i int);

 insert into t select * from seq_1_to_4000;

Then package upgrade/replace.

On the patch do you need to use /etc/mysql/debian.cnf if it exists for client auth?

(I'm re-reading the above to see what a minimal (impract) form of script would like)

Daniel Black added a comment - 2021-02-03 22:11 - edited On innodb_fast_shutdown: 10.3 onwards will have an message InnoDB: to purge {num} transactions in the error log on shutdown if a slow shutdown is occurring. For this there actually needs to be transactions to purge. For CI test something like the following is sufficient: create table t (i int); insert into t select * from seq_1_to_4000; Then package upgrade/replace. On the patch do you need to use /etc/mysql/debian.cnf if it exists for client auth? (I'm re-reading the above to see what a minimal (impract) form of script would like)

Daniel Black added a comment - 2021-02-04 00:23

otto's approach here (as I guess it) is to get something into downstream debian for the next release on 10.5. For simplicity keeping that as innodb_fast_shutdown=0 will save significant version checks and script complexity.

marko should innodb_fast_shutdown=0 be limited to major upgrades only? (like the following suggests https://mariadb.com/kb/en/upgrading-between-major-mariadb-versions/#requirements-for-doing-an-upgrade-between-major-versions). I think it would really help some users if there is time to implement it.

If so prerm (someone is upgrade from us) needs

if [ ${1:-} == upgrade ] && [ ${1:-1} != 10.5 ]; then ...

preinst upgrade $oldversion
could check old version.

prerm remove
good do do purge as next version is unknown

otto your postinst configure implementation looks odd to see why you've added it here? Sorry if I'm missing something.

ref: https://www.debian.org/doc/debian-policy/ap-flowcharts.html

Next step, even minor upgrades should ensure no insert_undo on shutdown to avoid ~~MDEV-15912~~:
(not needed if

if [ $(mysql  -B --column-names=0 -e 'select @@innodb_force_recovery !=0 and @@innodb_read_only=0') -eq 0 ]; then

  # upgrade fail

fi

if innodb_fast_shutdown=0 ; continue # respect users wish for slow shutdown, or maybe we did this on major version change.

set global read_only=1;

# hurry the process: select  trx_mysql_thread_id from  information_schema.INNODB_TRX where  trx_rows_modified > 0; and kill {query}

loop select SUM(trx_rows_modified) from information_schema.INNODB_TRX (while != NULL or 0)

marko is IS.INNODB_TRX's trx_rows_modified the most script accessible way to see if insert undo records exist (even though these may be other records?)

Daniel Black added a comment - 2021-02-04 00:23 otto 's approach here (as I guess it) is to get something into downstream debian for the next release on 10.5. For simplicity keeping that as innodb_fast_shutdown=0 will save significant version checks and script complexity. marko should innodb_fast_shutdown=0 be limited to major upgrades only? (like the following suggests https://mariadb.com/kb/en/upgrading-between-major-mariadb-versions/#requirements-for-doing-an-upgrade-between-major-versions ). I think it would really help some users if there is time to implement it. If so prerm (someone is upgrade from us) needs if [ ${1:-} == upgrade ] && [ ${1:-1} != 10.5 ]; then ... preinst upgrade $oldversion could check old version. prerm remove good do do purge as next version is unknown otto your postinst configure implementation looks odd to see why you've added it here? Sorry if I'm missing something. ref: https://www.debian.org/doc/debian-policy/ap-flowcharts.html Next step, even minor upgrades should ensure no insert_undo on shutdown to avoid MDEV-15912 : (not needed if if [ $(mysql -B --column-names=0 -e 'select @@innodb_force_recovery !=0 and @@innodb_read_only=0') -eq 0 ]; then # upgrade fail fi if innodb_fast_shutdown=0 ; continue # respect users wish for slow shutdown, or maybe we did this on major version change. set global read_only=1; # hurry the process: select trx_mysql_thread_id from information_schema.INNODB_TRX where trx_rows_modified > 0; and kill {query} loop select SUM(trx_rows_modified) from information_schema.INNODB_TRX (while != NULL or 0) marko is IS.INNODB_TRX's trx_rows_modified the most script accessible way to see if insert undo records exist (even though these may be other records?)

Marko Mäkelä added a comment - 2021-02-06 10:40

danblack, like I wrote earlier, I believe that it should suffice to ensure that no uncommitted transactions will exist when upgrading from older versions to 10.3 or later. That is, innodb_fast_shutdown=0 should not be necessary. What is necessary is to ensure that all transactions have been rolled back before shutting down the old version. This must be tested extensively, of course.

I think that it is a reasonable requirement to always ensure that no transactions exist before upgrading between major versions. An existing transaction in XA PREPARE state could hold locks that would make some upgrade script hang. This assumes that minor version upgrades are not going to execute any DDL scripts. Maybe it is not too intrusive to always require that no transactions are pending before upgrading?

A clean shutdown is always recommended (instead of killing the server), especially because some versions refuse to perform a crash-upgrade from older versions. The most recent example is 10.5 (due to ~~MDEV-12353~~). So, basically we would want innodb_fast_shutdown=1 (or 0).

Now, how to detect transactions that are active? I was thinking that one way could be to parse the SHOW ENGINE INNODB STATUS and look for the word ACTIVE, as it is displayed by trx_print_low(). This should include also recovered transactions that are being rolled in the background, or remain in XA PREPARE state. I think that your suggestion should work even better. I found the following code:

row->trx_rows_modified = trx->undo_no;

This will initialize information_schema.innodb_trx.trx_rows_modified. If there are no rows to be rolled back, ~~MDEV-15912~~ should not be an issue, even if the transaction were active and possibly locking some records. (Locks will be discarded on restart, except we recover explicit locks for the modified rows.)

Once you have something, I think that elenst should incorporate it in her upgrade testing framework, to validate these assumptions and check that ~~MDEV-15912~~ will be avoided when upgrading from 10.2 or earlier.

Marko Mäkelä added a comment - 2021-02-06 10:40 danblack , like I wrote earlier, I believe that it should suffice to ensure that no uncommitted transactions will exist when upgrading from older versions to 10.3 or later. That is, innodb_fast_shutdown=0 should not be necessary. What is necessary is to ensure that all transactions have been rolled back before shutting down the old version. This must be tested extensively, of course. I think that it is a reasonable requirement to always ensure that no transactions exist before upgrading between major versions. An existing transaction in XA PREPARE state could hold locks that would make some upgrade script hang. This assumes that minor version upgrades are not going to execute any DDL scripts. Maybe it is not too intrusive to always require that no transactions are pending before upgrading? A clean shutdown is always recommended (instead of killing the server), especially because some versions refuse to perform a crash-upgrade from older versions. The most recent example is 10.5 (due to MDEV-12353 ). So, basically we would want innodb_fast_shutdown=1 (or 0). Now, how to detect transactions that are active? I was thinking that one way could be to parse the SHOW ENGINE INNODB STATUS and look for the word ACTIVE , as it is displayed by trx_print_low() . This should include also recovered transactions that are being rolled in the background, or remain in XA PREPARE state. I think that your suggestion should work even better. I found the following code: row->trx_rows_modified = trx->undo_no; This will initialize information_schema.innodb_trx.trx_rows_modified . If there are no rows to be rolled back, MDEV-15912 should not be an issue, even if the transaction were active and possibly locking some records. (Locks will be discarded on restart, except we recover explicit locks for the modified rows.) Once you have something, I think that elenst should incorporate it in her upgrade testing framework, to validate these assumptions and check that MDEV-15912 will be avoided when upgrading from 10.2 or earlier.

Otto Kekäläinen added a comment - 2021-02-10 10:07 - edited

Docs about Debian maintainer scripts: https://www.debian.org/doc/debian-policy/ch-maintainerscripts.html

It might be possible to hook into the mechanism that calls the `mariadb-server-10.5.preinst upgrade <old-version>` and then issue commands to the existing server (if still running) to do a different kind of shutdown on minor and major upgrades. Note that there is also the `dh_systemd_start --restart-after-upgrade` customization to consider in `debian/rules`.

You can debug what the maintainer scripts do by running `DEBIAN_SCRIPT_TRACE=1` before running `apt install` or `apt upgrade`. I Just did it for a 10.5.6 install and then single plugin upgrade to 10.5.8 (which triggers a restart of mariadbd) and a full server 10.5.6-1 (Debian) to 10.5.8-1 (Debian) upgrade. Logs attached debian-script-debug-upgrade-server-2.txt debian-script-debug-upgrade-server.txt debian-script-debug-install.txt

Breakdown of maintainer scripts and their arguments run on mariadb-server install:

mariadb-server-10.5.config configure
mariadb-server-10.5.preinst install
mariadb-server-10.5.config configure
mariadb-server-10.5.postinst configure
mariadb-server-10.5.config configure /etc/mysql
mariadb-server-10.5.postinst triggered /etc/mysql

Breakdown of maintainer scripts and their arguments run on mariadb-plugin-spider upgrade:

mariadb-server-10.5.config configure /etc/mysql
mariadb-server-10.5.postinst triggered /etc/mysql

Breakdown of maintainer scripts and their arguments run on mariadb-server-10.5 upgrade:

new/mariadb-server-10.5.config configure 1:10.5.6-2
new/mariadb-server-10.5.preinst upgrade 1:10.5.6-2 1:10.5.8-3
old/mariadb-server-10.5.postrm upgrade 1:10.5.8-3
new/mariadb-server-10.5.config configure 1:10.5.6-2
new/mariadb-server-10.5.postinst configure 1:10.5.6-2

Note: mariadb-server-10.5.prerm does not contain code to be debugged with DEBIAN_SCRIPT_TRACE, so not visible in lists above.

Otto Kekäläinen added a comment - 2021-02-10 10:07 - edited Docs about Debian maintainer scripts: https://www.debian.org/doc/debian-policy/ch-maintainerscripts.html It might be possible to hook into the mechanism that calls the `mariadb-server-10.5.preinst upgrade <old-version>` and then issue commands to the existing server (if still running) to do a different kind of shutdown on minor and major upgrades. Note that there is also the `dh_systemd_start --restart-after-upgrade` customization to consider in `debian/rules`. You can debug what the maintainer scripts do by running `DEBIAN_SCRIPT_TRACE=1` before running `apt install` or `apt upgrade`. I Just did it for a 10.5.6 install and then single plugin upgrade to 10.5.8 (which triggers a restart of mariadbd) and a full server 10.5.6-1 (Debian) to 10.5.8-1 (Debian) upgrade. Logs attached debian-script-debug-upgrade-server-2.txt debian-script-debug-upgrade-server.txt debian-script-debug-install.txt Breakdown of maintainer scripts and their arguments run on mariadb-server install: mariadb-server-10.5.config configure mariadb-server-10.5.preinst install mariadb-server-10.5.config configure mariadb-server-10.5.postinst configure mariadb-server-10.5.config configure /etc/mysql mariadb-server-10.5.postinst triggered /etc/mysql Breakdown of maintainer scripts and their arguments run on mariadb-plugin-spider upgrade: mariadb-server-10.5.config configure /etc/mysql mariadb-server-10.5.postinst triggered /etc/mysql Breakdown of maintainer scripts and their arguments run on mariadb-server-10.5 upgrade: new/mariadb-server-10.5.config configure 1:10.5.6-2 new/mariadb-server-10.5.preinst upgrade 1:10.5.6-2 1:10.5.8-3 old/mariadb-server-10.5.postrm upgrade 1:10.5.8-3 new/mariadb-server-10.5.config configure 1:10.5.6-2 new/mariadb-server-10.5.postinst configure 1:10.5.6-2 Note: mariadb-server-10.5.prerm does not contain code to be debugged with DEBIAN_SCRIPT_TRACE, so not visible in lists above.

Otto Kekäläinen added a comment - 2021-05-09 06:57

marko danblack I've done some experiments on this topic in https://github.com/ottok/mariadb/commit/50b1bc808a2b0d76ae02bafa2f333ae3b66bbe67

However, to be able to complete this, you need to be very clear on what is the actual problem we try to fix here and what is the correct way to initiate a graceful shutdown on MariaDB.

Since the information is useful beyond just this Jira issue, I suggest you write a knowledge base article that clearly states:

When and why are graceful MariaDB shutdowns needed?
Which commands should be used? What is the difference between the alternative commands? Is it OK to run all of them, any downsides?
What versions of MariaDB introduced new binary formats? Meaning, when upgrading from what version to what version is it in particular important to do a graceful shutdown?
What are the symptoms is the shutdown was not graceful and start fails? How does such a start failure look like?
How long does the graceful shutdown take? Is it slow because of database size, of constant load or some special kind of load (e.g. long transactions)?

For the purpose of this Jira issue, I personally would also like to know:

How to trigger the failure mode? It is difficult to know if the changes fixed it unless there is a way to fail and then to see that failures stopped.
What is the backup method if `mariadb` does not work, for example if the installation has turned of unix socket authentication access for root user? Can the graceful shutdown be initiated by sending a special shutdown signal (e.g. kill -19) to mariadbd?

Otto Kekäläinen added a comment - 2021-05-09 06:57 marko danblack I've done some experiments on this topic in https://github.com/ottok/mariadb/commit/50b1bc808a2b0d76ae02bafa2f333ae3b66bbe67 However, to be able to complete this, you need to be very clear on what is the actual problem we try to fix here and what is the correct way to initiate a graceful shutdown on MariaDB. Since the information is useful beyond just this Jira issue, I suggest you write a knowledge base article that clearly states: When and why are graceful MariaDB shutdowns needed? Which commands should be used? What is the difference between the alternative commands? Is it OK to run all of them, any downsides? What versions of MariaDB introduced new binary formats? Meaning, when upgrading from what version to what version is it in particular important to do a graceful shutdown? What are the symptoms is the shutdown was not graceful and start fails? How does such a start failure look like? How long does the graceful shutdown take? Is it slow because of database size, of constant load or some special kind of load (e.g. long transactions)? For the purpose of this Jira issue, I personally would also like to know: How to trigger the failure mode? It is difficult to know if the changes fixed it unless there is a way to fail and then to see that failures stopped. What is the backup method if `mariadb` does not work, for example if the installation has turned of unix socket authentication access for root user? Can the graceful shutdown be initiated by sending a special shutdown signal (e.g. kill -19) to mariadbd?

Marko Mäkelä added a comment - 2021-05-17 05:24

otto, thank you for taking a look at this. The main motivation for a clean shutdown before upgrade is to avoid hitting ~~MDEV-15912~~. The InnoDB undo log format was simplified in ~~MDEV-12288~~ in MariaDB Server 10.3, going from 2 persistent undo logs per transaction to 1. That is, log for INSERT operations will no longer be written to a separate log. While some logic was implemented to deal with upgrades, it later turned out that it was not perfect.

One way to repeat the problem should be to kill the pre-10.3 server during a write workload, and then upgrade to 10.3. When upgrading from a version earlier than 10.2, or to a version later than 10.3, startup after such a crash would be refused because the redo log would not be empty. That can be ‘fixed’ by starting the old server with innodb_force_recovery=3 and initiating a clean shutdown. After that step, the undo logs would still be in the problematic state, but the redo log would look clean and the server should be able to repeat ~~MDEV-15912~~.

I hope that elenst can confirm or deny my above claim that ~~MDEV-15912~~ can only be hit when recovered transactions would exist after an upgrade.

I think that the the following constraints must be upheld during upgrade:

No transactions must exist in XA PREPARE state. (This is important for ~~MDEV-15912~~, but I think that it should be ‘common sense’ also when upgrading from 10.3 or later versions.)
A clean shutdown of the old server must succeed. Only innodb_fast_shutdown=1 or innodb_fast_shutdown=0 are allowed. We probably want innodb_fast_shutdown=1, because a slow shutdown could take tens of minutes, depending on the size of the change buffer and the transaction history.
If the server hung during shutdown (due to some bug), there are good chances that the newer server will refuse to start up because the redo log does not look empty. So, maybe there should be an option for the upgrade script to abort and require manual intervention.

The redo log format was radically changed in 10.2 (with the introduction of the log file format identifier that I implemented in MySQL 5.7.9) and 10.5 (~~MDEV-12353~~) but not changed at all in 10.6. It will likely be changed in 10.7 once more (~~MDEV-14425~~). 10.2+ will refuse to start up after a crash of 10.1 or earlier, and 10.5+ will refuse to start up after a crash of 10.4 or earlier.

Marko Mäkelä added a comment - 2021-05-17 05:24 otto , thank you for taking a look at this. The main motivation for a clean shutdown before upgrade is to avoid hitting MDEV-15912 . The InnoDB undo log format was simplified in MDEV-12288 in MariaDB Server 10.3, going from 2 persistent undo logs per transaction to 1. That is, log for INSERT operations will no longer be written to a separate log. While some logic was implemented to deal with upgrades, it later turned out that it was not perfect. One way to repeat the problem should be to kill the pre-10.3 server during a write workload, and then upgrade to 10.3. When upgrading from a version earlier than 10.2, or to a version later than 10.3, startup after such a crash would be refused because the redo log would not be empty. That can be ‘fixed’ by starting the old server with innodb_force_recovery=3 and initiating a clean shutdown. After that step, the undo logs would still be in the problematic state, but the redo log would look clean and the server should be able to repeat MDEV-15912 . I hope that elenst can confirm or deny my above claim that MDEV-15912 can only be hit when recovered transactions would exist after an upgrade. I think that the the following constraints must be upheld during upgrade: No transactions must exist in XA PREPARE state. (This is important for MDEV-15912 , but I think that it should be ‘common sense’ also when upgrading from 10.3 or later versions.) A clean shutdown of the old server must succeed. Only innodb_fast_shutdown=1 or innodb_fast_shutdown=0 are allowed. We probably want innodb_fast_shutdown=1 , because a slow shutdown could take tens of minutes, depending on the size of the change buffer and the transaction history. If the server hung during shutdown (due to some bug), there are good chances that the newer server will refuse to start up because the redo log does not look empty. So, maybe there should be an option for the upgrade script to abort and require manual intervention. The redo log format was radically changed in 10.2 (with the introduction of the log file format identifier that I implemented in MySQL 5.7.9) and 10.5 ( MDEV-12353 ) but not changed at all in 10.6. It will likely be changed in 10.7 once more ( MDEV-14425 ). 10.2+ will refuse to start up after a crash of 10.1 or earlier, and 10.5+ will refuse to start up after a crash of 10.4 or earlier.

Otto Kekäläinen added a comment - 2021-05-17 05:55

Thanks Marko for the info, but please consider documenting the
failures and steps to resolve them in a KB article so that a database
admin reading them would be able to manually do the correct
shutdowns/starts. When we have that level of clarity, I can easily
write the Debian maint scripts to automate those manual steps.

Otto Kekäläinen added a comment - 2021-05-17 05:55 Thanks Marko for the info, but please consider documenting the failures and steps to resolve them in a KB article so that a database admin reading them would be able to manually do the correct shutdowns/starts. When we have that level of clarity, I can easily write the Debian maint scripts to automate those manual steps.

Marko Mäkelä added a comment - 2021-05-17 15:49

We will not have that clarify before my claim has been validated by testing an upgrade, by elenst when time permits. I think that it suffices to test an upgrade from 10.2 to 10.3.

Marko Mäkelä added a comment - 2021-05-17 15:49 We will not have that clarify before my claim has been validated by testing an upgrade, by elenst when time permits. I think that it suffices to test an upgrade from 10.2 to 10.3.

Otto Kekäläinen added a comment - 2021-05-21 06:21

Based on the discussion in https://jira.mariadb.org/browse/MDEV-22373 that is another potential case where innodb_fast_shutdown=0 could have prevented issues on next startup.

innodb_fast_shutdown=0 never produces a post-restart count mismatch corruption, while innodb_fast_shutdown=1 very reliably does

Otto Kekäläinen added a comment - 2021-05-21 06:21 Based on the discussion in https://jira.mariadb.org/browse/MDEV-22373 that is another potential case where innodb_fast_shutdown=0 could have prevented issues on next startup. innodb_fast_shutdown=0 never produces a post-restart count mismatch corruption, while innodb_fast_shutdown=1 very reliably does

Elena Stepanova added a comment - 2021-05-22 21:38

marko wrote:

One way to repeat the problem should be to kill the pre-10.3 server during a write workload, and then upgrade to 10.3. When upgrading from a version earlier than 10.2, or to a version later than 10.3, startup after such a crash would be refused because the redo log would not be empty. That can be ‘fixed’ by starting the old server with innodb_force_recovery=3 and initiating a clean shutdown. After that step, the undo logs would still be in the problematic state, but the redo log would look clean and the server should be able to repeat ~~MDEV-15912~~.

I hope that Elena Stepanova can confirm or deny my above claim that ~~MDEV-15912~~ can only be hit when recovered transactions would exist after an upgrade.

What you describe as a workaround is the exact test scenario described in ~~MDEV-15912~~:

start the current 10.0 server;
create some tables and run some DML on them;
kill the server during operation (with SIGKILL);
restart the server with innodb-force-recovery=3, no client activity;
shutdown the server normally;

Then the current 10.3 server is started on the same datadir. It starts, but crashes immediately afterwards as below.

If there is any doubt in the accuracy of the description, it can be easily checked on the datadir attached to ~~MDEV-15912~~.
I tried it now.
10.1.45 was started on that datadir with innodb-force-recovery=3 (and with innodb-fast-shutdown=0, just for the sake of it. I tried without it, too). It said:

...

2021-05-23  0:25:23 139806970283904 [Note] InnoDB: Initializing buffer pool, size = 128.0M

2021-05-23  0:25:23 139806970283904 [Note] InnoDB: Completed initialization of buffer pool

2021-05-23  0:25:23 139806970283904 [Note] InnoDB: Highest supported file format is Barracuda.

InnoDB: 1 transaction(s) which must be rolled back or cleaned up

InnoDB: in total 2 row operations to undo

InnoDB: Trx id counter is 170752

2021-05-23  0:25:23 139806970283904 [Note] InnoDB: 128 rollback segment(s) are active.

2021-05-23  0:25:23 139806970283904 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.47-87.0 started; log sequence number 34853352

2021-05-23  0:25:23 139806970283904 [Note] InnoDB: !!! innodb_force_recovery is set to 3 !!!

Then I shut it down, it said:

2021-05-23  0:25:30 139806952339200 [Note] /data/releases/10.1.45/bin/mysqld: Normal shutdown

2021-05-23  0:25:30 139806952339200 [Note] Event Scheduler: Purging the queue. 0 events

2021-05-23  0:25:30 139806319290112 [Note] InnoDB: FTS optimize thread exiting.

2021-05-23  0:25:30 139806952339200 [Note] InnoDB: Starting shutdown...

2021-05-23  0:25:31 139806952339200 [Note] InnoDB: Waiting for page_cleaner to finish flushing of buffer pool

2021-05-23  0:25:32 139806952339200 [Note] InnoDB: Shutdown completed; log sequence number 34853362

2021-05-23  0:25:32 139806952339200 [Note] /data/releases/10.1.45/bin/mysqld: Shutdown complete

Then I started recent 10.3 on the same datadir. It said:

...

2021-05-23  0:25:55 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M

2021-05-23  0:25:55 0 [Note] InnoDB: Completed initialization of buffer pool

2021-05-23  0:25:55 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().

2021-05-23  0:25:55 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 2 row operations to undo

2021-05-23  0:25:55 0 [Note] InnoDB: Trx id counter is 170753

2021-05-23  0:25:55 0 [Note] InnoDB: Upgrading redo log: 2*50331648 bytes; LSN=34853362

2021-05-23  0:25:56 0 [Note] InnoDB: Starting to delete and rewrite log files.

2021-05-23  0:25:56 0 [Note] InnoDB: Setting log file ./ib_logfile101 size to 50331648 bytes

2021-05-23  0:25:56 0 [Note] InnoDB: Setting log file ./ib_logfile1 size to 50331648 bytes

2021-05-23  0:25:56 0 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0

2021-05-23  0:25:56 0 [Note] InnoDB: New log files created, LSN=34853362

2021-05-23  0:25:56 0 [Note] InnoDB: 128 out of 128 rollback segments are active.

2021-05-23  0:25:56 0 [Note] InnoDB: Starting in background the rollback of recovered transactions

2021-05-23  0:25:56 0 [Note] InnoDB: Creating sys_virtual system tables.

2021-05-23  0:25:56 0 [Note] InnoDB: Creating shared tablespace for temporary tables

2021-05-23  0:25:56 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...

2021-05-23  0:25:56 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.

2021-05-23  0:25:56 0 [Note] InnoDB: Waiting for purge to start

2021-05-23  0:25:56 0 [Note] InnoDB: 10.3.29 started; log sequence number 34853362; transaction id 170757

2021-05-23  0:25:56 0 [Note] InnoDB: Loading buffer pool(s) from /mnt-hd8t/bld/10.3-debug/data/ib_buffer_pool

2021-05-23  0:25:56 0 [Note] InnoDB: Cannot open '/mnt-hd8t/bld/10.3-debug/data/ib_buffer_pool' for reading: No such file or directory

2021-05-23  0:25:56 0 [Note] Plugin 'FEEDBACK' is disabled.

2021-05-23  0:25:56 0 [Note] InnoDB: Rolled back recovered transaction 169859

2021-05-23  0:25:56 0 [Note] InnoDB: Rollback of non-prepared transactions completed

2021-05-23  0:25:56 0 [Note] Server socket created on IP: '::'.

2021-05-23  0:25:56 0 [ERROR] Incorrect definition of table mysql.event: expected column 'sql_mode' at position 14 to have type set('REAL_AS_FLOAT','PIPES_AS_CONCAT','ANSI_QUOTES','IGNORE_SPACE','IGNORE_BAD_TABLE_OPTIONS','ONLY_FULL_GROUP_BY','NO_UNSIGNED_SUBTRACTION','NO_DIR_IN_CREATE','POSTGRESQL','ORACLE','MSSQL','DB2','MAXDB','NO_KEY_OPTIONS','NO_TABLE_OPTIONS','NO_FIELD_OPTIONS','MYSQL323','MYSQL40','ANSI','NO_AUTO_VALUE_ON_ZERO','NO_BACKSLASH_ESCAPES','STRICT_TRANS_TABLES','STRICT_ALL_TABLES','NO_ZERO_IN_DATE','NO_ZERO_DATE','INVALID_DATES','ERROR_FOR_DIVISION_BY_ZERO','TRADITIONAL','NO_AUTO_CREATE_USER','HIGH_NOT_PRECEDENCE','NO_ENGINE_SUBSTITUTION','PAD_CHAR_TO_FULL_LENGTH','EMPTY_STRING_IS_NULL','SIMULTANEOUS_ASSIGNMENT'), found type set('REAL_AS_FLOAT','PIPES_AS_CONCAT','ANSI_QUOTES','IGNORE_SPACE','IGNORE_BAD_TABLE_OPTIONS','ONLY_FULL_GROUP_BY','NO_UNSIGNED_SUBTRACTION','NO_DIR_IN_CREATE','POSTGRESQL','ORACLE','MSSQL','DB2','MAXDB','NO_KEY_OPTIONS','NO_TABLE_OPTIONS','NO_FIELD_OPTIONS','MYSQL323','MYSQL40','ANSI','NO_AUTO_VALU

2021-05-23  0:25:56 0 [ERROR] mysqld: Event Scheduler: An error occurred when initializing system tables. Disabling the Event Scheduler.

2021-05-23  0:25:56 6 [Warning] InnoDB: Table mysql/innodb_table_stats has length mismatch in the column name table_name.  Please run mysql_upgrade

2021-05-23  0:25:56 6 [Warning] InnoDB: Table mysql/innodb_index_stats has length mismatch in the column name table_name.  Please run mysql_upgrade

2021-05-23  0:25:56 0 [Note] Reading of all Master_info entries succeeded

2021-05-23  0:25:56 0 [Note] Added new Master_info '' to hash table

2021-05-23  0:25:56 0 [Note] /data/bld/10.3/bin/mysqld: ready for connections.

Version: '10.3.29-MariaDB-debug'  socket: '/data/bld/10.3/data/tmp/mysql.sock'  port: 3306  Source distribution

2021-05-23 00:25:56 0x7f81ea7fc700  InnoDB: Assertion failure in file /data/src/10.3/storage/innobase/trx/trx0purge.cc line 118

InnoDB: Failing assertion: purge_sys.tail.commit <= purge_sys.rseg->last_commit

InnoDB: We intentionally generate a memory trap.

...

So, I can neither confirm nor deny your claim:

yes, at least in this case the failure happens when recovered transactions exist after upgrade, but
no, starting the old server with innodb-force-recovery=3 and doing a clean shutdown doesn't prevent it.

Elena Stepanova added a comment - 2021-05-22 21:38 marko wrote: One way to repeat the problem should be to kill the pre-10.3 server during a write workload, and then upgrade to 10.3. When upgrading from a version earlier than 10.2, or to a version later than 10.3, startup after such a crash would be refused because the redo log would not be empty. That can be ‘fixed’ by starting the old server with innodb_force_recovery=3 and initiating a clean shutdown. After that step, the undo logs would still be in the problematic state, but the redo log would look clean and the server should be able to repeat MDEV-15912 . I hope that Elena Stepanova can confirm or deny my above claim that MDEV-15912 can only be hit when recovered transactions would exist after an upgrade. What you describe as a workaround is the exact test scenario described in MDEV-15912 : start the current 10.0 server; create some tables and run some DML on them; kill the server during operation (with SIGKILL); restart the server with innodb-force-recovery=3, no client activity; shutdown the server normally; Then the current 10.3 server is started on the same datadir. It starts, but crashes immediately afterwards as below. If there is any doubt in the accuracy of the description, it can be easily checked on the datadir attached to MDEV-15912 . I tried it now. 10.1.45 was started on that datadir with innodb-force-recovery=3 (and with innodb-fast-shutdown=0 , just for the sake of it. I tried without it, too). It said: ... 2021-05-23 0:25:23 139806970283904 [Note] InnoDB: Initializing buffer pool, size = 128.0M 2021-05-23 0:25:23 139806970283904 [Note] InnoDB: Completed initialization of buffer pool 2021-05-23 0:25:23 139806970283904 [Note] InnoDB: Highest supported file format is Barracuda. InnoDB: 1 transaction(s) which must be rolled back or cleaned up InnoDB: in total 2 row operations to undo InnoDB: Trx id counter is 170752 2021-05-23 0:25:23 139806970283904 [Note] InnoDB: 128 rollback segment(s) are active. 2021-05-23 0:25:23 139806970283904 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.47-87.0 started; log sequence number 34853352 2021-05-23 0:25:23 139806970283904 [Note] InnoDB: !!! innodb_force_recovery is set to 3 !!! Then I shut it down, it said: 2021-05-23 0:25:30 139806952339200 [Note] /data/releases/10.1.45/bin/mysqld: Normal shutdown 2021-05-23 0:25:30 139806952339200 [Note] Event Scheduler: Purging the queue. 0 events 2021-05-23 0:25:30 139806319290112 [Note] InnoDB: FTS optimize thread exiting. 2021-05-23 0:25:30 139806952339200 [Note] InnoDB: Starting shutdown... 2021-05-23 0:25:31 139806952339200 [Note] InnoDB: Waiting for page_cleaner to finish flushing of buffer pool 2021-05-23 0:25:32 139806952339200 [Note] InnoDB: Shutdown completed; log sequence number 34853362 2021-05-23 0:25:32 139806952339200 [Note] /data/releases/10.1.45/bin/mysqld: Shutdown complete Then I started recent 10.3 on the same datadir. It said: ... 2021-05-23 0:25:55 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M 2021-05-23 0:25:55 0 [Note] InnoDB: Completed initialization of buffer pool 2021-05-23 0:25:55 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority(). 2021-05-23 0:25:55 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 2 row operations to undo 2021-05-23 0:25:55 0 [Note] InnoDB: Trx id counter is 170753 2021-05-23 0:25:55 0 [Note] InnoDB: Upgrading redo log: 2*50331648 bytes; LSN=34853362 2021-05-23 0:25:56 0 [Note] InnoDB: Starting to delete and rewrite log files. 2021-05-23 0:25:56 0 [Note] InnoDB: Setting log file ./ib_logfile101 size to 50331648 bytes 2021-05-23 0:25:56 0 [Note] InnoDB: Setting log file ./ib_logfile1 size to 50331648 bytes 2021-05-23 0:25:56 0 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0 2021-05-23 0:25:56 0 [Note] InnoDB: New log files created, LSN=34853362 2021-05-23 0:25:56 0 [Note] InnoDB: 128 out of 128 rollback segments are active. 2021-05-23 0:25:56 0 [Note] InnoDB: Starting in background the rollback of recovered transactions 2021-05-23 0:25:56 0 [Note] InnoDB: Creating sys_virtual system tables. 2021-05-23 0:25:56 0 [Note] InnoDB: Creating shared tablespace for temporary tables 2021-05-23 0:25:56 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... 2021-05-23 0:25:56 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB. 2021-05-23 0:25:56 0 [Note] InnoDB: Waiting for purge to start 2021-05-23 0:25:56 0 [Note] InnoDB: 10.3.29 started; log sequence number 34853362; transaction id 170757 2021-05-23 0:25:56 0 [Note] InnoDB: Loading buffer pool(s) from /mnt-hd8t/bld/10.3-debug/data/ib_buffer_pool 2021-05-23 0:25:56 0 [Note] InnoDB: Cannot open '/mnt-hd8t/bld/10.3-debug/data/ib_buffer_pool' for reading: No such file or directory 2021-05-23 0:25:56 0 [Note] Plugin 'FEEDBACK' is disabled. 2021-05-23 0:25:56 0 [Note] InnoDB: Rolled back recovered transaction 169859 2021-05-23 0:25:56 0 [Note] InnoDB: Rollback of non-prepared transactions completed 2021-05-23 0:25:56 0 [Note] Server socket created on IP: '::'. 2021-05-23 0:25:56 0 [ERROR] Incorrect definition of table mysql.event: expected column 'sql_mode' at position 14 to have type set('REAL_AS_FLOAT','PIPES_AS_CONCAT','ANSI_QUOTES','IGNORE_SPACE','IGNORE_BAD_TABLE_OPTIONS','ONLY_FULL_GROUP_BY','NO_UNSIGNED_SUBTRACTION','NO_DIR_IN_CREATE','POSTGRESQL','ORACLE','MSSQL','DB2','MAXDB','NO_KEY_OPTIONS','NO_TABLE_OPTIONS','NO_FIELD_OPTIONS','MYSQL323','MYSQL40','ANSI','NO_AUTO_VALUE_ON_ZERO','NO_BACKSLASH_ESCAPES','STRICT_TRANS_TABLES','STRICT_ALL_TABLES','NO_ZERO_IN_DATE','NO_ZERO_DATE','INVALID_DATES','ERROR_FOR_DIVISION_BY_ZERO','TRADITIONAL','NO_AUTO_CREATE_USER','HIGH_NOT_PRECEDENCE','NO_ENGINE_SUBSTITUTION','PAD_CHAR_TO_FULL_LENGTH','EMPTY_STRING_IS_NULL','SIMULTANEOUS_ASSIGNMENT'), found type set('REAL_AS_FLOAT','PIPES_AS_CONCAT','ANSI_QUOTES','IGNORE_SPACE','IGNORE_BAD_TABLE_OPTIONS','ONLY_FULL_GROUP_BY','NO_UNSIGNED_SUBTRACTION','NO_DIR_IN_CREATE','POSTGRESQL','ORACLE','MSSQL','DB2','MAXDB','NO_KEY_OPTIONS','NO_TABLE_OPTIONS','NO_FIELD_OPTIONS','MYSQL323','MYSQL40','ANSI','NO_AUTO_VALU 2021-05-23 0:25:56 0 [ERROR] mysqld: Event Scheduler: An error occurred when initializing system tables. Disabling the Event Scheduler. 2021-05-23 0:25:56 6 [Warning] InnoDB: Table mysql/innodb_table_stats has length mismatch in the column name table_name. Please run mysql_upgrade 2021-05-23 0:25:56 6 [Warning] InnoDB: Table mysql/innodb_index_stats has length mismatch in the column name table_name. Please run mysql_upgrade 2021-05-23 0:25:56 0 [Note] Reading of all Master_info entries succeeded 2021-05-23 0:25:56 0 [Note] Added new Master_info '' to hash table 2021-05-23 0:25:56 0 [Note] /data/bld/10.3/bin/mysqld: ready for connections. Version: '10.3.29-MariaDB-debug' socket: '/data/bld/10.3/data/tmp/mysql.sock' port: 3306 Source distribution 2021-05-23 00:25:56 0x7f81ea7fc700 InnoDB: Assertion failure in file /data/src/10.3/storage/innobase/trx/trx0purge.cc line 118 InnoDB: Failing assertion: purge_sys.tail.commit <= purge_sys.rseg->last_commit InnoDB: We intentionally generate a memory trap. ... So, I can neither confirm nor deny your claim: yes, at least in this case the failure happens when recovered transactions exist after upgrade, but no, starting the old server with innodb-force-recovery=3 and doing a clean shutdown doesn't prevent it.

Marko Mäkelä added a comment - 2021-05-24 05:03

elenst, sorry, I used confusing language. I had put the word ‘fixed’ in quotes, because I think that the step of making the write-ahead log (redo log) clean is artificial and hopefully not executed by any serious DBA. The only purpose of the extra step of restarting the old server with innodb_force_recovery=3 is to make the redo log logically empty while retaining incomplete transactions in the undo logs.

My claim (which I would like you to confirm or refute) is that if there is an orderly shutdown of the 10.2 or earlier server (with innodb_force_recovery<3 and innodb_fast_shutdown≠2), then the 10.3 or later server should not hit ~~MDEV-15912~~. That is, if the redo log is logically empty after a shutdown, then also there must not exist incomplete transactions. (The setting innodb_force_recovery=3 specifically forbids the rollback of incomplete transactions.)

If my claim holds, we could make the upgrade process more robust by making the 10.3 or later server refuse to start up when an incomplete transaction in the pre-~~MDEV-12288~~ format (with separate insert_undo records) is recovered. In this way, it should be possible to correct the situation by starting up and shutting down the older server.

Marko Mäkelä added a comment - 2021-05-24 05:03 elenst , sorry, I used confusing language. I had put the word ‘fixed’ in quotes, because I think that the step of making the write-ahead log (redo log) clean is artificial and hopefully not executed by any serious DBA. The only purpose of the extra step of restarting the old server with innodb_force_recovery=3 is to make the redo log logically empty while retaining incomplete transactions in the undo logs. My claim (which I would like you to confirm or refute) is that if there is an orderly shutdown of the 10.2 or earlier server (with innodb_force_recovery<3 and innodb_fast_shutdown≠2 ), then the 10.3 or later server should not hit MDEV-15912 . That is, if the redo log is logically empty after a shutdown, then also there must not exist incomplete transactions. (The setting innodb_force_recovery=3 specifically forbids the rollback of incomplete transactions.) If my claim holds, we could make the upgrade process more robust by making the 10.3 or later server refuse to start up when an incomplete transaction in the pre- MDEV-12288 format (with separate insert_undo records) is recovered. In this way, it should be possible to correct the situation by starting up and shutting down the older server.

Elena Stepanova added a comment - 2021-05-25 15:54

I have sampled recorded failures attributed to ~~MDEV-15912~~ (there have been 73 of them over ~1.5 years, of which I checked a part), and each time it was happening in the same variation of the test as described above, with incomplete transactions remaining after upgrade, even though there are also normal upgrade tests running even more frequently.

For the directory attached to ~~MDEV-15912~~, a normal startup/shutdown on 10.2 server makes the problem go away – after that 10.3 starts on the same directory all right.

I think these two evidences combined are a good indication that your theory is correct and can be acted upon.

Elena Stepanova added a comment - 2021-05-25 15:54 I have sampled recorded failures attributed to MDEV-15912 (there have been 73 of them over ~1.5 years, of which I checked a part), and each time it was happening in the same variation of the test as described above, with incomplete transactions remaining after upgrade, even though there are also normal upgrade tests running even more frequently. For the directory attached to MDEV-15912 , a normal startup/shutdown on 10.2 server makes the problem go away – after that 10.3 starts on the same directory all right. I think these two evidences combined are a good indication that your theory is correct and can be acted upon.

Marko Mäkelä added a comment - 2021-05-25 18:33

greenman, can you please update the knowledge base according to my latest comments above?

otto and danblack, when it comes to upgrading, I would recommend the following. Hopefully you can implement this:

Before upgrading, the upgrade script will warn if innodb_force_recovery (a read-only parameter) differs from its default value 0. Upgrade will be refused if the value is 3 or more.
Before upgrading, if any transaction exists in the XA PREPARE state, a warning will be issued. Maybe the upgrade should be refused too? After all, ~~MDEV-742~~ could have changed some handling of this in 10.5.
Before upgrading, a shutdown of the server with innodb_fast_shutdown=1 will be executed. Upgrade will be aborted if the shutdown was not graceful. This could be determined from the exit code and the last messages of the error log. If the shutdown takes longer than some reasonable timeout, the upgrade would be aborted.
After upgrading: As a fix of ~~MDEV-15912~~, we will refuse InnoDB startup if incomplete transactions have any undo log in the pre-~~MDEV-12288~~ format. This would only be a safety measure for upgrading from an older version than 10.3, to defend against an error of the upgrade procedure.

Note: starting with 10.3 there is also the option innodb_fast_shutdown=3, which ends in a logically empty redo log, but skips the rollback of any transactions. While I do not anticipate any trouble related to using that option, it might be better to always use innodb_fast_shutdown=1 before upgrading

Marko Mäkelä added a comment - 2021-05-25 18:33 greenman , can you please update the knowledge base according to my latest comments above? otto and danblack , when it comes to upgrading, I would recommend the following. Hopefully you can implement this: Before upgrading, the upgrade script will warn if innodb_force_recovery (a read-only parameter) differs from its default value 0. Upgrade will be refused if the value is 3 or more. Before upgrading, if any transaction exists in the XA PREPARE state, a warning will be issued. Maybe the upgrade should be refused too? After all, MDEV-742 could have changed some handling of this in 10.5. Before upgrading, a shutdown of the server with innodb_fast_shutdown=1 will be executed. Upgrade will be aborted if the shutdown was not graceful. This could be determined from the exit code and the last messages of the error log. If the shutdown takes longer than some reasonable timeout, the upgrade would be aborted. After upgrading: As a fix of MDEV-15912 , we will refuse InnoDB startup if incomplete transactions have any undo log in the pre- MDEV-12288 format. This would only be a safety measure for upgrading from an older version than 10.3, to defend against an error of the upgrade procedure. Note: starting with 10.3 there is also the option innodb_fast_shutdown=3 , which ends in a logically empty redo log, but skips the rollback of any transactions. While I do not anticipate any trouble related to using that option, it might be better to always use innodb_fast_shutdown=1 before upgrading

Marko Mäkelä added a comment - 2021-06-21 15:29

~~MDEV-15912~~ is now fixed, and InnoDB will refuse to start up if the data directory is from an earlier version than MariaDB 10.3 that had not been shut down in a clean fashion. Hence, we now should have a reliable post-upgrade check that will prevent bad things from happening. But we still need a pre-upgrade check for smooth upgrades from 10.2 or earlier. I genuinely do not know if this needs to be addressed by packaging, or whether documenting this should be enough. Maybe the current scripts in (say) Debian are adequate for the masses who might be running a mostly idle instance of MariaDB. Those who are hosting valuable data should think twice before blindly executing something like sudo apt dist-upgrade.

Marko Mäkelä added a comment - 2021-06-21 15:29 MDEV-15912 is now fixed, and InnoDB will refuse to start up if the data directory is from an earlier version than MariaDB 10.3 that had not been shut down in a clean fashion. Hence, we now should have a reliable post-upgrade check that will prevent bad things from happening. But we still need a pre-upgrade check for smooth upgrades from 10.2 or earlier. I genuinely do not know if this needs to be addressed by packaging, or whether documenting this should be enough. Maybe the current scripts in (say) Debian are adequate for the masses who might be running a mostly idle instance of MariaDB. Those who are hosting valuable data should think twice before blindly executing something like sudo apt dist-upgrade .

Marko Mäkelä added a comment - 2021-07-26 12:56

danblack, I hope that you can review our upgrade scripts that they correspond to what I suggested here and in ~~MDEV-25793~~.

I think that it would be nice to ensure that ‘failure is an option’. First, if the server hangs on shutdown, or if some incompatible option (such as innodb_force_recovery) was in effect, maybe the upgrade script should abort just there. Ultimately, it is the upgraded server’s duty to perform further consistency checks and refuse startup if appropriate.

Should the upgraded server fail to start up, then I think that manual intervention by the system administrator will be needed: For example, install an old version of the server again and try to perform a proper shutdown. I do not think that we should attempt to implement rollback to an older version.

Marko Mäkelä added a comment - 2021-07-26 12:56 danblack , I hope that you can review our upgrade scripts that they correspond to what I suggested here and in MDEV-25793 . I think that it would be nice to ensure that ‘failure is an option’. First, if the server hangs on shutdown, or if some incompatible option (such as innodb_force_recovery ) was in effect, maybe the upgrade script should abort just there. Ultimately, it is the upgraded server’s duty to perform further consistency checks and refuse startup if appropriate. Should the upgraded server fail to start up, then I think that manual intervention by the system administrator will be needed: For example, install an old version of the server again and try to perform a proper shutdown. I do not think that we should attempt to implement rollback to an older version.

Otto Kekäläinen added a comment - 2021-08-23 18:22

Note illuusio that among the 17 comments in this issue there is a prototype in https://jira.mariadb.org/browse/MDEV-23755?focusedCommentId=188808&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-188808 - you don't need to start from scratch on this one.

Otto Kekäläinen added a comment - 2021-08-23 18:22 Note illuusio that among the 17 comments in this issue there is a prototype in https://jira.mariadb.org/browse/MDEV-23755?focusedCommentId=188808&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-188808 - you don't need to start from scratch on this one.

Tuukka Pasanen added a comment - 2021-08-25 05:14

otto As commented in Pull and reading what 'innodb_max_purge_lag_wait' should do I'm favor to believe that it's not actually needed as it but then if it doesn't make problem worse then extra care problems like this is not bad thing.
It should be enough to wait clean shutdown with mariadb-admin or does it just sigkill process?
Just reading comments and is there good way to hit this problem. Create database on 10.2 (or below) with client write to it then kill it when there is load and then upgrade to 10.3 and try to start?

Tuukka Pasanen added a comment - 2021-08-25 05:14 otto As commented in Pull and reading what 'innodb_max_purge_lag_wait' should do I'm favor to believe that it's not actually needed as it but then if it doesn't make problem worse then extra care problems like this is not bad thing. It should be enough to wait clean shutdown with mariadb-admin or does it just sigkill process? Just reading comments and is there good way to hit this problem. Create database on 10.2 (or below) with client write to it then kill it when there is load and then upgrade to 10.3 and try to start?

Tuukka Pasanen added a comment - 2023-01-23 09:28

PR is closed and this seems be in order

Tuukka Pasanen added a comment - 2023-01-23 09:28 PR is closed and this seems be in order

MariaDB Server

safety for upgrades (innodb) and packaging

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration