[MDEV-27613] Fixing debian to only run the full mysql_upgrade process when necessary Created: 2022-01-25  Updated: 2023-11-27

Status: Stalled
Project: MariaDB Server
Component/s: None
Affects Version/s: None
Fix Version/s: 10.5

Type: Bug Priority: Major
Reporter: Faustin Lammler Assignee: Faustin Lammler
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
PartOf
is part of MDEV-27068 running mariadb-upgrade on mariadb se... Closed
Relates
relates to MDEV-27167 system variable deferred-networking t... Closed
relates to MDEV-27606 change debian packaging to not run ma... Open
relates to MDEV-30499 mariadb in-server upgrade Open
relates to MDEV-25887 "Got notification message from PID xx... Closed

 Description   

On Debian, the mariadb_upgrade process is run at every start/reboot of the mariadb-server.

This is problematic because:

  • most of the time it's not needed;
  • if maria_upgrade has to be run, as in case of a major upgrade, then check and possible fix of tables can take from a few seconds to 20-30 seconds (normally) but potentially much longer if there is was a change in a character collation;
  • If during this time another maria_upgrade is run then it is very likely that mariadb_upgrade will hang because of unexpected conflicts. This is because maria_upgrade was not designed to be run in parallel (should now be fixed by MDEV-27279);
  • If anyone else connects to mysqld/mariadbd during maria_upgrade, bad things can happen as the tables are not up to date for the system. The possible problems are hangs or wrong data from selects or errors from table updates (as the indexes may not be up to date).

Here are some questions summarized from MDEV-27068:

  • how to integrate this in a many packages upgrade scenario?
  • what is the right moment to run the mariadb_upgrade (post-install script, startup/restart), if needed?
  • should we split system tables upgrade (quick) and the rest of the tables upgrade (can be very long)?
  • is this feasible with systemd directives (man systemd.directives) or should a script handle alone the whole process?
  • what about sysv init?


 Comments   
Comment by Faustin Lammler [ 2022-01-27 ]

Suggested solution:

  1. Start server
  2. Check with mysql_upgrade if we need to do an upgrade
    • If not, do nothing
    • If upgrade needed
      • Take down server with 'shutdown'
      • Start server with separate socket (to ensure that no one can connect)
      • Run upgrade
      • Shutdown server and restart

Note: 1 is necessary because mysql_upgrade needs to connect to the server and should no try to access the upgrade_info directly.

The reasons are:

  • mysql_upgrade needs to verify the following version information
    • mysql_upgrade version (from compile time)
    • Version in the mysql_upgrade_info file
    • Version of the installed (running) server: All version info are needed to know if there is a need to do an upgrade and if the upgrade can be done.

mysql_upgrade also need to know that the mysql_upgrade info it access/creates is for the running server.
The safest way to do that (which will work for any use case) is to get the datadir from the server.

Another option would be to add an option --datadir to mysql_upgrade, but this is less safe as there are several ways this can go wrong:

  • The datadir could point to another directory than the running databases directory, in which case the upgrade will probably never be done (if this is done in a script).
  • One could add the mysql_update --datadir option to a global config file. In this case any local users installation of MariaDB would be affected and if they run 'mysql_upgrade' it would use the global installation and probably not do anything even if a local upgrade is needed.
Comment by Daniel Black [ 2022-01-28 ]

Analysis of suggested solution in systemd:

Concept systemd directives used pros cons
Start server ExecStartPre that services depending on mariadb.service won't start The 'do nothing' step below means that this can never be the main process
  ExecStart this is now the main process services depending on mariadb.service will start. It might be very short lived if a shutdown occurs
Check with mysql_upgrade if we need to do an upgrade
If not, do nothing
If upgrade needed
Take down server with 'shutdown'
Start server with separate socket (to ensure that no one can connect)
Run upgrade
Shutdown server and restart
ExecStartPre its before activity occurs none
  ExecStart might work in only starting the server once Once the main event loop is there its online, dependant services start and use the database, and you're running the upgrade at the same time as active services

So in systemd the only way to run an offline mysql_upgrade as I prototyped (under multiinstance) as follows:

ExecStartPre=sh -c '/usr/local/mysql/bin/mariadbd $MYSQLD_MULTI_INSTANCE $MYSQLD_OPTS & echo $! > /tmp/p.pid'
ExecStartPre=sleep 5
ExecStartPre=/usr/local/mysql/bin/mariadb-upgrade $MYSQLD_MULTI_INSTANCE $MYSQLD_OPTS
ExecStartPre=sh -c 'kill $(< /tmp/p.pid)'
ExecStartPre=sleep 5

concept only - Putting a pre-pid in a proper location and doing proper termination waits instead, could wrap all pre bits into a single script. first start should include the skip-networking - different socket and skipping plugins (MDEV-27606).

Results in:

    Process: 127859 ExecStartPre=sh -c /usr/local/mysql/bin/mariadbd $MYSQLD_MULTI_INSTANCE $MYSQLD_OPTS & echo $! > /tmp/p.pid (code=exited, status=0/SUCCESS)
    Process: 127861 ExecStartPre=sleep 5 (code=exited, status=0/SUCCESS)
    Process: 127884 ExecStartPre=/usr/local/mysql/bin/mariadb-upgrade $MYSQLD_MULTI_INSTANCE $MYSQLD_OPTS (code=exited, status=0/SUCCESS)
    Process: 127893 ExecStartPre=sh -c kill $(< /tmp/p.pid) (code=exited, status=0/SUCCESS)
    Process: 127896 ExecStartPre=sleep 5 (code=exited, status=0/SUCCESS)
   Main PID: 127905 (mariadbd)
     Status: "Taking your SQL requests now..."
      Tasks: 15 (limit: 18775)
     Memory: 55.9M
        CPU: 265ms
     CGroup: /system.slice/system-mariadb.slice/mariadb@dan.service
             └─127905 /usr/local/mysql/bin/mariadbd --defaults-file=/home/dan/.my.cnf

So there's no checks, because we can't make the temporary started server the main process. So the default original mysql_upgrade already checked the version.

So cons:

  • two server start up processes are always required
  • the first server startup is almost always unnecessary as its already upgraded (hence MDEV-27636)

pros:

  • achieves bug goals of not running mysql_upgrade at the same time as the service is available to users
Comment by Daniel Black [ 2022-01-28 ]

Note - MDEV-27060 - without the upgrade-system-tables option the whole upgrade can take 2 hrs depending on the users data.

Comment by Daniel Black [ 2022-01-29 ]

Note 1: on the temporary server start, it might be prudent to run it in --skip-grant-tables so any user changed root authentication doesn't stand in the way of upgrades. Check the mysql_upgrade_info file is written; the changes to mysql_upgrade_info in MDEV-27068 are probably needed. The current released versions appear not to be able to retrieve SELECT VERSION() under FLUSH PRIVILEGES in the upgrade.

Note 2: suggest running mysql_upgrade as unix mysql user so that the datadir has its mysql_upgrade_info file owned by the mysql user.

Comment by Faustin Lammler [ 2022-02-01 ]

So, we can not use `ExecStartPre=` to start MariaDB server (and then use `ExecStart=` to do the rest of what is needed) because of:

> Note that ExecStartPre= may not be used to start long-running processes. All processes forked off by processes invoked via ExecStartPre= will be killed before the next service process is run.

See: https://www.freedesktop.org/software/systemd/man/systemd.service.html#ExecStartPre=

So I can not think of a better option than doing all of it in `ExecStart=` with a bash script.

With this solution, there is still a problem. Once MariaDB server start, it will notify sytemd that it has started and depending services will start (which is fine if no mysql_upgrade is needed but problematic if we immediately shutdown the server to run mysq_upgrade on a non accessible socket).
This is what I understand from how the `Type=notify` works (see https://www.freedesktop.org/software/systemd/man/systemd.service.html#Type=, https://www.freedesktop.org/software/systemd/man/sd_notify.html# and https://github.com/MariaDB/server/blob/10.5/sql/mysqld.cc#L6202-L6203).

danblack not exactly related to Note 2 but good to know, I believe that we are using a deprecated Systemd option, see my last comment in MDEV-14707. Using prefix should simplify and harden the script.

Comment by Daniel Black [ 2022-02-02 ]

> See: https://www.freedesktop.org/software/systemd/man/systemd.service.html#ExecStartPre=
> So I can not think of a better option than doing all of it in `ExecStart=` with a bash script.

Ouch, what a way to revert the quality of 7 years of semi-decent systemd scripts (MDEV-19210 would raise it to mostly decent).

Continuing this approach would need: https://www.freedesktop.org/software/systemd/man/sd_notify.html#MAINPID=%E2%80%A6, but lets keep looking.

mariadb-upgrade as subprocess (preferred)

With Monty's "It is trivially to add a patch to mariadb to delay notifications" perhaps MDEV-27167 is back on the table. To continue the functionality for the upgrade, the following extra server engineering:

  • getting mariadbd to directly launch the mariadb-upgrade.
  • even before other sockets are listening, create own dedicated temporary unix socket (in mkdtemp dir), abstract on linux (for ease of filesystem permissions), could be passed to the mariadb-upgrade
  • the SO_PASSCRED (man unix), same as unix_socket auth, can validate the mariadb-upgrade pid is the one connected which bypasses the need to have some auth problem (you know how Debian people fiddle their root user). Except there's mysql client sub-processes. hmm, solution incomplete here, env MYSQL_PWD to mariadb-upgrade might work around user manipulated passwords?
  • on subprocess completion continue startup
  • service_manager_extend_timeout on new connections to this temp socket. Maybe also per SQL CHECK TABLE, ALTER .... This prevents systemd timing out on startup.

Other reuses of subprocess engineering on other business causes:

  • reset root password script
  • galera sst donor
  • arbitrary user data load script for automated environments (like MDEV-27435 ; pr #1932 that wanted more)

scripted ExecStart (non-preferred)

A (yuk) ExecStart script.

  • starts mariadbd deferred networking MDEV-27167 with explicit temporary socket specified
  • loop until access on script
  • mariadb-upgrade

Server engineering:

  • MDEV-27167
  • service_manager_extend_timeout per above
  • "sd_notify(MAINPID=x " to let systemd recovery from the hackery
  • FLUSH PRIVILEGES to end deferred networking
Comment by Michael Widenius [ 2022-02-15 ]

Answering Daniel:

  • I don't see what MDEV-19210 has to do with this task. There is no description for the task, so there is apparently nothing to do related to it.
  • You already agreed one our call we cannot use ExecStartPre for starting MariaDB.
  • The whole idea of this task is to increase quality as the current one approach does not work (which is the whole reason for this task).
  • Yes, we would have to use MANPID, but that is a trivial change.
  • With "trivial" I mean that something that can be done in a few hours of work. This particular thing can be done in < 30 min.

I don't think it is a good idea for the server to start sub processes to feed itself. Too complex and too many things that can go wrong.
(For example, the server does not know where the mysql client or mysql_upgrade utility are located, especially for local installations)
Doing this just to satisfy debian mariadb_upgrade problems in MariaDB 10.2 is definitely an overkill.

  • Creating another socket is possible and use this for start & upgrade could be a viable solution. However this is more development work
    than creating a simple script that restarts the server and we probably don't need to do this at all.
  • I don't see how authentication is a problem for this case. If the startup scripts already run mysql_upgrade, the authentication problem must already be solved or have I missed anything?
  • "service_manager_extend_timeout on new connections to this temp socket. Maybe also per SQL CHECK TABLE, ALTER .... " ; There is no relevant timeouts for this part when it comes to mysql_upgrade. Client and server timeouts are easily managed in a script.
  • Yes, systemd startup can timeout for mysql_upgrade. However it can also timeout for normal server start, because of recovery, so there is nothing new here. There is not much we can do regarding this. However as it only happens when doing a full system upgrade (which changes the MariaDB main version) this is probably acceptable.

Regarding ExecStart script, the above is not the latest suggested solution.
You also have not explain why this is not your preferred option even if it more general and require less development work to do.

Here is a more complete spec:

Suggested solution for mysql_upgrade on Debian

  • Assumption: We cannot use ExecStartPre for anything as it is only
    usable for things that takes a 'short time' which means we cannot
    use it to even normal start of the server as database recovery can
    in the worst case take hours.
  • Suggestion is to start the server during ExceStart and do recovery
    and potentially mysql_upgrade before allowing normal users to access
    the server.

Have ExecStart run a script that shell starts the server instead of
starting the server itself.

The script could look like the following:

  • Start the server with all old options and a new option --skip-notifications
    (This will not do any sd_notify() calls)
  • Check with mysql_upgrade --check-if-upgrade-is-needed if we need to do an
    upgrade
  • If yes
  • Stop the server
  • Restart with all the above options and --socket=/tmp/mariadb_startup.sock
  • Run upgrade on the server
  • Restart the server with original options (no --skip-notification)
  • If no
  • Excecute SET @@notifications=1, which enables notfications.

For the initial normal start, there is no need to not start networking or
use a special socket as this stage is done during in systemd startup and there
are no user connected to the system when this happens. Any sub process that
depends on the server will wait for the systemd to get the notification that
they can start. If users are connected, they will get disconnected for the duration
for the mysql_upgrade, which they have to be anyhow as there seams to be possible
deadlocks happening if this is the case.

Comment by Daniel Black [ 2022-02-16 ]

For the initial normal start, there is no need to not start networking or
use a special socket as this stage is done during in systemd startup and there are no user connected to the system when this happens

We can't rely on this. Load balancers, web sites, and monitors will take a socket accepting a connection as available. Systemd's READY=1 service availability is not the only way dependent services and server determine database availability.

systemd startup can timeout for mysql_upgrade. However it can also timeout for normal server start, because of recovery,

This was mitigated in MDEV-14705.

...so there is nothing new here. There is not much we can do regarding this.

If the mysql_upgrade is left at --upgrade-system-tables, I don't think we should do anything. I don't know if this is a good idea.

If we don't, the (--force mis)configuration of MDEV-27060 shows the CHECK TABLES component of the mysql_upgrade can take hours. This would timeout. The mysql_upgrade itself could issue the sd_notify extend timeout between table checks, but it would probably need a time estimate (based on size of table?).

This would need NotifyAccess=all in the systemd service file which carries no downsides.

I don't see how authentication is a problem for this case. If the startup scripts already run mysql_upgrade, the authentication problem must already be solved or have I missed anything?

I see enough people that have run mysql_secure_installation (I just saw MDEV-26593 mentioned - explains exasperation at least) or just changed the root user (because they seen unix socket auth equated to passwordless and just removed it) such that their log rotate and various other dependent packaging issues fail. I don't think these users rational is always good, but its not an always solved problem. Increasing the quality accounting for less common scenarios like root password changes and additional mariadb installs is good.

As I've tested with the mariadb docker library image, running mysql_upgrade under mariadbd --skip-grant-tables works, and if we move/secure the socket per the first add requirement here, there isn't downside.

So refining previous idea with these two constraints:

ExecStart script:

  • create private tmp directory for socket/unique name
  • start mariadbd --early-socket=$socket --pid-file=systemd --skip-grant-tables
  • mysql_upgrade -S $socket
  • remove private tmp directory

Changes systemd service:

  • ExecStart=/boot_and_upgrade.sh
  • NotifyAccess=all (if check all tables in mysql_upgrade is extending timeout)

Server change --early-socket=x

That:

  • binds to the unix socket
  • Unknowns: (detection of Innodb recovery completion? Address other plugin/mysql_upgrade problems like spider MDEV-27095?)
  • enters a pre-main loop without binding/listening to all sockets
  • pre-main loop terminates under FLUSH PRIVILEGES, or an explicit system server variable like @@notifications=1, or @@early_socket=)
  • early socket is unbound

Server change --pid-file=systemd

Rational

  • I discovered recently that systemd launching containers (like https://github.com/eriksjolund/mariadb-podman-socket-activation) does so with a conman process (connection manager) taking the MAINPID. Trumping the MAINPID in these cases would be unexpected behaviour.
  • pid-files haven't been ever needed under systemd, or containers, so repurposing an almost obsolete system server variable with an explicit unlikely value has trivial implementation aspects and low risk.

Function:

  • when pid-file=systemd, pushes sd_notify(MAINPID=%d before READY=1

mysql_upgrade

  • extend timeout for during CHECK TABLES
  • notification of end of early loop (if not using FLUSH PRIVILEGES).

Questions

Even Monty's proposal above, I don't see the requirement to restart the server.

Without a restart of the server:

  • FLUSH PRIVILEGES from mysql_upgrade should put it in a good state.
  • SET @@notifications=1 at end of upgrade is effectively read
  • the innodb buffer pool status from applying redo log / innodb buffer pool load can be considered warmed up.

Disadvantages of not restarting a server:

  • We get to avoid writing a different early socket scenario and loop.
  • There may be server command line options that would make applying the upgrade simpler/faster (but I can't think of any at the moment that would have advantage over a single server start).
Comment by Otto Kekäläinen [ 2022-02-18 ]

I don't quite get why does the title include "Fixing debian...". How MariaDB does in-place upgrades for its data has nothing to do with distribution and packaging of files in distros, let alone anything specific to Debian?

MariaDB should not rely on deb nor rpm pre/post scripts to ensure its data files are in correct format (=upgraded). In the most simple scenario deb/rpm is just a way to download files and unpack them on the system with a couple of integrations to the distro paths and facilities. There is no guarantee for example that the MariaDB Server would be running nor restarted during the deb/rpm pre/post script execution.

The most sane solution is that the mariadbd binary itself checks that its data is on correct format, and if not, upgrades it or does whatever is needed. There should be just one way for it to do it and it should not be specific to any distro type or distro version. If mariadbd can't do it independently but needs a separate script mariadb-upgrade, then injecting a call to mariadb-upgrade in systemd and SysV init scripts is probably the best solution, as that is the way systems typically run mariadbd instead of calling it directly. You seem to be on that path already. If you choose to use PreExec or Exec is just a systemd implementation detail.

Comment by Otto Kekäläinen [ 2022-02-18 ]

faust

On Debian, the mariadb_upgrade process is run at every start/reboot of the mariadb-server.

When does other distros or Windows run mariadb-upgrade if not at start of the server? As stated in MDEV-27606, mariadb-upgrade runs as part of the systemd/init.d scripts and are largely the same across all distros.

In native Fedora they made a custom script to check for the need of running mariadb-upgrade, but it is still run as part of the server startup: https://src.fedoraproject.org/rpms/mariadb/blob/rawhide/f/mariadb-check-upgrade.sh

Ideally the mariadbd would itself keep track of its data directory and upgrade it when needed, and not rely on external bash scripts to do it for it, nor even have the possibility to run new mariadbd version without having the datadir also upgraded.

Comment by Vicențiu Ciorbaru [ 2023-11-27 ]

Discussed plan with faust. Finalizing the details will happen next week, but for record:

  1. We will use systemd to automate the process where possible.
    (in the future, we will have a way to --enable-networking, much like we have --skip-networking for similar automation in Docker Files)
  2. We will use mariadb-upgrade's --check-if-upgrade-is-needed flag
    1. check-if-upgrade-is-needed only looks at the datadir for mysql_upgrade_info file and returns 1 if there is a need to upgrade, otherwise 0. This is a fast operation, but requires read access on the datadir.
    2. We can run --check-if-upgrade-is-needed as part of PreExec step of SystemD, provided we can save the result for the actual Exec step.
    3. For the Exec step, we will start the server and only run mariadb-upgrade if the PreExec step returned 1.
Comment by Sergei Golubchik [ 2023-11-27 ]

Remember the main thread starting this this proposal ?

The conclusion was that mariadb-upgrade is only required for major and minor upgrades, but not for patchlevel upgrades.

So, you shouldn't need to run mariadb-upgrade for patchlevel upgrades. At all.

Generated at Thu Feb 08 09:54:15 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.