|
Suggested solution:
- Start server
- Check with mysql_upgrade if we need to do an upgrade
- If not, do nothing
- If upgrade needed
- Take down server with 'shutdown'
- Start server with separate socket (to ensure that no one can connect)
- Run upgrade
- Shutdown server and restart
Note: 1 is necessary because mysql_upgrade needs to connect to the server and should no try to access the upgrade_info directly.
The reasons are:
- mysql_upgrade needs to verify the following version information
- mysql_upgrade version (from compile time)
- Version in the mysql_upgrade_info file
- Version of the installed (running) server: All version info are needed to know if there is a need to do an upgrade and if the upgrade can be done.
mysql_upgrade also need to know that the mysql_upgrade info it access/creates is for the running server.
The safest way to do that (which will work for any use case) is to get the datadir from the server.
Another option would be to add an option --datadir to mysql_upgrade, but this is less safe as there are several ways this can go wrong:
- The datadir could point to another directory than the running databases directory, in which case the upgrade will probably never be done (if this is done in a script).
- One could add the mysql_update --datadir option to a global config file. In this case any local users installation of MariaDB would be affected and if they run 'mysql_upgrade' it would use the global installation and probably not do anything even if a local upgrade is needed.
|
|
Analysis of suggested solution in systemd:
| Concept |
systemd directives used |
pros |
cons |
| Start server |
ExecStartPre |
that services depending on mariadb.service won't start |
The 'do nothing' step below means that this can never be the main process |
| |
ExecStart |
this is now the main process |
services depending on mariadb.service will start. It might be very short lived if a shutdown occurs |
Check with mysql_upgrade if we need to do an upgrade
If not, do nothing
If upgrade needed
Take down server with 'shutdown'
Start server with separate socket (to ensure that no one can connect)
Run upgrade
Shutdown server and restart |
ExecStartPre |
its before activity occurs |
none |
| |
ExecStart |
might work in only starting the server once |
Once the main event loop is there its online, dependant services start and use the database, and you're running the upgrade at the same time as active services |
So in systemd the only way to run an offline mysql_upgrade as I prototyped (under multiinstance) as follows:
ExecStartPre=sh -c '/usr/local/mysql/bin/mariadbd $MYSQLD_MULTI_INSTANCE $MYSQLD_OPTS & echo $! > /tmp/p.pid'
|
ExecStartPre=sleep 5
|
ExecStartPre=/usr/local/mysql/bin/mariadb-upgrade $MYSQLD_MULTI_INSTANCE $MYSQLD_OPTS
|
ExecStartPre=sh -c 'kill $(< /tmp/p.pid)'
|
ExecStartPre=sleep 5
|
concept only - Putting a pre-pid in a proper location and doing proper termination waits instead, could wrap all pre bits into a single script. first start should include the skip-networking - different socket and skipping plugins (MDEV-27606).
Results in:
Process: 127859 ExecStartPre=sh -c /usr/local/mysql/bin/mariadbd $MYSQLD_MULTI_INSTANCE $MYSQLD_OPTS & echo $! > /tmp/p.pid (code=exited, status=0/SUCCESS)
|
Process: 127861 ExecStartPre=sleep 5 (code=exited, status=0/SUCCESS)
|
Process: 127884 ExecStartPre=/usr/local/mysql/bin/mariadb-upgrade $MYSQLD_MULTI_INSTANCE $MYSQLD_OPTS (code=exited, status=0/SUCCESS)
|
Process: 127893 ExecStartPre=sh -c kill $(< /tmp/p.pid) (code=exited, status=0/SUCCESS)
|
Process: 127896 ExecStartPre=sleep 5 (code=exited, status=0/SUCCESS)
|
Main PID: 127905 (mariadbd)
|
Status: "Taking your SQL requests now..."
|
Tasks: 15 (limit: 18775)
|
Memory: 55.9M
|
CPU: 265ms
|
CGroup: /system.slice/system-mariadb.slice/mariadb@dan.service
|
└─127905 /usr/local/mysql/bin/mariadbd --defaults-file=/home/dan/.my.cnf
|
So there's no checks, because we can't make the temporary started server the main process. So the default original mysql_upgrade already checked the version.
So cons:
- two server start up processes are always required
- the first server startup is almost always unnecessary as its already upgraded (hence MDEV-27636)
pros:
- achieves bug goals of not running mysql_upgrade at the same time as the service is available to users
|
|
Note - MDEV-27060 - without the upgrade-system-tables option the whole upgrade can take 2 hrs depending on the users data.
|
|
Note 1: on the temporary server start, it might be prudent to run it in --skip-grant-tables so any user changed root authentication doesn't stand in the way of upgrades. Check the mysql_upgrade_info file is written; the changes to mysql_upgrade_info in MDEV-27068 are probably needed. The current released versions appear not to be able to retrieve SELECT VERSION() under FLUSH PRIVILEGES in the upgrade.
Note 2: suggest running mysql_upgrade as unix mysql user so that the datadir has its mysql_upgrade_info file owned by the mysql user.
|
|
So, we can not use `ExecStartPre=` to start MariaDB server (and then use `ExecStart=` to do the rest of what is needed) because of:
> Note that ExecStartPre= may not be used to start long-running processes. All processes forked off by processes invoked via ExecStartPre= will be killed before the next service process is run.
See: https://www.freedesktop.org/software/systemd/man/systemd.service.html#ExecStartPre=
So I can not think of a better option than doing all of it in `ExecStart=` with a bash script.
With this solution, there is still a problem. Once MariaDB server start, it will notify sytemd that it has started and depending services will start (which is fine if no mysql_upgrade is needed but problematic if we immediately shutdown the server to run mysq_upgrade on a non accessible socket).
This is what I understand from how the `Type=notify` works (see https://www.freedesktop.org/software/systemd/man/systemd.service.html#Type=, https://www.freedesktop.org/software/systemd/man/sd_notify.html# and https://github.com/MariaDB/server/blob/10.5/sql/mysqld.cc#L6202-L6203).
danblack not exactly related to Note 2 but good to know, I believe that we are using a deprecated Systemd option, see my last comment in MDEV-14707. Using prefix should simplify and harden the script.
|
|
> See: https://www.freedesktop.org/software/systemd/man/systemd.service.html#ExecStartPre=
> So I can not think of a better option than doing all of it in `ExecStart=` with a bash script.
Ouch, what a way to revert the quality of 7 years of semi-decent systemd scripts (MDEV-19210 would raise it to mostly decent).
Continuing this approach would need: https://www.freedesktop.org/software/systemd/man/sd_notify.html#MAINPID=%E2%80%A6, but lets keep looking.
mariadb-upgrade as subprocess (preferred)
With Monty's "It is trivially to add a patch to mariadb to delay notifications" perhaps MDEV-27167 is back on the table. To continue the functionality for the upgrade, the following extra server engineering:
- getting mariadbd to directly launch the mariadb-upgrade.
- even before other sockets are listening, create own dedicated temporary unix socket (in mkdtemp dir), abstract on linux (for ease of filesystem permissions), could be passed to the mariadb-upgrade
- the SO_PASSCRED (man unix), same as unix_socket auth, can validate the mariadb-upgrade pid is the one connected which bypasses the need to have some auth problem (you know how Debian people fiddle their root user). Except there's mysql client sub-processes. hmm, solution incomplete here, env MYSQL_PWD to mariadb-upgrade might work around user manipulated passwords?
- on subprocess completion continue startup
- service_manager_extend_timeout on new connections to this temp socket. Maybe also per SQL CHECK TABLE, ALTER .... This prevents systemd timing out on startup.
Other reuses of subprocess engineering on other business causes:
- reset root password script
- galera sst donor

- arbitrary user data load script for automated environments (like
MDEV-27435 ; pr #1932 that wanted more)
scripted ExecStart (non-preferred)
A (yuk) ExecStart script.
- starts mariadbd deferred networking
MDEV-27167 with explicit temporary socket specified
- loop until access on script
- mariadb-upgrade
Server engineering:
MDEV-27167
- service_manager_extend_timeout per above
- "sd_notify(MAINPID=x " to let systemd recovery from the hackery
- FLUSH PRIVILEGES to end deferred networking
|
|
Answering Daniel:
- I don't see what MDEV-19210 has to do with this task. There is no description for the task, so there is apparently nothing to do related to it.
- You already agreed one our call we cannot use ExecStartPre for starting MariaDB.
- The whole idea of this task is to increase quality as the current one approach does not work (which is the whole reason for this task).
- Yes, we would have to use MANPID, but that is a trivial change.
- With "trivial" I mean that something that can be done in a few hours of work. This particular thing can be done in < 30 min.
I don't think it is a good idea for the server to start sub processes to feed itself. Too complex and too many things that can go wrong.
(For example, the server does not know where the mysql client or mysql_upgrade utility are located, especially for local installations)
Doing this just to satisfy debian mariadb_upgrade problems in MariaDB 10.2 is definitely an overkill.
- Creating another socket is possible and use this for start & upgrade could be a viable solution. However this is more development work
than creating a simple script that restarts the server and we probably don't need to do this at all.
- I don't see how authentication is a problem for this case. If the startup scripts already run mysql_upgrade, the authentication problem must already be solved or have I missed anything?
- "service_manager_extend_timeout on new connections to this temp socket. Maybe also per SQL CHECK TABLE, ALTER .... " ; There is no relevant timeouts for this part when it comes to mysql_upgrade. Client and server timeouts are easily managed in a script.
- Yes, systemd startup can timeout for mysql_upgrade. However it can also timeout for normal server start, because of recovery, so there is nothing new here. There is not much we can do regarding this. However as it only happens when doing a full system upgrade (which changes the MariaDB main version) this is probably acceptable.
Regarding ExecStart script, the above is not the latest suggested solution.
You also have not explain why this is not your preferred option even if it more general and require less development work to do.
Here is a more complete spec:
Suggested solution for mysql_upgrade on Debian
- Assumption: We cannot use ExecStartPre for anything as it is only
usable for things that takes a 'short time' which means we cannot
use it to even normal start of the server as database recovery can
in the worst case take hours.
- Suggestion is to start the server during ExceStart and do recovery
and potentially mysql_upgrade before allowing normal users to access
the server.
Have ExecStart run a script that shell starts the server instead of
starting the server itself.
The script could look like the following:
- Start the server with all old options and a new option --skip-notifications
(This will not do any sd_notify() calls)
- Check with mysql_upgrade --check-if-upgrade-is-needed if we need to do an
upgrade
- If yes
- Stop the server
- Restart with all the above options and --socket=/tmp/mariadb_startup.sock
- Run upgrade on the server
- Restart the server with original options (no --skip-notification)
- If no
- Excecute SET @@notifications=1, which enables notfications.
For the initial normal start, there is no need to not start networking or
use a special socket as this stage is done during in systemd startup and there
are no user connected to the system when this happens. Any sub process that
depends on the server will wait for the systemd to get the notification that
they can start. If users are connected, they will get disconnected for the duration
for the mysql_upgrade, which they have to be anyhow as there seams to be possible
deadlocks happening if this is the case.
|
For the initial normal start, there is no need to not start networking or
use a special socket as this stage is done during in systemd startup and there are no user connected to the system when this happens
We can't rely on this. Load balancers, web sites, and monitors will take a socket accepting a connection as available. Systemd's READY=1 service availability is not the only way dependent services and server determine database availability.
systemd startup can timeout for mysql_upgrade. However it can also timeout for normal server start, because of recovery,
This was mitigated in MDEV-14705.
...so there is nothing new here. There is not much we can do regarding this.
If the mysql_upgrade is left at --upgrade-system-tables, I don't think we should do anything. I don't know if this is a good idea.
If we don't, the (--force mis)configuration of MDEV-27060 shows the CHECK TABLES component of the mysql_upgrade can take hours. This would timeout. The mysql_upgrade itself could issue the sd_notify extend timeout between table checks, but it would probably need a time estimate (based on size of table?).
This would need NotifyAccess=all in the systemd service file which carries no downsides.
I don't see how authentication is a problem for this case. If the startup scripts already run mysql_upgrade, the authentication problem must already be solved or have I missed anything?
I see enough people that have run mysql_secure_installation (I just saw MDEV-26593 mentioned - explains exasperation at least) or just changed the root user (because they seen unix socket auth equated to passwordless and just removed it) such that their log rotate and various other dependent packaging issues fail. I don't think these users rational is always good, but its not an always solved problem. Increasing the quality accounting for less common scenarios like root password changes and additional mariadb installs is good.
As I've tested with the mariadb docker library image, running mysql_upgrade under mariadbd --skip-grant-tables works, and if we move/secure the socket per the first add requirement here, there isn't downside.
So refining previous idea with these two constraints:
ExecStart script:
- create private tmp directory for socket/unique name
- start mariadbd --early-socket=$socket --pid-file=systemd --skip-grant-tables
- mysql_upgrade -S $socket
- remove private tmp directory
Changes systemd service:
- ExecStart=/boot_and_upgrade.sh
- NotifyAccess=all (if check all tables in mysql_upgrade is extending timeout)
Server change --early-socket=x
That:
- binds to the unix socket
- Unknowns: (detection of Innodb recovery completion? Address other plugin/mysql_upgrade problems like spider
MDEV-27095?)
- enters a pre-main loop without binding/listening to all sockets
- pre-main loop terminates under FLUSH PRIVILEGES, or an explicit system server variable like @@notifications=1, or @@early_socket=)
- early socket is unbound
Server change --pid-file=systemd
Rational
- I discovered recently that systemd launching containers (like https://github.com/eriksjolund/mariadb-podman-socket-activation) does so with a conman process (connection manager) taking the MAINPID. Trumping the MAINPID in these cases would be unexpected behaviour.
- pid-files haven't been ever needed under systemd, or containers, so repurposing an almost obsolete system server variable with an explicit unlikely value has trivial implementation aspects and low risk.
Function:
- when pid-file=systemd, pushes sd_notify(MAINPID=%d before READY=1
mysql_upgrade
- extend timeout for during CHECK TABLES
- notification of end of early loop (if not using FLUSH PRIVILEGES).
Questions
Even Monty's proposal above, I don't see the requirement to restart the server.
Without a restart of the server:
- FLUSH PRIVILEGES from mysql_upgrade should put it in a good state.
- SET @@notifications=1 at end of upgrade is effectively read
- the innodb buffer pool status from applying redo log / innodb buffer pool load can be considered warmed up.
Disadvantages of not restarting a server:
- We get to avoid writing a different early socket scenario and loop.
- There may be server command line options that would make applying the upgrade simpler/faster (but I can't think of any at the moment that would have advantage over a single server start).
|
|
I don't quite get why does the title include "Fixing debian...". How MariaDB does in-place upgrades for its data has nothing to do with distribution and packaging of files in distros, let alone anything specific to Debian?
MariaDB should not rely on deb nor rpm pre/post scripts to ensure its data files are in correct format (=upgraded). In the most simple scenario deb/rpm is just a way to download files and unpack them on the system with a couple of integrations to the distro paths and facilities. There is no guarantee for example that the MariaDB Server would be running nor restarted during the deb/rpm pre/post script execution.
The most sane solution is that the mariadbd binary itself checks that its data is on correct format, and if not, upgrades it or does whatever is needed. There should be just one way for it to do it and it should not be specific to any distro type or distro version. If mariadbd can't do it independently but needs a separate script mariadb-upgrade, then injecting a call to mariadb-upgrade in systemd and SysV init scripts is probably the best solution, as that is the way systems typically run mariadbd instead of calling it directly. You seem to be on that path already. If you choose to use PreExec or Exec is just a systemd implementation detail.
|
|
faust
On Debian, the mariadb_upgrade process is run at every start/reboot of the mariadb-server.
When does other distros or Windows run mariadb-upgrade if not at start of the server? As stated in MDEV-27606, mariadb-upgrade runs as part of the systemd/init.d scripts and are largely the same across all distros.
In native Fedora they made a custom script to check for the need of running mariadb-upgrade, but it is still run as part of the server startup: https://src.fedoraproject.org/rpms/mariadb/blob/rawhide/f/mariadb-check-upgrade.sh
Ideally the mariadbd would itself keep track of its data directory and upgrade it when needed, and not rely on external bash scripts to do it for it, nor even have the possibility to run new mariadbd version without having the datadir also upgraded.
|
|
Discussed plan with faust. Finalizing the details will happen next week, but for record:
- We will use systemd to automate the process where possible.
(in the future, we will have a way to --enable-networking, much like we have --skip-networking for similar automation in Docker Files)
- We will use mariadb-upgrade's --check-if-upgrade-is-needed flag
- check-if-upgrade-is-needed only looks at the datadir for mysql_upgrade_info file and returns 1 if there is a need to upgrade, otherwise 0. This is a fast operation, but requires read access on the datadir.
- We can run --check-if-upgrade-is-needed as part of PreExec step of SystemD, provided we can save the result for the actual Exec step.
- For the Exec step, we will start the server and only run mariadb-upgrade if the PreExec step returned 1.
|
|
Remember the main thread starting this this proposal ?
The conclusion was that mariadb-upgrade is only required for major and minor upgrades, but not for patchlevel upgrades.
So, you shouldn't need to run mariadb-upgrade for patchlevel upgrades. At all.
|