The problem of early initialization of MariaDB manifesting in a few ways.
Currently we have multiple restarts of the mariadbd instance in containers because of multiple initialization stages causing slow starts like
MDEV-27074 exhibits. Progress writing a mariadb-upgrade in a container ( MDEV-25670) also hits this limitation. MDEV-27068 showed the need for a skip-networking mariadbd instance to be run for mariadb-upgrade to be run before user traffic occurs.
Both these rely on the lack of TCP networking to prevent the user traffic occuring.
The problem of a background mariadbd instance is:
- wasted CPU/IO for initialization of everything, just to get torn down after a few actions
- Poor integration with systemd and other service managers
- The actual start up instigates innodb recovery
The step between an initialization and a user accessible service is:
- The accessibility of TCP ports
- The systemd message initiating its ready to allow dependent services to start up.
This MDEV proposes the addition of a system variable deferred_networking to address this. Its behaviour is:
- Defaults to 0, for backwards compatibility
- When 1, TCP based ports are not listened to
- When 1, No systemd message READY=1 is sent
These aspects allows a window for mariadb-upgrade and container based initialization (of user data, timezones) to occur. After this is finished setting of deferred_networking to 1 results in:
- manipulation of the poll/fd structures to include TCP file descriptors
- a non abort_loop=1 call (like termination) to move forward from the existing select/poll
- A systemd message READY=1
- No further manipulation of the deferred_networking system variable is permitted.
In the addition to the above:
- mariadb-upgrade will add a --end-deferred-networking option to perform this system variable.
Hopefully the complexity of this is minimal and can make 10.7, and possibly earlier.
serg, what do you think of the concept? Any improvements? I'm happy to write it.
A couple of notes:
- We want any startup, independent if mysql_upgrade is to be used or not, to run recovery of ALL used storage engines!
If not, we cannot run a full upgrade!
- We don't need to start with --skip-networking during the systemd init; There are no users that can or will connect during that period