[MDEV-22580] MariaDB crashes on startup Created: 2020-05-15  Updated: 2023-03-12  Resolved: 2023-03-11

Status: Closed
Project: MariaDB Server
Component/s: Authentication and Privilege System
Affects Version/s: 10.4
Fix Version/s: 10.11.3, 10.4.29, 10.5.20, 10.6.13, 10.8.8, 10.9.6, 10.10.4

Type: Bug Priority: Major
Reporter: Georgi Sinapov Assignee: Sergei Golubchik
Resolution: Duplicate Votes: 0
Labels: None
Environment:

CentOS 7.7 latest.
MariaDB 10.4.13 - via your repository.


Attachments: PNG File Selection_341.png     PNG File Selection_342.png     File core.23694.gz     Text File mariadb_10_4_28_crash_log.txt     HTML File mdev22580_threads     HTML File mdev22580_threads_full     File not_ok.cfg.tar.gz     File ok.cfg.tar.gz     File signature.asc     Text File startup_crash.txt    
Issue Links:
Duplicate
is duplicated by MDEV-30826 Invalid data on mysql.host segfaults ... Closed

 Description   

Hello,
Server crashes on startup on a subset of hosts. The rest have been upgraded successfully.
Attached is the startup log. I can't correlate the hosts environment, they are a bit different, but not fundamentally. I'm also not able to find the core file, although is says one has been written.

Please, advise.



 Comments   
Comment by Elena Stepanova [ 2020-05-15 ]

Were all hosts upgraded from the same version, and if so, which is it? If they were upgraded from different ones, then maybe it's the difference?
Also, could the difference be in the previous session? For example, hosts which fail were upgraded after a crash, while the "good" ones were shut down normally before upgrade?

Comment by Georgi Sinapov [ 2020-05-15 ]

1. All host had been upgraded from the same version - CentOS stock 5.5.65-1.
2. None of the hosts had been reloaded recently or just before the upgrade. The running servers had not been shut down prior the upgrade either.
3. mysql_upgrade had to be run manually, after a successful upgrade.

HTH

Comment by Elena Stepanova [ 2020-05-15 ]

Just to clarify, what's the actual chain of events were? So, you had 5.5.65 running, then you upgraded to 10.4.13 (I guess uninstalled CentOS-made 5.5 packages and installed MariaDB's 10.4.13), it started, then you ran mysql_upgrade manually, it succeeded, but after restart the servers crashed and continued crashing?

Comment by Georgi Sinapov [ 2020-05-15 ]

The upgrade was done on a running 5.5.65. It might not be a recommended approach, but I do it like that for all applications.

Then, if the server could start, mysql_upgrade had to be run manually. After that - no problems, whatsoever.

If the server didn't start, nothing could have been done, except trying to start it manually and collect crash info.

Comment by Elena Stepanova [ 2020-05-15 ]

Okay. Could you please include in the log excerpt the last 5.5 session before the upgrade? That is, from the same instance, one of the problematic ones,
5.5 server starts
5.5 server shuts down or otherwise ends working
10.4 server starts
10.4 server crashes

Your attachment currently contains 10.4 part, but not the 5.5 part.

If you have any instance where the data isn't confidential (e.g. a testing box), it would be helpful if you could provide the contents of <datadir>/mysql folder. If the data is confidential, then at least .frm files from there – they are just table structures, don't contain any data. You can pack them and upload to our ftp.askmonty.org/private.

Comment by Georgi Sinapov [ 2020-05-16 ]

Nothing weird on 5.5.65 startup. This is a box, which was unsuccessfully upgraded to 10.4 and then rolled back. Had to delete the following files first.

aria_log.00000001
aria_log_control
ibdata1
ib_logfile0
ib_logfile1

200516 11:13:39 mysqld_safe mysqld from pid file /var/run/mariadb/mariadb.pid ended
200516 11:14:13 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
200516 11:14:13 [Note] /usr/libexec/mysqld (mysqld 5.5.65-MariaDB) starting as process 69111 ...
200516 11:14:13 InnoDB: The InnoDB memory heap is disabled
200516 11:14:13 InnoDB: Mutexes and rw_locks use GCC atomic builtins
200516 11:14:13 InnoDB: Compressed tables use zlib 1.2.7
200516 11:14:13 InnoDB: Using Linux native AIO
200516 11:14:13 InnoDB: Initializing buffer pool, size = 128.0M
200516 11:14:13 InnoDB: Completed initialization of buffer pool
200516 11:14:13 InnoDB: highest supported file format is Barracuda.
200516 11:14:13 InnoDB: Waiting for the background threads to start
200516 11:14:14 Percona XtraDB (http://www.percona.com) 5.5.61-MariaDB-38.13 started; log sequence number 1597945
200516 11:14:14 [Note] Plugin 'FEEDBACK' is disabled.
200516 11:14:14 [Note] Server socket created on IP: '0.0.0.0'.
200516 11:14:14 [Note] Event Scheduler: Loaded 0 events
200516 11:14:14 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.5.65-MariaDB' socket: '/var/lib/mysql/mysql.sock' port: 3306 MariaDB Server
200516 11:14:34 [Note] /usr/libexec/mysqld: Normal shutdown
200516 11:14:34 [Note] Event Scheduler: Purging the queue. 0 events
200516 11:14:34 InnoDB: Starting shutdown...
200516 11:14:38 InnoDB: Shutdown completed; log sequence number 1597945
200516 11:14:38 [Note] /usr/libexec/mysqld: Shutdown complete

200516 11:14:38 mysqld_safe mysqld from pid file /var/run/mariadb/mariadb.pid ended

I also managed to save a core dump and uploaded it here. This site doesn't work for me.

Comment by Elena Stepanova [ 2020-05-21 ]

Stack trace from the coredump (all threads are attached, just in case):

10.4.13 CentOS 7 RPM

#2  <signal handler called>
#3  get_magic_sort (templ=0x55992d9468c4 "d") at /usr/src/debug/MariaDB-10.4.13/src_0/sql/sql_acl_getsort.ic:174
#4  0x000055992d08bfc6 in acl_reload (thd=0x55992fd63558) at /usr/src/debug/MariaDB-10.4.13/src_0/sql/sql_acl.cc:2367
#5  0x000055992d08d023 in acl_init (dont_read_acl_tables=<optimized out>) at /usr/src/debug/MariaDB-10.4.13/src_0/sql/sql_acl.cc:2292
#6  0x000055992d04f4a7 in mysqld_main (argc=3, argv=0x55992f490970) at /usr/src/debug/MariaDB-10.4.13/src_0/sql/mysqld.cc:5775
#7  0x00007fd5024ff555 in __libc_start_main () from /lib64/libc.so.6
#8  0x000055992d043024 in _start ()

mdev22580_threads_full mdev22580_threads

Comment by Elena Stepanova [ 2020-05-21 ]

Could you please paste or attach configuration file(s) from one of the boxes where the crash happened?

Comment by Georgi Sinapov [ 2020-05-22 ]

Attached. It is very simple and practically, the same.

Comment by Daniel Black [ 2022-11-01 ]

Belatedly, it looks like the mysql.host table contained a db column with an entry, when retrieved, wasn't null terminated, possibly because it was right at the maximum length of the db column.

Comment by Georgi Sinapov [ 2022-11-03 ]

Hello,
This is definitely not the case.

Comment by Sergei Golubchik [ 2023-03-10 ]

This is likely the same bug as in MDEV-30826

Comment by Georgi Sinapov [ 2023-03-12 ]

Unfortunately, the problem is still there.

Comment by Sergei Golubchik [ 2023-03-12 ]

Still where? You have attached mariadb_10_4_28_crash_log.txt, it says

Mar 12 13:02:18 localhost mysqld: 2023-03-12 13:02:18 0 [Note] Starting MariaDB 10.4.28-MariaDB source revision c8f2e9a5c0ac5905f28b050b7df5a9ffd914b7e7 as process 14549

while the bug was fixed in 10.4.29 (see Fix Version/s field above) in the commit 8145b308b02ee7ee657851c1faee49c7b7ebf511 (see the [Git Commits] tab of the linked bug).

As a workaround you can try

UPDATE mysql.host SET db='%' WHERE db=''

Generated at Thu Feb 08 09:15:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.