[MDEV-25782] MariaDB all nodes stuck on shutdown Created: 2021-05-26 Updated: 2021-06-11 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | None |
| Affects Version/s: | 10.5.10 |
| Fix Version/s: | 10.5 |
| Type: | Bug | Priority: | Major |
| Reporter: | Jason Logan | Assignee: | Julius Goryavsky |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | None | ||
| Environment: |
Ubuntu 20.10 minimal no gui |
||
| Attachments: |
|
| Description |
|
All the following is logged in as root. If I reboot any node it gets stuck at shutting down mariadb. It does the same thing if I do a "systemctl stop mariadb". I've been shutting it down then doing a "ps -A | grep mariadb" and then a "kill -9 pid" to make it stop. Then I can perform upgrades, or reboots. The service must be killed to stop. I've attached a log that is the same on all my nodes when I attempt to stop the service. I did see someone say something about time zones so I set one group of nodes in a cluster to UTC but it did not help. These are in the central time zone. I've included the log of the shutdown to show when they are all getting stuck. Also, I've let it sit for days and it will not shutdown. I am happy to try anything because I have created a test cluster to see if I can get things working. |
| Comments |
| Comment by Sergei Golubchik [ 2021-05-26 ] | |
|
Do you have tables with indexed virtual columns? | |
| Comment by Jason Logan [ 2021-05-26 ] | |
|
I do not use virtual columns. | |
| Comment by Sergei Golubchik [ 2021-05-26 ] | |
|
Can you run something like
replacing 12345 with the actual mariadbd pid, of course you might need to install mariadb*dbgsym* packages to get a meaningful output | |
| Comment by Jason Logan [ 2021-05-26 ] | |
|
I uploaded the trace file: mariadbd.trace here | |
| Comment by Sergei Golubchik [ 2021-05-27 ] | |
|
sysprg, what do you take from it? | |
| Comment by Jason Logan [ 2021-06-10 ] | |
|
Is there any movement on this? This happens on CentOS 8 and Ubuntu 20.10 | |
| Comment by Julius Goryavsky [ 2021-06-10 ] | |
|
Usually at this point (judging by the log) we have a fully completed wsrep deinitialization and a transition to innodb deinitialization. However, here I do not see the line "[Note] InnoDB: FTS optimize thread exiting." in the log. Question to jason1430 - do I understand correctly that "mariadb.log" is the full log of the server? Or is it just a snippet that refers to wsrep, without other lines? Question to marko - can you tell me (based on "mariadbd.trace") if we got into innodb deinitialization in this case, or not? If not, then server probably got stuck in the wsrep deinitialization, which did not completed correctly. | |
| Comment by Jason Logan [ 2021-06-10 ] | |
|
Yes, it is the full log. | |
| Comment by Jason Logan [ 2021-06-10 ] | |
|
I had a node successfully stop. Here is the resulting log entry: Normal Shutdown: This is where it stops when it does not stop correctly: | |
| Comment by Julius Goryavsky [ 2021-06-11 ] | |
|
jason1430 thanks, now we must further study the issue to understand this hang in wsrep or wsrep managed to complete deinitialization, but then hangs occurs in the FTS processing (in innodb) | |
| Comment by Marko Mäkelä [ 2021-06-11 ] | |
|
I do not see any occurrence of storage/innobase in mariadbd.trace Thread 6 doesn’t look like InnoDB: some ?? is calling signal_hand(). Other threads seem to be an idle thread pool handler, and something Galera related. It seems that InnoDB is at a very late phase of shutdown. Possibly Thread 4 is executing a sleep in logs_empty_and_mark_files_at_shutdown(). Where is the server error log? mariadb.log Could the parameter innodb_disallow_writes be a culprit for this? It is used by some Galera scripts, and the implementation is in my opinion misplaced: blocking the writes at the low level, instead of blocking them at the high level (blocking any operation that would generate redo log). Finally, are there any InnoDB tables with FULLTEXT INDEX? I suspect that they cannot work correctly with Galera. | |
| Comment by Jason Logan [ 2021-06-11 ] | |
|
I can get you any log you need. These servers have no additional databases on them. They have the default DBs. I'm not using this cluster because I need it to be stable. |