[MDEV-33720] mariadb doesn't restart in 3-nodes cluster - Jira

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Galera
Labels:
None

Description

I have a backup cluster of 3 nodes (#1, #6, #7), one of the server (#6) is slave of a node of a primary cluster.

When I restart mariadb on nodes #1 or #7, it just doesn't.

Here is the last log:
#1:~$ tail -f /var/log/mysql/error.log
2024-03-19 8:32:47 0 [Note] WSREP: Flushing memory map to disk...
2024-03-19 8:32:47 0 [Note] InnoDB: FTS optimize thread exiting.
2024-03-19 8:32:48 0 [Note] InnoDB: Starting shutdown...
2024-03-19 8:32:48 0 [Note] InnoDB: Dumping buffer pool(s) to /var/lib/mysql/ib_buffer_pool
2024-03-19 8:32:48 0 [Note] InnoDB: Restricted to 778752 pages due to innodb_buf_pool_dump_pct=25
2024-03-19 8:32:48 0 [Note] InnoDB: Buffer pool(s) dump completed at 240319 8:32:48
2024-03-19 8:32:51 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
2024-03-19 8:32:51 0 [Note] InnoDB: Shutdown completed; log sequence number 65275667813314; transaction id 7267229134
2024-03-19 8:32:51 0 [Note] /usr/sbin/mariadbd: Shutdown complete

The process running is:
#1:~$ ps aux | grep maria
mysql 1477 100 3.7 57497224 4955180 ? Sl 08:36 49:23 /usr/sbin/mariadbd --user=mysql --wsrep_recover --disable-log-error

And it consumes 2 cores @ 100%, forever (confirmed after one full hour):

Both clusters have the exact same configuration, but I have this problem only on backup cluster since MariaDB last update 10.6.16 or 10.6.17.

How can I investigate this to find what's wrong?

I already reproduced this 3 times and everytime I restart mariadb (process or whole server) I have to go through this. Once synchronized, I can restart mariadb fine, but not sure for how many days.

The only option is to delete content of /var/lib/mysql, kill process and start mariadb so it runs a full SST sync, which takes 30+ minutes.

The only recent change in configuration concerns the option gcache.size=1G added to allow IST to proceed when servers are rebooted (after firmware updates).

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

image-2024-03-19-09-24-37-479.png
12 kB
2024-03-19 08:24

Activity

No workflow transitions have been executed yet.

People

Assignee:: Unassigned

Reporter:: COUNOTTE CEDRIC

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2024-03-19 08:27

Updated:: 2024-08-30 11:10

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Activity

People

Dates

Git Integration