[MDEV-29006] Mariadb replication hangs Created: 2022-07-02 Updated: 2022-09-25 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | None |
| Affects Version/s: | 10.6.8 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Yevgeny Kosarzhevsky | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Centos 7 |
||
| Attachments: |
|
| Description |
|
Hello. I have random hangs of replication on slave servers. Here is some strace output. Nothing else happens.
|
| Comments |
| Comment by Sergei Golubchik [ 2022-07-08 ] | ||||||||||||||||||||
|
could you instead of strace connect with gdb and run thread apply all bt full ? | ||||||||||||||||||||
| Comment by Yevgeny Kosarzhevsky [ 2022-07-10 ] | ||||||||||||||||||||
|
Attached gdb output | ||||||||||||||||||||
| Comment by Yevgeny Kosarzhevsky [ 2022-07-17 ] | ||||||||||||||||||||
|
When this happens, I get system process in 'Closing tables' status:
After trying to kill this process it's showing as killed:
Shutdown command or term signal don't terminate the process. It keeps running. After term signal I get a single line in logs:
Then I have to kill it with KILL signal as nothing happens for hours. | ||||||||||||||||||||
| Comment by Sergei Golubchik [ 2022-07-25 ] | ||||||||||||||||||||
|
Did your attached gdb output correspond to that moment when the connection id 5 (which is a Slave_SQL thread, according to the first show processlist) is closing tables? In the gdb output the slave sql thread (Thread 229 in gdb) doesn't seem to be closing tables, it's applying a Write_rows_log_event. | ||||||||||||||||||||
| Comment by Yevgeny Kosarzhevsky [ 2022-07-27 ] | ||||||||||||||||||||
|
Hello. Here is newly collected gdb output with all debug symbols loaded. Slave_SQL_Running_State: closing tables
| ||||||||||||||||||||
| Comment by Yevgeny Kosarzhevsky [ 2022-07-27 ] | ||||||||||||||||||||
|
After I had sent TERM to mariadbd, it stayed in sleep state. I add one more output reflecting this. | ||||||||||||||||||||
| Comment by Yevgeny Kosarzhevsky [ 2022-09-25 ] | ||||||||||||||||||||
|
After downgrading to 10.6.7 the issue isn't observed anymore. |