[MDEV-29141] InnoDB hangs some time after performing live VM migration Created: 2022-07-20  Updated: 2022-07-26  Resolved: 2022-07-25

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6.5, 10.6.8
Fix Version/s: 10.6.9, 10.7.5, 10.8.4, 10.9.2

Type: Bug Priority: Critical
Reporter: Niels Hendriks Assignee: Marko Mäkelä
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Debian 11
XCP-ng 8.2.1


Attachments: File deadlock1-broke-state.json     File deadlock2-after-restart-working-state.json     Text File gdb.txt    
Issue Links:
Duplicate
duplicates MDEV-28665 aio_uring::thread_routine terminates ... Closed

 Description   

Hello,

The first 10.6 release we used was 10.6.5. After upgrading to 10.6.5, whenever we perform live VM migrations we sometimes see that InnoDB will at some point no longer respond. We also see this behaviour in 10.6.8.

We perform a live VM migration on XCP-ng (xenserver). We have done this for about ~300 VMs, and about ~40 had issues. We would see the issues ranging from about a few minutes after the migration up to ~2 weeks after the migration (usually on less busy servers it would manifest later).

Queries remain stuck in the states Updating, Sending Data, Statistics, Filling Schema table, Commit and just never complete.
Running the command "Show engine innodb status" will hang indefinitely and never give any output.

A restart of mysql also will not work. we have to kill -9 in order to restart mysql. After that it works again.

We have not noticed this on MariaDB 10.5 and 10.4.

We have noticed this on single instances as well as instances running galera.

I did make a gcore of one of the instances that has issues (it is almost 5GB). Perhaps I can do anything with that, but I'm not sure what.

Any ideas on what is wrong here?



 Comments   
Comment by Daniel Black [ 2022-07-21 ]


Can you install some debug symbols, and then using gdb grab a backtrace of all threads and attach is output here.

Comment by Niels Hendriks [ 2022-07-21 ]

Hi Daniel,

Thanks for your response.
Seems I can reproduce it fairly easily. I did the following

apt-get install sysbench
sysbench --db-driver=mariadb --table-size=1000000 --db-driver=mysql --mysql-db=sysbench --mysql-user=root --mysql-host=127.0.0.1 --mysql-password=_PASS_ /usr/share/sysbench/oltp_read_write.lua prepare
sysbench --db-driver=mariadb --table-size=1000000 --db-driver=mysql --mysql-db=sysbench --mysql-user=root --mysql-host=127.0.0.1 --mysql-password=_PASS_ --threads=2 --time=0 --report-interval=1 /usr/share/sysbench/oltp_read_write.lua run

Then, while it is running, migrate the VM back and forth a few times. Usually once or twice is enough. At some point sysbench will report no more queries in the last seconds. In `show full processlist` I see:

+----+------------------+-----------------+----------+---------+------+----------------------+-------------------------------------------+----------+
| Id | User             | Host            | db       | Command | Time | State                | Info                                      | Progress |
+----+------------------+-----------------+----------+---------+------+----------------------+-------------------------------------------+----------+
| 31 | root             | 127.0.0.1:42924 | sysbench | Execute |  249 | starting             | COMMIT                                    |    0.000 |
| 32 | root             | 127.0.0.1:42926 | sysbench | Execute |  249 | starting             | COMMIT                                    |    0.000 |
| 36 | root             | localhost       | NULL     | Killed  |  211 | starting             | show engine innodb status                 |    0.000 |
| 39 | root             | localhost       | NULL     | Killed  |  198 | starting             | show engine innodb status                 |    0.000 |
| 44 | nagiosmonitoring | 127.0.0.1:43360 | NULL     | Query   |  160 | Filling schema table | SHOW GLOBAL STATUS LIKE 'threads_running' |    0.000 |
| 45 | nagiosmonitoring | 127.0.0.1:43412 | NULL     | Query   |   58 | Filling schema table | SHOW GLOBAL STATUS LIKE 'threads_running' |    0.000 |
| 48 | root             | localhost       | NULL     | Killed  |    3 | starting             | show engine innodb status                 |    0.000 |
| 51 | root             | localhost       | NULL     | Query   |    0 | starting             | show processlist                          |    0.000 |
+----+------------------+-----------------+----------+---------+------+----------------------+-------------------------------------------+----------+

gdb.txt

EDIT:
In case it helps, if I strace the mariadb process this is all I see. Seems to be looping and giving a "Connection timed out" error:

# strace -s9999 -ff -p2722170                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    [264/1495]strace: Process 2722170 attached with 21 threads
[pid 2731134] futex(0x563ba5e8cfe0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2726313] restart_syscall(<... resuming interrupted read ...> <unfinished ...>
[pid 2725553] futex(0x563ba5e8cfe0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2722982] futex(0x563ba5544750, FUTEX_WAIT_PRIVATE, 2147483651, NULL <unfinished ...>
[pid 2722938] futex(0x563ba5e8cfe0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2722252] futex(0x7f6fcb803898, FUTEX_WAIT_PRIVATE, 2147483650, NULL <unfinished ...>
[pid 2722199] futex(0x7f6fcb803a78, FUTEX_WAIT_PRIVATE, 2147483650, NULL <unfinished ...>
[pid 2722179] futex(0x563ba55d136c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 2722176] futex(0x563ba5e924c8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 2730245] futex(0x563ba5e8cfe0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2727959] futex(0x563ba5e8cfe0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2726997] futex(0x563ba5e8cfe0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2722172] restart_syscall(<... resuming interrupted read ...> <unfinished ...>
[pid 2726745] futex(0x563ba5e5a1a0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2726783] restart_syscall(<... resuming interrupted read ...> <unfinished ...>
[pid 2726498] futex(0x563ba5e8cfe0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2725279] futex(0x563ba5e5a1a0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2722339] futex(0x563ba5544750, FUTEX_WAIT_PRIVATE, 2147483651, NULL <unfinished ...>
[pid 2722180] rt_sigtimedwait([HUP QUIT ALRM TERM TSTP],  <unfinished ...>
[pid 2722171] restart_syscall(<... resuming interrupted read ...> <unfinished ...>
[pid 2722170] restart_syscall(<... resuming interrupted read ...> <unfinished ...>
[pid 2722171] <... restart_syscall resumed>) = -1 ETIMEDOUT (Connection timed out)
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410543, tv_nsec=542899136}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410543, tv_nsec=543079650}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410543, tv_nsec=543210453}) = 0
[pid 2722171] futex(0x563ba5e9c680, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2722171] futex(0x563ba5e9c668, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1658410543, tv_nsec=866787000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410543, tv_nsec=868672354}) = 0
[pid 2722171] futex(0x563ba78fff4c, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 2722171] futex(0x563ba5e9c680, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 2726313] <... restart_syscall resumed>) = 0
[pid 2722171] <... futex resumed>)      = 0
[pid 2722171] futex(0x563ba5e9c668, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1658410543, tv_nsec=943210000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 2726313] clock_gettime(CLOCK_MONOTONIC, {tv_sec=4247514, tv_nsec=137132943}) = 0
[pid 2726313] futex(0x563ba78f0490, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2726313] clock_gettime(CLOCK_MONOTONIC, {tv_sec=4247514, tv_nsec=137575353}) = 0
[pid 2726313] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410543, tv_nsec=869427669}) = 0
[pid 2726313] clock_gettime(CLOCK_MONOTONIC, {tv_sec=4247514, tv_nsec=137720612}) = 0
[pid 2726313] futex(0x563ba78fff48, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=4247574, tv_nsec=137720612}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 2722171] <... futex resumed>)      = -1 ETIMEDOUT (Connection timed out)
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410543, tv_nsec=943702954}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410543, tv_nsec=943998970}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410543, tv_nsec=944257926}) = 0
[pid 2722171] futex(0x563ba5e9c680, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2722171] futex(0x563ba5e9c668, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1658410544, tv_nsec=344257000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410544, tv_nsec=344682585}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410544, tv_nsec=345168126}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410544, tv_nsec=345405141}) = 0
[pid 2722171] futex(0x563ba5e9c680, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2722171] futex(0x563ba5e9c668, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1658410544, tv_nsec=745405000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410544, tv_nsec=745974061}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410544, tv_nsec=746048634}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410544, tv_nsec=746111415}) = 0
[pid 2722171] futex(0x563ba5e9c680, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2722171] futex(0x563ba5e9c668, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1658410544, tv_nsec=869427000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410544, tv_nsec=869912977}) = 0
[pid 2722171] futex(0x563ba78fff48, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 2726313] <... futex resumed>)      = 0
[pid 2722171] <... futex resumed>)      = 1
[pid 2726313] futex(0x563ba78f0490, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2722171] futex(0x563ba78f0490, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 2726313] <... futex resumed>)      = -1 EAGAIN (Resource temporarily unavailable)
[pid 2722171] <... futex resumed>)      = 0
[pid 2726313] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
[pid 2722171] futex(0x563ba5e9c680, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 2726313] <... clock_gettime resumed>{tv_sec=4247515, tv_nsec=138455194}) = 0
[pid 2722171] <... futex resumed>)      = 0
[pid 2726313] futex(0x563ba78f0490, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 2722171] futex(0x563ba5e9c668, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1658410545, tv_nsec=146111000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 2726313] <... futex resumed>)      = 0
[pid 2726313] clock_gettime(CLOCK_MONOTONIC, {tv_sec=4247515, tv_nsec=138766460}) = 0
[pid 2726313] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410544, tv_nsec=870552160}) = 0
[pid 2726313] clock_gettime(CLOCK_MONOTONIC, {tv_sec=4247515, tv_nsec=138845952}) = 0
[pid 2726313] futex(0x563ba78fff4c, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=4247575, tv_nsec=138845952}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 2722171] <... futex resumed>)      = -1 ETIMEDOUT (Connection timed out)
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410545, tv_nsec=146488322}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410545, tv_nsec=146618144}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410545, tv_nsec=146717487}) = 0
[pid 2722171] futex(0x563ba5e9c680, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2722171] futex(0x563ba5e9c668, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1658410545, tv_nsec=546717000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410545, tv_nsec=547630667}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410545, tv_nsec=547840491}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410545, tv_nsec=547955034}) = 0
[pid 2722171] futex(0x563ba5e9c680, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2722171] futex(0x563ba5e9c668, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1658410545, tv_nsec=870552000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410545, tv_nsec=870860631}) = 0
[pid 2722171] futex(0x563ba78fff4c, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 2726313] <... futex resumed>)      = 0
[pid 2722171] <... futex resumed>)      = 1
[pid 2726313] futex(0x563ba78f0490, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2722171] futex(0x563ba78f0490, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 2726313] <... futex resumed>)      = -1 EAGAIN (Resource temporarily unavailable)
[pid 2722171] <... futex resumed>)      = 0
[pid 2726313] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
[pid 2722171] futex(0x563ba5e9c680, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 2726313] <... clock_gettime resumed>{tv_sec=4247516, tv_nsec=139298985}) = 0
[pid 2722171] <... futex resumed>)      = 0
[pid 2726313] futex(0x563ba78f0490, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 2722171] futex(0x563ba5e9c668, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1658410545, tv_nsec=947955000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 2726313] <... futex resumed>)      = 0
[pid 2726313] clock_gettime(CLOCK_MONOTONIC, {tv_sec=4247516, tv_nsec=139440168}) = 0
[pid 2726313] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410545, tv_nsec=871224618}) = 0
[pid 2726313] clock_gettime(CLOCK_MONOTONIC, {tv_sec=4247516, tv_nsec=139506523}) = 0
[pid 2726313] futex(0x563ba78fff48, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=4247576, tv_nsec=139506523}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 2722171] <... futex resumed>)      = -1 ETIMEDOUT (Connection timed out)
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410545, tv_nsec=948270421}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410545, tv_nsec=948392294}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410545, tv_nsec=948488326}) = 0
[pid 2722171] futex(0x563ba5e9c680, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2722171] futex(0x563ba5e9c668, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1658410546, tv_nsec=348488000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)                                                                                                                                                                                                                                                                                                                                                                                                                                                            [pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410546, tv_nsec=349263804}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410546, tv_nsec=349475968}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410546, tv_nsec=349657743}) = 0
[pid 2722171] futex(0x563ba5e9c680, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2722171] futex(0x563ba5e9c668, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1658410546, tv_nsec=749657000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410546, tv_nsec=749983835}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410546, tv_nsec=750485896}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410546, tv_nsec=750708850}) = 0
[pid 2722171] futex(0x563ba5e9c680, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2722171] futex(0x563ba5e9c668, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1658410546, tv_nsec=871224000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410546, tv_nsec=871457110}) = 0
[pid 2722171] futex(0x563ba78fff48, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 2726313] <... futex resumed>)      = 0
[pid 2722171] <... futex resumed>)      = 1
[pid 2726313] futex(0x563ba78f0490, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2722171] futex(0x563ba78f0490, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 2726313] <... futex resumed>)      = -1 EAGAIN (Resource temporarily unavailable)
[pid 2722171] <... futex resumed>)      = 0
[pid 2726313] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
[pid 2722171] futex(0x563ba5e9c680, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 2726313] <... clock_gettime resumed>{tv_sec=4247517, tv_nsec=139955857}) = 0
[pid 2722171] <... futex resumed>)      = 0
[pid 2726313] futex(0x563ba78f0490, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 2722171] futex(0x563ba5e9c668, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1658410547, tv_nsec=150708000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 2726313] <... futex resumed>)      = 0
[pid 2726313] clock_gettime(CLOCK_MONOTONIC, {tv_sec=4247517, tv_nsec=140094590}) = 0
[pid 2726313] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410546, tv_nsec=871894070}) = 0
[pid 2726313] clock_gettime(CLOCK_MONOTONIC, {tv_sec=4247517, tv_nsec=140193112}) = 0
[pid 2726313] futex(0x563ba78fff4c, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=4247577, tv_nsec=140193112}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 2722171] <... futex resumed>)      = -1 ETIMEDOUT (Connection timed out)
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410547, tv_nsec=151111343}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410547, tv_nsec=151240035}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410547, tv_nsec=151345988}) = 0
[pid 2722171] futex(0x563ba5e9c680, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2722171] futex(0x563ba5e9c668, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1658410547, tv_nsec=551345000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410547, tv_nsec=551701056}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410547, tv_nsec=552043594}) = 0
[pid 2722171] clock_gettime(CLOCK_REALTIME, {tv_sec=1658410547, tv_nsec=552157955}) = 0
[pid 2722171] futex(0x563ba5e9c680, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2722171] futex(0x563ba5e9c668, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1658410547, tv_nsec=871894000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)

Comment by Daniel Black [ 2022-07-22 ]

Thank you.

Thread 9 - a SHOW ENGINE INNODB STATUS under trx_sys_t::history_size stuck waiting on 563ba5544748 lock (page 206?)

Thread 11 purge thread, under trx_sys_t::history_size also stuck on same lock

Thread 14 - SHOW GLOBAL STATUS LIKE 'threads_running' - srv_innodb_monitor_mutex

Thread 13 - tpool::thread_pool_generic::get_task waiting on task outside innodb. seems normal

Thread 12 SHOW ENGINE INNODB STATUS , waiting on srv_monitor_file_mutex, which Thread 9 holds

Thread 8 a commit - under trx_purge_add_undo_to_history - waiting on mtr_t::page_lock (page 206)

Thread 7 a commit - under trx_purge_add_undo_to_history - waiting on a different mtr_t::page_lock (page 207)

Thread 6 - my_sigwait
Thread 5 - handle manager - idle waiting on COND_manager
Thread 4 - buf_dblwr_t::flush_buffered_writes waiting on condition
Thread 3 - ma_checkpoint_background (Aria) - cond wait - checkpoint_control
Thread 2 timer
Thread 1 poll (incoming connections)

GDB missing threads compared to process list is 1 x "SHOW GLOBAL STATUS" and 1x "SHOW ENGINE INNODB STATUS"

observations:

  • lock_print_info_summary - can use trx_sys_t::history_size_approx instead of trx_sys_t::history_size
  • trx_sys_t::history_size - can release lock immediately after obtaining size?
  • trx_sys_t::history_exceeds - do locks need to be maintained?
Comment by Daniel Black [ 2022-07-22 ]

While deadlock on threads 7,8 and maybe others is probably decipherable, I'm wondering if you could repeat your test with /usr/sbin/deadlock-bpfcc from bpfcc-tools as a test of the tool to see if it provides the deadlock information easier that a manual looking at code and gdb backtraces. I've no idea how well bpf traces in the kernel migrate during VM migration.

There's a very narrow window before the next release, and the more obvious important bugs like this are, will make them more likely to be fixed.

Comment by Niels Hendriks [ 2022-07-23 ]

Hello Daniel,

Sure. I hope I did it correctly. I noticed that when running the command, it only runs for a few seconds, detects a potential deadlock, then quits. Even after just restarting MariaDB and everything is working, it still reports a potential deadlock.

I have 2 logs, one (first one) in a broken state, the other just after a restart of mariadb. Since the command reports a deadlock after a few seconds either way I cannot keep it running during a live migration.

Broken state:

# deadlock-bpfcc 837 --binary /lib/x86_64-linux-gnu/libpthread.so.0 --dump-graph deadlock1 --verbose
In file included from <built-in>:2:
In file included from /virtual/include/bcc/bpf.h:12:
In file included from include/linux/types.h:6:
In file included from include/uapi/linux/types.h:14:
In file included from include/uapi/linux/posix_types.h:5:
In file included from include/linux/stddef.h:5:
In file included from include/uapi/linux/stddef.h:2:
In file included from include/linux/compiler_types.h:69:
include/linux/compiler-clang.h:51:9: warning: '__HAVE_BUILTIN_BSWAP32__' macro redefined [-Wmacro-redefined]
#define __HAVE_BUILTIN_BSWAP32__
        ^
<command line>:4:9: note: previous definition is here
#define __HAVE_BUILTIN_BSWAP32__ 1
        ^
In file included from <built-in>:2:
In file included from /virtual/include/bcc/bpf.h:12:
In file included from include/linux/types.h:6:
In file included from include/uapi/linux/types.h:14:
In file included from include/uapi/linux/posix_types.h:5:
In file included from include/linux/stddef.h:5:
In file included from include/uapi/linux/stddef.h:2:
In file included from include/linux/compiler_types.h:69:
include/linux/compiler-clang.h:52:9: warning: '__HAVE_BUILTIN_BSWAP64__' macro redefined [-Wmacro-redefined]
#define __HAVE_BUILTIN_BSWAP64__
        ^
<command line>:5:9: note: previous definition is here
#define __HAVE_BUILTIN_BSWAP64__ 1
        ^
In file included from <built-in>:2:
In file included from /virtual/include/bcc/bpf.h:12:
In file included from include/linux/types.h:6:
In file included from include/uapi/linux/types.h:14:
In file included from include/uapi/linux/posix_types.h:5:
In file included from include/linux/stddef.h:5:
In file included from include/uapi/linux/stddef.h:2:
In file included from include/linux/compiler_types.h:69:
include/linux/compiler-clang.h:53:9: warning: '__HAVE_BUILTIN_BSWAP16__' macro redefined [-Wmacro-redefined]
#define __HAVE_BUILTIN_BSWAP16__
        ^
<command line>:3:9: note: previous definition is here
#define __HAVE_BUILTIN_BSWAP16__ 1
        ^
3 warnings generated.
Tracing... Hit Ctrl-C to end.
Mutexes: 0, Edges: 0
Mutexes: 4, Edges: 7
----------------
Potential Deadlock Detected!
 
Cycle in lock order graph: Mutex M0 (0x000055af06073490) => Mutex M1 (0x000055af061d76e0) => Mutex M0 (0x000055af06073490)
 
Mutex M1 (0x000055af061d76e0) acquired here while holding Mutex M0 (0x000055af06073490) in Thread 838 (mariadbd):
@ 00007ff0ee6fd760 b'__GI___pthread_mutex_lock'
@ 000055af030d1258 b'tpool::thread_pool_generic::submit_task(tpool::task*)'
@ 000055af0313daa0 b'timer_handler'
@ 00007ff0ee6faea7 b'start_thread'
 
Mutex M0 (0x000055af06073490) previously acquired by the same Thread 838 (mariadbd) here:
@ 00007ff0ee6fd760 b'__GI___pthread_mutex_lock'
@ 000055af0313daa0 b'timer_handler'
@ 00007ff0ee6faea7 b'start_thread'
 
Mutex M0 (0x000055af06073490) acquired here while holding Mutex M1 (0x000055af061d76e0) in Thread 18339 (mariadbd):
@ 00007ff0ee6fd760 b'__GI___pthread_mutex_lock'
@ 000055af030d1a6d b'tpool::thread_pool_generic::worker_main(tpool::worker_data*)'
@ 00007ff0ee5eded0 b'[unknown]'
@ 000055af06082e20 b'[unknown]'
@ 000055af030d1b30 b'std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> > >::~_State_impl()'
@ 8de907894810c083 b'[unknown]'
 
Mutex M1 (0x000055af061d76e0) previously acquired by the same Thread 18339 (mariadbd) here:
@ 00007ff0ee6fd760 b'__GI___pthread_mutex_lock'
@ 000055af030d1a5f b'tpool::thread_pool_generic::worker_main(tpool::worker_data*)'
@ 00007ff0ee5eded0 b'[unknown]'
@ 000055af06082e20 b'[unknown]'
@ 000055af030d1b30 b'std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> > >::~_State_impl()'
@ 8de907894810c083 b'[unknown]'
 
Could not find stack trace where Thread 18339 was created
 
Could not find stack trace where Thread 838 was created

And just after the restart when everything is working:

# deadlock-bpfcc 20430 --binary /lib/x86_64-linux-gnu/libpthread.so.0 --dump-graph deadlock2 --verbose
In file included from <built-in>:2:
In file included from /virtual/include/bcc/bpf.h:12:
In file included from include/linux/types.h:6:
In file included from include/uapi/linux/types.h:14:
In file included from include/uapi/linux/posix_types.h:5:
In file included from include/linux/stddef.h:5:
In file included from include/uapi/linux/stddef.h:2:
In file included from include/linux/compiler_types.h:69:
include/linux/compiler-clang.h:51:9: warning: '__HAVE_BUILTIN_BSWAP32__' macro redefined [-Wmacro-redefined]
#define __HAVE_BUILTIN_BSWAP32__
        ^
<command line>:4:9: note: previous definition is here
#define __HAVE_BUILTIN_BSWAP32__ 1
        ^
In file included from <built-in>:2:
In file included from /virtual/include/bcc/bpf.h:12:
In file included from include/linux/types.h:6:
In file included from include/uapi/linux/types.h:14:
In file included from include/uapi/linux/posix_types.h:5:
In file included from include/linux/stddef.h:5:
In file included from include/uapi/linux/stddef.h:2:
In file included from include/linux/compiler_types.h:69:
include/linux/compiler-clang.h:52:9: warning: '__HAVE_BUILTIN_BSWAP64__' macro redefined [-Wmacro-redefined]
#define __HAVE_BUILTIN_BSWAP64__
        ^
<command line>:5:9: note: previous definition is here
#define __HAVE_BUILTIN_BSWAP64__ 1
        ^
In file included from <built-in>:2:
In file included from /virtual/include/bcc/bpf.h:12:
In file included from include/linux/types.h:6:
In file included from include/uapi/linux/types.h:14:
In file included from include/uapi/linux/posix_types.h:5:
In file included from include/linux/stddef.h:5:
In file included from include/uapi/linux/stddef.h:2:
In file included from include/linux/compiler_types.h:69:
include/linux/compiler-clang.h:53:9: warning: '__HAVE_BUILTIN_BSWAP16__' macro redefined [-Wmacro-redefined]
#define __HAVE_BUILTIN_BSWAP16__
        ^
<command line>:3:9: note: previous definition is here
#define __HAVE_BUILTIN_BSWAP16__ 1
        ^
3 warnings generated.
Tracing... Hit Ctrl-C to end.
Mutexes: 0, Edges: 0
Mutexes: 4, Edges: 7
----------------
Potential Deadlock Detected!
 
Cycle in lock order graph: Mutex M0 (0x000055f3b6ef1490) => Mutex M1 (0x000055f3b7128010) => Mutex M0 (0x000055f3b6ef1490)
 
Mutex M1 (0x000055f3b7128010) acquired here while holding Mutex M0 (0x000055f3b6ef1490) in Thread 20431 (mariadbd):
@ 00007f0a32fc5760 b'__GI___pthread_mutex_lock'
@ 000055f3b4114258 b'tpool::thread_pool_generic::submit_task(tpool::task*)'
@ 000055f3b4180aa0 b'timer_handler'
@ 00007f0a32fc2ea7 b'start_thread'
 
Mutex M0 (0x000055f3b6ef1490) previously acquired by the same Thread 20431 (mariadbd) here:
@ 00007f0a32fc5760 b'__GI___pthread_mutex_lock'
@ 000055f3b4180aa0 b'timer_handler'
@ 00007f0a32fc2ea7 b'start_thread'
 
Mutex M0 (0x000055f3b6ef1490) acquired here while holding Mutex M1 (0x000055f3b7128010) in Thread 20441 (mariadbd):
@ 00007f0a32fc5760 b'__GI___pthread_mutex_lock'
@ 000055f3b4114a6d b'tpool::thread_pool_generic::worker_main(tpool::worker_data*)'
@ 00007f0a32eb5ed0 b'[unknown]'
@ 000055f3b6f00fa0 b'[unknown]'
@ 000055f3b4114b30 b'std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> > >::~_State_impl()'
@ 8de907894810c083 b'[unknown]'
 
Mutex M1 (0x000055f3b7128010) previously acquired by the same Thread 20441 (mariadbd) here:
@ 00007f0a32fc5760 b'__GI___pthread_mutex_lock'
@ 000055f3b4114a5f b'tpool::thread_pool_generic::worker_main(tpool::worker_data*)'
@ 00007f0a32eb5ed0 b'[unknown]'
@ 000055f3b6f00fa0 b'[unknown]'
@ 000055f3b4114b30 b'std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> > >::~_State_impl()'
@ 8de907894810c083 b'[unknown]'
 
Could not find stack trace where Thread 20441 was created
 
Could not find stack trace where Thread 20431 was created

deadlock1-broke-state.json deadlock2-after-restart-working-state.json

Comment by Niels Hendriks [ 2022-07-23 ]

Could this be related to https://github.com/MariaDB/server/commit/db0fde3f24b37cfac9a4125ce888f1650a20db7b ?

I've been trying to see if I can pin this to a specific commit where it starts, but I'm having a difficult time understanding git bisect on the MariaDB repo. I keep ending up on wrong branches (suddenly building for 10.5 for example despite being on the 10.6 branch). I don't often use bisect so perhaps I am just using it wrong.

Either way, I think that so-far I cannot reproduce this on the latest commit in the 10.6 branch. Doing a specific checkout on commit db0fde3f24b37cfac9a4125ce888f1650a20db7b I can also not reproduce it. Looking at https://github.com/MariaDB/server/commits/10.6?after=654236c06d231461c66e2f3c5c4fd3b35cba3869+139&branch=10.6&qualified_name=refs%2Fheads%2F10.6 - it seems that commit a0e4853eff028fa9db9ba0421309e2bd1124ab26 comes just prior to db0fde3f24b37cfac9a4125ce888f1650a20db7b but this compiles to MariaDB 10.5 so I'm probably doing something wrong there.

I can reproduce it on commit 57e66dc7e60 (which seems close to commit db0fde3f24b ?) where I cannot reproduce it so-far.

I'm curious on your opinion on whether the commit I linked seems related to this one to you.

Comment by Marko Mäkelä [ 2022-07-25 ]

nielsh, in gdb.txt I cannot find any occurrence of the string thread_routine. If that run reported InnoDB: Using liburing in the server error log, then the cause of that hang should be fixed in MDEV-28665.
Also worth noting:

git diff db0fde3f24b..57e66dc7e60 storage/innobase tpool
git log 57e66dc7e60..db0fde3f24b

These commands only report the fix of MDEV-28665. So, this report would indeed seem to be a duplicate of MDEV-28665. Thank you both for narrowing it down.

Comment by Marko Mäkelä [ 2022-07-25 ]

danblack, thank you for reviewing the trx_sys code changes that I made in MDEV-25062. It was a plausible candidate for this hang. To answer your observations:

  • Yes, lock_print_info_summary() could use trx_sys.history_size_approx(). I would not object to a patch to optimize it.
  • No, trx_sys_t::history_size() cannot release any rseg.latch before acquiring them all, because we want an exact snapshot across all rollback segments.
  • The trx_sys_t::history_exceeds() is only used in SET GLOBAL innodb_max_purge_lag_wait, introduced in MDEV-16952 primarily for making tests more stable, but also to help users ‘prepare’ for a slow shutdown so that the actual shutdown will complete faster. By design, it requires an exact count, similar to trx_sys_t::history_size().
Comment by Daniel Black [ 2022-07-25 ]

Ok, I did the inexact counts in https://github.com/MariaDB/server/pull/2202

Don't quite follow that the exactness of trx_sys_t::history_size() / trx_sys_t::history_exceeds() the accuracy is lost when the locks are released before the function is returned.

Comment by Marko Mäkelä [ 2022-07-26 ]

Thank you, MDEV-29166 is now in. There are a few cases where we actually care about an exact trx_sys_t::history_size() count. I would not dare to change those without thinking about it very carefully. In the past, we had problems with the purge being unreliable, such as MDEV-11802.

Generated at Thu Feb 08 10:06:13 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.