Unfortunately I can consistently repeat this bug on a good galera SST dump, 6 times out of 10, on a very large data set approx 16TB on a 1TB system with --use-memory of 800G, all stored on a 24TB fast ssd raid0 array. Mariabackup either hangs with no cpu waiting on a thread or hangs with 2 or more cores like this:
{{[pid 18785] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable)
[pid 18785] futex(0x7ff13810fc50, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 18787] futex(0x7ff13810fc50, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 18786] futex(0x7ff13810fc50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 18787] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable)
[pid 18786] <... futex resumed> ) = 0
[pid 18787] futex(0x7ff13810fc50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 18785] futex(0x7ff13810fc50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 18787] <... futex resumed> ) = 0
[pid 18785] <... futex resumed> ) = 0
[pid 18519] futex(0x7ff13810fc50, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 18787] futex(0x7ff13810fc50, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 18786] futex(0x7ff13810fc50, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 18787] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable)
[pid 18786] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable)
[pid 18787] futex(0x7ff13810fc50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 18786] futex(0x7ff13810fc50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 18787] <... futex resumed> ) = 0
[pid 18786] <... futex resumed> ) = 0
[pid 18785] futex(0x7ff13810fc50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 18784] futex(0x7ff13810fc50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 18785] <... futex resumed> ) = 0
[pid 18784] <... futex resumed> ) = 0
[pid 18519] futex(0x7ff13810fc50, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 18787] futex(0x7ff13810fc50, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 18786] futex(0x7ff13810fc50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 18787] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable)
[pid 18786] <... futex resumed> ) = 0
[pid 18787] futex(0x7ff13810fc50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 18519] futex(0x7ff13810fc50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 18787] <... futex resumed> ) = 0
[pid 18519] <... futex resumed> ) = 0
[pid 18786] futex(0x7ff13810fc50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 18519] futex(0x7ff13810fc50, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
}}
No errors in the log, all is well until this point on crash recovery phase (being an SST):
Dec 7 02:28:09 x -innobackupex-apply: 2021-12-07 2:28:09 139656444233472 [Note] InnoDB: To recover: 9930460 pages from log
Dec 7 02:28:24 x -innobackupex-apply: 2021-12-07 2:28:24 139656444233472 [Note] InnoDB: To recover: 9792342 pages from log
Dec 7 02:28:39 x -innobackupex-apply: 2021-12-07 2:28:39 139656444233472 [Note] InnoDB: To recover: 9622947 pages from log
Dec 7 02:28:54 x -innobackupex-apply: 2021-12-07 2:28:54 139656444233472 [Note] InnoDB: To recover: 9510081 pages from log
Dec 7 02:29:09 x -innobackupex-apply: 2021-12-07 2:29:09 139656444233472 [Note] InnoDB: To recover: 9503157 pages from log
Dec 7 02:29:24 x -innobackupex-apply: 2021-12-07 2:29:24 139656427448064 [Note] InnoDB: To recover: 9402421 pages from log
Each attempt can freeze at a random point, never the same amount of pages to recover, and randomly it just works, passes, member rejoins the cluster like this isn't ever an issue.
example of a hung mariabackup:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18519 mysql 15 -5 0.939t 0.853t 4456 R 507.1 86.8 2780:00 mariabackup
I will rerun with --dbug=d,ib_log during log apply phase to verify what can be found, it used to be this rarely would happen, but now it is far more regular of an occurrence. I've seen other reports of this happening but no real followups. I figured at first it was a ram limitation related matter but it can happen earlier on in the first batch read, before it even consumes a large amount of memory.
├─mysqld─┬─sh───wsrep_sst_maria─┬─logger
│ │ ├─mariabackup───12*[
{mariabackup}
]
│ │ └─wsrep_sst_maria───logger
│ └─7*[
{mysqld}
]
Thanks in advance,
Bren
claudio.nanni, are you sure that the backtrace-txt.log
is not from mariabackup --backup? The threads there would seem more like that to me, not --prepare. I see no InnoDB threads executing.
The mariabackup --prepare of a backup is close to invoking InnoDB crash recovery. I do not see any recovery here; only log-copying and data-file-copying threads that should not be run during --prepare.
For debugging --prepare, I would suggest invoking a debug version, to get verbose output from the recovery code:
mariabackup --prepare --dbug=d,ib_log …