[MDEV-30970] Server partially freezed after upgrade to 10.6 Created: 2023-03-30  Updated: 2023-05-10  Resolved: 2023-05-09

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6.12
Fix Version/s: 11.1.1, 10.11.3, 11.0.2, 10.6.13, 10.8.8, 10.9.6, 10.10.4

Type: Bug Priority: Critical
Reporter: Patrick Moiroux Assignee: Marko Mäkelä
Resolution: Duplicate Votes: 0
Labels: innodb
Environment:

Centos7 (Reproductible on Rocky8)

MariaDB-client.x86_64 10.6.12-1.el7.centos @mariadb
MariaDB-common.x86_64 10.6.12-1.el7.centos @mariadb
MariaDB-compat.x86_64 10.6.12-1.el7.centos @mariadb
MariaDB-devel.x86_64 10.6.12-1.el7.centos @mariadb
MariaDB-server.x86_64 10.6.12-1.el7.centos @mariadb
MariaDB-shared.x86_64 10.6.12-1.el7.centos @mariadb
MariaDB-test.x86_64 10.6.12-1.el7.centos @mariadb
galera-4.x86_64 26.4.14-1.el7.centos @mariadb

CentOS Linux release 7.9.2009 (Core)

Memory: 192 GB in test / 256 GB in Prod


Issue Links:
Duplicate
duplicates MDEV-29835 Partial server freeze Closed

 Description   

After upgrade from 10.3 to 10.6.12, queries stay running forever. if I kill the query, it stays in status "Killed" forever.

Only way is to kill the mysql process.

Maybe related to MDEV-30638 or MDEV-29835 but I'm not sure

I generated a core dump (17 GB) and output of SHOW ENGINE INNODB STATUS

Let me know how I can send you the core dump. It's really critical for us, affecting Production



 Comments   
Comment by Daniel Black [ 2023-03-31 ]

On MDEV-30637 I added some instructions for a later build that might fix the issue.

For large uploads - the ftp service on https://mariadb.com/kb/en/meta/mariadb-ftp-server/ is a way.

Optionally: Or with the core dump and debug install symbols included create a backtrace yourself using gdb.

Comment by Marko Mäkelä [ 2023-03-31 ]

Thank you, danblack. One thing that I didn’t find in that document with quick searching is a mention that if a core dump is copied to another system for analysis, on the receiving system all dynamic libraries need to be identical, or the stack traces may be completely messed up. To get a list of most libraries, ldd /usr/sbin/mariadbd. To tell GDB to use different libraries, use set solib-search-path or set solib-absolute-prefix.

Currently, the only InnoDB hang that I know exists in 10.6.12 is MDEV-29835, and it seems to be relatively easy to hit, based on the number of reports. To verify that, I would need to see the output of thread apply all backtrace full generated from the core dump. The -debuginfo package for the server must definitely be installed before generating the stack traces. Otherwise I will be unable to say for certain if it is a duplicate of MDEV-29835.

Comment by Patrick Moiroux [ 2023-03-31 ]

Thanks for your reply.

I generated the backtrace.

ldd /usr/sbin/mariadbd
linux-vdso.so.1 => (0x00007ffd314c0000)
libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f1a376ed000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f1a374b6000)
libaio.so.1 => /lib64/libaio.so.1 (0x00007f1a372b4000)
liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f1a3708e000)
libpmem.so.1 => /lib64/libpmem.so.1 (0x00007f1a36e6c000)
libsystemd.so.0 => /lib64/libsystemd.so.0 (0x00007f1a36c3b000)
libz.so.1 => /lib64/libz.so.1 (0x00007f1a36a25000)
libssl.so.10 => /lib64/libssl.so.10 (0x00007f1a367b3000)
libcrypto.so.10 => /lib64/libcrypto.so.10 (0x00007f1a36350000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f1a36134000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f1a35f30000)
libstdc+.so.6 => /lib64/libstdc+.so.6 (0x00007f1a35c28000)
libm.so.6 => /lib64/libm.so.6 (0x00007f1a35926000)
libc.so.6 => /lib64/libc.so.6 (0x00007f1a35558000)
/lib64/ld-linux-x86-64.so.2 (0x00007f1a39cba000)
libfreebl3.so => /lib64/libfreebl3.so (0x00007f1a35355000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007f1a35150000)
librt.so.1 => /lib64/librt.so.1 (0x00007f1a34f48000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f1a34d21000)
liblz4.so.1 => /lib64/liblz4.so.1 (0x00007f1a34b12000)
libgcrypt.so.11 => /lib64/libgcrypt.so.11 (0x00007f1a34891000)
libgpg-error.so.0 => /lib64/libgpg-error.so.0 (0x00007f1a3468c000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f1a34472000)
libdw.so.1 => /lib64/libdw.so.1 (0x00007f1a34221000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f1a3400b000)
libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x00007f1a33dbe000)
libkrb5.so.3 => /lib64/libkrb5.so.3 (0x00007f1a33ad5000)
libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007f1a338d1000)
libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x00007f1a3369e000)
libattr.so.1 => /lib64/libattr.so.1 (0x00007f1a33499000)
libpcre.so.1 => /lib64/libpcre.so.1 (0x00007f1a33237000)
libelf.so.1 => /lib64/libelf.so.1 (0x00007f1a3301f000)
libbz2.so.1 => /lib64/libbz2.so.1 (0x00007f1a32e0f000)
libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x00007f1a32bff000)
libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007f1a329fb000)

I uploaded the file to the FTP private folder:

curl -T MDEV-30970.tgz --ssl-reqd ftp://ftp.mariadb.org/private/

Please let me know if it's related to MDEV-29835 and if it's safe to apply the build that may fix the issue.

Do you know when 10.6.13 will be available ?

Thanks

Comment by Daniel Black [ 2023-04-03 ]

Thanks pmoiroux,

Looking at the backtrace provided: Thread 464 (Thread 0x7f79b5f2f700 (LWP 55845)) has recursive btr_page_split_and_insert call like MDEV-29835.

MDEV-30481 has instructions for latest 10.6 RPM packages.

10.6.13 scheduled for 2023-04-27.

Comment by Patrick Moiroux [ 2023-04-05 ]

Thanks,

I've installed latest version in test and I'm trying to reproduce the issue. I will let you know how it goes..

Patrick

Comment by Marko Mäkelä [ 2023-05-08 ]

pmoiroux, did the hang disappear when using the development snapshot?

Comment by Patrick Moiroux [ 2023-05-09 ]

Hi Marko,

I was not able to reproduce the issue in test so you can close this one. I'm still waiting for the official release to update Production. Any idea when it will be available ? Was supposed to be 2023-04-27 ?

Thanks,
Patrick

Comment by Marko Mäkelä [ 2023-05-09 ]

pmoiroux, thank you. I would expect the 10.x quarterly releases to become available this week. The release process has been started.

Comment by Marko Mäkelä [ 2023-05-10 ]

The quarterly releases of MariaDB Server 10.6.19 through 10.11.3 are now available.

Generated at Thu Feb 08 10:20:16 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.