[MDEV-28698] MariaDB instance crashed after connection Out of memory issue Created: 2022-05-30  Updated: 2022-07-04

Status: Open
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.4.18
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Hemachandran M Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None


 Description   

Hello Team,

Recently, we faced weird issue with the database which is running on MariaDB 10.4.18. The cluster has been setup with three node semi-sync replicas. All the instances had to be rebuild after applying OS patches where it was completed for two instance and primary was changed to the patch applied instance. When the last instance was rebuilt and syncing, the primary went down after connections started to throw exceptions for Out of memory.

When we dig through the issue, metrics was showing the replication user which was consuming more resources before the database crashed. Never went through this scenario where the primary results in crash while he secondary is syncing. Error log didn't have any information for 2 minutes during the crash at 2022-05-04 0:25:14

2022-05-04  0:14:14 6920340 [Warning] Aborted connection 6920340 to db: 'unconnected' user: 'unauthenticated' host: 'connecting host' (Out of memory.)
2022-05-04  0:14:14 6920341 [Warning] Aborted connection 6920341 to db: 'unconnected' user: 'unauthenticated' host: 'connecting host' (Out of memory.)
......
2022-05-04  0:25:14 0 [Warning] Aborted connection 0 to db: 'unconnected' user: 'unauthenticated' host: 'connecting host' (Too many connections)
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@2022-05-04  0:27:07 0 [Note] InnoDB: Using Linux native AIO
2022-05-04  0:27:07 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins



 Comments   
Comment by Marko Mäkelä [ 2022-05-31 ]

The NUL bytes in the log file make me suspect that multiple MariaDB Server processes were started on the same set of data files. Could that be the case? Did one of the servers perhaps run out of memory when trying to allocate the InnoDB buffer pool?

Because the reported version is 10.4.18, you must be lucky. Starting with MDEV-24393 in 10.4.21, InnoDB would by default no longer lock its data files, and an attempt to start up multiple server instances on the same files would likely corrupt the database.

Comment by Hemachandran M [ 2022-06-06 ]

From the metrics graph we didn't see the buffer pool ran out of memory. It was normal shutdown when the connections ran out of memory. The cluster is running with three node semi-sync replica and one of the node was syncing from primary which had this issue.

Generated at Thu Feb 08 10:02:47 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.