[MDEV-33294] terminate called after throwing an instance of 'boost::wrapexcept<std::system_error>' Created: 2024-01-22  Updated: 2024-02-08

Status: Needs Feedback
Project: MariaDB Server
Component/s: Memory management
Affects Version/s: 10.6.12
Fix Version/s: None

Type: Bug Priority: Major
Reporter: 01234xy Assignee: Jan Lindström
Resolution: Unresolved Votes: 0
Labels: OOM, galera, mariadb, memory, oom
Environment:

Server version: 10.6.12-MariaDB-0ubuntu0.22.04.1 source revision
VM on hyperv machine with 16GB ram and 4 core
Galera cluster



 Description   

Hello,

I've got 2nd time issue regarding MariaDB killed a service which made me outage.
First time I've doubled the memory to 16GB (Swap turned off) but seems like not enough even my NeXTcloud server is not heavily used.

Some of my config part:

/etc/mysql/mariadb.conf.d/50-server.cnf
[server]
[mysqld]
pid-file                = /run/mysqld/mysqld.pid
basedir                 = /usr
max_allowed_packet     = 256M
expire_logs_days        = 10
character-set-server  = utf8mb4
collation-server      = utf8mb4_general_ci
[embedded]
[mariadb]
[mariadb-10.6]
[galera]
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://1.1.1.1,1.1.1.2"
binlog_format=row
transaction_isolation = READ-COMMITTED
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0
wsrep_cluster_name="MariaDB_Nextcloud_Cluster"
wsrep_node_name='DB1'
wsrep_node_address="1.1.1.1"
innodb_buffer_pool_size = 1G
innodb_log_file_size    = 256M
read_rnd_buffer_size    = 4M
sort_buffer_size        = 4M
max_allowed_packet      = 1G

This is the issue in the logs:

Jan 22 13:30:49 servername mariadbd[1068]: 2024-01-22 13:30:49 514463 [Warning] Aborted connection 514463 to db: 'unconnected' user: 'haproxy' host: 'servername' (Got an error writing communication packets
)
Jan 22 13:30:50 servername mariadbd[1068]: terminate called after throwing an instance of 'boost::wrapexcept<std::system_error>'
Jan 22 13:30:51 servername mariadbd[1068]:   what():  remote_endpoint: Transport endpoint is not connected
Jan 22 13:30:51 servername mariadbd[1068]: 240122 13:30:51 [ERROR] mysqld got signal 6 ;
Jan 22 13:30:51 servername mariadbd[1068]: This could be because you hit a bug. It is also possible that this binary
Jan 22 13:30:51 servername mariadbd[1068]: or one of the libraries it was linked against is corrupt, improperly built,
Jan 22 13:30:51 servername mariadbd[1068]: or misconfigured. This error can also be caused by malfunctioning hardware.
Jan 22 13:30:51 servername mariadbd[1068]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Jan 22 13:30:51 servername mariadbd[1068]: We will try our best to scrape up some info that will hopefully help
Jan 22 13:30:51 servername mariadbd[1068]: diagnose the problem, but since we have already crashed,
Jan 22 13:30:51 servername mariadbd[1068]: something is definitely wrong and this may fail.
Jan 22 13:30:51 servername mariadbd[1068]: Server version: 10.6.12-MariaDB-0ubuntu0.22.04.1 source revision:
Jan 22 13:30:51 servername mariadbd[1068]: key_buffer_size=134217728
Jan 22 13:30:51 servername mariadbd[1068]: read_buffer_size=131072
Jan 22 13:30:51 servername mariadbd[1068]: max_used_connections=152
Jan 22 13:30:51 servername mariadbd[1068]: max_threads=153
Jan 22 13:30:51 servername mariadbd[1068]: thread_count=153
Jan 22 13:30:51 servername mariadbd[1068]: It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 781311 K  bytes of memory
Jan 22 13:30:51 servername mariadbd[1068]: Hope that's ok; if not, decrease some variables in the equation.
Jan 22 13:30:51 servername mariadbd[1068]: Thread pointer: 0x0
Jan 22 13:30:51 servername mariadbd[1068]: Attempting backtrace. You can use the following information to find out
Jan 22 13:30:51 servername mariadbd[1068]: where mysqld died. If you see no messages after this, something went
Jan 22 13:30:51 servername mariadbd[1068]: terribly wrong...
.
.
.
Jan 22 13:37:12 servername kernel: [516744.982529] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=init.scope,mems_allowed=0,global_oom,task_memcg=/system.slice/php8.2-fpm.service,task=php-fpm8.2,pid=962,uid=33
Jan 22 13:37:12 servername kernel: [516744.982547] Out of memory: Killed process 962 (php-fpm8.2) total-vm:680576kB, anon-rss:240744kB, file-rss:3588kB, shmem-rss:101428kB, UID:33 pgtables:912kB oom_score_adj:0
Jan 22 13:37:12 servername kernel: [516745.023948] systemd[1]: php8.2-fpm.service: A process of this unit has been killed by the OOM killer.
.
.
.
Jan 22 13:37:13 servername systemd[1]: php8.2-fpm.service: Failed with result 'oom-kill'.
Jan 22 13:37:13 servername systemd[1]: php8.2-fpm.service: Consumed 4h 17min 42.625s CPU time.



 Comments   
Comment by Jan Lindström [ 2024-02-08 ]

01234xy Please provide full unedited error log, if you have output from show processlist, show status, snd show engine innodb status before oom.

Generated at Thu Feb 08 10:37:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.