[MDEV-23901] Server Crashes With Signal 6 When Disk Full - Jira

XML

Word

Printable

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.5.5
Fix Version/s: None
Component/s: Server
Labels:
None
Environment:
RHEL 8 on x86_x64
Ubuntu 20.04 on x86_x64

Description

MariaDB versions starting with 10.2, and including 10.5.5 crash when the disk is full, where MariaDB 10.1 seems to handle this gracefully.

In affected versions, the server crashes when trying to process an INSERT statement that would fill the disk, with these messages in the log:

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: 2020-10-06 20:59:46 0 [ERROR] InnoDB: preallocating 104857600 bytes for file ./storage_test/test_table.ibd failed >

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: 2020-10-06 20:59:46 0 [ERROR] [FATAL] InnoDB: Error (Out of disk space) in rollback.

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: 201006 20:59:46 [ERROR] mysqld got signal 6 ;

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: This could be because you hit a bug. It is also possible that this binary

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: or one of the libraries it was linked against is corrupt, improperly built,

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: or misconfigured. This error can also be caused by malfunctioning hardware.

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: We will try our best to scrape up some info that will hopefully help

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: diagnose the problem, but since we have already crashed,

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: something is definitely wrong and this may fail.

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: Server version: 10.5.5-MariaDB

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: key_buffer_size=134217728

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: read_buffer_size=131072

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: max_used_connections=0

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: max_threads=153

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: thread_count=1

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: It is possible that mysqld could use up to

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467803 K  bytes of memory

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: Hope that's ok; if not, decrease some variables in the equation.

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: Thread pointer: 0x0

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: Attempting backtrace. You can use the following information to find out

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: where mysqld died. If you see no messages after this, something went

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: terribly wrong...

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: stack_bottom = 0x0 thread_stack 0x49000

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: /usr/sbin/mariadbd(my_print_stacktrace+0x2e)[0x555a6f7bddee]

Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: /usr/sbin/mariadbd(handle_fatal_signal+0x485)[0x555a6f248ec5]

In my case, I'm actually using file system quotas and running into the problem when the quota limit is reached. But, I've found the behavior to be the same when the actual disk is full.

The problem is easy to recreate using the RHEL 8 AMI on Amazon Web Services. After starting an instance with this AMI, use these commands to install MariaDB and enable XFS project quotas on the root file system:

# Install MariaDB and enable XFS pquota option on root file system.

sudo yum -y update

curl -LsS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | sudo bash

sudo yum -y install MariaDB-server MariaDB-client

sudo systemctl enable mariadb

sudo sed -i 's/\(GRUB_CMDLINE_LINUX=\)"\(.*\)"/\1"\2 rootflags=pquota"/' /etc/default/grub

sudo grub2-mkconfig -o /boot/grub2/grub.cfg

sudo reboot

Then after the reboot, setup a test database and configure 100MB directory quota with these commands:

# Create test database.

sudo mariadb -e 'create database storage_test;

create table storage_test.test_table (col1 mediumtext)'

# Set up XFS project quota on database directory.

echo '10:/var/lib/mysql/storage_test' | sudo tee -a /etc/projects

echo 'storage_test:10' | sudo tee -a /etc/projid

sudo xfs_quota -x -c 'project -s 10'

sudo xfs_quota -x -c 'limit -p bhard=100M 10'

Then fill up the database to the quota limit by running this command 10 times:

# Insert 10MB of data.

sudo mariadb -e "insert into storage_test.test_table values(repeat('x', 10485760))"

The 10th INSERT causes the database directory quota to be reached and MariaDB crashes with the messages shown above. MariaDB starts back up the first time, but then if you run another INSERT statement it will crash again and won't restart, with messages similar to those above in the log.

In 10.1, this type of situation is handled gracefully. When a disk quota is reached, the INSERT statement fails with:

ERROR 1114 (HY000) at line 10: The table 'test_table' is full

The 10.1 server does not crash, other databases/schemas are not affected, and recovery is easy – the disk quota can be increased, or rows can be deleted and OPTIMIZE TABLE can be run to reclaim space.

We operate a multi-tenant MariaDB server (currently 10.1) and this issue is preventing us from upgrading. We enforce disk space limits on our tenants using XFS directory quotas. In versions higher than 10.1, if one tenant fills up his database the entire server crashes for everyone!

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: David Russo

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2020-10-06 22:51

Updated:: 2020-10-06 23:44

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.