Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23901

Server Crashes With Signal 6 When Disk Full

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.5.5
    • None
    • Server
    • None
    • RHEL 8 on x86_x64
      Ubuntu 20.04 on x86_x64

    Description

      MariaDB versions starting with 10.2, and including 10.5.5 crash when the disk is full, where MariaDB 10.1 seems to handle this gracefully.

      In affected versions, the server crashes when trying to process an INSERT statement that would fill the disk, with these messages in the log:

      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: 2020-10-06 20:59:46 0 [ERROR] InnoDB: preallocating 104857600 bytes for file ./storage_test/test_table.ibd failed >
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: 2020-10-06 20:59:46 0 [ERROR] [FATAL] InnoDB: Error (Out of disk space) in rollback.
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: 201006 20:59:46 [ERROR] mysqld got signal 6 ;
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: This could be because you hit a bug. It is also possible that this binary
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: or one of the libraries it was linked against is corrupt, improperly built,
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: or misconfigured. This error can also be caused by malfunctioning hardware.
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: We will try our best to scrape up some info that will hopefully help
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: diagnose the problem, but since we have already crashed,
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: something is definitely wrong and this may fail.
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: Server version: 10.5.5-MariaDB
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: key_buffer_size=134217728
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: read_buffer_size=131072
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: max_used_connections=0
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: max_threads=153
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: thread_count=1
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: It is possible that mysqld could use up to
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467803 K  bytes of memory
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: Hope that's ok; if not, decrease some variables in the equation.
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: Thread pointer: 0x0
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: Attempting backtrace. You can use the following information to find out
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: where mysqld died. If you see no messages after this, something went
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: terribly wrong...
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: stack_bottom = 0x0 thread_stack 0x49000
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: /usr/sbin/mariadbd(my_print_stacktrace+0x2e)[0x555a6f7bddee]
      Oct 06 20:59:46 ip-172-31-44-55.us-east-2.compute.internal mariadbd[1428]: /usr/sbin/mariadbd(handle_fatal_signal+0x485)[0x555a6f248ec5]
      

      In my case, I'm actually using file system quotas and running into the problem when the quota limit is reached. But, I've found the behavior to be the same when the actual disk is full.

      The problem is easy to recreate using the RHEL 8 AMI on Amazon Web Services. After starting an instance with this AMI, use these commands to install MariaDB and enable XFS project quotas on the root file system:

      # Install MariaDB and enable XFS pquota option on root file system.
      sudo yum -y update
      curl -LsS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | sudo bash
      sudo yum -y install MariaDB-server MariaDB-client
      sudo systemctl enable mariadb
      sudo sed -i 's/\(GRUB_CMDLINE_LINUX=\)"\(.*\)"/\1"\2 rootflags=pquota"/' /etc/default/grub
      sudo grub2-mkconfig -o /boot/grub2/grub.cfg
      sudo reboot
      

      Then after the reboot, setup a test database and configure 100MB directory quota with these commands:

      # Create test database.
      sudo mariadb -e 'create database storage_test;
      create table storage_test.test_table (col1 mediumtext)'
       
      # Set up XFS project quota on database directory.
      echo '10:/var/lib/mysql/storage_test' | sudo tee -a /etc/projects
      echo 'storage_test:10' | sudo tee -a /etc/projid
      sudo xfs_quota -x -c 'project -s 10'
      sudo xfs_quota -x -c 'limit -p bhard=100M 10'
      

      Then fill up the database to the quota limit by running this command 10 times:

      # Insert 10MB of data.
      sudo mariadb -e "insert into storage_test.test_table values(repeat('x', 10485760))" 
      

      The 10th INSERT causes the database directory quota to be reached and MariaDB crashes with the messages shown above. MariaDB starts back up the first time, but then if you run another INSERT statement it will crash again and won't restart, with messages similar to those above in the log.

      In 10.1, this type of situation is handled gracefully. When a disk quota is reached, the INSERT statement fails with:

      ERROR 1114 (HY000) at line 10: The table 'test_table' is full
      

      The 10.1 server does not crash, other databases/schemas are not affected, and recovery is easy – the disk quota can be increased, or rows can be deleted and OPTIMIZE TABLE can be run to reclaim space.

      We operate a multi-tenant MariaDB server (currently 10.1) and this issue is preventing us from upgrading. We enforce disk space limits on our tenants using XFS directory quotas. In versions higher than 10.1, if one tenant fills up his database the entire server crashes for everyone!

      Attachments

        Activity

          People

            Unassigned Unassigned
            drusso David Russo
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.