Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29255

SQL waits forever when tmpdir full. Expect getting disk space full error.

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 10.5, 10.6
    • Fix Version/s: None
    • Component/s: Server
    • Labels:
      None
    • Environment:
      redhat on VM

      Description

      Hi all,

      Our production DB version is MariaDB 10.6.7 with a dedicate 10GB mount point for tmpdir. We got a few incidents that - when a SQL uses up tmpdir disk space for sorting, that SQL will keep waiting forever. All new SQL needs tmpdir will hang. Needs manual stop DB and remove files in tmpdir. This affects the availability a lot.

      Checked the DB log saying waiting someone to free disk space. However, it is tmpdir and no one is touching it manually until service impact.

      We increased disk space as workaround. However, we never know when a even bigger tmpdir usage SQL will come. We have 200 more mariadb and this is no way to estimate the optimal tmpdir for each DB.

      Move tmpdir to same mount point as datadir cannot solve the problem. Bad SQL can still use up the disk space and introuduces availability issue.

      The same problem also found in MariaDB 10.5.12.

      While using MySQL, the SQL got disk full error immediately and, then, tmpdir disk space released to free. Other SQL statements are running fine and NO avaiability issue.

      Kindly advice how can we have the behavior same as MySQL in this situation ?

      Below is the step to repeat the issue

      # at OS level, setup mariadb 10.6.7 and run it

      # at VM level, create new disk with a few GB

      # back to OS level

      # rescan disk
      for i in $(ls /sys/class/scsi_host/)
      do
      echo "- - -" > /sys/class/scsi_host/${i}/scan
      done

      # check rescan result
      for i in $(ls /sys/class/scsi_device/)
      do
      echo 1 > /sys/class/scsi_device/${i}/device/rescan
      done

      # find the dev name (below example use /dev/sdb)
      fdisk -l

      # create partition
      fdisk /dev/sdb
      # p -> n -> p -> w

      # mount the tmpdir with 2MB only
      partprobe
      pvcreate /dev/sdb1
      vgcreate vgtmpdir /dev/sdb1
      lvcreate -n lvtmpdir -L 2M vgtmpdir
      mkfs.ext4 /dev/vgtmpdir/lvtmpdir
      mkdir -p /var/tmp
      mount /dev/vgtmpdir/lvtmpdir /var/tmp
      chmod 4777 /var/tmp

      # set tmpdir DB parameter
      rm -f /etc/my.cnf.d/tmpdir.cnf
      echo "[mysqld]" >> /etc/my.cnf.d/tmpdir.cnf
      echo "tmpdir=/var/tmp" >> /etc/my.cnf.d/tmpdir.cnf

      # restart DB
      systemctl stop mariadb
      systemctl start mariadb

      # run large sorting SQL

      show variables like 'tmpdir' ;

      select * from information_schema.COLUMNS union all
      select * from information_schema.COLUMNS union all
      select * from information_schema.COLUMNS union all
      select * from information_schema.COLUMNS union all
      select * from information_schema.COLUMNS union all
      select * from information_schema.COLUMNS union all
      select * from information_schema.COLUMNS union all
      select * from information_schema.COLUMNS union all
      select * from information_schema.COLUMNS
      order by TABLE_NAME ;

      #### SQL keeps waiting at this point but expect it returns error

      MariaDB [(none)]> show processlist ;
      --------------------------------------------------------------------------------------------------------------------------------------------------------------

      Id User Host db Command Time State Info Progress

      --------------------------------------------------------------------------------------------------------------------------------------------------------------

      3 root localhost NULL Query 109 Removing tmp table select * from information_schema.COLUMNS union all
      select * from information_schema.COLUMNS union a
      0.000
      5 root localhost NULL Query 0 starting show processlist 0.000

      --------------------------------------------------------------------------------------------------------------------------------------------------------------
      2 rows in set (0.000 sec)

      #### tmpdir keeps 100%

      Filesystem Size Used Avail Use% Mounted on
      /dev/mapper/vgtmpdir-lvtmpdir 2.9M 2.6M 3.0K 100% /var/tmp

      #### DB log
      [Warning] mariadbd: Disk is full writing '/var/tmp/#sql-temptable-d1d7-3-9.MAD' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)

        Attachments

          Activity

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            frelist William Wong
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:

                Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.