Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33813

ERROR 1021 (HY000): Disk full (./org/test1.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")

Details

    Description

      We have a test senerio with Aria tables MariaDB ES versions 10.6.14,10.6.15 and 10.6.17(Current).
      We tried executing slect,Inserts/deletes and alter commands parallel. When the table alter failed with MariaDB "tmp" dir space 100% utilization, Transaction rolebacked and "tmp" utilization was down to normal.
      But Subsequent queries which are using "tmp" dir were failing with "ERROR 1021 (HY000): Disk full (./org/test.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")".

      In each case we observed "(deleted)" files showing up in the output of the "lsof" command as referenced below.

      -----------------------------------------------------------------------------------------
      Version:10.6.14
       
      MariaDB [org]> select version();
      +----------------------------------+
      | version()                        |
      +----------------------------------+
      | 10.6.14-9-MariaDB-enterprise-log |
      +----------------------------------+
      1 row in set (0.001 sec)
       
      MariaDB [org]> show processlist;
      +----+-------+-----------+------+---------+------+----------+------------------+----------+
      | Id | User  | Host      | db   | Command | Time | State    | Info             | Progress |
      +----+-------+-----------+------+---------+------+----------+------------------+----------+
      | 77 | mysql | localhost | org  | Query   |    0 | starting | show processlist |    0.000 |
      +----+-------+-----------+------+---------+------+----------+------------------+----------+
      1 row in set (0.001 sec)
       
      MariaDB [org]> select count(*) from  test1 where a=1;
      ERROR 1021 (HY000): Disk full (./org/test1.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
       
      MariaDB [org]> select count(*) from test1;
      +----------+
      | count(*) |
      +----------+
      |  2338896 |
      +----------+
      1 row in set (0.000 sec)
       
      MariaDB [org]>
      MariaDB [org]> \! df -h
      Filesystem      Size  Used Avail Use% Mounted on
      tmpfs           391M  1.6M  390M   1% /run
      /dev/sda3        49G   18G   29G  40% /
      tmpfs           2.0G     0  2.0G   0% /dev/shm
      tmpfs           5.0M  4.0K  5.0M   1% /run/lock
      tmpfs           2.0M     0  2.0M   0% /mariadb/temp
      /dev/sda2       512M  6.1M  506M   2% /boot/efi
      tmpfs           391M  116K  391M   1% /run/user/1000
      tmpfs           2.0M     0  2.0M   0% /mariadb/10614/temp
       
      MariaDB [(none)]> \! lsof +L1 |grep -i "/mariadb/10614"
      lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs
            Output information may be incomplete.
      lsof: WARNING: can't stat() fuse.portal file system /run/user/1000/doc
            Output information may be incomplete.
      mariadbd  3418   mysql    8u   REG   0,48        0     0      36 /mariadb/10614/temp/#36 (deleted)
      mariadbd  3418   mysql    9u   REG   0,48        0     0      37 /mariadb/10614/temp/#37 (deleted)
      mariadbd  3418   mysql   11u   REG   0,48        0     0      38 /mariadb/10614/temp/#38 (deleted)
      mariadbd  3418   mysql   14u   REG   0,48        0     0      39 /mariadb/10614/temp/#39 (deleted)
      MariaDB [(none)]>
       
      --------------------------------------------------------------------------
       
      Version : 10.6.15
      MariaDB [org]> select version();
      +-----------------------------------+
      | version()                         |
      +-----------------------------------+
      | 10.6.15-10-MariaDB-enterprise-log |
      +-----------------------------------+
      1 row in set (0.000 sec)
       
      MariaDB [org]> show processlist;
      +----+------+-----------+------+---------+------+----------+------------------+----------+
      | Id | User | Host      | db   | Command | Time | State    | Info             | Progress |
      +----+------+-----------+------+---------+------+----------+------------------+----------+
      | 20 | root | localhost | org  | Query   |    0 | starting | show processlist |    0.000 |
      +----+------+-----------+------+---------+------+----------+------------------+----------+
      1 row in set (0.000 sec)
       
      MariaDB [org]> select count(*) from test1 where a=1;
      ERROR 1021 (HY000): Disk full (./org/test1.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
       
      MariaDB [org]> select count(*) from test1;
      +----------+
      | count(*) |
      +----------+
      | 14745600 |
      +----------+
      1 row in set (0.001 sec)
      MariaDB [org]>
      MariaDB [org]> \! df -h
      Filesystem      Size  Used Avail Use% Mounted on
      tmpfs           391M  2.0M  389M   1% /run
      /dev/sda3        49G   17G   30G  35% /
      tmpfs           2.0G     0  2.0G   0% /dev/shm
      tmpfs           5.0M  4.0K  5.0M   1% /run/lock
      /dev/sda2       512M  6.1M  506M   2% /boot/efi
      tmpfs           391M  104K  391M   1% /run/user/1000
      tmpfs           2.0M     0  2.0M   0% /mariadb/temp
      tmpfs           391M   92K  391M   1% /run/user/128
      MariaDB [org]> \! lsof +L1 |grep -i "/mariadb/"
      lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs
            Output information may be incomplete.
      lsof: WARNING: can't stat() fuse.portal file system /run/user/1000/doc
            Output information may be incomplete.
      lsof: WARNING: can't stat() fuse.portal file system /run/user/128/doc
            Output information may be incomplete.
      lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/128/gvfs
            Output information may be incomplete.
      mariadbd  3674   mysql    8u   REG   0,47        0     0      40 /mariadb/temp/#40 (deleted)
      mariadbd  3674   mysql    9u   REG   0,47        0     0      41 /mariadb/temp/#41 (deleted)
      mariadbd  3674   mysql   11u   REG   0,47        0     0      42 /mariadb/temp/#42 (deleted)
      mariadbd  3674   mysql   14u   REG   0,47        0     0      43 /mariadb/temp/#43 (deleted)
       
      ----------------------------------------------------------------------
      Version : 10.6.17
      MariaDB [org]> select version();
      +---------------------+
      | version()           |
      +---------------------+
      | 10.6.17-MariaDB-log |
      +---------------------+
      1 row in set (0.001 sec)
       
      MariaDB [org]> show processlist;
      +----+------+-----------+------+---------+------+----------+------------------+----------+
      | Id | User | Host      | db   | Command | Time | State    | Info             | Progress |
      +----+------+-----------+------+---------+------+----------+------------------+----------+
      | 24 | root | localhost | org  | Query   |    0 | starting | show processlist |    0.000 |
      +----+------+-----------+------+---------+------+----------+------------------+----------+
      1 row in set (0.003 sec)
       
      MariaDB [org]> select count(*) from test where a=1;
      ERROR 1021 (HY000): Disk full (./org/test.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
       
      MariaDB [org]> select count(*) from test;
      +----------+
      | count(*) |
      +----------+
      |        9 |
      +----------+
      1 row in set (0.002 sec)
       
      @node1:/mariadb# df -h
      Filesystem      Size  Used Avail Use% Mounted on
      tmpfs           391M  1.6M  390M   1% /run
      /dev/sda3        49G   16G   31G  34% /
      tmpfs           2.0G     0  2.0G   0% /dev/shm
      tmpfs           5.0M  4.0K  5.0M   1% /run/lock
      /dev/sda2       512M  6.1M  506M   2% /boot/efi
      tmpfs           391M  120K  391M   1% /run/user/1000
      tmpfs           2.0M  4.0K  2.0M   1% /mariadb/temp
      
      

      Attachments

        Issue Links

          Activity

            willfong Will Fong added a comment -

            Steps to reproduce:

            1. Run a query that fills up the temporary drive (ALTER, large SELECT, etc)
            2. Kill that query
            3. Any other query that requires a temporary file will fail

            willfong Will Fong added a comment - Steps to reproduce: 1. Run a query that fills up the temporary drive (ALTER, large SELECT, etc) 2. Kill that query 3. Any other query that requires a temporary file will fail

            Things to test/fix:

            • Internal temporary tables used by SELECT should never wait for disk full.
            • There should be a server option to get an error instead of 'waiting for disk full'
            • Killing a query that uses temporary files should always delete the temporary files as part ending the query. (I though this was already the case, but apparently there are some cases when this is not true)
            monty Michael Widenius added a comment - Things to test/fix: Internal temporary tables used by SELECT should never wait for disk full. There should be a server option to get an error instead of 'waiting for disk full' Killing a query that uses temporary files should always delete the temporary files as part ending the query. (I though this was already the case, but apparently there are some cases when this is not true)
            monty Michael Widenius added a comment - - edited

            Note that in the original example: Disk full (./org/test.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")"
            This is not using temporary space but comes from flushing the data from cache to the primary data-storage disk. In this case the correct behavior is to wait as otherwise the table will be corrupted.
            At kill, the table as left 'as is', as we should not delete 'real data' automatically.
            To free the space in 'data' on has to drop the table.

            monty Michael Widenius added a comment - - edited Note that in the original example: Disk full (./org/test.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")" This is not using temporary space but comes from flushing the data from cache to the primary data-storage disk. In this case the correct behavior is to wait as otherwise the table will be corrupted. At kill, the table as left 'as is', as we should not delete 'real data' automatically. To free the space in 'data' on has to drop the table.

            Fixed that internal temporary tables are not waiting for freed disk space.

            Other things:

            • 'kill id' will now kill a query waiting for free disk space instantly.
              Before it could take up to 60 seconds for the kill would be noticed.
            monty Michael Widenius added a comment - Fixed that internal temporary tables are not waiting for freed disk space. Other things: 'kill id' will now kill a query waiting for free disk space instantly. Before it could take up to 60 seconds for the kill would be noticed.

            Note that I have not been able to reproduce the case where /tmp would be full after a SELECT query using a temporary table is killed.
            In the case of one fills the disk with a normal table, then it will not help to kill the query. One must first free some space and then kill the query
            to resolve the issue.

            monty Michael Widenius added a comment - Note that I have not been able to reproduce the case where /tmp would be full after a SELECT query using a temporary table is killed. In the case of one fills the disk with a normal table, then it will not help to kill the query. One must first free some space and then kill the query to resolve the issue.

            Fix pushed to 10.6

            monty Michael Widenius added a comment - Fix pushed to 10.6
            willfong Will Fong added a comment -

            Hi monty

            From the case, they are reporting that no queries are running, but the tmpdir was still full and subsequent queries that needed to create a temporary table were not able to execute. Does your fix address this case?

            Thanks,
            -will

            willfong Will Fong added a comment - Hi monty From the case, they are reporting that no queries are running, but the tmpdir was still full and subsequent queries that needed to create a temporary table were not able to execute. Does your fix address this case? Thanks, -will

            After the patch in 10.6.18, there has not been any more incidents of MariaDB waiting for disk space to get freed up for internal temporary tables.

            I did however noticed that we are still writing 'Waiting for someone to free space" in case of disk full.
            Now removing the error message when there is no wait.

            monty Michael Widenius added a comment - After the patch in 10.6.18, there has not been any more incidents of MariaDB waiting for disk space to get freed up for internal temporary tables. I did however noticed that we are still writing 'Waiting for someone to free space" in case of disk full. Now removing the error message when there is no wait.

            People

              monty Michael Widenius
              vigneswara.bandi Venkata Vigneswara Reddy Bandi
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.