Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27803

MariaDB binlog corruption when "No space left on device" and stuck session killed by client

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.2(EOL), 10.6
    • 10.6
    • Replication
    • None

    Description

      MariaDB server should be recoverable during storage-full condition, but in following condition the binlog corrupted during storage-full (0 space on disk). Could cause binlog replay failure, and replication failure.

      The issue is reproducible in 10.6.5 and was also seen in 10.2.40.

      Issue description:

      When MariaDB server runs out of storage, it fails to write binlog file because of "No space left on device". At this time, the server is still running.

      2022-02-10 23:51:28 6 [Warning] mysqld: Disk is full writing '/data/log/binlog/mysql-bin-changelog.000001' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
      2022-02-10 23:51:28 6 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
      

      As it will keep retrying the binlog writing, it suppose to recover after releasing some storage or adding more storage.

      However in this condition, if the stuck session is killed by the client, the binlog writing will break and couldn't recover.

      After the binlog corrupted:

      1. new insert query will fail with errors of errno: 11 "Resource temporarily unavailable")

        MariaDB [(none)]> INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');
        2022-02-10 23:55:46 11 [ERROR] mysqld: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 11 "Resource temporarily unavailable")
        

      2. Using mysqlbinlog to parse the problematic binlog file will fail with following error:

        # /usr/local/mysql/bin/mysqlbinlog /data/log/binlog/mysql-bin-changelog.000001 > /tmp/0001
        ERROR: Error in Log_event::read_log_event(): 'Event truncated', data_len: 673207109, event_type: 32
        

      The only way to recovery in this scenario is to restart MariaDB server. However it left the problematic binlog and it can't be replayed. If there're replicas, replication will also fail because of it:

      Last_IO_Error: Relay log write failure: could not queue event from master
      

      How to reproduce:

      The issue can be reproduced with following steps, using source code https://github.com/MariaDB/server/tree/mariadb-10.6.5/, and building/installing on AWS EC2 instance.

      1. Create an EC2 instance of Amazon Linux 2, add a EBS volume storage of 1 GB. Mount the volume to `/data` on instance:

        sudo su -
        fdisk -l
        mkdir /data
        mkfs.ext4 /dev/nvme1n1
        mount /dev/nvme1n1 /data
        

      2. Checkout MariaDB 10.6.5, build and install. Here's the parameters and commands I used:

        yum-builddep -y mariadb-server
        yum install -y git gcc gcc-c++ bison libxml2-devel libevent-devel rpm-build
        git clone https://github.com/MariaDB/server.git --branch mariadb-10.6.5 --depth 1
        cd server && cmake . && make -j `nproc` &&make install
        

      3. Prepare

        pkill mysqld && sleep 1
        sudo rm -rf /data/*
        sudo mkdir -p /data/log/binlog /data/log/error /data/log/innodb/ /data/db/innodb/ /data/tmp/
        sudo chown `whoami`:`whoami` /data -R
        

      4. Init the DB and start with `--log-bin` in background

        sudo /usr/local/mysql/scripts/mysql_install_db \
         --no-defaults --user=`whoami` --datadir=/data/db --basedir=/usr/local/mysql/ --innodb-log-group-home-dir /data/log/innodb --innodb-log-file-size 134217728 \
         --auth-root-authentication-method=normal --force --skip-name-resolve --skip-test-db --cross-bootstrap --innodb-data-home-dir /data/db/innodb
         
        /usr/local/mysql/bin/mysqld --no-defaults --user=`whoami` --datadir=/data/db --basedir=/usr/local/mysql/ --innodb-log-group-home-dir /data/log/innodb --innodb-log-file-size 134217728 --innodb-data-home-dir /data/db/innodb  \
        --binlog_cache_size 32768  --binlog_format MIXED   --max_binlog_size 134217728   --sync-binlog 1 --log-bin='/data/log/binlog/mysql-bin-changelog'  --skip-grant-tables --server_id=2 &
        

      5. Connect to the database and create test db/table:

        /usr/local/mysql/bin/mysql -e "\
        create database t; \
        CREATE TABLE t.t1 (a INT, b MEDIUMTEXT) ENGINE=Innodb; \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');"
        

      6. Fill up the storage

        let size1=`df /dev/nvme1n1 | sed -n '2p' | awk '{print $4}'`*1024
        fallocate -l $size1 /data/1
        cat /data/1 > /data/2
        

      7. Keep inserting a few times until query stuck

        /usr/local/mysql/bin/mysql -e " \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
        INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');"
        

      8. At this time, check the mysql-error log should contain following lines

        2022-02-09 21:22:30 31 [Warning] mysqld: Disk is full writing '/data/log/binlog/mysql-bin-changelog.000011' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
        2022-02-09 21:22:30 31 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
        

      9. Show processlist shows query state is "Commit":

        MariaDB [(none)]> show full processlist;
        +----+------+-----------+------+---------+------+----------+----------------------------------------------------------------------------------------------------------------------------+----------+
        | Id | User | Host      | db   | Command | Time | State    | Info                                                                                                                       | Progress |
        +----+------+-----------+------+---------+------+----------+----------------------------------------------------------------------------------------------------------------------------+----------+
        | 11 | root | localhost | NULL | Query   |   48 | Commit   | INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa') |    0.000 |
        | 12 | root | localhost | NULL | Query   |    0 | starting | show full processlist                                                                                                      |    0.000 |
        +----+------+-----------+------+---------+------+----------+----------------------------------------------------------------------------------------------------------------------------+----------+
        2 rows in set (0.000 sec)
        

      10. Kill the stuck client using ctrl+c

        MariaDB [(none)]> INSERT INTO t.t1 VALUES (1, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
        ^CCtrl-C -- query killed. Continuing normally.
        ^CCtrl-C -- query killed. Continuing normally.
        ERROR 2013 (HY000): Lost connection to MySQL server during query
        MariaDB [(none)]> Ctrl-C -- exit!
        Aborted
        

      11. At this point `show processlist`shows the query "Command=Killed" and "State=Commit"

        #/usr/local/mysql/bin/mysql -e "show processlist"
        +----+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
        | Id | User | Host      | db   | Command | Time | State    | Info                                                                                                 | Progress |
        +----+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
        |  5 | root | localhost | NULL | Killed  |   22 | Commit   | INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa |    0.000 |
        |  8 | root | localhost | NULL | Query   |    0 | starting | show processlist                                                                                     |    0.000 |
        +----+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
        

      12. Wait for 1 min (Time in processlist becomes 59)
        Then should see this error:

        [ERROR] mysqld: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 28 "No space left on device")
        

      13. release storage by rm /data/1

        [21:24:54][root][~]$ rm /data/1 
        rm: remove regular file '/data/1'? y
        

      14. Ideally at this time since there's enough storage, the binlog should be recovered.
      15. However, the binlog is already corrupted at this point:
        use mysqlbinlog to parse the binlog will see errors like:

        [root@ip-172-31-41-130 tmp]# /usr/local/mysql/bin/mysqlbinlog /data/log/binlog/mysql-bin-changelog.000001 > /tmp/0001
        ERROR: Error in Log_event::read_log_event(): 'Event truncated', data_len: 673207109, event_type: 32
        

      16. Retry the inserting it will show errors about the binlog writing.

        /usr/local/mysql/bin/mysql -e "INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa')"
        ERROR 1026 (HY000) at line 1: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 11 "Resource temporarily unavailable")
        

        Error log:

        2022-02-10 23:18:17 11 [ERROR] mysqld: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 11 "Resource temporarily unavailable")
        

      Attachments

        Issue Links

          Activity

            wenhug Hugo Wen created issue -
            wenhug Hugo Wen made changes -
            Field Original Value New Value
            Description

            *MariaDB server should be recoverable during storage-full condition, but in following condition the binlog corrupted during storage-full (0 space on disk). Could cause binlog replay failure, and replication failure.*

            The issue is reproducible in 10.6.5 and was also seen in 10.2.40.

            h3. *Issue description*:

            When MariaDB server runs out of storage, it fails to write binlog file because of "No space left on device". At this time, the server is still running.

            {code:java}
            2022-02-10 23:51:28 6 [Warning] mysqld: Disk is full writing '/data/log/binlog/mysql-bin-changelog.000001' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
            2022-02-10 23:51:28 6 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
            {code}

            As it will keep retrying the binlog writing, it suppose to recover after releasing some storage or adding more storage.

            However in this condition, *if the stuck session is killed by the client, the binlog writing will break and couldn't recover.*
             
            After the binlog corrupted:
            # new insert query will fail with errors of {{errno: 11 "Resource temporarily unavailable")}}
            {code:java}
            MariaDB [(none)]> INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');
            2022-02-10 23:55:46 11 [ERROR] mysqld: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 11 "Resource temporarily unavailable")
            {code}
            # Using mysqlbinlog to parse the problematic binlog file will fail with following error:
            {code}
            # /usr/local/mysql/bin/mysqlbinlog /data/log/binlog/mysql-bin-changelog.000001 > /tmp/0001
            ERROR: Error in Log_event::read_log_event(): 'Event truncated', data_len: 673207109, event_type: 32
            {code}

            The only way to recovery in this scenario is to restart MariaDB server. *However it left the problematic binlog and it can't be replayed. If there're replicas, replication will also fail because of it:*
            {code}
            Last_IO_Error: Relay log write failure: could not queue event from master
            {code}

            h3. *How to reproduce:*
            I'm able to reproduce this issue by building and installing from source code https://github.com/MariaDB/server/tree/mariadb-10.6.5/ on using AWS EC2 instance.

            # Create an EC2 instance of Amazon Linux 2, add a EBS volume storage of 1 GB. Mount the volume to `/data` on instance:
            {code}
            sudo su -
            fdisk -l
            mkdir /data
            mkfs.ext4 /dev/nvme1n1
            mount /dev/nvme1n1 /data
            {code}
            # Checkout MariaDB 10.6.5, build and install. Here's the parameters and commands I used:
            {code}
            yum-builddep -y mariadb-server
            yum install -y git gcc gcc-c++ bison libxml2-devel libevent-devel rpm-build
            git clone https://github.com/MariaDB/server.git --branch mariadb-10.6.5 --depth 1
            cd server && cmake . && make -j `nproc` &&make install
            {code}
            # Prepare
            {code}
            pkill mysqld && sleep 1
            sudo rm -rf /data/*
            sudo mkdir -p /data/log/binlog /data/log/error /data/log/innodb/ /data/db/innodb/ /data/tmp/
            sudo chown `whoami`:`whoami` /data -R
            {code}
            # Init the DB and start with `--log-bin` in background
            {code}
            sudo /usr/local/mysql/scripts/mysql_install_db \
             --no-defaults --user=`whoami` --datadir=/data/db --basedir=/usr/local/mysql/ --innodb-log-group-home-dir /data/log/innodb --innodb-log-file-size 134217728 \
             --auth-root-authentication-method=normal --force --skip-name-resolve --skip-test-db --cross-bootstrap --innodb-data-home-dir /data/db/innodb

            /usr/local/mysql/bin/mysqld --no-defaults --user=`whoami` --datadir=/data/db --basedir=/usr/local/mysql/ --innodb-log-group-home-dir /data/log/innodb --innodb-log-file-size 134217728 --innodb-data-home-dir /data/db/innodb \
            --binlog_cache_size 32768 --binlog_format MIXED --max_binlog_size 134217728 --sync-binlog 1 --log-bin='/data/log/binlog/mysql-bin-changelog' --skip-grant-tables --server_id=2 &
            {code}
            # Connect to the database and create test db/table:
            {code}
            /usr/local/mysql/bin/mysql -e "\
            create database t; \
            CREATE TABLE t.t1 (a INT, b MEDIUMTEXT) ENGINE=Innodb; \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');"
            {code}
            # Fill up the storage
            {code}
            let size1=`df /dev/nvme1n1 | sed -n '2p' | awk '{print $4}'`*1024
            fallocate -l $size1 /data/1
            cat /data/1 > /data/2
            {code}
            # Keep inserting a few times until query stuck
            {code}
            /usr/local/mysql/bin/mysql -e " \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');"
            {code}
            # At this time, check the mysql-error log should contain following lines
            {code}
            2022-02-09 21:22:30 31 [Warning] mysqld: Disk is full writing '/data/log/binlog/mysql-bin-changelog.000011' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
            2022-02-09 21:22:30 31 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
            {code}
            # Show processlist shows query state is "Commit":
            {code}
            MariaDB [(none)]> show full processlist;
            +----+------+-----------+------+---------+------+----------+----------------------------------------------------------------------------------------------------------------------------+----------+
            | Id | User | Host | db | Command | Time | State | Info | Progress |
            +----+------+-----------+------+---------+------+----------+----------------------------------------------------------------------------------------------------------------------------+----------+
            | 11 | root | localhost | NULL | Query | 48 | Commit | INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa') | 0.000 |
            | 12 | root | localhost | NULL | Query | 0 | starting | show full processlist | 0.000 |
            +----+------+-----------+------+---------+------+----------+----------------------------------------------------------------------------------------------------------------------------+----------+
            2 rows in set (0.000 sec)
            {code}
            # Kill the stuck client using ctrl+c
            {code}
            MariaDB [(none)]> INSERT INTO t.t1 VALUES (1, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
            ^CCtrl-C -- query killed. Continuing normally.
            ^CCtrl-C -- query killed. Continuing normally.
            ERROR 2013 (HY000): Lost connection to MySQL server during query
            MariaDB [(none)]> Ctrl-C -- exit!
            Aborted
            {code}
            # At this point `show processlist`shows the query "Command=Killed" and "State=Commit"
            {code}
            #/usr/local/mysql/bin/mysql -e "show processlist"
            +----+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
            | Id | User | Host | db | Command | Time | State | Info | Progress |
            +----+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
            | 5 | root | localhost | NULL | Killed | 22 | Commit | INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa | 0.000 |
            | 8 | root | localhost | NULL | Query | 0 | starting | show processlist | 0.000 |
            +----+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
            {code}
            # Wait for 1 min (Time in processlist becomes 59)
            Then should see this error:
            {code}
            [ERROR] mysqld: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 28 "No space left on device")
            {code}
            # release storage by rm /data/1
            {code}
            [21:24:54][root][~]$ rm /data/1
            rm: remove regular file '/data/1'? y
            {code}
            # *Ideally at this time since there's enough storage, the binlog should be recovered.*
            # However, the binlog will be corrupted at this point:
            use mysqlbinlog to parse the binlog will see errors like:
            {code}
            [root@ip-172-31-41-130 tmp]# /usr/local/mysql/bin/mysqlbinlog /data/log/binlog/mysql-bin-changelog.000001 > /tmp/0001
            ERROR: Error in Log_event::read_log_event(): 'Event truncated', data_len: 673207109, event_type: 32
            {code}
            # Retry the inserting it will show errors about the binlog writing.
            {code}
            /usr/local/mysql/bin/mysql -e "INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa')"
            ERROR 1026 (HY000) at line 1: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 11 "Resource temporarily unavailable")
            {code}
            Error log:
            {code}
            2022-02-10 23:18:17 11 [ERROR] mysqld: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 11 "Resource temporarily unavailable")
            {code}
            *MariaDB server should be recoverable during storage-full condition, but in following condition the binlog corrupted during storage-full (0 space on disk). Could cause binlog replay failure, and replication failure.*

            The issue is reproducible in 10.6.5 and was also seen in 10.2.40.

            h3. *Issue description*:

            When MariaDB server runs out of storage, it fails to write binlog file because of "No space left on device". At this time, the server is still running.

            {code:java}
            2022-02-10 23:51:28 6 [Warning] mysqld: Disk is full writing '/data/log/binlog/mysql-bin-changelog.000001' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
            2022-02-10 23:51:28 6 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
            {code}

            As it will keep retrying the binlog writing, it suppose to recover after releasing some storage or adding more storage.

            However in this condition, *if the stuck session is killed by the client, the binlog writing will break and couldn't recover.*
             
            After the binlog corrupted:
            # new insert query will fail with errors of {{errno: 11 "Resource temporarily unavailable")}}
            {code:java}
            MariaDB [(none)]> INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');
            2022-02-10 23:55:46 11 [ERROR] mysqld: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 11 "Resource temporarily unavailable")
            {code}
            # Using mysqlbinlog to parse the problematic binlog file will fail with following error:
            {code}
            # /usr/local/mysql/bin/mysqlbinlog /data/log/binlog/mysql-bin-changelog.000001 > /tmp/0001
            ERROR: Error in Log_event::read_log_event(): 'Event truncated', data_len: 673207109, event_type: 32
            {code}

            The only way to recovery in this scenario is to restart MariaDB server. *However it left the problematic binlog and it can't be replayed. If there're replicas, replication will also fail because of it:*
            {code}
            Last_IO_Error: Relay log write failure: could not queue event from master
            {code}

            h3. *How to reproduce:*
            The issue can be reproduced with following steps, using source code https://github.com/MariaDB/server/tree/mariadb-10.6.5/, and building/installing on AWS EC2 instance.

            # Create an EC2 instance of Amazon Linux 2, add a EBS volume storage of 1 GB. Mount the volume to `/data` on instance:
            {code}
            sudo su -
            fdisk -l
            mkdir /data
            mkfs.ext4 /dev/nvme1n1
            mount /dev/nvme1n1 /data
            {code}
            # Checkout MariaDB 10.6.5, build and install. Here's the parameters and commands I used:
            {code}
            yum-builddep -y mariadb-server
            yum install -y git gcc gcc-c++ bison libxml2-devel libevent-devel rpm-build
            git clone https://github.com/MariaDB/server.git --branch mariadb-10.6.5 --depth 1
            cd server && cmake . && make -j `nproc` &&make install
            {code}
            # Prepare
            {code}
            pkill mysqld && sleep 1
            sudo rm -rf /data/*
            sudo mkdir -p /data/log/binlog /data/log/error /data/log/innodb/ /data/db/innodb/ /data/tmp/
            sudo chown `whoami`:`whoami` /data -R
            {code}
            # Init the DB and start with `--log-bin` in background
            {code}
            sudo /usr/local/mysql/scripts/mysql_install_db \
             --no-defaults --user=`whoami` --datadir=/data/db --basedir=/usr/local/mysql/ --innodb-log-group-home-dir /data/log/innodb --innodb-log-file-size 134217728 \
             --auth-root-authentication-method=normal --force --skip-name-resolve --skip-test-db --cross-bootstrap --innodb-data-home-dir /data/db/innodb

            /usr/local/mysql/bin/mysqld --no-defaults --user=`whoami` --datadir=/data/db --basedir=/usr/local/mysql/ --innodb-log-group-home-dir /data/log/innodb --innodb-log-file-size 134217728 --innodb-data-home-dir /data/db/innodb \
            --binlog_cache_size 32768 --binlog_format MIXED --max_binlog_size 134217728 --sync-binlog 1 --log-bin='/data/log/binlog/mysql-bin-changelog' --skip-grant-tables --server_id=2 &
            {code}
            # Connect to the database and create test db/table:
            {code}
            /usr/local/mysql/bin/mysql -e "\
            create database t; \
            CREATE TABLE t.t1 (a INT, b MEDIUMTEXT) ENGINE=Innodb; \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');"
            {code}
            # Fill up the storage
            {code}
            let size1=`df /dev/nvme1n1 | sed -n '2p' | awk '{print $4}'`*1024
            fallocate -l $size1 /data/1
            cat /data/1 > /data/2
            {code}
            # Keep inserting a few times until query stuck
            {code}
            /usr/local/mysql/bin/mysql -e " \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');"
            {code}
            # At this time, check the mysql-error log should contain following lines
            {code}
            2022-02-09 21:22:30 31 [Warning] mysqld: Disk is full writing '/data/log/binlog/mysql-bin-changelog.000011' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
            2022-02-09 21:22:30 31 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
            {code}
            # Show processlist shows query state is "Commit":
            {code}
            MariaDB [(none)]> show full processlist;
            +----+------+-----------+------+---------+------+----------+----------------------------------------------------------------------------------------------------------------------------+----------+
            | Id | User | Host | db | Command | Time | State | Info | Progress |
            +----+------+-----------+------+---------+------+----------+----------------------------------------------------------------------------------------------------------------------------+----------+
            | 11 | root | localhost | NULL | Query | 48 | Commit | INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa') | 0.000 |
            | 12 | root | localhost | NULL | Query | 0 | starting | show full processlist | 0.000 |
            +----+------+-----------+------+---------+------+----------+----------------------------------------------------------------------------------------------------------------------------+----------+
            2 rows in set (0.000 sec)
            {code}
            # Kill the stuck client using ctrl+c
            {code}
            MariaDB [(none)]> INSERT INTO t.t1 VALUES (1, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
            ^CCtrl-C -- query killed. Continuing normally.
            ^CCtrl-C -- query killed. Continuing normally.
            ERROR 2013 (HY000): Lost connection to MySQL server during query
            MariaDB [(none)]> Ctrl-C -- exit!
            Aborted
            {code}
            # At this point `show processlist`shows the query "Command=Killed" and "State=Commit"
            {code}
            #/usr/local/mysql/bin/mysql -e "show processlist"
            +----+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
            | Id | User | Host | db | Command | Time | State | Info | Progress |
            +----+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
            | 5 | root | localhost | NULL | Killed | 22 | Commit | INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa | 0.000 |
            | 8 | root | localhost | NULL | Query | 0 | starting | show processlist | 0.000 |
            +----+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
            {code}
            # Wait for 1 min (Time in processlist becomes 59)
            Then should see this error:
            {code}
            [ERROR] mysqld: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 28 "No space left on device")
            {code}
            # release storage by rm /data/1
            {code}
            [21:24:54][root][~]$ rm /data/1
            rm: remove regular file '/data/1'? y
            {code}
            # *Ideally at this time since there's enough storage, the binlog should be recovered.*
            # However, the binlog will be corrupted at this point:
            use mysqlbinlog to parse the binlog will see errors like:
            {code}
            [root@ip-172-31-41-130 tmp]# /usr/local/mysql/bin/mysqlbinlog /data/log/binlog/mysql-bin-changelog.000001 > /tmp/0001
            ERROR: Error in Log_event::read_log_event(): 'Event truncated', data_len: 673207109, event_type: 32
            {code}
            # Retry the inserting it will show errors about the binlog writing.
            {code}
            /usr/local/mysql/bin/mysql -e "INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa')"
            ERROR 1026 (HY000) at line 1: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 11 "Resource temporarily unavailable")
            {code}
            Error log:
            {code}
            2022-02-10 23:18:17 11 [ERROR] mysqld: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 11 "Resource temporarily unavailable")
            {code}
            wenhug Hugo Wen made changes -
            Description *MariaDB server should be recoverable during storage-full condition, but in following condition the binlog corrupted during storage-full (0 space on disk). Could cause binlog replay failure, and replication failure.*

            The issue is reproducible in 10.6.5 and was also seen in 10.2.40.

            h3. *Issue description*:

            When MariaDB server runs out of storage, it fails to write binlog file because of "No space left on device". At this time, the server is still running.

            {code:java}
            2022-02-10 23:51:28 6 [Warning] mysqld: Disk is full writing '/data/log/binlog/mysql-bin-changelog.000001' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
            2022-02-10 23:51:28 6 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
            {code}

            As it will keep retrying the binlog writing, it suppose to recover after releasing some storage or adding more storage.

            However in this condition, *if the stuck session is killed by the client, the binlog writing will break and couldn't recover.*
             
            After the binlog corrupted:
            # new insert query will fail with errors of {{errno: 11 "Resource temporarily unavailable")}}
            {code:java}
            MariaDB [(none)]> INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');
            2022-02-10 23:55:46 11 [ERROR] mysqld: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 11 "Resource temporarily unavailable")
            {code}
            # Using mysqlbinlog to parse the problematic binlog file will fail with following error:
            {code}
            # /usr/local/mysql/bin/mysqlbinlog /data/log/binlog/mysql-bin-changelog.000001 > /tmp/0001
            ERROR: Error in Log_event::read_log_event(): 'Event truncated', data_len: 673207109, event_type: 32
            {code}

            The only way to recovery in this scenario is to restart MariaDB server. *However it left the problematic binlog and it can't be replayed. If there're replicas, replication will also fail because of it:*
            {code}
            Last_IO_Error: Relay log write failure: could not queue event from master
            {code}

            h3. *How to reproduce:*
            The issue can be reproduced with following steps, using source code https://github.com/MariaDB/server/tree/mariadb-10.6.5/, and building/installing on AWS EC2 instance.

            # Create an EC2 instance of Amazon Linux 2, add a EBS volume storage of 1 GB. Mount the volume to `/data` on instance:
            {code}
            sudo su -
            fdisk -l
            mkdir /data
            mkfs.ext4 /dev/nvme1n1
            mount /dev/nvme1n1 /data
            {code}
            # Checkout MariaDB 10.6.5, build and install. Here's the parameters and commands I used:
            {code}
            yum-builddep -y mariadb-server
            yum install -y git gcc gcc-c++ bison libxml2-devel libevent-devel rpm-build
            git clone https://github.com/MariaDB/server.git --branch mariadb-10.6.5 --depth 1
            cd server && cmake . && make -j `nproc` &&make install
            {code}
            # Prepare
            {code}
            pkill mysqld && sleep 1
            sudo rm -rf /data/*
            sudo mkdir -p /data/log/binlog /data/log/error /data/log/innodb/ /data/db/innodb/ /data/tmp/
            sudo chown `whoami`:`whoami` /data -R
            {code}
            # Init the DB and start with `--log-bin` in background
            {code}
            sudo /usr/local/mysql/scripts/mysql_install_db \
             --no-defaults --user=`whoami` --datadir=/data/db --basedir=/usr/local/mysql/ --innodb-log-group-home-dir /data/log/innodb --innodb-log-file-size 134217728 \
             --auth-root-authentication-method=normal --force --skip-name-resolve --skip-test-db --cross-bootstrap --innodb-data-home-dir /data/db/innodb

            /usr/local/mysql/bin/mysqld --no-defaults --user=`whoami` --datadir=/data/db --basedir=/usr/local/mysql/ --innodb-log-group-home-dir /data/log/innodb --innodb-log-file-size 134217728 --innodb-data-home-dir /data/db/innodb \
            --binlog_cache_size 32768 --binlog_format MIXED --max_binlog_size 134217728 --sync-binlog 1 --log-bin='/data/log/binlog/mysql-bin-changelog' --skip-grant-tables --server_id=2 &
            {code}
            # Connect to the database and create test db/table:
            {code}
            /usr/local/mysql/bin/mysql -e "\
            create database t; \
            CREATE TABLE t.t1 (a INT, b MEDIUMTEXT) ENGINE=Innodb; \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');"
            {code}
            # Fill up the storage
            {code}
            let size1=`df /dev/nvme1n1 | sed -n '2p' | awk '{print $4}'`*1024
            fallocate -l $size1 /data/1
            cat /data/1 > /data/2
            {code}
            # Keep inserting a few times until query stuck
            {code}
            /usr/local/mysql/bin/mysql -e " \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');"
            {code}
            # At this time, check the mysql-error log should contain following lines
            {code}
            2022-02-09 21:22:30 31 [Warning] mysqld: Disk is full writing '/data/log/binlog/mysql-bin-changelog.000011' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
            2022-02-09 21:22:30 31 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
            {code}
            # Show processlist shows query state is "Commit":
            {code}
            MariaDB [(none)]> show full processlist;
            +----+------+-----------+------+---------+------+----------+----------------------------------------------------------------------------------------------------------------------------+----------+
            | Id | User | Host | db | Command | Time | State | Info | Progress |
            +----+------+-----------+------+---------+------+----------+----------------------------------------------------------------------------------------------------------------------------+----------+
            | 11 | root | localhost | NULL | Query | 48 | Commit | INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa') | 0.000 |
            | 12 | root | localhost | NULL | Query | 0 | starting | show full processlist | 0.000 |
            +----+------+-----------+------+---------+------+----------+----------------------------------------------------------------------------------------------------------------------------+----------+
            2 rows in set (0.000 sec)
            {code}
            # Kill the stuck client using ctrl+c
            {code}
            MariaDB [(none)]> INSERT INTO t.t1 VALUES (1, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
            ^CCtrl-C -- query killed. Continuing normally.
            ^CCtrl-C -- query killed. Continuing normally.
            ERROR 2013 (HY000): Lost connection to MySQL server during query
            MariaDB [(none)]> Ctrl-C -- exit!
            Aborted
            {code}
            # At this point `show processlist`shows the query "Command=Killed" and "State=Commit"
            {code}
            #/usr/local/mysql/bin/mysql -e "show processlist"
            +----+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
            | Id | User | Host | db | Command | Time | State | Info | Progress |
            +----+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
            | 5 | root | localhost | NULL | Killed | 22 | Commit | INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa | 0.000 |
            | 8 | root | localhost | NULL | Query | 0 | starting | show processlist | 0.000 |
            +----+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
            {code}
            # Wait for 1 min (Time in processlist becomes 59)
            Then should see this error:
            {code}
            [ERROR] mysqld: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 28 "No space left on device")
            {code}
            # release storage by rm /data/1
            {code}
            [21:24:54][root][~]$ rm /data/1
            rm: remove regular file '/data/1'? y
            {code}
            # *Ideally at this time since there's enough storage, the binlog should be recovered.*
            # However, the binlog will be corrupted at this point:
            use mysqlbinlog to parse the binlog will see errors like:
            {code}
            [root@ip-172-31-41-130 tmp]# /usr/local/mysql/bin/mysqlbinlog /data/log/binlog/mysql-bin-changelog.000001 > /tmp/0001
            ERROR: Error in Log_event::read_log_event(): 'Event truncated', data_len: 673207109, event_type: 32
            {code}
            # Retry the inserting it will show errors about the binlog writing.
            {code}
            /usr/local/mysql/bin/mysql -e "INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa')"
            ERROR 1026 (HY000) at line 1: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 11 "Resource temporarily unavailable")
            {code}
            Error log:
            {code}
            2022-02-10 23:18:17 11 [ERROR] mysqld: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 11 "Resource temporarily unavailable")
            {code}
            *MariaDB server should be recoverable during storage-full condition, but in following condition the binlog corrupted during storage-full (0 space on disk). Could cause binlog replay failure, and replication failure.*

            The issue is reproducible in 10.6.5 and was also seen in 10.2.40.

            h3. *Issue description*:

            When MariaDB server runs out of storage, it fails to write binlog file because of "No space left on device". At this time, the server is still running.

            {code:java}
            2022-02-10 23:51:28 6 [Warning] mysqld: Disk is full writing '/data/log/binlog/mysql-bin-changelog.000001' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
            2022-02-10 23:51:28 6 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
            {code}

            As it will keep retrying the binlog writing, it suppose to recover after releasing some storage or adding more storage.

            However in this condition, *if the stuck session is killed by the client, the binlog writing will break and couldn't recover.*
             
            After the binlog corrupted:
            # new insert query will fail with errors of {{errno: 11 "Resource temporarily unavailable")}}
            {code:java}
            MariaDB [(none)]> INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');
            2022-02-10 23:55:46 11 [ERROR] mysqld: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 11 "Resource temporarily unavailable")
            {code}
            # Using mysqlbinlog to parse the problematic binlog file will fail with following error:
            {code}
            # /usr/local/mysql/bin/mysqlbinlog /data/log/binlog/mysql-bin-changelog.000001 > /tmp/0001
            ERROR: Error in Log_event::read_log_event(): 'Event truncated', data_len: 673207109, event_type: 32
            {code}

            The only way to recovery in this scenario is to restart MariaDB server. *However it left the problematic binlog and it can't be replayed. If there're replicas, replication will also fail because of it:*
            {code}
            Last_IO_Error: Relay log write failure: could not queue event from master
            {code}

            h3. *How to reproduce:*
            The issue can be reproduced with following steps, using source code https://github.com/MariaDB/server/tree/mariadb-10.6.5/, and building/installing on AWS EC2 instance.

            # Create an EC2 instance of Amazon Linux 2, add a EBS volume storage of 1 GB. Mount the volume to `/data` on instance:
            {code}
            sudo su -
            fdisk -l
            mkdir /data
            mkfs.ext4 /dev/nvme1n1
            mount /dev/nvme1n1 /data
            {code}
            # Checkout MariaDB 10.6.5, build and install. Here's the parameters and commands I used:
            {code}
            yum-builddep -y mariadb-server
            yum install -y git gcc gcc-c++ bison libxml2-devel libevent-devel rpm-build
            git clone https://github.com/MariaDB/server.git --branch mariadb-10.6.5 --depth 1
            cd server && cmake . && make -j `nproc` &&make install
            {code}
            # Prepare
            {code}
            pkill mysqld && sleep 1
            sudo rm -rf /data/*
            sudo mkdir -p /data/log/binlog /data/log/error /data/log/innodb/ /data/db/innodb/ /data/tmp/
            sudo chown `whoami`:`whoami` /data -R
            {code}
            # Init the DB and start with `--log-bin` in background
            {code}
            sudo /usr/local/mysql/scripts/mysql_install_db \
             --no-defaults --user=`whoami` --datadir=/data/db --basedir=/usr/local/mysql/ --innodb-log-group-home-dir /data/log/innodb --innodb-log-file-size 134217728 \
             --auth-root-authentication-method=normal --force --skip-name-resolve --skip-test-db --cross-bootstrap --innodb-data-home-dir /data/db/innodb

            /usr/local/mysql/bin/mysqld --no-defaults --user=`whoami` --datadir=/data/db --basedir=/usr/local/mysql/ --innodb-log-group-home-dir /data/log/innodb --innodb-log-file-size 134217728 --innodb-data-home-dir /data/db/innodb \
            --binlog_cache_size 32768 --binlog_format MIXED --max_binlog_size 134217728 --sync-binlog 1 --log-bin='/data/log/binlog/mysql-bin-changelog' --skip-grant-tables --server_id=2 &
            {code}
            # Connect to the database and create test db/table:
            {code}
            /usr/local/mysql/bin/mysql -e "\
            create database t; \
            CREATE TABLE t.t1 (a INT, b MEDIUMTEXT) ENGINE=Innodb; \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');"
            {code}
            # Fill up the storage
            {code}
            let size1=`df /dev/nvme1n1 | sed -n '2p' | awk '{print $4}'`*1024
            fallocate -l $size1 /data/1
            cat /data/1 > /data/2
            {code}
            # Keep inserting a few times until query stuck
            {code}
            /usr/local/mysql/bin/mysql -e " \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'); \
            INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');"
            {code}
            # At this time, check the mysql-error log should contain following lines
            {code}
            2022-02-09 21:22:30 31 [Warning] mysqld: Disk is full writing '/data/log/binlog/mysql-bin-changelog.000011' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
            2022-02-09 21:22:30 31 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
            {code}
            # Show processlist shows query state is "Commit":
            {code}
            MariaDB [(none)]> show full processlist;
            +----+------+-----------+------+---------+------+----------+----------------------------------------------------------------------------------------------------------------------------+----------+
            | Id | User | Host | db | Command | Time | State | Info | Progress |
            +----+------+-----------+------+---------+------+----------+----------------------------------------------------------------------------------------------------------------------------+----------+
            | 11 | root | localhost | NULL | Query | 48 | Commit | INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa') | 0.000 |
            | 12 | root | localhost | NULL | Query | 0 | starting | show full processlist | 0.000 |
            +----+------+-----------+------+---------+------+----------+----------------------------------------------------------------------------------------------------------------------------+----------+
            2 rows in set (0.000 sec)
            {code}
            # Kill the stuck client using ctrl+c
            {code}
            MariaDB [(none)]> INSERT INTO t.t1 VALUES (1, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
            ^CCtrl-C -- query killed. Continuing normally.
            ^CCtrl-C -- query killed. Continuing normally.
            ERROR 2013 (HY000): Lost connection to MySQL server during query
            MariaDB [(none)]> Ctrl-C -- exit!
            Aborted
            {code}
            # At this point `show processlist`shows the query "Command=Killed" and "State=Commit"
            {code}
            #/usr/local/mysql/bin/mysql -e "show processlist"
            +----+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
            | Id | User | Host | db | Command | Time | State | Info | Progress |
            +----+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
            | 5 | root | localhost | NULL | Killed | 22 | Commit | INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa | 0.000 |
            | 8 | root | localhost | NULL | Query | 0 | starting | show processlist | 0.000 |
            +----+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
            {code}
            # Wait for 1 min (Time in processlist becomes 59)
            Then should see this error:
            {code}
            [ERROR] mysqld: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 28 "No space left on device")
            {code}
            # release storage by rm /data/1
            {code}
            [21:24:54][root][~]$ rm /data/1
            rm: remove regular file '/data/1'? y
            {code}
            # *Ideally at this time since there's enough storage, the binlog should be recovered.*
            # However, the binlog is already corrupted at this point:
            use mysqlbinlog to parse the binlog will see errors like:
            {code}
            [root@ip-172-31-41-130 tmp]# /usr/local/mysql/bin/mysqlbinlog /data/log/binlog/mysql-bin-changelog.000001 > /tmp/0001
            ERROR: Error in Log_event::read_log_event(): 'Event truncated', data_len: 673207109, event_type: 32
            {code}
            # Retry the inserting it will show errors about the binlog writing.
            {code}
            /usr/local/mysql/bin/mysql -e "INSERT INTO t.t1 VALUES (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa')"
            ERROR 1026 (HY000) at line 1: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 11 "Resource temporarily unavailable")
            {code}
            Error log:
            {code}
            2022-02-10 23:18:17 11 [ERROR] mysqld: Error writing file '/data/log/binlog/mysql-bin-changelog' (errno: 11 "Resource temporarily unavailable")
            {code}
            wenhug Hugo Wen made changes -
            Summary MariaDB binlog corruption when "No space left on device" and stuck session killed from client MariaDB binlog corruption when "No space left on device" and stuck session killed by client
            danblack Daniel Black added a comment -

            Thank you for the detailed bug report.

            danblack Daniel Black added a comment - Thank you for the detailed bug report.
            rdem Richard DEMONGEOT made changes -

            Hello,

            I've open a bug that seems very related a month ago ( MDEV-27436 ) so i've linked both ticket as related.

            Regards,

            rdem Richard DEMONGEOT added a comment - Hello, I've open a bug that seems very related a month ago ( MDEV-27436 ) so i've linked both ticket as related. Regards,

            marko What are your thoughts on expected behavior of InnoDB when the data directory runs out of disk space (or if in general the filesystem suddenly goes into read-only mode for whatever reason)? Should mariadbd shut down automatically in such a case? Or should database stay on but yield errors, and continue once disk is writeable again? Should SELECT queries and the database connections still work and only write operations yield errors while filesystem does not accept writes?

            otto Otto Kekäläinen added a comment - marko What are your thoughts on expected behavior of InnoDB when the data directory runs out of disk space (or if in general the filesystem suddenly goes into read-only mode for whatever reason)? Should mariadbd shut down automatically in such a case? Or should database stay on but yield errors, and continue once disk is writeable again? Should SELECT queries and the database connections still work and only write operations yield errors while filesystem does not accept writes?
            danblack Daniel Black added a comment -

            As a general InnoDB IO error handling the MDEV-27593 is currently looking at how to handle these. It would be good if binlog where handled the same way.

            Do you have a user preference?

            danblack Daniel Black added a comment - As a general InnoDB IO error handling the MDEV-27593 is currently looking at how to handle these. It would be good if binlog where handled the same way. Do you have a user preference?

            otto, InnoDB normally attempts to allocate space upfront. The InnoDB redo log should never run out of space, because it is circular. If an InnoDB data file needs to be extended, then I believe that a failure to extend a file currently results in the server being killed. A more robust error handling would be to refuse the write operation that resulted in the need to extend the data file. In any case, no data should be lost in the InnoDB layer due to running out of space.

            To my understanding, both log_bin and the Aria storage engine recovery log are appended on write.

            When it comes to the Aria storage engine, I believe that it cannot be changed to use a circular log file without restricting the maximum size of a transaction. Since InnoDB writes undo log into data pages that are covered by the circular redo log, open transactions do not prevent any redo log checkpoints. I do not know anything about the binlog, but I would not be surprised if the maximum transaction size is the minimum binlog file size.

            marko Marko Mäkelä added a comment - otto , InnoDB normally attempts to allocate space upfront. The InnoDB redo log should never run out of space, because it is circular. If an InnoDB data file needs to be extended, then I believe that a failure to extend a file currently results in the server being killed. A more robust error handling would be to refuse the write operation that resulted in the need to extend the data file. In any case, no data should be lost in the InnoDB layer due to running out of space. To my understanding, both log_bin and the Aria storage engine recovery log are appended on write. When it comes to the Aria storage engine, I believe that it cannot be changed to use a circular log file without restricting the maximum size of a transaction. Since InnoDB writes undo log into data pages that are covered by the circular redo log, open transactions do not prevent any redo log checkpoints. I do not know anything about the binlog, but I would not be surprised if the maximum transaction size is the minimum binlog file size.

            Thanks for the InnoDB description Marko!

            Actually we should perhaps ask Elkin to chime in on what his thoughts are about the expected behavior of binlogs when disk is full (or filesystem goes into read-only mode for some other reason)?

            otto Otto Kekäläinen added a comment - Thanks for the InnoDB description Marko! Actually we should perhaps ask Elkin to chime in on what his thoughts are about the expected behavior of binlogs when disk is full (or filesystem goes into read-only mode for some other reason)?
            Elkin Andrei Elkin added a comment - - edited

            otto, in case binlog file system gets full and the server gets crashed (to my testing the server can only be killed) there will (or should) be the following at restart:
            1. The last filed transaction may be incomplete
            2. It *should* be trimmed from binlog according to WL#5493: Binlog crash-safe when master crashed
            3. The trimmed transaction won't be committed at recovery either.

            To WL#5493 trimming, actually I could not confirm that with my testing (on 10.6), it needs
            investigating. I am not sure though whether the report's mysqlbinglog failures to read
            were done after the server is restarted. Probably not, and if soit makes sense to restart the server and after that check the old binlog file with mysqlbinlog.

            Elkin Andrei Elkin added a comment - - edited otto , in case binlog file system gets full and the server gets crashed (to my testing the server can only be killed) there will (or should) be the following at restart: 1. The last filed transaction may be incomplete 2. It * should * be trimmed from binlog according to WL#5493: Binlog crash-safe when master crashed 3. The trimmed transaction won't be committed at recovery either. To WL#5493 trimming, actually I could not confirm that with my testing (on 10.6), it needs investigating. I am not sure though whether the report's mysqlbinglog failures to read were done after the server is restarted. Probably not, and if soit makes sense to restart the server and after that check the old binlog file with mysqlbinlog .

            Thanks Elkin for the comments. One problem here is that the binlogs are written by the primary DB and used for replication to be applied by replicas. Thus in theory the primary DB could continue working even when disk is full, but only replication would fail as binlogs are no longer written. Is the replicas exists for fail-over and high availability purposes, it would be a bit counterproductive to shut down the primary DB and make the whole application fail. Or is the assumption that if replication is on, the primary DB can be shut down and the app should fail-over to one of the replicas? And hopefully the replicas don't have their disk in read-only mode. Or is the assumption that if filesystem goes into read-only mode, the primary DB would continue running but emit alerts, and then the replicas would shut down and recovery after disk is again writeable would happen by creating new replicas from the primary DB? Or is there some middle ground, the primary DB might close all connections and refuse new writes but still keep flushing the binlogs to the replicas, allowing one of the replicas to be promoted the primary DB as soon as they have caught up with the primary DB? And only after that fully shut down the primary DB?

            What if MariaDB had some code that would trigger a safe shutdown and flush before it runs out of disk space?

            otto Otto Kekäläinen added a comment - Thanks Elkin for the comments. One problem here is that the binlogs are written by the primary DB and used for replication to be applied by replicas. Thus in theory the primary DB could continue working even when disk is full, but only replication would fail as binlogs are no longer written. Is the replicas exists for fail-over and high availability purposes, it would be a bit counterproductive to shut down the primary DB and make the whole application fail. Or is the assumption that if replication is on, the primary DB can be shut down and the app should fail-over to one of the replicas? And hopefully the replicas don't have their disk in read-only mode. Or is the assumption that if filesystem goes into read-only mode, the primary DB would continue running but emit alerts, and then the replicas would shut down and recovery after disk is again writeable would happen by creating new replicas from the primary DB? Or is there some middle ground, the primary DB might close all connections and refuse new writes but still keep flushing the binlogs to the replicas, allowing one of the replicas to be promoted the primary DB as soon as they have caught up with the primary DB? And only after that fully shut down the primary DB? What if MariaDB had some code that would trigger a safe shutdown and flush before it runs out of disk space?
            Elkin Andrei Elkin added a comment -

            otto, Yw.
            From my capsule review of your questions/proposals (did not have much time today for deeper look), the following method

            is the assumption that if replication is on, the primary DB can be shut down and the app should fail-over to one of the replicas?

            must be viable. From the server side though we need to ensure smooth shutdown (I did have a hang at my testing, to explore and fix if that's the case). It then would be the application burden to find the most updated slave to fail over the master role onto. (An automatic fail-over announced as a part of MDEV-19140 is not yet in the plans).

            I did not get your idea, sorry, in

            Or is the assumption that if filesystem goes into read-only mode, the primary DB would continue running but emit alerts, and then the replicas would shut down

            So could you please explain?

            Cheers.

            Andrei

            Elkin Andrei Elkin added a comment - otto , Yw. From my capsule review of your questions/proposals (did not have much time today for deeper look), the following method is the assumption that if replication is on, the primary DB can be shut down and the app should fail-over to one of the replicas? must be viable. From the server side though we need to ensure smooth shutdown (I did have a hang at my testing, to explore and fix if that's the case). It then would be the application burden to find the most updated slave to fail over the master role onto. (An automatic fail-over announced as a part of MDEV-19140 is not yet in the plans). I did not get your idea, sorry, in Or is the assumption that if filesystem goes into read-only mode, the primary DB would continue running but emit alerts, and then the replicas would shut down So could you please explain? Cheers. Andrei

            Or is the assumption that if filesystem goes into read-only mode, the primary DB would continue running but emit alerts, and then the replicas would shut down


            So could you please explain?

            I meant that if the filesystem where binlogs are goes into read-only mode (disk full, filesystem corrupted to kernel remounts it as read-only, network filesystem with hickup, or whatever reason) and new binlog entries cannot be written, then the primary database could in still continue to serve both writes and reads if the InnoDB data tables are on different filesystem that is still writeable. However since no new binlogs are written, replication would be broken and the primary DB should tell the replicas that they are no longer up to date and thus inconsistent.

            Alternatively is neither binlog nor data tables can be written, the primary database could still continue to run but only serve SELECT queries and issue warnings both to client connections and in server logs that it does not accept write operations.

            My purpose was just to list different scenarios so you can consider what the designed behavior should be in them.

            otto Otto Kekäläinen added a comment - Or is the assumption that if filesystem goes into read-only mode, the primary DB would continue running but emit alerts, and then the replicas would shut down So could you please explain? I meant that if the filesystem where binlogs are goes into read-only mode (disk full, filesystem corrupted to kernel remounts it as read-only, network filesystem with hickup, or whatever reason) and new binlog entries cannot be written, then the primary database could in still continue to serve both writes and reads if the InnoDB data tables are on different filesystem that is still writeable. However since no new binlogs are written, replication would be broken and the primary DB should tell the replicas that they are no longer up to date and thus inconsistent. Alternatively is neither binlog nor data tables can be written, the primary database could still continue to run but only serve SELECT queries and issue warnings both to client connections and in server logs that it does not accept write operations. My purpose was just to list different scenarios so you can consider what the designed behavior should be in them.
            Elkin Andrei Elkin added a comment - - edited

            > the primary DB should tell the replicas that they are no longer up to date and thus inconsistent.

            This is a good idea and can be implemented separately or along with cherry-picking
            --binlog-error-action to extend the upstream's set of policies.
            We already have INCIDENT event to notify replicas on certain primary's abnormalities, just in your proposal INCIDENT would be sent directly to replicas, bypassing the binlog.

            We can also help out the out-of-disk binlog primary to continue replication, again with sending out replicated events directly (not touching the binlog).
            In this scenario, the primary would have to be demoted into slave at its first restart (see MDEV-21117 semisync slave recovery that has already probated this approach) , to receive own events but not execute them but rather merely binlog them.

            Elkin Andrei Elkin added a comment - - edited > the primary DB should tell the replicas that they are no longer up to date and thus inconsistent. This is a good idea and can be implemented separately or along with cherry-picking --binlog-error-action to extend the upstream's set of policies. We already have INCIDENT event to notify replicas on certain primary's abnormalities, just in your proposal INCIDENT would be sent directly to replicas, bypassing the binlog. We can also help out the out-of-disk binlog primary to continue replication, again with sending out replicated events directly (not touching the binlog). In this scenario, the primary would have to be demoted into slave at its first restart (see MDEV-21117 semisync slave recovery that has already probated this approach) , to receive own events but not execute them but rather merely binlog them.
            Elkin Andrei Elkin made changes -
            Assignee Andrei Elkin [ elkin ]
            marko Marko Mäkelä made changes -
            elenst Elena Stepanova made changes -
            Fix Version/s 10.6 [ 24028 ]
            Affects Version/s 10.2 [ 14601 ]
            Affects Version/s 10.6 [ 24028 ]
            monty Michael Widenius added a comment - - edited

            Thing to do on the MariaDB server side to make things easier if something like this happens again:

            • On the master, if the thread local binlog cached event causes /tmp to be full during
              binlog-commit, we should skip the event and instead write an incident event to the binary
              log to mark the binlog as corrupted on the slave (as the transaction is already committed in
              the engine but the binary log will not contain it).
            • Better error message when we get a "half event" from the master.
            • When we get a 'half event', wait until the SQL slave threads has executed all found events so
              far before stopping replication (this may be the case already, but it has to be checked).
              From the above it looks like we are starting applying from 1-2-8425025170 over and over again
              when we should be applying starting from 1-2-8425025171.
              If this is the case, this is bug as in case of 'half events' we will also miss any events
              that we have read before the 'half event' as we are aborting the replication before we
              have applied the previous events.
              If the user in this case will try to use "SQL_SLAVE_SKIP_COUNTER" to skip some events,
              also the not applied events will be ignored.
            • SQL_SLAVE_SKIP_COUNTER should also be able to skip a 'half event' if this is the last
              event in a log.

            Fix the following messages to make it things more clear of what is going on:

            2023-10-16 11:54:18 14 [Note] Slave I/O thread exiting, read up to log 'bin_log.003418', position 10664265; GTID position 1-2-8425025170, master xxxx
            ->
            2023-10-16 11:54:18 14 [Note] Slave I/O thread exiting, read up to log 'bin_log.003418', position 10664265; Last applied GTID 1-2-8425025170, master xxxx

            2023-10-16 12:28:02 142367 [Note] Slave SQL thread exiting, replication stopped in log 'bin_log.003418' at position 10664156; GTID position '1-2-8425025170', master: xxxx
            ->
            2023-10-16 12:28:02 142367 [Note] Slave SQL thread exiting, replication stopped in log 'bin_log.003418' at position 10664156; Last applied GTID 1-2-8425025170', master: xxxx

            2023-10-16 13:02:07 491 [Note] Slave I/O thread: connected to master 'xxxx',replication starts at GTID position xxx
            ->
            2023-10-16 13:02:07 491 [Note] Slave I/O thread: connected to master 'xxxx',replication starting on next event after GTID xxx

            monty Michael Widenius added a comment - - edited Thing to do on the MariaDB server side to make things easier if something like this happens again: On the master, if the thread local binlog cached event causes /tmp to be full during binlog-commit, we should skip the event and instead write an incident event to the binary log to mark the binlog as corrupted on the slave (as the transaction is already committed in the engine but the binary log will not contain it). Better error message when we get a "half event" from the master. When we get a 'half event', wait until the SQL slave threads has executed all found events so far before stopping replication (this may be the case already, but it has to be checked). From the above it looks like we are starting applying from 1-2-8425025170 over and over again when we should be applying starting from 1-2-8425025171. If this is the case, this is bug as in case of 'half events' we will also miss any events that we have read before the 'half event' as we are aborting the replication before we have applied the previous events. If the user in this case will try to use "SQL_SLAVE_SKIP_COUNTER" to skip some events, also the not applied events will be ignored. SQL_SLAVE_SKIP_COUNTER should also be able to skip a 'half event' if this is the last event in a log. Fix the following messages to make it things more clear of what is going on: 2023-10-16 11:54:18 14 [Note] Slave I/O thread exiting, read up to log 'bin_log.003418', position 10664265; GTID position 1-2-8425025170, master xxxx -> 2023-10-16 11:54:18 14 [Note] Slave I/O thread exiting, read up to log 'bin_log.003418', position 10664265; Last applied GTID 1-2-8425025170, master xxxx 2023-10-16 12:28:02 142367 [Note] Slave SQL thread exiting, replication stopped in log 'bin_log.003418' at position 10664156; GTID position '1-2-8425025170', master: xxxx -> 2023-10-16 12:28:02 142367 [Note] Slave SQL thread exiting, replication stopped in log 'bin_log.003418' at position 10664156; Last applied GTID 1-2-8425025170', master: xxxx 2023-10-16 13:02:07 491 [Note] Slave I/O thread: connected to master 'xxxx',replication starts at GTID position xxx -> 2023-10-16 13:02:07 491 [Note] Slave I/O thread: connected to master 'xxxx',replication starting on next event after GTID xxx
            monty Michael Widenius made changes -
            Labels CS0665999
            julien.fritsch Julien Fritsch made changes -
            Labels CS0665999
            julien.fritsch Julien Fritsch made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            julien.fritsch Julien Fritsch made changes -
            Labels triage
            Roel Roel Van de Paar made changes -
            Roel Roel Van de Paar made changes -
            julien.fritsch Julien Fritsch made changes -
            Priority Critical [ 2 ] Major [ 3 ]
            julien.fritsch Julien Fritsch made changes -
            Labels triage

            People

              Elkin Andrei Elkin
              wenhug Hugo Wen
              Votes:
              2 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.