Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29610

uring tmpfs InnoDB: IO Error: 125 during async write of X bytes, for file Y, returned 0

Details

    • Bug
    • Status: Stalled (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL), 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL)
    • 10.6, 10.11
    • None

    Description

      Almost always on kvm-deb-bullseye-aarch64 (old bb) worker. tmpfs based directory.

      Usually encryption.innodb* tests though mariabackup.huge_lsn too.

      Notable non-aarch64 exception:

      Attachments

        Issue Links

          Activity

            danblack, any news on this? I see that 125 is ECANCELED in /usr/include/asm-generic/errno.h.

            mleich got this on 5.15.0-52-generic #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux on an ext4 file system that was loopback mounted in a file that resided in /dev/shm. (Such indirection makes sense when using the older libaio; it does not support files residing directly on /dev/shm.)

            marko Marko Mäkelä added a comment - danblack , any news on this? I see that 125 is ECANCELED in /usr/include/asm-generic/errno.h . mleich got this on 5.15.0-52-generic #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux on an ext4 file system that was loopback mounted in a file that resided in /dev/shm . (Such indirection makes sense when using the older libaio ; it does not support files residing directly on /dev/shm .)

            Yes, "doing fork/exit with IO inflight" seems to apply here:

            # 2023-05-17T11:03:59 [216405] | 2023-05-17 11:03:55 0 [Note] /data/Server_bin/bb-10.6-MDEV-31234_RelWithDebInfo/bin/mysqld (initiated by: root[root] @ localhost [127.0.0.1]): Normal shutdown
            # 2023-05-17T11:03:59 [216405] | 2023-05-17 11:03:55 0 [Note] InnoDB: FTS optimize thread exiting.
            # 2023-05-17T11:03:59 [216405] | 2023-05-17 11:03:55 0 [ERROR] [FATAL] InnoDB: IO Error: 125 during async lru write of 4096 bytes, for file 9, returned 0
            

            danblack, do you have a patch that makes it easier to hit this bug?

            marko Marko Mäkelä added a comment - Yes, "doing fork/exit with IO inflight" seems to apply here: # 2023-05-17T11:03:59 [216405] | 2023-05-17 11:03:55 0 [Note] /data/Server_bin/bb-10.6-MDEV-31234_RelWithDebInfo/bin/mysqld (initiated by: root[root] @ localhost [127.0.0.1]): Normal shutdown # 2023-05-17T11:03:59 [216405] | 2023-05-17 11:03:55 0 [Note] InnoDB: FTS optimize thread exiting. # 2023-05-17T11:03:59 [216405] | 2023-05-17 11:03:55 0 [ERROR] [FATAL] InnoDB: IO Error: 125 during async lru write of 4096 bytes, for file 9, returned 0 danblack , do you have a patch that makes it easier to hit this bug?

            In MDEV-27593 I changed the wording of the message and made it a non-fatal error. Actually, I accidentally made it Note instead of ERROR, so this should not cause any mtr test failures anymore:

            diff --git a/storage/innobase/os/os0file.cc b/storage/innobase/os/os0file.cc
            index 906ce79e8fd..99f212b6eeb 100644
            --- a/storage/innobase/os/os0file.cc
            +++ b/storage/innobase/os/os0file.cc
            @@ -3459,8 +3459,11 @@ static void write_io_callback(void *c)
               const IORequest &request= *static_cast<const IORequest*>
                 (static_cast<const void*>(cb->m_userdata));
             
            +  if (!ut_rnd_interval(100))
            +    cb->m_err= 42;
            +
               if (UNIV_UNLIKELY(cb->m_err != 0))
            -    ib::info () << "IO Error: " << cb->m_err
            +    ib::error() << "IO Error: " << cb->m_err
                             << "during write of "
                             << cb->m_len << " bytes, for file "
                             << request.node->name << "(" << cb->m_fh << "), returned "
            

            10.6 72928e640e62559f4d4b82e9ff1102bb72bd5828 with patch

            var/9/log/mysqld.1.err:2023-08-01 15:27:09 0 [Note] InnoDB: IO Error: 42during write of 16384 bytes, for file ./ibdata1(11), returned 16384
            

            marko Marko Mäkelä added a comment - In MDEV-27593 I changed the wording of the message and made it a non-fatal error. Actually, I accidentally made it Note instead of ERROR , so this should not cause any mtr test failures anymore: diff --git a/storage/innobase/os/os0file.cc b/storage/innobase/os/os0file.cc index 906ce79e8fd..99f212b6eeb 100644 --- a/storage/innobase/os/os0file.cc +++ b/storage/innobase/os/os0file.cc @@ -3459,8 +3459,11 @@ static void write_io_callback(void *c) const IORequest &request= *static_cast<const IORequest*> (static_cast<const void*>(cb->m_userdata)); + if (!ut_rnd_interval(100)) + cb->m_err= 42; + if (UNIV_UNLIKELY(cb->m_err != 0)) - ib::info () << "IO Error: " << cb->m_err + ib::error() << "IO Error: " << cb->m_err << "during write of " << cb->m_len << " bytes, for file " << request.node->name << "(" << cb->m_fh << "), returned " 10.6 72928e640e62559f4d4b82e9ff1102bb72bd5828 with patch var/9/log/mysqld.1.err:2023-08-01 15:27:09 0 [Note] InnoDB: IO Error: 42during write of 16384 bytes, for file ./ibdata1(11), returned 16384

            I got some logs of failures that occurred after MDEV-27593 was pushed into 10.6 (2003-08-01). I found only one failure that includes that fix:

            bb-11.0-ycp-mdev-31996 d07012f4d84e57f1895f848b029fabd220b5d5be ppc64le-ubuntu-2204

            encryption.innodb-checksum-algorithm '32k,ctr,innodb,strict_crc32' w8 [ fail ]
            …
            2023-10-06  3:37:06 4 [Note] InnoDB: Importing tablespace for table 'test/tp' that was exported from host 'c8473bd36d6b'
            2023-10-06  3:37:06 4 [Note] InnoDB: Phase I - Update all pages
            2023-10-06  3:37:06 4 [Note] InnoDB: Sync to disk
            2023-10-06  3:37:06 4 [Note] InnoDB: Sync to disk - done!
            2023-10-06  3:37:06 4 [Note] InnoDB: Phase III - Flush changes to disk
            2023-10-06  3:37:06 4 [Note] InnoDB: Phase IV - Flush complete
            2023-10-06  3:37:06 4 [Note] InnoDB: `test`.`tp` autoinc value set to 2
            2023-10-06  3:37:07 0 [Note] InnoDB: IO Error: 125during write of 491520 bytes, for file ./ibdata1(11), returned 0
            ----------SERVER LOG END-------------
            mysqltest failed but provided no output
            

            The server apparently hung here. The reason could be unrelated to the io_uring error, such as MDEV-32588.

            marko Marko Mäkelä added a comment - I got some logs of failures that occurred after MDEV-27593 was pushed into 10.6 (2003-08-01). I found only one failure that includes that fix: bb-11.0-ycp-mdev-31996 d07012f4d84e57f1895f848b029fabd220b5d5be ppc64le-ubuntu-2204 encryption.innodb-checksum-algorithm '32k,ctr,innodb,strict_crc32' w8 [ fail ] … 2023-10-06 3:37:06 4 [Note] InnoDB: Importing tablespace for table 'test/tp' that was exported from host 'c8473bd36d6b' 2023-10-06 3:37:06 4 [Note] InnoDB: Phase I - Update all pages 2023-10-06 3:37:06 4 [Note] InnoDB: Sync to disk 2023-10-06 3:37:06 4 [Note] InnoDB: Sync to disk - done! 2023-10-06 3:37:06 4 [Note] InnoDB: Phase III - Flush changes to disk 2023-10-06 3:37:06 4 [Note] InnoDB: Phase IV - Flush complete 2023-10-06 3:37:06 4 [Note] InnoDB: `test`.`tp` autoinc value set to 2 2023-10-06 3:37:07 0 [Note] InnoDB: IO Error: 125during write of 491520 bytes, for file ./ibdata1(11), returned 0 ----------SERVER LOG END------------- mysqltest failed but provided no output The server apparently hung here. The reason could be unrelated to the io_uring error, such as MDEV-32588 .

            A Linux kernel bug that was behind MDEV-30728 is specific to the POWER ISA. Therefore, this issue (errors on io_uring write completion callback, observed on ARMv8 and AMD64) should be unaffected by that fix.

            marko Marko Mäkelä added a comment - A Linux kernel bug that was behind MDEV-30728 is specific to the POWER ISA. Therefore, this issue (errors on io_uring write completion callback, observed on ARMv8 and AMD64) should be unaffected by that fix.

            People

              Unassigned Unassigned
              danblack Daniel Black
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.