Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17482

InnoDB fails to say which fatal error fsync() returned

Details

    Description

      Hi,
      Today, Galera's one node was crashed down.
      I don't know why is it happened.
      Crashed node is partitioned from cluster, and some time later(28min) crashed down.

      I think it seems like InnoDB engine's bug
      I attached error log.

      Thank you.

      Attachments

        Issue Links

          Activity

            fsync has failed but log does not contain any error code so it is not exactly clear why. Did you check that file system has enough space ? Have you seen this crash more than one time or is this the only time ?

            jplindst Jan Lindström (Inactive) added a comment - fsync has failed but log does not contain any error code so it is not exactly clear why. Did you check that file system has enough space ? Have you seen this crash more than one time or is this the only time ?
            Seonghwan Kim Seonghwan Kim added a comment -

            I think it may enough disk space in that node at that time.
            I known, this is the only time that crash is happened.

            Seonghwan Kim Seonghwan Kim added a comment - I think it may enough disk space in that node at that time. I known, this is the only time that crash is happened.

            The Linux manual page for fsync(2) reports the following possible errors, which could cause the intentional crash in InnoDB:

            EBADF
            fd is not a valid open file descriptor.
            ENOSPC
            Disk space was exhausted while synchronizing.
            EROFS, EINVAL
            fd is bound to a special file (e.g., a pipe, FIFO, or socket) which does not support synchronization.
            ENOSPC, EDQUOT
            fd is bound to a file on NFS or another filesystem which does not allocate space at the time of a write(2) system call, and some previous write failed due to insufficient storage space.

            Since MariaDB Server 10.2.17, also EIO will result in an intentional crash.

            Theoretically, once MDEV-13542 has been finally fixed, we might handle fsync() or write or allocation failures on user data files a little more gracefully: flag the affected index or table as corrupted, and mark the table read-only.

            If a write to the redo log fails, I think that the only reasonable options are to kill the server (like we do now) or to make all InnoDB tables read-only.

            For now, I will only change the diagnostics so that before killing the server, InnoDB will say which error code was returned by fsync().

            marko Marko Mäkelä added a comment - The Linux manual page for fsync(2) reports the following possible errors, which could cause the intentional crash in InnoDB: EBADF fd is not a valid open file descriptor. ENOSPC Disk space was exhausted while synchronizing. EROFS, EINVAL fd is bound to a special file (e.g., a pipe, FIFO, or socket) which does not support synchronization. ENOSPC, EDQUOT fd is bound to a file on NFS or another filesystem which does not allocate space at the time of a write(2) system call, and some previous write failed due to insufficient storage space. Since MariaDB Server 10.2.17, also EIO will result in an intentional crash. Theoretically, once MDEV-13542 has been finally fixed, we might handle fsync() or write or allocation failures on user data files a little more gracefully: flag the affected index or table as corrupted, and mark the table read-only. If a write to the redo log fails, I think that the only reasonable options are to kill the server (like we do now) or to make all InnoDB tables read-only. For now, I will only change the diagnostics so that before killing the server, InnoDB will say which error code was returned by fsync() .

            People

              marko Marko Mäkelä
              Seonghwan Kim Seonghwan Kim
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.