Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35860

use O_TMPFILE for (implicit) temporary Aria/MyISAM tables

Details

    Description

      The current usage of a temporary table involves the following system calls that explicitly reference the filename.

      843157 openat(AT_FDCWD, "/tmp/#sql-temptable-cdd6a-3-0.MAI", O_RDWR|O_CREAT|O_TRUNC|O_NOFOLLOW|O_CLOEXEC, 0660) = 49
      843157 openat(AT_FDCWD, "/tmp/#sql-temptable-cdd6a-3-0.MAD", O_RDWR|O_CREAT|O_TRUNC|O_NOFOLLOW|O_CLOEXEC, 0660) = 50
      843157 readlink("/tmp/#sql-temptable-cdd6a-3-0.MAI", 0x7fd25c0b7400, 1023) = -1 EINVAL (Invalid argument)
      843157 newfstatat(AT_FDCWD, "/tmp/#sql-temptable-cdd6a-3-0.MAI", {st_mode=S_IFREG|0660, st_size=8192, ...}, AT_SYMLINK_NOFOLLOW) = 0
      843157 openat(49, "#sql-temptable-cdd6a-3-0.MAI", O_RDWR|O_NOFOLLOW|O_CLOEXEC) = 50
      843157 newfstatat(AT_FDCWD, "/tmp/#sql-temptable-cdd6a-3-0.MAD", {st_mode=S_IFREG|0660, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
      843157 openat(AT_FDCWD, "/tmp/#sql-temptable-cdd6a-3-0.MAD", O_RDWR|O_CLOEXEC) = 49
      843157 newfstatat(AT_FDCWD, "/tmp/#sql-temptable-cdd6a-3-0.MAI", {st_mode=S_IFREG|0660, st_size=8192, ...}, AT_SYMLINK_NOFOLLOW) = 0
      843157 unlink("/tmp/#sql-temptable-cdd6a-3-0.MAI") = 0
      843157 newfstatat(AT_FDCWD, "/tmp/#sql-temptable-cdd6a-3-0.MAD", {st_mode=S_IFREG|0660, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
      843157 unlink("/tmp/#sql-temptable-cdd6a-3-0.MAD") = 0
      

      As seen in MDEV-34577, there are inefficiencies in overlayfs2 (used by containers, Docker's moby, Podman etc), that result in poor performance.

      Overall eleven system calls by by filename causes overhead inside the kernel and contention locks there inside the kernel on top of the cost of all of the context switch of those system calls.

      With O_TMPFILE, like implemented in MDEV-15584, there is no need to maintain on disk filenames for temporary tables.

      As seen in MDEV-17420, when errors occur there is a bunch of temporary files to clean up.

      This will at the SQL layer result in combining create_internal_tmp_table and open_tmp_table, they are always following each so this is easy.

      maria_create (mi_create) would need to be adjusted to incorporate the state made by ha_maria::open and in a table opened once. Currently maria_create leaves the newly created files closed on storage.

      with O_TMPFILE used as an argument to open, there should be two open calls instead of the previous eleven, with additional performance benefits that the ha_maria::open won't need to revalidate everything the maria_create already did.

      Additional benefits if something crashes there will be no temporary files lingering of the filesystem.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              danblack Daniel Black
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.