Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33203

storage/innobase/os/os0file.cc doesn't compile on Solaris

Details

    Description

      To provide patches for the Solaris build issues I've reported before against MariaDB 10.11.6, I've tried a build on 11.4 instead. Unfortunately, there's a new build failure there:
      ```
      /vol/src/mariadb/mariadb-server/storage/innobase/os/os0file.cc: In function ‘pfs_os_file_t os_file_create_func(const char*, ulint, ulint, ulint, bool, bool*)’:
      /vol/src/mariadb/mariadb-server/storage/innobase/os/os0file.cc:1233:49: error: ‘mode_str’ was not declared in this scope; did you mean ‘mode_t’?
      1233 | os_file_set_nocache(file, name, mode_str);

      ^~~~~~~~
      mode_t
      ```
      While all other uses of `mode_str` are wrapped in `#ifdef O_DIRECT`, this one is not, which breaks the Solaris build which lacks `O_DIRECT`.

      Fixed by wrapping that case in `#ifdef O_DIRECT`, too.

      Attachments

        Issue Links

          Activity

            rorth, thank you. What you say makes complete sense. (By the way, Jira uses its own proprietary markup language, not Markdown.)

            I do not have much experience with copy-on-write file systems, or in fact anything else than journal based ones (Linux ext4 and Microsoft NTFS). I understood that ZFS is also compressing data on the fly. That kind of stuff can actually only work if the data is being copied between user and kernel address space, that is, there is no innodb_flush_method=O_DIRECT or innodb_log_file_buffering=OFF or innodb_data_file_buffering=OFF.

            Are you suggesting that we should simply remove the directio() code on Solaris and (among others) treat innodb_flush_method=O_DIRECT as if innodb_flush_method=fsync had been specified? If I remember correctly, the directio() change was added by one colleague and not double-checked by anyone else. It looks like the documentation was misunderstood already back then.

            marko Marko Mäkelä added a comment - rorth , thank you. What you say makes complete sense. (By the way, Jira uses its own proprietary markup language, not Markdown.) I do not have much experience with copy-on-write file systems, or in fact anything else than journal based ones (Linux ext4 and Microsoft NTFS). I understood that ZFS is also compressing data on the fly. That kind of stuff can actually only work if the data is being copied between user and kernel address space, that is, there is no innodb_flush_method=O_DIRECT or innodb_log_file_buffering=OFF or innodb_data_file_buffering=OFF . Are you suggesting that we should simply remove the directio() code on Solaris and (among others) treat innodb_flush_method=O_DIRECT as if innodb_flush_method=fsync had been specified? If I remember correctly, the directio() change was added by one colleague and not double-checked by anyone else. It looks like the documentation was misunderstood already back then.

            I just came across https://blogs.oracle.com/solaris/post/zfs-and-directio from 2006. If ZFS never overwrites live data on Solaris, I guess that innodb_doublewrite=OFF would be a safe default. Not only on a copy-on-write file system, but also on a journaling file system, the doublewrite buffer should be redundant, in theory. In practice, implementation flaws may exist: https://www.percona.com/blog/2015/06/17/update-on-the-innodb-double-write-buffer-and-ext4-transactions/

            With OpenZFS on Linux I would have some doubts regarding the atomicity of writes. I don’t know if the userspace to kernel space data copying has been fixed in Linux meanwhile. In the past, I observed that any writes (not only to the file system, but also to an RS-232 serial line) may be truncated at multiples of 4096 bytes when merely the process is being killed.

            I’ve heard that on Linux you may safely disable the doublewrite buffer when using O_DIRECT on the ext4 file system and an SSD that supports atomic writes of innodb_page_size, provided that the file system allocation block size as well as the alignment of the partition are compatible with innodb_page_size.

            marko Marko Mäkelä added a comment - I just came across https://blogs.oracle.com/solaris/post/zfs-and-directio from 2006. If ZFS never overwrites live data on Solaris, I guess that innodb_doublewrite=OFF would be a safe default. Not only on a copy-on-write file system, but also on a journaling file system, the doublewrite buffer should be redundant, in theory. In practice, implementation flaws may exist: https://www.percona.com/blog/2015/06/17/update-on-the-innodb-double-write-buffer-and-ext4-transactions/ With OpenZFS on Linux I would have some doubts regarding the atomicity of writes. I don’t know if the userspace to kernel space data copying has been fixed in Linux meanwhile. In the past, I observed that any writes (not only to the file system, but also to an RS-232 serial line) may be truncated at multiples of 4096 bytes when merely the process is being killed. I’ve heard that on Linux you may safely disable the doublewrite buffer when using O_DIRECT on the ext4 file system and an SSD that supports atomic writes of innodb_page_size , provided that the file system allocation block size as well as the alignment of the partition are compatible with innodb_page_size .
            rorth Rainer Orth added a comment -

            Just for correctness: ZFS *can* compress data on the fly, but that's off by default. It can be enabled on a per-dataset/per-filesystem basis as need be, though.

            As for removing directio(3C) code: while I cannot say for certain if there are any users of MariaDB on NFS storage (where it could benefit), I'm quite certain that UFS isn't used with current versions of MariaDB these days. So unless NFS storage support is deemed worthwhile, I'd say that removing directio() support would just simplify the codebase for no significant loss in functionality.

            Instead of treating innodb_flush_method=O_DIRECT, I'd argue that mariadbd should error out if this is selected explicitly on targets that cannot support it. In our case, we hadn't set the variable at all and got tons of warnings. In this case, I believe the daemon should default to innodb_flush_method=fsync instead, which we had to do explicitly to get rid of the warnings.

            rorth Rainer Orth added a comment - Just for correctness: ZFS * can * compress data on the fly, but that's off by default. It can be enabled on a per-dataset/per-filesystem basis as need be, though. As for removing directio(3C) code: while I cannot say for certain if there are any users of MariaDB on NFS storage (where it could benefit), I'm quite certain that UFS isn't used with current versions of MariaDB these days. So unless NFS storage support is deemed worthwhile, I'd say that removing directio() support would just simplify the codebase for no significant loss in functionality. Instead of treating innodb_flush_method=O_DIRECT , I'd argue that mariadbd should error out if this is selected explicitly on targets that cannot support it. In our case, we hadn't set the variable at all and got tons of warnings. In this case, I believe the daemon should default to innodb_flush_method=fsync instead, which we had to do explicitly to get rid of the warnings.

            Before innodb_flush_method was changed to an enumeration in MDEV-12218, it was a string parameter. The validation before and after that change accepted all values, no matter if they were actually implemented on the current platform. Only for Microsoft Windows, the additional values normal, unbuffered, and async_unbuffered had been allowed.

            I did implement a prototype that would restrict the choice of allowed values, but in the end I would avoid that. We have to keep in mind that the parameter innodb_flush_method was deprecated in MDEV-30136. It could make sense to conditionally define the Boolean parameters innodb_log_file_buffering and innodb_data_file_buffering, or to issue errors or warnings when they have no effect.

            marko Marko Mäkelä added a comment - Before innodb_flush_method was changed to an enumeration in MDEV-12218 , it was a string parameter. The validation before and after that change accepted all values, no matter if they were actually implemented on the current platform. Only for Microsoft Windows, the additional values normal , unbuffered , and async_unbuffered had been allowed. I did implement a prototype that would restrict the choice of allowed values, but in the end I would avoid that. We have to keep in mind that the parameter innodb_flush_method was deprecated in MDEV-30136 . It could make sense to conditionally define the Boolean parameters innodb_log_file_buffering and innodb_data_file_buffering , or to issue errors or warnings when they have no effect.

            I think that this was fixed by MDEV-33095, by removing any support for O_DIRECT on Solaris.

            marko Marko Mäkelä added a comment - I think that this was fixed by MDEV-33095 , by removing any support for O_DIRECT on Solaris.

            People

              marko Marko Mäkelä
              rorth Rainer Orth
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.