[MDEV-33203] storage/innobase/os/os0file.cc doesn't compile on Solaris Created: 2024-01-09 Updated: 2024-01-19 Resolved: 2024-01-19 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 11.0, 11.1, 11.2, 11.3, 11.4 |
| Fix Version/s: | 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3, 11.3.2, 11.4.1 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Rainer Orth | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | regression | ||
| Environment: |
Solaris 11.4/amd64 |
||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Description |
|
To provide patches for the Solaris build issues I've reported before against MariaDB 10.11.6, I've tried a build on 11.4 instead. Unfortunately, there's a new build failure there:
Fixed by wrapping that case in `#ifdef O_DIRECT`, too. |
| Comments |
| Comment by Marko Mäkelä [ 2024-01-09 ] | |
|
The function os_file_set_nocache() should be defined on all other platforms than Microsoft Windows. Solaris at least used to define a preprocessor symbol DIRECTIO_ON and a system call directio(), which the implementation of that function should be using. Could you please follow the code contribution instructions and file a pull request? | |
| Comment by Rainer Orth [ 2024-01-09 ] | |
|
My current intent is just to unbreak the build, not implement directio on Solaris. You mean a pull request for the current patch? So far, Sergei always told me to just attach patches to the issue. | |
| Comment by Marko Mäkelä [ 2024-01-09 ] | |
|
Originally, Solaris directio() was implemented for InnoDB in MySQL 5.1.18 if not earlier. Unfortunately, we no longer have any Solaris or its derivatives on the CI systems, so some breakage is inevitable. I think that the correct preprocessor directive around the local variable mode_str would be the following:
As far as I understand, this should have been broken in | |
| Comment by Rainer Orth [ 2024-01-10 ] | |
|
For one, please don't mix the build-breaking use of undefined `mode_str` with direct I/O on Solaris. If we really want to tackle the latter, this should be a separate issue. That said, the Solaris code would have to be completely different: while `O_DIRECT` is an `open(2)` flag, Solaris `DIRECTIO_ON` etc. are options to a completely different `directio(3)` function. More importantly, the `directio(3C)` manpage states: I doubt many people a using MariaDB with storage on NFS these days, and UFS is considered completely obsolete with the advent of ZFS. In fact, it's more than 15 years that I've last used UFS on Solaris, so any effort to implement `directio` support would be almost completely wasted. | |
| Comment by Marko Mäkelä [ 2024-01-10 ] | |
|
rorth, thank you. What you say makes complete sense. (By the way, Jira uses its own proprietary markup language, not Markdown.) I do not have much experience with copy-on-write file systems, or in fact anything else than journal based ones (Linux ext4 and Microsoft NTFS). I understood that ZFS is also compressing data on the fly. That kind of stuff can actually only work if the data is being copied between user and kernel address space, that is, there is no innodb_flush_method=O_DIRECT or innodb_log_file_buffering=OFF or innodb_data_file_buffering=OFF. Are you suggesting that we should simply remove the directio() code on Solaris and (among others) treat innodb_flush_method=O_DIRECT as if innodb_flush_method=fsync had been specified? If I remember correctly, the directio() change was added by one colleague and not double-checked by anyone else. It looks like the documentation was misunderstood already back then. | |
| Comment by Marko Mäkelä [ 2024-01-12 ] | |
|
I just came across https://blogs.oracle.com/solaris/post/zfs-and-directio from 2006. If ZFS never overwrites live data on Solaris, I guess that innodb_doublewrite=OFF would be a safe default. Not only on a copy-on-write file system, but also on a journaling file system, the doublewrite buffer should be redundant, in theory. In practice, implementation flaws may exist: https://www.percona.com/blog/2015/06/17/update-on-the-innodb-double-write-buffer-and-ext4-transactions/ With OpenZFS on Linux I would have some doubts regarding the atomicity of writes. I don’t know if the userspace to kernel space data copying has been fixed in Linux meanwhile. In the past, I observed that any writes (not only to the file system, but also to an RS-232 serial line) may be truncated at multiples of 4096 bytes when merely the process is being killed. I’ve heard that on Linux you may safely disable the doublewrite buffer when using O_DIRECT on the ext4 file system and an SSD that supports atomic writes of innodb_page_size, provided that the file system allocation block size as well as the alignment of the partition are compatible with innodb_page_size. | |
| Comment by Rainer Orth [ 2024-01-15 ] | |
|
Just for correctness: ZFS *can* compress data on the fly, but that's off by default. It can be enabled on a per-dataset/per-filesystem basis as need be, though. As for removing directio(3C) code: while I cannot say for certain if there are any users of MariaDB on NFS storage (where it could benefit), I'm quite certain that UFS isn't used with current versions of MariaDB these days. So unless NFS storage support is deemed worthwhile, I'd say that removing directio() support would just simplify the codebase for no significant loss in functionality. Instead of treating innodb_flush_method=O_DIRECT, I'd argue that mariadbd should error out if this is selected explicitly on targets that cannot support it. In our case, we hadn't set the variable at all and got tons of warnings. In this case, I believe the daemon should default to innodb_flush_method=fsync instead, which we had to do explicitly to get rid of the warnings. | |
| Comment by Marko Mäkelä [ 2024-01-16 ] | |
|
Before innodb_flush_method was changed to an enumeration in I did implement a prototype that would restrict the choice of allowed values, but in the end I would avoid that. We have to keep in mind that the parameter innodb_flush_method was deprecated in | |
| Comment by Marko Mäkelä [ 2024-01-19 ] | |
|
I think that this was fixed by |