[MDEV-33095] innodb_flush_method=O_DIRECT creates excessive errors on Solaris Created: 2023-12-20  Updated: 2024-02-02  Resolved: 2024-01-19

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0, 11.1, 11.2, 11.3, 11.4, 10.11.6
Fix Version/s: 10.6.17, 10.11.7, 11.0.5, 11.1.4, 11.2.3, 11.3.2, 11.4.1

Type: Bug Priority: Critical
Reporter: Rainer Orth Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: portability, regression
Environment:

Solaris 11.4/x86


Issue Links:
Blocks
blocks MDEV-33203 storage/innobase/os/os0file.cc doesn'... Closed
Problem/Incident
is caused by MDEV-24854 Change innodb_flush_method=O_DIRECT b... Closed
Relates
relates to MDEV-30136 Map innodb_flush_method to new settab... Closed

 Description   

After upgrading MariaDB from 10.3.9 to 10.11.6, we got an excessive number of errors in the server logs:
```
2023-12-20 11:25:48 0 [ERROR] InnoDB: Failed to set DIRECTIO_ON on file ./mysql/gtid_slave_pos.ibd; OPEN: Inappropriate ioctl for device, continuing anyway.
```
It seems the general change of default for `innodb_flush_method` from `fsync` to `O_DIRECT` on every platform is ill-advised: `O_DIRECT` is *not* portable across Unix platforms, but is a Linux extensions that has been adopted by some non-Linux targets, but is certainly not POSIX.

While I can manually restore the default to `fsync` to avoid this flood of errors, `mysqld` should only change the default to `O_DIRECT` on targets where it's supported and known to work.



 Comments   
Comment by Marko Mäkelä [ 2024-01-04 ]

The name of a value of the configuration parameter innodb_flush_method was O_DIRECT. Yes, it was somewhat misleading, and this configuration parameter was deprecated and mapped to 4 new settable Boolean parameters in MDEV-30136.

Comment by Marko Mäkelä [ 2024-01-04 ]

Because we no longer have any Solaris derivatives in our CI systems, it is difficult for me to do anything about this bug. I am happy to review any code contribution.

While we are at it, I wonder if a Solaris variant of MDEV-26476 could be implemented.

Comment by Marko Mäkelä [ 2024-01-12 ]

rorth, based on your comment in MDEV-33203, I would assume that the correct course of action would be to simply remove the support of innodb_flush_method=O_DIRECT on Solaris, as well as the support of the related parameters that were implemented in MDEV-30136.

Can you confirm that this is fine with you?

Comment by Marko Mäkelä [ 2024-01-12 ]

rorth, please check if https://github.com/MariaDB/server/pull/3002 works for you. I added a new cmakedefine for checking if O_DIRECT is usable. Solaris defines both fcntl(2) and O_DIRECT, but they cannot be used together. Therefore, I added a tweak to cmake/os/SunOS.cmake to disable this.

Comment by Marko Mäkelä [ 2024-01-15 ]

I found out that we were invoking fcntl() on IBM AIX with an undocumented argument O_DIRECT. I implemented a similar way of disabling O_DIRECT on both AIX and Solaris. I think that this is best reviewed by danblack, who set up IBM AIX on our CI.

Generated at Thu Feb 08 10:36:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.