digdilem, thank you. In mycnf.txt
I see that your innodb_log_file_size is only half a gigabyte, which is a fraction of the innodb_buffer_pool_size. That will force very frequent log checkpoints, which in turn will cause stalls. That could actually be the root cause of your messages, if those messages always say "0 pending operations and pending fsync". I do not think that there is any need to execute fsync() or fdatasync() outside log checkpoints. Starting with 10.5, thanks to MDEV-19176, recovery should work fine even if you make the redo log as big as the buffer pool, or possibly even larger.
Another factor is that if innodb_open_files is not specified, open_files_limit will be used instead. That may force frequent closing of data files. In mycnf.txt
I see open_files_limit=65535, which may or may not be reasonable. But the innodb_log_file_size is definitely way too small.
Internally, we have repeated something, but only when using system versioned tables and FULLTEXT INDEX. Theoretically, there could be a bug in some system versioning code that forgets to commit a mini-transaction (which would release a page latch). But I think that such a bug should cause more trouble than just these messages. At the very least, it should prevent log checkpoints and cause log file overruns, for which we have a separate message.
Some of the logs from our internal testing show a nonzero "pending operations" count, but we also have logs that exclusively show "0 pending operations and pending fsync". I see that within a second, we can issue the message for several files. For us, they are the numerous internal tables for FULLTEXT INDEX for a single user table. In one log that I checked, all file names would start with the same prefix that includes the numeric main table identifier, such as test/FTS_000000000000125b_. Those messages were almost certainly issued by the LRU logic that is attempting to enforce the ‘open files’ limit in InnoDB.
We should definitely rate-limit that output somehow, even if this run is using a ‘misconfigured’ system. If all files are located in the same journaled file system, all files should be ‘on the same boat’ waiting for the fsync() or fdatasync() to complete.
We encountered this bug today on a busy production server and researching brought us here.
Mariadb: Server version: 10.5.11 (From Maria's C7 repo)
OS: CentOS 7
2021-08-09 22:49:11 5443978 [Note] InnoDB: Cannot close file ./DB0/Tab1.ibd because of 20 pending operations and pending fsync
2021-08-03 13:05:10 145279 [Note] InnoDB: Cannot close file ./DB1/Tab1#P#per.ibd because of 0 pending operations and pending fsync
2021-08-03 13:05:10 145911 [Note] InnoDB: Cannot close file ./DB1/Tab1#P#per.ibd because of 0 pending operations and pending fsync
2021-08-03 13:05:10 145515 [Note] InnoDB: Cannot close file ./DB1/Tab1#P#per.ibd because of 0 pending operations and pending fsync
Variations of this, affecting multiple databases and tables.
Total number of similar log entries is 200149 during this period.
Restarting Maria appears to have (temporarily) resolved the issue.