[MDEV-24313] Hang with innodb_use_native_aio=0 and innodb_write_io_threads=1 Created: 2020-11-30  Updated: 2021-06-17  Resolved: 2020-12-14

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.5.7, 10.5.8
Fix Version/s: 10.5.9

Type: Bug Priority: Blocker
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: hang, regression-10.5

Attachments: File mysql.err    
Issue Links:
Problem/Incident
is caused by MDEV-16264 Implement a common work queue for Inn... Closed
is caused by MDEV-23855 InnoDB log checkpointing causes regre... Closed
Relates
relates to MDEV-25953 Tpool - prevent potential deadlock in... Closed

 Description   

elenst reports that her tests are hanging.

The reason seems to be the following combination of parameters:

  • innodb_write_io_threads=1
  • innodb_use_native_aio=0
  • innodb_doublewrite=1

To add confusion, InnoDB will claim that it is using native AIO (because innodb_use_native_aio=1 by default), but a little later, it will silently change to simulated asynchronous I/O (innodb_use_native_aio=0) due to io_setup() failure.

The hang is possible, because starting with MDEV-23855, the doublewrite buffer issues asynchronous writes. On buf_dblwr_t::flush_buffered_writes_completed(), up to 128 page writes may be submitted. But, there may already be 128 outstanding page writes. The thread would then hang here, because the maximum number of outstanding requests per thread is 256:

mariadb-10.5.8

#14 0x000055c699979745 in io_slots::acquire (this=0x55c69c565d50) at /home/mariadb/semaphore1/10.5-enterprise/storage/innobase/os/os0file.cc:103
#15 0x000055c699977bfe in os_aio (type=..., buf=0x7f9c7847c000, offset=4128768, n=16384) at /home/mariadb/semaphore1/10.5-enterprise/storage/innobase/os/os0file.cc:4211
#16 0x000055c699c14ce0 in fil_space_t::io (this=0x55c69cd565e8, type=..., offset=4128768, len=16384, buf=0x7f9c7847c000, bpage=0x7f9c78009ab0)
    at /home/mariadb/semaphore1/10.5-enterprise/storage/innobase/fil/fil0fil.cc:3431
#17 0x000055c699b9113c in buf_dblwr_t::flush_buffered_writes_completed (this=0x55c69b62b960 <buf_dblwr>, request=...) at /home/mariadb/semaphore1/10.5-enterprise/storage/innobase/buf/buf0dblwr.cc:678
#18 0x000055c699c14fb9 in fil_aio_callback (request=...) at /home/mariadb/semaphore1/10.5-enterprise/storage/innobase/fil/fil0fil.cc:3466
#19 0x000055c699976827 in io_callback (cb=0x55c69c565f40) at /home/mariadb/semaphore1/10.5-enterprise/storage/innobase/os/os0file.cc:3918
#20 0x000055c699ca75dd in tpool::simulated_aio::simulated_aio_callback (param=0x55c69c565f40) at /home/mariadb/semaphore1/10.5-enterprise/tpool/aio_simulated.cc:162

A possible workaround could be to make innodb_write_io_threads=2 the minimum value.

Thanks to wlad for participating in the debugging.



 Comments   
Comment by Elena Stepanova [ 2020-12-01 ]

I want to emphasize for posterity: innodb_use_native_aio=0 mentioned in the description is NOT a parameter used by the test. The test runs the server with the default innodb_use_native_aio=1, and all necessary packages are installed on the machines where it is run. Whatever makes InnoDB ignore it happens internally and not configured by the test.

This is an important distinction, because stating that it is dependent on innodb_use_native_aio=0 setting would make people discard it as a possible cause of similar problems, since they are not using such an option. It would be wrong – the server can be run with the default innodb_use_native_aio and still be affected.

For the other two parameters listed in the description, innodb_doublewrite=1 is the default value; and the test does indeed set innodb_write_io_threads=1.

Comment by Marko Mäkelä [ 2020-12-14 ]

Before MDEV-16264, the function os_aio_init() was able to return an error and thus abort the server startup. That was changed to disabling of innodb_use_native_aio=0. That should be indicated by the following kind of message in the server error log:

2020-12-14 10:11:52 0 [Note] InnoDB: Linux native AIO disabled

elenst, I hope that such a message was present in the log.

To address this report, I would prefer to avoid touching that logic, but only increase the minimum value of innodb_write_io_threads from 1 to 2.

Comment by Elena Stepanova [ 2020-12-14 ]

I can't see any such message in the error logs. There was

mysqld: O_TMPFILE is not supported on /home/mariadb/semaphore1/var/tmp (disabling future attempts)

I don't know if it's related or not.
An example of an error log is attached.
mysql.err

Comment by Marko Mäkelä [ 2020-12-14 ]

I now see that if is_linux_native_aio_supported() would return false, the native AIO would be silently disabled.

Comment by Marko Mäkelä [ 2020-12-14 ]

Also, thread_pool::configure_aio() would silently fall back to simulated AIO.

Comment by Marko Mäkelä [ 2020-12-14 ]

The minimum value of innodb_write_io_threads will be 2 instead of 1. When we fall back to innodb_use_native_aio=0, we will issue a message about it. The message InnoDB: Using Linux native AIO will no longer be issued before the code has actually been enabled.

Generated at Thu Feb 08 09:29:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.