Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.11, 11.4
Description
While working on MDEV-33894, I noticed a rather deep call stack:
#0 __libc_pwrite64 (fd=7, buf=0x7f0d44b15000, count=count@entry=8192, offset=104448) at ../sysdeps/unix/sysv/linux/pwrite64.c:24
|
#1 0x0000564264c10e07 in SyncFileIO::execute_low (this=this@entry=0x7f0d1729bc20, request=..., n=8192) at /data/Server/10.11-MDEV-33894/storage/innobase/os/os0file.cc:692
|
#2 0x0000564264c10f17 in SyncFileIO::execute (this=this@entry=0x7f0d1729bc20, request=...) at /data/Server/10.11-MDEV-33894/storage/innobase/os/os0file.cc:715
|
#3 0x0000564264c12402 in os_file_io (in_type=..., file=file@entry=7, buf=buf@entry=0x7f0d44b15000, n=n@entry=8192, offset=offset@entry=104448, err=err@entry=0x7f0d1729be7c)
|
at /data/Server/10.11-MDEV-33894/storage/innobase/os/os0file.cc:2584
|
#4 0x0000564264c125d9 in os_file_pwrite (type=..., file=file@entry=7, buf=buf@entry=0x7f0d44b15000 "w\a\f", n=8192, offset=104448, err=err@entry=0x7f0d1729be7c)
|
at /data/Server/10.11-MDEV-33894/storage/innobase/os/os0file.cc:2659
|
#5 0x0000564264c14299 in os_file_write_func (type=..., name=<optimized out>, name@entry=0x56426540609f "ib_logfile0", file=7, buf=0x7f0d44b15000, offset=<optimized out>, offset@entry=104448, n=<optimized out>)
|
at /data/Server/10.11-MDEV-33894/storage/innobase/os/os0file.cc:2688
|
#6 0x0000564264bbed28 in log_file_t::write (this=this@entry=0x5642665238c0 <log_sys+128>, offset=offset@entry=104448, buf=...) at /data/Server/10.11-MDEV-33894/storage/innobase/log/log0log.cc:173
|
#7 0x0000564264bbf1a7 in log_write_buf (buf=buf@entry=0x7f0d44b15000 "w\a\f", length=length@entry=8192, offset=offset@entry=104448) at /data/Server/10.11-MDEV-33894/storage/innobase/log/log0log.cc:624
|
#8 0x0000564264bc1936 in log_t::write_buf<true> (this=this@entry=0x564266523840 <log_sys>) at /data/Server/10.11-MDEV-33894/storage/innobase/log/log0log.cc:944
|
The InnoDB write-ahead log is not instrumented by PERFORMANCE_SCHEMA; we only have innodb_data_file_key and innodb_temp_file_key for data files and temporary files. Therefore, log_file_t::write could simply invoke IF_WIN(tpool::,)pwrite().
Attachments
Issue Links
- relates to
-
MDEV-33894 MariaDB does unexpected storage read IO for the redo log
-
- Closed
-
Also the reads of ib_logfile0 (during early startup and crash recovery) are unnecessarily complex, including some ‘unreachable’ bookkeeping in os_file_pread():
MONITOR_ATOMIC_INC_LOW(MONITOR_OS_PENDING_READS, monitor);
ssize_t n_bytes = os_file_io(type, file, buf, n, offset, err);
MONITOR_ATOMIC_DEC_LOW(MONITOR_OS_PENDING_READS, monitor);
#0 __libc_pread64 (fd=10, buf=0x7f3e4a400000, count=count@entry=2097152, offset=49152) at ../sysdeps/unix/sysv/linux/pread64.c:25
#1 0x0000558ef654a6bd in SyncFileIO::execute_low (this=this@entry=0x7ffd94df1ff0, request=@0x7ffd94df1fd0: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_SYNC}, n=2097152)
at /mariadb/10.11m/storage/innobase/os/os0file.cc:691
#2 0x0000558ef654a811 in SyncFileIO::execute (this=this@entry=0x7ffd94df1ff0, request=@0x7ffd94df1fd0: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_SYNC})
at /mariadb/10.11m/storage/innobase/os/os0file.cc:715
#3 0x0000558ef654b70a in os_file_io (in_type=@0x558ef5891080: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_SYNC}, file=file@entry=10, buf=buf@entry=0x7f3e4a400000, n=n@entry=2097152,
offset=offset@entry=49152, err=err@entry=0x7ffd94df222c) at /mariadb/10.11m/storage/innobase/os/os0file.cc:2584
#4 0x0000558ef654b9c1 in os_file_pread (type=@0x558ef5891080: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_SYNC}, file=file@entry=10, buf=buf@entry=0x7f3e4a400000, n=n@entry=2097152,
offset=offset@entry=49152, err=err@entry=0x7ffd94df222c) at /mariadb/10.11m/storage/innobase/os/os0file.cc:2742
#5 0x0000558ef654d58e in os_file_read_func (type=@0x558ef5891080: {bpage = 0x0, slot = 0x0, node = 0x0, type = IORequest::READ_SYNC}, file=10, buf=0x7f3e4a400000, offset=offset@entry=49152, n=2097152,
o=o@entry=0x0) at /mariadb/10.11m/storage/innobase/os/os0file.cc:2772
#6 0x0000558ef651cc53 in log_file_t::read (this=0x558ef72044c0 <log_sys+128>, offset=49152, buf={data_ = 0x7f3e4a400000 "", size_ = 2097152}) at /mariadb/10.11m/storage/innobase/log/log0log.cc:166
We have similar accounting in os_file_pwrite() that currently covers also log writes:
++os_n_file_writes;
MONITOR_ATOMIC_INC_LOW(MONITOR_OS_PENDING_WRITES, monitor);
n, offset, err);
MONITOR_ATOMIC_DEC_LOW(MONITOR_OS_PENDING_WRITES, monitor);
I think that we can do without that and only count pending writes to data files, not ib_logfile0. In fact, we may remove or replace the entire MONITOR_OS_PENDING_WRITES, because os_aio_pending_writes() will cover all writes to data files. The os_n_file_writes is an atomic counter, because data page writes require that. We do not need to increment that counter for log writes, because they are additionally covered by the regular variable log_sys.write_to_log, which is protected by log_sys.latch.
Currently, the status variable Innodb_data_writes includes the count of Innodb_log_writes. With my intended change, Innodb_data_writes would no longer cover writes to ib_logfile0.