[MDEV-39780] Server crash in INFORMATION_SCHEMA.INNODB_BUFFER_PAGE after innodb_buffer_pool_size change - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Duplicate
Affects Version/s: 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5(EOL), 10.6
Fix Version/s: 10.11.12, 11.4.6, 11.8.2
Component/s: Storage Engine - InnoDB
Labels:
- crash
- rr-profile-analyzed

Bug Category:
Can result in hang or crash

Description

mleich produced an rr replay trace where MariaDB Server 10.6 is crashing after executing the following commands:

SET GLOBAL innodb_buffer_pool_size=268435456;

SET GLOBAL innodb_buffer_pool_size=134217728;

SELECT * FROM INFORMATION_SCHEMA.INNODB_BUFFER_PAGE;

10.6-MDEV-39344 cb6b5e26d1837c007b87bd4ad676b7c1720ffe8e
2026-05-27 12:55:11 17 [Note] InnoDB: Requested to resize buffer pool. (new size: 134217728 bytes)
2026-05-27 12:55:11 0 [Note] InnoDB: Resizing buffer pool from 268435456 to 134217728 (unit=134217728).
2026-05-27 12:55:11 0 [Note] InnoDB: Disabling adaptive hash index.
2026-05-27 12:55:11 0 [Note] InnoDB: Withdrawing blocks to be shrunken.
2026-05-27 12:55:11 0 [Note] InnoDB: start to withdraw the last 15977 blocks
2026-05-27 12:55:11 0 [Note] InnoDB: withdrawing blocks. (15977/15977)
2026-05-27 12:55:11 0 [Note] InnoDB: withdrew 15977 blocks from free list. Tried to relocate 0 pages (15977/15977)
2026-05-27 12:55:11 0 [Note] InnoDB: withdrawn target: 15977 blocks
2026-05-27 12:55:11 0 [Note] InnoDB: Latching whole of buffer pool.
2026-05-27 12:55:11 0 [Note] InnoDB: buffer pool resizing with chunks 2 to 1.
2026-05-27 12:55:11 0 [Note] InnoDB: 1 chunks (15977 blocks) were freed.
2026-05-27 12:55:11 0 [Note] InnoDB: Completed to resize buffer pool from 268435456 to 134217728.
2026-05-27 12:55:11 0 [Note] InnoDB: Completed resizing buffer pool at 260527 12:55:11.
[New Thread 1857251.1870065]
[New Thread 1857251.1857468]
…
[New Thread 1857251.1917025]
[New Thread 1857251.1946944]

Thread 10 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1857251.1870065]
i_s_innodb_buffer_page_get_info (bpage=bpage@entry=0x31e84fa9ad00,
pos=pos@entry=10000, page_info=page_info@entry=0x6d1108077010)
at /data/Server/10.6-MDEV-39344/storage/innobase/handler/i_s.cc:4070
4070 page_info->state = bpage->state();
(rr) awatch (size_t)0x31e84fa9ad00
Hardware access (read/write) watchpoint 1: (size_t)0x31e84fa9ad00
(rr) rc
Continuing.

Thread 50 hit Hardware access (read/write) watchpoint 1: (size_t)0x31e84fa9ad00

Value = 18446744073709551615
0x0000000070000000 in syscall_traced ()
(rr) bt
#0 0x0000000070000000 in syscall_traced ()
#1 0x00006298fbba1018 in _raw_syscall () at /home/ubuntu/rr/src/preload/raw_syscall.S:120
#2 0x00006298fbb9a909 in traced_raw_syscall (call=0x7f64a4dfffa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:379
#3 0x00006298fbb9e0bc in sys_futex (call=<optimized out>) at /home/ubuntu/rr/src/preload/syscallbuf.c:2085
#4 syscall_hook_internal (call=call@entry=0x7f64a4dfffa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:4203
#5 0x00006298fbba0da4 in syscall_hook (call=0x7f64a4dfffa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:4387
#6 syscall_hook (call=0x7f64a4dfffa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:4371
#7 0x00006298fbb9a323 in _syscall_hook_trampoline () at /home/ubuntu/rr/src/preload/syscall_hook.S:308
#8 0x00006298fbb9a38d in __morestack () at /home/ubuntu/rr/src/preload/syscall_hook.S:443
#9 0x00006298fbb9a394 in _syscall_hook_trampoline_48_3d_01_f0_ff_ff () at /home/ubuntu/rr/src/preload/syscall_hook.S:457
#10 0x0000048130297d81 in __GI_munmap () at ../sysdeps/unix/syscall-template.S:117
#11 0x00006298fcbf8cdb in my_large_free (ptr=0x31e84f89f000, size=134217728) at /data/Server/10.6-MDEV-39344/mysys/my_largepage.c:430
#12 0x00006298fc9f60a8 in ut_allocator<unsigned char, true>::deallocate_large (this=this@entry=0x6298fd743c60 <buf_pool+16672>, ptr=<optimized out>, pfx=pfx@entry=0x62990f6dca80)
at /data/Server/10.6-MDEV-39344/storage/innobase/include/ut0new.h:675
#13 0x00006298fcacf879 in ut_allocator<unsigned char, true>::deallocate_large_dodump (pfx=0x62990f6dca80, ptr=<optimized out>, this=0x6298fd743c60 <buf_pool+16672>)
at /data/Server/10.6-MDEV-39344/storage/innobase/include/ut0new.h:679
#14 buf_pool_t::resize (this=this@entry=0x6298fd73fb40 <buf_pool>) at /data/Server/10.6-MDEV-39344/storage/innobase/buf/buf0buf.cc:1747
#15 0x00006298fcac8940 in buf_resize_callback () at /data/Server/10.6-MDEV-39344/storage/innobase/buf/buf0buf.cc:1935
#16 0x00006298fcb873ed in tpool::task_group::execute (this=0x6298fe04e6c0 <single_threaded_group>, t=t@entry=0x6298fe04e620 <buf_resize_task>) at /data/Server/10.6-MDEV-39344/tpool/task_group.cc:55
#17 0x00006298fcb87489 in tpool::task::execute (this=0x6298fe04e620 <buf_resize_task>) at /data/Server/10.6-MDEV-39344/tpool/task.cc:32
#18 0x00006298fcb84bc9 in tpool::thread_pool_generic::worker_main (this=0x62990f68b120, thread_var=0x62990f68b7b0) at /data/Server/10.6-MDEV-39344/tpool/tpool_generic.cc:573

We can see that the view information_schema.innodb_buffer_page is accessing a block descriptor that had been freed as part of shrinking the buffer pool.

When ~~MDEV-29445~~ reimplemented the buffer pool resizing and the way how the buffer pool is allocated, it created a static mapping between page frame addresses and block descriptors. The virtual addresses are preallocated at server startup. Buffer pool resizing would only change how many addresses starting from buf_pool.memory are addressable.

Let us look at the crash to see if it would be possible when the ~~MDEV-29445~~ fix is present:

10.6-MDEV-39344 cb6b5e26d1837c007b87bd4ad676b7c1720ffe8e
(rr) bt
#0 i_s_innodb_buffer_page_get_info (bpage=bpage@entry=0x31e84fa9ad00, pos=pos@entry=10000, page_info=page_info@entry=0x6d1108077010) at /data/Server/10.6-MDEV-39344/storage/innobase/handler/i_s.cc:4070
#1 0x00006298fc915497 in i_s_innodb_buffer_page_fill (thd=0x6d1108000d58, tables=0x6d1108013d00) at /data/Server/10.6-MDEV-39344/storage/innobase/handler/i_s.cc:4179
#2 0x00006298fc47af3c in get_schema_tables_result (join=join@entry=0x6d1108015728, executed_place=executed_place@entry=PROCESSED_BY_JOIN_EXEC) at /data/Server/10.6-MDEV-39344/sql/sql_show.cc:9056

The culprit is clear:

i_s_innodb_buffer_page_fill()
for (ulint n = 0;
n < ut_min(buf_pool.n_chunks, buf_pool.n_chunks_new); n++) {
// skip some code
/* Obtain appropriate mutexes. Since this is diagnostic
buffer pool info printout, we are not required to
preserve the overall consistency, so we can
release mutex periodically */
mysql_mutex_lock(&buf_pool.mutex);

/* GO through each block in the chunk */
for (n_blocks = num_to_process; n_blocks--; block++) {
i_s_innodb_buffer_page_get_info(
&block->page, block_id,
info_buffer + num_page);

This function is reading many structures that ought to be protected by buf_pool.mutex. In fact, this code was writing output to an ENGINE=Aria internal temporary table while the memory buffer was freed:

10.6-MDEV-39344 cb6b5e26d1837c007b87bd4ad676b7c1720ffe8e
#11 0x00006298fc471821 in schema_table_store_record (thd=thd@entry=0x6d1108000d58, table=table@entry=0x6d110806ef80) at /data/Server/10.6-MDEV-39344/sql/sql_show.cc:3942
#12 0x00006298fc90fc22 in i_s_innodb_buffer_page_fill (thd=thd@entry=0x6d1108000d58, tables=tables@entry=0x6d1108013d00, info_array=info_array@entry=0x6d1108077010, num_page=num_page@entry=10000) at /data/Server/10.6-MDEV-39344/storage/innobase/handler/i_s.cc:3987
#13 0x00006298fc9154e6 in i_s_innodb_buffer_page_fill (thd=0x6d1108000d58, tables=0x6d1108013d00) at /data/Server/10.6-MDEV-39344/storage/innobase/handler/i_s.cc:4190
#14 0x00006298fc47af3c in get_schema_tables_result (join=join@entry=0x6d1108015728, executed_place=executed_place@entry=PROCESSED_BY_JOIN_EXEC) at /data/Server/10.6-MDEV-39344/sql/sql_show.cc:9056

In ~~MDEV-29445~~ the loop was rewritten to be much shorter, and with proper mutex protection:

  for (size_t j= 0;;)

    memset((void*) b, 0, MAX_BUF_INFO_CACHED * sizeof *b);

    mysql_mutex_lock(&buf_pool.mutex);

    const size_t N= buf_pool.curr_size();

    const size_t n= std::min<size_t>(N, MAX_BUF_INFO_CACHED);

    for (size_t i= 0; i < n && j < N; i++, j++)

      i_s_innodb_buffer_page_get_info(&buf_pool.get_nth_page(j)->page, j,

                                      &b[i]);

    mysql_mutex_unlock(&buf_pool.mutex);

    status= i_s_innodb_buffer_page_fill(thd, tables, b, n);

    if (status || j >= N)

      break;

In this loop, we are first safely copying the information to an intermediate array of buf_page_info_t, at most MAX_BUF_INFO_CACHED entries. Then, after releasing buf_pool.mutex, we copy the intermediate array to the output buffer. If the size of the buffer pool is changed before the next iteration, we will correctly keep iterating until the current buffer pool size is reached.

Attachments

Issue Links

relates to

MDEV-29445 reorganise innodb buffer pool (and remove buffer pool chunks)

Closed

MDEV-35485 The test innodb.innodb_buffer_pool_resize occasionally crashes

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Marko Mäkelä

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2026-05-28 05:25

Updated:: 2026-05-28 08:04

Resolved:: 2026-05-28 05:32

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

0.25d

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.