Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Duplicate
-
10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5(EOL), 10.6
-
Can result in hang or crash
Description
mleich produced an rr replay trace where MariaDB Server 10.6 is crashing after executing the following commands:
SET GLOBAL innodb_buffer_pool_size=268435456; |
SET GLOBAL innodb_buffer_pool_size=134217728; |
SELECT * FROM INFORMATION_SCHEMA.INNODB_BUFFER_PAGE; |
|
10.6-MDEV-39344 cb6b5e26d1837c007b87bd4ad676b7c1720ffe8e |
2026-05-27 12:55:11 17 [Note] InnoDB: Requested to resize buffer pool. (new size: 134217728 bytes)
|
2026-05-27 12:55:11 0 [Note] InnoDB: Resizing buffer pool from 268435456 to 134217728 (unit=134217728).
|
2026-05-27 12:55:11 0 [Note] InnoDB: Disabling adaptive hash index.
|
2026-05-27 12:55:11 0 [Note] InnoDB: Withdrawing blocks to be shrunken.
|
2026-05-27 12:55:11 0 [Note] InnoDB: start to withdraw the last 15977 blocks
|
2026-05-27 12:55:11 0 [Note] InnoDB: withdrawing blocks. (15977/15977)
|
2026-05-27 12:55:11 0 [Note] InnoDB: withdrew 15977 blocks from free list. Tried to relocate 0 pages (15977/15977)
|
2026-05-27 12:55:11 0 [Note] InnoDB: withdrawn target: 15977 blocks
|
2026-05-27 12:55:11 0 [Note] InnoDB: Latching whole of buffer pool.
|
2026-05-27 12:55:11 0 [Note] InnoDB: buffer pool resizing with chunks 2 to 1.
|
2026-05-27 12:55:11 0 [Note] InnoDB: 1 chunks (15977 blocks) were freed.
|
2026-05-27 12:55:11 0 [Note] InnoDB: Completed to resize buffer pool from 268435456 to 134217728.
|
2026-05-27 12:55:11 0 [Note] InnoDB: Completed resizing buffer pool at 260527 12:55:11.
|
[New Thread 1857251.1870065]
|
[New Thread 1857251.1857468]
|
…
|
[New Thread 1857251.1917025]
|
[New Thread 1857251.1946944]
|
|
|
Thread 10 received signal SIGSEGV, Segmentation fault.
|
[Switching to Thread 1857251.1870065]
|
i_s_innodb_buffer_page_get_info (bpage=bpage@entry=0x31e84fa9ad00,
|
pos=pos@entry=10000, page_info=page_info@entry=0x6d1108077010)
|
at /data/Server/10.6-MDEV-39344/storage/innobase/handler/i_s.cc:4070
|
4070 page_info->state = bpage->state();
|
(rr) awatch *(size_t*)0x31e84fa9ad00
|
Hardware access (read/write) watchpoint 1: *(size_t*)0x31e84fa9ad00
|
(rr) rc
|
Continuing.
|
|
|
Thread 50 hit Hardware access (read/write) watchpoint 1: *(size_t*)0x31e84fa9ad00
|
|
|
Value = 18446744073709551615
|
0x0000000070000000 in syscall_traced ()
|
(rr) bt
|
#0 0x0000000070000000 in syscall_traced ()
|
#1 0x00006298fbba1018 in _raw_syscall () at /home/ubuntu/rr/src/preload/raw_syscall.S:120
|
#2 0x00006298fbb9a909 in traced_raw_syscall (call=0x7f64a4dfffa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:379
|
#3 0x00006298fbb9e0bc in sys_futex (call=<optimized out>) at /home/ubuntu/rr/src/preload/syscallbuf.c:2085
|
#4 syscall_hook_internal (call=call@entry=0x7f64a4dfffa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:4203
|
#5 0x00006298fbba0da4 in syscall_hook (call=0x7f64a4dfffa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:4387
|
#6 syscall_hook (call=0x7f64a4dfffa0) at /home/ubuntu/rr/src/preload/syscallbuf.c:4371
|
#7 0x00006298fbb9a323 in _syscall_hook_trampoline () at /home/ubuntu/rr/src/preload/syscall_hook.S:308
|
#8 0x00006298fbb9a38d in __morestack () at /home/ubuntu/rr/src/preload/syscall_hook.S:443
|
#9 0x00006298fbb9a394 in _syscall_hook_trampoline_48_3d_01_f0_ff_ff () at /home/ubuntu/rr/src/preload/syscall_hook.S:457
|
#10 0x0000048130297d81 in __GI_munmap () at ../sysdeps/unix/syscall-template.S:117
|
#11 0x00006298fcbf8cdb in my_large_free (ptr=0x31e84f89f000, size=134217728) at /data/Server/10.6-MDEV-39344/mysys/my_largepage.c:430
|
#12 0x00006298fc9f60a8 in ut_allocator<unsigned char, true>::deallocate_large (this=this@entry=0x6298fd743c60 <buf_pool+16672>, ptr=<optimized out>, pfx=pfx@entry=0x62990f6dca80)
|
at /data/Server/10.6-MDEV-39344/storage/innobase/include/ut0new.h:675
|
#13 0x00006298fcacf879 in ut_allocator<unsigned char, true>::deallocate_large_dodump (pfx=0x62990f6dca80, ptr=<optimized out>, this=0x6298fd743c60 <buf_pool+16672>)
|
at /data/Server/10.6-MDEV-39344/storage/innobase/include/ut0new.h:679
|
#14 buf_pool_t::resize (this=this@entry=0x6298fd73fb40 <buf_pool>) at /data/Server/10.6-MDEV-39344/storage/innobase/buf/buf0buf.cc:1747
|
#15 0x00006298fcac8940 in buf_resize_callback () at /data/Server/10.6-MDEV-39344/storage/innobase/buf/buf0buf.cc:1935
|
#16 0x00006298fcb873ed in tpool::task_group::execute (this=0x6298fe04e6c0 <single_threaded_group>, t=t@entry=0x6298fe04e620 <buf_resize_task>) at /data/Server/10.6-MDEV-39344/tpool/task_group.cc:55
|
#17 0x00006298fcb87489 in tpool::task::execute (this=0x6298fe04e620 <buf_resize_task>) at /data/Server/10.6-MDEV-39344/tpool/task.cc:32
|
#18 0x00006298fcb84bc9 in tpool::thread_pool_generic::worker_main (this=0x62990f68b120, thread_var=0x62990f68b7b0) at /data/Server/10.6-MDEV-39344/tpool/tpool_generic.cc:573
|
We can see that the view information_schema.innodb_buffer_page is accessing a block descriptor that had been freed as part of shrinking the buffer pool.
When MDEV-29445 reimplemented the buffer pool resizing and the way how the buffer pool is allocated, it created a static mapping between page frame addresses and block descriptors. The virtual addresses are preallocated at server startup. Buffer pool resizing would only change how many addresses starting from buf_pool.memory are addressable.
Let us look at the crash to see if it would be possible when the MDEV-29445 fix is present:
|
10.6-MDEV-39344 cb6b5e26d1837c007b87bd4ad676b7c1720ffe8e |
(rr) bt
|
#0 i_s_innodb_buffer_page_get_info (bpage=bpage@entry=0x31e84fa9ad00, pos=pos@entry=10000, page_info=page_info@entry=0x6d1108077010) at /data/Server/10.6-MDEV-39344/storage/innobase/handler/i_s.cc:4070
|
#1 0x00006298fc915497 in i_s_innodb_buffer_page_fill (thd=0x6d1108000d58, tables=0x6d1108013d00) at /data/Server/10.6-MDEV-39344/storage/innobase/handler/i_s.cc:4179
|
#2 0x00006298fc47af3c in get_schema_tables_result (join=join@entry=0x6d1108015728, executed_place=executed_place@entry=PROCESSED_BY_JOIN_EXEC) at /data/Server/10.6-MDEV-39344/sql/sql_show.cc:9056
|
The culprit is clear:
|
i_s_innodb_buffer_page_fill() |
for (ulint n = 0; |
n < ut_min(buf_pool.n_chunks, buf_pool.n_chunks_new); n++) {
|
// skip some code
|
/* Obtain appropriate mutexes. Since this is diagnostic |
buffer pool info printout, we are not required to
|
preserve the overall consistency, so we can
|
release mutex periodically */
|
mysql_mutex_lock(&buf_pool.mutex);
|
|
|
/* GO through each block in the chunk */ |
for (n_blocks = num_to_process; n_blocks--; block++) { |
i_s_innodb_buffer_page_get_info(
|
&block->page, block_id,
|
info_buffer + num_page);
|
This function is reading many structures that ought to be protected by buf_pool.mutex. In fact, this code was writing output to an ENGINE=Aria internal temporary table while the memory buffer was freed:
|
10.6-MDEV-39344 cb6b5e26d1837c007b87bd4ad676b7c1720ffe8e |
#11 0x00006298fc471821 in schema_table_store_record (thd=thd@entry=0x6d1108000d58, table=table@entry=0x6d110806ef80) at /data/Server/10.6-MDEV-39344/sql/sql_show.cc:3942
|
#12 0x00006298fc90fc22 in i_s_innodb_buffer_page_fill (thd=thd@entry=0x6d1108000d58, tables=tables@entry=0x6d1108013d00, info_array=info_array@entry=0x6d1108077010, num_page=num_page@entry=10000) at /data/Server/10.6-MDEV-39344/storage/innobase/handler/i_s.cc:3987
|
#13 0x00006298fc9154e6 in i_s_innodb_buffer_page_fill (thd=0x6d1108000d58, tables=0x6d1108013d00) at /data/Server/10.6-MDEV-39344/storage/innobase/handler/i_s.cc:4190
|
#14 0x00006298fc47af3c in get_schema_tables_result (join=join@entry=0x6d1108015728, executed_place=executed_place@entry=PROCESSED_BY_JOIN_EXEC) at /data/Server/10.6-MDEV-39344/sql/sql_show.cc:9056
|
In MDEV-29445 the loop was rewritten to be much shorter, and with proper mutex protection:
for (size_t j= 0;;) |
{
|
memset((void*) b, 0, MAX_BUF_INFO_CACHED * sizeof *b); |
mysql_mutex_lock(&buf_pool.mutex);
|
const size_t N= buf_pool.curr_size(); |
const size_t n= std::min<size_t>(N, MAX_BUF_INFO_CACHED); |
for (size_t i= 0; i < n && j < N; i++, j++) |
i_s_innodb_buffer_page_get_info(&buf_pool.get_nth_page(j)->page, j,
|
&b[i]);
|
|
|
mysql_mutex_unlock(&buf_pool.mutex);
|
status= i_s_innodb_buffer_page_fill(thd, tables, b, n);
|
if (status || j >= N) |
break; |
}
|
In this loop, we are first safely copying the information to an intermediate array of buf_page_info_t, at most MAX_BUF_INFO_CACHED entries. Then, after releasing buf_pool.mutex, we copy the intermediate array to the output buffer. If the size of the buffer pool is changed before the next iteration, we will correctly keep iterating until the current buffer pool size is reached.
Attachments
Issue Links
- relates to
-
MDEV-29445 reorganise innodb buffer pool (and remove buffer pool chunks)
-
- Closed
-
-
MDEV-35485 The test innodb.innodb_buffer_pool_resize occasionally crashes
-
- Closed
-