This happens when buf_LRU_block_free_non_file_page() calls UNIV_MEM_FREE(block->frame, srv_page_size); after buffer pool was resized. Memory was allocated in os_mem_alloc_large() via shmget(). Then it was freed at buffer pool resize. And that block->frame became a dangling pointer. Sure, program should not touch freed memory. Looks like an ownership + use-after-free error.
#0 os_mem_free_large (ptr=0x288bc0000001c, size=713428427603996) at os0proc.cc:157
#1 0x0000000002847b95 in ut_allocator<unsigned char, true>::deallocate_large (this=0x61b000003280, ptr=0x7fffdffad000 "\377\377\377\377\377\377\377\377@\031\a\340\377\177", size=9895936) at ut0new.h:681
#2 0x00000000028420a7 in ut_allocator<unsigned char, true>::deallocate_large_dodump (this=0x61b000003280, ptr=0x7fffdffad000 "\377\377\377\377\377\377\377\377@\031\a\340\377\177", size=9895936) at ut0new.h:695
#3 0x00000000027fc6b9 in buf_pool_resize () at buf0buf.cc:2966
#4 0x00000000027f8e32 in buf_resize_thread () at buf0buf.cc:3242
#5 0x00007ffff7f87669 in start_thread (arg=<optimized out>) at pthread_create.c:479
#6 0x00007ffff7450323 in clone () at clone.S:95
Right after that operation some memory blocks are still alive. Their block->frame points to a freed memory. And buf_LRU_block_free_non_file_page() touches this memory here:
Memory is touched more in debug mode and less in release mode.
Eugene Kosov (Inactive)
added a comment - Memory is freed here:
#0 os_mem_free_large (ptr=0x288bc0000001c, size=713428427603996) at os0proc.cc:157
#1 0x0000000002847b95 in ut_allocator<unsigned char, true>::deallocate_large (this=0x61b000003280, ptr=0x7fffdffad000 "\377\377\377\377\377\377\377\377@\031\a\340\377\177", size=9895936) at ut0new.h:681
#2 0x00000000028420a7 in ut_allocator<unsigned char, true>::deallocate_large_dodump (this=0x61b000003280, ptr=0x7fffdffad000 "\377\377\377\377\377\377\377\377@\031\a\340\377\177", size=9895936) at ut0new.h:695
#3 0x00000000027fc6b9 in buf_pool_resize () at buf0buf.cc:2966
#4 0x00000000027f8e32 in buf_resize_thread () at buf0buf.cc:3242
#5 0x00007ffff7f87669 in start_thread (arg=<optimized out>) at pthread_create.c:479
#6 0x00007ffff7450323 in clone () at clone.S:95
Right after that operation some memory blocks are still alive. Their block->frame points to a freed memory. And buf_LRU_block_free_non_file_page() touches this memory here:
UNIV_MEM_ALLOC(block->frame, srv_page_size);
...
memset(block->frame, '\0', srv_page_size);
...
memset(block->frame + FIL_PAGE_OFFSET, 0xfe, 4);
memset(block->frame + FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID, 0xfe, 4);
...
UNIV_MEM_FREE(block->frame, srv_page_size);
Memory is touched more in debug mode and less in release mode.
#13 0x000055db30e05486 in btr_pcur_open_low (index=<optimized out>, level=0, tuple=0x7f3fbc234ee8, mode=PAGE_CUR_LE, latch_mode=2, cursor=0x55db32eaaae8, file=0x0, line=1225, autoinc=0, mtr=0x8) at /mariadb/10.5m/storage/innobase/include/btr0pcur.ic:441
#14 0x000055db30e03f63 in row_search_on_row_ref (pcur=0x55db32eaaae8, mode=2, table=0x7f407411f248, ref=0x7f3fbc234ee8, mtr=0x7f403a7fb740) at /mariadb/10.5m/storage/innobase/row/row0row.cc:1225
#15 0x000055db30df3e98 in row_purge_reposition_pcur (mode=2, node=0x55db32eaaa48, mtr=0x7f403a7fb740) at /mariadb/10.5m/storage/innobase/row/row0purge.cc:79
#16 0x000055db30df6980 in row_purge_reset_trx_id (node=0x55db32eaaa48, mtr=0x7f403a7fb740) at /mariadb/10.5m/storage/innobase/row/row0purge.cc:660
Marko Mäkelä
added a comment - If there is a real problem, maybe that could explain MDEV-16964 . On 10.5, I just got this on a non-ASAN build:
10.5 ce41a9075ad63e5388045943b80f69b7af0048d2 with some changes
CURRENT_TEST: innodb.innodb_buffer_pool_resize
mysqltest: In included file "./include/wait_condition.inc":
included from /mariadb/10.5m/mysql-test/suite/innodb/t/innodb_buffer_pool_resize.test at line 58:
At line 54: query 'let $success= `$wait_condition`' failed with wrong errno 2013: 'Lost connection to MySQL server during query', instead of 0...
…
2019-12-13 16:00:38 0 [Note] InnoDB: Resizing also other hash tables.
mysqld: /mariadb/10.5m/storage/innobase/include/buf0buf.ic:972: buf_page_t *buf_page_hash_get_low(buf_pool_t *, const page_id_t): Assertion `rw_lock_own(hash_lock, RW_LOCK_X) || rw_lock_own(hash_lock, RW_LOCK_S)' failed.
191213 16:00:38 [ERROR] mysqld got signal 6 ;
…
#6 0x00007f40ae1f0b92 in __GI___assert_fail (assertion=0x55db313684f3 "rw_lock_own(hash_lock, RW_LOCK_X) || rw_lock_own(hash_lock, RW_LOCK_S)", file=0x55db313551ab "/mariadb/10.5m/storage/innobase/include/buf0buf.ic", line=972, function=0x55db3136853a "buf_page_t *buf_page_hash_get_low(buf_pool_t *, const page_id_t)") at assert.c:101
#7 0x000055db30f14dee in buf_page_hash_get_low (buf_pool=0x55db32c7d080, page_id=...) at /mariadb/10.5m/storage/innobase/include/buf0buf.ic:971
#8 0x000055db30f1f180 in buf_page_init_for_read (err=<optimized out>, mode=132, page_id=..., zip_size=2048, unzip=<optimized out>) at /mariadb/10.5m/storage/innobase/buf/buf0buf.cc:5303
#9 0x000055db30f53f73 in buf_read_page_low (err=0x7f403a7f9bbc, sync=false, mode=132, page_id=..., zip_size=2048, unzip=<optimized out>, ignore=<optimized out>) at /mariadb/10.5m/storage/innobase/buf/buf0rea.cc:150
#10 0x000055db30f55c1f in buf_read_ahead_linear (page_id=..., zip_size=<optimized out>, ibuf=<optimized out>) at /mariadb/10.5m/storage/innobase/buf/buf0rea.cc:700
#11 0x000055db30f1b247 in buf_page_get_gen (page_id=..., zip_size=2048, rw_latch=<optimized out>, guess=<optimized out>, mode=<optimized out>, file=0x55db313a13c5 "/mariadb/10.5m/storage/innobase/row/row0row.cc", line=1225, mtr=0x7f403a7fb740, err=0x7f403a7fa13c, allow_ibuf_merge=<optimized out>) at /mariadb/10.5m/storage/innobase/buf/buf0buf.cc:4842
#12 0x000055db30ecb2ea in btr_cur_search_to_nth_level_func (index=<optimized out>, level=<optimized out>, tuple=<optimized out>, mode=<optimized out>, latch_mode=<optimized out>, cursor=<optimized out>, ahi_latch=0x0, file=0x55db313a13c5 "/mariadb/10.5m/storage/innobase/row/row0row.cc", line=1225, mtr=0x7f403a7fb740, autoinc=0) at /mariadb/10.5m/storage/innobase/btr/btr0cur.cc:1584
#13 0x000055db30e05486 in btr_pcur_open_low (index=<optimized out>, level=0, tuple=0x7f3fbc234ee8, mode=PAGE_CUR_LE, latch_mode=2, cursor=0x55db32eaaae8, file=0x0, line=1225, autoinc=0, mtr=0x8) at /mariadb/10.5m/storage/innobase/include/btr0pcur.ic:441
#14 0x000055db30e03f63 in row_search_on_row_ref (pcur=0x55db32eaaae8, mode=2, table=0x7f407411f248, ref=0x7f3fbc234ee8, mtr=0x7f403a7fb740) at /mariadb/10.5m/storage/innobase/row/row0row.cc:1225
#15 0x000055db30df3e98 in row_purge_reposition_pcur (mode=2, node=0x55db32eaaa48, mtr=0x7f403a7fb740) at /mariadb/10.5m/storage/innobase/row/row0purge.cc:79
#16 0x000055db30df6980 in row_purge_reset_trx_id (node=0x55db32eaaa48, mtr=0x7f403a7fb740) at /mariadb/10.5m/storage/innobase/row/row0purge.cc:660
In MDEV-18851 and PR#1221 there is a proposal to replace os_mem_alloc_large() with my_large_malloc(). Those should be kept in sync with what we are going to do here.
Marko Mäkelä
added a comment - In MDEV-18851 and PR#1221 there is a proposal to replace os_mem_alloc_large() with my_large_malloc() . Those should be kept in sync with what we are going to do here.
Eugene Kosov (Inactive)
added a comment - ASAN documentation requires us to manually unpoison when we manually poisoned https://github.com/llvm/llvm-project/blob/master/compiler-rt/include/sanitizer/asan_interface.h#L21 And we didn't, which is our failure.
marko sorry, I can't explain your failure here. It seems unrelated to ASAN. And I didn't find any issues with buffer pool while debugging this MDEV.
Eugene Kosov (Inactive)
added a comment - marko sorry, I can't explain your failure here. It seems unrelated to ASAN. And I didn't find any issues with buffer pool while debugging this MDEV.
Memory is freed here:
#0 os_mem_free_large (ptr=0x288bc0000001c, size=713428427603996) at os0proc.cc:157
#1 0x0000000002847b95 in ut_allocator<unsigned char, true>::deallocate_large (this=0x61b000003280, ptr=0x7fffdffad000 "\377\377\377\377\377\377\377\377@\031\a\340\377\177", size=9895936) at ut0new.h:681
#2 0x00000000028420a7 in ut_allocator<unsigned char, true>::deallocate_large_dodump (this=0x61b000003280, ptr=0x7fffdffad000 "\377\377\377\377\377\377\377\377@\031\a\340\377\177", size=9895936) at ut0new.h:695
#3 0x00000000027fc6b9 in buf_pool_resize () at buf0buf.cc:2966
#4 0x00000000027f8e32 in buf_resize_thread () at buf0buf.cc:3242
#5 0x00007ffff7f87669 in start_thread (arg=<optimized out>) at pthread_create.c:479
#6 0x00007ffff7450323 in clone () at clone.S:95
Right after that operation some memory blocks are still alive. Their block->frame points to a freed memory. And buf_LRU_block_free_non_file_page() touches this memory here:
UNIV_MEM_ALLOC(block->frame, srv_page_size);
...
memset(block->frame, '\0', srv_page_size);
...
memset(block->frame + FIL_PAGE_OFFSET, 0xfe, 4);
memset(block->frame + FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID, 0xfe, 4);
...
UNIV_MEM_FREE(block->frame, srv_page_size);
Memory is touched more in debug mode and less in release mode.