Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.6, 10.11
-
None
Description
The problem was found on
|
origin/10.6-MDEV-33508 208f3ee34381f7c8a79d114f610c42bb53a5f394 2024-02-21T11:35:06+02:00
|
Per Marko the official 10.6 would suffer from that too.
|
|
Scenario:
|
1. Start the server and generate some initial data.
|
2. 33 "worker" run concurrent some DML mix
|
3. Some RQG component checks periodic the processlist.
|
pluto:/data/results/1708610110/HANG1/rqg.log
|
Line 1322 (2024-02-22T16:08:28) 9 workers active, the other have disconnected
|
They all run some INSERT IGNORE INTO ... (one row only).
|
Line 1797 (2024-02-22T16:12:39) 9 workers active
|
They all run the same statement like above.
|
Only the time was going up by ~ 250s.
|
|
State somewhere between 16:12:39 and 16:16
|
pluto:/data/results/1708610110/HANG1$ gdb -c ./1/data/gcore.2353714 /data/Server_bin/10.6-MDEV-33508_debug_Og/bin/mariadbd
|
State somewhere between 16:16 and 16:19
|
pluto:/data/results/1708610110/HANG1$ gdb -c ./1/data/core.2353714 /data/Server_bin/10.6-MDEV-33508_debug_Og/bin/mariadbd
|
|
RQG
|
===
|
# git clone https://github.com/mleich1/rqg --branch <pick the right branch> RQG
|
#
|
# GIT_SHOW: HEAD -> master, origin/master, origin/HEAD 24efdf20d844da7e26bf02b494515a503ef8e474 2024-01-18T19:39:19+01:00
|
# rqg.pl : Version 4.4.0 (2023-08)
|
#
|
# $RQG_HOME/rqg.pl \
|
# --gendata=conf/mariadb/oltp.zz \
|
# --max_gd_duration=900 \
|
# --grammar=conf/mariadb/oltp.yy \
|
# --mysqld=--transaction-isolation=REPEATABLE-READ \
|
# --validator=SelectStability \
|
# --mysqld=--loose-innodb_lock_schedule_algorithm=fcfs \
|
# --mysqld=--loose-idle_write_transaction_timeout=0 \
|
# --mysqld=--loose-idle_transaction_timeout=0 \
|
# --mysqld=--loose-idle_readonly_transaction_timeout=0 \
|
# --mysqld=--connect_timeout=60 \
|
# --mysqld=--interactive_timeout=28800 \
|
# --mysqld=--slave_net_timeout=60 \
|
# --mysqld=--net_read_timeout=30 \
|
# --mysqld=--net_write_timeout=60 \
|
# --mysqld=--loose-table_lock_wait_timeout=50 \
|
# --mysqld=--wait_timeout=28800 \
|
# --no_mask \
|
# --queries=10000000 \
|
# --seed=random \
|
# --reporters=None \
|
# --reporters=ErrorLog \
|
# --reporters=Deadlock \
|
# --validators=None \
|
# --mysqld=--log_output=none \
|
# --mysqld=--log_bin_trust_function_creators=1 \
|
# --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
|
# --engine=InnoDB \
|
# --restart_timeout=240 \
|
# --mysqld=--plugin-load-add=file_key_management.so \
|
# --mysqld=--loose-file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
|
# --mysqld=--plugin-load-add=provider_lzo.so \
|
# --mysqld=--plugin-load-add=provider_bzip2.so \
|
# --mysqld=--plugin-load-add=provider_lzma.so \
|
# --mysqld=--plugin-load-add=provider_snappy.so \
|
# --mysqld=--plugin-load-add=provider_lz4.so \
|
# --mysqld=--loose-innodb_compression_level=1 \
|
# --duration=300 \
|
# --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
|
# --mysqld=--lock-wait-timeout=86400 \
|
# --mysqld=--innodb-lock-wait-timeout=50 \
|
# --mysqld=--sql_mode=STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION \
|
# --mysqld=--loose-innodb_read_only_compressed=OFF \
|
# --mysqld=--innodb_stats_persistent=off \
|
# --mysqld=--innodb_adaptive_hash_index=off \
|
# --mysqld=--innodb_random_read_ahead=OFF \
|
# --mysqld=--log-bin \
|
# --mysqld=--sync-binlog=1 \
|
# --mysqld=--loose-innodb_evict_tables_on_commit_debug=on \
|
# --threads=33 \
|
# --mysqld=--innodb_use_native_aio=1 \
|
# --mysqld=--innodb_undo_log_truncate=OFF \
|
# --mysqld=--loose_innodb_change_buffering=all \ # <============
|
# --mysqld=--innodb_rollback_on_timeout=OFF \
|
# --mysqld=--innodb_page_size=16K \
|
# --mysqld=--innodb-buffer-pool-size=10M \
|
# --vardir_type=slow \
|
# <local settings>
|
|
Attachments
Issue Links
- causes
-
MDEV-34458 wait_for_read in buf_page_get_low hurts performance
-
- Closed
-
- mentioned in
-
Page Failed to load
There is a deadlock between two threads that are buffering an insert at the same time:
ssh pluto
gdb -c /data/results/1708610110/HANG1/1/data/gcore.2353714 /data/Server_bin/10.6-MDEV-33508_debug_Og/bin/mariadbd
10.6-MDEV-33508 208f3ee34381f7c8a79d114f610c42bb53a5f394
Thread 12 (Thread 0x7f4cc99ed640 (LWP 2363768)):
…
#6 sux_lock<ssux_lock_impl<true> >::s_lock (this=this@entry=0x7f4cdc564a68) at /data/Server/10.6-MDEV-33508/storage/innobase/include/sux_lock.h:362
#7 0x00005555dd5575f2 in buf_page_get_low (page_id=..., page_id@entry=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=8, guess=guess@entry=0x0, mode=mode@entry=10, mtr=mtr@entry=0x7f4cc99ea070, err=0x7f4cc99e99ec, allow_ibuf_merge=false) at /data/Server/10.6-MDEV-33508/storage/innobase/buf/buf0buf.cc:2577
#8 0x00005555dd559c06 in buf_page_get_gen (page_id=page_id@entry=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=8, guess=guess@entry=0x0, mode=mode@entry=10, mtr=mtr@entry=0x7f4cc99ea070, err=0x7f4cc99e99ec, allow_ibuf_merge=false) at /data/Server/10.6-MDEV-33508/storage/innobase/buf/buf0buf.cc:2927
#9 0x00005555dd52a694 in btr_latch_prev (block=block@entry=0x7f4cdc55f430, page_id=..., page_id@entry=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=RW_X_LATCH, mtr=mtr@entry=0x7f4cc99ea070, err=err@entry=0x7f4cc99e99ec) at /data/Server/10.6-MDEV-33508/storage/innobase/btr/btr0cur.cc:959
#10 0x00005555dd52d03a in btr_cur_t::search_leaf (this=this@entry=0x7f4cc99e9f90, tuple=tuple@entry=0x7f4c9404f7b0, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=BTR_MODIFY_PREV, mtr=mtr@entry=0x7f4cc99ea070) at /data/Server/10.6-MDEV-33508/storage/innobase/btr/btr0cur.cc:1375
#11 0x00005555dd331f5a in btr_pcur_open (mtr=0x7f4cc99ea070, cursor=0x7f4cc99e9f90, latch_mode=BTR_MODIFY_PREV, mode=PAGE_CUR_LE, tuple=0x7f4c9404f7b0) at /data/Server/10.6-MDEV-33508/storage/innobase/include/btr0pcur.h:431
#12 ibuf_insert_low (mode=mode@entry=BTR_MODIFY_PREV, op=op@entry=IBUF_OP_INSERT, no_counter=no_counter@entry=0, entry=entry@entry=0x7f4c8c01a5a8, entry_size=entry_size@entry=13, index=index@entry=0x7f4ca4084070, page_id=..., zip_size=0, thr=0x7f4c8801d038) at /data/Server/10.6-MDEV-33508/storage/innobase/ibuf/ibuf0ibuf.cc:3195
…
Thread 27 (Thread 0x7f4cc9588640 (LWP 2363877)):
…
#5 sux_lock<ssux_lock_impl<true> >::x_lock_upgraded (this=0x7f4cdc55f448) at /data/Server/10.6-MDEV-33508/storage/innobase/include/sux_lock.h:428
#6 buf_page_get_low (page_id=..., page_id@entry=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=2, guess=<optimized out>, guess@entry=0x0, mode=mode@entry=10, mtr=mtr@entry=0x7f4cc9585070, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.6-MDEV-33508/storage/innobase/buf/buf0buf.cc:2882
#7 0x00005555dd559c06 in buf_page_get_gen (page_id=..., zip_size=0, rw_latch=2, guess=guess@entry=0x0, mode=mode@entry=10, mtr=mtr@entry=0x7f4cc9585070, err=0x7f4cc95849ec, allow_ibuf_merge=false) at /data/Server/10.6-MDEV-33508/storage/innobase/buf/buf0buf.cc:2927
#8 0x00005555dd4ee074 in btr_block_get (index=..., page=387, mode=mode@entry=RW_X_LATCH, merge=merge@entry=false, mtr=mtr@entry=0x7f4cc9585070, err=err@entry=0x7f4cc95849ec, first=0x0) at /data/Server/10.6-MDEV-33508/storage/innobase/btr/btr0btr.cc:230
#9 0x00005555dd52ca7e in btr_cur_t::search_leaf (this=this@entry=0x7f4cc9584f90, tuple=tuple@entry=0x7f4c50050f20, mode=mode@entry=PAGE_CUR_LE, latch_mode=<optimized out>, latch_mode@entry=BTR_MODIFY_PREV, mtr=mtr@entry=0x7f4cc9585070) at /data/Server/10.6-MDEV-33508/storage/innobase/btr/btr0cur.cc:1378
#10 0x00005555dd331f5a in btr_pcur_open (mtr=0x7f4cc9585070, cursor=0x7f4cc9584f90, latch_mode=BTR_MODIFY_PREV, mode=PAGE_CUR_LE, tuple=0x7f4c50050f20) at /data/Server/10.6-MDEV-33508/storage/innobase/include/btr0pcur.h:431
#11 ibuf_insert_low (mode=mode@entry=BTR_MODIFY_PREV, op=op@entry=IBUF_OP_INSERT, no_counter=no_counter@entry=0, entry=entry@entry=0x7f4c340242a8, entry_size=entry_size@entry=13, index=index@entry=0x7f4ca407e4a0, page_id=..., zip_size=0, thr=0x7f4c340358f8) at /data/Server/10.6-MDEV-33508/storage/innobase/ibuf/ibuf0ibuf.cc:3195
#12 0x00005555dd3336f1 in ibuf_insert (op=op@entry=IBUF_OP_INSERT, entry=entry@entry=0x7f4c340242a8, index=0x7f4ca407e4a0, page_id=page_id@entry=..., zip_size=zip_size@entry=0, thr=0x7f4c340358f8) at /data/Server/10.6-MDEV-33508/storage/innobase/ibuf/ibuf0ibuf.cc:3520
Thread 12 is waiting for a latch on the change buffer page 0x182 in the system tablespace, which is held by Thread 27, which in turn is waiting for a latch on page 0x183, which is held by Thread 12.