[MDEV-13497] concurrently used sequence hangs in SEQUENCE::next_value - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.3.0
Fix Version/s: 10.3.1
Component/s: Sequences
Labels:
None

Description

A workload where many workers concurrently pull a number from a InnoDB sequence and insert the number into another InnoDB table using transactions, occasionally causes all worker threads to hang in SEQUENCE::next_value. Everything is fine when no transactions are used (autocommit) or when the sequence object is using the MyISAM engine.

How to repeat:

You need Lua-enabled sysbench, i.e. from https://github.com/hgxl64/sysbench-mariadb.git. Then use the attached test script sequence_native.lua:

mysql -S ... -u root -e "drop database if exists sbtest"

mysql -S ... -u root -e "create database sbtest"

sysbench --test=sequence_native.lua --oltp-tables-count=32 \

 --mysql-table-engine=InnoDB --sequence-cache=10 \

 --mysql-user=root --mysql-socket=... prepare

sysbench --test=sequence_native.lua --oltp-tables-count=32 \

 --trx-size=10 --num-threads=2 --report-interval=1 --max-time=0 \

 --max-requests=0 --mysql-user=root --mysql-socket=... run

You might need to alter the sequence cache size or number of threads. The Lua script understands a few options:
--sequence-cache is used for the CACHE option when creating the sequence
--sequence-engine is used for the ENGINE option of the sequence
--trx-size gives the number of inserts in each transaction

My guess is, that something goes wrong if the sequence cache runs empty for multiple worker threads and they have to dive for the base table at the same time. The bigger the sequence cache is, the more worker threads are needed to trigger the hang. I.e. with Cache=1000 I need 16 threads.

Attached is also my.cnf and a stack trace from a hanging mysqld.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

hang.trace.txt.gz
8 kB
2017-08-10 19:51
my.cnf
0.3 kB
2017-08-10 19:51
sequence_native.lua
2 kB
2017-08-10 19:51

Issue Links

relates to

MDEV-10139 Support for SEQUENCE objects

Closed

MDEV-12930 Testing SEQUENCE object

Closed

MDEV-10139 Support for SEQUENCE objects

Closed

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2017-08-15 13:22

This looks like an InnoDB bug. From the hang.trace.txt.gz:

#4  0x00007fde473f63ab in lock_wait_suspend_thread (thr=thr@entry=0x7fdb8801d8a0) at /home/axel/mariadb-source/mariadb-10.3.0/storage/innobase/lock/lock0wait.cc:337

#5  0x00007fde47486b77 in row_mysql_handle_errors (new_err=new_err@entry=0x7fdbec090d30, trx=trx@entry=0x7fde4409ed58, thr=thr@entry=0x7fdb8801d8a0, savept=savept@entry=0x0) at /home/axel/mariadb-source/mariadb-10.3.0/storage/innobase/row/row0mysql.cc:795

#6  0x00007fde474a75ea in row_search_mvcc (buf=buf@entry=0x7fdb88018460 "\377\301\031\262", mode=mode@entry=PAGE_CUR_G, prebuilt=0x7fdb8801cca0, match_mode=match_mode@entry=0, direction=direction@entry=0) at /home/axel/mariadb-source/mariadb-10.3.0/storage/innobase/row/row0sel.cc:5653

#7  0x00007fde473b210a in ha_innobase::index_read (this=0x7fdb88017830, buf=0x7fdb88018460 "\377\301\031\262", key_ptr=<optimized out>, key_len=<optimized out>, find_flag=<optimized out>) at /home/axel/mariadb-source/mariadb-10.3.0/storage/innobase/handler/ha_innodb.cc:9945

#8  0x00007fde4739acb7 in ha_innobase::index_first (this=<optimized out>, buf=<optimized out>) at /home/axel/mariadb-source/mariadb-10.3.0/storage/innobase/handler/ha_innodb.cc:10373

#9  0x00007fde473b16b9 in ha_innobase::rnd_next (this=0x7fdb88017830, buf=<optimized out>) at /home/axel/mariadb-source/mariadb-10.3.0/storage/innobase/handler/ha_innodb.cc:10473

#10 0x00007fde47217467 in handler::ha_rnd_next (this=this@entry=0x7fdb88017830, buf=0x7fdb88018460 "\377\301\031\262") at /home/axel/mariadb-source/mariadb-10.3.0/sql/handler.cc:2593

#11 0x00007fde47217535 in handler::update_first_row (this=0x7fdb88017830, new_data=0x7fdb88018420 "\377\211L\262") at /home/axel/mariadb-source/mariadb-10.3.0/sql/handler.cc:6041

#12 0x00007fde477a7d18 in ha_sequence::write_row (this=0x7fdb88017fb0, buf=0x7fdb88018420 "\377\211L\262") at /home/axel/mariadb-source/mariadb-10.3.0/sql/ha_sequence.cc:216

There should be no InnoDB table or record locking for the SEQUENCE pseudo-table.

Marko Mäkelä added a comment - 2017-08-15 13:22 This looks like an InnoDB bug. From the hang.trace.txt.gz : #4 0x00007fde473f63ab in lock_wait_suspend_thread (thr=thr@entry=0x7fdb8801d8a0) at /home/axel/mariadb-source/mariadb-10.3.0/storage/innobase/lock/lock0wait.cc:337 #5 0x00007fde47486b77 in row_mysql_handle_errors (new_err=new_err@entry=0x7fdbec090d30, trx=trx@entry=0x7fde4409ed58, thr=thr@entry=0x7fdb8801d8a0, savept=savept@entry=0x0) at /home/axel/mariadb-source/mariadb-10.3.0/storage/innobase/row/row0mysql.cc:795 #6 0x00007fde474a75ea in row_search_mvcc (buf=buf@entry=0x7fdb88018460 "\377\301\031\262", mode=mode@entry=PAGE_CUR_G, prebuilt=0x7fdb8801cca0, match_mode=match_mode@entry=0, direction=direction@entry=0) at /home/axel/mariadb-source/mariadb-10.3.0/storage/innobase/row/row0sel.cc:5653 #7 0x00007fde473b210a in ha_innobase::index_read (this=0x7fdb88017830, buf=0x7fdb88018460 "\377\301\031\262", key_ptr=<optimized out>, key_len=<optimized out>, find_flag=<optimized out>) at /home/axel/mariadb-source/mariadb-10.3.0/storage/innobase/handler/ha_innodb.cc:9945 #8 0x00007fde4739acb7 in ha_innobase::index_first (this=<optimized out>, buf=<optimized out>) at /home/axel/mariadb-source/mariadb-10.3.0/storage/innobase/handler/ha_innodb.cc:10373 #9 0x00007fde473b16b9 in ha_innobase::rnd_next (this=0x7fdb88017830, buf=<optimized out>) at /home/axel/mariadb-source/mariadb-10.3.0/storage/innobase/handler/ha_innodb.cc:10473 #10 0x00007fde47217467 in handler::ha_rnd_next (this=this@entry=0x7fdb88017830, buf=0x7fdb88018460 "\377\301\031\262") at /home/axel/mariadb-source/mariadb-10.3.0/sql/handler.cc:2593 #11 0x00007fde47217535 in handler::update_first_row (this=0x7fdb88017830, new_data=0x7fdb88018420 "\377\211L\262") at /home/axel/mariadb-source/mariadb-10.3.0/sql/handler.cc:6041 #12 0x00007fde477a7d18 in ha_sequence::write_row (this=0x7fdb88017fb0, buf=0x7fdb88018420 "\377\211L\262") at /home/axel/mariadb-source/mariadb-10.3.0/sql/ha_sequence.cc:216 There should be no InnoDB table or record locking for the SEQUENCE pseudo-table.

Marko Mäkelä added a comment - 2017-08-16 13:19

I implemented the InnoDB part of SEQUENCE in ~~MDEV-10139~~ in the bb-10.2-ext branch, which is in the merge workflow: 10.2→bb-10.2-ext→10.3.
Since the last merge from bb-10.2-ext before the 10.3.0 release, there are multiple fixes to SEQUENCE bugs.

Because the test did not involve restarting the server, we can rule out this bug:
~~MDEV-13015~~ After restart, InnoDB wrongly thinks that a SEQUENCE is a TABLE

The stack trace indicates that InnoDB wrongly tried to acquire a transactional lock on a sequence. Sequences are not supposed to use any transactions or transactional locks. They are being persisted via the InnoDB redo log. Concurrency control inside InnoDB is supposed to rely solely on the buf_page_t::lock on the page that stores the state of the sequence. The sequence is internally maintained in a 1-page, 1-record clustered index that is updated-in-place.

I believe that this bug was fixed in this MDEV-10139 follow-up commit that includes the following change:

diff --git a/storage/innobase/row/row0sel.cc b/storage/innobase/row/row0sel.cc

index 106845f73fa..18ae2a8a4e8 100644

--- a/storage/innobase/row/row0sel.cc

+++ b/storage/innobase/row/row0sel.cc

@@ -4481,8 +4481,6 @@ row_search_mvcc(

 	      || MVCC::is_view_active(trx->read_view)

 	      || srv_read_only_mode);

-	trx_start_if_not_started(trx, false);

 	if (trx->isolation_level <= TRX_ISO_READ_COMMITTED

 	    && prebuilt->select_lock_type != LOCK_NONE

 	    && trx->mysql_thd != NULL

@@ -4533,9 +4531,14 @@ row_search_mvcc(

 			fputc('\n', stderr);

 			ut_error;

+	} else if (prebuilt->table->no_rollback()) {

+		/* NO_ROLLBACK tables do not support MVCC or locking. */

+		prebuilt->select_lock_type = LOCK_NONE;

+		prebuilt->sql_stat_start = FALSE;

 	} else if (prebuilt->select_lock_type == LOCK_NONE) {

 		/* This is a consistent read */

 		/* Assign a read view for the query */

+		trx_start_if_not_started(trx, false);

 		if (!srv_read_only_mode) {

 			trx_assign_read_view(trx);

@@ -4543,6 +4546,7 @@ row_search_mvcc(

 		prebuilt->sql_stat_start = FALSE;

 	} else {

+		trx_start_if_not_started(trx, false);

 wait_table_again:

 		err = lock_table(0, prebuilt->table,

 				 prebuilt->select_lock_type == LOCK_S

Marko Mäkelä added a comment - 2017-08-16 13:19 I implemented the InnoDB part of SEQUENCE in MDEV-10139 in the bb-10.2-ext branch, which is in the merge workflow: 10.2→bb-10.2-ext→10.3. Since the last merge from bb-10.2-ext before the 10.3.0 release , there are multiple fixes to SEQUENCE bugs. Because the test did not involve restarting the server, we can rule out this bug: MDEV-13015 After restart, InnoDB wrongly thinks that a SEQUENCE is a TABLE The stack trace indicates that InnoDB wrongly tried to acquire a transactional lock on a sequence. Sequences are not supposed to use any transactions or transactional locks. They are being persisted via the InnoDB redo log. Concurrency control inside InnoDB is supposed to rely solely on the buf_page_t::lock on the page that stores the state of the sequence. The sequence is internally maintained in a 1-page, 1-record clustered index that is updated-in-place. I believe that this bug was fixed in this MDEV-10139 follow-up commit that includes the following change: diff --git a/storage/innobase/row/row0sel.cc b/storage/innobase/row/row0sel.cc index 106845f73fa..18ae2a8a4e8 100644 --- a/storage/innobase/row/row0sel.cc +++ b/storage/innobase/row/row0sel.cc @@ -4481,8 +4481,6 @@ row_search_mvcc( || MVCC::is_view_active(trx->read_view) || srv_read_only_mode); - trx_start_if_not_started(trx, false); - if (trx->isolation_level <= TRX_ISO_READ_COMMITTED && prebuilt->select_lock_type != LOCK_NONE && trx->mysql_thd != NULL @@ -4533,9 +4531,14 @@ row_search_mvcc( fputc('\n', stderr); ut_error; } + } else if (prebuilt->table->no_rollback()) { + /* NO_ROLLBACK tables do not support MVCC or locking. */ + prebuilt->select_lock_type = LOCK_NONE; + prebuilt->sql_stat_start = FALSE; } else if (prebuilt->select_lock_type == LOCK_NONE) { /* This is a consistent read */ /* Assign a read view for the query */ + trx_start_if_not_started(trx, false); if (!srv_read_only_mode) { trx_assign_read_view(trx); @@ -4543,6 +4546,7 @@ row_search_mvcc( prebuilt->sql_stat_start = FALSE; } else { + trx_start_if_not_started(trx, false); wait_table_again: err = lock_table(0, prebuilt->table, prebuilt->select_lock_type == LOCK_S

Marko Mäkelä added a comment - 2017-08-16 13:21

axel, please confirm if this bug is repeatable with

git checkout 8acf4d6f783e5a5f736624064989b88867143685~

and fixed in

git checkout 8acf4d6f783e5a5f736624064989b88867143685

If this is the case, please close the bug as fixed in 10.3.1, because that fix is present in 10.3.1.

Marko Mäkelä added a comment - 2017-08-16 13:21 axel , please confirm if this bug is repeatable with git checkout 8acf4d6f783e5a5f736624064989b88867143685~ and fixed in git checkout 8acf4d6f783e5a5f736624064989b88867143685 If this is the case, please close the bug as fixed in 10.3.1, because that fix is present in 10.3.1.

Marko Mäkelä added a comment - 2017-08-17 07:59

I confirmed that the bug was fixed by commit 8acf4d6f783e5a5f736624064989b88867143685. I can immediately reproduce the hang with the parent of that commit, and not with that commit.

Marko Mäkelä added a comment - 2017-08-17 07:59 I confirmed that the bug was fixed by commit 8acf4d6f783e5a5f736624064989b88867143685 . I can immediately reproduce the hang with the parent of that commit, and not with that commit.

MariaDB Server

concurrently used sequence hangs in SEQUENCE::next_value

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration