[MDEV-13820] trx_id_check() fails during row_log_table_apply() - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.3(EOL)
Fix Version/s: 10.3.3
Component/s: Storage Engine - InnoDB
Labels:
None

Description

10.3 3722372ae56b7ea90
mysqld: /data/src/10.3/storage/innobase/include/trx0sys.h:186: bool trx_id_check(const void*, trx_id_t): Assertion `id == 0 \|\| id > trx_id' failed.
171024 20:59:53 [ERROR] mysqld got signal 6 ;

#7 0x00007f00bc64bee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#8 0x00005639db364f32 in trx_id_check (db_trx_id=0x7f00b6258082, trx_id=1287) at /data/src/10.3/storage/innobase/include/trx0sys.h:186
#9 0x00005639db384c68 in row_log_table_apply_delete (thr=0x7f006c14b160, trx_id_col=1, mrec=0x7f00b40df01f "\200", moffsets=0x7f006c17d230, offsets_heap=0x7f006c0645d0, heap=0x7f006c060540, log=0x7f006c040400, save_ext=0x0, ext_size=0) at /data/src/10.3/storage/innobase/row/row0log.cc:2065
#10 0x00005639db3869b9 in row_log_table_apply_op (thr=0x7f006c14b160, new_trx_id_col=1, dup=0x7f00b426d560, error=0x7f00b426d30c, offsets_heap=0x7f006c0645d0, heap=0x7f006c060540, mrec=0x7f00b40df01f "\200", mrec_end=0x7f00b40df030 "", offsets=0x7f006c17d230) at /data/src/10.3/storage/innobase/row/row0log.cc:2624
#11 0x00005639db388ca8 in row_log_table_apply_ops (thr=0x7f006c14b160, dup=0x7f00b426d560, stage=0x7f006c17b350) at /data/src/10.3/storage/innobase/row/row0log.cc:3177
#12 0x00005639db389267 in row_log_table_apply (thr=0x7f006c14b160, old_table=0x7f006c093d88, table=0x7f006c152bd0, stage=0x7f006c17b350) at /data/src/10.3/storage/innobase/row/row0log.cc:3276
#13 0x00005639db259e2f in commit_try_rebuild (ha_alter_info=0x7f00b426e520, ctx=0x7f006c015fc8, altered_table=0x7f006c152bd0, old_table=0x7f006c14fbd0, trx=0x7f00b65780d8, table_name=0x7f006c14e20d "t2") at /data/src/10.3/storage/innobase/handler/handler0alter.cc:8301
#14 0x00005639db254a90 in ha_innobase::commit_inplace_alter_table (this=0x7f006c1507f8, altered_table=0x7f006c152bd0, ha_alter_info=0x7f00b426e520, commit=true) at /data/src/10.3/storage/innobase/handler/handler0alter.cc:9137
#15 0x00005639daef0c76 in handler::ha_commit_inplace_alter_table (this=0x7f006c1507f8, altered_table=0x7f006c152bd0, ha_alter_info=0x7f00b426e520, commit=true) at /data/src/10.3/sql/handler.cc:4251
#16 0x00005639dacf7f58 in mysql_inplace_alter_table (thd=0x7f006c000b00, table_list=0x7f006c014ae8, table=0x7f006c14fbd0, altered_table=0x7f006c152bd0, ha_alter_info=0x7f00b426e520, inplace_supported=HA_ALTER_INPLACE_NO_LOCK_AFTER_PREPARE, target_mdl_request=0x7f00b426e590, alter_ctx=0x7f00b426f150) at /data/src/10.3/sql/sql_table.cc:7229
#17 0x00005639dacfd23c in mysql_alter_table (thd=0x7f006c000b00, new_db=0x7f006c0150f8 "test", new_name=0x0, create_info=0x7f00b426fd60, table_list=0x7f006c014ae8, alter_info=0x7f00b426fcb0, order_num=0, order=0x0, ignore=false) at /data/src/10.3/sql/sql_table.cc:9205
#18 0x00005639dad7be7b in Sql_cmd_alter_table::execute (this=0x7f006c015100, thd=0x7f006c000b00) at /data/src/10.3/sql/sql_alter.cc:331
#19 0x00005639dac2f172 in mysql_execute_command (thd=0x7f006c000b00) at /data/src/10.3/sql/sql_parse.cc:6236
#20 0x00005639dac33a1c in mysql_parse (thd=0x7f006c000b00, rawbuf=0x7f006c0149f8 "ALTER TABLE t2 ROW_FORMAT=DYNAMIC", length=33, parser_state=0x7f00b4271610, is_com_multi=false, is_next_command=false) at /data/src/10.3/sql/sql_parse.cc:7921
#21 0x00005639dac211bd in dispatch_command (command=COM_QUERY, thd=0x7f006c000b00, packet=0x7f006c08fce1 "ALTER TABLE t2 ROW_FORMAT=DYNAMIC", packet_length=33, is_com_multi=false, is_next_command=false) at /data/src/10.3/sql/sql_parse.cc:1819
#22 0x00005639dac1fc1b in do_command (thd=0x7f006c000b00) at /data/src/10.3/sql/sql_parse.cc:1370
#23 0x00005639dad769f4 in do_handle_one_connection (connect=0x5639de539bc0) at /data/src/10.3/sql/sql_connect.cc:1418
#24 0x00005639dad76781 in handle_one_connection (arg=0x5639de539bc0) at /data/src/10.3/sql/sql_connect.cc:1324
#25 0x00005639db1f0f4c in pfs_spawn_thread (arg=0x5639de5567b0) at /data/src/10.3/storage/perfschema/pfs.cc:1862
#26 0x00007f00be590494 in start_thread (arg=0x7f00b4272700) at pthread_create.c:333
#27 0x00007f00bc70893f in clone () from /lib/x86_64-linux-gnu/libc.so.6

The test case is not deterministic. If it doesn't fail right away, try running with --repeat.

Also, running with innodb_flush_log_at_trx_commit=2 significantly increased the probability of the race condition in original stress tests while running on disk – once in a few minutes vs once in many hours; and further, running in shm increased the probability even more, to once in a few seconds. Currently the MTR test below fails pretty much every time for me, even on disk and without innodb_flush_log_at_trx_commit=2, so I cannot say whether those options have any effect on it.

--source include/have_innodb.inc

CREATE TABLE t1 (pk INT PRIMARY KEY, i INT) ENGINE=InnoDB;

CREATE TABLE t2 (pk INT PRIMARY KEY, i INT) ENGINE=InnoDB;

--connect (con1,localhost,root,,test)

BEGIN;

--send

  INSERT INTO t1 VALUES (1,10);

--connection default

--send

  ALTER TABLE t2 ROW_FORMAT=DYNAMIC;

--connection con1

--reap

INSERT INTO t2 VALUES (2,20);

ROLLBACK;

--connection default

--reap

# Cleanup

--disconnect con1

--connection default

DROP TABLE t1, t2;

Does not fail on 10.2.
No visible effect on 10.3 non-debug build.
The problem rather badly affects upgrade tests on 10.3 (the laste step, when DDL/DML is running on the upgraded server), causes a lot of failures unrelated to upgrade as such.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

threads
32 kB
2017-10-24 18:09

Issue Links

is caused by

MDEV-12288 Reset DB_TRX_ID when the history is removed, to speed up MVCC

Closed

MDEV-13536 DB_TRX_ID is not actually being reset when the history is removed

Closed

MDEV-13654 Various crashes due to DB_TRX_ID mismatch in table-rebuilding ALTER TABLE…LOCK=NONE

Closed

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2017-10-24 20:03

This assertion only exists in 10.3.
The assertion fails on the record that was inserted by

INSERT INTO t2 VALUES (2,20);

There is a race condition in the test, but in my case this transaction ID was assigned right before the transaction for ALTER TABLE t2 started.

It looks like the ROLLBACK completed execution. In t2.ibd (and in the buffer pool copy of its root page 5:3) there is no user record. As part of ~~MDEV-11369~~ we would empty the root page when the table becomes empty. So, there is not even PAGE_FREE garbage for the (2,20).

The assertion fails on the intermediate copy that is being created for t2.ibd (page 6:3). That record must have been copied during the initial copy phase. At that point, the transaction was active.

The proper fix would seem to be to relax this assertion. After all, we are applying the delete operation related to rolling back the INSERT.

Currently row_merge_read_clustered_index() only resets the DB_TRX_ID for records that belonged to transactions that were committed before the ALTER TABLE started:

		if (old_table == new_table) {

			/* Do not bother touching DB_TRX_ID,DB_ROLL_PTR

			because they are not going to be written into

			secondary indexes. */

		} else if (rec_trx_id < trx->id) {

			/* Reset the DB_TRX_ID,DB_ROLL_PTR of old rows

			for which history is not going to be

			available after the rebuild operation.

			This essentially mimics row_purge_reset_trx_id(). */

			row->fields[new_trx_id_col].data

				= const_cast<byte*>(reset_trx_id);

			row->fields[new_trx_id_col + 1].data

				= const_cast<byte*>(reset_trx_id

						    + DATA_TRX_ID_LEN);

Marko Mäkelä added a comment - 2017-10-24 20:03 This assertion only exists in 10.3. The assertion fails on the record that was inserted by INSERT INTO t2 VALUES (2,20); There is a race condition in the test, but in my case this transaction ID was assigned right before the transaction for ALTER TABLE t2 started. It looks like the ROLLBACK completed execution. In t2.ibd (and in the buffer pool copy of its root page 5:3) there is no user record. As part of MDEV-11369 we would empty the root page when the table becomes empty. So, there is not even PAGE_FREE garbage for the (2,20). The assertion fails on the intermediate copy that is being created for t2.ibd (page 6:3). That record must have been copied during the initial copy phase. At that point, the transaction was active. The proper fix would seem to be to relax this assertion. After all, we are applying the delete operation related to rolling back the INSERT. Currently row_merge_read_clustered_index() only resets the DB_TRX_ID for records that belonged to transactions that were committed before the ALTER TABLE started: if (old_table == new_table) { /* Do not bother touching DB_TRX_ID,DB_ROLL_PTR because they are not going to be written into secondary indexes. */ } else if (rec_trx_id < trx->id) { /* Reset the DB_TRX_ID,DB_ROLL_PTR of old rows for which history is not going to be available after the rebuild operation. This essentially mimics row_purge_reset_trx_id(). */ row->fields[new_trx_id_col].data = const_cast <byte*>(reset_trx_id); row->fields[new_trx_id_col + 1].data = const_cast <byte*>(reset_trx_id + DATA_TRX_ID_LEN); }

Marko Mäkelä added a comment - 2017-10-24 20:06

We cannot simply relax the assertion.
mrec_trx_id is 0 in this case, and we are comparing it to the rolled-back rec_trx_id. This comparison would fail, causing the delete operation to be skipped:

		if (memcmp(mrec_trx_id, rec_trx_id,

			   DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN)) {

			/* The ROW_T_DELETE was logged for a different

			PRIMARY KEY,DB_TRX_ID,DB_ROLL_PTR.

			This is possible if a ROW_T_INSERT was skipped

			or a ROW_T_UPDATE was interpreted as ROW_T_DELETE

			because some BLOBs were missing due to

			(1) rolling back the initial insert, or

			(2) purging the BLOB for a later ROW_T_DELETE

			(3) purging 'old values' for a later ROW_T_UPDATE

			or ROW_T_DELETE. */

			ut_ad(!log->same_pk);

			goto all_done;

Special consideration is needed when the operation involves DROP PRIMARY KEY, ADD PRIMARY KEY.
This could affect the correctness of table-rebuilding online ALTER in non-debug builds.

Marko Mäkelä added a comment - 2017-10-24 20:06 We cannot simply relax the assertion. mrec_trx_id is 0 in this case, and we are comparing it to the rolled-back rec_trx_id. This comparison would fail, causing the delete operation to be skipped: if ( memcmp (mrec_trx_id, rec_trx_id, DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN)) { /* The ROW_T_DELETE was logged for a different PRIMARY KEY,DB_TRX_ID,DB_ROLL_PTR. This is possible if a ROW_T_INSERT was skipped or a ROW_T_UPDATE was interpreted as ROW_T_DELETE because some BLOBs were missing due to (1) rolling back the initial insert, or (2) purging the BLOB for a later ROW_T_DELETE (3) purging 'old values' for a later ROW_T_UPDATE or ROW_T_DELETE. */ ut_ad(! log ->same_pk); goto all_done; } Special consideration is needed when the operation involves DROP PRIMARY KEY, ADD PRIMARY KEY. This could affect the correctness of table-rebuilding online ALTER in non-debug builds.

Marko Mäkelä added a comment - 2017-12-07 08:47

Here is a more deterministic version of the test. It is important that the transaction that did performed concurrent modifications was started (and a transaction ID assigned) before the ALTER TABLE was started:

--source include/have_innodb.inc

--source include/have_debug.inc

--source include/have_debug_sync.inc

CREATE TABLE t1 (pk INT PRIMARY KEY) ENGINE=InnoDB;

CREATE TABLE t2 (pk INT PRIMARY KEY) ENGINE=InnoDB;

BEGIN;

INSERT INTO t2 SET pk=1;

--connect (con1,localhost,root,,test)

SET DEBUG_SYNC='alter_table_inplace_after_lock_downgrade SIGNAL prepared WAIT_FOR logged';

send ALTER TABLE t1 FORCE;

--connection default

SET DEBUG_SYNC='now WAIT_FOR prepared';

INSERT INTO t1 VALUES (1);

SET DEBUG_SYNC='now SIGNAL logged';

ROLLBACK;

--connection con1

reap;

--disconnect con1

--connection default

SET DEBUG_SYNC='RESET';

DROP TABLE t1;

This certainly highlights a thinking mistake that I made when developing the following fix:

~~MDEV-13654~~ Various crashes due to DB_TRX_ID mismatch in table-rebuilding ALTER TABLE…LOCK=NONE

The assertion trx_id_check() was introduced in the above fix. It is valid by design in row_merge_read_clustered_index(), because there we are performing REPEATABLE READ, so we will not see any records that were written after the ALTER TABLE. Also, thanks to the exclusive lock during ha_innobase::prepare_inplace_alter_table(), there cannot exist any older transactions that have already modified the table at the time the ALTER TABLE started the copying. So, at that point, REPEATABLE READ cannot see rows from active but earlier-started transactions.
In row_log_table_apply() it is a different matter. We can indeed see changes by older-started transactions, and ideally we should reset these DB_TRX_ID to 0, because purge would not be run on the rebuilt table. This may be challenging at least in the case of ADD PRIMARY KEY, because there records in the rebuilt table would be identified by both PRIMARY KEY and DB_TRX_ID.

Marko Mäkelä added a comment - 2017-12-07 08:47 Here is a more deterministic version of the test. It is important that the transaction that did performed concurrent modifications was started (and a transaction ID assigned) before the ALTER TABLE was started: --source include/have_innodb.inc --source include/have_debug.inc --source include/have_debug_sync.inc CREATE TABLE t1 (pk INT PRIMARY KEY ) ENGINE=InnoDB; CREATE TABLE t2 (pk INT PRIMARY KEY ) ENGINE=InnoDB; BEGIN ; INSERT INTO t2 SET pk=1; --connect (con1,localhost,root,,test) SET DEBUG_SYNC= 'alter_table_inplace_after_lock_downgrade SIGNAL prepared WAIT_FOR logged' ; send ALTER TABLE t1 FORCE ; --connection default SET DEBUG_SYNC= 'now WAIT_FOR prepared' ; INSERT INTO t1 VALUES (1); SET DEBUG_SYNC= 'now SIGNAL logged' ; ROLLBACK ; --connection con1 reap; --disconnect con1 --connection default SET DEBUG_SYNC= 'RESET' ; DROP TABLE t1; This certainly highlights a thinking mistake that I made when developing the following fix: MDEV-13654 Various crashes due to DB_TRX_ID mismatch in table-rebuilding ALTER TABLE…LOCK=NONE The assertion trx_id_check() was introduced in the above fix. It is valid by design in row_merge_read_clustered_index(), because there we are performing REPEATABLE READ, so we will not see any records that were written after the ALTER TABLE. Also, thanks to the exclusive lock during ha_innobase::prepare_inplace_alter_table(), there cannot exist any older transactions that have already modified the table at the time the ALTER TABLE started the copying. So, at that point, REPEATABLE READ cannot see rows from active but earlier-started transactions. In row_log_table_apply() it is a different matter. We can indeed see changes by older-started transactions, and ideally we should reset these DB_TRX_ID to 0, because purge would not be run on the rebuilt table. This may be challenging at least in the case of ADD PRIMARY KEY, because there records in the rebuilt table would be identified by both PRIMARY KEY and DB_TRX_ID.

Marko Mäkelä added a comment - 2017-12-07 12:25

When logging ROW_T_INSERT or ROW_T_UPDATE records, we did not normalize the DB_TRX_ID of the current transaction into 0 if the current transaction had started (modifying other tables) before the ALTER TABLE started.
Starting with ~~MDEV-13654~~, this normalization was already performed for ROW_T_DELETE, and for all operations with ADD PRIMARY KEY, in row_log_table_get_pk().

Marko Mäkelä added a comment - 2017-12-07 12:25 When logging ROW_T_INSERT or ROW_T_UPDATE records, we did not normalize the DB_TRX_ID of the current transaction into 0 if the current transaction had started (modifying other tables) before the ALTER TABLE started. Starting with MDEV-13654 , this normalization was already performed for ROW_T_DELETE, and for all operations with ADD PRIMARY KEY, in row_log_table_get_pk().

MariaDB Server

trx_id_check() fails during row_log_table_apply()

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration