This failure had been reported in MDEV-15533, which was filed about a different failure in 10.2.
There are multiple assertions log->blobs in the function row_log_table_apply_update(). The one that fails in the rr replay trace appears to be specific to 10.3 and later, likely caused by MDEV-12288:
if (memcmp(rec_trx_id, old_pk_trx_id->data,
DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN)) {
/* The ROW_T_UPDATE was logged for a different
DB_TRX_ID,DB_ROLL_PTR. This is possible if an
earlier ROW_T_INSERT or ROW_T_UPDATE was diverted
because some BLOBs were missing due to rolling
back the initial insert or due to purging
the old BLOB values of an update. */
ut_ad(log->blobs);
In the trace that I analyzed, old_trx_id->data is reset_trx_id.
We should adjust the condition to take log->min_trx into account. At the time of the mismatch, we have mach_read_from_6(rec_trx_id) == 6388 and log->min_trx == 6385. It seems that we are correctly treating this as a mismatch of the DB_TRX_ID, because in row_log_table_get_pk() the original DB_TRX_ID must have been smaller than 6385. So, this might be a 'harmless' debug assertion failure.
This should only affect cases where the PRIMARY KEY is being redefined while concurrent DML is being executed. In the trace that I analyzed, ALTER TABLE t2 ADD PRIMARY KEY (col2) was being executed.
Attachments
Issue Links
is caused by
MDEV-13654Various crashes due to DB_TRX_ID mismatch in table-rebuilding ALTER TABLE…LOCK=NONE
Closed
relates to
MDEV-15533Assertion `log->blobs' failed in row_log_table_apply_update
mleich, you posted a RQG grammar to MDEV-15533. Could you please generate a new (hopefully smaller) trace with it, hopefully smaller than the one that I analyzed? Any version 10.3, 10.4, 10.5 is fine.
I just realized that applying the reset_trx_id should not affect the outcome of the memcmp(), and the assertion should still fail. So, this might not be caused by MDEV-12288 after all, and the same failure might be possible on 10.2.
Marko Mäkelä
added a comment - - edited mleich , you posted a RQG grammar to MDEV-15533 . Could you please generate a new (hopefully smaller) trace with it, hopefully smaller than the one that I analyzed? Any version 10.3, 10.4, 10.5 is fine.
I just realized that applying the reset_trx_id should not affect the outcome of the memcmp() , and the assertion should still fail. So, this might not be caused by MDEV-12288 after all, and the same failure might be possible on 10.2.
I checked the shorter trace (with 31666 events until the crash). It is executing ADD PRIMARY KEY on a ROW_FORMAT=COMPRESSED table. I think that I must first parse the log and then try to understand what exactly went wrong in the INSERT part of applying a ROW_T_UPDATE record:
The 0xc014 indicate BLOB pointers (0x14=20 bytes of pointer written to the clustered index record).
On a quick look, the immediate problem seems to be that both ROW_T_INSERT of the old primary key values DB_ROW_ID=0x205 and DB_ROW_ID=0x206 are using reset_trx_id while the ROW_T_UPDATE are for DB_TRX_ID=0x4f. The log->min_trx is 0x4b. I do not yet see any evidence that MDEV-12288 would be to blame. I must continue the analysis further.
Marko Mäkelä
added a comment - I checked the shorter trace (with 31666 events until the crash). It is executing ADD PRIMARY KEY on a ROW_FORMAT=COMPRESSED table. I think that I must first parse the log and then try to understand what exactly went wrong in the INSERT part of applying a ROW_T_UPDATE record:
10.3 b75563cdfd161e373480949ffa498bd4a8087b12
(rr) p/x *log.tail.block@326
$1 = {0x41, 0x4, 0x14, 0xc0, 0xa, 0x10, 0x0, 0x0, 0x0, 0x0, 0x2, 0x5, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0x0, 0x1,
0x6b, 0x80, 0x0, 0x1, 0x6b, 0x80, 0x0, 0x1, 0x6b, 0x33, 0x33, 0x33, 0x33,
0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x0, 0x0, 0x0, 0x7, 0x0, 0x0, 0x0, 0x5,
0x0, 0x0, 0x0, 0xc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x8, 0x1, 0x41, 0x4, 0x14,
0xc0, 0xa, 0x10, 0x0, 0x0, 0x0, 0x0, 0x2, 0x6, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0x0, 0x1, 0x6b, 0x80, 0x0, 0x1,
0x6b, 0x80, 0x0, 0x1, 0x6b, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33,
0x33, 0x33, 0x0, 0x0, 0x0, 0x7, 0x0, 0x0, 0x0, 0x6, 0x0, 0x0, 0x0, 0xc, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x8, 0x1, 0x42, 0x2, 0x9, 0x0, 0x80, 0x0, 0x1,
0x6b, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4, 0x14, 0xc0, 0xa,
0x10, 0x0, 0x0, 0x0, 0x0, 0x2, 0x5, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4f, 0x25,
0x0, 0x0, 0x3, 0x4d, 0x1, 0x10, 0x80, 0x0, 0x1, 0x6b, 0x80, 0x0, 0x0, 0x1,
0x80, 0x0, 0x1, 0x6b, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33,
0x33, 0x0, 0x0, 0x0, 0x7, 0x0, 0x0, 0x0, 0x5, 0x0, 0x0, 0x0, 0xc, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x8, 0x1, 0x42, 0x2, 0x9, 0x0, 0x80, 0x0, 0x1, 0x6b,
0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4, 0x14, 0xc0, 0xa, 0x10,
0x0, 0x0, 0x0, 0x0, 0x2, 0x6, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4f, 0x25, 0x0, 0x0,
0x3, 0x4d, 0x1, 0x33, 0x80, 0x0, 0x1, 0x6b, 0x80, 0x0, 0x0, 0x1, 0x80, 0x0,
0x1, 0x6b, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x0,
0x0, 0x0, 0x7, 0x0, 0x0, 0x0, 0x6, 0x0, 0x0, 0x0, 0xc, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x8, 0x1}
(rr) p mrec-log.tail.block
$2 = 265
Partially decoded:
offset 0: ROW_T_INSERT (extra=(0x10,0xa,0xc014),DB_ROW_ID=0x205,reset_trx_id,1,'k',1,'k',1,'k…')
offset 67: ROW_T_INSERT (extra=(0x10,0xa,0xc014),DB_ROW_ID=0x206,reset_trx_id,1,'k',1,'k',1,'k…')
offset 134: ROW_T_UPDATE (old_pk=(363,'333333333',reset_trx_id),extra=(0x10,0xa,0xc014),DB_ROW_ID=0x205,DB_TRX_ID=0x4f,DB_ROLL_PTR=(update),363,1,363,…)
offset 230: ROW_T_UPDATE (old_pk=(363,'333333333',reset_trx_id),extra=(0x10,0xa,0xc014),DB_ROW_ID=0x206,DB_TRX_ID=0x4f,DB_ROLL_PTR=(update),363,1,363,…)
offset 326: EOF
The 0xc014 indicate BLOB pointers (0x14=20 bytes of pointer written to the clustered index record).
On a quick look, the immediate problem seems to be that both ROW_T_INSERT of the old primary key values DB_ROW_ID=0x205 and DB_ROW_ID=0x206 are using reset_trx_id while the ROW_T_UPDATE are for DB_TRX_ID=0x4f . The log->min_trx is 0x4b. I do not yet see any evidence that MDEV-12288 would be to blame. I must continue the analysis further.
Even though the rr replay trace is relatively short, it includes many irrelevant statements, because the RQG grammar has not been simplified for this error.
Both the original table and the copy of the table are initially empty at the time the first ROW_T_INSERT is logged. The sequence of interesting rr replay events (identifed by when) follows:
29335: A 2-row INSERT into the empty table by transaction 47 reaches ha_innobase::write_row() and row_insert_for_mysql(), in Thread 35. (Maybe this was rolled back later?)
29355: The ALTER TABLE t2 starts executing in Thread 3.
29558: Thread 35 starts executing ROLLBACK, to remove the 1 inserted row of transaction 47.
30225: A 2-row INSERT INTO t2 reaches dispatch_command() in Thread 35.
30919: The first row of the 2-row INSERT INTO t2 reaches ha_innobase::write_row() and row_insert_for_mysql() in Thread 35 (transaction 63 with no modifications yet).
30942: The first ROW_T_INSERT is logged by Thread 35.
30944: The second row of the 2-row INSERT INTO t2 reaches ha_innobase::write_row() and row_insert_for_mysql() in Thread 35.
31013: The second ROW_T_INSERT is logged by Thread 35.
31109: Thread 33 (transaction 79, with no modifications to InnoDB yet, UPDATE t2 SET col2 = 1 LIMIT 2) reaches row_update_for_mysql().
31242: The first ROW_T_UPDATE is logged by Thread 33 (the first modification of transaction 79).
31244: The second ROW_T_UPDATE is logged by Thread 33.
31666: Applying the second ROW_T_UPDATE triggers the assertion failure in the ALTER TABLE.
Based on the above, I created a reasonably reduced test case that repeats the crash. 10.1 and 10.2 are not affected:
Marko Mäkelä
added a comment - Even though the rr replay trace is relatively short, it includes many irrelevant statements, because the RQG grammar has not been simplified for this error.
Both the original table and the copy of the table are initially empty at the time the first ROW_T_INSERT is logged. The sequence of interesting rr replay events (identifed by when ) follows:
29335: A 2-row INSERT into the empty table by transaction 47 reaches ha_innobase::write_row() and row_insert_for_mysql() , in Thread 35. (Maybe this was rolled back later?)
29355: The ALTER TABLE t2 starts executing in Thread 3.
29558: Thread 35 starts executing ROLLBACK , to remove the 1 inserted row of transaction 47.
30225: A 2-row INSERT INTO t2 reaches dispatch_command() in Thread 35.
30919: The first row of the 2-row INSERT INTO t2 reaches ha_innobase::write_row() and row_insert_for_mysql() in Thread 35 (transaction 63 with no modifications yet).
30942: The first ROW_T_INSERT is logged by Thread 35.
30944: The second row of the 2-row INSERT INTO t2 reaches ha_innobase::write_row() and row_insert_for_mysql() in Thread 35.
31013: The second ROW_T_INSERT is logged by Thread 35.
31109: Thread 33 (transaction 79, with no modifications to InnoDB yet, UPDATE t2 SET col2 = 1 LIMIT 2 ) reaches row_update_for_mysql() .
31242: The first ROW_T_UPDATE is logged by Thread 33 (the first modification of transaction 79).
31244: The second ROW_T_UPDATE is logged by Thread 33.
31666: Applying the second ROW_T_UPDATE triggers the assertion failure in the ALTER TABLE .
Based on the above, I created a reasonably reduced test case that repeats the crash. 10.1 and 10.2 are not affected:
--source include/have_innodb.inc
CREATE TABLE t1 (a INT ) ENGINE=InnoDB;
CREATE TABLE t2 (col1 INT , col_text TEXT) ENGINE=InnoDB;
connect (con1,localhost,root,,);
BEGIN ;
INSERT INTO t1 VALUES (1);
connection default ;
--echo # needed in 10.1: # SET SQL_MODE=STRICT_TRANS_TABLES;
SET DEBUG_SYNC= 'row_log_table_apply1_before SIGNAL dml WAIT_FOR dml_done' ;
send ALTER TABLE t2 ADD PRIMARY KEY (col_text(9));
connection con1;
SET DEBUG_SYNC= 'now WAIT_FOR dml' ;
INSERT INTO t2 VALUES (363,REPEAT( '3' , 2049)),(363,REPEAT( '3' , 2049));
COMMIT ;
disconnect con1;
connect (con2,localhost,root,,);
UPDATE t2 SET col1 = 1;
SET DEBUG_SYNC= 'now SIGNAL dml_done' ;
disconnect con2;
connection default ;
reap;
SET DEBUG_SYNC= 'RESET' ;
CHECK TABLE t2;
DROP TABLE t1,t2;
The RQG test simplifier was used. But it generated only two bug replays on > 3000 RQG test runs.
So the final simplification was only a reduction of the number of involved concurrent sessions.
Matthias Leich
added a comment - The RQG test simplifier was used. But it generated only two bug replays on > 3000 RQG test runs.
So the final simplification was only a reduction of the number of involved concurrent sessions.
Notably, in 10.3, if I append LIMIT 1 to the UPDATE statement, the test will no longer crash, but it will not report the duplicate PRIMARY KEY either, like it should.
Marko Mäkelä
added a comment - This problem is caused by the fix of MDEV-13654 . Removing all references to row_log_t::min_trx from row0log.cc makes the test pass.
I simplified the test case even further and analyzed the contents of the log in both 10.2 and 10.3.
--source include/have_innodb.inc
CREATE TABLE t1(a TINYINT PRIMARY KEY ) ENGINE=InnoDB;
CREATE TABLE t2(
a TINYINT UNSIGNED PRIMARY KEY ,
b TINYINT UNSIGNED NOT NULL ,
c CHAR (2) NOT NULL )
ENGINE=InnoDB;
connect (con1,localhost,root,,);
BEGIN ;
INSERT INTO t1 VALUES (1);
connection default ;
SET DEBUG_SYNC= 'row_log_table_apply1_before SIGNAL dml WAIT_FOR dml_done' ;
send ALTER TABLE t2 DROP PRIMARY KEY , ADD PRIMARY KEY (c(1));
connection con1;
SET DEBUG_SYNC= 'now WAIT_FOR dml' ;
INSERT INTO t2 VALUES (160,16, 'ab' ),(161,16, 'ac' );
COMMIT ;
disconnect con1;
connect (con2,localhost,root,,);
UPDATE t2 SET b=17;
SET DEBUG_SYNC= 'now SIGNAL dml_done' ;
disconnect con2;
connection default ;
--error ER_DUP_ENTRY
reap;
SET DEBUG_SYNC= 'RESET' ;
CHECK TABLE t2;
DROP TABLE t1,t2;
10.3
ROW_T_INSERT(160,reset_trx_id,16,'ab')
ROW_T_INSERT(161,reset_trx_id,16,'ac')
ROW_T_UPDATE(old_pk('a'),reset_trx_id,
160,DB_TRX_ID=0x2b,DB_ROLL_PTR=(update),17,'ab')
ROW_T_UPDATE(old_pk('a'),reset_trx_id,
161,DB_TRX_ID=0x2b,DB_ROLL_PTR=(update),17,'ac')
10.2
ROW_T_INSERT(160,DB_TRX_ID=0x506,DB_ROLL_PTR=(insert1),16,'ab')
ROW_T_INSERT(161,DB_TRX_ID=0x506,DB_ROLL_PTR=(insert2),16,'ac')
ROW_T_UPDATE(old_pk('a'),DB_TRX_ID=0x506,(insert1),
160,DB_TRX_ID=0x50b,DB_ROLL_PTR=(update),17,'ab')
ROW_T_UPDATE(old_pk('a'),DB_TRX_ID=0x506,DB_ROLL_PTR=(insert1),
161,DB_TRX_ID=0x50b,DB_ROLL_PTR=(update),17,'ac')
Notably, in 10.3, if I append LIMIT 1 to the UPDATE statement, the test will no longer crash, but it will not report the duplicate PRIMARY KEY either, like it should.
It is not an actual fix yet, and we would definitely be missing a change to row_log_table_low_redundant() if this is the right direction.
I believe that the root cause is that applying the second ROW_T_INSERT did not flag a duplicate key error. Here is a simpler test for that, causing no crash, but a wrong result (the ALTER TABLE wrongly succeeds, and the table will contain 1 row, even though 2 were inserted):
SET DEBUG_SYNC='row_log_table_apply1_before SIGNAL dml WAIT_FOR dml_done';
send ALTERTABLE t2 ADDPRIMARYKEY(c(1));
connection con1;
SET DEBUG_SYNC='now WAIT_FOR dml';
INSERTINTO t2 VALUES ('ab'),('ac');
COMMIT;
SET DEBUG_SYNC='now SIGNAL dml_done';
connectiondefault;
--error ER_DUP_ENTRY
reap;
SET DEBUG_SYNC='RESET';
SELECT * FROM t2;
DROPTABLE t1,t2;
Marko Mäkelä
added a comment - The following partial revert of the MDEV-13654 changes makes this particular test pass:
diff --git a/storage/innobase/row/row0log.cc b/storage/innobase/row/row0log.cc
index a3bf91a1c74..267e56bf828 100644
--- a/storage/innobase/row/row0log.cc
+++ b/storage/innobase/row/row0log.cc
@@ -1079,15 +1079,7 @@ row_log_table_low(
b += rec_extra_size;
memcpy(b, fake_extra_buf + 1, fake_extra_size);
b += fake_extra_size;
- ulint len;
- ulint trx_id_offs = rec_get_nth_field_offs(
- offsets, index->n_uniq, &len);
- ut_ad(len == DATA_TRX_ID_LEN);
memcpy(b, rec, rec_offs_data_size(offsets));
- if (trx_read_trx_id(b + trx_id_offs) < log->min_trx) {
- memcpy(b + trx_id_offs,
- reset_trx_id, sizeof reset_trx_id);
- }
b += rec_offs_data_size(offsets);
row_log_table_close(index, b, mrec_size, avail_size);
@@ -1392,13 +1384,7 @@ row_log_table_get_pk(
/* Copy the fields, because the fields will be updated
or the record may be moved somewhere else in the B-tree
as part of the upcoming operation. */
- if (trx_read_trx_id(trx_roll) < log->min_trx) {
- trx_roll = reset_trx_id;
- if (sys) {
- memcpy(sys, trx_roll,
- DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
- }
- } else if (sys) {
+ if (sys) {
memcpy(sys, trx_roll,
DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
trx_roll = sys;
@@ -1409,8 +1395,6 @@ row_log_table_get_pk(
DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN));
}
- ut_d(trx_id_check(trx_roll, log->min_trx));
-
dfield_set_data(dtuple_get_nth_field(tuple, new_n_uniq),
trx_roll, DATA_TRX_ID_LEN);
dfield_set_data(dtuple_get_nth_field(tuple, new_n_uniq + 1),
It is not an actual fix yet, and we would definitely be missing a change to row_log_table_low_redundant() if this is the right direction.
I believe that the root cause is that applying the second ROW_T_INSERT did not flag a duplicate key error. Here is a simpler test for that, causing no crash, but a wrong result (the ALTER TABLE wrongly succeeds, and the table will contain 1 row, even though 2 were inserted):
--source include/have_innodb.inc
CREATE TABLE t1(a TINYINT PRIMARY KEY ) ENGINE=InnoDB;
CREATE TABLE t2(c CHAR (2) NOT NULL ) ENGINE=InnoDB;
connect (con1,localhost,root,,);
BEGIN ;
INSERT INTO t1 VALUES (1);
connection default ;
SET DEBUG_SYNC= 'row_log_table_apply1_before SIGNAL dml WAIT_FOR dml_done' ;
send ALTER TABLE t2 ADD PRIMARY KEY (c(1));
connection con1;
SET DEBUG_SYNC= 'now WAIT_FOR dml' ;
INSERT INTO t2 VALUES ( 'ab' ),( 'ac' );
COMMIT ;
SET DEBUG_SYNC= 'now SIGNAL dml_done' ;
connection default ;
--error ER_DUP_ENTRY
reap;
SET DEBUG_SYNC= 'RESET' ;
SELECT * FROM t2;
DROP TABLE t1,t2;
Finally, after some debugging on 10.2 and 10.3, I found the actual root cause. The following fix should have limited negative impact: more ‘fake’ duplicate key errors could be reported.
+ /* rec is an exact match of entry, and DB_TRX_ID belongs
+ to a transaction that started after our ALTER TABLE. */
return(DB_SUCCESS_LOCKED_REC);
}
Marko Mäkelä
added a comment - Finally, after some debugging on 10.2 and 10.3, I found the actual root cause. The following fix should have limited negative impact: more ‘fake’ duplicate key errors could be reported.
diff --git a/storage/innobase/row/row0ins.cc b/storage/innobase/row/row0ins.cc
index d6596d586ab..13649777419 100644
--- a/storage/innobase/row/row0ins.cc
+++ b/storage/innobase/row/row0ins.cc
@@ -2241,8 +2241,14 @@ row_ins_duplicate_online(
return(DB_SUCCESS);
}
- if (fields == n_uniq + 2) {
- /* rec is an exact match of entry. */
+ ulint trx_id_len;
+
+ if (fields == n_uniq + 2
+ && memcmp(rec_get_nth_field(rec, offsets, n_uniq, &trx_id_len),
+ reset_trx_id, DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN)) {
+ ut_ad(trx_id_len == DATA_TRX_ID_LEN);
+ /* rec is an exact match of entry, and DB_TRX_ID belongs
+ to a transaction that started after our ALTER TABLE. */
return(DB_SUCCESS_LOCKED_REC);
}
Also the following would fail to flag a duplicate key error from an INSERT (or possibly UPDATE) transaction that was started before the ALTER TABLE but did not access the table in question before the ALTER TABLE started:
CREATETABLE t2(a CHAR(2) NOTNULL) ENGINE=InnoDB;
ALTERTABLE t2 ADDPRIMARYKEY(a);
and
CREATETABLE t2(a CHAR(1), b CHAR(1), PRIMARYKEY(a, b)) ENGINE=InnoDB;
ALTERTABLE t2 DROP b, DROPPRIMARYKEY, ADDPRIMARYKEY(a);
Starting with MariaDB 10.2, a simple ALTER TABLE t2 DROP b would be refused. In MariaDB 10.0 and 10.1, it would imply the PRIMARY KEY operations.
Marko Mäkelä
added a comment - Also the following would fail to flag a duplicate key error from an INSERT (or possibly UPDATE ) transaction that was started before the ALTER TABLE but did not access the table in question before the ALTER TABLE started:
CREATE TABLE t2(a CHAR (2) NOT NULL ) ENGINE=InnoDB;
ALTER TABLE t2 ADD PRIMARY KEY (a);
and
CREATE TABLE t2(a CHAR (1), b CHAR (1), PRIMARY KEY (a, b)) ENGINE=InnoDB;
ALTER TABLE t2 DROP b, DROP PRIMARY KEY , ADD PRIMARY KEY (a);
Starting with MariaDB 10.2, a simple ALTER TABLE t2 DROP b would be refused. In MariaDB 10.0 and 10.1, it would imply the PRIMARY KEY operations.
mleich, you posted a RQG grammar to
MDEV-15533. Could you please generate a new (hopefully smaller) trace with it, hopefully smaller than the one that I analyzed? Any version 10.3, 10.4, 10.5 is fine.I just realized that applying the reset_trx_id should not affect the outcome of the memcmp(), and the assertion should still fail. So, this might not be caused by
MDEV-12288after all, and the same failure might be possible on 10.2.