Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23244

ALTER TABLE…ADD PRIMARY KEY fails to flag duplicate key error from concurrent DML

Details

    Description

      This failure had been reported in MDEV-15533, which was filed about a different failure in 10.2.

      There are multiple assertions log->blobs in the function row_log_table_apply_update(). The one that fails in the rr replay trace appears to be specific to 10.3 and later, likely caused by MDEV-12288:

      		if (memcmp(rec_trx_id, old_pk_trx_id->data,
      			   DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN)) {
      			/* The ROW_T_UPDATE was logged for a different
      			DB_TRX_ID,DB_ROLL_PTR. This is possible if an
      			earlier ROW_T_INSERT or ROW_T_UPDATE was diverted
      			because some BLOBs were missing due to rolling
      			back the initial insert or due to purging
      			the old BLOB values of an update. */
      			ut_ad(log->blobs);
      

      In the trace that I analyzed, old_trx_id->data is reset_trx_id.

      The value was assigned earlier:

      #0  0x000055bd086d1405 in dfield_set_data (len=6, 
          data=0x55bd0997f4e0 <reset_trx_id>, field=0x61200036ddb0)
          at /home/mleich/10.5_A/storage/innobase/include/data0data.ic:93
      #1  row_log_table_get_pk (rec=<optimized out>, 
          index=index@entry=0x61800020f520, offsets=offsets@entry=0x7f9a4025f9d0, 
          sys=sys@entry=0x0, heap=heap@entry=0x7f9a4025f6f0)
          at /home/mleich/10.5_A/storage/innobase/row/row0log.cc:1416
      #2  0x000055bd0876a3fd in row_upd_clust_rec (flags=flags@entry=0, 
          node=node@entry=0x621000c6f220, index=index@entry=0x61800020f520, 
          offsets=<optimized out>, offsets@entry=0x7f9a4025f9d0, 
          offsets_heap=offsets_heap@entry=0x7f9a4025f850, 
          thr=thr@entry=0x621000c6f600, mtr=<optimized out>)
          at /home/mleich/10.5_A/storage/innobase/row/row0upd.cc:2550
      

      The code in row_log_table_get_pk() is directly related to this:

      		if (trx_read_trx_id(trx_roll) < log->min_trx) {
      			trx_roll = reset_trx_id;
      			if (sys) {
      				memcpy(sys, trx_roll,
      				       DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
      			}
      ...
      		}
       
      		ut_d(trx_id_check(trx_roll, log->min_trx));
       
      		dfield_set_data(dtuple_get_nth_field(tuple, new_n_uniq),
      				trx_roll, DATA_TRX_ID_LEN);
      		dfield_set_data(dtuple_get_nth_field(tuple, new_n_uniq + 1),
      				trx_roll + DATA_TRX_ID_LEN, DATA_ROLL_PTR_LEN);
      

      We should adjust the condition to take log->min_trx into account. At the time of the mismatch, we have mach_read_from_6(rec_trx_id) == 6388 and log->min_trx == 6385. It seems that we are correctly treating this as a mismatch of the DB_TRX_ID, because in row_log_table_get_pk() the original DB_TRX_ID must have been smaller than 6385. So, this might be a 'harmless' debug assertion failure.

      This should only affect cases where the PRIMARY KEY is being redefined while concurrent DML is being executed. In the trace that I analyzed, ALTER TABLE t2 ADD PRIMARY KEY (col2) was being executed.

      Attachments

        Issue Links

          Activity

            marko Marko Mäkelä created issue -
            marko Marko Mäkelä made changes -
            Field Original Value New Value
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä added a comment - - edited

            mleich, you posted a RQG grammar to MDEV-15533. Could you please generate a new (hopefully smaller) trace with it, hopefully smaller than the one that I analyzed? Any version 10.3, 10.4, 10.5 is fine.

            I just realized that applying the reset_trx_id should not affect the outcome of the memcmp(), and the assertion should still fail. So, this might not be caused by MDEV-12288 after all, and the same failure might be possible on 10.2.

            marko Marko Mäkelä added a comment - - edited mleich , you posted a RQG grammar to MDEV-15533 . Could you please generate a new (hopefully smaller) trace with it, hopefully smaller than the one that I analyzed? Any version 10.3, 10.4, 10.5 is fine. I just realized that applying the reset_trx_id should not affect the outcome of the memcmp() , and the assertion should still fail. So, this might not be caused by MDEV-12288 after all, and the same failure might be possible on 10.2.
            marko Marko Mäkelä made changes -
            Labels ddl not-10.2 regression-10.3 rr-profile-analyzed ddl need_rr not-10.2 regression-10.3 rr-profile-analyzed
            mleich Matthias Leich made changes -
            Labels ddl need_rr not-10.2 regression-10.3 rr-profile-analyzed ddl not-10.2 regression-10.3 rr-profile rr-profile-analyzed

            I checked the shorter trace (with 31666 events until the crash). It is executing ADD PRIMARY KEY on a ROW_FORMAT=COMPRESSED table. I think that I must first parse the log and then try to understand what exactly went wrong in the INSERT part of applying a ROW_T_UPDATE record:

            10.3 b75563cdfd161e373480949ffa498bd4a8087b12

            (rr) p/x *log.tail.block@326
            $1 = {0x41, 0x4, 0x14, 0xc0, 0xa, 0x10, 0x0, 0x0, 0x0, 0x0, 0x2, 0x5, 0x0, 
              0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0x0, 0x1, 
              0x6b, 0x80, 0x0, 0x1, 0x6b, 0x80, 0x0, 0x1, 0x6b, 0x33, 0x33, 0x33, 0x33, 
              0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x0, 0x0, 0x0, 0x7, 0x0, 0x0, 0x0, 0x5, 
              0x0, 0x0, 0x0, 0xc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x8, 0x1, 0x41, 0x4, 0x14, 
              0xc0, 0xa, 0x10, 0x0, 0x0, 0x0, 0x0, 0x2, 0x6, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 
              0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0x0, 0x1, 0x6b, 0x80, 0x0, 0x1, 
              0x6b, 0x80, 0x0, 0x1, 0x6b, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 
              0x33, 0x33, 0x0, 0x0, 0x0, 0x7, 0x0, 0x0, 0x0, 0x6, 0x0, 0x0, 0x0, 0xc, 0x0, 
              0x0, 0x0, 0x0, 0x0, 0x0, 0x8, 0x1, 0x42, 0x2, 0x9, 0x0, 0x80, 0x0, 0x1, 
              0x6b, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x0, 0x0, 0x0, 
              0x0, 0x0, 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4, 0x14, 0xc0, 0xa, 
              0x10, 0x0, 0x0, 0x0, 0x0, 0x2, 0x5, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4f, 0x25, 
              0x0, 0x0, 0x3, 0x4d, 0x1, 0x10, 0x80, 0x0, 0x1, 0x6b, 0x80, 0x0, 0x0, 0x1, 
              0x80, 0x0, 0x1, 0x6b, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 
              0x33, 0x0, 0x0, 0x0, 0x7, 0x0, 0x0, 0x0, 0x5, 0x0, 0x0, 0x0, 0xc, 0x0, 0x0, 
              0x0, 0x0, 0x0, 0x0, 0x8, 0x1, 0x42, 0x2, 0x9, 0x0, 0x80, 0x0, 0x1, 0x6b, 
              0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x0, 0x0, 0x0, 0x0, 
              0x0, 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4, 0x14, 0xc0, 0xa, 0x10, 
              0x0, 0x0, 0x0, 0x0, 0x2, 0x6, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4f, 0x25, 0x0, 0x0, 
              0x3, 0x4d, 0x1, 0x33, 0x80, 0x0, 0x1, 0x6b, 0x80, 0x0, 0x0, 0x1, 0x80, 0x0, 
              0x1, 0x6b, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x0, 
              0x0, 0x0, 0x7, 0x0, 0x0, 0x0, 0x6, 0x0, 0x0, 0x0, 0xc, 0x0, 0x0, 0x0, 0x0, 
              0x0, 0x0, 0x8, 0x1}
            (rr) p mrec-log.tail.block
            $2 = 265
            

            Partially decoded:
            offset 0: ROW_T_INSERT(extra=(0x10,0xa,0xc014),DB_ROW_ID=0x205,reset_trx_id,1,'k',1,'k',1,'k…')
            offset 67: ROW_T_INSERT(extra=(0x10,0xa,0xc014),DB_ROW_ID=0x206,reset_trx_id,1,'k',1,'k',1,'k…')
            offset 134: ROW_T_UPDATE(old_pk=(363,'333333333',reset_trx_id),extra=(0x10,0xa,0xc014),DB_ROW_ID=0x205,DB_TRX_ID=0x4f,DB_ROLL_PTR=(update),363,1,363,…)
            offset 230: ROW_T_UPDATE(old_pk=(363,'333333333',reset_trx_id),extra=(0x10,0xa,0xc014),DB_ROW_ID=0x206,DB_TRX_ID=0x4f,DB_ROLL_PTR=(update),363,1,363,…)
            offset 326: EOF

            The 0xc014 indicate BLOB pointers (0x14=20 bytes of pointer written to the clustered index record).

            On a quick look, the immediate problem seems to be that both ROW_T_INSERT of the old primary key values DB_ROW_ID=0x205 and DB_ROW_ID=0x206 are using reset_trx_id while the ROW_T_UPDATE are for DB_TRX_ID=0x4f. The log->min_trx is 0x4b. I do not yet see any evidence that MDEV-12288 would be to blame. I must continue the analysis further.

            marko Marko Mäkelä added a comment - I checked the shorter trace (with 31666 events until the crash). It is executing ADD PRIMARY KEY on a ROW_FORMAT=COMPRESSED table. I think that I must first parse the log and then try to understand what exactly went wrong in the INSERT part of applying a ROW_T_UPDATE record: 10.3 b75563cdfd161e373480949ffa498bd4a8087b12 (rr) p/x *log.tail.block@326 $1 = {0x41, 0x4, 0x14, 0xc0, 0xa, 0x10, 0x0, 0x0, 0x0, 0x0, 0x2, 0x5, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0x0, 0x1, 0x6b, 0x80, 0x0, 0x1, 0x6b, 0x80, 0x0, 0x1, 0x6b, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x0, 0x0, 0x0, 0x7, 0x0, 0x0, 0x0, 0x5, 0x0, 0x0, 0x0, 0xc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x8, 0x1, 0x41, 0x4, 0x14, 0xc0, 0xa, 0x10, 0x0, 0x0, 0x0, 0x0, 0x2, 0x6, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0x0, 0x1, 0x6b, 0x80, 0x0, 0x1, 0x6b, 0x80, 0x0, 0x1, 0x6b, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x0, 0x0, 0x0, 0x7, 0x0, 0x0, 0x0, 0x6, 0x0, 0x0, 0x0, 0xc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x8, 0x1, 0x42, 0x2, 0x9, 0x0, 0x80, 0x0, 0x1, 0x6b, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4, 0x14, 0xc0, 0xa, 0x10, 0x0, 0x0, 0x0, 0x0, 0x2, 0x5, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4f, 0x25, 0x0, 0x0, 0x3, 0x4d, 0x1, 0x10, 0x80, 0x0, 0x1, 0x6b, 0x80, 0x0, 0x0, 0x1, 0x80, 0x0, 0x1, 0x6b, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x0, 0x0, 0x0, 0x7, 0x0, 0x0, 0x0, 0x5, 0x0, 0x0, 0x0, 0xc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x8, 0x1, 0x42, 0x2, 0x9, 0x0, 0x80, 0x0, 0x1, 0x6b, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4, 0x14, 0xc0, 0xa, 0x10, 0x0, 0x0, 0x0, 0x0, 0x2, 0x6, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4f, 0x25, 0x0, 0x0, 0x3, 0x4d, 0x1, 0x33, 0x80, 0x0, 0x1, 0x6b, 0x80, 0x0, 0x0, 0x1, 0x80, 0x0, 0x1, 0x6b, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x0, 0x0, 0x0, 0x7, 0x0, 0x0, 0x0, 0x6, 0x0, 0x0, 0x0, 0xc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x8, 0x1} (rr) p mrec-log.tail.block $2 = 265 Partially decoded: offset 0: ROW_T_INSERT (extra=(0x10,0xa,0xc014),DB_ROW_ID=0x205,reset_trx_id,1,'k',1,'k',1,'k…') offset 67: ROW_T_INSERT (extra=(0x10,0xa,0xc014),DB_ROW_ID=0x206,reset_trx_id,1,'k',1,'k',1,'k…') offset 134: ROW_T_UPDATE (old_pk=(363,'333333333',reset_trx_id),extra=(0x10,0xa,0xc014),DB_ROW_ID=0x205,DB_TRX_ID=0x4f,DB_ROLL_PTR=(update),363,1,363,…) offset 230: ROW_T_UPDATE (old_pk=(363,'333333333',reset_trx_id),extra=(0x10,0xa,0xc014),DB_ROW_ID=0x206,DB_TRX_ID=0x4f,DB_ROLL_PTR=(update),363,1,363,…) offset 326: EOF The 0xc014 indicate BLOB pointers (0x14=20 bytes of pointer written to the clustered index record). On a quick look, the immediate problem seems to be that both ROW_T_INSERT of the old primary key values DB_ROW_ID=0x205 and DB_ROW_ID=0x206 are using reset_trx_id while the ROW_T_UPDATE are for DB_TRX_ID=0x4f . The log->min_trx is 0x4b. I do not yet see any evidence that MDEV-12288 would be to blame. I must continue the analysis further.
            marko Marko Mäkelä made changes -
            Status Open [ 1 ] In Progress [ 3 ]

            Even though the rr replay trace is relatively short, it includes many irrelevant statements, because the RQG grammar has not been simplified for this error.

            Both the original table and the copy of the table are initially empty at the time the first ROW_T_INSERT is logged. The sequence of interesting rr replay events (identifed by when) follows:

            1. 29335: A 2-row INSERT into the empty table by transaction 47 reaches ha_innobase::write_row() and row_insert_for_mysql(), in Thread 35. (Maybe this was rolled back later?)
            2. 29355: The ALTER TABLE t2 starts executing in Thread 3.
            3. 29558: Thread 35 starts executing ROLLBACK, to remove the 1 inserted row of transaction 47.
            4. 30225: A 2-row INSERT INTO t2 reaches dispatch_command() in Thread 35.
            5. 30919: The first row of the 2-row INSERT INTO t2 reaches ha_innobase::write_row() and row_insert_for_mysql() in Thread 35 (transaction 63 with no modifications yet).
            6. 30942: The first ROW_T_INSERT is logged by Thread 35.
            7. 30944: The second row of the 2-row INSERT INTO t2 reaches ha_innobase::write_row() and row_insert_for_mysql() in Thread 35.
            8. 31013: The second ROW_T_INSERT is logged by Thread 35.
            9. 31109: Thread 33 (transaction 79, with no modifications to InnoDB yet, UPDATE t2 SET col2 = 1 LIMIT 2) reaches row_update_for_mysql().
            10. 31242: The first ROW_T_UPDATE is logged by Thread 33 (the first modification of transaction 79).
            11. 31244: The second ROW_T_UPDATE is logged by Thread 33.
            12. 31666: Applying the second ROW_T_UPDATE triggers the assertion failure in the ALTER TABLE.

            Based on the above, I created a reasonably reduced test case that repeats the crash. 10.1 and 10.2 are not affected:

            --source include/have_innodb.inc
             
            CREATE TABLE t1 (a INT) ENGINE=InnoDB;
             
            CREATE TABLE t2 (col1 INT, col_text TEXT) ENGINE=InnoDB;
             
            connect (con1,localhost,root,,);
            BEGIN;
            INSERT INTO t1 VALUES(1);
             
            connection default;
            --echo # needed in 10.1: # SET SQL_MODE=STRICT_TRANS_TABLES;
            SET DEBUG_SYNC='row_log_table_apply1_before SIGNAL dml WAIT_FOR dml_done';
            send ALTER TABLE t2 ADD PRIMARY KEY(col_text(9));
             
            connection con1;
            SET DEBUG_SYNC='now WAIT_FOR dml';
            INSERT INTO t2 VALUES (363,REPEAT('3', 2049)),(363,REPEAT('3', 2049));
            COMMIT;
            disconnect con1;
             
            connect (con2,localhost,root,,);
            UPDATE t2 SET col1 = 1;
            SET DEBUG_SYNC='now SIGNAL dml_done';
            disconnect con2;
             
            connection default;
            reap;
            SET DEBUG_SYNC='RESET';
             
            CHECK TABLE t2;
            DROP TABLE t1,t2;
            

            marko Marko Mäkelä added a comment - Even though the rr replay trace is relatively short, it includes many irrelevant statements, because the RQG grammar has not been simplified for this error. Both the original table and the copy of the table are initially empty at the time the first ROW_T_INSERT is logged. The sequence of interesting rr replay events (identifed by when ) follows: 29335: A 2-row INSERT into the empty table by transaction 47 reaches ha_innobase::write_row() and row_insert_for_mysql() , in Thread 35. (Maybe this was rolled back later?) 29355: The ALTER TABLE t2 starts executing in Thread 3. 29558: Thread 35 starts executing ROLLBACK , to remove the 1 inserted row of transaction 47. 30225: A 2-row INSERT INTO t2 reaches dispatch_command() in Thread 35. 30919: The first row of the 2-row INSERT INTO t2 reaches ha_innobase::write_row() and row_insert_for_mysql() in Thread 35 (transaction 63 with no modifications yet). 30942: The first ROW_T_INSERT is logged by Thread 35. 30944: The second row of the 2-row INSERT INTO t2 reaches ha_innobase::write_row() and row_insert_for_mysql() in Thread 35. 31013: The second ROW_T_INSERT is logged by Thread 35. 31109: Thread 33 (transaction 79, with no modifications to InnoDB yet, UPDATE t2 SET col2 = 1 LIMIT 2 ) reaches row_update_for_mysql() . 31242: The first ROW_T_UPDATE is logged by Thread 33 (the first modification of transaction 79). 31244: The second ROW_T_UPDATE is logged by Thread 33. 31666: Applying the second ROW_T_UPDATE triggers the assertion failure in the ALTER TABLE . Based on the above, I created a reasonably reduced test case that repeats the crash. 10.1 and 10.2 are not affected: --source include/have_innodb.inc   CREATE TABLE t1 (a INT ) ENGINE=InnoDB;   CREATE TABLE t2 (col1 INT , col_text TEXT) ENGINE=InnoDB;   connect (con1,localhost,root,,); BEGIN ; INSERT INTO t1 VALUES (1);   connection default ; --echo # needed in 10.1: # SET SQL_MODE=STRICT_TRANS_TABLES; SET DEBUG_SYNC= 'row_log_table_apply1_before SIGNAL dml WAIT_FOR dml_done' ; send ALTER TABLE t2 ADD PRIMARY KEY (col_text(9));   connection con1; SET DEBUG_SYNC= 'now WAIT_FOR dml' ; INSERT INTO t2 VALUES (363,REPEAT( '3' , 2049)),(363,REPEAT( '3' , 2049)); COMMIT ; disconnect con1;   connect (con2,localhost,root,,); UPDATE t2 SET col1 = 1; SET DEBUG_SYNC= 'now SIGNAL dml_done' ; disconnect con2;   connection default ; reap; SET DEBUG_SYNC= 'RESET' ;   CHECK TABLE t2; DROP TABLE t1,t2;
            marko Marko Mäkelä made changes -
            Labels ddl not-10.2 regression-10.3 rr-profile rr-profile-analyzed ddl not-10.2 regression-10.3 rr-profile-analyzed

            The RQG test simplifier was used. But it generated only two bug replays on > 3000 RQG test runs.
            So the final simplification was only a reduction of the number of involved concurrent sessions.

            mleich Matthias Leich added a comment - The RQG test simplifier was used. But it generated only two bug replays on > 3000 RQG test runs. So the final simplification was only a reduction of the number of involved concurrent sessions.

            This problem is caused by the fix of MDEV-13654. Removing all references to row_log_t::min_trx from row0log.cc makes the test pass.

            I simplified the test case even further and analyzed the contents of the log in both 10.2 and 10.3.

            --source include/have_innodb.inc
             
            CREATE TABLE t1(a TINYINT PRIMARY KEY) ENGINE=InnoDB;
             
            CREATE TABLE t2(
             a TINYINT UNSIGNED PRIMARY KEY,
             b TINYINT UNSIGNED NOT NULL,
             c CHAR(2) NOT NULL)
            ENGINE=InnoDB;
             
            connect (con1,localhost,root,,);
            BEGIN;
            INSERT INTO t1 VALUES(1);
             
            connection default;
            SET DEBUG_SYNC='row_log_table_apply1_before SIGNAL dml WAIT_FOR dml_done';
            send ALTER TABLE t2 DROP PRIMARY KEY, ADD PRIMARY KEY(c(1));
             
            connection con1;
            SET DEBUG_SYNC='now WAIT_FOR dml';
            INSERT INTO t2 VALUES (160,16,'ab'),(161,16,'ac');
            COMMIT;
            disconnect con1;
             
            connect (con2,localhost,root,,);
            UPDATE t2 SET b=17;
            SET DEBUG_SYNC='now SIGNAL dml_done';
            disconnect con2;
             
            connection default;
            --error ER_DUP_ENTRY
            reap;
            SET DEBUG_SYNC='RESET';
             
            CHECK TABLE t2;
            DROP TABLE t1,t2;
            

            10.3

            ROW_T_INSERT(160,reset_trx_id,16,'ab')
            ROW_T_INSERT(161,reset_trx_id,16,'ac')
            ROW_T_UPDATE(old_pk('a'),reset_trx_id,
                         160,DB_TRX_ID=0x2b,DB_ROLL_PTR=(update),17,'ab')
            ROW_T_UPDATE(old_pk('a'),reset_trx_id,
                         161,DB_TRX_ID=0x2b,DB_ROLL_PTR=(update),17,'ac')
            

            10.2

            ROW_T_INSERT(160,DB_TRX_ID=0x506,DB_ROLL_PTR=(insert1),16,'ab')
            ROW_T_INSERT(161,DB_TRX_ID=0x506,DB_ROLL_PTR=(insert2),16,'ac')
            ROW_T_UPDATE(old_pk('a'),DB_TRX_ID=0x506,(insert1),
                         160,DB_TRX_ID=0x50b,DB_ROLL_PTR=(update),17,'ab')
            ROW_T_UPDATE(old_pk('a'),DB_TRX_ID=0x506,DB_ROLL_PTR=(insert1),
                         161,DB_TRX_ID=0x50b,DB_ROLL_PTR=(update),17,'ac')
            

            Notably, in 10.3, if I append LIMIT 1 to the UPDATE statement, the test will no longer crash, but it will not report the duplicate PRIMARY KEY either, like it should.

            marko Marko Mäkelä added a comment - This problem is caused by the fix of MDEV-13654 . Removing all references to row_log_t::min_trx from row0log.cc makes the test pass. I simplified the test case even further and analyzed the contents of the log in both 10.2 and 10.3. --source include/have_innodb.inc   CREATE TABLE t1(a TINYINT PRIMARY KEY ) ENGINE=InnoDB;   CREATE TABLE t2( a TINYINT UNSIGNED PRIMARY KEY , b TINYINT UNSIGNED NOT NULL , c CHAR (2) NOT NULL ) ENGINE=InnoDB;   connect (con1,localhost,root,,); BEGIN ; INSERT INTO t1 VALUES (1);   connection default ; SET DEBUG_SYNC= 'row_log_table_apply1_before SIGNAL dml WAIT_FOR dml_done' ; send ALTER TABLE t2 DROP PRIMARY KEY , ADD PRIMARY KEY (c(1));   connection con1; SET DEBUG_SYNC= 'now WAIT_FOR dml' ; INSERT INTO t2 VALUES (160,16, 'ab' ),(161,16, 'ac' ); COMMIT ; disconnect con1;   connect (con2,localhost,root,,); UPDATE t2 SET b=17; SET DEBUG_SYNC= 'now SIGNAL dml_done' ; disconnect con2;   connection default ; --error ER_DUP_ENTRY reap; SET DEBUG_SYNC= 'RESET' ;   CHECK TABLE t2; DROP TABLE t1,t2; 10.3 ROW_T_INSERT(160,reset_trx_id,16,'ab') ROW_T_INSERT(161,reset_trx_id,16,'ac') ROW_T_UPDATE(old_pk('a'),reset_trx_id, 160,DB_TRX_ID=0x2b,DB_ROLL_PTR=(update),17,'ab') ROW_T_UPDATE(old_pk('a'),reset_trx_id, 161,DB_TRX_ID=0x2b,DB_ROLL_PTR=(update),17,'ac') 10.2 ROW_T_INSERT(160,DB_TRX_ID=0x506,DB_ROLL_PTR=(insert1),16,'ab') ROW_T_INSERT(161,DB_TRX_ID=0x506,DB_ROLL_PTR=(insert2),16,'ac') ROW_T_UPDATE(old_pk('a'),DB_TRX_ID=0x506,(insert1), 160,DB_TRX_ID=0x50b,DB_ROLL_PTR=(update),17,'ab') ROW_T_UPDATE(old_pk('a'),DB_TRX_ID=0x506,DB_ROLL_PTR=(insert1), 161,DB_TRX_ID=0x50b,DB_ROLL_PTR=(update),17,'ac') Notably, in 10.3, if I append LIMIT 1 to the UPDATE statement, the test will no longer crash, but it will not report the duplicate PRIMARY KEY either, like it should.
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -

            The following partial revert of the MDEV-13654 changes makes this particular test pass:

            diff --git a/storage/innobase/row/row0log.cc b/storage/innobase/row/row0log.cc
            index a3bf91a1c74..267e56bf828 100644
            --- a/storage/innobase/row/row0log.cc
            +++ b/storage/innobase/row/row0log.cc
            @@ -1079,15 +1079,7 @@ row_log_table_low(
             		b += rec_extra_size;
             		memcpy(b, fake_extra_buf + 1, fake_extra_size);
             		b += fake_extra_size;
            -		ulint len;
            -		ulint trx_id_offs = rec_get_nth_field_offs(
            -			offsets, index->n_uniq, &len);
            -		ut_ad(len == DATA_TRX_ID_LEN);
             		memcpy(b, rec, rec_offs_data_size(offsets));
            -		if (trx_read_trx_id(b + trx_id_offs) < log->min_trx) {
            -			memcpy(b + trx_id_offs,
            -			       reset_trx_id, sizeof reset_trx_id);
            -		}
             		b += rec_offs_data_size(offsets);
             
             		row_log_table_close(index, b, mrec_size, avail_size);
            @@ -1392,13 +1384,7 @@ row_log_table_get_pk(
             		/* Copy the fields, because the fields will be updated
             		or the record may be moved somewhere else in the B-tree
             		as part of the upcoming operation. */
            -		if (trx_read_trx_id(trx_roll) < log->min_trx) {
            -			trx_roll = reset_trx_id;
            -			if (sys) {
            -				memcpy(sys, trx_roll,
            -				       DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
            -			}
            -		} else if (sys) {
            +		if (sys) {
             			memcpy(sys, trx_roll,
             			       DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN);
             			trx_roll = sys;
            @@ -1409,8 +1395,6 @@ row_log_table_get_pk(
             					DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN));
             		}
             
            -		ut_d(trx_id_check(trx_roll, log->min_trx));
            -
             		dfield_set_data(dtuple_get_nth_field(tuple, new_n_uniq),
             				trx_roll, DATA_TRX_ID_LEN);
             		dfield_set_data(dtuple_get_nth_field(tuple, new_n_uniq + 1),
            

            It is not an actual fix yet, and we would definitely be missing a change to row_log_table_low_redundant() if this is the right direction.

            I believe that the root cause is that applying the second ROW_T_INSERT did not flag a duplicate key error. Here is a simpler test for that, causing no crash, but a wrong result (the ALTER TABLE wrongly succeeds, and the table will contain 1 row, even though 2 were inserted):

            --source include/have_innodb.inc
            CREATE TABLE t1(a TINYINT PRIMARY KEY) ENGINE=InnoDB;
            CREATE TABLE t2(c CHAR(2) NOT NULL) ENGINE=InnoDB;
             
            connect (con1,localhost,root,,);
            BEGIN;
            INSERT INTO t1 VALUES(1);
             
            connection default;
            SET DEBUG_SYNC='row_log_table_apply1_before SIGNAL dml WAIT_FOR dml_done';
            send ALTER TABLE t2 ADD PRIMARY KEY(c(1));
             
            connection con1;
            SET DEBUG_SYNC='now WAIT_FOR dml';
            INSERT INTO t2 VALUES ('ab'),('ac');
            COMMIT;
            SET DEBUG_SYNC='now SIGNAL dml_done';
             
            connection default;
            --error ER_DUP_ENTRY
            reap;
            SET DEBUG_SYNC='RESET';
             
            SELECT * FROM t2;
            DROP TABLE t1,t2;
            

            marko Marko Mäkelä added a comment - The following partial revert of the MDEV-13654 changes makes this particular test pass: diff --git a/storage/innobase/row/row0log.cc b/storage/innobase/row/row0log.cc index a3bf91a1c74..267e56bf828 100644 --- a/storage/innobase/row/row0log.cc +++ b/storage/innobase/row/row0log.cc @@ -1079,15 +1079,7 @@ row_log_table_low( b += rec_extra_size; memcpy(b, fake_extra_buf + 1, fake_extra_size); b += fake_extra_size; - ulint len; - ulint trx_id_offs = rec_get_nth_field_offs( - offsets, index->n_uniq, &len); - ut_ad(len == DATA_TRX_ID_LEN); memcpy(b, rec, rec_offs_data_size(offsets)); - if (trx_read_trx_id(b + trx_id_offs) < log->min_trx) { - memcpy(b + trx_id_offs, - reset_trx_id, sizeof reset_trx_id); - } b += rec_offs_data_size(offsets); row_log_table_close(index, b, mrec_size, avail_size); @@ -1392,13 +1384,7 @@ row_log_table_get_pk( /* Copy the fields, because the fields will be updated or the record may be moved somewhere else in the B-tree as part of the upcoming operation. */ - if (trx_read_trx_id(trx_roll) < log->min_trx) { - trx_roll = reset_trx_id; - if (sys) { - memcpy(sys, trx_roll, - DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN); - } - } else if (sys) { + if (sys) { memcpy(sys, trx_roll, DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN); trx_roll = sys; @@ -1409,8 +1395,6 @@ row_log_table_get_pk( DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN)); } - ut_d(trx_id_check(trx_roll, log->min_trx)); - dfield_set_data(dtuple_get_nth_field(tuple, new_n_uniq), trx_roll, DATA_TRX_ID_LEN); dfield_set_data(dtuple_get_nth_field(tuple, new_n_uniq + 1), It is not an actual fix yet, and we would definitely be missing a change to row_log_table_low_redundant() if this is the right direction. I believe that the root cause is that applying the second ROW_T_INSERT did not flag a duplicate key error. Here is a simpler test for that, causing no crash, but a wrong result (the ALTER TABLE wrongly succeeds, and the table will contain 1 row, even though 2 were inserted): --source include/have_innodb.inc CREATE TABLE t1(a TINYINT PRIMARY KEY ) ENGINE=InnoDB; CREATE TABLE t2(c CHAR (2) NOT NULL ) ENGINE=InnoDB;   connect (con1,localhost,root,,); BEGIN ; INSERT INTO t1 VALUES (1);   connection default ; SET DEBUG_SYNC= 'row_log_table_apply1_before SIGNAL dml WAIT_FOR dml_done' ; send ALTER TABLE t2 ADD PRIMARY KEY (c(1));   connection con1; SET DEBUG_SYNC= 'now WAIT_FOR dml' ; INSERT INTO t2 VALUES ( 'ab' ),( 'ac' ); COMMIT ; SET DEBUG_SYNC= 'now SIGNAL dml_done' ;   connection default ; --error ER_DUP_ENTRY reap; SET DEBUG_SYNC= 'RESET' ;   SELECT * FROM t2; DROP TABLE t1,t2;

            Finally, after some debugging on 10.2 and 10.3, I found the actual root cause. The following fix should have limited negative impact: more ‘fake’ duplicate key errors could be reported.

            diff --git a/storage/innobase/row/row0ins.cc b/storage/innobase/row/row0ins.cc
            index d6596d586ab..13649777419 100644
            --- a/storage/innobase/row/row0ins.cc
            +++ b/storage/innobase/row/row0ins.cc
            @@ -2241,8 +2241,14 @@ row_ins_duplicate_online(
             		return(DB_SUCCESS);
             	}
             
            -	if (fields == n_uniq + 2) {
            -		/* rec is an exact match of entry. */
            +	ulint trx_id_len;
            +
            +	if (fields == n_uniq + 2
            +	    && memcmp(rec_get_nth_field(rec, offsets, n_uniq, &trx_id_len),
            +		      reset_trx_id, DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN)) {
            +		ut_ad(trx_id_len == DATA_TRX_ID_LEN);
            +		/* rec is an exact match of entry, and DB_TRX_ID belongs
            +		to a transaction that started after our ALTER TABLE. */
             		return(DB_SUCCESS_LOCKED_REC);
             	}
             
            

            marko Marko Mäkelä added a comment - Finally, after some debugging on 10.2 and 10.3, I found the actual root cause. The following fix should have limited negative impact: more ‘fake’ duplicate key errors could be reported. diff --git a/storage/innobase/row/row0ins.cc b/storage/innobase/row/row0ins.cc index d6596d586ab..13649777419 100644 --- a/storage/innobase/row/row0ins.cc +++ b/storage/innobase/row/row0ins.cc @@ -2241,8 +2241,14 @@ row_ins_duplicate_online( return(DB_SUCCESS); } - if (fields == n_uniq + 2) { - /* rec is an exact match of entry. */ + ulint trx_id_len; + + if (fields == n_uniq + 2 + && memcmp(rec_get_nth_field(rec, offsets, n_uniq, &trx_id_len), + reset_trx_id, DATA_TRX_ID_LEN + DATA_ROLL_PTR_LEN)) { + ut_ad(trx_id_len == DATA_TRX_ID_LEN); + /* rec is an exact match of entry, and DB_TRX_ID belongs + to a transaction that started after our ALTER TABLE. */ return(DB_SUCCESS_LOCKED_REC); }

            Also the following would fail to flag a duplicate key error from an INSERT (or possibly UPDATE) transaction that was started before the ALTER TABLE but did not access the table in question before the ALTER TABLE started:

            CREATE TABLE t2(a CHAR(2) NOT NULL) ENGINE=InnoDB;
            ALTER TABLE t2 ADD PRIMARY KEY(a);
            

            and

            CREATE TABLE t2(a CHAR(1), b CHAR(1), PRIMARY KEY(a, b)) ENGINE=InnoDB;
            ALTER TABLE t2 DROP b, DROP PRIMARY KEY, ADD PRIMARY KEY(a);
            

            Starting with MariaDB 10.2, a simple ALTER TABLE t2 DROP b would be refused. In MariaDB 10.0 and 10.1, it would imply the PRIMARY KEY operations.

            marko Marko Mäkelä added a comment - Also the following would fail to flag a duplicate key error from an INSERT (or possibly UPDATE ) transaction that was started before the ALTER TABLE but did not access the table in question before the ALTER TABLE started: CREATE TABLE t2(a CHAR (2) NOT NULL ) ENGINE=InnoDB; ALTER TABLE t2 ADD PRIMARY KEY (a); and CREATE TABLE t2(a CHAR (1), b CHAR (1), PRIMARY KEY (a, b)) ENGINE=InnoDB; ALTER TABLE t2 DROP b, DROP PRIMARY KEY , ADD PRIMARY KEY (a); Starting with MariaDB 10.2, a simple ALTER TABLE t2 DROP b would be refused. In MariaDB 10.0 and 10.1, it would imply the PRIMARY KEY operations.
            marko Marko Mäkelä made changes -
            Summary Assertion log->blobs failed after memcmp(rec_trx_id, ...) ALTER TABLE…ADD PRIMARY KEY fails to flag duplicate key error from concurrent DML
            marko Marko Mäkelä made changes -
            issue.field.resolutiondate 2020-07-23 14:48:56.0 2020-07-23 14:48:56.191
            marko Marko Mäkelä made changes -
            Fix Version/s 10.3.24 [ 24306 ]
            Fix Version/s 10.4.14 [ 24305 ]
            Fix Version/s 10.5.5 [ 24423 ]
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            Fix Version/s 10.5 [ 23123 ]
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Closed [ 6 ]
            marko Marko Mäkelä made changes -
            Labels ddl not-10.2 regression-10.3 rr-profile-analyzed ddl not-10.1 not-10.2 regression-10.3 rr-profile-analyzed
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 111460 ] MariaDB v4 [ 158126 ]

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.