Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30100

Assertion `purge_sys.tail.trx_no <= purge_sys.rseg->last_trx_no()'

Details

    Description

      The process exits during mysqld execution, and an error occurs after restarting.

      lead to:

      InnoDB: Failing assertion: purge_sys.tail.trx_no <= purge_sys.rseg->last_trx_no()
      InnoDB: We intentionally generate a memory trap.
      InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
      InnoDB: If you get repeated assertion failures or crashes, even
      InnoDB: immediately after the mysqld startup, there may be
      InnoDB: corruption in the InnoDB tablespace. Please refer to
      InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
      InnoDB: about forcing recovery.
      220914 18:23:30 [ERROR] mysqld got signal 6 ;
      

      Attachments

        Issue Links

          Activity

            I tried but failed to find the exact root cause of this.

            break ibuf0ibuf.cc:3509
            cond 1 index==0x12222402ac38 && *(char*)entry.fields[0].data=='2'
            break ibuf_merge_or_delete_for_page
            cond 2 page_id.m_id==0x50000000e
            continue
            

            The first breakpoint is being hit exactly once in the entire trace, for a different primary key, col1=0x81c (2076) and not col1=0x11a. I think that in order to find and fix the cause of this corruption, we might need some test case simplification.

            marko Marko Mäkelä added a comment - I tried but failed to find the exact root cause of this. break ibuf0ibuf.cc:3509 cond 1 index==0x12222402ac38 && *(char*)entry.fields[0].data=='2' break ibuf_merge_or_delete_for_page cond 2 page_id.m_id==0x50000000e continue The first breakpoint is being hit exactly once in the entire trace, for a different primary key, col1=0x81c (2076) and not col1=0x11a. I think that in order to find and fix the cause of this corruption, we might need some test case simplification.

            I just spent 4 hours analyzing the root cause of another failure:

            # 2023-08-17T13:18:56 [33159] ERROR: checkDatabaseIntegrity  server[1]:  Query ->ALTER TABLE test . t5 FORCE<- failed with 1062 : Duplicate entry '1900-01-01' for key 'col_date'
            

            Yes, there was a duplicate value 0x8ed821 for that unique column, with the PRIMARY KEY values pk=0x875 and pk=0x876. It took me some effort to find the original INSERT statement. At that point, the unique key did exist, but the constraint was violated because innodb_change_buffering was enabled and unique_checks=0 was in effect. The problematic INSERT only inserted that single row. In the clustered index page at that time, we got:

            #1  page_cur_insert_rec_low (cur=cur@entry=0x79f71d2ada20, rec=rec@entry=0x1b2d141183be "\200", offsets=<optimized out>, mtr=mtr@entry=0x79f71d2adfc0)
                at /data/Server/bb-10.6-MDEV-30100/storage/innobase/page/page0cur.cc:1569
            1569	    mach_write_to_2(cur->rec - REC_NEXT,
            (rr) p/x *rec@30
            $44 = {0x80, 0x0, 0x8, 0x76, 0x0, 0x0, 0x0, 0x0, 0x80, 0x12, 0x92, 0x0, 0x0, 0x4, 0x66, 0xc, 0x10, 0x8e, 0xd8, 0x21, 0x0, 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
            (rr) p/x *cur.rec@30
            $46 = {0x80, 0x0, 0x8, 0x75, 0x0, 0x0, 0x0, 0x0, 0x80, 0x4, 0x89, 0x0, 0x0, 0x4, 0x65, 0xa, 0x88, 0x8e, 0xd8, 0x21, 0x0, 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
            

            That is, pk=0x875 was inserted by DB_TRX_ID=0x8004, and the current transaction is inserting pk=0x876,DB_TRX_ID=0x8012. In both records right after the 7-byte DB_ROLL_PTR, we see the duplicate col_date value 0x8ed821. This is still OK; the duplicate should be detected when we attempt to insert that key to the unique index. After that point, this last insert should be rolled back. However, the duplicate key was not noticed due to the following:

            Thread 29 hit Breakpoint 19, ibuf_insert_low (mode=BTR_MODIFY_PREV, op=IBUF_OP_INSERT, no_counter=1, entry=0x1b2d140aaee8, entry_size=12, index=0x1b2d1409e898, page_id={m_id = 38654705818}, zip_size=0, 
                thr=0x1b2d1411f890) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/ibuf/ibuf0ibuf.cc:3111
            3111	{
            5: x/i $pc
            => 0x555a1593cd90 <ibuf_insert_low(btr_latch_mode, ibuf_op_t, ulint, dtuple_t const*, ulint, dict_index_t*, page_id_t, ulint, que_thr_t*)>:	push   %rbp
            (rr) p index.name
            $54 = {m_name = 0x1b2d1409ea28 "col_date"}
            (rr) fin
            Run till exit from #0  ibuf_insert_low (mode=BTR_MODIFY_PREV, op=IBUF_OP_INSERT, no_counter=1, entry=0x1b2d140aaee8, entry_size=12, index=0x1b2d1409e898, page_id={m_id = 38654705818}, zip_size=0, 
                thr=0x1b2d1411f890) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/ibuf/ibuf0ibuf.cc:3111
            ibuf_insert (op=op@entry=IBUF_OP_INSERT, entry=entry@entry=0x1b2d140aaee8, index=0x1b2d1409e898, page_id=page_id@entry={m_id = 38654705818}, zip_size=zip_size@entry=0, thr=<optimized out>)
                at /data/Server/bb-10.6-MDEV-30100/storage/innobase/ibuf/ibuf0ibuf.cc:3512
            3512		if (err == DB_FAIL) {
            5: x/i $pc
            => 0x555a1593ead5 <ibuf_insert(ibuf_op_t, dtuple_t const*, dict_index_t*, page_id_t, unsigned long, que_thr_t*)+533>:	add    $0x20,%rsp
            Value returned is $55 = DB_SUCCESS
            (rr) 
            Run till exit from #0  ibuf_insert (op=op@entry=IBUF_OP_INSERT, entry=entry@entry=0x1b2d140aaee8, index=0x1b2d1409e898, page_id=page_id@entry={m_id = 38654705818}, zip_size=zip_size@entry=0, 
                thr=<optimized out>) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/ibuf/ibuf0ibuf.cc:3512
            0x0000555a15fc0f8a in btr_cur_t::search_leaf (this=this@entry=0x79f71d2ad760, tuple=tuple@entry=0x1b2d140aaee8, mode=mode@entry=PAGE_CUR_LE, latch_mode=BTR_MODIFY_LEAF, latch_mode@entry=2114, 
                mtr=mtr@entry=0x79f71d2adfb0) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/include/btr0cur.h:744
            744	  dict_index_t *index() const { return page_cur.index; }
            5: x/i $pc
            => 0x555a15fc0f8a <btr_cur_t::search_leaf(dtuple_t const*, page_cur_mode_t, btr_latch_mode, mtr_t*)+2026>:	mov    -0x538(%rbp),%r10
            Value returned is $56 = true
            (rr) bt
            #0  0x0000555a15fc0f8a in btr_cur_t::search_leaf (this=this@entry=0x79f71d2ad760, tuple=tuple@entry=0x1b2d140aaee8, mode=mode@entry=PAGE_CUR_LE, latch_mode=BTR_MODIFY_LEAF, latch_mode@entry=2114, 
                mtr=mtr@entry=0x79f71d2adfb0) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/include/btr0cur.h:744
            #1  0x0000555a15f4ac05 in row_ins_sec_index_entry_low (flags=0, mode=BTR_MODIFY_LEAF, index=0x1b2d1409e898, offsets_heap=<optimized out>, heap=0x1b2d14256430, entry=0x1b2d140aaee8, trx_id=0, thr=0x1b2d1411f890)
                at /data/Server/bb-10.6-MDEV-30100/storage/innobase/row/row0ins.cc:3015
            #2  0x0000555a15f4e312 in row_ins_sec_index_entry (index=0x1b2d1409e898, entry=0x1b2d140aaee8, thr=0x1b2d1411f890, check_foreign=<optimized out>)
                at /data/Server/bb-10.6-MDEV-30100/storage/innobase/row/row0ins.cc:3322
            #3  0x0000555a15f4e7ec in row_ins_index_entry (thr=0x1b2d1411f890, entry=<optimized out>, index=<optimized out>) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/row/row0ins.cc:3370
            #4  row_ins_index_entry_step (thr=0x1b2d1411f890, node=<optimized out>) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/row/row0ins.cc:3536
            #5  row_ins (thr=0x1b2d1411f890, node=<optimized out>) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/row/row0ins.cc:3661
            #6  row_ins_step (thr=thr@entry=0x1b2d1411f890) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/row/row0ins.cc:3790
            #7  0x0000555a15f5b227 in row_insert_for_mysql (mysql_rec=mysql_rec@entry=0x1b2d140ac2c0 "\377", prebuilt=0x1b2d1412ed98, ins_mode=ROW_INS_NORMAL)
                at /data/Server/bb-10.6-MDEV-30100/storage/innobase/row/row0mysql.cc:1315
            #8  0x0000555a15eca90e in ha_innobase::write_row (this=0x1b2d140a8490, record=0x1b2d140ac2c0 "\377") at /data/Server/bb-10.6-MDEV-30100/storage/innobase/handler/ha_innodb.cc:7905
            

            This statement

            INSERT IGNORE INTO `t5` ( `pk`, `col_char` ) VALUES ( NULL, 'l' ) /* E_R Thread1 QNO 63149 CON_ID 14 */;
            

            was successfully autocommitted with that 1 invalid row inserted, because unique_checks=0 allowed the buffered insert into the unique index col_date to happen. Had it not been possible to use the change buffer, the uniqueness violation would have been noticed.

            marko Marko Mäkelä added a comment - I just spent 4 hours analyzing the root cause of another failure: # 2023-08-17T13:18:56 [33159] ERROR: checkDatabaseIntegrity server[1]: Query ->ALTER TABLE test . t5 FORCE<- failed with 1062 : Duplicate entry '1900-01-01' for key 'col_date' Yes, there was a duplicate value 0x8ed821 for that unique column, with the PRIMARY KEY values pk=0x875 and pk=0x876. It took me some effort to find the original INSERT statement. At that point, the unique key did exist, but the constraint was violated because innodb_change_buffering was enabled and unique_checks=0 was in effect. The problematic INSERT only inserted that single row. In the clustered index page at that time, we got: #1 page_cur_insert_rec_low (cur=cur@entry=0x79f71d2ada20, rec=rec@entry=0x1b2d141183be "\200", offsets=<optimized out>, mtr=mtr@entry=0x79f71d2adfc0) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/page/page0cur.cc:1569 1569 mach_write_to_2(cur->rec - REC_NEXT, (rr) p/x *rec@30 $44 = {0x80, 0x0, 0x8, 0x76, 0x0, 0x0, 0x0, 0x0, 0x80, 0x12, 0x92, 0x0, 0x0, 0x4, 0x66, 0xc, 0x10, 0x8e, 0xd8, 0x21, 0x0, 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} (rr) p/x *cur.rec@30 $46 = {0x80, 0x0, 0x8, 0x75, 0x0, 0x0, 0x0, 0x0, 0x80, 0x4, 0x89, 0x0, 0x0, 0x4, 0x65, 0xa, 0x88, 0x8e, 0xd8, 0x21, 0x0, 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} That is, pk=0x875 was inserted by DB_TRX_ID=0x8004, and the current transaction is inserting pk=0x876,DB_TRX_ID=0x8012. In both records right after the 7-byte DB_ROLL_PTR, we see the duplicate col_date value 0x8ed821. This is still OK; the duplicate should be detected when we attempt to insert that key to the unique index. After that point, this last insert should be rolled back. However, the duplicate key was not noticed due to the following: Thread 29 hit Breakpoint 19, ibuf_insert_low (mode=BTR_MODIFY_PREV, op=IBUF_OP_INSERT, no_counter=1, entry=0x1b2d140aaee8, entry_size=12, index=0x1b2d1409e898, page_id={m_id = 38654705818}, zip_size=0, thr=0x1b2d1411f890) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/ibuf/ibuf0ibuf.cc:3111 3111 { 5: x/i $pc => 0x555a1593cd90 <ibuf_insert_low(btr_latch_mode, ibuf_op_t, ulint, dtuple_t const*, ulint, dict_index_t*, page_id_t, ulint, que_thr_t*)>: push %rbp (rr) p index.name $54 = {m_name = 0x1b2d1409ea28 "col_date"} (rr) fin Run till exit from #0 ibuf_insert_low (mode=BTR_MODIFY_PREV, op=IBUF_OP_INSERT, no_counter=1, entry=0x1b2d140aaee8, entry_size=12, index=0x1b2d1409e898, page_id={m_id = 38654705818}, zip_size=0, thr=0x1b2d1411f890) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/ibuf/ibuf0ibuf.cc:3111 ibuf_insert (op=op@entry=IBUF_OP_INSERT, entry=entry@entry=0x1b2d140aaee8, index=0x1b2d1409e898, page_id=page_id@entry={m_id = 38654705818}, zip_size=zip_size@entry=0, thr=<optimized out>) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/ibuf/ibuf0ibuf.cc:3512 3512 if (err == DB_FAIL) { 5: x/i $pc => 0x555a1593ead5 <ibuf_insert(ibuf_op_t, dtuple_t const*, dict_index_t*, page_id_t, unsigned long, que_thr_t*)+533>: add $0x20,%rsp Value returned is $55 = DB_SUCCESS (rr) Run till exit from #0 ibuf_insert (op=op@entry=IBUF_OP_INSERT, entry=entry@entry=0x1b2d140aaee8, index=0x1b2d1409e898, page_id=page_id@entry={m_id = 38654705818}, zip_size=zip_size@entry=0, thr=<optimized out>) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/ibuf/ibuf0ibuf.cc:3512 0x0000555a15fc0f8a in btr_cur_t::search_leaf (this=this@entry=0x79f71d2ad760, tuple=tuple@entry=0x1b2d140aaee8, mode=mode@entry=PAGE_CUR_LE, latch_mode=BTR_MODIFY_LEAF, latch_mode@entry=2114, mtr=mtr@entry=0x79f71d2adfb0) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/include/btr0cur.h:744 744 dict_index_t *index() const { return page_cur.index; } 5: x/i $pc => 0x555a15fc0f8a <btr_cur_t::search_leaf(dtuple_t const*, page_cur_mode_t, btr_latch_mode, mtr_t*)+2026>: mov -0x538(%rbp),%r10 Value returned is $56 = true (rr) bt #0 0x0000555a15fc0f8a in btr_cur_t::search_leaf (this=this@entry=0x79f71d2ad760, tuple=tuple@entry=0x1b2d140aaee8, mode=mode@entry=PAGE_CUR_LE, latch_mode=BTR_MODIFY_LEAF, latch_mode@entry=2114, mtr=mtr@entry=0x79f71d2adfb0) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/include/btr0cur.h:744 #1 0x0000555a15f4ac05 in row_ins_sec_index_entry_low (flags=0, mode=BTR_MODIFY_LEAF, index=0x1b2d1409e898, offsets_heap=<optimized out>, heap=0x1b2d14256430, entry=0x1b2d140aaee8, trx_id=0, thr=0x1b2d1411f890) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/row/row0ins.cc:3015 #2 0x0000555a15f4e312 in row_ins_sec_index_entry (index=0x1b2d1409e898, entry=0x1b2d140aaee8, thr=0x1b2d1411f890, check_foreign=<optimized out>) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/row/row0ins.cc:3322 #3 0x0000555a15f4e7ec in row_ins_index_entry (thr=0x1b2d1411f890, entry=<optimized out>, index=<optimized out>) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/row/row0ins.cc:3370 #4 row_ins_index_entry_step (thr=0x1b2d1411f890, node=<optimized out>) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/row/row0ins.cc:3536 #5 row_ins (thr=0x1b2d1411f890, node=<optimized out>) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/row/row0ins.cc:3661 #6 row_ins_step (thr=thr@entry=0x1b2d1411f890) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/row/row0ins.cc:3790 #7 0x0000555a15f5b227 in row_insert_for_mysql (mysql_rec=mysql_rec@entry=0x1b2d140ac2c0 "\377", prebuilt=0x1b2d1412ed98, ins_mode=ROW_INS_NORMAL) at /data/Server/bb-10.6-MDEV-30100/storage/innobase/row/row0mysql.cc:1315 #8 0x0000555a15eca90e in ha_innobase::write_row (this=0x1b2d140a8490, record=0x1b2d140ac2c0 "\377") at /data/Server/bb-10.6-MDEV-30100/storage/innobase/handler/ha_innodb.cc:7905 This statement INSERT IGNORE INTO `t5` ( `pk`, `col_char` ) VALUES ( NULL , 'l' ) /* E_R Thread1 QNO 63149 CON_ID 14 */ ; was successfully autocommitted with that 1 invalid row inserted, because unique_checks=0 allowed the buffered insert into the unique index col_date to happen. Had it not been possible to use the change buffer, the uniqueness violation would have been noticed.

            origin/bb-10.6-MDEV-30100 562850823ba74e15ae638205aeacee9f678de393 2023-08-17T11:14:02+03:00
            performed well in the last round of RQG testing. No new problems.

            mleich Matthias Leich added a comment - origin/bb-10.6- MDEV-30100 562850823ba74e15ae638205aeacee9f678de393 2023-08-17T11:14:02+03:00 performed well in the last round of RQG testing. No new problems.

            The code looks correct to me. Thanks to marko for explaining some details.

            vlad.lesin Vladislav Lesin added a comment - The code looks correct to me. Thanks to marko for explaining some details.

            Thanks again to mleich for testing this, and vlad.lesin for the review. I added some comments to trx_t::commit_empty() to explain the logic.

            marko Marko Mäkelä added a comment - Thanks again to mleich for testing this, and vlad.lesin for the review. I added some comments to trx_t::commit_empty() to explain the logic.

            People

              marko Marko Mäkelä
              dongjian dongjian
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.