Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24156

trx_undo_left() fails to prevent overflow

Details

    Description

      The function trx_undo_left() checks whether an undo log record (or a part of it) would fit in the undo log page. If not, a new undo log page will be allocated.

      There is some wishful thinking in the design of the function, all the way from the very start of the public history of InnoDB:

      /**************************************************************************
      Calculates the free space left for extending an undo log record. */
      UNIV_INLINE
      ulint
      trx_undo_left(
      /*==========*/
      			/* out: bytes left */
      	page_t* page,	/* in: undo log page */
      	byte*	ptr)	/* in: pointer to page */
      {
      	/* The '- 10' is a safety margin, in case we have some small
      	calculation error below */
       
      	return(UNIV_PAGE_SIZE - (ptr - page) - 10 - FIL_PAGE_DATA_END);
      }
      

      Due to the unsigned return type, the additional 10 bytes actually constitute an unsafety margin that causes us to waste 10 bytes in every undo log page. If the ptr is already past those 10+4 bytes from the end of the page, the function will return a large positive value. But the numerous trx_undo_left(...) < ... checks would expect a negative value.

      I think that we must add debug assertions to the function in all maintained versions (backport parts of MDEV-24096), and make the function return 0 when ptr is already past the end, to make the overflow checks work in a desirable way.

      Also, we should re-review all code that calls the function, especially the code that writes indexed virtual columns to the undo log.

      Attachments

        Issue Links

          Activity

            I believe that this could fix the root cause of MySQL Bug #94553 Crash in trx_undo_rec_copy and similar reports.

            marko Marko Mäkelä added a comment - I believe that this could fix the root cause of MySQL Bug #94553 Crash in trx_undo_rec_copy and similar reports.

            Workflow:
            1. Start server
            2. Some sessions run a concurrent SQL mix
            3. During 2. is ongoing KILL the server
            4. Restart
            5. Some Select harvests
            [rr 881931 64512]mysqld: /Server/bb-10.6-MDEV-23497/storage/innobase/include/mach0data.ic:691: void mach_write_to_n_little_endian(byte*, ulint, ulint): Assertion `dest_size <= sizeof(ulint)' failed.
             
            sdp:/RQG/storage/1605036797/tmp/dev/shm/vardir/1605036797/121/1/rr
            _RR_TRACE_DIR="." rr replay --mark-stdio
            

            mleich Matthias Leich added a comment - Workflow: 1. Start server 2. Some sessions run a concurrent SQL mix 3. During 2. is ongoing KILL the server 4. Restart 5. Some Select harvests [rr 881931 64512]mysqld: /Server/bb-10.6-MDEV-23497/storage/innobase/include/mach0data.ic:691: void mach_write_to_n_little_endian(byte*, ulint, ulint): Assertion `dest_size <= sizeof(ulint)' failed.   sdp:/RQG/storage/1605036797/tmp/dev/shm/vardir/1605036797/121/1/rr _RR_TRACE_DIR="." rr replay --mark-stdio

            mleich, sorry, I thought that you would have triggered the new debug assertion ut_ad(left >= 0) that we introduced in trx_undo_left().
            The assertion that you triggered is something else and deserves a bug report on its own:

            #3  0x00005aff78c9bf36 in __GI___assert_fail (
                assertion=assertion@entry=0x55ea195f4980 "dest_size <= sizeof(ulint)", 
                file=file@entry=0x55ea195f1220 "/Server/bb-10.6-MDEV-23497/storage/innobase/include/mach0data.ic", line=line@entry=691, 
                function=function@entry=0x55ea195f49c0 "void mach_write_to_n_little_endian(byte*, ulint, ulint)") at assert.c:101
            #4  0x000055ea18404318 in mach_write_to_n_little_endian (n=32769, 
                dest_size=18446744073709551612, dest=0x61a000319ebd "")
                at /Server/bb-10.6-MDEV-23497/storage/innobase/include/mach0data.ic:691
            #5  row_mysql_store_blob_ref (dest=dest@entry=0x61a000319ebd "", col_len=4, 
                data=data@entry=0x6310045248a0, len=len@entry=32769)
                at /Server/bb-10.6-MDEV-23497/storage/innobase/row/row0mysql.cc:251
            #6  0x000055ea184a833d in row_sel_field_store_in_mysql_format_func (
                dest=0x61a000319ebd "", templ=templ@entry=0x618000046528, 
                index=index@entry=0x6170000203a0, field_no=field_no@entry=7, 
                data=0x6310045248a0 '3' <repeats 200 times>..., len=32769)
            

            The large dest_size looks like -4 when interpreted as a signed integer. The col_len=4 that was passed to row_mysql_store_blob_ref() is definitely too small.
            My guess would be that the .frm file is out of sync with the InnoDB data dictionary. It should be possible to check this hypothesis by looking at the rr replay traces in more detail.

            marko Marko Mäkelä added a comment - mleich , sorry, I thought that you would have triggered the new debug assertion ut_ad(left >= 0) that we introduced in trx_undo_left() . The assertion that you triggered is something else and deserves a bug report on its own: #3 0x00005aff78c9bf36 in __GI___assert_fail ( assertion=assertion@entry=0x55ea195f4980 "dest_size <= sizeof(ulint)", file=file@entry=0x55ea195f1220 "/Server/bb-10.6-MDEV-23497/storage/innobase/include/mach0data.ic", line=line@entry=691, function=function@entry=0x55ea195f49c0 "void mach_write_to_n_little_endian(byte*, ulint, ulint)") at assert.c:101 #4 0x000055ea18404318 in mach_write_to_n_little_endian (n=32769, dest_size=18446744073709551612, dest=0x61a000319ebd "") at /Server/bb-10.6-MDEV-23497/storage/innobase/include/mach0data.ic:691 #5 row_mysql_store_blob_ref (dest=dest@entry=0x61a000319ebd "", col_len=4, data=data@entry=0x6310045248a0, len=len@entry=32769) at /Server/bb-10.6-MDEV-23497/storage/innobase/row/row0mysql.cc:251 #6 0x000055ea184a833d in row_sel_field_store_in_mysql_format_func ( dest=0x61a000319ebd "", templ=templ@entry=0x618000046528, index=index@entry=0x6170000203a0, field_no=field_no@entry=7, data=0x6310045248a0 '3' <repeats 200 times>..., len=32769) The large dest_size looks like -4 when interpreted as a signed integer. The col_len=4 that was passed to row_mysql_store_blob_ref() is definitely too small. My guess would be that the .frm file is out of sync with the InnoDB data dictionary. It should be possible to check this hypothesis by looking at the rr replay traces in more detail.

            For the rr trace mentioned above I have created
            https://jira.mariadb.org/browse/MDEV-24202

            mleich Matthias Leich added a comment - For the rr trace mentioned above I have created https://jira.mariadb.org/browse/MDEV-24202

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.