Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32820

Race condition between trx_purge_free_segment() and trx_undo_create()

    XMLWordPrintable

Details

    Description

      The test encryption.create_or_replace failed on a IA-32 debug build like this:

      bb-11.2-release 23651e27c672aa57d0dfba1251a4fc0abc6c95d6

      encryption.create_or_replace 'ctr,innodb' w3 [ fail ]
              Test ended at 2023-11-03 14:56:55
       
      CURRENT_TEST: encryption.create_or_replace
      mysqltest: At line 68: query 'CREATE OR REPLACE TABLE `create_or_replace_t` AS SELECT * FROM `table10_int_autoinc`' failed: <Unknown> (2013): Lost connection to server during query
      ...
      Version: '11.2.2-MariaDB-debug-log'  socket: '/mnt/buildbot/build/mariadb-11.2.2/mysql-test/var/tmp/3/mysqld.1.sock'  port: 16040  Source distribution
      2023-11-03 14:56:46 4 [Note] InnoDB: Creating #1 encryption thread id 2890922816 total threads 4.
      2023-11-03 14:56:46 4 [Note] InnoDB: Creating #2 encryption thread id 2882530112 total threads 4.
      2023-11-03 14:56:46 4 [Note] InnoDB: Creating #3 encryption thread id 2874137408 total threads 4.
      2023-11-03 14:56:46 4 [Note] InnoDB: Creating #4 encryption thread id 2865744704 total threads 4.
      mariadbd: /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/include/buf0buf.h:789: uint32_t buf_page_t::fix(uint32_t): Assertion `f >= FREED' failed.
      ...
      #9  0xb7042d8b in __assert_fail () from /lib/i386-linux-gnu/libc.so.6
      #10 0x810c6b5b in buf_page_t::fix (this=0xb0c00068, count=1) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/include/buf0buf.h:789
      #11 0x8119a97d in buf_page_create_low (page_id=..., zip_size=0, mtr=0xaefa5724, free_block=0xb09ff298) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/buf/buf0buf.cc:2861
      #12 0x8119b23f in buf_page_create (space=0x83a87ed8, offset=46, zip_size=0, mtr=0xaefa5724, free_block=0xb09ff298) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/buf/buf0buf.cc:3010
      #13 0x8123c91e in fsp_page_create (space=0x83a87ed8, offset=46, mtr=0xaefa5724) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/fsp/fsp0fsp.cc:1066
      #14 0x8123cdaa in fsp_alloc_free_page (space=0x83a87ed8, hint=0, mtr=0xaefa5724, init_mtr=0xaefa5724, err=0xaefa56f4) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/fsp/fsp0fsp.cc:1178
      #15 0x81240017 in fseg_alloc_free_page_low (space=0x83a87ed8, seg_inode=0xb0eca072 "", iblock=0xb0dffa08, hint=0, direction=111 'o', has_done_reservation=true, mtr=0xaefa5724, init_mtr=0xaefa5724, err=0xaefa56f4) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/fsp/fsp0fsp.cc:2143
      #16 0x8123e8e5 in fseg_create (space=0x83a87ed8, byte_offset=60, mtr=0xaefa5724, err=0xaefa56f4, has_done_reservation=true, block=0x0) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/fsp/fsp0fsp.cc:1755
      #17 0x811349b0 in trx_undo_seg_create (space=0x83a87ed8, rseg_hdr=0xb0fff870, id=0xaefa5638, err=0xaefa56f4, mtr=0xaefa5724) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/trx/trx0undo.cc:482
      #18 0x81136d11 in trx_undo_create (trx=0xb1a00b80, rseg=0x81e6f280 <trx_sys+22336>, undo=0xb1a010f4, err=0xaefa56f4, mtr=0xaefa5724) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/trx/trx0undo.cc:1180
      #19 0x81138602 in trx_undo_assign_low<false> (trx=0xb1a00b80, rseg=0x81e6f280 <trx_sys+22336>, undo=0xb1a010f4, mtr=0xaefa5724, err=0xaefa56f4) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/trx/trx0undo.cc:1372
      #20 0x8111321e in trx_undo_report_row_operation (thr=0xa7b65fc8, index=0xac50a7f8, clust_entry=0x0, update=0xa7b69158, cmpl_info=0, rec=0xb1138152 "testcreate_or_replace_t", offsets=0xaefa60ac, roll_ptr=0xaefa5ac8) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/trx/trx0rec.cc:1898
      

      Concurrently, another thread is trying to free pages from this undo log tablespace:

      bb-11.2-release 23651e27c672aa57d0dfba1251a4fc0abc6c95d6

      #8  0x80fc5b41 in fil_space_t::x_lock (this=0x83a87ed8) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/include/fil0fil.h:983
      #9  0x80fc1175 in mtr_t::x_lock_space (this=0xa845bc44, space=0x83a87ed8) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/mtr/mtr0mtr.cc:790
      #10 0x80fc1062 in mtr_t::x_lock_space (this=0xa845bc44, space_id=2) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/mtr/mtr0mtr.cc:776
      #11 0x812420f9 in fseg_free_step_not_header (header=0xb0c5c03c "", mtr=0xa845bc44, ahi=false) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/fsp/fsp0fsp.cc:2905
      #12 0x810f5896 in trx_purge_free_segment (block=0xb0bfebb0, mtr=...) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/trx/trx0purge.cc:354
      #13 0x810f5ed7 in trx_purge_truncate_rseg_history (rseg=..., limit=..., all=true) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/trx/trx0purge.cc:459
      #14 0x810f65fd in purge_sys_t::iterator::free_history (this=0xa845bf80) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/trx/trx0purge.cc:546
      #15 0x810e0851 in purge_truncation_callback () at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/srv/srv0srv.cc:1101
      

      This task was added in MDEV-32050. Previously this was part of the purge_coordinator_callback and executed less frequently.

      Yet another thread is trying to write out pages of this undo tablespace:

      bb-11.2-release 23651e27c672aa57d0dfba1251a4fc0abc6c95d6

      #5  0x811af4d0 in inline_mysql_mutex_lock (that=0x81e72080 <buf_pool>, src_file=0x81800010 "/home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/buf/buf0flu.cc", src_line=1677) at /home/buildbot/buildbot/build/mariadb-11.2.2/include/mysql/psi/mysql_thread.h:746
      #6  0x811b6bd9 in buf_flush_list_space (space=0x83a87ed8, n_flushed=0xabcfdc8c) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/buf/buf0flu.cc:1677
      #7  0x81236df5 in fil_crypt_flush_space (state=0xabcfe008) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/fil/fil0crypt.cc:1909
      #8  0x812372a3 in fil_crypt_complete_rotate_space (state=0xabcfe008) at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/fil/fil0crypt.cc:1994
      #9  0x812375af in fil_crypt_thread () at /home/buildbot/buildbot/build/mariadb-11.2.2/storage/innobase/fil/fil0crypt.cc:2066
      

      So far, I failed to reproduce this locally by running 2,800 repetitions of the test on a local 32-bit build. I started another campaign of 56,000 repetitions, and it is over ¼ through, with no failures.

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.