Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25457

Server crashes in row_undo_mod_clust_low upon rollback of read-only transaction

Details

    Description

      --source include/have_innodb.inc
       
      CREATE TEMPORARY TABLE t (a INT) ENGINE=InnoDB;
      INSERT INTO t VALUES (1);
      START TRANSACTION READ ONLY;
      UPDATE t SET a = NULL;
      ROLLBACK;
      

      10.2 635b5ce3

      #3  <signal handler called>
      #4  0x000055629fb2feb7 in row_undo_mod_clust_low (node=0x7fd6d40a8bd0, offsets=0x7fd72c243b58, offsets_heap=0x7fd72c243b50, heap=0x7fd6d40a90e0, rebuilt_old_pk=0x7fd72c243b60, sys=0x7fd72c243b83 "", thr=0x7fd6d4036878, mtr=0x7fd72c243b90, mode=2) at /data/src/10.2/storage/innobase/row/row0umod.cc:110
      #5  0x000055629fb306af in row_undo_mod_clust (node=0x7fd6d40a8bd0, thr=0x7fd6d4036878) at /data/src/10.2/storage/innobase/row/row0umod.cc:298
      #6  0x000055629fb32d63 in row_undo_mod (node=0x7fd6d40a8bd0, thr=0x7fd6d4036878) at /data/src/10.2/storage/innobase/row/row0umod.cc:1277
      #7  0x000055629f94eed6 in row_undo (node=0x7fd6d40a8bd0, thr=0x7fd6d4036878) at /data/src/10.2/storage/innobase/row/row0undo.cc:303
      #8  0x000055629f94f037 in row_undo_step (thr=0x7fd6d4036878) at /data/src/10.2/storage/innobase/row/row0undo.cc:351
      #9  0x000055629f8b2c91 in que_thr_step (thr=0x7fd6d4036878) at /data/src/10.2/storage/innobase/que/que0que.cc:1039
      #10 0x000055629f8b2eb1 in que_run_threads_low (thr=0x7fd6d4036878) at /data/src/10.2/storage/innobase/que/que0que.cc:1103
      #11 0x000055629f8b307b in que_run_threads (thr=0x7fd6d4036878) at /data/src/10.2/storage/innobase/que/que0que.cc:1143
      #12 0x000055629f9adbfd in trx_rollback_to_savepoint_low (trx=0x7fd72c5b5120, savept=0x0) at /data/src/10.2/storage/innobase/trx/trx0roll.cc:107
      #13 0x000055629f9adee6 in trx_rollback_for_mysql_low (trx=0x7fd72c5b5120) at /data/src/10.2/storage/innobase/trx/trx0roll.cc:169
      #14 0x000055629f9ae1f2 in trx_rollback_for_mysql (trx=0x7fd72c5b5120) at /data/src/10.2/storage/innobase/trx/trx0roll.cc:200
      #15 0x000055629f7bb501 in innobase_rollback (hton=0x5562a1e8fd80, thd=0x7fd6d4000d90, rollback_trx=true) at /data/src/10.2/storage/innobase/handler/ha_innodb.cc:4832
      #16 0x000055629f5b779f in ha_rollback_trans (thd=0x7fd6d4000d90, all=true) at /data/src/10.2/sql/handler.cc:1708
      #17 0x000055629f49bd35 in trans_rollback (thd=0x7fd6d4000d90) at /data/src/10.2/sql/transaction.cc:415
      #18 0x000055629f33770d in mysql_execute_command (thd=0x7fd6d4000d90) at /data/src/10.2/sql/sql_parse.cc:5411
      #19 0x000055629f33e562 in mysql_parse (thd=0x7fd6d4000d90, rawbuf=0x7fd6d40126f8 "ROLLBACK", length=8, parser_state=0x7fd72c245570, is_com_multi=false, is_next_command=false) at /data/src/10.2/sql/sql_parse.cc:7796
      #20 0x000055629f32c78c in dispatch_command (command=COM_QUERY, thd=0x7fd6d4000d90, packet=0x7fd6d4008b51 "ROLLBACK", packet_length=8, is_com_multi=false, is_next_command=false) at /data/src/10.2/sql/sql_parse.cc:1827
      #21 0x000055629f32b287 in do_command (thd=0x7fd6d4000d90) at /data/src/10.2/sql/sql_parse.cc:1381
      #22 0x000055629f485e36 in do_handle_one_connection (connect=0x5562a23dca80) at /data/src/10.2/sql/sql_connect.cc:1336
      #23 0x000055629f485b9b in handle_one_connection (arg=0x5562a23dca80) at /data/src/10.2/sql/sql_connect.cc:1241
      #24 0x000055629fcb0cec in pfs_spawn_thread (arg=0x5562a247fcd0) at /data/src/10.2/storage/perfschema/pfs.cc:1869
      #25 0x00007fd731bd9609 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #26 0x00007fd7317b5293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      

      The failure started happening in its current form after this commit:

      commit a3871cd2832dec43ca4ad6592646f58a7acf6630 7fa12b1e34d3d561baed9e7f2aacb0d6a3eb7062
      Author:     Eugene Kosov
      AuthorDate: Wed Mar 31 16:36:36 2021 +0300
      Commit:     Eugene Kosov <claprix@yandex.ru>
      CommitDate: Thu Apr 15 17:53:33 2021 +0300
       
          MDEV-22255 SIGABRT: Assertion `id' failed in trx_write_trx_id on INSERT | Assertion `id > 0' failed in trx_write_trx_id | Assertion `val > 0' failed in row_upd_index_entry_sys_field | Assertion `thr_get_trx(thr)->id || index->table->no_rollback()' failed.
      

      Before the patch, the test case would fail on UPDATE with an assertion MDEV-22257. It got fixed along with MDEV-22255 as expected, but now we have a crash upon rollback instead.

      It doesn't crash on non-debug builds on my machine, but with SIGSEGV it can be just the matter of luck (unless it happens in debug-specific code of course).

      The patch hasn't yet been merged into higher versions, so 10.3+ aren't affected yet.

      Attachments

        Activity

          My intuition would say that if a transaction is already in progress, START TRANSACTION or BEGIN should do one of the following:

          1. Return an error. (This would be clearest, but I am afraid that it would be a major change in behaviour, breaking many applications.)
          2. Commit the active transaction, release all its resources, and start a new one. (Likewise, this could break some things.)
          3. Be ignored. (In the described case, the transaction would remain in read-write mode.)

          Based on the popularity of the function trx_start_if_not_started(), I am afraid that we have to go with the second or third choice.

          Transactions are controlled both on the SQL layer (for the purpose of metadata locks and replication) and the InnoDB layer (ROLLBACK, durability, MVCC).

          kevg, please describe in detail what currently happens during the START TRANSACTION READ ONLY. Please consider also what would happen if a XA transaction would be started while an implicitly started transaction is in progress.

          marko Marko Mäkelä added a comment - My intuition would say that if a transaction is already in progress, START TRANSACTION or BEGIN should do one of the following: Return an error. (This would be clearest, but I am afraid that it would be a major change in behaviour, breaking many applications.) Commit the active transaction, release all its resources, and start a new one. (Likewise, this could break some things.) Be ignored. (In the described case, the transaction would remain in read-write mode.) Based on the popularity of the function trx_start_if_not_started() , I am afraid that we have to go with the second or third choice. Transactions are controlled both on the SQL layer (for the purpose of metadata locks and replication) and the InnoDB layer ( ROLLBACK , durability, MVCC). kevg , please describe in detail what currently happens during the START TRANSACTION READ ONLY . Please consider also what would happen if a XA transaction would be started while an implicitly started transaction is in progress.

          I applied the SIGSEGV fix of the debug assertion (and the test case to exercise it) from bb-10.2-kevgs to 10.3 when merging MDEV-22255 and many other changes. The patch had not been pushed to 10.2 yet at that point.

          The patch is OK to push to 10.2 too. Apparently the only issue was that the debug assertion was trying to access dict_table_t via a null pointer.

          marko Marko Mäkelä added a comment - I applied the SIGSEGV fix of the debug assertion (and the test case to exercise it) from bb-10.2-kevgs to 10.3 when merging MDEV-22255 and many other changes. The patch had not been pushed to 10.2 yet at that point. The patch is OK to push to 10.2 too. Apparently the only issue was that the debug assertion was trying to access dict_table_t via a null pointer.

          People

            kevg Eugene Kosov (Inactive)
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.