Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33593

Auto increment deadlock error causes ASSERT in subsequent save point

Details

    Description

      The issue is observed by mleich while testing the fix for MDEV-31154. It turns out to be an independent issue repeatable in 10.5 with auto increment deadlock error handling.

      sdp:/data1/results/1708353845/ExMDEV-19555$ _RR_TRACE_DIR=./1/rr rr replay --mark-stdio
       
      [rr 2526265 1220974]mariadbd: /data/Server/11.1-MDEV-31154/sql/sql_error.h:1068: uint Diagnostics_area::sql_errno() const: Assertion `m_status == DA_ERROR' failed.
       
      (rr) bt
      #0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=12455983052352) at ./nptl/pthread_kill.c:44
      #1  __pthread_kill_internal (signo=6, threadid=12455983052352) at ./nptl/pthread_kill.c:78
      #2  __GI___pthread_kill (threadid=12455983052352, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
      #3  0x000051bc67b5d476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
      #4  0x000051bc67b437f3 in __GI_abort () at ./stdlib/abort.c:79
      #5  0x000051bc67b4371b in __assert_fail_base (fmt=0x51bc67cf8150 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x555c8e7f4fb5 "m_status == DA_ERROR",  file=0x555c8e7f09c0 "/data/Server/11.1-MDEV-31154/sql/sql_error.h", line=1068, function=<optimized out>) at ./assert/assert.c:92
      #6  0x000051bc67b54e96 in __GI___assert_fail (assertion=0x555c8e7f4fb5 "m_status == DA_ERROR", file=0x555c8e7f09c0 "/data/Server/11.1-MDEV-31154/sql/sql_error.h", line=1068,  function=0x555c8e7f50b8 "uint Diagnostics_area::sql_errno() const") at ./assert/assert.c:101
      #7  0x0000555c8db9becf in Diagnostics_area::sql_errno (this=0x74f01c006c78) at /data/Server/11.1-MDEV-31154/sql/sql_error.h:1068
      #8  thd_get_error_number (thd=<optimized out>) at /data/Server/11.1-MDEV-31154/sql/sql_class.cc:540
      #9  0x0000555c8e461b6e in trx_state_eq (trx=trx@entry=0x7f1a4c675780, state=state@entry=TRX_STATE_ACTIVE, relaxed=relaxed@entry=true) at /data/Server/11.1-MDEV-31154/storage/innobase/include/trx0trx.inl:65
      #10 0x0000555c8e4625c9 in trx_release_savepoint_for_mysql (trx=trx@entry=0x7f1a4c675780, savepoint_name=savepoint_name@entry=0xb542271dbc0 "19KQCZL8W8") at /data/Server/11.1-MDEV-31154/storage/innobase/trx/trx0roll.cc:555
      #11 0x0000555c8e258a06 in innobase_release_savepoint (hton=<optimized out>, thd=<optimized out>, savepoint=0x74f01c017348) at /data/Server/11.1-MDEV-31154/storage/innobase/handler/ha_innodb.cc:4829
      #12 0x0000555c8df3c9d1 in ha_release_savepoint (thd=thd@entry=0x74f01c000d58, sv=sv@entry=0x74f01c017310) at /data/Server/11.1-MDEV-31154/sql/handler.cc:3058
      #13 0x0000555c8dda564b in trans_savepoint (thd=thd@entry=0x74f01c000d58, name=...) at /data/Server/11.1-MDEV-31154/sql/transaction.cc:599
      #14 0x0000555c8dc16879 in mysql_execute_command (thd=thd@entry=0x74f01c000d58, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /data/Server/11.1-MDEV-31154/sql/sql_parse.cc:5551
      #15 0x0000555c8dc18b33 in mysql_parse (thd=thd@entry=0x74f01c000d58, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0xb542271e2b0) at /data/Server/11.1-MDEV-31154/sql/sql_parse.cc:7849
      #16 0x0000555c8dc1aecb in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x74f01c000d58, packet=packet@entry=0x74f01c00af59 "SAVEPOINT A /* E_R Thread4 QNO 1281 CON_ID 17 */ ",  packet_length=packet_length@entry=49, blocking=blocking@entry=true) at /data/Server/11.1-MDEV-31154/sql/sql_parse.cc:1892
      #17 0x0000555c8dc1ccc4 in do_command (thd=0x74f01c000d58, blocking=blocking@entry=true) at /data/Server/11.1-MDEV-31154/sql/sql_parse.cc:1405
      #18 0x0000555c8dd8f525 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x555c91265f28, put_in_cache=put_in_cache@entry=true) at /data/Server/11.1-MDEV-31154/sql/sql_connect.cc:1415
      #19 0x0000555c8dd8f77e in handle_one_connection (arg=0x555c91265f28) at /data/Server/11.1-MDEV-31154/sql/sql_connect.cc:1317
      #20 0x000051bc67bafb43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
      #21 0x000051bc67c40bb4 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100
      (rr) quit
      

      Attachments

        Issue Links

          Activity

            I could analyze the issue from RR trace and create a repeatable mtr test with three concurrent transactions.

            T1: Acquire Row X lock on table t2 row
            T2: Wait for (T1) row lock on t2 after acquiring GAP Lock on t1
            T3: Wait for (T2) Insert Intension(II) row Lock on t1 after acquiring Auto Increment Lock on t1
            T1: First Save point S1
            T1: Wait for (T3) auto increment lock on t1 causing T1 -> T3 -> T2 -> T1 deadlock : Returns Error
            T1: Second Save Point S1: Asserts when trying to release the first Save Point

            The issue here is ha_innobase::get_auto_increment() could cause a deadlock involving auto-increment lock and rollback the transaction implicitly.

            #7  0x0000555c8e463105 in trx_t::rollback (this=this@entry=0x7f1a4c675780, savept=savept@entry=0x0) at /data/Server/11.1-MDEV-31154/storage/innobase/trx/trx0roll.cc:178
            #8  0x0000555c8e3d24a6 in row_mysql_handle_errors (new_err=new_err@entry=0xb542271d73c, trx=trx@entry=0x7f1a4c675780, thr=thr@entry=0x6e08740492a0, savept=savept@entry=0x0)
                at /data/Server/11.1-MDEV-31154/storage/innobase/row/row0mysql.cc:702
            #9  0x0000555c8e3d2758 in row_lock_table_autoinc_for_mysql (prebuilt=0x6e0874048b40) at /data/Server/11.1-MDEV-31154/storage/innobase/row/row0mysql.cc:1138
            #10 0x0000555c8e25cea0 in ha_innobase::innobase_lock_autoinc (this=this@entry=0x6e0874047e80) at /data/Server/11.1-MDEV-31154/storage/innobase/handler/ha_innodb.cc:7696
            #11 0x0000555c8e25f06d in ha_innobase::innobase_get_autoinc 
            

            For such cases, storage engines usually call thd_mark_transaction_to_rollback() to inform server to rollback in other SEs and close the transaction. In innodb we call it while converting error code to MySQL. However, since ::innobase_get_autoinc() returns void we don't return and MySQL error code and skip the call for error code conversion and also miss marking the transaction for rollback for deadlock error.

            The way the the caller of ::innobase_get_autoinc() knows about any error is somewhat confusing. The auto increment value ULONGLONG_MAX communicates the error.

            int handler::update_auto_increment() 
            {
               ...
                get_auto_increment(variables->auto_increment_offset,
                                     variables->auto_increment_increment,
                                     nb_desired_values, &nr,
                                     &nb_reserved_values);
                  if (nr == ULONGLONG_MAX)
                    DBUG_RETURN(HA_ERR_AUTOINC_READ_FAILED);  // Mark failure
            

            The solution is quite straight forward. Since convert_error_code_to_mysql() is handling some generic error handling part, like invoking the callback when needed, we should call that function in ha_innobase::get_auto_increment() even if we don't return the resulting error code back.

            debarun Debarun Banerjee added a comment - I could analyze the issue from RR trace and create a repeatable mtr test with three concurrent transactions. T1: Acquire Row X lock on table t2 row T2: Wait for (T1) row lock on t2 after acquiring GAP Lock on t1 T3: Wait for (T2) Insert Intension(II) row Lock on t1 after acquiring Auto Increment Lock on t1 T1: First Save point S1 T1: Wait for (T3) auto increment lock on t1 causing T1 -> T3 -> T2 -> T1 deadlock : Returns Error T1: Second Save Point S1: Asserts when trying to release the first Save Point The issue here is ha_innobase::get_auto_increment() could cause a deadlock involving auto-increment lock and rollback the transaction implicitly. #7 0x0000555c8e463105 in trx_t::rollback (this=this@entry=0x7f1a4c675780, savept=savept@entry=0x0) at /data/Server/11.1-MDEV-31154/storage/innobase/trx/trx0roll.cc:178 #8 0x0000555c8e3d24a6 in row_mysql_handle_errors (new_err=new_err@entry=0xb542271d73c, trx=trx@entry=0x7f1a4c675780, thr=thr@entry=0x6e08740492a0, savept=savept@entry=0x0) at /data/Server/11.1-MDEV-31154/storage/innobase/row/row0mysql.cc:702 #9 0x0000555c8e3d2758 in row_lock_table_autoinc_for_mysql (prebuilt=0x6e0874048b40) at /data/Server/11.1-MDEV-31154/storage/innobase/row/row0mysql.cc:1138 #10 0x0000555c8e25cea0 in ha_innobase::innobase_lock_autoinc (this=this@entry=0x6e0874047e80) at /data/Server/11.1-MDEV-31154/storage/innobase/handler/ha_innodb.cc:7696 #11 0x0000555c8e25f06d in ha_innobase::innobase_get_autoinc For such cases, storage engines usually call thd_mark_transaction_to_rollback() to inform server to rollback in other SEs and close the transaction. In innodb we call it while converting error code to MySQL. However, since ::innobase_get_autoinc() returns void we don't return and MySQL error code and skip the call for error code conversion and also miss marking the transaction for rollback for deadlock error. The way the the caller of ::innobase_get_autoinc() knows about any error is somewhat confusing. The auto increment value ULONGLONG_MAX communicates the error. int handler::update_auto_increment() { ... get_auto_increment(variables->auto_increment_offset, variables->auto_increment_increment, nb_desired_values, &nr, &nb_reserved_values); if (nr == ULONGLONG_MAX) DBUG_RETURN(HA_ERR_AUTOINC_READ_FAILED); // Mark failure The solution is quite straight forward. Since convert_error_code_to_mysql() is handling some generic error handling part, like invoking the callback when needed, we should call that function in ha_innobase::get_auto_increment() even if we don't return the resulting error code back.
            debarun Debarun Banerjee added a comment - marko for your review. https://github.com/MariaDB/server/pull/3102

            Thank you, the code change looks good and simple.

            marko Marko Mäkelä added a comment - Thank you, the code change looks good and simple.

            People

              debarun Debarun Banerjee
              debarun Debarun Banerjee
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.