Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13935

INSERT INTO stuck at state "Unlocking tables"

Details

    Description

      Disabling query cache and persistent stats does not help.

      | query_cache_type             | OFF                 |
      | innodb_stats_persistent              | OFF           |
      

      Inner deadlock?

      Thread 4 (Thread 0x7ffb000c9700 (LWP 11346)):
      #0  0x00007ffff6a4b00d in nanosleep () from /lib64/libpthread.so.0
      #1  0x00007ffff7742ce0 in os_thread_sleep(unsigned long) ()
      #2  0x00007ffff76d7c9f in TrxInInnoDB::wait(trx_t*) ()
      #3  0x00007ffff76ca63e in TrxInInnoDB::enter ()
      #4  0x00007ffff76d2ea4 in ha_innobase::external_lock(THD*, int) ()
      #5  0x00007ffff75298f9 in handler::ha_external_lock(THD*, int) ()
      #6  0x00007ffff75e4c0b in unlock_external(THD*, TABLE**, unsigned int) ()
      #7  0x00007ffff75e4ecc in mysql_unlock_tables(THD*, st_mysql_lock*, bool) ()
      #8  0x00007ffff735d9c7 in close_thread_tables(THD*) ()
      #9  0x00007ffff73a0bf7 in mysql_execute_command(THD*) ()
      #10 0x00007ffff73b910e in Prepared_statement::execute(String*, bool) ()
      #11 0x00007ffff73bc74b in Prepared_statement::execute_loop ()
      #12 0x00007ffff73bd3c2 in mysql_stmt_execute_common(THD*, unsigned long, unsigned char*, unsigned char*, unsigned long, bool, bool) ()
      #13 0x00007ffff73bd507 in mysqld_stmt_execute(THD*, char*, unsigned int) ()
      #14 0x00007ffff73a9d2a in dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) ()
      #15 0x00007ffff73ab14e in do_command(THD*) ()
      #16 0x00007ffff7467f4f in do_handle_one_connection(CONNECT*) ()
      #17 0x00007ffff7468074 in handle_one_connection ()
      #18 0x00007ffff769eedd in pfs_spawn_thread ()
      #19 0x00007ffff6a43aa1 in start_thread () from /lib64/libpthread.so.0
      #20 0x00007ffff515bbcd in clone () from /lib64/libc.so.6
      

      |  48 | dxxdata     | 172.19.171.239:49345 | muikku | Execute | 42884 | Unlocking tables               | INSERT INTO osrchan (cc, osrhandle, operid, osrcode, objtype, time, osrcomm, keyval, dbtable, valtype, fields, val, changroupid, numfields, wgroupid, keytype, transactid, nofchanges ) VALUES (400200,1076913024,69,1,8,getutcdate(),554698778,0x0,0,1,0x0,0x5b00000023000000250000002a000000150200000014015265734c6576656c000523000000140152657356616c756500055b3c000015020000001401333500014d504c5356504e3a3135343531001401416374696f6e5479706500013131,400197,0,0,1,400197,4)
                                                                                                                 |    0.000 |
      |  52 | dxxdata     | 172.19.171.239:49567 | muikku | Prepare | 15422 | Waiting for table metadata lock | SELECT   operator.operid, operator.previouslogin, operator.previouslogout, operator.maxtimepreinactive, operator.profid FROM operator  WHERE operator.operatorstate = 1 AND operator.operatortype != 99 AND operator.maxtimepreinactive != 0  ORDER BY  operator.operid ASC
                                                                                                                 |    0.000 |
      |  54 | dxxdata     | 172.19.171.239:49584 | muikku | Execute | 41355 | query end                      | INSERT INTO osrchan (cc, osrhandle, operid, osrcode, objtype, time, osrcomm, keyval, dbtable, valtype, fields, val, changroupid, numfields, wgroupid, keytype, transactid, nofchanges ) VALUES (566222,1076912960,69,1,720,getutcdate(),554881030,0x0,0,2,0x0,0x0,566221,0,0,1,566221,2)
                                                                                                                 |    0.000 |
      |  57 | dxxdata     | 172.19.132.29:57559  | muikku | Sleep   |     5 |                                 | NULL
                                                                                                                 |    0.000 |
      |  58 | mysql       | %:57560              | muikku | Execute |     4 | Sending data                    | insert into pms15minpwe3mibintervalhistory SELECT * FROM  pms15minpwe3mibintervalhistory58 on duplicate key update pms15minpwe3mibintervalhistory.nid = pms15minpwe3mibintervalhistory58.nid
                                                                                                                 |    0.000 |
      |  60 | dxxdata     | 172.19.171.239:49596 | muikku | Execute | 42958 | NULL                            | INSERT INTO osrchan (cc, osrhandle, operid, osrcode, objtype, time, osrcomm, keyval, dbtable, valtype, fields, val, changroupid, numfields, wgroupid, keytype, transactid, nofchanges ) VALUES (392715,1076913472,69,1,0,getutcdate(),555845638,0x010007004f3c000001,392,1,0x0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f,0x4f3c000001000000000000000700000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000404320032,392709,31,0,1,392709,8)
                                                                                                                 |    0.000 |
      |  62 | dxxdata     | 172.19.132.29:57592  | muikku | Sleep   |     5 |                                 | NULL
      

      Attachments

        1. stack.txt
          130 kB
        2. fies1udb32.err
          12 kB
        3. global_variables.log
          18 kB
        4. global_status.log
          12 kB
        5. mysqld_hanged_20171108.txt
          64 kB
        6. gdb_mysqld_20171123.txt
          10 kB
        7. mysqld_10_2_13_hanged_spid_65.zip
          23 kB

        Issue Links

          Activity

            I have now pushed this to 10.2. Some WSREP related cleanup I pushed to 10.1.
            It turns out that MariaDB 10.2 uses a different mechanism for asynchronously killing transactions; see my follow-up fix in 10.1 after porting the cleanup from 10.2. In 10.2, the function thd_report_wait_for() has apparently been replaced with thd_rpl_deadlock_check().

            marko Marko Mäkelä added a comment - I have now pushed this to 10.2. Some WSREP related cleanup I pushed to 10.1. It turns out that MariaDB 10.2 uses a different mechanism for asynchronously killing transactions; see my follow-up fix in 10.1 after porting the cleanup from 10.2. In 10.2, the function thd_report_wait_for() has apparently been replaced with thd_rpl_deadlock_check() .
            alibo Ali Borhani added a comment -

            Since we've updated to 10.2.14, we haven't faced this issue. Also, the average response time of our endpoints is decreased by 100-150%. Thank you all

            alibo Ali Borhani added a comment - Since we've updated to 10.2.14, we haven't faced this issue. Also, the average response time of our endpoints is decreased by 100-150%. Thank you all
            manttila Manu Anttila added a comment -

            10.2.14 has solved the issue. Thank you very much.

            manttila Manu Anttila added a comment - 10.2.14 has solved the issue. Thank you very much.

            We faced the same issue, but Galera
            MariaDB 10.2.13
            It has been resolved too?

            matteraf Fernando Mattera added a comment - We faced the same issue, but Galera MariaDB 10.2.13 It has been resolved too?

            matteraf, Galera is using InnoDB underneath, and the InnoDB in MariaDB Server contains some related modifications (WSREP patches, for write-set replication). Most of the Galera-related changes in InnoDB have something to do with record locking. So yes, it looks like you were affected by exactly this issue. The hang was caused by new code that was introduced in MySQL 5.7 (and merged to MariaDB 10.2) for Galera-like functionality in MySQL 5.7. This code was mostly dead code, because some related callback functions were replaced with stubs in MariaDB. So, it was safe to revert those changes. The hang was caused by something in the class TrxInInnoDB that was part of the added 5.7 code. I did not study the exact mechanism. Now that the unnecessary 5.7 code has been removed from MariaDB, it is safe to say that this hang is not possible any more.

            marko Marko Mäkelä added a comment - matteraf , Galera is using InnoDB underneath, and the InnoDB in MariaDB Server contains some related modifications ( WSREP patches, for write-set replication). Most of the Galera-related changes in InnoDB have something to do with record locking. So yes, it looks like you were affected by exactly this issue. The hang was caused by new code that was introduced in MySQL 5.7 (and merged to MariaDB 10.2) for Galera-like functionality in MySQL 5.7. This code was mostly dead code, because some related callback functions were replaced with stubs in MariaDB. So, it was safe to revert those changes. The hang was caused by something in the class TrxInInnoDB that was part of the added 5.7 code. I did not study the exact mechanism. Now that the unnecessary 5.7 code has been removed from MariaDB, it is safe to say that this hang is not possible any more.

            People

              marko Marko Mäkelä
              manttila Manu Anttila
              Votes:
              6 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.