Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5863

"Lock wait timeout exceeded", lost events, Assertion `!table || !table->in_use || ta ble->in_use == _current_thd()' failure upon parallel replicating



    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 10.0.10
    • Component/s: None
    • Labels:


      The attached binary log was created by a fair concurrent test on master, without any manual editing.
      In case it relevant, the master server was running with the following options:

      10.0/sql/mysqld --no-defaults --basedir=10.0 --datadir=/home/elenst/test_results/analyze2/data --lc-messages-dir=10.0/sql/share --character-sets-dir=10.0/sql/share/charsets --tmpdir=/home/elenst/test_results/analyze2/tmp --core-file --max-allowed-packet=128Mb --port=10730 --socket=/home/elenst/test_results/analyze2/mysql.sock --pid-file=/home/elenst/test_results/analyze2/mysql.pid --general-log --general-log-file=/home/elenst/test_results/analyze2/mysql.log --server_id=1 --report-host= --report_port=10730 --innodb --sql-mode=no_engine_substitution --binlog_commit_wait_count=10 --binlog_commit_wait_usec=1000000 --ignore-builtin-innodb --plugin-load=ha_innodb.so --slave-skip-errors=1054,1317,1049,1305,1539,1505 --log-output=FILE --slave_parallel_threads=16 --slave_parallel_max_queued=4194304

      Despite slave-specific options, it was not configured as a slave and was not replicating from anywhere.

      Now I'm feeding this binary log to a slave running with slave-parallel-threads=16 (no other special options either on master or on slave is necessary).

      I'm getting several problems.

      Replication failure

      Initially replication breaks with

      140314 18:04:40 [ERROR] Slave SQL: Error 'Lock wait timeout exceeded; try restarting transaction' on query. Default database: 'test'. Query: 'INSERT INTO `view_A` ( `pk` ) VALUES (NULL)', Internal MariaDB error code: 1205
      140314 18:04:40 [Warning] Slave: Lock wait timeout exceeded; try restarting transaction Error_code: 1205
      140314 18:04:40 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000001' position 312054

      It happens every time, in the same place, does not seem to be a race condition, probably some logic specific to parallel replication.

      Data discrepancy

      After the previous failure, I simply issue START SLAVE again, and replication does not fail anymore.
      But when I check the contents of view_A, it turns out that it only has 4 rows while the master has 30+ (either the master on the original datadir, or after feeding the binlog into a new one via MySQL client).
      It also happens every time.

      Assertion failure

      When slave reports that it has finished replication, I attempt to shut down the server, and get the assertion failure upon shutdown:

      mysqld: 10.0/sql/handler.cc:2381: THD* handler::ha_thd() const: Assertion `!table || !table->in_use || table->in_use == _current_thd()' failed.
      140314 18:12:47 [ERROR] mysqld got signal 6 ;

      #5  0x00007f0a9a4b46f0 in *__GI_abort () at abort.c:92
      #6  0x00007f0a9a4aa621 in *__GI___assert_fail (assertion=0xfdbc48 "!table || !table->in_use || table->in_use == _current_thd()", file=<optimized out>, line=2381, function=0xfde5c0 "THD* handler::ha_thd() const") at assert.c:81
      #7  0x000000000085f648 in handler::ha_thd (this=0x7f0a6d62f088) at 10.0/sql/handler.cc:2381
      #8  0x0000000000b74035 in ha_innobase::extra (this=0x7f0a6d62f088, operation=HA_EXTRA_DETACH_CHILDREN) at 10.0/storage/xtradb/handler/ha_innodb.cc:12762
      #9  0x000000000060fc14 in mark_tmp_table_for_reuse (table=0x7f0a7121c070) at 10.0/sql/sql_base.cc:696
      #10 0x000000000060faf2 in mark_temp_tables_as_free_for_reuse (thd=0x7f0a6bd11070) at 10.0/sql/sql_base.cc:663
      #11 0x0000000000610224 in close_thread_tables (thd=0x7f0a6bd11070) at 10.0/sql/sql_base.cc:943
      #12 0x000000000077d1e1 in rpl_group_info::slave_close_thread_tables (this=0x7f0a6bcde870, thd=0x7f0a6bd11070) at 10.0/sql/rpl_rli.cc:1699
      #13 0x000000000077ced0 in rpl_group_info::cleanup_context (this=0x7f0a6bcde870, thd=0x7f0a6bd11070, error=true) at 10.0/sql/rpl_rli.cc:1611
      #14 0x00000000005e3381 in handle_slave_sql (arg=0x7f0a7c3ad070) at 10.0/sql/slave.cc:4594
      #15 0x0000000000a2da58 in pfs_spawn_thread (arg=0x7f0a72887ef0) at 10.0/storage/perfschema/pfs.cc:1853
      #16 0x00007f0a9c04eb50 in start_thread (arg=<optimized out>) at pthread_create.c:304
      #17 0x00007f0a9a559a7d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112

      Stack trace from:

      revision-id: monty@mariadb.org-20140313144311-hmphz2gup3qdjlp9
      date: 2014-03-13 16:43:11 +0200
      build-date: 2014-03-14 19:43:28 +0400
      revno: 4051
      branch-nick: 10.0

      Server where the binary log was created

      revision-id: svoj@mariadb.org-20140307074306-blf3begv8ee0tb5j
      date: 2014-03-07 11:43:06 +0400
      build-date: 2014-03-14 17:43:49 +0200
      revno: 4045
      branch-nick: 10.0


          Issue Links



              • Assignee:
                knielsen Kristian Nielsen
                elenst Elena Stepanova
              • Votes:
                0 Vote for this issue
                3 Start watching this issue


                • Created: