Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-3925

Galera: Node aborts with a bogus error 1317: InnoDB: Cannot delete/update rows with cascading foreign key constraints that exceed max depth of 255

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Cannot Reproduce
    • 5.5.28a-galera
    • 5.5.34-galera
    • None

    Description

      General description
      RQG grammars and data and server command lines follow

      I have a 3-node cluster (2 servers and the arbitrator).
      Test data consists of one InnoDB table with two columns, a PK and a non-unique key on a char(1) column. The table contains 20 rows.

      The test is very similar to one in MDEV-3924, but there are some differences (more threads on the 1st node and a slightly different query on the 2nd node), which cause a consistently different replication failure (on a release build; on a debug build, the assertion failure is the same).

      Test flow on the 1st node:
      several threads (7 in the provided test)
      all but one run UPDATE <table> SET <non-pk> = ... ORDER BY ... LIMIT 8
      the last one runs KILL QUERY <a random one of the other threads>

      Test flow on the 2nd node:
      Single thread runs UPDATE <table> SET <non-pk> = ... ORDER BY <field list> LIMIT <a few rows>

      After a few minutes on a release build replication on the first node fails with

      [ERROR] Slave SQL: Could not execute Update_rows event on table test.table20_innodb_int_autoinc; Query execution was interrupted, Error_code: 1317; InnoDB: Cannot delete/update rows with cascading foreign key constraints that exceed max depth of 255. Please drop extra constraints and try again, Error_code: 152; Got error -1 from storage engine, Error_code: 1030; handler error No Error!; the event's master log FIRST, end_log_pos 126, Error_code: 1317
      [Warning] WSREP: RBR event 2 Update_rows apply warning: -1, 3687
      [Warning] WSREP: failed to replay trx: source: a713e54f-421f-11e2-0800-0ad5746759b3 version: 2 local: 1 state: REPLAYING flags: 1 conn_id: 7 trx_id: 15082 seqnos (l: 3741, g: 3687, s: 3685, d: 3686, ts: 1355071510004060772)
      [Warning] WSREP: Failed to apply app buffer: ^U<C0><C4>P^S^A, seqno: 3687, status: WSREP_FATAL
               at galera/src/replicator_smm.cpp:apply_wscoll():49
               at galera/src/replicator_smm.cpp:apply_trx_ws():120
      [ERROR] WSREP: trx_replay failed for: 5, query: UPDATE `table20_innodb_int_autoinc` SET `col_char_1_key` = 'f' ORDER BY `col_char_1_key`,`pk` LIMIT 8
      [ERROR] Aborting

      After that the server hangs.

      The weird thing is, there are no foreign key constraints on the table, which is why the report summary says "bogus error".

      On a debug version it takes longer, but eventually server aborts with

      maria-5.5-galera/sql/sql_error.h:76: uint Diagnostics_area::sql
      _errno() const: Assertion `m_status == DA_ERROR' failed.
      [ERROR] mysqld got signal 6 ;

      #6  0x00007f1e0aeb6d4d in __GI___assert_fail (assertion=0xd495da "m_status == DA_ERROR", file=<optimized out>, line=76, function=<optimized out>) at assert.c:81
      #7  0x000000000057bf15 in Diagnostics_area::sql_errno (this=0x7f1dd804a530) at maria-5.5-galera/sql/sql_error.h:76
      #8  0x00000000008cd766 in Rows_log_event::do_apply_event (this=0x4b121b0, rli=0x4816d10) at maria-5.5-galera/sql/log_event.cc:8280
      #9  0x0000000000592e8c in Log_event::apply_event (this=0x4b121b0, rli=0x4816d10) at maria-5.5-galera/sql/log_event.h:1230
      #10 0x00000000006284e9 in wsrep_apply_rbr (thd=0x7f1dd8046c00, rbr_buf=0x4af37c0 "\343\302\304P\023\001", buf_len=0) at maria-5.5-galera/sql/sql_parse.cc:8098
      #11 0x0000000000628ad5 in wsrep_apply_cb (ctx=0x7f1dd8046c00, buf=0x4af37c0, buf_len=112, global_seqno=33591) at maria-5.5-galera/sql/sql_parse.cc:8177
      #12 0x00007f1e0a1faabf in apply_wscoll (trx=..., apply_cb=0x628a27 <wsrep_apply_cb(void*, void const*, unsigned long, long)>, recv_ctx=0x7f1dd8046c00) at galera/src/replicator_smm.cpp:37
      #13 apply_trx_ws (recv_ctx=0x7f1dd8046c00, apply_cb=0x628a27 <wsrep_apply_cb(void*, void const*, unsigned long, long)>, commit_cb=0x628d32 <wsrep_commit_cb(void*, long, bool)>, trx=...) at galera/src/replicator_smm.cpp:81
      #14 0x00007f1e0a1ff00f in galera::ReplicatorSMM::replay_trx (this=0x30faed0, trx=0x4a82520, trx_ctx=0x7f1dd8046c00) at galera/src/replicator_smm.cpp:821
      #15 0x00007f1e0a216b76 in galera_replay_trx (gh=<optimized out>, trx_handle=<optimized out>, recv_ctx=0x7f1dd8046c00) at galera/src/wsrep_provider.cpp:658
      #16 0x000000000062330b in wsrep_mysql_parse (thd=0x7f1dd8046c00, rawbuf=0x49cdc28 "UPDATE `table20_innodb_int_autoinc` SET `col_char_1_key` = 'a' ORDER BY `col_char_1_key`,`pk` LIMIT 8", length=101, parser_state=0x7f1ddc3f8550) at maria-5.5-galera/sql/sql_parse.cc:6085
      #17 0x000000000061543b in dispatch_command (command=COM_QUERY, thd=0x7f1dd8046c00, packet=0x7f1dd810b6b1 "UPDATE `table20_innodb_int_autoinc` SET `col_char_1_key` = 'a' ORDER BY `col_char_1_key`,`pk` LIMIT 8", packet_length=101) at maria-5.5-galera/sql/sql_parse.cc:1230
      #18 0x000000000061429e in do_command (thd=0x7f1dd8046c00) at maria-5.5-galera/sql/sql_parse.cc:890
      #19 0x000000000071c1b8 in do_handle_one_connection (thd_arg=0x7f1dd8046c00) at maria-5.5-galera/sql/sql_connect.cc:1278
       

      Some pointers may be invalid and cause the dump to abort.
      Query (0x49cdc28): UPDATE `table20_innodb_int_autoinc` SET `col_char_1_key` = 'a' ORDER BY `col_char_1_key`,`pk` LIMIT 8
      Connection ID (thread ID): 12
      Status: KILL_QUERY

      branch: maria-5.5-galera
      revision-id: seppo.jaakola@codership.com-20121130113629-lhwlr2ncrib15h18
      date: 2012-11-30 13:36:29 +0200
      revno: 3358

      Command lines:

      maria-5.5-galera/sql/mysqld --defaults-file=maria-5.5-galera/mydef1.cnf --datadir=maria-5.5-galera/data1 --wsrep_provider=galera-23.2.2-src/libgalera_smm.so --wsrep_sst_method=rsync --core --default-storage-engine=InnoDB --innodb_autoinc_lock_mode=2 --innodb_locks_unsafe_for_binlog=1 --binlog-format=row --innodb_flush_log_at_trx_commit=0 --log-error=log.err --basedir=maria-5.5-galera/ --port=8306 --loose-lc-messages-dir=maria-5.5-galera/sql/share --socket=/tmp/elenst-galera-1.sock --tmpdir=maria-5.5-galera/data1/tmp --general-log=1 --wsrep_cluster_address=gcomm:// --core --log-bin=master-bin
       
      maria-5.5-galera/sql/mysqld --defaults-file=maria-5.5-galera/mydef2.cnf --datadir=maria-5.5-galera/data2 --wsrep_provider=galera-23.2.2-src/libgalera_smm.so --wsrep_sst_method=rsync --core --default-storage-engine=InnoDB --innodb_autoinc_lock_mode=2 --innodb_locks_unsafe_for_binlog=1 --binlog-format=row --innodb_flush_log_at_trx_commit=0 --log-error=log.err --basedir=maria-5.5-galera/ --port=8307 --loose-lc-messages-dir=maria-5.5-galera/sql/share --socket=/tmp/elenst-galera-2.sock --tmpdir=maria-5.5-galera/data2/tmp --general-log=1 --wsrep_cluster_address=gcomm://127.0.0.1:4567?gmcast.listen_addr=tcp://127.0.0.1:4566 --core --log-bin=master-bin

      (mydefX.cnf files are irrelevant, they only contain datadirs and ports).

      The test is run via RQG, one instance per node. The data file is the same for both instances, grammars slightly differ.

      RQG command lines:

      perl gentest.pl --gendata=1.zz --threads=7 --queries=100M --duration=21600 --dsn=dbi:mysql:host=127.0.0.1:port=8306:user=root:database=test --grammar=1a.yy 
       
      perl gentest.pl --gendata=1.zz --threads=1 --queries=100M --duration=21600 --dsn=dbi:mysql:host=127.0.0.1:port=8307:user=root:database=test --grammar=1b.yy

      data file (1.zz):

      $tables = {
              rows => [ 20 ],
              engines => [ 'InnoDB' ]
      };
       
      $fields = {
              types => [ 'char(1)' ],
              pk => [ 'int' ],
              indexes => [ 'key' ]
      };
       
      $data = {
              numbers => [ 'digit' ],
              strings => [ 'letter' ]
      }

      Grammar for the 1st node (1a.yy):

      thread7_init:
              SELECT CONNECTION_ID() INTO @killer ;
       
      thread7:
              KILL QUERY @killer - kill_thread ;
       
      kill_thread:
              1 | 2 | 3 | 4 | 5 | 6 ;
       
      query:
              UPDATE _table SET _field_no_pk = _varchar(1) ORDER BY _field_list LIMIT 8 ;

      Grammar for the 2nd node (1b.yy):

      query:
              UPDATE _table SET _field_no_pk = _varchar(1) ORDER BY _field_list LIMIT large_digit ;
       
      large_digit:
              5 | 6 | 7 | 8 ;

      Attachments

        Activity

          People

            jplindst Jan Lindström (Inactive)
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.