Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-28777

binlog.binlog_truncate_multi_engine failed in bb with Lost connection to server during query

Details

    Description

      https://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos74-amd64-debug/builds/8853

      10.6 4179f93d28035ea2798cb1c16feeaaef

      binlog.binlog_truncate_multi_engine 'innodb,row' w4 [ fail ]
              Test ended at 2022-06-06 16:16:56
       
      CURRENT_TEST: binlog.binlog_truncate_multi_engine
      mysqltest: In included file "/usr/share/mysql-test/suite/binlog/t/binlog_truncate_multi_engine.inc": 
      included from /usr/share/mysql-test/suite/binlog/t/binlog_truncate_multi_engine.test at line 47:
      At line 42: query 'if (`SELECT $case = "B"`)
      {' failed: <Unknown> (2013): Lost connection to server during query
       
      The result from queries just before the failure was:
      < snip >
      #
      # Case "B" : "one engine has committed its transaction branch"
      #
      RESET MASTER;
      FLUSH LOGS;
      SET GLOBAL max_binlog_size= 4096;
      connect con1,localhost,root,,;
      List of binary logs before rotation
      show binary logs;
      Log_name	File_size
      master-bin.000001	#
      master-bin.000002	#
      INSERT INTO t1 VALUES (1, REPEAT("x", 1));
      INSERT INTO t2 VALUES (1, REPEAT("x", 1));
      SET GLOBAL debug_dbug="d,enable_log_write_upto_crash";
      BEGIN;
      INSERT INTO t2 VALUES (2, REPEAT("x", 4100));
      INSERT INTO t1 VALUES (2, REPEAT("x", 4100));
      COMMIT;
      connection default;
      

      Has happened a few times, always on kvm-rpm-centos74-amd64-debug.

      Attachments

        Activity

          angelique.sklavounos Angelique Sklavounos (Inactive) created issue -
          serg Sergei Golubchik made changes -
          Field Original Value New Value
          Assignee Andrei Elkin [ elkin ]
          serg Sergei Golubchik made changes -
          Fix Version/s 10.6 [ 24028 ]
          Fix Version/s 10.7 [ 24805 ]
          Fix Version/s 10.8 [ 26121 ]
          Fix Version/s 10.9 [ 26905 ]
          Fix Version/s 10.10 [ 27530 ]

          Binlog does

          void
          MYSQL_BIN_LOG::trx_group_commit_leader(group_commit_entry *leader)
          {
          ...
                mysql_mutex_lock(&current->thd->LOCK_thd_data);
                run_commit_ordered(current->thd, current->all);
                mysql_mutex_unlock(&current->thd->LOCK_thd_data);
          

          Then in run_commit_ordered()

          void
          TC_LOG::run_commit_ordered(THD *thd, bool all)
          {
          ...    DEBUG_SYNC(thd, "commit_after_run_commit_ordered");
            }
          }
          

          When the server is being shut down it runs

          static my_bool kill_thread_phase_1(THD *thd, int *n_threads_awaiting_ack)
          {
            if (thd->slave_thread || thd->is_binlog_dump_thread() ||
                (shutdown_wait_for_slaves &&
                ...
          

          which does

          bool THD::is_binlog_dump_thread()
          {
            mysql_mutex_lock(&LOCK_thd_data);
          

          but this mutex is already taken by the group commit leader, which is forever waiting on the "commit_after_run_commit_ordered" sync point. mysqltest gives up waiting for the server to shut down and kills it with SIGABRT.

          serg Sergei Golubchik added a comment - Binlog does void MYSQL_BIN_LOG::trx_group_commit_leader(group_commit_entry *leader) { ... mysql_mutex_lock(&current->thd->LOCK_thd_data); run_commit_ordered(current->thd, current->all); mysql_mutex_unlock(&current->thd->LOCK_thd_data); Then in run_commit_ordered() void TC_LOG::run_commit_ordered(THD *thd, bool all) { ... DEBUG_SYNC(thd, "commit_after_run_commit_ordered" ); } } When the server is being shut down it runs static my_bool kill_thread_phase_1(THD *thd, int *n_threads_awaiting_ack) { if (thd->slave_thread || thd->is_binlog_dump_thread() || (shutdown_wait_for_slaves && ... which does bool THD::is_binlog_dump_thread() { mysql_mutex_lock(&LOCK_thd_data); but this mutex is already taken by the group commit leader, which is forever waiting on the "commit_after_run_commit_ordered" sync point. mysqltest gives up waiting for the server to shut down and kills it with SIGABRT.
          angelique.sklavounos Angelique Sklavounos (Inactive) made changes -
          julien.fritsch Julien Fritsch made changes -
          Fix Version/s 10.7 [ 24805 ]
          Elkin Andrei Elkin made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Elkin Andrei Elkin added a comment -

          Interestingly S@rg's comments in there talk about a different already fixed by him `2ab52cc0e5c3` issue in the case C.

          This time it is B.
          Could you please study the commit which is hopefully instructive.

          Elkin Andrei Elkin added a comment - Interestingly S@rg's comments in there talk about a different already fixed by him `2ab52cc0e5c3` issue in the case C. This time it is B. Could you please study the commit which is hopefully instructive.
          Elkin Andrei Elkin made changes -
          Assignee Andrei Elkin [ elkin ] Angelique Sklavounos [ JIRAUSER50741 ]
          Status In Progress [ 3 ] In Review [ 10002 ]

          Thanks, the fix makes sense to me.

          angelique.sklavounos Angelique Sklavounos (Inactive) added a comment - Thanks, the fix makes sense to me.
          angelique.sklavounos Angelique Sklavounos (Inactive) made changes -
          Assignee Angelique Sklavounos [ JIRAUSER50741 ] Andrei Elkin [ elkin ]
          Status In Review [ 10002 ] Stalled [ 10000 ]
          Elkin Andrei Elkin made changes -
          Fix Version/s 10.6.13 [ 28514 ]
          Fix Version/s 10.9.6 [ 28520 ]
          Fix Version/s 10.10.4 [ 28522 ]
          Fix Version/s 10.11.3 [ 28524 ]
          Fix Version/s 10.8.8 [ 28518 ]
          Fix Version/s 10.6 [ 24028 ]
          Fix Version/s 10.8 [ 26121 ]
          Fix Version/s 10.9 [ 26905 ]
          Fix Version/s 10.10 [ 27530 ]
          Resolution Fixed [ 1 ]
          Status Stalled [ 10000 ] Closed [ 6 ]

          People

            Elkin Andrei Elkin
            angelique.sklavounos Angelique Sklavounos (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.