Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8039

parallel replication slave - indefinate lock wait in rpl_parallel::wait_for_done when terminate_slave_thread with stop slave while show global status/show slave status was running

Details

    Description

      from MDEV-7126 - same symptoms - different case

      reproduced on 10.0.17:

      its quite likely this had an active IO thread and stopped SQL thread (was running pt-slave-delay --delay 24h --interval 1hr)

        mysql  -S /var/lib/mariadb/mysql-10.sock 
      Welcome to the MariaDB monitor.  Commands end with ; or \g.
      Your MariaDB connection id is 1438
      Server version: 10.0.17-MariaDB-log MariaDB Server
       
      Copyright (c) 2000, 2015, Oracle, MariaDB Corporation Ab and others.
       
      Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
       
      MariaDB [(none)]> stop slave; set global slave_domain_parallel_threads=15; start slave;
      

      ]$ mysql -u openquery   -S /var/lib/mariadb/mysql-10.sock 
      Welcome to the MariaDB monitor.  Commands end with ; or \g.
      Your MariaDB connection id is 1443
      Server version: 10.0.17-MariaDB-log MariaDB Server
       
      Copyright (c) 2000, 2015, Oracle, MariaDB Corporation Ab and others.
       
      Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
       
      MariaDB [(none)]> show processlist;
      +------+-------------+-----------+--------------------+---------+--------+--------------------------------------------------------------------------------+--------------------+----------+
      | Id   | User        | Host      | db                 | Command | Time   | State                                                                          | Info               | Progress |
      +------+-------------+-----------+--------------------+---------+--------+--------------------------------------------------------------------------------+--------------------+----------+
      |    3 | system user |           | NULL               | Connect | 264856 | Waiting for work from SQL thread                                               | NULL               |    0.000 |
      |    4 | system user |           | NULL               | Connect | 264856 | Waiting for work from SQL thread                                               | NULL               |    0.000 |
      |    5 | system user |           | NULL               | Connect | 264856 | Waiting for work from SQL thread                                               | NULL               |    0.000 |
      |    6 | system user |           | NULL               | Connect | 264856 | Waiting for work from SQL thread                                               | NULL               |    0.000 |
      |    7 | system user |           | NULL               | Connect | 264856 | Waiting for work from SQL thread                                               | NULL               |    0.000 |
      |    8 | system user |           | NULL               | Connect | 264856 | Waiting for work from SQL thread                                               | NULL               |    0.000 |
      |    9 | system user |           | NULL               | Connect | 264856 | Waiting for work from SQL thread                                               | NULL               |    0.000 |
      |   10 | system user |           | NULL               | Connect | 264856 | Waiting for work from SQL thread                                               | NULL               |    0.000 |
      |   11 | system user |           | NULL               | Connect | 264856 | Waiting for work from SQL thread                                               | NULL               |    0.000 |
      |   12 | system user |           | NULL               | Connect | 264856 | Waiting for work from SQL thread                                               | NULL               |    0.000 |
      |   13 | system user |           | NULL               | Connect | 114397 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   14 | system user |           | NULL               | Connect | 114397 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   15 | system user |           | NULL               | Connect | 114397 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   16 | system user |           | NULL               | Connect | 114396 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   17 | system user |           | NULL               | Connect | 114398 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   18 | system user |           | NULL               | Connect | 114396 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   19 | system user |           | NULL               | Connect | 114398 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   20 | system user |           | NULL               | Connect | 114398 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   21 | system user |           | NULL               | Connect | 114396 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   22 | system user |           | NULL               | Connect | 114397 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   24 | system user |           | NULL               | Connect | 264856 | Waiting for master to send event                                               | NULL               |    0.000 |
      |   39 | openquery   | localhost | NULL               | Sleep   |    907 |                                                                                | NULL               |    0.000 |
      | 1349 | system user |           | NULL               | Connect |  26107 | Reading event from the relay log                                               | NULL               |    0.000 |
      | 1438 | root        | localhost | NULL               | Query   |    147 | Killing slave                                                                  | stop slave         |    0.000 |
      | 1440 | munin       | localhost | information_schema | Query   |     76 | Filling schema table                                                           | SHOW GLOBAL STATUS |    0.000 |
      | 1441 | root        | localhost | NULL               | Query   |     41 | init                                                                           | show slave status  |    0.000 |
      | 1442 | munin       | localhost | information_schema | Query   |     14 | Filling schema table                                                           | SHOW GLOBAL STATUS |    0.000 |
      | 1443 | openquery   | localhost | NULL               | Query   |      0 | init                                                                           | show processlist   |    0.000 |
      +------+-------------+-----------+--------------------+---------+--------+--------------------------------------------------------------------------------+--------------------+----------+
      

      SQL thread was stopped - set password on non-existent user.

      MariaDB [(none)]> set global sql_slave_skip_counter=1; start slave;
      ERROR 1198 (HY000): This operation cannot be performed as you have a running slave ''; run STOP SLAVE '' first
      Query OK, 0 rows affected, 1 warning (0.00 sec)
       
      MariaDB [(none)]> stop slave;set global sql_slave_skip_counter=1; start slave;
      

      erver version: 10.0.17-MariaDB-log MariaDB Server
       
      Copyright (c) 2000, 2015, Oracle, MariaDB Corporation Ab and others.
       
      Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
       
      MariaDB [(none)]> show processlist;
      | Id   | User        | Host      | db                 | Command | Time    | State                                                                          | Info               | Progress |
      |    3 | system user |           | NULL               | Connect | 1010764 | Waiting for work from SQL thread                                               | NULL               |    0.000 |
      |    4 | system user |           | NULL               | Connect | 1010764 | Waiting for work from SQL thread                                               | NULL               |    0.000 |
      |    5 | system user |           | NULL               | Connect | 1010764 | Waiting for work from SQL thread                                               | NULL               |    0.000 |
      |    6 | system user |           | NULL               | Connect | 1010764 | Waiting for work from SQL thread                                               | NULL               |    0.000 |
      |    7 | system user |           | NULL               | Connect | 1010764 | Waiting for work from SQL thread                                               | NULL               |    0.000 |
      |    8 | system user |           | NULL               | Connect |  944326 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |    9 | system user |           | NULL               | Connect |  944324 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   10 | system user |           | NULL               | Connect |  944327 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   11 | system user |           | NULL               | Connect |  944326 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   12 | system user |           | NULL               | Connect |  944327 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   13 | system user |           | NULL               | Connect |  944326 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   14 | system user |           | NULL               | Connect |  944325 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   15 | system user |           | NULL               | Connect |  944328 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   16 | system user |           | NULL               | Connect |  944328 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   17 | system user |           | NULL               | Connect |  944324 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   18 | system user |           | NULL               | Connect |  944328 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   19 | system user |           | NULL               | Connect |  944324 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   20 | system user |           | NULL               | Connect |  944326 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   21 | system user |           | NULL               | Connect |  944324 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   22 | system user |           | NULL               | Connect |  944324 | Waiting for prior transaction to start commit before starting next transaction | NULL               |    0.000 |
      |   23 | system user |           | NULL               | Connect | 1010763 | Waiting for master to send event                                               | NULL               |    0.000 |
      |   28 | openquery   | localhost | NULL               | Killed  |  127489 | init                                                                           | SHOW SLAVE STATUS  |    0.000 |
      |  291 | system user |           | NULL               | Connect |  854691 | Reading event from the relay log                                               | NULL               |    0.000 |
      | 2711 | openquery   | localhost | NULL               | Query   |  129024 | Killing slave                                                                  | stop slave         |    0.000 |
      | 2715 | munin       | localhost | information_schema | Killed  |  128765 | Filling schema table                                                           | SHOW GLOBAL STATUS |    0.000 |
      | 2716 | root        | localhost | NULL               | Killed  |  128709 | init                                                                           | show slave status  |    0.000 |
      | 2719 | munin       | localhost | information_schema | Killed  |  128703 | Filling schema table                                                           | SHOW GLOBAL STATUS |    0.000 |
      | 2741 | openquery   | localhost | NULL               | Query   |       0 | init                                                                           | show processlist   |    0.000 |
       
      28 rows in set (0.00 sec)
      

      Threads where well and truly halted before i killed off the global status threads.

      (gdb) where
      #0  0x000000348e40b98e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
      #1  0x000000000056167f in terminate_slave_thread ()
      #2  0x0000000000564d45 in terminate_slave_threads(Master_info*, int, bool) ()
      #3  0x00000000005efabf in stop_slave(THD*, Master_info*, bool) ()
      #4  0x00000000005d3fe2 in mysql_execute_command(THD*) ()
      #5  0x00000000005d8552 in mysql_parse ()
      #6  0x00000000005da7b0 in dispatch_command(enum_server_command, THD*, char*, unsigned int) ()
      #7  0x00000000006963d3 in do_handle_one_connection(THD*) ()
      #8  0x00000000006964a2 in handle_one_connection ()
      #9  0x000000348e4079d1 in start_thread () from /lib64/libpthread.so.0
      #10 0x000000348e0e8b6d in clone () from /lib64/libc.so.6
      

      full bt in gdb-10.0-terminate_slave_thread.txt

      Analysis

      Positions are from objdump on the executable

      Looking at the show slave status this is at the pthread_mutex_lock position 0x63551a in fill_status though a copy show pthread_mutex lock is from fill_status(635419)->show_status_array(62a94c)->show_heartbeat_period(54a982).

      show_heartbeat_period is blocked acquiring the LOCK_active_mi.

      fill_status(63551a) is waiting on LOCK_show_status just before show_status_array which means its waiting on the fill_status that reached show_heartbeat_period.

      0x5d5953 in mysql_execute_command is waiting on a lock LOCK_active_mi for a show slave status command.

      terminate_slave_thread looking at the disassembled code the following is the only cond_timedwait

      mysql_cond_timedwait(term_cond, term_lock, &abstime)
      

      and its waiting here.

      thd->LOCK_thd_data is unlocked.
      term_lock is sql_lock (in terminate_slave_threads) based on the position in the terminate_slave_threads, which is (mi->rli.run_lock) (which is locked) in stop_slave (by calling lock_slave_thread).
      term_cond is mi->rli.stop_cond

      We can see this is terminating the sql thread.

      In both stack traces have handle_slave_sql(line 4676) calling in rpl_parallel::wait_for_done(6d52ec) at mysql_cond_wait(&e->COND_parallel_entry, &rpt->LOCK_rpl_thread); gdb-10.0-terminate_slave_thread.txt has two of these threads in the same place which strikes me as odd (wasn't multisource).

      So the blocking rpl_parallel::wait_for_done at a lock point and is preventing the mysql_cond_broadcast(&rli->stop_cond); at the bottom of the handle_slave_sql thread.

      So the point of being stuck is at rpl_parallel::wait_for_done(6d52ec) at mysql_cond_wait(&e->COND_parallel_entry, &rpt->LOCK_rpl_thread)

      Attachments

        Issue Links

          Activity

            danblack Daniel Black added a comment - Slave status locks fixed in MDEV-7126 https://github.com/MariaDB/server/commit/f8320210e78232fed350754faf72d14af63d0c3e
            elenst Elena Stepanova added a comment - - edited

            Got one as well:

            Thread 3 (Thread 0x7fc2dd490700 (LWP 20035)):
            #0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:215
            #1  0x0000000000eed6fd in safe_cond_timedwait (cond=0x7fc30c1ebdf8, mp=0x7fc30c1ebd10, abstime=0x7fc2dd48e4b0, file=0xf8bbe0 "10.1/include/mysql/psi/mysql_thread.h", line=1202) at 10.1/mysys/thr_mutex.c:547
            #2  0x00000000005dcfca in inline_mysql_cond_timedwait (that=0x7fc30c1ebdf8, mutex=0x7fc30c1ebd10, abstime=0x7fc2dd48e4b0, src_file=0xf8c2f0 "10.1/sql/slave.cc", src_line=791) at 10.1/include/mysql/psi/mysql_thread.h:1202
            #3  0x00000000005debb2 in terminate_slave_thread (thd=0x7fc2dc0e3070, term_lock=0x7fc30c1ebd10, term_cond=0x7fc30c1ebdf8, slave_running=0x7fc30c1ec7ec, skip_lock=true) at 10.1/sql/slave.cc:791
            #4  0x00000000005de554 in terminate_slave_threads (mi=0x7fc30c1e8000, thread_mask=3, skip_lock=true) at 10.1/sql/slave.cc:636
            #5  0x00000000006b8fb7 in stop_slave (thd=0x7fc2eecf5070, mi=0x7fc30c1e8000, net_report=true) at 10.1/sql/sql_repl.cc:3186
            #6  0x0000000000686841 in mysql_execute_command (thd=0x7fc2eecf5070) at 10.1/sql/sql_parse.cc:3530
            #7  0x000000000069218b in mysql_parse (thd=0x7fc2eecf5070, rawbuf=0x7fc2e6ec5088 "STOP SLAVE SQL_THREAD, IO_THREAD", length=32, parser_state=0x7fc2dd48f110) at 10.1/sql/sql_parse.cc:7165
            #8  0x0000000000681199 in dispatch_command (command=COM_QUERY, thd=0x7fc2eecf5070, packet=0x7fc2eec92071 "STOP SLAVE SQL_THREAD, IO_THREAD", packet_length=32) at 10.1/sql/sql_parse.cc:1462
            #9  0x000000000067ff63 in do_command (thd=0x7fc2eecf5070) at 10.1/sql/sql_parse.cc:1090
            #10 0x00000000007ad7a0 in do_handle_one_connection (thd_arg=0x7fc2eecf5070) at 10.1/sql/sql_connect.cc:1347
            #11 0x00000000007ad4e5 in handle_one_connection (arg=0x7fc2eecf5070) at 10.1/sql/sql_connect.cc:1258
            #12 0x00007fc30dab6e9a in start_thread (arg=0x7fc2dd490700) at pthread_create.c:308
            #13 0x00007fc30d1e78bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
            #14 0x0000000000000000 in ?? ()

            Thread 5 (Thread 0x7fc2dde8b700 (LWP 20028)):
            #0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:215
            #1  0x0000000000eed6fd in safe_cond_timedwait (cond=0x7fc2eefc8218, mp=0x7fc2eefc8168, abstime=0x7fc2dde87aa0, file=0xfd2760 "10.1/include/mysql/psi/mysql_thread.h", line=1202) at 10.1/mysys/thr_mutex.c:547
            #2  0x00000000007b6784 in inline_mysql_cond_timedwait (that=0x7fc2eefc8218, mutex=0x7fc2eefc8168, abstime=0x7fc2dde87aa0, src_file=0xfd29c8 "10.1/sql/mdl.cc", src_line=1066) at 10.1/include/mysql/psi/mysql_thread.h:1202
            #3  0x00000000007b7902 in MDL_wait::timed_wait (this=0x7fc2eefc8168, owner=0x7fc2eefc8120, abs_timeout=0x7fc2dde87aa0, set_status_on_timeout=false, wait_state_name=0x170e6f0) at 10.1/sql/mdl.cc:1066
            #4  0x00000000007b92ab in MDL_context::acquire_lock (this=0x7fc2eefc8168, mdl_request=0x7fc2dde87b70, lock_wait_timeout=31536000) at 10.1/sql/mdl.cc:2039
            #5  0x00000000007b9a2a in MDL_context::upgrade_shared_lock (this=0x7fc2eefc8168, mdl_ticket=0x7fc30c01f2c0, new_type=MDL_EXCLUSIVE, lock_wait_timeout=31536000) at 10.1/sql/mdl.cc:2232
            #6  0x000000000073f186 in mysql_inplace_alter_table (thd=0x7fc2eefc8070, table_list=0x7fc2eefce128, table=0x7fc3072dcc70, altered_table=0x7fc2eefd5070, ha_alter_info=0x7fc2dde87eb0, inplace_supported=HA_ALTER_INPLACE_NO_LOCK_AFTER_PREPARE, target_mdl_request=0x7fc2dde89460, alter_ctx=0x7fc2dde87fd0) at 10.1/sql/sql_table.cc:6944
            #7  0x0000000000743c9b in mysql_alter_table (thd=0x7fc2eefc8070, new_db=0x7fc2eefce708 "test", new_name=0x0, create_info=0x7fc2dde896a0, table_list=0x7fc2eefce128, alter_info=0x7fc2dde897d0, order_num=0, order=0x0, ignore=false) at 10.1/sql/sql_table.cc:8821
            #8  0x00000000007b2325 in Sql_cmd_alter_table::execute (this=0x7fc2eefce8a8, thd=0x7fc2eefc8070) at 10.1/sql/sql_alter.cc:325
            #9  0x000000000068de51 in mysql_execute_command (thd=0x7fc2eefc8070) at 10.1/sql/sql_parse.cc:5558
            #10 0x000000000069218b in mysql_parse (thd=0x7fc2eefc8070, rawbuf=0x7fc2e6d4d1b9 "ALTER TABLE table11_innodb ADD COLUMN `default` INTEGER DEFAULT 0", length=65, parser_state=0x7fc2dde8a9f0) at 10.1/sql/sql_parse.cc:7165
            #11 0x000000000099f402 in Query_log_event::do_apply_event (this=0x7fc2e6c1f230, rgi=0x7fc2e6df0800, query_arg=0x7fc2e6d4d1b9 "ALTER TABLE table11_innodb ADD COLUMN `default` INTEGER DEFAULT 0", q_len_arg=65) at 10.1/sql/log_event.cc:4287
            #12 0x000000000099e6ed in Query_log_event::do_apply_event (this=0x7fc2e6c1f230, rgi=0x7fc2e6df0800) at 10.1/sql/log_event.cc:4013
            #13 0x00000000005f0654 in Log_event::apply_event (this=0x7fc2e6c1f230, rgi=0x7fc2e6df0800) at 10.1/sql/log_event.h:1347
            #14 0x00000000005e64df in apply_event_and_update_pos (ev=0x7fc2e6c1f230, thd=0x7fc2eefc8070, rgi=0x7fc2e6df0800, rpt=0x7fc2dc008380) at 10.1/sql/slave.cc:3274
            #15 0x00000000008179e7 in rpt_handle_event (qev=0x7fc2e6db7a70, rpt=0x7fc2dc008380) at 10.1/sql/rpl_parallel.cc:49
            #16 0x0000000000819b6c in handle_rpl_parallel_thread (arg=0x7fc2dc008380) at 10.1/sql/rpl_parallel.cc:942
            #17 0x00007fc30dab6e9a in start_thread (arg=0x7fc2dde8b700) at pthread_create.c:308
            #18 0x00007fc30d1e78bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
            #19 0x0000000000000000 in ?? ()

            Thread 6 (Thread 0x7fc2ddf1d700 (LWP 20027)):
            #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
            #1  0x0000000000eed3f3 in safe_cond_wait (cond=0x7fc2e6df10e8, mp=0x7fc2e6df1038, file=0xf9c780 "10.1/include/mysql/psi/mysql_thread.h", line=1165) at 10.1/mysys/thr_mutex.c:493
            #2  0x000000000063f9ce in inline_mysql_cond_wait (that=0x7fc2e6df10e8, mutex=0x7fc2e6df1038, src_file=0xf9cfe8 "10.1/sql/sql_class.cc", src_line=6853) at 10.1/include/mysql/psi/mysql_thread.h:1165
            #3  0x000000000065004e in wait_for_commit::wait_for_prior_commit2 (this=0x7fc2e6df1038, thd=0x7fc2e6c95070) at 10.1/sql/sql_class.cc:6853
            #4  0x000000000062fbc3 in wait_for_commit::wait_for_prior_commit (this=0x7fc2e6df1038, thd=0x7fc2e6c95070) at 10.1/sql/sql_class.h:1780
            #5  0x000000000062fed2 in THD::wait_for_prior_commit (this=0x7fc2e6c95070) at 10.1/sql/sql_class.h:3781
            #6  0x00000000008af50c in ha_commit_one_phase (thd=0x7fc2e6c95070, all=true) at 10.1/sql/handler.cc:1536
            #7  0x00000000008aee1d in ha_commit_trans (thd=0x7fc2e6c95070, all=true) at 10.1/sql/handler.cc:1406
            #8  0x00000000007c14e9 in trans_commit (thd=0x7fc2e6c95070) at 10.1/sql/transaction.cc:236
            #9  0x00000000009a699d in Xid_log_event::do_apply_event (this=0x7fc2dc024570, rgi=0x7fc2e6df1000) at 10.1/sql/log_event.cc:7393
            #10 0x00000000005f0654 in Log_event::apply_event (this=0x7fc2dc024570, rgi=0x7fc2e6df1000) at 10.1/sql/log_event.h:1347
            #11 0x00000000005e64df in apply_event_and_update_pos (ev=0x7fc2dc024570, thd=0x7fc2e6c95070, rgi=0x7fc2e6df1000, rpt=0x7fc2dc0081b0) at 10.1/sql/slave.cc:3274
            #12 0x00000000008179e7 in rpt_handle_event (qev=0x7fc2e6db8970, rpt=0x7fc2dc0081b0) at 10.1/sql/rpl_parallel.cc:49
            #13 0x0000000000819b6c in handle_rpl_parallel_thread (arg=0x7fc2dc0081b0) at 10.1/sql/rpl_parallel.cc:942
            #14 0x00007fc30dab6e9a in start_thread (arg=0x7fc2ddf1d700) at pthread_create.c:308
            #15 0x00007fc30d1e78bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
            #16 0x0000000000000000 in ?? ()

            All threads' bt attached as all_threads_bt.

            10.1, git commit 53382ac128.

            Slave command line:

            00:00:59 /home/elenst/git/10.1/sql/mysqld --no-defaults --basedir=/home/elenst/git/10.1 --datadir=/home/elenst/test_results/analyze_slave/data --lc-messages-dir=/home/elenst/git/10.1/sql/share --character-sets-dir=/home/elenst/git/10.1/sql/share/charsets --tmpdir=/home/elenst/test_results/analyze_slave/tmp --core-file --max-allowed-packet=128Mb --port=10732 --socket=/home/elenst/test_results/analyze_slave/mysql.sock --pid-file=/home/elenst/test_results/analyze_slave/mysql.pid --general-log --general-log-file=/home/elenst/test_results/analyze_slave/mysql.log --server_id=2 --report-host=127.0.0.1 --report_port=10732 --log_output=FILE --slow_query_log --log_bin_trust_function_creators=1 --query_cache_size=64M --slave-skip-errors=1054,1317,1049,1305,1539,1505 --slave-parallel-mode=optimistic --binlog_commit_wait_count=20 --binlog_commit_wait_usec=1000000 --slave-parallel-threads=20 --max-binlog-size=4096 --innodb --sql-mode=no_engine_substitution

            End of slave error log:

            150502 21:33:31 [Note] Slave I/O thread: connected to master 'root@127.0.0.1:10730',replication starts at GTID position '0-1-65'
            150502 21:33:31 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000004' at position 1755, relay log './mysql-relay-bin.000001' position: 4; GTID position '0-1-65'
            150502 21:34:02 [Note] Error reading relay log event: slave SQL thread was killed
            150502 21:34:02 [ERROR] Error reading packet from server: Lost connection to MySQL server during query ( server_errno=2013)
            150502 21:34:02 [Note] Slave I/O thread killed while reading event
            150502 21:34:02 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000012', position 3799; GTID position 0-1-241
            150502 21:34:22 [Note] Slave I/O thread: connected to master 'root@127.0.0.1:10730',replication starts at GTID position '0-1-203'
            150502 21:34:22 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000011' at position 534, relay log './mysql-relay-bin.000001' position: 4; GTID position '0-1-203'
            150502 21:34:32 [Note] Error reading relay log event: slave SQL thread was killed
            150503  5:35:08 [ERROR] Error reading packet from server: Lost connection to MySQL server during query ( server_errno=2013)
            150503  5:35:08 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.000020' at position 4146; GTID position '0-1-399'
            150503  5:35:08 [ERROR] Slave I/O: error reconnecting to master 'root@127.0.0.1:10730' - retry-time: 1  retries: 86400  message: Can't connect to MySQL server on '127.0.0.1' (111 "Connection refused"), Internal MariaDB error code: 2003
            150503 15:25:18 [Note] /home/elenst/git/10.1/sql/mysqld: Normal shutdown
             
            150503 15:25:18 [Note] Event Scheduler: Purging the queue. 0 events
            150503 15:25:19 [Note] Slave I/O thread killed during or after a reconnect done to recover from failed read
            150503 15:25:19 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000020', position 4146; GTID position 0-1-399

            Test command line:

            perl /home/elenst/bzr/randgen-mariadb-patches/runall-new.pl --no-mask --seed=1430530643 --threads=16 --duration=400 --queries=100M --reporters=QueryTimeout,Backtrace,ErrorLog,Deadlock,ReplicationThreadRestarter --redefine=conf/mariadb/general-workarounds.yy --redefine=conf/mariadb/10.0-features-redefine.yy --mysqld=--log_output=FILE --mysqld=--slow_query_log --mysqld=--log_bin_trust_function_creators=1 --mysqld=--query_cache_size=64M --engine=InnoDB --views --grammar=conf/runtime/metadata_stability.yy --gendata=conf/runtime/metadata_stability.zz --rpl_mode=statement --mysqld=--slave-skip-errors=1054,1317,1049,1305,1539,1505 --mysqld=--slave-parallel-mode=optimistic --mysqld=--binlog_commit_wait_count=20 --mysqld=--binlog_commit_wait_usec=1000000 --mysqld=--slave-parallel-threads=20 --use-gtid=current_pos --mysqld=--max-binlog-size=4096 --mtr-build-thread=73 --basedir1=/home/elenst/git/10.1 --vardir1=/home/elenst/test_results/analyze

            elenst Elena Stepanova added a comment - - edited Got one as well: Thread 3 (Thread 0x7fc2dd490700 (LWP 20035)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:215 #1 0x0000000000eed6fd in safe_cond_timedwait (cond=0x7fc30c1ebdf8, mp=0x7fc30c1ebd10, abstime=0x7fc2dd48e4b0, file=0xf8bbe0 "10.1/include/mysql/psi/mysql_thread.h", line=1202) at 10.1/mysys/thr_mutex.c:547 #2 0x00000000005dcfca in inline_mysql_cond_timedwait (that=0x7fc30c1ebdf8, mutex=0x7fc30c1ebd10, abstime=0x7fc2dd48e4b0, src_file=0xf8c2f0 "10.1/sql/slave.cc", src_line=791) at 10.1/include/mysql/psi/mysql_thread.h:1202 #3 0x00000000005debb2 in terminate_slave_thread (thd=0x7fc2dc0e3070, term_lock=0x7fc30c1ebd10, term_cond=0x7fc30c1ebdf8, slave_running=0x7fc30c1ec7ec, skip_lock=true) at 10.1/sql/slave.cc:791 #4 0x00000000005de554 in terminate_slave_threads (mi=0x7fc30c1e8000, thread_mask=3, skip_lock=true) at 10.1/sql/slave.cc:636 #5 0x00000000006b8fb7 in stop_slave (thd=0x7fc2eecf5070, mi=0x7fc30c1e8000, net_report=true) at 10.1/sql/sql_repl.cc:3186 #6 0x0000000000686841 in mysql_execute_command (thd=0x7fc2eecf5070) at 10.1/sql/sql_parse.cc:3530 #7 0x000000000069218b in mysql_parse (thd=0x7fc2eecf5070, rawbuf=0x7fc2e6ec5088 "STOP SLAVE SQL_THREAD, IO_THREAD", length=32, parser_state=0x7fc2dd48f110) at 10.1/sql/sql_parse.cc:7165 #8 0x0000000000681199 in dispatch_command (command=COM_QUERY, thd=0x7fc2eecf5070, packet=0x7fc2eec92071 "STOP SLAVE SQL_THREAD, IO_THREAD", packet_length=32) at 10.1/sql/sql_parse.cc:1462 #9 0x000000000067ff63 in do_command (thd=0x7fc2eecf5070) at 10.1/sql/sql_parse.cc:1090 #10 0x00000000007ad7a0 in do_handle_one_connection (thd_arg=0x7fc2eecf5070) at 10.1/sql/sql_connect.cc:1347 #11 0x00000000007ad4e5 in handle_one_connection (arg=0x7fc2eecf5070) at 10.1/sql/sql_connect.cc:1258 #12 0x00007fc30dab6e9a in start_thread (arg=0x7fc2dd490700) at pthread_create.c:308 #13 0x00007fc30d1e78bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #14 0x0000000000000000 in ?? () Thread 5 (Thread 0x7fc2dde8b700 (LWP 20028)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:215 #1 0x0000000000eed6fd in safe_cond_timedwait (cond=0x7fc2eefc8218, mp=0x7fc2eefc8168, abstime=0x7fc2dde87aa0, file=0xfd2760 "10.1/include/mysql/psi/mysql_thread.h", line=1202) at 10.1/mysys/thr_mutex.c:547 #2 0x00000000007b6784 in inline_mysql_cond_timedwait (that=0x7fc2eefc8218, mutex=0x7fc2eefc8168, abstime=0x7fc2dde87aa0, src_file=0xfd29c8 "10.1/sql/mdl.cc", src_line=1066) at 10.1/include/mysql/psi/mysql_thread.h:1202 #3 0x00000000007b7902 in MDL_wait::timed_wait (this=0x7fc2eefc8168, owner=0x7fc2eefc8120, abs_timeout=0x7fc2dde87aa0, set_status_on_timeout=false, wait_state_name=0x170e6f0) at 10.1/sql/mdl.cc:1066 #4 0x00000000007b92ab in MDL_context::acquire_lock (this=0x7fc2eefc8168, mdl_request=0x7fc2dde87b70, lock_wait_timeout=31536000) at 10.1/sql/mdl.cc:2039 #5 0x00000000007b9a2a in MDL_context::upgrade_shared_lock (this=0x7fc2eefc8168, mdl_ticket=0x7fc30c01f2c0, new_type=MDL_EXCLUSIVE, lock_wait_timeout=31536000) at 10.1/sql/mdl.cc:2232 #6 0x000000000073f186 in mysql_inplace_alter_table (thd=0x7fc2eefc8070, table_list=0x7fc2eefce128, table=0x7fc3072dcc70, altered_table=0x7fc2eefd5070, ha_alter_info=0x7fc2dde87eb0, inplace_supported=HA_ALTER_INPLACE_NO_LOCK_AFTER_PREPARE, target_mdl_request=0x7fc2dde89460, alter_ctx=0x7fc2dde87fd0) at 10.1/sql/sql_table.cc:6944 #7 0x0000000000743c9b in mysql_alter_table (thd=0x7fc2eefc8070, new_db=0x7fc2eefce708 "test", new_name=0x0, create_info=0x7fc2dde896a0, table_list=0x7fc2eefce128, alter_info=0x7fc2dde897d0, order_num=0, order=0x0, ignore=false) at 10.1/sql/sql_table.cc:8821 #8 0x00000000007b2325 in Sql_cmd_alter_table::execute (this=0x7fc2eefce8a8, thd=0x7fc2eefc8070) at 10.1/sql/sql_alter.cc:325 #9 0x000000000068de51 in mysql_execute_command (thd=0x7fc2eefc8070) at 10.1/sql/sql_parse.cc:5558 #10 0x000000000069218b in mysql_parse (thd=0x7fc2eefc8070, rawbuf=0x7fc2e6d4d1b9 "ALTER TABLE table11_innodb ADD COLUMN `default` INTEGER DEFAULT 0", length=65, parser_state=0x7fc2dde8a9f0) at 10.1/sql/sql_parse.cc:7165 #11 0x000000000099f402 in Query_log_event::do_apply_event (this=0x7fc2e6c1f230, rgi=0x7fc2e6df0800, query_arg=0x7fc2e6d4d1b9 "ALTER TABLE table11_innodb ADD COLUMN `default` INTEGER DEFAULT 0", q_len_arg=65) at 10.1/sql/log_event.cc:4287 #12 0x000000000099e6ed in Query_log_event::do_apply_event (this=0x7fc2e6c1f230, rgi=0x7fc2e6df0800) at 10.1/sql/log_event.cc:4013 #13 0x00000000005f0654 in Log_event::apply_event (this=0x7fc2e6c1f230, rgi=0x7fc2e6df0800) at 10.1/sql/log_event.h:1347 #14 0x00000000005e64df in apply_event_and_update_pos (ev=0x7fc2e6c1f230, thd=0x7fc2eefc8070, rgi=0x7fc2e6df0800, rpt=0x7fc2dc008380) at 10.1/sql/slave.cc:3274 #15 0x00000000008179e7 in rpt_handle_event (qev=0x7fc2e6db7a70, rpt=0x7fc2dc008380) at 10.1/sql/rpl_parallel.cc:49 #16 0x0000000000819b6c in handle_rpl_parallel_thread (arg=0x7fc2dc008380) at 10.1/sql/rpl_parallel.cc:942 #17 0x00007fc30dab6e9a in start_thread (arg=0x7fc2dde8b700) at pthread_create.c:308 #18 0x00007fc30d1e78bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #19 0x0000000000000000 in ?? () Thread 6 (Thread 0x7fc2ddf1d700 (LWP 20027)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x0000000000eed3f3 in safe_cond_wait (cond=0x7fc2e6df10e8, mp=0x7fc2e6df1038, file=0xf9c780 "10.1/include/mysql/psi/mysql_thread.h", line=1165) at 10.1/mysys/thr_mutex.c:493 #2 0x000000000063f9ce in inline_mysql_cond_wait (that=0x7fc2e6df10e8, mutex=0x7fc2e6df1038, src_file=0xf9cfe8 "10.1/sql/sql_class.cc", src_line=6853) at 10.1/include/mysql/psi/mysql_thread.h:1165 #3 0x000000000065004e in wait_for_commit::wait_for_prior_commit2 (this=0x7fc2e6df1038, thd=0x7fc2e6c95070) at 10.1/sql/sql_class.cc:6853 #4 0x000000000062fbc3 in wait_for_commit::wait_for_prior_commit (this=0x7fc2e6df1038, thd=0x7fc2e6c95070) at 10.1/sql/sql_class.h:1780 #5 0x000000000062fed2 in THD::wait_for_prior_commit (this=0x7fc2e6c95070) at 10.1/sql/sql_class.h:3781 #6 0x00000000008af50c in ha_commit_one_phase (thd=0x7fc2e6c95070, all=true) at 10.1/sql/handler.cc:1536 #7 0x00000000008aee1d in ha_commit_trans (thd=0x7fc2e6c95070, all=true) at 10.1/sql/handler.cc:1406 #8 0x00000000007c14e9 in trans_commit (thd=0x7fc2e6c95070) at 10.1/sql/transaction.cc:236 #9 0x00000000009a699d in Xid_log_event::do_apply_event (this=0x7fc2dc024570, rgi=0x7fc2e6df1000) at 10.1/sql/log_event.cc:7393 #10 0x00000000005f0654 in Log_event::apply_event (this=0x7fc2dc024570, rgi=0x7fc2e6df1000) at 10.1/sql/log_event.h:1347 #11 0x00000000005e64df in apply_event_and_update_pos (ev=0x7fc2dc024570, thd=0x7fc2e6c95070, rgi=0x7fc2e6df1000, rpt=0x7fc2dc0081b0) at 10.1/sql/slave.cc:3274 #12 0x00000000008179e7 in rpt_handle_event (qev=0x7fc2e6db8970, rpt=0x7fc2dc0081b0) at 10.1/sql/rpl_parallel.cc:49 #13 0x0000000000819b6c in handle_rpl_parallel_thread (arg=0x7fc2dc0081b0) at 10.1/sql/rpl_parallel.cc:942 #14 0x00007fc30dab6e9a in start_thread (arg=0x7fc2ddf1d700) at pthread_create.c:308 #15 0x00007fc30d1e78bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #16 0x0000000000000000 in ?? () All threads' bt attached as all_threads_bt. 10.1, git commit 53382ac128. Slave command line: 00:00:59 /home/elenst/git/10.1/sql/mysqld --no-defaults --basedir=/home/elenst/git/10.1 --datadir=/home/elenst/test_results/analyze_slave/data --lc-messages-dir=/home/elenst/git/10.1/sql/share --character-sets-dir=/home/elenst/git/10.1/sql/share/charsets --tmpdir=/home/elenst/test_results/analyze_slave/tmp --core-file --max-allowed-packet=128Mb --port=10732 --socket=/home/elenst/test_results/analyze_slave/mysql.sock --pid-file=/home/elenst/test_results/analyze_slave/mysql.pid --general-log --general-log-file=/home/elenst/test_results/analyze_slave/mysql.log --server_id=2 --report-host=127.0.0.1 --report_port=10732 --log_output=FILE --slow_query_log --log_bin_trust_function_creators=1 --query_cache_size=64M --slave-skip-errors=1054,1317,1049,1305,1539,1505 --slave-parallel-mode=optimistic --binlog_commit_wait_count=20 --binlog_commit_wait_usec=1000000 --slave-parallel-threads=20 --max-binlog-size=4096 --innodb --sql-mode=no_engine_substitution End of slave error log: 150502 21:33:31 [Note] Slave I/O thread: connected to master 'root@127.0.0.1:10730',replication starts at GTID position '0-1-65' 150502 21:33:31 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000004' at position 1755, relay log './mysql-relay-bin.000001' position: 4; GTID position '0-1-65' 150502 21:34:02 [Note] Error reading relay log event: slave SQL thread was killed 150502 21:34:02 [ERROR] Error reading packet from server: Lost connection to MySQL server during query ( server_errno=2013) 150502 21:34:02 [Note] Slave I/O thread killed while reading event 150502 21:34:02 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000012', position 3799; GTID position 0-1-241 150502 21:34:22 [Note] Slave I/O thread: connected to master 'root@127.0.0.1:10730',replication starts at GTID position '0-1-203' 150502 21:34:22 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000011' at position 534, relay log './mysql-relay-bin.000001' position: 4; GTID position '0-1-203' 150502 21:34:32 [Note] Error reading relay log event: slave SQL thread was killed 150503 5:35:08 [ERROR] Error reading packet from server: Lost connection to MySQL server during query ( server_errno=2013) 150503 5:35:08 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.000020' at position 4146; GTID position '0-1-399' 150503 5:35:08 [ERROR] Slave I/O: error reconnecting to master 'root@127.0.0.1:10730' - retry-time: 1 retries: 86400 message: Can't connect to MySQL server on '127.0.0.1' (111 "Connection refused"), Internal MariaDB error code: 2003 150503 15:25:18 [Note] /home/elenst/git/10.1/sql/mysqld: Normal shutdown   150503 15:25:18 [Note] Event Scheduler: Purging the queue. 0 events 150503 15:25:19 [Note] Slave I/O thread killed during or after a reconnect done to recover from failed read 150503 15:25:19 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000020', position 4146; GTID position 0-1-399 Test command line: perl /home/elenst/bzr/randgen-mariadb-patches/runall-new.pl --no-mask --seed=1430530643 --threads=16 --duration=400 --queries=100M --reporters=QueryTimeout,Backtrace,ErrorLog,Deadlock,ReplicationThreadRestarter --redefine=conf/mariadb/general-workarounds.yy --redefine=conf/mariadb/10.0-features-redefine.yy --mysqld=--log_output=FILE --mysqld=--slow_query_log --mysqld=--log_bin_trust_function_creators=1 --mysqld=--query_cache_size=64M --engine=InnoDB --views --grammar=conf/runtime/metadata_stability.yy --gendata=conf/runtime/metadata_stability.zz --rpl_mode=statement --mysqld=--slave-skip-errors=1054,1317,1049,1305,1539,1505 --mysqld=--slave-parallel-mode=optimistic --mysqld=--binlog_commit_wait_count=20 --mysqld=--binlog_commit_wait_usec=1000000 --mysqld=--slave-parallel-threads=20 --use-gtid=current_pos --mysqld=--max-binlog-size=4096 --mtr-build-thread=73 --basedir1=/home/elenst/git/10.1 --vardir1=/home/elenst/test_results/analyze

            08:35 < knielsen> elenst: can you supply binlogs of MDEV-8039?
            08:38 < knielsen> elenst: binlog from slave (if enabled) would be useful also, and general log of slave
            08:39 < knielsen> elenst: and value of @@gtid_slave_pos on slave, if you still have it hanging

            knielsen Kristian Nielsen added a comment - 08:35 < knielsen> elenst: can you supply binlogs of MDEV-8039 ? 08:38 < knielsen> elenst: binlog from slave (if enabled) would be useful also, and general log of slave 08:39 < knielsen> elenst: and value of @@gtid_slave_pos on slave, if you still have it hanging
            knielsen Kristian Nielsen added a comment - - edited

            Daniel, I agree that it is odd that there are two threads in
            handle_slave_sql(). This indicates that two SQL threads are running. I do
            not see how that is possible without using multi-source replication?
            There are also two sets of parallel replication worker threads.

            Anyway, this case seems unrelated to show slave status. What we see is 10
            worker threads idle, and 10 worker threads in "Waiting for prior transaction
            to start commit before starting next transaction".
            (In the second case, there is an additional set of 5 idle and 15 waiting
            worker threads).

            This indicates that the parallel replication worker threads have become
            stuck, independent of anything else going on. This causes STOP SLAVE to
            block waiting for the stuck worker threads, and in turn SHOW STATUS to
            block, but that seems unrelated. All the worker threads are idle, not
            executing any transactions, and all waiting (incorrectly).

            There were several bugs fixes recently where this could happen. Basically,
            some internal lists of group_commit_orderer objects gets corrupt, and a
            wakeup is missed, causing all threads to get stuck waiting. These should all
            be fixed in 10.0.18, which I believe is due for release shortly?

            We can think about trying to track down why it hangs, but I would prefer to
            first see if 10.0.18 fixes it. Tracking down these hangs can be very hard,
            and would require at least very carefully collected binlog around the
            point of hang, SHOW FULL PROCESSLIST, and error log, all painstakingly
            correct, complete, and matching each other.

            When this condition happens, it is possible to recover by using KILL queries
            against the threads stuck in "Waiting for prior transaction to start commit
            before starting next transaction". Note that it may be necessary to kill
            multiple times, as they have to be killed in the correct order, which is
            unknown to the user.

            knielsen Kristian Nielsen added a comment - - edited Daniel, I agree that it is odd that there are two threads in handle_slave_sql(). This indicates that two SQL threads are running. I do not see how that is possible without using multi-source replication? There are also two sets of parallel replication worker threads. Anyway, this case seems unrelated to show slave status. What we see is 10 worker threads idle, and 10 worker threads in "Waiting for prior transaction to start commit before starting next transaction". (In the second case, there is an additional set of 5 idle and 15 waiting worker threads). This indicates that the parallel replication worker threads have become stuck, independent of anything else going on. This causes STOP SLAVE to block waiting for the stuck worker threads, and in turn SHOW STATUS to block, but that seems unrelated. All the worker threads are idle, not executing any transactions, and all waiting (incorrectly). There were several bugs fixes recently where this could happen. Basically, some internal lists of group_commit_orderer objects gets corrupt, and a wakeup is missed, causing all threads to get stuck waiting. These should all be fixed in 10.0.18, which I believe is due for release shortly? We can think about trying to track down why it hangs, but I would prefer to first see if 10.0.18 fixes it. Tracking down these hangs can be very hard, and would require at least very carefully collected binlog around the point of hang, SHOW FULL PROCESSLIST, and error log, all painstakingly correct, complete, and matching each other. When this condition happens, it is possible to recover by using KILL queries against the threads stuck in "Waiting for prior transaction to start commit before starting next transaction". Note that it may be necessary to kill multiple times, as they have to be killed in the correct order, which is unknown to the user.

            So to clarify, Elenas and Daniels conditions seem unrelated.

            In Daniel's case, the worker threads have become stuck, all waiting for
            another thread to start first.

            In Elena's case, there is an ALTER TABLE stuck waiting on a table lock while
            three other transactions have completed and are waiting for a prior
            transaction to commit first (presumably the ALTER TABLE?). So this looks
            like incorrectly executing the ALTER TABLE ddl speculatively, hopefully logs
            can help clarify this.

            knielsen Kristian Nielsen added a comment - So to clarify, Elenas and Daniels conditions seem unrelated. In Daniel's case, the worker threads have become stuck, all waiting for another thread to start first. In Elena's case, there is an ALTER TABLE stuck waiting on a table lock while three other transactions have completed and are waiting for a prior transaction to commit first (presumably the ALTER TABLE?). So this looks like incorrectly executing the ALTER TABLE ddl speculatively, hopefully logs can help clarify this.
            danblack Daniel Black added a comment -

            thanks knielsen. Happy to see if 10.0.18 fixes in it, especially in light of contradictory information on my part.

            Thanks for looking. Once released I'll hammer the stop slave/start slave in parallel replication for a while. If I come across anything I'll be sure the collect those information bits carefully.

            danblack Daniel Black added a comment - thanks knielsen . Happy to see if 10.0.18 fixes in it, especially in light of contradictory information on my part. Thanks for looking. Once released I'll hammer the stop slave/start slave in parallel replication for a while. If I come across anything I'll be sure the collect those information bits carefully.

            08:35 < knielsen> elenst: can you supply binlogs of MDEV-8039?
            08:38 < knielsen> elenst: binlog from slave (if enabled) would be useful also, and general log of slave
            08:39 < knielsen> elenst: and value of @@gtid_slave_pos on slave, if you still have it hanging

            So to clarify, Elenas and Daniels conditions seem unrelated.

            In order to separate unrelated issues, the new JIRA report MDEV-8113 has been created, it contains the requested information.

            elenst Elena Stepanova added a comment - 08:35 < knielsen> elenst: can you supply binlogs of MDEV-8039 ? 08:38 < knielsen> elenst: binlog from slave (if enabled) would be useful also, and general log of slave 08:39 < knielsen> elenst: and value of @@gtid_slave_pos on slave, if you still have it hanging So to clarify, Elenas and Daniels conditions seem unrelated. In order to separate unrelated issues, the new JIRA report MDEV-8113 has been created, it contains the requested information.
            danblack Daniel Black added a comment -

            Apologies that this has been waiting for feedback for years. Due to employer changes I cannot test this as promised.

            danblack Daniel Black added a comment - Apologies that this has been waiting for feedback for years. Due to employer changes I cannot test this as promised.

            People

              knielsen Kristian Nielsen
              danblack Daniel Black
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.