Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14069

galera_sst_mysqldump.test fails with: 'show status' failed with wrong errno 1205: 'Lock wait timeout exceeded; try restarting transaction', instead of 0...

Details

    Description

      This test fails always:

       bld/mysql-test/mtr galera_sst_mysqldump --mem
      Logging: /home/a/e1/m1-10.1/src/mysql-test/mysql-test-run.pl  galera_sst_mysqldump --mem
      vardir: /home/a/e1/_depot/m-branch/m1-10.1-bld/mysql-test/var
      Checking leftover processes...
      Removing old var directory...
      Creating var directory '/home/a/e1/_depot/m-branch/m1-10.1-bld/mysql-test/var'...
       - symlinking 'var' to '/dev/shm/var_auto_tjWY'
      Checking supported features...
      MariaDB Version 10.1.29-MariaDB-debug
       - SSL connections supported
       - binaries are debug compiled
      Sphinx 'indexer' binary not found, sphinx suite will be skipped
      Collecting tests...
      Installing system database...
       
      ==============================================================================
       
      TEST                                      RESULT   TIME (ms) or COMMENT
      --------------------------------------------------------------------------
       
      worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
      galera.galera_sst_mysqldump 'innodb_plugin' [ fail ]
              Test ended at 2017-10-14 14:11:25
       
      CURRENT_TEST: galera.galera_sst_mysqldump
      mysqltest: In included file "./include/wait_until_connected_again.inc": 
      included from ./suite/galera/include/galera_load_provider.inc at line 10:
      included from ./suite/galera/include/galera_st_disconnect_slave.inc at line 57:
      included from /home/a/e1/_depot/m-branch/m1-10.1-src/mysql-test/suite/galera/t/galera_sst_mysqldump.test at line 9:
      At line 26: query 'show status' failed with wrong errno 1205: 'Lock wait timeout exceeded; try restarting transaction', instead of 0...
       
      The result from queries just before the failure was:
      < snip >
      INSERT INTO t1 VALUES ('node1_committed_during');
      INSERT INTO t1 VALUES ('node1_committed_during');
      INSERT INTO t1 VALUES ('node1_committed_during');
      INSERT INTO t1 VALUES ('node1_committed_during');
      INSERT INTO t1 VALUES ('node1_committed_during');
      COMMIT;
      START TRANSACTION;
      INSERT INTO t1 VALUES ('node1_to_be_committed_after');
      INSERT INTO t1 VALUES ('node1_to_be_committed_after');
      INSERT INTO t1 VALUES ('node1_to_be_committed_after');
      INSERT INTO t1 VALUES ('node1_to_be_committed_after');
      INSERT INTO t1 VALUES ('node1_to_be_committed_after');
      SET AUTOCOMMIT=OFF;
      START TRANSACTION;
      INSERT INTO t1 VALUES ('node1_to_be_rollbacked_after');
      INSERT INTO t1 VALUES ('node1_to_be_rollbacked_after');
      INSERT INTO t1 VALUES ('node1_to_be_rollbacked_after');
      INSERT INTO t1 VALUES ('node1_to_be_rollbacked_after');
      INSERT INTO t1 VALUES ('node1_to_be_rollbacked_after');
      Loading wsrep provider ...
       
      More results from queries before failure can be found in /home/a/e1/_depot/m-branch/m1-10.1-bld/mysql-test/var/log/galera_sst_mysqldump.log
       
       - saving '/home/a/e1/_depot/m-branch/m1-10.1-bld/mysql-test/var/log/galera.galera_sst_mysqldump-innodb_plugin/' to '/home/a/e1/_depot/m-branch/m1-10.1-bld/mysql-test/var/log/galera.galera_sst_mysqldump-innodb_plugin/'
       
      Only  1  of 2 completed.
      --------------------------------------------------------------------------
      The servers were restarted 0 times
      Spent 0.000 of 59 seconds executing testcases
       
      Failure: Failed 1/1 tests, 0.00% were successful.
       
      Failing test(s): galera.galera_sst_mysqldump
       
      The log files in var/log may give you some hint of what went wrong.
       
      If you want to report this error, please read first the documentation
      at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
       
      mysql-test-run: *** ERROR: there were failing test cases
      

      Attachments

        Issue Links

          Activity

            After changing some include files the joiner seem to hanging after first rsync sst and these two threads are probably to blame:

            Thread 24 (Thread 0x7fd279095700 (LWP 10989)):
            #0  0x00007fd2772c1d8d in poll () at ../sysdeps/unix/syscall-template.S:84
            #1  0x000055d05a4b9fc4 in poll (__timeout=28800000, __nfds=1, __fds=0x7fd279094758) at /usr/include/x86_64-linux-gnu/bits/poll2.h:46
            #2  vio_io_wait (vio=vio@entry=0x7fd276441c08, event=event@entry=VIO_IO_EVENT_READ, timeout=28800000) at /home/a/env1/m3-10.0-galera/src/vio/viosocket.c:992
            #3  0x000055d05a4ba068 in vio_socket_io_wait (vio=vio@entry=0x7fd276441c08, event=event@entry=VIO_IO_EVENT_READ) at /home/a/env1/m3-10.0-galera/src/vio/viosocket.c:108
            #4  0x000055d05a4ba171 in vio_read (vio=0x7fd276441c08, buf=0x7fd268fca008 "<", size=4) at /home/a/env1/m3-10.0-galera/src/vio/viosocket.c:184
            #5  0x000055d059ebffda in my_real_read (net=0x7fd267b23238, complen=complen@entry=0x7fd279094940, header=header@entry=1 '\001') at /home/a/env1/m3-10.0-galera/src/sql/net_serv.cc:874
            #6  0x000055d059ec0df1 in my_net_read_packet (net=net@entry=0x7fd267b23238, read_from_server=read_from_server@entry=1 '\001') at /home/a/env1/m3-10.0-galera/src/sql/net_serv.cc:1137
            #7  0x000055d059f46ff0 in do_command (thd=0x7fd267b23008) at /home/a/env1/m3-10.0-galera/src/sql/sql_parse.cc:953
            #8  0x000055d059ffe98c in do_handle_one_connection (thd_arg=thd_arg@entry=0x7fd267b23008) at /home/a/env1/m3-10.0-galera/src/sql/sql_connect.cc:1394
            #9  0x000055d059ffe9f8 in handle_one_connection (arg=arg@entry=0x7fd267b23008) at /home/a/env1/m3-10.0-galera/src/sql/sql_connect.cc:1306
            #10 0x000055d05a2b3b14 in pfs_spawn_thread (arg=0x7fd276441e08) at /home/a/env1/m3-10.0-galera/src/storage/perfschema/pfs.cc:1860
            #11 0x00007fd2784d56da in start_thread (arg=0x7fd279095700) at pthread_create.c:456
            #12 0x00007fd2772cdd7f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
             
            Thread 23 (Thread 0x7fd279127700 (LWP 10986)):
            #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
            #1  0x000055d059eb6b3f in inline_mysql_cond_wait (src_file=0x55d05a4dfb88 "/home/a/env1/m3-10.0-galera/src/sql/mysqld.cc", src_line=5658, mutex=0x55d05ac84540 <LOCK_thread_count>, 
                that=0x55d05ac83e80 <COND_thread_count>) at /home/a/env1/m3-10.0-galera/src/include/mysql/psi/mysql_thread.h:1154
            #2  wsrep_close_client_connections (wait_to_end=wait_to_end@entry=1 '\001') at /home/a/env1/m3-10.0-galera/src/sql/mysqld.cc:5658
            #3  0x000055d05a044f18 in wsrep_view_handler_cb (app_ctx=<optimized out>, recv_ctx=<optimized out>, view=0x7fd26a85a400, state=<optimized out>, state_len=<optimized out>, sst_req=0x7fd279125ca0, 
                sst_req_len=0x7fd279125ca8) at /home/a/env1/m3-10.0-galera/src/sql/wsrep_mysqld.cc:288
            #4  0x00007fd271979550 in galera::ReplicatorSMM::process_conf_change (this=0x7fd2727f5000, recv_ctx=0x7fd26a812008, view_info=..., repl_proto=7, next_state=galera::Replicator::S_CONNECTED, seqno_l=<optimized out>)
                at galera/src/replicator_smm.cpp:1414
            #5  0x00007fd27194f19b in galera::GcsActionSource::dispatch (this=this@entry=0x7fd2727f5730, recv_ctx=recv_ctx@entry=0x7fd26a812008, act=..., exit_loop=@0x7fd2791261ff: false)
                at galera/src/gcs_action_source.cpp:139
            #6  0x00007fd271950af6 in galera::GcsActionSource::process (this=0x7fd2727f5730, recv_ctx=0x7fd26a812008, exit_loop=@0x7fd2791261ff: false) at galera/src/gcs_action_source.cpp:181
            #7  0x00007fd271978e4b in galera::ReplicatorSMM::async_recv (this=0x7fd2727f5000, recv_ctx=0x7fd26a812008) at galera/src/replicator_smm.cpp:371
            #8  0x00007fd27199332b in galera_recv (gh=<optimized out>, recv_ctx=<optimized out>) at galera/src/wsrep_provider.cpp:244
            #9  0x000055d05a0502ec in wsrep_replication_process (thd=0x7fd26a812008) at /home/a/env1/m3-10.0-galera/src/sql/wsrep_thd.cc:359
            #10 0x000055d059ebacce in start_wsrep_THD (arg=0x55d05a050290 <wsrep_replication_process(THD*)>) at /home/a/env1/m3-10.0-galera/src/sql/mysqld.cc:5417
            #11 0x00007fd2784d56da in start_thread (arg=0x7fd279127700) at pthread_create.c:456
            #12 0x00007fd2772cdd7f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
            

            It looks Galera is waiting mtr thread to disconnect gracefully, which is just looping in wait_until_connected_again.inc

            anikitin Andrii Nikitin (Inactive) added a comment - After changing some include files the joiner seem to hanging after first rsync sst and these two threads are probably to blame: Thread 24 (Thread 0x7fd279095700 (LWP 10989)): #0 0x00007fd2772c1d8d in poll () at ../sysdeps/unix/syscall-template.S:84 #1 0x000055d05a4b9fc4 in poll (__timeout=28800000, __nfds=1, __fds=0x7fd279094758) at /usr/include/x86_64-linux-gnu/bits/poll2.h:46 #2 vio_io_wait (vio=vio@entry=0x7fd276441c08, event=event@entry=VIO_IO_EVENT_READ, timeout=28800000) at /home/a/env1/m3-10.0-galera/src/vio/viosocket.c:992 #3 0x000055d05a4ba068 in vio_socket_io_wait (vio=vio@entry=0x7fd276441c08, event=event@entry=VIO_IO_EVENT_READ) at /home/a/env1/m3-10.0-galera/src/vio/viosocket.c:108 #4 0x000055d05a4ba171 in vio_read (vio=0x7fd276441c08, buf=0x7fd268fca008 "<", size=4) at /home/a/env1/m3-10.0-galera/src/vio/viosocket.c:184 #5 0x000055d059ebffda in my_real_read (net=0x7fd267b23238, complen=complen@entry=0x7fd279094940, header=header@entry=1 '\001') at /home/a/env1/m3-10.0-galera/src/sql/net_serv.cc:874 #6 0x000055d059ec0df1 in my_net_read_packet (net=net@entry=0x7fd267b23238, read_from_server=read_from_server@entry=1 '\001') at /home/a/env1/m3-10.0-galera/src/sql/net_serv.cc:1137 #7 0x000055d059f46ff0 in do_command (thd=0x7fd267b23008) at /home/a/env1/m3-10.0-galera/src/sql/sql_parse.cc:953 #8 0x000055d059ffe98c in do_handle_one_connection (thd_arg=thd_arg@entry=0x7fd267b23008) at /home/a/env1/m3-10.0-galera/src/sql/sql_connect.cc:1394 #9 0x000055d059ffe9f8 in handle_one_connection (arg=arg@entry=0x7fd267b23008) at /home/a/env1/m3-10.0-galera/src/sql/sql_connect.cc:1306 #10 0x000055d05a2b3b14 in pfs_spawn_thread (arg=0x7fd276441e08) at /home/a/env1/m3-10.0-galera/src/storage/perfschema/pfs.cc:1860 #11 0x00007fd2784d56da in start_thread (arg=0x7fd279095700) at pthread_create.c:456 #12 0x00007fd2772cdd7f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105   Thread 23 (Thread 0x7fd279127700 (LWP 10986)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x000055d059eb6b3f in inline_mysql_cond_wait (src_file=0x55d05a4dfb88 "/home/a/env1/m3-10.0-galera/src/sql/mysqld.cc", src_line=5658, mutex=0x55d05ac84540 <LOCK_thread_count>, that=0x55d05ac83e80 <COND_thread_count>) at /home/a/env1/m3-10.0-galera/src/include/mysql/psi/mysql_thread.h:1154 #2 wsrep_close_client_connections (wait_to_end=wait_to_end@entry=1 '\001') at /home/a/env1/m3-10.0-galera/src/sql/mysqld.cc:5658 #3 0x000055d05a044f18 in wsrep_view_handler_cb (app_ctx=<optimized out>, recv_ctx=<optimized out>, view=0x7fd26a85a400, state=<optimized out>, state_len=<optimized out>, sst_req=0x7fd279125ca0, sst_req_len=0x7fd279125ca8) at /home/a/env1/m3-10.0-galera/src/sql/wsrep_mysqld.cc:288 #4 0x00007fd271979550 in galera::ReplicatorSMM::process_conf_change (this=0x7fd2727f5000, recv_ctx=0x7fd26a812008, view_info=..., repl_proto=7, next_state=galera::Replicator::S_CONNECTED, seqno_l=<optimized out>) at galera/src/replicator_smm.cpp:1414 #5 0x00007fd27194f19b in galera::GcsActionSource::dispatch (this=this@entry=0x7fd2727f5730, recv_ctx=recv_ctx@entry=0x7fd26a812008, act=..., exit_loop=@0x7fd2791261ff: false) at galera/src/gcs_action_source.cpp:139 #6 0x00007fd271950af6 in galera::GcsActionSource::process (this=0x7fd2727f5730, recv_ctx=0x7fd26a812008, exit_loop=@0x7fd2791261ff: false) at galera/src/gcs_action_source.cpp:181 #7 0x00007fd271978e4b in galera::ReplicatorSMM::async_recv (this=0x7fd2727f5000, recv_ctx=0x7fd26a812008) at galera/src/replicator_smm.cpp:371 #8 0x00007fd27199332b in galera_recv (gh=<optimized out>, recv_ctx=<optimized out>) at galera/src/wsrep_provider.cpp:244 #9 0x000055d05a0502ec in wsrep_replication_process (thd=0x7fd26a812008) at /home/a/env1/m3-10.0-galera/src/sql/wsrep_thd.cc:359 #10 0x000055d059ebacce in start_wsrep_THD (arg=0x55d05a050290 <wsrep_replication_process(THD*)>) at /home/a/env1/m3-10.0-galera/src/sql/mysqld.cc:5417 #11 0x00007fd2784d56da in start_thread (arg=0x7fd279127700) at pthread_create.c:456 #12 0x00007fd2772cdd7f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105 It looks Galera is waiting mtr thread to disconnect gracefully, which is just looping in wait_until_connected_again.inc

            I can make test pass with following changes in 10.0-galera:

            index 6f64ef45440..8cbdcea09f0 100644
            --- a/mysql-test/include/wait_until_connected_again.inc
            +++ b/mysql-test/include/wait_until_connected_again.inc
            @@ -14,8 +14,15 @@ while ($mysql_errno)
               # Strangely enough, the server might return "Too many connections"
               # while being shutdown, thus 1040 is an "allowed" error
               # See BUG#36228
            -  --error 0,1040,1053,2002,2003,2005,2006,2013,1927
            -  show status;
            +  --error 0,1053,2002,2003,2005,2006,2013,1927,1047
            +  select 1;
             
               dec $counter;
               if (!$counter)
            diff --git a/mysql-test/suite/galera/t/galera_sst_mysqldump.cnf b/mysql-test/suite/galera/t/galera_sst_mysqldump.cnf
            index 574ae28b54a..4208637b69d 100644
            --- a/mysql-test/suite/galera/t/galera_sst_mysqldump.cnf
            +++ b/mysql-test/suite/galera/t/galera_sst_mysqldump.cnf
            @@ -5,7 +5,11 @@
             
             [mysqld.1]
             wsrep_provider_options='base_port=@mysqld.1.#galera_port;gcache.size=1;pc.ignore_sb=true'
            +# must present because test uses both 192.0.0.1 and 192.0.0.2
            +bind-address=0.0.0.0
             
             [mysqld.2]
             wsrep_provider_options='base_port=@mysqld.2.#galera_port;gcache.size=1;pc.ignore_sb=true'
            +# must present because test uses both 192.0.0.1 and 192.0.0.2
            +bind-address=0.0.0.0
            diff --git a/sql/wsrep_mysqld.cc b/sql/wsrep_mysqld.cc
            index 49988287933..3a162f88e2f 100644
            --- a/sql/wsrep_mysqld.cc
            +++ b/sql/wsrep_mysqld.cc
             @@ -285,7 +285,7 @@ wsrep_view_handler_cb (void*                    app_ctx,
                 if (!wsrep_before_SE())
                 {
                     WSREP_DEBUG("[debug]: closing client connections for PRIM");
            -        wsrep_close_client_connections(TRUE);
            +        wsrep_close_client_connections(FALSE);
                 }
             
                 ssize_t const req_len= wsrep_sst_prepare (sst_req);
            

            (Change in wsrep_mysqld.cc is related to MDEV-13329 )

            Now need to find why 'show status' returns 1205

            anikitin Andrii Nikitin (Inactive) added a comment - I can make test pass with following changes in 10.0-galera: index 6f64ef45440..8cbdcea09f0 100644 --- a/mysql-test/include/wait_until_connected_again.inc +++ b/mysql-test/include/wait_until_connected_again.inc @@ -14,8 +14,15 @@ while ($mysql_errno) # Strangely enough, the server might return "Too many connections" # while being shutdown, thus 1040 is an "allowed" error # See BUG#36228 - --error 0,1040,1053,2002,2003,2005,2006,2013,1927 - show status; + --error 0,1053,2002,2003,2005,2006,2013,1927,1047 + select 1; dec $counter; if (!$counter) diff --git a/mysql-test/suite/galera/t/galera_sst_mysqldump.cnf b/mysql-test/suite/galera/t/galera_sst_mysqldump.cnf index 574ae28b54a..4208637b69d 100644 --- a/mysql-test/suite/galera/t/galera_sst_mysqldump.cnf +++ b/mysql-test/suite/galera/t/galera_sst_mysqldump.cnf @@ -5,7 +5,11 @@ [mysqld.1] wsrep_provider_options='base_port=@mysqld.1.#galera_port;gcache.size=1;pc.ignore_sb=true' +# must present because test uses both 192.0.0.1 and 192.0.0.2 +bind-address=0.0.0.0 [mysqld.2] wsrep_provider_options='base_port=@mysqld.2.#galera_port;gcache.size=1;pc.ignore_sb=true' +# must present because test uses both 192.0.0.1 and 192.0.0.2 +bind-address=0.0.0.0 diff --git a/sql/wsrep_mysqld.cc b/sql/wsrep_mysqld.cc index 49988287933..3a162f88e2f 100644 --- a/sql/wsrep_mysqld.cc +++ b/sql/wsrep_mysqld.cc @@ -285,7 +285,7 @@ wsrep_view_handler_cb (void* app_ctx, if (!wsrep_before_SE()) { WSREP_DEBUG("[debug]: closing client connections for PRIM"); - wsrep_close_client_connections(TRUE); + wsrep_close_client_connections(FALSE); } ssize_t const req_len= wsrep_sst_prepare (sst_req); (Change in wsrep_mysqld.cc is related to MDEV-13329 ) Now need to find why 'show status' returns 1205

            MDEV-14070 must be fixed because mtr adds bind-address=127.0.0.1 to all tests by default and there is really no reasons to have it 0.0.0.0 for sst mysqldump method

            anikitin Andrii Nikitin (Inactive) added a comment - MDEV-14070 must be fixed because mtr adds bind-address=127.0.0.1 to all tests by default and there is really no reasons to have it 0.0.0.0 for sst mysqldump method

            Ok, actually error 1205 happens with wsrep_sync_wait=15 , so final patch to make test pass with dependence on MDEV-13329 :

            --- a/mysql-test/suite/galera/t/galera_sst_mysqldump.cnf
            +++ b/mysql-test/suite/galera/t/galera_sst_mysqldump.cnf
            @@ -5,7 +5,13 @@
             
             [mysqld.1]
             wsrep_provider_options='base_port=@mysqld.1.#galera_port;gcache.size=1;pc.ignore_sb=true'
            +# must present because test uses both 192.0.0.1 and 192.0.0.2
            +bind-address=0.0.0.0
            +wsrep_sync_wait=0
             
             [mysqld.2]
             wsrep_provider_options='base_port=@mysqld.2.#galera_port;gcache.size=1;pc.ignore_sb=true'
            +# must present because test uses both 192.0.0.1 and 192.0.0.2
            +bind-address=0.0.0.0
            +wsrep_sync_wait=0
            

            anikitin Andrii Nikitin (Inactive) added a comment - Ok, actually error 1205 happens with wsrep_sync_wait=15 , so final patch to make test pass with dependence on MDEV-13329 : --- a/mysql-test/suite/galera/t/galera_sst_mysqldump.cnf +++ b/mysql-test/suite/galera/t/galera_sst_mysqldump.cnf @@ -5,7 +5,13 @@ [mysqld.1] wsrep_provider_options='base_port=@mysqld.1.#galera_port;gcache.size=1;pc.ignore_sb=true' +# must present because test uses both 192.0.0.1 and 192.0.0.2 +bind-address=0.0.0.0 +wsrep_sync_wait=0 [mysqld.2] wsrep_provider_options='base_port=@mysqld.2.#galera_port;gcache.size=1;pc.ignore_sb=true' +# must present because test uses both 192.0.0.1 and 192.0.0.2 +bind-address=0.0.0.0 +wsrep_sync_wait=0
            anikitin Andrii Nikitin (Inactive) added a comment - - edited

            Please review the patch in the last comment , which makes the test pass in 10.0-galera debug build with patch from MDEV-13329

            anikitin Andrii Nikitin (Inactive) added a comment - - edited Please review the patch in the last comment , which makes the test pass in 10.0-galera debug build with patch from MDEV-13329

            People

              jplindst Jan Lindström (Inactive)
              anikitin Andrii Nikitin (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.