Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23612

galera_sr.galera_sr_shutdown_master MTR failed: WSREP_SST: [ERROR] Possible timeout in receving first data from donor in gtid stage

Details

    Description

      galera_sr.galera_sr_shutdown_master MTR failed on BB 10.5: "WSREP_SST: [ERROR] Possible timeout in receving first data from donor in gtid stage".
      It seems to be a sporadic issue.

      stdio.log:

      10.5.6 8f8f2aea93835899345454f87768fd649749e29c

      galera_sr.galera_sr_shutdown_master 'innodb' w2 [ fail ]  Found warnings/errors in server log file!
              Test ended at 2020-08-26 07:20:57
      line
      WSREP_SST: [ERROR] Possible timeout in receving first data from donor in gtid stage (20200826 07:20:53.339)
      WSREP_SST: [ERROR] Cleanup after exit with status:32 (20200826 07:20:53.343)
      ^ Found warnings in /dev/shm/var/2/log/mysqld.2.err
      ok
       
      worker[2] > Restart  - not started
      worker[2] > Restart  - not started
      

      Attachments

        Issue Links

          Activity

            It failed on BB, 10.5 with signal 11.
            stdio.log:

            10.5.9, e8217d070fc3e60870131615a48515836c773b07, kvm-deb-xenial-amd64

            galera_sr.galera_sr_shutdown_master 'innodb' w1 [ fail ]
                    Test ended at 2020-12-14 14:47:50
             
            CURRENT_TEST: galera_sr.galera_sr_shutdown_master
            mysqltest: In included file "./include/galera_init.inc": 
            included from ./include/galera_cluster.inc at line 16:
            included from /usr/share/mysql/mysql-test/suite/galera_sr/t/galera_sr_shutdown_master.test at line 6:
            At line 25: query 'connect $galera_connection_name,127.0.0.1,root,,test,$_galera_port,' failed: 2013: Lost connection to MySQL server at 'handshake: reading initial communication packet', system error: 11
             
             
            Server [mysqld.1 - pid: 5735, winpid: 5735, exit: 256] failed during test run
            Server log from this test:
            ----------SERVER LOG START-----------
            201214 14:47:36 [ERROR] mysqld got signal 11 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.
             
            To report this bug, see https://mariadb.com/kb/en/reporting-bugs
             
            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed, 
            something is definitely wrong and this may fail.
             
            Server version: 10.5.9-MariaDB-1:10.5.9+maria~xenial-log
            key_buffer_size=1048576
            read_buffer_size=131072
            max_used_connections=64
            max_threads=153
            thread_count=67
            It is possible that mysqld could use up to 
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63638 K  bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.
             
            Thread pointer: 0x7f87bcb3d778
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x7f87d0357cb8 thread_stack 0x49000
            ??:0(my_print_stacktrace)[0x55d6e0acae2e]
            ??:0(handle_fatal_signal)[0x55d6e04ff1bf]
            ??:0(__restore_rt)[0x7f8821e8d390]
            ??:0(thd_clear_errors(THD*))[0x55d6e02aaa6e]
            ??:0(THD::change_user())[0x55d6e02af941]
            ??:0(THD::reset_for_reuse())[0x55d6e02afb29]
            ??:0(CONNECT::create_thd(THD*))[0x55d6e03f0a24]
            ??:0(do_handle_one_connection(CONNECT*, bool))[0x55d6e03f103b]
            ??:0(handle_one_connection)[0x55d6e03f1454]
            ??:0(MyCTX_nopad::finish(unsigned char*, unsigned int*))[0x55d6e07358f1]
            ??:0(start_thread)[0x7f8821e836ba]
            x86_64/clone.S:111(clone)[0x7f882132a4dd]
             
            Trying to get some variables.
            Some pointers may be invalid and cause the dump to abort.
            Query (0x0): (null)
            Connection ID (thread ID): 319
            Status: NOT_KILLED
             
            Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off
             
            The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
            information that should help you find out what is causing the crash.
             
            We think the query pointer is invalid, but we will try to print it anyway. 
            Query: 
             
            Writing a core file...
            Working directory at /dev/shm/var/1/mysqld.1/data
            Resource Limits:
            Limit                     Soft Limit           Hard Limit           Units     
            Max cpu time              unlimited            unlimited            seconds   
            Max file size             unlimited            unlimited            bytes     
            Max data size             unlimited            unlimited            bytes     
            Max stack size            8388608              unlimited            bytes     
            Max core file size        unlimited            unlimited            bytes     
            Max resident set          unlimited            unlimited            bytes     
            Max processes             23720                23720                processes 
            Max open files            1024                 1024                 files     
            Max locked memory         65536                65536                bytes     
            Max address space         unlimited            unlimited            bytes     
            Max file locks            unlimited            unlimited            locks     
            Max pending signals       23720                23720                signals   
            Max msgqueue size         819200               819200               bytes     
            Max nice priority         0                    0                    
            Max realtime priority     0                    0                    
            Max realtime timeout      unlimited            unlimited            us        
            Core pattern: |/usr/share/apport/apport %p %s %c %P
             
            ----------SERVER LOG END-------------
             
             
             - found 'core' (0/0)
             
            Trying 'dbx' to get a backtrace
             
            Trying 'gdb' to get a backtrace from coredump /dev/shm/var/1/log/galera_sr.galera_sr_shutdown_master-innodb/mysqld.1/data/core
             
            Trying 'lldb' to get a backtrace from coredump /dev/shm/var/1/log/galera_sr.galera_sr_shutdown_master-innodb/mysqld.1/data/core
             - deleting it, already saved 0
             - saving '/dev/shm/var/1/log/galera_sr.galera_sr_shutdown_master-innodb/' to '/dev/shm/var/log/galera_sr.galera_sr_shutdown_master-innodb/'
             
            Retrying test galera_sr.galera_sr_shutdown_master, attempt(2/3)...
             
            worker[1] > Restart  - not started
            worker[1] > Restart  - not started
            

            10.5.9 Server crash logs

            stepan.patryshev Stepan Patryshev (Inactive) added a comment - - edited It failed on BB, 10.5 with signal 11. stdio.log : 10.5.9, e8217d070fc3e60870131615a48515836c773b07, kvm-deb-xenial-amd64 galera_sr.galera_sr_shutdown_master 'innodb' w1 [ fail ] Test ended at 2020-12-14 14:47:50   CURRENT_TEST: galera_sr.galera_sr_shutdown_master mysqltest: In included file "./include/galera_init.inc": included from ./include/galera_cluster.inc at line 16: included from /usr/share/mysql/mysql-test/suite/galera_sr/t/galera_sr_shutdown_master.test at line 6: At line 25: query 'connect $galera_connection_name,127.0.0.1,root,,test,$_galera_port,' failed: 2013: Lost connection to MySQL server at 'handshake: reading initial communication packet', system error: 11     Server [mysqld.1 - pid: 5735, winpid: 5735, exit: 256] failed during test run Server log from this test: ----------SERVER LOG START----------- 201214 14:47:36 [ERROR] mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware.   To report this bug, see https://mariadb.com/kb/en/reporting-bugs   We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail.   Server version: 10.5.9-MariaDB-1:10.5.9+maria~xenial-log key_buffer_size=1048576 read_buffer_size=131072 max_used_connections=64 max_threads=153 thread_count=67 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63638 K bytes of memory Hope that's ok; if not, decrease some variables in the equation.   Thread pointer: 0x7f87bcb3d778 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7f87d0357cb8 thread_stack 0x49000 ??:0(my_print_stacktrace)[0x55d6e0acae2e] ??:0(handle_fatal_signal)[0x55d6e04ff1bf] ??:0(__restore_rt)[0x7f8821e8d390] ??:0(thd_clear_errors(THD*))[0x55d6e02aaa6e] ??:0(THD::change_user())[0x55d6e02af941] ??:0(THD::reset_for_reuse())[0x55d6e02afb29] ??:0(CONNECT::create_thd(THD*))[0x55d6e03f0a24] ??:0(do_handle_one_connection(CONNECT*, bool))[0x55d6e03f103b] ??:0(handle_one_connection)[0x55d6e03f1454] ??:0(MyCTX_nopad::finish(unsigned char*, unsigned int*))[0x55d6e07358f1] ??:0(start_thread)[0x7f8821e836ba] x86_64/clone.S:111(clone)[0x7f882132a4dd]   Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x0): (null) Connection ID (thread ID): 319 Status: NOT_KILLED   Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off   The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains information that should help you find out what is causing the crash.   We think the query pointer is invalid, but we will try to print it anyway. Query:   Writing a core file... Working directory at /dev/shm/var/1/mysqld.1/data Resource Limits: Limit Soft Limit Hard Limit Units Max cpu time unlimited unlimited seconds Max file size unlimited unlimited bytes Max data size unlimited unlimited bytes Max stack size 8388608 unlimited bytes Max core file size unlimited unlimited bytes Max resident set unlimited unlimited bytes Max processes 23720 23720 processes Max open files 1024 1024 files Max locked memory 65536 65536 bytes Max address space unlimited unlimited bytes Max file locks unlimited unlimited locks Max pending signals 23720 23720 signals Max msgqueue size 819200 819200 bytes Max nice priority 0 0 Max realtime priority 0 0 Max realtime timeout unlimited unlimited us Core pattern: |/usr/share/apport/apport %p %s %c %P   ----------SERVER LOG END-------------     - found 'core' (0/0)   Trying 'dbx' to get a backtrace   Trying 'gdb' to get a backtrace from coredump /dev/shm/var/1/log/galera_sr.galera_sr_shutdown_master-innodb/mysqld.1/data/core   Trying 'lldb' to get a backtrace from coredump /dev/shm/var/1/log/galera_sr.galera_sr_shutdown_master-innodb/mysqld.1/data/core - deleting it, already saved 0 - saving '/dev/shm/var/1/log/galera_sr.galera_sr_shutdown_master-innodb/' to '/dev/shm/var/log/galera_sr.galera_sr_shutdown_master-innodb/'   Retrying test galera_sr.galera_sr_shutdown_master, attempt(2/3)...   worker[1] > Restart - not started worker[1] > Restart - not started 10.5.9 Server crash logs

            People

              jplindst Jan Lindström (Inactive)
              stepan.patryshev Stepan Patryshev (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.