Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35940

Unallowed state transition: donor -> synced in galera_wan

Details

    • Bug
    • Status: Open (View Workflow)
    • Critical
    • Resolution: Unresolved
    • 10.5.27
    • 10.5
    • Galera
    • None

    Description

      galera_wan test failed on node_2:

      CURRENT_TEST: galera.galera_wan
       
      Failed to start mysqld.4
      mysqltest failed but provided no output
       
       - saving '/home/panda/maria-10.5/build/mysql-test/var/1/log/galera.galera_wan/' to '/home/panda/maria-10.5/build/mysql-test/var/log/galera.galera_wan/'
      ***Warnings generated in error logs during shutdown after running tests: galera.galera_wan
       
      2025-01-26  1:00:48 2 [Warning] WSREP: server: panda unallowed state transition: donor -> synced
      mariadbd: /home/panda/maria-10.5/wsrep-lib/src/server_state.cpp:1368: void wsrep::server_state::state(wsrep::unique_lock<wsrep::mutex>&, state): Assertion `0' failed.
      Attempting backtrace. You can use the following information to find out
      2025-01-26  1:00:54 0 [Warning] WSREP: Donor 904badfd-db78-11ef-8383-262f50186ceb is no longer in the group. State transfer cannot be completed, need to abort.
      2025-01-26  1:00:54 0 [ERROR] WSREP: /home/panda/galera-es/gcs/src/gcs_core.cpp:core_handle_state_msg():1090: Failed to handle state message: -131 (State not recoverable)
      2025-01-26  1:00:54 0 [ERROR] WSREP: /home/panda/galera-es/gcs/src/gcs_core.cpp:gcs_core_recv():1350: Unrecoverable error happened above. Aborting...
      WSREP_SST: [ERROR] Parent mysqld process (PID: 1118096) terminated unexpectedly. (20250126 01:00:56.610)
      WSREP_SST: [ERROR] Cleanup after exit with status: 3 (20250126 01:00:56.613)

      log fragment and stack trace for node_2:

      2025-01-26  1:00:48 2 [Note] WSREP: Read WSREPXid from InnoDB:	8f2d1325-db78-11ef-b682-2a2e8fd12b2d:3
      2025-01-26  1:00:48 2 [Note] WSREP: Lowest cert index boundary for CC from group: 3
      2025-01-26  1:00:48 2 [Note] WSREP: Min available from gcache for CC from group: 2
      2025-01-26  1:00:48 2 [Note] WSREP: Server panda synced with group
      2025-01-26  1:00:48 2 [Note] WSREP: server panda state change: joined -> synced
      2025-01-26  1:00:48 2 [Note] WSREP: Server status change joined -> synced
      2025-01-26  1:00:48 2 [Note] WSREP: Synchronized with group, ready for connections
      2025-01-26  1:00:48 2 [Note] WSREP: Setting wsrep_ready to 1
      2025-01-26  1:00:48 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      2025-01-26  1:00:48 2 [Note] WSREP: Detected STR version: 1, req_len: 153, req: STRv1
      2025-01-26  1:00:48 2 [Note] WSREP: Cert index preload: 3 -> 3
      2025-01-26  1:00:48 2 [Note] WSREP: server panda state change: synced -> donor
      2025-01-26  1:00:48 0 [Note] WSREP: async IST sender starting to serve tcp://127.0.0.1:19044 sending 3-3, preload starts from 3
      2025-01-26  1:00:48 2 [Note] WSREP: Server status change synced -> donor
      2025-01-26  1:00:48 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      2025-01-26  1:00:48 0 [Note] WSREP: IST sender 3 -> 3
      2025-01-26  1:00:48 0 [Note] WSREP: Donor monitor thread started to monitor
      2025-01-26  1:00:48 0 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'donor' --address '127.0.0.1:19045/rsync_sst' --local-port 19031 --socket '/home/panda/maria-10.5/build/mysql-test/var/tmp/1/mysqld.2.sock' --progress 0 --datadir '/home/panda/maria-10.5/build/mysql-test/var/1/mysqld.2/data/' --defaults-file '/home/panda/maria-10.5/build/mysql-test/var/1/my.cnf' --defaults-group-suffix '.2' --gtid '8f2d1325-db78-11ef-b682-2a2e8fd12b2d:3' --gtid-domain-id 0 --binlog-index 'mysqld-bin.index' --mysqld-args --defaults-group-suffix=.2 --defaults-file=/home/panda/maria-10.5/build/mysql-test/var/1/my.cnf --log-output=file --innodb --innodb-cmpmem --innodb-cmp-per-index --innodb-trx --innodb-locks --innodb-lock-waits --innodb-metrics --innodb-buffer-pool-stats --innodb-buffer-page --innodb-buffer-page-lru --innodb-sys-columns --innodb-sys-fields --innodb-sys-foreign --innodb-sys-foreign-cols --innodb-sys-indexes --innodb-sys-tables --innodb-sys-virtual --innodb --innodb-cmpmem --innodb-cmp-per-index --innodb-trx --innodb-locks --innodb-lock-wa
      2025-01-26  1:00:48 0 [Note] WSREP: 1.1 (panda): State transfer to 3.3 (panda) complete.
      2025-01-26  1:00:48 0 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 3)
      2025-01-26  1:00:48 0 [Note] WSREP: Processing event queue:... -nan% (0/0 events) complete.
      2025-01-26  1:00:48 2 [Note] WSREP: sst_donor_thread signaled with 0
      2025-01-26  1:00:48 0 [Note] WSREP: Member 1.1 (panda) synced with group.
      2025-01-26  1:00:48 0 [Note] WSREP: Processing event queue:... 100.0% (1/1 events) complete.
      2025-01-26  1:00:48 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 3)
      2025-01-26  1:00:48 2 [Note] WSREP: Server panda synced with group
      2025-01-26  1:00:48 2 [Warning] WSREP: server: panda unallowed state transition: donor -> synced
       
      mariadbd: /home/panda/maria-10.5/wsrep-lib/src/server_state.cpp:1368: void wsrep::server_state::state(wsrep::unique_lock<wsrep::mutex>&, state): Assertion `0' failed.
      250126	1:00:48 [ERROR] mysqld got signal 6 ;
      Sorry, we probably made a mistake, and this is a bug.
       
      Your assistance in bug reporting will enable us to fix this for the next release.
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed,
      something is definitely wrong and this may fail.
       
      Server version: 10.5.28-22-MariaDB-debug-log source revision: 1335b809745edeb87ae84f8693ff52cbee02d0a5
      key_buffer_size=1048576
      read_buffer_size=131072
      max_used_connections=1
      max_threads=153
      thread_count=5
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63826 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      WSREP: Suppressing further logging
      WSREP: Shutting down network communications
       
      Thread pointer: 0x7f45f8000dc8
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f461408b000 thread_stack 0x49000
      WSREP_SST: [INFO] rsync SST started on donor (20250126 01:00:48.132)
      2025-01-26  1:00:48 13 [Note] WSREP: assigned new next trx id: 24
      2025-01-26  1:00:48 13 [Note] WSREP: wsrep_after_statement for 13 client_state exec  client_mode local trans_state aborted
      2025-01-26  1:00:48 13 [Note] WSREP: wsrep_thd_binlog_reset
      2025-01-26  1:00:48 13 [Note] WSREP: assigned new next trx id: 25
      2025-01-26  1:00:48 13 [Note] WSREP: wsrep_after_statement for 13 client_state exec  client_mode local trans_state aborted
      2025-01-26  1:00:48 13 [Note] WSREP: wsrep_thd_binlog_reset
      2025-01-26  1:00:48 13 [Note] WSREP: wsrep_after_statement for 13 client_state exec  client_mode local trans_state aborted
      2025-01-26  1:00:48 13 [Note] WSREP: wsrep_thd_binlog_reset
      2025-01-26  1:00:48 13 [Note] WSREP: assigned new next trx id: 26
      WSREP_SST: [INFO] Using stunnel for SSL encryption: CA: '/home/panda/maria-10.5/mysql-test/std_data/cacert.pem', CAPATH='', ssl-mode='VERIFY_CA' (20250126 01:00:48.273)
      2025-01-26  1:00:48 0 [Note] WSREP: Flushing tables for SST...
      mysys/stacktrace.c:212(my_print_stacktrace)[0x562710e944e8]
      sql/signal_handler.cc:246(handle_fatal_signal)[0x5627104640da]
      libc_sigaction.c:0(__restore_rt)[0x7f4619124520]
      nptl/pthread_kill.c:44(__pthread_kill_implementation)[0x7f46191789fc]
      posix/raise.c:27(__GI_raise)[0x7f4619124476]
      stdlib/abort.c:81(__GI_abort)[0x7f461910a7f3]
      intl/loadmsgcat.c:1177(_nl_load_domain)[0x7f461910a71b]
      /lib/x86_64-linux-gnu/libc.so.6(+0x39e96)[0x7f461911be96]
      src/server_state.cpp:1371(wsrep::server_state::state(std::unique_lock<wsrep::mutex>&, wsrep::server_state::state))[0x5627110d2fc6]
      src/server_state.cpp:1095(wsrep::server_state::on_sync())[0x5627110d1491]
      2025-01-26  1:00:49 0 [Note] WSREP: (904badfd-8383, 'tcp://0.0.0.0:19037') turning message relay requesting on, nonlive peers: tcp://127.0.0.1:19034 tcp://127.0.0.1:19040 tcp://127.0.0.1:19043
      src/wsrep_provider_v26.cpp:532((anonymous namespace)::synced_cb(void*))[0x5627110f8f11]
      src/replicator_smm.cpp:3104(galera::ReplicatorSMM::process_sync(long))[0x7f460fc77082]
      src/gcs_action_source.cpp:144(galera::GcsActionSource::dispatch(void*, gcs_action const&, bool&))[0x7f460fccbc82]
      src/gcs_action_source.cpp:192(galera::GcsActionSource::process(void*, bool&))[0x7f460fccc21d]
      src/replicator_smm.cpp:404(galera::ReplicatorSMM::async_recv(void*))[0x7f460fc660ba]
      2025-01-26  1:00:50 0 [Note] WSREP: (904badfd-8383, 'tcp://0.0.0.0:19037') reconnecting to 8f2589b0-b794 (tcp://127.0.0.1:19034), attempt 0
      2025-01-26  1:00:50 0 [Note] WSREP: (904badfd-8383, 'tcp://0.0.0.0:19037') reconnecting to 905c6a98-9e45 (tcp://127.0.0.1:19040), attempt 0
      2025-01-26  1:00:50 0 [Note] WSREP: (904badfd-8383, 'tcp://0.0.0.0:19037') reconnecting to 90665be4-87d2 (tcp://127.0.0.1:19043), attempt 0
      2025-01-26  1:00:50 0 [Note] WSREP: Failed to establish connection: Operation aborted.
      src/wsrep_provider.cpp:292(galera_recv)[0x7f460fc3cc86]
      2025-01-26  1:00:50 0 [Note] WSREP: Failed to establish connection: Operation aborted.
      2025-01-26  1:00:50 0 [Note] WSREP: Failed to establish connection: Operation aborted.
      src/wsrep_provider_v26.cpp:870(wsrep::wsrep_provider_v26::run_applier(wsrep::high_priority_service*))[0x5627110fa330]
      sql/wsrep_thd.cc:57(wsrep_replication_process(THD*, void*))[0x5627108aec29]
      sql/wsrep_mysqld.cc:3655(start_wsrep_THD(void*))[0x562710891f32]
      perfschema/pfs.cc:2203(pfs_spawn_thread)[0x5627107e0b00]
      nptl/pthread_create.c:442(start_thread)[0x7f4619176ac3]
      x86_64/clone3.S:83(__clone3)[0x7f4619208850]
       
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x0): (null)
      Connection ID (thread ID): 2
      Status: NOT_KILLED

      and for node_4:

      WSREP_SST: [INFO] rsync SST started on joiner (20250126 01:00:41.770)
      WSREP_SST: [INFO] Using stunnel for SSL encryption: CA: '/home/panda/maria-10.5/mysql-test/std_data/cacert.pem', CAPATH='', ssl-mode='VERIFY_CA' (20250126 01:00:41.864)
      WSREP_SST: [INFO] 'ss' is selected as a socket information utility. (20250126 01:00:41.871)
      2025.01.26 01:00:41 LOG4[ui]: Insecure file permissions on /home/panda/maria-10.5/mysql-test/std_data/server-key.pem
      2025-01-26  1:00:41 2 [Note] WSREP: Prepared SST request: rsync|localhost:057c9d21918efafdc5e3826a19021dba@127.0.0.1:19045/rsync_sst
      2025-01-26  1:00:41 2 [Note] WSREP: App waits SST: 1
      2025-01-26  1:00:41 2 [Note] WSREP: ####### IST uuid:00000000-0000-0000-0000-000000000000 f: 0, l: 3, STRv: 3
      2025-01-26  1:00:41 2 [Note] WSREP: IST receiver addr using tcp://127.0.0.1:19044
      2025-01-26  1:00:41 2 [Note] WSREP: Prepared IST receiver for 0-3, listening at: tcp://127.0.0.1:19044
      2025-01-26  1:00:41 0 [Note] WSREP: Member 2.2 (panda) requested state transfer from '*any*', but it is impossible to select State Transfer donor: No donor candidates temporarily available in suitable state
      2025-01-26  1:00:41 0 [Note] WSREP: Member 3.3 (panda) requested state transfer from '*any*', but it is impossible to select State Transfer donor: No donor candidates temporarily available in suitable state
      2025-01-26  1:00:41 2 [Note] WSREP: Requesting state transfer failed: No donor candidates temporarily available in suitable state. Will keep retrying every 1 second(s)
      2025-01-26  1:00:43 0 [Note] WSREP: Member 2.2 (panda) requested state transfer from '*any*', but it is impossible to select State Transfer donor: No donor candidates temporarily available in suitable state
      2025-01-26  1:00:43 0 [Note] WSREP: Member 3.3 (panda) requested state transfer from '*any*', but it is impossible to select State Transfer donor: No donor candidates temporarily available in suitable state
      2025-01-26  1:00:44 0 [Note] WSREP: Member 2.2 (panda) requested state transfer from '*any*', but it is impossible to select State Transfer donor: No donor candidates temporarily available in suitable state
      2025-01-26  1:00:44 0 [Note] WSREP: Member 3.3 (panda) requested state transfer from '*any*', but it is impossible to select State Transfer donor: No donor candidates temporarily available in suitable state
      2025-01-26  1:00:44 0 [Note] WSREP: 0.1 (panda): State transfer to -1.-1 (left the group) complete.
      2025-01-26  1:00:44 0 [Note] WSREP: Member 0.1 (panda) synced with group.
      2025-01-26  1:00:45 0 [Note] WSREP: Member 2.2 (panda) requested state transfer from '*any*'. Selected 0.1 (panda)(SYNCED) as donor.
      2025-01-26  1:00:45 0 [Note] WSREP: Member 3.3 (panda) requested state transfer from '*any*', but it is impossible to select State Transfer donor: No donor candidates temporarily available in suitable state
      2025-01-26  1:00:46 0 [Note] WSREP: Member 3.3 (panda) requested state transfer from '*any*', but it is impossible to select State Transfer donor: No donor candidates temporarily available in suitable state
      2025-01-26  1:00:47 0 [Note] WSREP: 1.1 (panda): State transfer from -1.-1 (left the group) complete.
      2025-01-26  1:00:47 0 [Note] WSREP: Member 1.1 (panda) synced with group.
      2025-01-26  1:00:47 0 [Note] WSREP: There are no nodes in the same segment that will ever be able to become donors, yet there is a suitable donor outside. Will use that one.
      2025-01-26  1:00:47 0 [Note] WSREP: Member 3.3 (panda) requested state transfer from '*any*'. Selected 1.1 (panda)(SYNCED) as donor.
      2025-01-26  1:00:47 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 3)
      2025-01-26  1:00:47 2 [Note] WSREP: Requesting state transfer: success after 6 tries, donor: 1
      2025-01-26  1:00:47 2 [Note] WSREP: Resetting GCache seqno map due to different histories.
      2025-01-26  1:00:47 2 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> 8f2d1325-db78-11ef-b682-2a2e8fd12b2d:3
      2025-01-26  1:00:48 0 [Note] WSREP: 1.1 (panda): State transfer to 3.3 (panda) complete.
      2025-01-26  1:00:48 0 [Note] WSREP: Member 1.1 (panda) synced with group.
      2025-01-26  1:00:48 0 [Note] WSREP: (90665be4-87d2, 'tcp://0.0.0.0:19043') turning message relay requesting on, nonlive peers: tcp://127.0.0.1:19037
      2025-01-26  1:00:50 0 [Note] WSREP: (90665be4-87d2, 'tcp://0.0.0.0:19043') reconnecting to 904badfd-8383 (tcp://127.0.0.1:19037), attempt 0
      2025-01-26  1:00:53 0 [Note] WSREP: Failed to establish connection: Connection refused
      2025-01-26  1:00:54 0 [Note] WSREP: evs::proto(90665be4-87d2, GATHER, view_id(REG,8f2589b0-b794,3)) suspecting node: 904badfd-8383
      2025-01-26  1:00:54 0 [Note] WSREP: evs::proto(90665be4-87d2, GATHER, view_id(REG,8f2589b0-b794,3)) suspected node without join message, declaring inactive
      2025-01-26  1:00:54 0 [Note] WSREP: 0.1 (panda): State transfer to 2.2 (panda) complete.
      2025-01-26  1:00:54 0 [Note] WSREP: 2.2 (panda): State transfer from 0.1 (panda) complete.
      2025-01-26  1:00:54 0 [Note] WSREP: declaring 8f2589b0-b794 at tcp://127.0.0.1:19034 stable
      2025-01-26  1:00:54 0 [Note] WSREP: declaring 905c6a98-9e45 at tcp://127.0.0.1:19040 stable
      2025-01-26  1:00:54 0 [Note] WSREP: Node 8f2589b0-b794 state prim
      2025-01-26  1:00:54 0 [Note] WSREP: view(view_id(PRIM,8f2589b0-b794,4) memb {
      	8f2589b0-b794,1
      	905c6a98-9e45,2
      	90665be4-87d2,3
      } joined {
      } left {
      } partitioned {
      	904badfd-8383,1
      })
      2025-01-26  1:00:54 0 [Note] WSREP: save pc into disk
      2025-01-26  1:00:54 0 [Note] WSREP: forgetting 904badfd-8383 (tcp://127.0.0.1:19037)
      2025-01-26  1:00:54 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3
      2025-01-26  1:00:54 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
      2025-01-26  1:00:54 0 [Note] WSREP: (90665be4-87d2, 'tcp://0.0.0.0:19043') turning message relay requesting off
      2025-01-26  1:00:54 0 [Note] WSREP: SYNC message from member 0 in non-primary configuration. Ignored.
      2025-01-26  1:00:54 0 [Note] WSREP: SYNC message from member 1 in non-primary configuration. Ignored.
      2025-01-26  1:00:54 0 [Note] WSREP: STATE EXCHANGE: sent state msg: 9cdc9e4a-db78-11ef-b541-2e7d86acb5fe
      2025-01-26  1:00:54 0 [Note] WSREP: STATE EXCHANGE: got state msg: 9cdc9e4a-db78-11ef-b541-2e7d86acb5fe from 0 (panda)
      2025-01-26  1:00:54 0 [Note] WSREP: STATE EXCHANGE: got state msg: 9cdc9e4a-db78-11ef-b541-2e7d86acb5fe from 1 (panda)
      2025-01-26  1:00:54 0 [Note] WSREP: STATE EXCHANGE: got state msg: 9cdc9e4a-db78-11ef-b541-2e7d86acb5fe from 2 (panda)
      2025-01-26  1:00:54 0 [Note] WSREP: Quorum results:
      	version    = 6,
      	component  = PRIMARY,
      	conf_id    = 3,
      	members    = 2/3 (joined/total),
      	act_id	   = 3,
      	last_appl. = 0,
      	protocols  = 5/11/4 (gcs/repl/appl),
      	vote policy= 0,
      	group UUID = 8f2d1325-db78-11ef-b682-2a2e8fd12b2d
      2025-01-26  1:00:54 0 [Warning] WSREP: Donor 904badfd-db78-11ef-8383-262f50186ceb is no longer in the group. State transfer cannot be completed, need to abort.
      2025-01-26  1:00:54 0 [ERROR] WSREP: /home/panda/galera-es/gcs/src/gcs_core.cpp:core_handle_state_msg():1090: Failed to handle state message: -131 (State not recoverable)
      2025-01-26  1:00:54 0 [Note] WSREP: gcomm: terminating thread
      2025-01-26  1:00:54 0 [Note] WSREP: gcomm: joining thread
      2025-01-26  1:00:54 0 [Note] WSREP: gcomm: closing backend
      2025-01-26  1:00:54 0 [Note] WSREP: view(view_id(NON_PRIM,8f2589b0-b794,4) memb {
      	90665be4-87d2,3
      } joined {
      } left {
      } partitioned {
      	8f2589b0-b794,1
      	905c6a98-9e45,2
      })
      2025-01-26  1:00:54 0 [Note] WSREP: PC protocol downgrade 1 -> 0
      2025-01-26  1:00:54 0 [Note] WSREP: view((empty))
      2025-01-26  1:00:54 0 [Note] WSREP: gcomm: closed
      2025-01-26  1:00:54 0 [ERROR] WSREP: /home/panda/galera-es/gcs/src/gcs_core.cpp:gcs_core_recv():1350: Unrecoverable error happened above. Aborting...
      2025-01-26  1:00:55 0 [Note] WSREP: /home/panda/maria-10.5/build/sql/mariadbd: Terminated.
      WSREP_SST: [ERROR] Parent mysqld process (PID: 1118096) terminated unexpectedly. (20250126 01:00:56.610)
      WSREP_SST: [ERROR] Cleanup after exit with status: 3 (20250126 01:00:56.613)
      WSREP_SST: [INFO] Joiner cleanup: rsync PID=0, stunnel PID=1120147 (20250126 01:00:56.616)
      WSREP_SST: [INFO] Joiner cleanup done. (20250126 01:00:56.648)

      Attachments

        1. mysqld.1.err
          62 kB
        2. mysqld.2.err
          44 kB
        3. mysqld.3.err
          73 kB
        4. mysqld.4.err
          24 kB
        5. stdout.log
          1 kB

        Activity

          There are no comments yet on this issue.

          People

            Lampio Pekka
            sysprg Julius Goryavsky
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.