[MDEV-18422] Galera: Rolling upgrade: 10.4 node stopped with signal 6 after upgrade and trying to replicate DDL Created: 2019-01-30  Updated: 2019-02-21  Resolved: 2019-02-21

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.4.2, 10.3.13
Fix Version/s: 10.4.3, 10.3.14

Type: Bug Priority: Major
Reporter: Shahriyar Rzayev (Inactive) Assignee: Teemu Ollakka
Resolution: Fixed Votes: 0
Labels: galera
Environment:

Ubuntu 18.04


Attachments: File node5_fail_without_mysqld_upgrade_sql_on_node1.err    

 Description   

Hi,
The following scenario was used for this test:

  • Tested on Ubuntu 18.04 desktop with -> 10.3.13-MariaDB-debug and 10.4.2-MariaDB-debug
    • Start given number of nodes - with my test 5 nodes with 10.3 Maria + 3.x Galera.
    • Stop node5
    • start node5 with 10.4 Maria + 4.x Galera.
    • Do NOT run mysqld_upgrade on node5
    • Try to create database in node1 -> create database test_cluster;
    • Lost node5 with following error:

[Warning] WSREP: no corresponding NBO begin found for NBO end source: 2222148c-2450-11e9-90e3-be38054a3929 version: 4 local: 0 flags: 5 conn_id: 12 trx
_id: -1 tstamp: 1432598153916; state: REPLICATING:0->CERTIFYING:3034 seqnos (l: 3, g: 1, s: 0, d: 0) WS pa_range: 0; state history: REPLICATING:0->CERTIFYING:3034
190130  9:37:28 [ERROR] mysqld got signal 6 ;
 
Server version: 10.4.2-MariaDB-debug
stack_bottom = 0x7fee581e3ac0 thread_stack 0x49000
/home/shako/Galera_Tests/dbs/maria_10.4/bin/mysqld(my_print_stacktrace+0x40)[0x55f40f152c55]
/home/shako/Galera_Tests/dbs/maria_10.4/bin/mysqld(handle_fatal_signal+0x3e1)[0x55f40e98a54c]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7fee5c950890]
linux/raise.c:51(__GI_raise)[0x7fee5ba48e97]
stdlib/abort.c:81(__GI_abort)[0x7fee5ba4a801]
/home/shako/Galera_Tests/dbs/maria_10.4/bin/mysqld(+0x991ad5)[0x55f40e884ad5]
/home/shako/Galera_Tests/dbs/maria_10.4/bin/mysqld(+0x12e8996)[0x55f40f1db996]
src/server_state.cpp:324(apply_toi(wsrep::provider&, wsrep::high_priority_service&, wsrep::ws_handle const&, wsrep::ws_meta const&, wsrep::const_buffer const&))[0x55f40f1de890]
src/server_state.cpp:912(wsrep::server_state::on_apply(wsrep::high_priority_service&, wsrep::ws_handle const&, wsrep::ws_meta const&, wsrep::const_buffer const&))[0x55f40f1f3757]
wsrep/high_priority_service.hpp:47(wsrep::high_priority_service::apply(wsrep::ws_handle const&, wsrep::ws_meta const&, wsrep::const_buffer const&))[0x55f40f1f101d]
src/trx_handle.cpp:414(galera::TrxHandleSlave::apply(void*, wsrep_cb_status (*)(void*, wsrep_ws_handle const*, unsigned int, wsrep_buf const*, wsrep_trx_meta const*, bool*), wsrep_trx_meta const&, bool&))[0x7fee596e7e0e]
src/replicator_smm.cpp:505(galera::ReplicatorSMM::apply_trx(void*, galera::TrxHandleSlave&))[0x7fee597373b0]
src/replicator_smm.cpp:2137(galera::ReplicatorSMM::process_trx(void*, boost::shared_ptr<galera::TrxHandleSlave> const&))[0x7fee5973e28d]
src/gcs_action_source.cpp:63(galera::GcsActionSource::process_writeset(void*, gcs_action const&, bool&))[0x7fee59710a28]
src/gcs_action_source.cpp:109(galera::GcsActionSource::dispatch(void*, gcs_action const&, bool&))[0x7fee597117d2]
src/gcs_action_source.cpp:29(galera::GcsActionSource::process(void*, bool&))[0x7fee59711a98]
src/replicator_smm.cpp:383(galera::ReplicatorSMM::async_recv(void*))[0x7fee59739a80]
src/wsrep_provider.cpp:263(galera_recv)[0x7fee597577fb]
/home/shako/Galera_Tests/dbs/maria_10.4/bin/mysqld(_ZN5wsrep18wsrep_provider_v2611run_applierEPNS_21high_priority_serviceE+0x30)[0x55f40f1f1bd0]
src/wsrep_provider_v26.cpp:646(wsrep::wsrep_provider_v26::run_applier(wsrep::high_priority_service*))[0x55f40e8a9e0a]
sql/wsrep_thd.cc:61(wsrep_replication_process(THD*, void*))[0x55f40e89b66b]
nptl/pthread_create.c:463(start_thread)[0x7fee5c9456db]
x86_64/clone.S:97(clone)[0x7fee5bb2b88f]



 Comments   
Comment by Shahriyar Rzayev (Inactive) [ 2019-01-30 ]

But if we change the scenario a bit, there will be no error:

• Start given number of nodes - with my test 5 nodes with 10.3 Maria + 3.x Galera.
• Stop node5
• start node5 with 10.4 Maria + 4.x Galera.
• Do NOT run `mysqld_upgrade` on node5
• Try to create database in `node5` -> `> create database test_cluster;`
• Replicated to other nodes, without errors.

Comment by Shahriyar Rzayev (Inactive) [ 2019-01-31 ]

Same fail happened after running mysql_upgrade as well on node5.

Comment by Shahriyar Rzayev (Inactive) [ 2019-02-21 ]

Can be closed as related PR is merged in 10.4

Generated at Thu Feb 08 08:43:59 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.