Details
-
Bug
-
Status: Closed (View Workflow)
-
Blocker
-
Resolution: Fixed
-
11.1(EOL), 11.2, 11.4, 11.3(EOL)
-
None
Description
Description below was for original 10.11 to 11.4 failure, further testing showed that the actual same problem occurs betweein 11.0 and 11.1. Rolling upgrades from 10.11 to 11.0 and from 11.1 to 11.4 work as expected.
So it is only the 11.0 to 11.1 step that is critical.
I'm installing a 10.11 two node cluster with rather basic configuration, wait for a minute, then try a rolling upgrade to 11.4, starting with the 2nd node.
The upgraded node at first fails to join the cluster with
Sep 02 12:58:49 node-2 mariadbd[3973]: 2024-09-02 12:58:49 0 [Note] WSREP: Member 1.0 (node-1) synced with group.
|
Sep 02 12:58:49 node-2 -wsrep-sst-joiner[4188]: receiving process ended without creating magic file (/tmp/tmp.IJSXAIA9T4/mariadb_backup_galera_info)
|
Sep 02 12:58:50 node-2 -wsrep-sst-joiner[4146]: logger: invalid option -- 'r'
|
Sep 02 12:58:50 node-2 -wsrep-sst-joiner[4146]: Try 'logger --help' for more information.
|
Sep 02 12:58:50 node-2 -wsrep-sst-joiner[4195]: Removing the sst_in_progress file
|
Sep 02 12:58:50 node-2 mariadbd[3973]: 2024-09-02 12:58:50 0 [ERROR] WSREP: Failed to read uuid:seqno and wsrep_gtid_domain_id from joiner script.
|
Sep 02 12:58:50 node-2 mariadbd[3973]: 2024-09-02 12:58:50 0 [ERROR] WSREP: Process completed with error: wsrep_sst_mariabackup --role 'joiner' --address 'node-2' --datadir '/var/lib/mysql/' --parent 3973 --progress 0 [...]
|
while the donor side does not report anything unusual, showing the state transfer as successful IST, not needding a ST:
Sep 02 12:58:49 node-1 mariadbd[1878]: 2024-09-02 12:58:49 0 [Note] WSREP: Member 0.0 (node-2) requested state transfer from '*any*'. Selected 1.0 (node-1)(SYNCED) as donor.
|
Sep 02 12:58:49 node-1 mariadbd[1878]: 2024-09-02 12:58:49 0 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 6)
|
Sep 02 12:58:49 node-1 mariadbd[1878]: 2024-09-02 12:58:49 2 [Note] WSREP: Detected STR version: 1, req_len: 115, req: STRv1
|
Sep 02 12:58:49 node-1 mariadbd[1878]: 2024-09-02 12:58:49 2 [Note] WSREP: IST request: b6d18f4d-692a-11ef-abf2-8bd2c43da2fb:4-6|tcp://node-2:4568
|
Sep 02 12:58:49 node-1 mariadbd[1878]: 2024-09-02 12:58:49 2 [Note] WSREP: Server status change synced -> donor
|
Sep 02 12:58:49 node-1 mariadbd[1878]: 2024-09-02 12:58:49 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
|
Sep 02 12:58:49 node-1 mariadbd[1878]: 2024-09-02 12:58:49 0 [Note] WSREP: Running: 'wsrep_sst_mariabackup --role 'donor' --address 'node-2:4444/xtrabackup_sst//1' --local-por>
|
Sep 02 12:58:49 node-1 mariadbd[1878]: 2024-09-02 12:58:49 0 [Note] WSREP: Donor monitor thread started to monitor
|
Sep 02 12:58:49 node-1 mariadbd[1878]: 2024-09-02 12:58:49 2 [Note] WSREP: sst_donor_thread signaled with 0
|
Sep 02 12:58:49 node-1 mariadbd[1878]: 2024-09-02 12:58:49 0 [Note] WSREP: async IST sender starting to serve tcp://node-2:4568 sending 2-6, preload starts from 2
|
Sep 02 12:58:49 node-1 mariadbd[1878]: 2024-09-02 12:58:49 0 [Note] WSREP: IST sender 2 -> 6
|
Sep 02 12:58:49 node-1 mariadbd[2359]: WSREP_SST: [INFO] mariabackup IST started on donor (20240902 12:58:49.919)
|
Sep 02 12:58:49 node-1 mariadbd[2359]: WSREP_SST: [INFO] SSL configuration: CA='', CAPATH='', CERT='', KEY='', MODE='DISABLED', encrypt='0' (20240902 12:58:49.959)
|
Sep 02 12:58:50 node-1 mariadbd[2359]: WSREP_SST: [INFO] Logging all stderr of SST/mariadb-backup to syslog (20240902 12:58:50.054)
|
Sep 02 12:58:50 node-1 -wsrep-sst-donor[2543]: Streaming with mbstream
|
Sep 02 12:58:50 node-1 -wsrep-sst-donor[2546]: Bypassing the SST for IST
|
Sep 02 12:58:50 node-1 mariadbd[1878]: 2024-09-02 12:58:50 0 [Note] WSREP: SST sent: b6d18f4d-692a-11ef-abf2-8bd2c43da2fb:4
|
Sep 02 12:58:50 node-1 mariadbd[1878]: 2024-09-02 12:58:50 0 [Note] WSREP: Server status change donor -> joined
|
Sep 02 12:58:50 node-1 mariadbd[1878]: 2024-09-02 12:58:50 0 [Note] WSREP: 1.0 (node-1): State transfer to 0.0 (node-2) complete.
|
Sep 02 12:58:50 node-1 mariadbd[1878]: 2024-09-02 12:58:50 0 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 6)
|
Sep 02 12:58:50 node-1 mariadbd[1878]: 2024-09-02 12:58:50 0 [Note] WSREP: Processing event queue:... -nan% (0/0 events) complete.
|
Sep 02 12:58:50 node-1 mariadbd[1878]: 2024-09-02 12:58:50 0 [Note] WSREP: Member 1.0 (node-1) synced with group.
|
Sep 02 12:58:50 node-1 mariadbd[1878]: 2024-09-02 12:58:50 0 [Note] WSREP: Processing event queue:... 100.0% (1/1 events) complete.
|
Sep 02 12:58:50 node-1 mariadbd[1878]: 2024-09-02 12:58:50 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 6)
|
Sep 02 12:58:50 node-1 mariadbd[1878]: 2024-09-02 12:58:50 2 [Note] WSREP: Server node-1 synced with group
|
Sep 02 12:58:50 node-1 mariadbd[1878]: 2024-09-02 12:58:50 2 [Note] WSREP: Server status change joined -> synced
|
Sep 02 12:58:50 node-1 mariadbd[1878]: 2024-09-02 12:58:50 2 [Note] WSREP: Synchronized with group, ready for connections
|
Sep 02 12:58:50 node-1 mariadbd[1878]: 2024-09-02 12:58:50 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
|
Sep 02 12:58:50 node-1 -wsrep-sst-donor[2550]: Total time on donor: 0 seconds
|
Sep 02 12:58:50 node-1 -wsrep-sst-donor[2551]: mariabackup IST completed on donor
|
Sep 02 12:58:50 node-1 mariadbd[1878]: 2024-09-02 12:58:50 0 [Note] WSREP: Donor monitor thread ended with total time 1 sec
|
SystemD restarts the server process on node-2 and it runs into the same problem two more times before the restart starts to fail due to a crash instead:
Sep 02 12:59:19 node-2 mariadbd[5558]: 2024-09-02 12:59:19 2 [Note] WSREP: Prepared IST receiver for 0-12, listening at: tcp://10.30.36.12:4568
|
Sep 02 12:59:19 node-2 mariadbd[5558]: 2024-09-02 12:59:19 0 [Note] WSREP: Member 0.0 (node-2) requested state transfer from '*any*'. Selected 1.0 (node-1)(SYNCED) as donor.
|
Sep 02 12:59:19 node-2 mariadbd[5558]: 2024-09-02 12:59:19 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 12)
|
Sep 02 12:59:19 node-2 mariadbd[5558]: 2024-09-02 12:59:19 2 [Note] WSREP: Requesting state transfer: success, donor: 1
|
Sep 02 12:59:19 node-2 mariadbd[5558]: 2024-09-02 12:59:19 2 [Note] WSREP: Resetting GCache seqno map due to different histories.
|
Sep 02 12:59:19 node-2 mariadbd[5558]: munmap_chunk(): invalid pointer
|
Sep 02 12:59:19 node-2 mariadbd[5558]: 240902 12:59:19 [ERROR] mysqld got signal 6 ;
|
Sep 02 12:59:19 node-2 mariadbd[5558]: Sorry, we probably made a mistake, and this is a bug.
|
Sep 02 12:59:19 node-2 mariadbd[5558]: Your assistance in bug reporting will enable us to fix this for the next release.
|
Sep 02 12:59:19 node-2 mariadbd[5558]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
|
Sep 02 12:59:19 node-2 mariadbd[5558]: We will try our best to scrape up some info that will hopefully help
|
Sep 02 12:59:19 node-2 mariadbd[5558]: diagnose the problem, but since we have already crashed,
|
Sep 02 12:59:19 node-2 mariadbd[5558]: something is definitely wrong and this may fail.
|
Sep 02 12:59:19 node-2 mariadbd[5558]: Server version: 11.4.3-MariaDB-ubu2004-log source revision: 5ab81ffe0097a22a774957df28c5223cf0201de3
|
Sep 02 12:59:19 node-2 mariadbd[5558]: key_buffer_size=0
|
Sep 02 12:59:19 node-2 mariadbd[5558]: read_buffer_size=131072
|
Sep 02 12:59:19 node-2 mariadbd[5558]: max_used_connections=0
|
Sep 02 12:59:19 node-2 mariadbd[5558]: max_threads=153
|
Sep 02 12:59:19 node-2 mariadbd[5558]: thread_count=3
|
Sep 02 12:59:19 node-2 mariadbd[5558]: It is possible that mysqld could use up to
|
Sep 02 12:59:19 node-2 mariadbd[5558]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 336996 K bytes of memory
|
Sep 02 12:59:19 node-2 mariadbd[5558]: Hope that's ok; if not, decrease some variables in the equation.
|
Sep 02 12:59:19 node-2 mariadbd[5558]: Thread pointer: 0x7f2954000c58
|
Sep 02 12:59:19 node-2 mariadbd[5558]: Attempting backtrace. You can use the following information to find out
|
Sep 02 12:59:19 node-2 mariadbd[5558]: where mysqld died. If you see no messages after this, something went
|
Sep 02 12:59:19 node-2 mariadbd[5558]: terribly wrong...
|
Sep 02 12:59:19 node-2 mariadbd[5558]: stack_bottom = 0x7f297875cd88 thread_stack 0x49000
|
Sep 02 12:59:19 node-2 mariadbd[5558]: /usr/sbin/mariadbd(my_print_stacktrace+0x32)[0x5615938a01c2]
|
Sep 02 12:59:19 node-2 mariadbd[5558]: /usr/sbin/mariadbd(handle_fatal_signal+0x485)[0x56159335a875]
|
Sep 02 12:59:19 node-2 mariadbd[5558]: sigaction.c:0(__restore_rt)[0x7f298056a420]
|
Sep 02 12:59:19 node-2 mariadbd[5774]: addr2line: DWARF error: section .debug_info is larger than its filesize! (0x93f189 vs 0x531098)
|
Sep 02 12:59:19 node-2 mariadbd[5558]: /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7f298007000b]
|
Sep 02 12:59:19 node-2 mariadbd[5558]: /lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7f298004f859]
|
Sep 02 12:59:19 node-2 mariadbd[5558]: /lib/x86_64-linux-gnu/libc.so.6(+0x8d26e)[0x7f29800ba26e]
|
Sep 02 12:59:19 node-2 mariadbd[5558]: /lib/x86_64-linux-gnu/libc.so.6(+0x952fc)[0x7f29800c22fc]
|
Sep 02 12:59:19 node-2 mariadbd[5558]: /lib/x86_64-linux-gnu/libc.so.6(+0x9554c)[0x7f29800c254c]
|
Sep 02 12:59:19 node-2 mariadbd[5558]: src/gu_deqmap.hpp:117(gu::DeqMap<long, void const*, std::allocator<void const*> >::index_back() const)[0x7f297a22083e]
|
Sep 02 12:59:19 node-2 mariadbd[5558]: src/GCache_seqno.cpp:32(gcache::GCache::seqno_reset(gu::GTID const&))[0x7f297a2055da]
|
Sep 02 12:59:19 node-2 mariadbd[5558]: src/replicator_str.cpp:917(galera::ReplicatorSMM::request_state_transfer(void*, int, wsrep_uuid const&, long, void const*, long))[0x7f297a0e6a10]
|
Sep 02 12:59:20 node-2 mariadbd[5558]: src/replicator_smm.cpp:2722(galera::ReplicatorSMM::process_st_required(void*, int, wsrep_view_info const*))[0x7f297a0cfcc7]
|
Sep 02 12:59:20 node-2 mariadbd[5558]: src/replicator_smm.cpp:3012(galera::ReplicatorSMM::process_prim_conf_change(void*, gcs_act_cchange const&, int, void*))[0x7f297a0d06ff]
|
Sep 02 12:59:20 node-2 mariadbd[5558]: src/galera_gcs.hpp:186(galera::Gcs::resume_recv())[0x7f297a0d0d4d]
|
Sep 02 12:59:20 node-2 mariadbd[5558]: bits/stl_vector.h:677(std::vector<gcs_act_cchange::member, std::allocator<gcs_act_cchange::member> >::~vector())[0x7f297a10236b]
|
Sep 02 12:59:20 node-2 mariadbd[5558]: src/gcs_action_source.cpp:29(Release::~Release())[0x7f297a10285f]
|
Sep 02 12:59:20 node-2 mariadbd[5558]: src/replicator_smm.cpp:404(galera::ReplicatorSMM::async_recv(void*))[0x7f297a0cef10]
|
Sep 02 12:59:20 node-2 mariadbd[5558]: src/wsrep_provider.cpp:292(galera_recv)[0x7f297a0a2581]
|
Sep 02 12:59:20 node-2 mariadbd[5558]: /usr/sbin/mariadbd(_ZN5wsrep18wsrep_provider_v2611run_applierEPNS_21high_priority_serviceE+0x12)[0x56159396cbf2]
|
Sep 02 12:59:20 node-2 mariadbd[5558]: /usr/sbin/mariadbd(+0xdcc9a1)[0x56159362e9a1]
|
Sep 02 12:59:20 node-2 mariadbd[5558]: /usr/sbin/mariadbd(_Z15start_wsrep_THDPv+0x26b)[0x56159361c3bb]
|
Sep 02 12:59:20 node-2 mariadbd[5558]: /usr/sbin/mariadbd(+0xd413a6)[0x5615935a33a6]
|
Sep 02 12:59:20 node-2 mariadbd[5558]: nptl/pthread_create.c:478(start_thread)[0x7f298055e609]
|
Sep 02 12:59:20 node-2 mariadbd[5778]: addr2line: DWARF error: section .debug_info is larger than its filesize! (0x93f189 vs 0x531098)
|
Sep 02 12:59:20 node-2 mariadbd[5558]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7f298014c353]
|
Sep 02 12:59:20 node-2 mariadbd[5558]: Trying to get some variables.
|
Sep 02 12:59:20 node-2 mariadbd[5558]: Some pointers may be invalid and cause the dump to abort.
|
Sep 02 12:59:20 node-2 mariadbd[5558]: Query (0x0): (null)
|
Sep 02 12:59:20 node-2 mariadbd[5558]: Connection ID (thread ID): 2
|
Sep 02 12:59:20 node-2 mariadbd[5558]: Status: NOT_KILLED
|
Sep 02 12:59:20 node-2 mariadbd[5558]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on>
|
Sep 02 12:59:20 node-2 mariadbd[5558]: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mariadbd/ contains
|
Sep 02 12:59:20 node-2 mariadbd[5558]: information that should help you find out what is causing the crash.
|
Sep 02 12:59:20 node-2 mariadbd[5558]: We think the query pointer is invalid, but we will try to print it anyway.
|
Sep 02 12:59:20 node-2 mariadbd[5558]: Query:
|
Sep 02 12:59:20 node-2 mariadbd[5558]: Writing a core file...
|
Attachments
Issue Links
- is caused by
-
MDEV-18931 Rename Mariabackup's xtrabackup_* files to mariadb_backup_*
- Closed
- is duplicated by
-
MDEV-30822 Adjust wsrep_sst_mariabackup.sh to understand both pre-MDEV-18931 and post-MDEV-18931 file naming
- Closed