Details
-
Bug
-
Status: Closed (View Workflow)
-
Blocker
-
Resolution: Fixed
-
10.3.36, 10.4.26
Description
Test case
Install 10.3.36 on vagrant boxes (2 nodes)
|
shutdown node2
|
remove 10.3.36 packages from node2
|
install10.4.26 packages after updating repo
|
> server startup is failing after package installation
|
Error info
2022-08-24 14:48:04 0 [Note] WSREP: Service thread queue flushed.
|
2022-08-24 14:48:04 0 [Note] WSREP: ####### Assign initial position for certification: 5c410a36-23a8-11ed-a44c-f6f37823dd10:3, protocol version: -1
|
2022-08-24 14:48:04 0 [ERROR] WSREP: Corrupt buffer header: addr: 0x7f722bd5b530, seqno: 7019267256999739392, size: 825111097, ctx: 0x559652a28678, flags: 14391. store: 46, type: 49
|
220824 14:48:04 [ERROR] mysqld got signal 6 ;
|
This could be because you hit a bug. It is also possible that this binary
|
or one of the libraries it was linked against is corrupt, improperly built,
|
GDB stack
(gdb) bt
|
#0 __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
|
#1 0x000055906acb5508 in handle_fatal_signal ()
|
#2 <signal handler called>
|
#3 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
|
#4 0x00007f6b47065859 in __GI_abort () at abort.c:79
|
#5 0x00007f6b470d026e in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f6b471fa298 "%s\n")
|
at ../sysdeps/posix/libc_fatal.c:155
|
#6 0x00007f6b470d82fc in malloc_printerr (str=str@entry=0x7f6b471f84c1 "free(): invalid pointer") at malloc.c:5347
|
#7 0x00007f6b470d9b2c in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:4173
|
#8 0x00007f6b4654069c in gcache::MemStore::discard (bh=0x7f6afffff528, this=0x55906e177620) at ./gcache/src/gcache_mem_store.hpp:136
|
#9 gcache::GCache::discard_buffer (this=0x55906e1774f0, bh=0x7f6afffff528, ptr=<optimized out>) at ./gcache/src/GCache_memops.cpp:18
|
#10 0x00007f6b46540cde in gcache::GCache::discard_tail (this=this@entry=0x55906e1774f0, seqno=seqno@entry=3)
|
at ./gcache/src/GCache_memops.cpp:161
|
#11 0x00007f6b465265da in gcache::GCache::seqno_reset (this=this@entry=0x55906e1774f0, gtid=...) at ./gcache/src/GCache_seqno.cpp:31
|
#12 0x00007f6b463f5408 in galera::ReplicatorSMM::ReplicatorSMM (this=0x55906e177040, args=<optimized out>)
|
at ./galerautils/src/gu_uuid.hpp:203
|
#13 0x00007f6b463c4f52 in galera_init (gh=0x55906e142ef0, args=0x7fff984bad20) at ./galera/src/wsrep_provider.cpp:48
|
#14 0x000055906b2b0cbc in wsrep::wsrep_provider_v26::wsrep_provider_v26(wsrep::server_state&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, wsrep::provider::services const&) ()
|
#15 0x000055906b2ada84 in wsrep::provider::make_provider(wsrep::server_state&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, wsrep::provider::services const&) ()
|
#16 0x000055906b298d43 in wsrep::server_state::load_provider(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, wsrep::provider::services const&) ()
|
#17 0x000055906af40db4 in wsrep_init() ()
|
#18 0x000055906af41416 in wsrep_init_startup(bool) ()
|
#19 0x000055906a9e2679 in ?? ()
|
#20 0x000055906a9e7666 in mysqld_main(int, char**) ()
|
#21 0x00007f6b47067083 in __libc_start_main (main=0x55906a9c2d30 <main>, argc=2, argv=0x7fff984bb908, init=<optimized out>,
|
fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff984bb8f8) at ../csu/libc-start.c:308
|
#22 0x000055906a9db6be in _start ()
|
(gdb)
|
Attachments
Activity
Field | Original Value | New Value |
---|---|---|
Priority | Critical [ 2 ] | Blocker [ 1 ] |
Description |
Test case
{noformat} Install10.3.36 on vagrant boxes (2 nodes) shutdown node2 remove 10.3.36 packages from node2 install10.4.26 packages after updating repo > server startup is failing after package installation {noformat} Error info {noformat} 2022-08-24 14:48:04 0 [Note] WSREP: Service thread queue flushed. 2022-08-24 14:48:04 0 [Note] WSREP: ####### Assign initial position for certification: 5c410a36-23a8-11ed-a44c-f6f37823dd10:3, protocol version: -1 2022-08-24 14:48:04 0 [ERROR] WSREP: Corrupt buffer header: addr: 0x7f722bd5b530, seqno: 7019267256999739392, size: 825111097, ctx: 0x559652a28678, flags: 14391. store: 46, type: 49 220824 14:48:04 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, {noformat} GDB stack {noformat} (gdb) bt #0 __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56 #1 0x000055906acb5508 in handle_fatal_signal () #2 <signal handler called> #3 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #4 0x00007f6b47065859 in __GI_abort () at abort.c:79 #5 0x00007f6b470d026e in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f6b471fa298 "%s\n") at ../sysdeps/posix/libc_fatal.c:155 #6 0x00007f6b470d82fc in malloc_printerr (str=str@entry=0x7f6b471f84c1 "free(): invalid pointer") at malloc.c:5347 #7 0x00007f6b470d9b2c in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:4173 #8 0x00007f6b4654069c in gcache::MemStore::discard (bh=0x7f6afffff528, this=0x55906e177620) at ./gcache/src/gcache_mem_store.hpp:136 #9 gcache::GCache::discard_buffer (this=0x55906e1774f0, bh=0x7f6afffff528, ptr=<optimized out>) at ./gcache/src/GCache_memops.cpp:18 #10 0x00007f6b46540cde in gcache::GCache::discard_tail (this=this@entry=0x55906e1774f0, seqno=seqno@entry=3) at ./gcache/src/GCache_memops.cpp:161 #11 0x00007f6b465265da in gcache::GCache::seqno_reset (this=this@entry=0x55906e1774f0, gtid=...) at ./gcache/src/GCache_seqno.cpp:31 #12 0x00007f6b463f5408 in galera::ReplicatorSMM::ReplicatorSMM (this=0x55906e177040, args=<optimized out>) at ./galerautils/src/gu_uuid.hpp:203 #13 0x00007f6b463c4f52 in galera_init (gh=0x55906e142ef0, args=0x7fff984bad20) at ./galera/src/wsrep_provider.cpp:48 #14 0x000055906b2b0cbc in wsrep::wsrep_provider_v26::wsrep_provider_v26(wsrep::server_state&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, wsrep::provider::services const&) () #15 0x000055906b2ada84 in wsrep::provider::make_provider(wsrep::server_state&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, wsrep::provider::services const&) () #16 0x000055906b298d43 in wsrep::server_state::load_provider(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, wsrep::provider::services const&) () #17 0x000055906af40db4 in wsrep_init() () #18 0x000055906af41416 in wsrep_init_startup(bool) () #19 0x000055906a9e2679 in ?? () #20 0x000055906a9e7666 in mysqld_main(int, char**) () #21 0x00007f6b47067083 in __libc_start_main (main=0x55906a9c2d30 <main>, argc=2, argv=0x7fff984bb908, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff984bb8f8) at ../csu/libc-start.c:308 #22 0x000055906a9db6be in _start () (gdb) {noformat} |
Test case
{noformat} Install 10.3.36 on vagrant boxes (2 nodes) shutdown node2 remove 10.3.36 packages from node2 install10.4.26 packages after updating repo > server startup is failing after package installation {noformat} Error info {noformat} 2022-08-24 14:48:04 0 [Note] WSREP: Service thread queue flushed. 2022-08-24 14:48:04 0 [Note] WSREP: ####### Assign initial position for certification: 5c410a36-23a8-11ed-a44c-f6f37823dd10:3, protocol version: -1 2022-08-24 14:48:04 0 [ERROR] WSREP: Corrupt buffer header: addr: 0x7f722bd5b530, seqno: 7019267256999739392, size: 825111097, ctx: 0x559652a28678, flags: 14391. store: 46, type: 49 220824 14:48:04 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, {noformat} GDB stack {noformat} (gdb) bt #0 __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56 #1 0x000055906acb5508 in handle_fatal_signal () #2 <signal handler called> #3 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #4 0x00007f6b47065859 in __GI_abort () at abort.c:79 #5 0x00007f6b470d026e in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f6b471fa298 "%s\n") at ../sysdeps/posix/libc_fatal.c:155 #6 0x00007f6b470d82fc in malloc_printerr (str=str@entry=0x7f6b471f84c1 "free(): invalid pointer") at malloc.c:5347 #7 0x00007f6b470d9b2c in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:4173 #8 0x00007f6b4654069c in gcache::MemStore::discard (bh=0x7f6afffff528, this=0x55906e177620) at ./gcache/src/gcache_mem_store.hpp:136 #9 gcache::GCache::discard_buffer (this=0x55906e1774f0, bh=0x7f6afffff528, ptr=<optimized out>) at ./gcache/src/GCache_memops.cpp:18 #10 0x00007f6b46540cde in gcache::GCache::discard_tail (this=this@entry=0x55906e1774f0, seqno=seqno@entry=3) at ./gcache/src/GCache_memops.cpp:161 #11 0x00007f6b465265da in gcache::GCache::seqno_reset (this=this@entry=0x55906e1774f0, gtid=...) at ./gcache/src/GCache_seqno.cpp:31 #12 0x00007f6b463f5408 in galera::ReplicatorSMM::ReplicatorSMM (this=0x55906e177040, args=<optimized out>) at ./galerautils/src/gu_uuid.hpp:203 #13 0x00007f6b463c4f52 in galera_init (gh=0x55906e142ef0, args=0x7fff984bad20) at ./galera/src/wsrep_provider.cpp:48 #14 0x000055906b2b0cbc in wsrep::wsrep_provider_v26::wsrep_provider_v26(wsrep::server_state&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, wsrep::provider::services const&) () #15 0x000055906b2ada84 in wsrep::provider::make_provider(wsrep::server_state&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, wsrep::provider::services const&) () #16 0x000055906b298d43 in wsrep::server_state::load_provider(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, wsrep::provider::services const&) () #17 0x000055906af40db4 in wsrep_init() () #18 0x000055906af41416 in wsrep_init_startup(bool) () #19 0x000055906a9e2679 in ?? () #20 0x000055906a9e7666 in mysqld_main(int, char**) () #21 0x00007f6b47067083 in __libc_start_main (main=0x55906a9c2d30 <main>, argc=2, argv=0x7fff984bb908, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff984bb8f8) at ../csu/libc-start.c:308 #22 0x000055906a9db6be in _start () (gdb) {noformat} |
Attachment | node2.err [ 65175 ] |
Attachment | node1.err [ 65176 ] |
Attachment | gcache.tar.gz [ 65229 ] |
Status | Open [ 1 ] | Confirmed [ 10101 ] |
Labels | regression |
Affects Version/s | 10.3.36 [ 27513 ] |
Comment |
[ Codership's Galera 4.11 does not have this bug. This is what it shows when recovering provided galera.cache file:
{noformat} 2022-10-08 22:49:41 0 [Note] WSREP: GCache::RingBuffer initial scan... 0.0% ( 0/1073741848 bytes) complete. 2022-10-08 22:49:44 0 [Note] WSREP: GCache::RingBuffer initial scan...100.0% (1073741848/1073741848 bytes) complete. 2022-10-08 22:49:44 0 [Note] WSREP: Recovering GCache ring buffer: Recovery failed, need to do full reset. {noformat} This means that previous contents of the file is ignored MariaDB's Galera version shows: {noformat} 2022-10-09 11:17:14 0 [Note] WSREP: GCache::RingBuffer initial scan... 0.0% ( 0/2097176 bytes) complete. 2022-10-09 11:17:14 0 [Note] WSREP: GCache::RingBuffer initial scan...100.0% (2097176/2097176 bytes) complete. 2022-10-09 11:17:14 0 [Note] WSREP: Recovering GCache ring buffer: didn't recover any events. {noformat} This means that the file is taken as is and it's structures were found valid. Since gcache buffer header format is different between 3.x and 4.x this inevitably leads to crash. The diff between Codership's and MariaDB's gcache sources (only that part) is 3K lines, meaning that at least that part of Galera library shipped with MariaDB is terribly outdated. ] |
Status | Confirmed [ 10101 ] | In Progress [ 3 ] |
Status | In Progress [ 3 ] | Stalled [ 10000 ] |
Assignee | Alexey [ yurchenko ] | Jan Lindström [ jplindst ] |
Assignee | Jan Lindström [ jplindst ] | Alexey [ yurchenko ] |
Status | Stalled [ 10000 ] | Needs Feedback [ 10501 ] |
Assignee | Alexey [ yurchenko ] | Jan Lindström [ jplindst ] |
Status | Needs Feedback [ 10501 ] | Open [ 1 ] |
Status | Open [ 1 ] | In Progress [ 3 ] |
Status | In Progress [ 3 ] | In Testing [ 10301 ] |
Assignee | Jan Lindström [ jplindst ] | Ramesh Sivaraman [ JIRAUSER48189 ] |
Assignee | Ramesh Sivaraman [ JIRAUSER48189 ] | Jan Lindström [ jplindst ] |
Status | In Testing [ 10301 ] | Stalled [ 10000 ] |
issue.field.resolutiondate | 2022-10-12 04:57:31.0 | 2022-10-12 04:57:31.048 |
Fix Version/s | 10.3.37 [ 28404 ] | |
Fix Version/s | 10.4.27 [ 28405 ] | |
Fix Version/s | 10.4 [ 22408 ] | |
Resolution | Fixed [ 1 ] | |
Status | Stalled [ 10000 ] | Closed [ 6 ] |
Attachment | logs.tar.gz [ 69160 ] |
Zendesk Related Tickets | 129199 |
Did not see this issue in 10.4.25 > 10.4.26 rolling upgrade.