[MCOL-4088] workernode or controllernode sporadically crash with SIGABRT in __gnu_cxx::__verbose_terminate_handler() upon Columnstore plugin package installation Created: 2020-06-20  Updated: 2020-11-09  Resolved: 2020-11-09

Status: Closed
Project: MariaDB ColumnStore
Component/s: installation
Affects Version/s: 1.5.2
Fix Version/s: 5.5.1

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Roman
Resolution: Fixed Votes: 0
Labels: None
Environment:

1.5.2-1 / packages from bb-10.5-cs 19d09e49912, tarbuildnum 33657 / Ubuntu 16.04.6 Xenial / 4.15.0-106-generic / x86_64 ; VirtualBox 6.1, Debian Stretch host, guests installed from the official .iso images, run with default VirtualBox settings (only memory and disk size adjusted)


Attachments: File controllernode_crash.tar.gz     File workernode_crash.tar.gz    
Issue Links:
PartOf
is part of MCOL-4134 Clean and fix remaining columnstore c... Closed

 Description   

The crashes happen occasionally without any obvious reason, upon installation of mariadb-server mariadb-plugin-columnstore, on the same VM image where it works uneventfully most of the times.

Stacktrace:
 #0  0x00007fc7044b4428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
         resultvar = 0
         pid = 9048
         selftid = 9048
 #1  0x00007fc7044b602a in __GI_abort () at abort.c:89
         save_stage = 2
         act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 <repeats 16 times>}}, sa_flags = 0, sa_restorer = 0x564b301c2618}
         sigs = {__val = {32, 0 <repeats 15 times>}}
 #2  0x00007fc704be184d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #3  0x00007fc704bdf6b6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #4  0x00007fc704bdf701 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #5  0x00007fc704bdf919 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #6  0x00007fc707c116fe in idbdatafile::IDBPolicy::init(bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) () from /usr/lib/x86_64-linux-gnu/libidbdatafile.so
 No symbol table info available.
 #7  0x00007fc707c121dd in idbdatafile::IDBPolicy::configIDBPolicy() () from /usr/lib/x86_64-linux-gnu/libidbdatafile.so
 No symbol table info available.
 #8  0x0000564b2f34d062 in ?? ()
 No symbol table info available.
 #9  0x00007fc70449f830 in __libc_start_main (main=0x564b2f34cf90, argc=3, argv=0x7ffc7c18cbf8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffc7c18cbe8) at ../csu/libc-start.c:291
         result = <optimized out>
         unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, 4368315612961404667, 94880914529056, 140722390486000, 0, 0, 8066503001669072635, 8036402091289966331}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x7ffc7c18cc18, 0x7fc70b292168}, data = {prev = 0x0, cleanup = 0x0, canceltype = 2081999896}}}
         not_first_call = <optimized out>
 #10 0x0000564b2f34db49 in ?? ()
 No symbol table info available.
StacktraceAddressSignature: /usr/bin/workernode:6:/lib/x86_64-linux-gnu/libc-2.23.so+35428:/lib/x86_64-linux-gnu/libc-2.23.so+3702a:/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21+8f84d:/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21+8d6b6:/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21+8d701:/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21+8d919:/usr/lib/x86_64-linux-gnu/libidbdatafile.so+1f6fe:/usr/lib/x86_64-linux-gnu/libidbdatafile.so+201dd:/usr/bin/workernode+d062:/lib/x86_64-linux-gnu/libc-2.23.so+20830:/usr/bin/workernode+db49
StacktraceTop:
 __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 idbdatafile::IDBPolicy::init(bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) () from /usr/lib/x86_64-linux-gnu/libidbdatafile.so
Tags: xenial third-party-packages
ThreadStacktrace:
 .
 Thread 2 (Thread 0x7fc7016eb700 (LWP 9058)):
 #0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
 No locals.
 #1  0x00007fc70811b19b in threadpool::ThreadPool::pruneThread() () from /usr/lib/x86_64-linux-gnu/libthreadpool.so
 No symbol table info available.
 #2  0x00007fc7066055d5 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.58.0
 No symbol table info available.
 #3  0x00007fc70ae556ba in start_thread (arg=0x7fc7016eb700) at pthread_create.c:333
         __res = <optimized out>
         pd = 0x7fc7016eb700
         now = <optimized out>
         unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140492699252480, -8067915554564114693, 0, 140722390485087, 140492699253184, 94880929547504, 8036395707065399035, 8036414354446448379}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
         not_first_call = <optimized out>
         pagesize_m1 = <optimized out>
         sp = <optimized out>
         freesize = <optimized out>
         __PRETTY_FUNCTION__ = "start_thread"
 #4  0x00007fc70458641d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
 No locals.
 .
 Thread 1 (Thread 0x7fc70b260740 (LWP 9048)):
 #0  0x00007fc7044b4428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
         resultvar = 0
         pid = 9048
         selftid = 9048
 #1  0x00007fc7044b602a in __GI_abort () at abort.c:89
         save_stage = 2
         act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 <repeats 16 times>}}, sa_flags = 0, sa_restorer = 0x564b301c2618}
         sigs = {__val = {32, 0 <repeats 15 times>}}
 #2  0x00007fc704be184d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #3  0x00007fc704bdf6b6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #4  0x00007fc704bdf701 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #5  0x00007fc704bdf919 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #6  0x00007fc707c116fe in idbdatafile::IDBPolicy::init(bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) () from /usr/lib/x86_64-linux-gnu/libidbdatafile.so
 No symbol table info available.
 #7  0x00007fc707c121dd in idbdatafile::IDBPolicy::configIDBPolicy() () from /usr/lib/x86_64-linux-gnu/libidbdatafile.so
 No symbol table info available.
 #8  0x0000564b2f34d062 in ?? ()
 No symbol table info available.
 #9  0x00007fc70449f830 in __libc_start_main (main=0x564b2f34cf90, argc=3, argv=0x7ffc7c18cbf8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffc7c18cbe8) at ../csu/libc-start.c:291
         result = <optimized out>
         unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, 4368315612961404667, 94880914529056, 140722390486000, 0, 0, 8066503001669072635, 8036402091289966331}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x7ffc7c18cc18, 0x7fc70b292168}, data = {prev = 0x0, cleanup = 0x0, canceltype = 2081999896}}}
         not_first_call = <optimized out>
 #10 0x0000564b2f34db49 in ?? ()
 No symbol table info available.
Title: workernode crashed with SIGABRT in __gnu_cxx::__verbose_terminate_handler()

 #0  0x00007f8e2f4c1428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
         resultvar = 0
         pid = 9289
         selftid = 9289
 #1  0x00007f8e2f4c302a in __GI_abort () at abort.c:89
         save_stage = 2
         act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 <repeats 13 times>, 140734883982640, 140248655882267, 94179417462920}}, sa_flags = 801028224, sa_restorer = 0x55a7dab91888}
         sigs = {__val = {32, 0 <repeats 15 times>}}
 #2  0x00007f8e2fbee84d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #3  0x00007f8e2fbec6b6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #4  0x00007f8e2fbec701 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #5  0x00007f8e2fbec919 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #6  0x00007f8e32c1e6fe in idbdatafile::IDBPolicy::init(bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) () from /usr/lib/x86_64-linux-gnu/libidbdatafile.so
 No symbol table info available.
 #7  0x00007f8e32c1f1dd in idbdatafile::IDBPolicy::configIDBPolicy() () from /usr/lib/x86_64-linux-gnu/libidbdatafile.so
 No symbol table info available.
 #8  0x000055a7d8b51378 in ?? ()
 No symbol table info available.
 #9  0x00007f8e2f4ac830 in __libc_start_main (main=0x55a7d8b51220, argc=2, argv=0x7fff64c47888, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff64c47878) at ../csu/libc-start.c:291
         result = <optimized out>
         unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, 7434466455411384146, 94179383652128, 140734883985536, 0, 0, 3719409514439139154, 3710799113800539986}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x7fff64c478a0, 0x7f8e3629f168}, data = {prev = 0x0, cleanup = 0x0, canceltype = 1690597536}}}
         not_first_call = <optimized out>
 #10 0x000055a7d8b52f49 in ?? ()
 No symbol table info available.
StacktraceAddressSignature: /usr/bin/controllernode:6:/lib/x86_64-linux-gnu/libc-2.23.so+35428:/lib/x86_64-linux-gnu/libc-2.23.so+3702a:/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21+8f84d:/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21+8d6b6:/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21+8d701:/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21+8d919:/usr/lib/x86_64-linux-gnu/libidbdatafile.so+1f6fe:/usr/lib/x86_64-linux-gnu/libidbdatafile.so+201dd:/usr/bin/controllernode+f378:/lib/x86_64-linux-gnu/libc-2.23.so+20830:/usr/bin/controllernode+10f49
StacktraceTop:
 __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 idbdatafile::IDBPolicy::init(bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) () from /usr/lib/x86_64-linux-gnu/libidbdatafile.so
Tags: xenial third-party-packages
ThreadStacktrace:
 .
 Thread 2 (Thread 0x7f8e2c6f8700 (LWP 9302)):
 #0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
 No locals.
 #1  0x00007f8e3312819b in threadpool::ThreadPool::pruneThread() () from /usr/lib/x86_64-linux-gnu/libthreadpool.so
 No symbol table info available.
 #2  0x00007f8e316125d5 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.58.0
 No symbol table info available.
 #3  0x00007f8e35e626ba in start_thread (arg=0x7f8e2c6f8700) at pthread_create.c:333
         __res = <optimized out>
         pd = 0x7f8e2c6f8700
         now = <optimized out>
         unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140248607590144, -3719068704764698798, 0, 140734883984623, 140248607590848, 94179417322736, 3710806016627728210, 3710787404626419538}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
         not_first_call = <optimized out>
         pagesize_m1 = <optimized out>
         sp = <optimized out>
         freesize = <optimized out>
         __PRETTY_FUNCTION__ = "start_thread"
 #4  0x00007f8e2f59341d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
 No locals.
 .
 Thread 1 (Thread 0x7f8e3626d740 (LWP 9289)):
 #0  0x00007f8e2f4c1428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
         resultvar = 0
         pid = 9289
         selftid = 9289
 #1  0x00007f8e2f4c302a in __GI_abort () at abort.c:89
         save_stage = 2
         act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 <repeats 13 times>, 140734883982640, 140248655882267, 94179417462920}}, sa_flags = 801028224, sa_restorer = 0x55a7dab91888}
         sigs = {__val = {32, 0 <repeats 15 times>}}
 #2  0x00007f8e2fbee84d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #3  0x00007f8e2fbec6b6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #4  0x00007f8e2fbec701 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #5  0x00007f8e2fbec919 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #6  0x00007f8e32c1e6fe in idbdatafile::IDBPolicy::init(bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) () from /usr/lib/x86_64-linux-gnu/libidbdatafile.so
 No symbol table info available.
 #7  0x00007f8e32c1f1dd in idbdatafile::IDBPolicy::configIDBPolicy() () from /usr/lib/x86_64-linux-gnu/libidbdatafile.so
 No symbol table info available.
 #8  0x000055a7d8b51378 in ?? ()
 No symbol table info available.
 #9  0x00007f8e2f4ac830 in __libc_start_main (main=0x55a7d8b51220, argc=2, argv=0x7fff64c47888, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff64c47878) at ../csu/libc-start.c:291
         result = <optimized out>
         unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, 7434466455411384146, 94179383652128, 140734883985536, 0, 0, 3719409514439139154, 3710799113800539986}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x7fff64c478a0, 0x7f8e3629f168}, data = {prev = 0x0, cleanup = 0x0, canceltype = 1690597536}}}
         not_first_call = <optimized out>
 #10 0x000055a7d8b52f49 in ?? ()
 No symbol table info available.
Title: controllernode crashed with SIGABRT in __gnu_cxx::__verbose_terminate_handler()

The suggestion from the first report to upgrade ubuntu-keyring didn't help, the second one was produced with it already upgraded.

The attached archive contains the Ubuntu crash report, logs from /var/log/mariadb/columnstore, files from /tmp/columnstore_tmp_files, and stdout of the installation.



 Comments   
Comment by Elena Stepanova [ 2020-06-22 ]

Same or similar crashes happen on CentOS 7 too, they are just much less obvious, as the installation process doesn't complain about anything at all. The symptom is that after pseudo-successful installation and startup, all attempts to use Columnstore end with

ERROR 1815 (HY000) at line 1: Internal error: CAL0009: Error while calling getSysCatDBRoot    

Restart, reboot etc. doesn't help.
Eventually I found these logs in the trace folder:

[elenst@localhost rpms]$ sudo cat /var/log/mariadb/columnstore/trace/controllernode.30278.log
Date/time: 2020-05-10 18:31:13
Signal: 6
 
/usr/bin/controllernode(+0x23100)[0x55a90fae5100]
/lib64/libpthread.so.0(+0xf630)[0x7f49558a4630]
/lib64/libc.so.6(gsignal+0x37)[0x7f494f13f387]
/lib64/libc.so.6(abort+0x148)[0x7f494f140a78]
/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x165)[0x7f494f8397d5]
/lib64/libstdc++.so.6(+0x5e746)[0x7f494f837746]
/lib64/libstdc++.so.6(+0x5e773)[0x7f494f837773]
/lib64/libstdc++.so.6(+0x5e993)[0x7f494f837993]
/lib64/libidbdatafile.so(_ZN11idbdatafile9IDBPolicy4initEbbRKSsl+0x390)[0x7f49527729b0]
/lib64/libidbdatafile.so(_ZN11idbdatafile9IDBPolicy15configIDBPolicyEv+0x393)[0x7f49527730b3]
/usr/bin/controllernode(+0xf18b)[0x55a90fad118b]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f494f12b555]
/usr/bin/controllernode(+0x10e3f)[0x55a90fad2e3f]
 
 
[elenst@localhost rpms]$ sudo cat /var/log/mariadb/columnstore/trace/PrimProc.30280.log 
Date/time: 2020-05-10 18:31:13
Signal: 6
 
/usr/bin/PrimProc(+0x8d710)[0x56229bfb1710]
/lib64/libpthread.so.0(+0xf630)[0x7fafbdfe1630]
/lib64/libc.so.6(gsignal+0x37)[0x7fafb69df387]
/lib64/libc.so.6(abort+0x148)[0x7fafb69e0a78]
/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x165)[0x7fafb70d97d5]
/lib64/libstdc++.so.6(+0x5e746)[0x7fafb70d7746]
/lib64/libstdc++.so.6(+0x5e773)[0x7fafb70d7773]
/lib64/libstdc++.so.6(+0x5e993)[0x7fafb70d7993]
/lib64/libidbdatafile.so(_ZN11idbdatafile9IDBPolicy4initEbbRKSsl+0x390)[0x7fafba24e9b0]
/lib64/libidbdatafile.so(_ZN11idbdatafile9IDBPolicy15configIDBPolicyEv+0x393)[0x7fafba24f0b3]
/usr/bin/PrimProc(+0x26704)[0x56229bf4a704]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fafb69cb555]
/usr/bin/PrimProc(+0x3f337)[0x56229bf63337]

Somebody should really come up with a decent course of action which can be presented to users, as currently about a half of my attempts to install Columnstore ends with this, and even re-installation of the package doesn't necessarily help.

Comment by Roman [ 2020-07-13 ]

We haven't been able to reproduce the issue neither with docker nor with VMs. I think there is a potential race starting workernode and controllernode so we will migrate towards socket activation in the systemd units.

Comment by David Hall (Inactive) [ 2020-09-08 ]

This may be a situation where an attempt was made by controller | worker to open the tablelocks file, but it wasn't available at the precise moment it tried. We need to add retry/recovery code in this situation. And a better handling of the exception.

Generated at Thu Feb 08 02:47:42 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.