[MCOL-4933] Columnstore MTR fails Created: 2021-11-25  Updated: 2022-05-25  Resolved: 2022-05-25

Status: Closed
Project: MariaDB ColumnStore
Component/s: installation
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Critical
Reporter: Timofey Turenko Assignee: Sergey Zefirov
Resolution: Won't Do Votes: 0
Labels: affects-tests
Environment:

SLES 12


Attachments: File core-dump1.tar.gz     File core-dump2.tar.gz    
Sprint: 2021-16

 Description   

Reproducible only with SLES 12

 
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
columnstore.basic                        [ fail ]
        Test ended at 2021-11-25 19:34:26
 
CURRENT_TEST: columnstore.basic
mysqltest: At line 7: query 'CREATE TABLE t1 (a INT, b VARCHAR(255)) ENGINE=columnstore' failed: ER_INTERNAL_ERROR (1815): Internal error: CAL0009: IDB-2044: An internal error occurred.  Check the error log file & contact support.   
 
The result from queries just before the failure was:
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a INT, b VARCHAR(255)) ENGINE=columnstore;
 
 - saving '/var/tmp/mtr/log/columnstore.basic/' to '/var/tmp/mtr/log/columnstore.basic/'
 
Retrying test columnstore.basic, attempt(2/3)...

From syslog:

 
Nov 25 16:32:11 mdbci-rmbsqg3k-1637828109-build env[5684]: Starting PrimitiveServer: st = 1, sq = 10, pw = 128, pq = 10240, nb = 509714
Nov 25 16:32:11 mdbci-rmbsqg3k-1637828109-build env[5684]: PrimProc main process has started
Nov 25 16:32:11 mdbci-rmbsqg3k-1637828109-build systemd[1]: Started mcs-primproc.
Nov 25 17:03:14 mdbci-rmbsqg3k-1637828109-build systemd[1]: mcs-primproc.service: Main process exited, code=killed, status=11/SEGV
Nov 25 17:03:14 mdbci-rmbsqg3k-1637828109-build systemd[1]: mcs-primproc.service: Unit entered failed state.
Nov 25 17:03:14 mdbci-rmbsqg3k-1637828109-build systemd[1]: mcs-primproc.service: Failed with result 'signal'.
Nov 25 17:03:15 mdbci-rmbsqg3k-1637828109-build systemd[1]: mcs-primproc.service: Service RestartSec=100ms expired, scheduling restart.
Nov 25 17:03:15 mdbci-rmbsqg3k-1637828109-build systemd[1]: Stopped mcs-primproc.
Nov 25 17:03:15 mdbci-rmbsqg3k-1637828109-build systemd[1]: Starting mcs-primproc...
Nov 25 17:03:15 mdbci-rmbsqg3k-1637828109-build env[6056]: Starting PrimitiveServer: st = 1, sq = 10, pw = 128, pq = 10240, nb = 509714
Nov 25 17:03:15 mdbci-rmbsqg3k-1637828109-build env[6056]: PrimProc main process has started
Nov 25 17:03:15 mdbci-rmbsqg3k-1637828109-build systemd[1]: Started mcs-primproc.

from core dump:

(gdb) bt
#0  0x00007f2548915680 in __lll_unlock_elision () from /lib64/libpthread.so.0
#1  0x000055806a36a2f8 in primitiveprocessor::BatchPrimitiveProcessor::unlock (this=<optimized out>)
    at /usr/src/debug/MariaDB-/src_0/storage/columnstore/columnstore/primitives/primproc/batchprimitiveprocessor.h:166
#2  primitiveprocessor::BPPV::abort (this=<optimized out>) at /usr/src/debug/MariaDB-/src_0/storage/columnstore/columnstore/primitives/primproc/primitiveserver.cpp:2634
#3  0x000055806a36d5e4 in (anonymous namespace)::BPPHandler::destroyBPP (dieTime=..., bs=..., this=0x7f23e0001890)
    at /usr/src/debug/MariaDB-/src_0/storage/columnstore/columnstore/primitives/primproc/primitiveserver.cpp:1684
#4  (anonymous namespace)::BPPHandler::Destroy::operator() (this=0x7f23d00019e0)
    at /usr/src/debug/MariaDB-/src_0/storage/columnstore/columnstore/primitives/primproc/primitiveserver.cpp:1355
#5  0x00007f25480b87a8 in threadpool::PriorityThreadPool::threadFcn (this=0x55806bbdb2a0, preferredQueue=threadpool::PriorityThreadPool::HIGH)
    at /usr/src/debug/MariaDB-/src_0/storage/columnstore/columnstore/utils/threadpool/prioritythreadpool.cpp:191
#6  0x00007f25497c0b6a in ?? () from /usr/lib64/libboost_thread.so.1.54.0
#7  0x00007f254890b6da in start_thread () from /lib64/libpthread.so.0
#8  0x00007f2546ecd16d in clone () from /lib64/libc.so.6

server version: 10.6-enterprise, commid ID 57f6f59b634535781e54d207929a6f23131dd8e0

binaries can be found:
https://mdbe-test-repo@mdbe-ci-repo.mariadb.net/MariaDBEnterprise/10.6-enterprise-57f6f59b634535781e54d207929a6f23131dd8e0/sles/12/x86_64/
or
https://es-repo.mariadb.net/jenkins/DEVBUILDS/10.6/origin/10.6-enterprise/57f6f59b634535781e54d207929a6f23131dd8e0/RPMS/sles-12/



 Comments   
Comment by Sergey Zefirov [ 2021-11-28 ]

Of interest is this GNU Fortran error: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100352

It appears that __lll_lock_elision segfaults when it gets released more than once.

Comment by Roman [ 2021-12-09 ]

SLES 12 isn't officially supported.

Comment by Sergei Golubchik [ 2022-03-09 ]

SLES 12 most definitely is officially supported.

Here's the latest engineering policy: https://mariadb.com/wp-content/uploads/2021/12/mariadb-engineering-policies-v4-09_policy_1130.pdf#section.3

Generated at Thu Feb 08 02:54:06 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.