[MCOL-1040] ERROR 2013 (HY000): Lost connection to MySQL server during query Created: 2017-11-19  Updated: 2017-12-05  Resolved: 2017-12-05

Status: Closed
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: None
Fix Version/s: 1.0.12, 1.1.3

Type: Bug Priority: Major
Reporter: hiller1 Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Attachments: File all.sql.gz     File columnstoreSupportReport.columnstore-1.tar.gz     File sql_by_day.sql    
Sprint: 2017-24

 Description   

I hava a super big SQL,30 table left join

max_allowed_packet = 1G

run this super big SQL,result is :

ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 3
Current database: test

ERROR 2013 (HY000): Lost connection to MySQL server during query



 Comments   
Comment by Andrew Hutchings (Inactive) [ 2017-11-19 ]

Can you please provide more information on this one, specifically:

Comment by hiller1 [ 2017-11-20 ]

columnstoreSupportReport.columnstore-1.tar.gz

Server version: 10.1.26-MariaDB Columnstore 1.0.11-1

Comment by hiller1 [ 2017-11-20 ]

sql_by_day.sql super big sql

Comment by Andrew Hutchings (Inactive) [ 2017-11-20 ]

Many thanks for the information.

OK, so, this is a crash that appears to be happening when linking the return columns from a subquery to the parent query during ColumnStore's optimisation phase. Unfortunately there is not enough information in the logs to explain where in the query and why this is happening.

Can you please send us the "SHOW CREATE TABLE" output for the tables involved? We should be able to reproduce it from this and track down where and why it is happening.

Comment by hiller1 [ 2017-11-20 ]

all.sql.gz schema

/usr/local/mariadb/columnstore/mysql/bin/mysqldump --defaults-extra-file=/usr/local/mariadb/columnstore/mysql/my.cnf --single-transaction -uroot -d -A > all.sql

Comment by Andrew Hutchings (Inactive) [ 2017-11-29 ]

Crash happens in replaceRefCol() on this line:

ReturnedColumn* tmp = derivedColList[sc->colPosition()]->clone();

sc is finup_core.lend_repay_record.repaid_time. sc->colPosition() is -1 hence the crash.

(gdb) p *sc
$3 = {<execplan::ReturnedColumn> = {<execplan::TreeNode> = {
      _vptr.TreeNode = 0x7fbe9cbfbd78 <vtable for execplan::SimpleColumn+16>, 
      fResult = {intVal = 0, uintVal = 0, origIntVal = 0, dummy = 0, 
        doubleVal = 0, floatVal = 0, boolVal = false, strVal = "", 
        decimalVal = {value = 0, scale = 0 '\000', precision = 0 '\000'}, 
        valueConverted = false}, fResultType = {colWidth = 8, 
        constraintType = execplan::CalpontSystemCatalog::NO_CONSTRAINT, 
        colDataType = execplan::CalpontSystemCatalog::DATETIME, ddn = {
          dictOID = -2147483648, listOID = -2147483648, treeOID = -2147483648, 
          compressionType = 0}, defaultValue = "", colPosition = 16, 
        scale = 0, precision = 10, compressionType = 0, columnOID = 33946, 
        autoincrement = false, nextvalue = 1}, fOperationType = {colWidth = 0, 
        constraintType = execplan::CalpontSystemCatalog::NO_CONSTRAINT, 
        colDataType = execplan::CalpontSystemCatalog::MEDINT, ddn = {
          dictOID = 0, listOID = 0, treeOID = 0, compressionType = 0}, 
        defaultValue = "", colPosition = -1, scale = 0, precision = -1, 
        compressionType = 0, columnOID = 0, autoincrement = false, 
        nextvalue = 0}, fRegex = {px = 0x0, pn = {pi_ = 0x0}}, 
      tmp = '\000' <repeats 311 times>, fDerivedTable = "", fRefCount = 0, 
      fDerivedRefCol = 0x0}, fReturnAll = false, fSessionID = 0, 
    fSequence = -1, fCardinality = 0, fAlias = "", fDistinct = false, 
    fJoinInfo = 0, fAsc = true, fNullsFirst = true, 
    fOrderPos = 18446744073709551615, fColSource = 0, fColPosition = -1, 
    fSimpleColumnList = std::vector of length 0, capacity 0, 
    fAggColumnList = std::vector of length 0, capacity 0, 
    fWindowFunctionColumnList = std::vector of length 0, capacity 0, 
    fHasAggregate = false, fData = "", fErrMsg = "", fInputIndex = 4294967295, 
    fOutputIndex = 4294967295, fExpressionId = 4294967295}, 
  fSchemaName = "finup_core", fTableName = "lend_repay_record", 
  fColumnName = "repaid_time", fOid = 33946, fTableAlias = "lh02", 
  fData = "finup_core.lh02.repaid_time", fIndexName = "", fViewName = "", 
  fIsInfiniDB = true}

Simplified down to just this part:

SELECT
        lh02.core_lend_request_id,
        max(date(lh02.repaid_time)) AS max_repaid_time,
    lh01.real_due_date AS 'real_due_date'
    FROM
        finup_core.lend_repay_record lh02
    LEFT JOIN (
        SELECT
            core_lend_request_id,
            due_date AS real_due_date
        FROM
            finup_core.lend_repay_record
        WHERE
            TIMESTAMPDIFF(DAY,date(due_date),date("''' + date01 + '''")) =1
    ) lh01 ON lh02.core_lend_request_id = lh01.core_lend_request_id
    WHERE
        TIMESTAMPDIFF(DAY,date(lh02.due_date),DATE("''' + date01 + '''")) >= 0
    AND TIMESTAMPDIFF(DAY,date(lh02.repaid_time),DATE(lh01.real_due_date)) >= 0
    AND lh02.repaid_time IS NOT NULL 
    GROUP BY
        lh02.core_lend_request_id

It is the TIMESTAMPDIFF on lh02.repaid_time that is triggering it, not sure why yet.

Comment by Andrew Hutchings (Inactive) [ 2017-11-29 ]

Most simplified form of test case:

create table mcol1040 (a int, b datetime, c datetime) engine=columnstore;
 
select mcol1040.a from mcol1040 left join (select a, c from mcol1040 mc) mcs on mcol1040.a = mcs.a where timestampdiff(DAY, b, mcs.c) >= 0;

The key piece here is the mcs.c being inside the timestampdiff, I can't find a way to trigger it if it is mcol1040.c or if the c is a different WHERE condition.

Comment by Andrew Hutchings (Inactive) [ 2017-11-29 ]

Problem appears to be an assumption about what is and is not a derived column. Pull request for 1.0, will merge up to 1.1 in regular cycle.

For QA: See my simplified test case in my previous comment.

Comment by Daniel Lee (Inactive) [ 2017-12-05 ]

Builds verified: github source
1.0.12-1

[root@localhost ~]# cat mariadb-columnstore-1.0.12-1-centos7.x86_64.bin.tar.txt
/root/columnstore/mariadb-columnstore-server
commit 25e9d054cd3d05683fade1b974e1730316d256ed
Merge: 89b2ea1 7c52a83
Author: David.Hall <david.hall@mariadb.com>
Date: Tue Nov 21 10:49:11 2017 -0600

Merge pull request #79 from mariadb-corporation/MCOL-954-1.0

MCOL-954 Init vtable state

/root/columnstore/mariadb-columnstore-server/mariadb-columnstore-engine
commit b112e826a2793228f5f3c1312fec5291fc1d8bf5
Merge: 7c2640f b657938
Author: David.Hall <david.hall@mariadb.com>
Date: Fri Dec 1 16:17:28 2017 -0600

Merge pull request #338 from mariadb-corporation/MCOL-1068

MCOL-1068 Improve compression_ratio() procedure

1.1.3-1

/root/columnstore/mariadb-columnstore-server
commit 632e265687674fb66bd1d704bc18032b00dd6b17
Merge: 5e9fe52 200f5be
Author: david hill <david.hill@mariadb.com>
Date: Tue Nov 21 15:22:06 2017 -0600

Merge branch 'develop-1.1' of https://github.com/mariadb-corporation/mariadb-columnstore-server into develop-1.1

/root/columnstore/mariadb-columnstore-server/mariadb-columnstore-engine
commit 4d8026618cfb5377c9a200170848092ce5660f10
Author: david hill <david.hill@mariadb.com>
Date: Wed Nov 29 09:36:24 2017 -0600

change how the os_detect is run on remote nodes

Reproduced the issue in 1.1.0-1 and verified it has been fixed in 1.0.12 and 1.1.3.

Generated at Thu Feb 08 02:25:44 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.