[MCOL-4807] SegFault when using Distributed Aggregate Functions Created: 2021-07-09  Updated: 2022-02-22  Resolved: 2022-02-22

Status: Closed
Project: MariaDB ColumnStore
Component/s: ExeMgr
Affects Version/s: 5.5.2
Fix Version/s: 6.3.1

Type: Bug Priority: Blocker
Reporter: Gale Temp Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: None
Environment:

Centos 8 Linux 4.18.0-305.7.1.el8_4.x86_64 run inside Hyper-V



 Description   

I run the following query and the connection get closed with error.

SELECT m.Year, m.Origin, m.Dest, m.PAX, REGR_INTERCEPT(CAST(p.PAX AS double),CAST(p.Year AS double)) INTERCEPT, REGR_SLOPE(CAST(p.PAX AS double),CAST(p.Year AS double)) SLOPE, REGR_R2(CAST(p.PAX AS double),CAST(p.Year AS double)) R2
FROM MarketYearSize m
LEFT JOIN MarketYearSize p ON m.Origin = p.Origin AND m.Dest = p.Dest AND m.Year >= p.year AND m.year - 5 <= p.Year
GROUP BY m.Year, m.Origin, m.Dest, m.PAX;

When I look at the trace I see the following:

Date/time: 2021-07-08 19:46:56
Signal: 11
/usr/bin/ExeMgr(+0x2c735)[0x5597030e5735]
/lib64/libpthread.so.0(+0x12b20)[0x7f0cae9c9b20]
/lib64/librowgroup.so(_ZNK8rowgroup13UserDataStore11getUserDataEj+0x66)[0x7f0cb24d29f6]
/lib64/librowgroup.so(_ZN8rowgroup14RowAggregation6doUDAFERKNS_3RowElllRm+0x988)[0x7f0cb24bccf8]
/lib64/librowgroup.so(_ZN8rowgroup16RowAggregationUM11updateEntryERKNS_3RowE+0x16a)[0x7f0cb24b884a]
/lib64/librowgroup.so(_ZN8rowgroup14RowAggregation12aggregateRowERNS_3RowE+0x25d)[0x7f0cb24bf5bd]
/lib64/librowgroup.so(_ZN8rowgroup14RowAggregation11addRowGroupEPKNS_8RowGroupERSt6vectorINS_3Row7PointerESaIS6_EE+0xab)[0x7f0cb24b5e6b]
/lib64/libjoblist.so(_ZN7joblist18TupleAggregateStep26threadedAggregateRowGroupsEj+0x271a)[0x7f0cb34c38fa]
/lib64/libthreadpool.so(_ZN10threadpool10ThreadPool11beginThreadEv+0x560)[0x7f0cadefe8d0]
/lib64/libboost_thread.so.1.66.0(+0x155e1)[0x7f0caf8915e1]
/lib64/libpthread.so.0(+0x814a)[0x7f0cae9bf14a]
/lib64/libc.so.6(clone+0x43)[0x7f0caced1dc3]

I also notice that it works if I use less aggregate functions, however the results seems correct only if I use 1 aggregate function. I tried reducing the number of rows but it does not solve the problem.



 Comments   
Comment by David Hall (Inactive) [ 2021-07-09 ]

It would help us diagnose this if you could give us the CREATE TABLE statements and a data file to load the tables.

Comment by Gale Temp [ 2021-07-09 ]

Thanks for your quick reply. See the code below.

DROP TABLE IF EXISTS MarketSize;
CREATE TABLE MarketSize (`YEAR` integer, ORIGIN varchar(3), DEST varchar(3), PAX integer) ENGINE=COLUMNSTORE COMMENT='compression=1';
 
INSERT INTO MarketSize VALUES
	 (2016,'MCO','BWI',48648),
	 (2015,'MCO','BWI',36833),
	 (2017,'MCO','BWI',52975),
	 (2018,'MCO','BWI',54494),
	 (2019,'MCO','BWI',50467),
	 (2020,'MCO','BWI',13263);
 
SELECT m.Year, m.Origin, m.Dest, m.PAX,   REGR_INTERCEPT(CAST(p.PAX AS double),CAST(p.Year AS double)) INTERCEPT,  REGR_SLOPE(CAST(p.PAX AS double),CAST(p.Year AS double)) SLOPE, REGR_R2(CAST(p.PAX AS double),CAST(p.Year AS double)) R2
FROM MarketSize m
LEFT JOIN MarketSize p ON m.Origin = p.Origin AND m.Dest = p.Dest AND m.Year >= p.year AND m.year - 5 <= p.Year
GROUP BY m.Year, m.Origin, m.Dest, m.PAX;

Comment by Gale Temp [ 2021-07-09 ]

If you use only one aggregate function (as in the query below) it completes.

SELECT m.Year, m.Origin, m.Dest, m.PAX,   REGR_INTERCEPT(CAST(p.PAX AS double),CAST(p.Year AS double)) INTERCEPT   -- ,  REGR_SLOPE(CAST(p.PAX AS double),CAST(p.Year AS double)) SLOPE, REGR_R2(CAST(p.PAX AS double),CAST(p.Year AS double)) R2
FROM MarketSize m
LEFT JOIN MarketSize p ON m.Origin = p.Origin AND m.Dest = p.Dest AND m.Year >= p.year AND m.year - 5 <= p.Year
GROUP BY m.Year, m.Origin, m.Dest, m.PAX;

Comment by Gale Temp [ 2021-07-09 ]

Also if you run the following two queries separately, the correlation gives different results. I hope this helps.

SELECT m.Year, m.Origin, m.Dest, m.PAX,  CORR(CAST(p.PAX AS double),CAST(p.Year AS double)) CORRELATION  
FROM MarketSize m
LEFT JOIN MarketSize p ON m.Origin = p.Origin AND m.Dest = p.Dest AND m.Year >= p.year AND m.year - 5 <= p.Year
GROUP BY m.Year, m.Origin, m.Dest, m.PAX;
 
----------------------------------------------------------------------------------
 
SELECT m.Year, m.Origin, m.Dest, m.PAX,  CORR(CAST(p.PAX AS double),CAST(p.Year AS double)) CORRELATION ,   REGR_INTERCEPT(CAST(p.PAX AS double),CAST(p.Year AS double)) INTERCEPT 
FROM MarketSize m
LEFT JOIN MarketSize p ON m.Origin = p.Origin AND m.Dest = p.Dest AND m.Year >= p.year AND m.year - 5 <= p.Year
GROUP BY m.Year, m.Origin, m.Dest, m.PAX;

Comment by David Hall (Inactive) [ 2021-07-14 ]

I ran the above tests against Columnstore 6.1.1 (Due in July 2021). I found no issues. The problem has been fixed.

MariaDB [dhall]> SELECT m.Year, m.Origin, m.Dest, m.PAX,
    ->     REGR_INTERCEPT(CAST(p.PAX AS double),CAST(p.Year AS double)) INTERCEPT,
    ->     REGR_SLOPE(CAST(p.PAX AS double),CAST(p.Year AS double)) SLOPE, REGR_R2(CAST(p.PAX AS double),CAST(p.Year AS double)) R2
    -> FROM MarketSize m
    -> LEFT JOIN MarketSize p ON m.Origin = p.Origin AND m.Dest = p.Dest AND m.Year >= p.year AND m.year - 5 <= p.Year
    -> GROUP BY m.Year, m.Origin, m.Dest, m.PAX;
+------+--------+------+-------+-------------------+---------------------+---------------------+
| Year | Origin | Dest | PAX   | INTERCEPT         | SLOPE               | R2                  |
+------+--------+------+-------+-------------------+---------------------+---------------------+
| 2017 | MCO    | BWI  | 52975 |         -16224984 |                8071 |  0.9330715627666164 |
| 2016 | MCO    | BWI  | 48648 |         -23770392 |               11815 |                   1 |
| 2015 | MCO    | BWI  | 36833 |              NULL |                NULL |                NULL |
| 2019 | MCO    | BWI  | 50467 |        -6630410.4 |              3311.4 |  0.5600423661101044 |
| 2018 | MCO    | BWI  | 54494 |         -11508324 |                5731 |  0.8561297441764697 |
| 2020 | MCO    | BWI  | 13263 | 6433874.142857143 | -3167.8285714285716 | 0.14147668561205626 |
+------+--------+------+-------+-------------------+---------------------+---------------------+
6 rows in set (0.084 sec)
 
MariaDB [dhall]> 
MariaDB [dhall]> SELECT m.Year, m.Origin, m.Dest, m.PAX,   
    ->     REGR_INTERCEPT(CAST(p.PAX AS double),CAST(p.Year AS double)) INTERCEPT 
    -> FROM MarketSize m
    -> LEFT JOIN MarketSize p ON m.Origin = p.Origin AND m.Dest = p.Dest AND m.Year >= p.year AND m.year - 5 <= p.Year
    -> GROUP BY m.Year, m.Origin, m.Dest, m.PAX;
+------+--------+------+-------+-------------------+
| Year | Origin | Dest | PAX   | INTERCEPT         |
+------+--------+------+-------+-------------------+
| 2017 | MCO    | BWI  | 52975 |         -16224984 |
| 2016 | MCO    | BWI  | 48648 |         -23770392 |
| 2015 | MCO    | BWI  | 36833 |              NULL |
| 2019 | MCO    | BWI  | 50467 |        -6630410.4 |
| 2018 | MCO    | BWI  | 54494 |         -11508324 |
| 2020 | MCO    | BWI  | 13263 | 6433874.142857143 |
+------+--------+------+-------+-------------------+
6 rows in set (0.082 sec)
 
MariaDB [dhall]> 
MariaDB [dhall]> SELECT m.Year, m.Origin, m.Dest, m.PAX,  
    ->     CORR(CAST(p.PAX AS double),CAST(p.Year AS double)) CORRELATION  
    -> FROM MarketSize m
    -> LEFT JOIN MarketSize p ON m.Origin = p.Origin AND m.Dest = p.Dest AND m.Year >= p.year AND m.year - 5 <= p.Year
    -> GROUP BY m.Year, m.Origin, m.Dest, m.PAX;
+------+--------+------+-------+----------------------+
| Year | Origin | Dest | PAX   | CORRELATION          |
+------+--------+------+-------+----------------------+
| 2017 | MCO    | BWI  | 52975 |   0.9659562944391513 |
| 2016 | MCO    | BWI  | 48648 |                    1 |
| 2015 | MCO    | BWI  | 36833 |                 NULL |
| 2019 | MCO    | BWI  | 50467 |   0.7483597838674286 |
| 2018 | MCO    | BWI  | 54494 |   0.9252727944646756 |
| 2020 | MCO    | BWI  | 13263 | -0.37613386661141834 |
+------+--------+------+-------+----------------------+
6 rows in set (0.088 sec)
 
MariaDB [dhall]> 
MariaDB [dhall]> SELECT m.Year, m.Origin, m.Dest, m.PAX, 
    ->      CORR(CAST(p.PAX AS double),CAST(p.Year AS double)) CORRELATION,   
    ->      REGR_INTERCEPT(CAST(p.PAX AS double),CAST(p.Year AS double)) INTERCEPT
    ->  FROM MarketSize m
    ->  LEFT JOIN MarketSize p ON m.Origin = p.Origin AND m.Dest = p.Dest AND m.Year >= p.year AND m.year - 5 <= p.Year
    ->  GROUP BY m.Year, m.Origin, m.Dest, m.PAX;
+------+--------+------+-------+----------------------+-------------------+
| Year | Origin | Dest | PAX   | CORRELATION          | INTERCEPT         |
+------+--------+------+-------+----------------------+-------------------+
| 2017 | MCO    | BWI  | 52975 |   0.9659562944391513 |         -16224984 |
| 2016 | MCO    | BWI  | 48648 |                    1 |         -23770392 |
| 2015 | MCO    | BWI  | 36833 |                 NULL |              NULL |
| 2019 | MCO    | BWI  | 50467 |   0.7483597838674286 |        -6630410.4 |
| 2018 | MCO    | BWI  | 54494 |   0.9252727944646756 |         -11508324 |
| 2020 | MCO    | BWI  | 13263 | -0.37613386661141834 | 6433874.142857143 |
+------+--------+------+-------+----------------------+-------------------+
6 rows in set (0.100 sec)

Comment by Gale Temp [ 2021-07-15 ]

Thanks for the update. I'll wait for the new version.

Generated at Thu Feb 08 02:53:09 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.