[MCOL-901] group_concat() consumes a great amount of memory Created: 2017-08-31  Updated: 2020-08-25  Resolved: 2019-03-11

Status: Closed
Project: MariaDB ColumnStore
Component/s: ExeMgr
Affects Version/s: 1.0.11
Fix Version/s: 1.2.3

Type: Bug Priority: Major
Reporter: Daniel Lee (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Sprint: 2019-01, 2019-02, 2019-03

 Description   

Build tested: 1.0.11-1, 1.0.9-1

Using a VM that's configured with 60 GB memory. I did the following test:

1) create a 1gb dbt3 database
The lineitem table is about 750mb data and 6001215 rows. There are 1500000 unique order keys in the table.

2) execute query
select l_orderkey, group_concat(l_partkey) from lineitem group by l_orderkey;

1500000 rows in set, 1 warning (4 min 13.48 sec)

3) Check cal trace
MariaDB [tpch1]> select calgettrace();
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

calgettrace()

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Desc Mode Table TableOID ReferencedColumns PIO LIO PBE Elapsed Rows
BPS PM orders 3044 (o_custkey,o_orderkey) 1474 1486 0 0.102 1500000
BPS PM lineitem 3092 (l_orderkey,l_partkey) 0 5866 0 215.113 6001215
HJS PM lineitem-orders 3092 - - - - ----- -
TAS UM - - - - - - 244.862 1500000

4) getcalstat

-----------------------------------------------------------+

Query Stats: MaxMemPct-68; NumTempFiles-0; TempFileSpace-0B; ApproxPhyI/O-5870; CacheI/O-5886; BlocksTouched-5866; PartitionBlocksEliminated-0; MsgBytesIn-57MB; MsgBytesOut-2KB; Mode-Distributed

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 row in set (0.01 sec)

ExeMgr's memory utilization was peaked at 40gb. Although group_concat() is known to use more memory, but using 40gb memory to process two columns totaling 44mg of data seems to be excessive.

I also did the same test on 1.0.9-1 and it showed the same behavior.



 Comments   
Comment by Roman [ 2019-02-01 ]

Please review.

Comment by Daniel Lee (Inactive) [ 2019-02-08 ]

Build verified: 1.2.3-1 from buildbot nightly

Build verified: 1.2.3-1 from buildbot nightly

server commit:
61f32f2
engine commit:
46cc344

Executed the same test on a VM with 48gb of memory. It used a max of 9%, or 4.32gb of memory. That's about 90% less memory used.

Comment by David Hill (Inactive) [ 2019-02-22 ]

Issue reported by a customer that had a beta version of 1.2.3

Comment by Roman [ 2019-03-04 ]

Please review

Comment by Daniel Lee (Inactive) [ 2019-03-11 ]

The ticket was closed. According to the programmer, the ticket was reopened by mistake. Closed it.

Generated at Thu Feb 08 02:24:40 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.