[MCOL-5153] Disk-based aggregation fails with ERROR 1815 (HY000): Internal error: TupleAggregateStep::threadedAggregateRowGroups()[24] MCS-2054: Unknown error while aggregation. (part 1) - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 6.2.3, 6.3.1, 6.4.1
Fix Version/s: 22.08.1, 6.4.4-dompe
Component/s: ExeMgr
Labels:
None

Sprint:
2021-17

Description

The aggregation on VARCHAR(128) column(number of distinct values is aproximately 31 bln) fails with an obscure error.

ERROR 1815 (HY000): Internal error: TupleAggregateStep::threadedAggregateRowGroups()[24] MCS-2054: Unknown error while aggregation.

The current implementation of RowAggStorage::increaseSize() can raise RowAggStorage::Data::fMask 4 times before rehashing happens. The guarding check in increaseSize() is too restrictive and fails easily with big numbers in fCurData->fMask and fCurData->fSize(see RowAggStorage::increaseSize() for details).

The suggested solution is to increase the multiplier in the expression:

if (fCurData->fSize * 2 < calcMaxSize(fCurData->fMask + 1))

Attachments

Issue Links

is part of

MCOL-5199 Follow-up for the hash calculation perf degradation

Closed

Activity

Ascending order - Click to sort in descending order

Roman added a comment - 2022-07-07 08:55

Plz review.

Roman added a comment - 2022-07-07 08:55 Plz review.

Roman added a comment - 2022-07-09 12:55 - edited

4QA I have seen it in the wild on a beefy hardware with 1.5 TB RAM with S3-based cluster on NVME. The issue happens with aggregation on VARCHAR(30) column when the number of DISTINCT values equals to 31 bln.
With the data I have the reproduction can be to aggregate on 5 mln distinct VARCHAR(30) values, e.g. SELECT c1 FROM t1 GROUP BY c1, where c1 is the mentioned VARCHAR(30) with 5 mln distinct values.

Roman added a comment - 2022-07-09 12:55 - edited 4QA I have seen it in the wild on a beefy hardware with 1.5 TB RAM with S3-based cluster on NVME. The issue happens with aggregation on VARCHAR(30) column when the number of DISTINCT values equals to 31 bln. With the data I have the reproduction can be to aggregate on 5 mln distinct VARCHAR(30) values, e.g. SELECT c1 FROM t1 GROUP BY c1, where c1 is the mentioned VARCHAR(30) with 5 mln distinct values.

Daniel Lee (Inactive) added a comment - 2022-07-18 21:08 - edited

Build tested: 6.4.2-1 (Jenkins build bb-10.6.8-4-cs-6.4.2-1)

storage: local
3PM cluster, with 30gb memory in each node.
Dataset size: 10g, lineitem, l_comment is archer(44)
row = 59,986,052 (close to 60 millions)
distinct rows = 19,439,546 rows (19 millions)
query: select l_comment from lineitem group by l_comment;

With disk-join disabled, the query would run out of memory.
With disk-join enabled, the query executed successfully.

Also tested "select count from lineitem, orders where l_orderkey = o_orderkey" on 100gb, 200gb, and 300gb datasets. All succeeded.

Is this test for S3 only? or local storage would be sufficient?

Daniel Lee (Inactive) added a comment - 2022-07-18 21:08 - edited Build tested: 6.4.2-1 (Jenkins build bb-10.6.8-4-cs-6.4.2-1) storage: local 3PM cluster, with 30gb memory in each node. Dataset size: 10g, lineitem, l_comment is archer(44) row = 59,986,052 (close to 60 millions) distinct rows = 19,439,546 rows (19 millions) query: select l_comment from lineitem group by l_comment; With disk-join disabled, the query would run out of memory. With disk-join enabled, the query executed successfully. Also tested "select count from lineitem, orders where l_orderkey = o_orderkey" on 100gb, 200gb, and 300gb datasets. All succeeded. Is this test for S3 only? or local storage would be sufficient?

Roman added a comment - 2022-08-04 17:26

Another iteration on the disk-based aggregation code. This try replaces MariaDB collation aware hashing with a combination of strnxfrm(converts bytes array into collation aware weights array) + MM3 byte array hash. There is also an optimization borrowed from Robin Hood that is triggered when RowStorage::increaseSize() is called when there are plenty of space available in the current fCurData w/o taking more RAM(see the patch for the details).

Roman added a comment - 2022-08-04 17:26 Another iteration on the disk-based aggregation code. This try replaces MariaDB collation aware hashing with a combination of strnxfrm(converts bytes array into collation aware weights array) + MM3 byte array hash. There is also an optimization borrowed from Robin Hood that is triggered when RowStorage::increaseSize() is called when there are plenty of space available in the current fCurData w/o taking more RAM(see the patch for the details).

Daniel Lee (Inactive) added a comment - 2022-08-10 22:12

Build tested: 22.08-1 (#5243)

Executed the same 300gb DBT3 database above successfully.

Daniel Lee (Inactive) added a comment - 2022-08-10 22:12 Build tested: 22.08-1 (#5243) Executed the same 300gb DBT3 database above successfully.

People

Assignee:: Alexey Antipovsky

Reporter:: Roman

Assigned for Testing:: Daniel Lee (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2022-07-07 08:54

Updated:: 2022-10-26 04:39

Resolved:: 2022-08-17 17:23

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB ColumnStore