[MDEV-16520] Out-Of-Memory running big aggregate query on Spider Engine Created: 2018-06-19  Updated: 2020-08-25  Resolved: 2019-02-05

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Spider
Affects Version/s: 10.3
Fix Version/s: 10.4.3, 10.3.13

Type: Bug Priority: Major
Reporter: Mattias Jonsson Assignee: Kentoku Shiba (Inactive)
Resolution: Fixed Votes: 1
Labels: None
Environment:

Linux


Issue Links:
Relates
relates to MDEV-16249 CHECKSUM TABLE for a spider table is ... Closed
relates to MDEV-16279 Spider crashes on CHECKSUM TABLE with... Closed
relates to MDEV-16880 Provide checksum aggregate functions,... Open

 Description   

When running 6 concurrent queries (2 queries per 3 tables) the mysqld process gets killed by the OOM killer.

The queries was:

select min(id), max(id), avg(id), sum(id), min(id2), max(id2), avg(id2), sum(id2), count(*) FROM t2; -- Finished
select min(id), max(id), avg(id), sum(id), min(id2), max(id2), avg(id2), sum(id2), count(*) FROM t3;
select min(id), max(id), avg(id), sum(id), count(*) FROM t1;
select min(id), max(id), avg(id), bit_xor(id), min(id2), max(id2), avg(id2), bit_xor(id2), count(*) FROM t2; -- Finished
select min(id), max(id), avg(id), bit_xor(id), min(id2), max(id2), avg(id2), bit_xor(id2), count(*) FROM t3;
select min(id), max(id), avg(id), bit_xor(id), count(*) FROM t1;


Where the tables has more than 100M rows.



 Comments   
Comment by Mattias Jonsson [ 2018-06-19 ]

Same issue as MDEV-16249?

Comment by Jacob Mathew (Inactive) [ 2018-11-12 ]

I have been unable to reproduce this problem. However, I have been able to figure out what is causing the out-of-memory condition.

If you are not explicitly setting the spider_quick_mode setting, then you are getting spider_quick_mode = 0. This means that result rows are buffered by the libmysqlclient used by Spider to interface with its data nodes. The memory used for the result rows is freed only at the end of the query. So if the aggregate query is not pushed to the data nodes (when pushdown of one of the aggregate functions is not supported) and the number of result rows returned by the data nodes is enormous, then the Spider node can run out of memory.

To prevent the out-of-memory condition, try setting the following variables:

spider_quick_mode = 3
spider_quick_page_size = <large positive number>

spider_quick_mode = 3 will cause the result rows to be written to disk when the number of result rows exceeds the value of spider_quick_page_size. You will need to tune the value of spider_quick_page_size so that it is large enough to not have a negative effect on performance and small enough to prevent the out-of-memory condition.

I am closing the bug with this recommended action. If this does not work for you, please reopen the bug.

Comment by Jacob Mathew (Inactive) [ 2018-11-12 ]

Please take the action recommended in the previous comment. If it does not resolve your problem, please reopen the bug.

Comment by Mattias Jonsson [ 2018-11-13 ]

Since spider_quick_mode = 3 crashes the spider server with CHECKSUM TABLE, see MDEV-16279, I don't see the suggested workaround as acceptable. We cannot be required to switch variables depending on which query we want to run, just to avoid crashes.

Comment by Michael Widenius [ 2018-11-13 ]

Eric Herman has more comments on this task

Comment by Kentoku Shiba (Inactive) [ 2018-11-30 ]

I changed default value of the followings for avoiding "out of memory" error.
quick_mode 0 -> 3
quick_page_size 100 -> 1024

I added the following variabless for limiting result page size by byte
Spider Server System Variable

  • spider_quick_page_byte
    Number of bytes in a page when acquisition one by one.
    When quick_mode is 1 or 2, Spider stores at least 1 record even if
    quick_page_byte is smaller than 1 record. When quick_mode is 3,
    quick_page_byte is used for judging using temporary table.
    -1 : The table variable is adopted.
    0 or more : Number of bytes.
    The default value is -1

Spider Table System Variable

  • quick_page_byte(qpb)
    Number of bytes in a page when acquisition one by one.
    When quick_mode is 1 or 2, Spider stores at least 1 record even if
    quick_page_byte is smaller than 1 record. When quick_mode is 3,
    quick_page_byte is used for judging using temporary table.
    That is given to priority when server variable spider_quick_page_byte
    is set.
    The default value is 10485760

for 10.3
b07cb08
for 10.4
8ff0655

Comment by Kentoku Shiba (Inactive) [ 2018-12-24 ]

Sachin, please review this.

Comment by Kentoku Shiba (Inactive) [ 2019-01-31 ]

Sachin, please review the following changes.
94bfa39

Comment by Sachin Setiya (Inactive) [ 2019-02-03 ]

okay to push

Comment by Kentoku Shiba (Inactive) [ 2019-02-05 ]

Merged to 10.3 and 10.4 tree.

Generated at Thu Feb 08 08:29:31 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.