[MDEV-6142] CONNECT Engine could eat all the memory - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.0.10
Fix Version/s: 10.0.12
Component/s: None
Labels:
- connect-engine

Description

Hi,

I've just tested the CONNECT engine on a "big" table (24GB) on my local server :

CREATE TABLE `url_list_connect` (

  `id` int(10) unsigned NOT NULL,

  `url` varchar(2048) NOT NULL DEFAULT '',

  `date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,

  `id_license_domain` int(10) unsigned NOT NULL DEFAULT '0',

  `id_license` int(10) unsigned NOT NULL DEFAULT '0',

  `dom` mediumint(8) unsigned NOT NULL DEFAULT '0',

  `loadtime` mediumint(8) unsigned NOT NULL DEFAULT '0',

  `gen` mediumint(8) unsigned NOT NULL DEFAULT '0',

  `script` varchar(2048) NOT NULL DEFAULT '',

  `database_time` mediumint(8) unsigned NOT NULL DEFAULT '0'

) ENGINE=CONNECT DEFAULT CHARSET=latin1 `table_type`=MYSQL `dbname`=test `tabname`=url_list `option_list`='user=root,host=localhost,password=xxxx'

The original table is a tokudb table with a tokudb_cache_size of 12Go
(I can see the same behaviour with an InnoDB table).

Then I've tested the following :
SELECT * FROM url_list_connect LIMIT 10;

This query seems to execute a full scan of the table and load the whole table in memory, before returning the 10 first results.
Hence the mysql server finally crashes because out of memory. (even with a small innodb_buffer_pool_size / tokudb_cache_size / tmp_table_size).
I don't see any variable like in cassandra SE which allows to control the full scan behaviour :

"Full table scans are performed in a memory-efficient way. Cassandra SE performs a full table scan as a series of batches, each of which reads not more than cassandra_rnd_batch_size records."

To reproduce : create a table with a size > the total memory on the server, then select without a WHERE clause.

Thanks and regards,
Jocelyn Fournier

Attachments

Activity

Ascending order - Click to sort in descending order

Olivier Bertrand added a comment - 2014-04-22 19:21 - edited

Normally, CONNECT does not make a full table scan when retrieving query result. However there are exceptions:

1 - XML tables. Because the used libraries (DOMDOC or libxml2) are not used in SAX mode but in DOM mode. This means that they parse the whole document and make a memory image that is later used by CONNECT.

2 - MYSQL tables.

What is done for MYSQL tables is to construct a SELECT statement specifying only the columns used in the query and adding a WHERE clause retrieved by cond_push (unlike FEDERATED that retrieve the whole table except when indexing is used) The last version of CONNECT also can use "REMOTE" indexing.

Unfortunately, it seems difficult to handle the LIMIT clause. Firstly because engines have no way to now about it (except by re-parsing the original query) but chiefly because of queries like this:

SELECT col1, count(*) FROM t1 GROUP BY col1 LIMIT 10;

The engine is called by MariaDB to return the table data, the engine not knowing for what purpose. Here CONNECT will construct the query:

SELECT col1 FROM tab;

(tab being the t1 source table) However, adding the LIMIT clause would be wrong because it applies to the GROUP BY result, not to the table that is grouped.

Perhaps, a solution could be to retrieve the table data row by row, or by fixed number of rows as CASSANDRA does. This is a little more complicated to implement but feasible if the mysql API permit doing so. I am not sure whether or not they built a memory result completely in memory in any case. Concerning the ODBC tables, they are retrieved by CONNECT by block of size given by the BLOCK_SIZE option.

Meanwhile, you can browse the first rows of a big MYSQL table using a view or a SRCDEF option that includes the LIMIT clause, for instance replacing:

`tabname`=url_list

`srcdef`='SELECT * FROM url_list LIMIT 10'

Olivier Bertrand added a comment - 2014-04-22 19:21 - edited Normally, CONNECT does not make a full table scan when retrieving query result. However there are exceptions: 1 - XML tables. Because the used libraries (DOMDOC or libxml2) are not used in SAX mode but in DOM mode. This means that they parse the whole document and make a memory image that is later used by CONNECT. 2 - MYSQL tables. What is done for MYSQL tables is to construct a SELECT statement specifying only the columns used in the query and adding a WHERE clause retrieved by cond_push (unlike FEDERATED that retrieve the whole table except when indexing is used) The last version of CONNECT also can use "REMOTE" indexing. Unfortunately, it seems difficult to handle the LIMIT clause. Firstly because engines have no way to now about it (except by re-parsing the original query) but chiefly because of queries like this: SELECT col1, count (*) FROM t1 GROUP BY col1 LIMIT 10; The engine is called by MariaDB to return the table data, the engine not knowing for what purpose. Here CONNECT will construct the query: SELECT col1 FROM tab; (tab being the t1 source table) However, adding the LIMIT clause would be wrong because it applies to the GROUP BY result, not to the table that is grouped. Perhaps, a solution could be to retrieve the table data row by row, or by fixed number of rows as CASSANDRA does. This is a little more complicated to implement but feasible if the mysql API permit doing so. I am not sure whether or not they built a memory result completely in memory in any case. Concerning the ODBC tables, they are retrieved by CONNECT by block of size given by the BLOCK_SIZE option. Meanwhile, you can browse the first rows of a big MYSQL table using a view or a SRCDEF option that includes the LIMIT clause, for instance replacing: `tabname`=url_list by `srcdef`='SELECT * FROM url_list LIMIT 10'

jocelyn fournier added a comment - 2014-04-27 10:28

Hi Olivier,

I think the best solution to avoid the crash would be to implement the same mechanism than CASSANDRA with a rows fetch by blocks.
Thanks for the srcdef example.

Jocelyn

jocelyn fournier added a comment - 2014-04-27 10:28 Hi Olivier, I think the best solution to avoid the crash would be to implement the same mechanism than CASSANDRA with a rows fetch by blocks. Thanks for the srcdef example. Jocelyn

Olivier Bertrand added a comment - 2014-04-27 20:47

Hi, Jocelyn,

I don't know how CASSANDRA does this. CONNECT is using the MySQL API and there are no such things available (alike the ODBC recordset size and extended fetch)

Therefore, what I did is to enable CONNECT MYSQL tables to retrieve the result via mysql_use_result instead of mysql_store_result. Apparently it does the same thing but does not allocate the whole thing in memory. My fix was pushed as revision 4159.

To trigger this, use in the CREATE or ALTER statement the option HUGE, for instance:

ALTER TABLE `url_list_connect` huge=1;

I cannot test it, not having such big tables; let me know whether it works for you.

Note that the way these API functions work, using LIMIT does not reduce the time spent to retrieve the result. It seems that even the result set is not stored, MySQL must internally go to the end of it before the connection is released.

Olivier

Olivier Bertrand added a comment - 2014-04-27 20:47 Hi, Jocelyn, I don't know how CASSANDRA does this. CONNECT is using the MySQL API and there are no such things available (alike the ODBC recordset size and extended fetch) Therefore, what I did is to enable CONNECT MYSQL tables to retrieve the result via mysql_use_result instead of mysql_store_result. Apparently it does the same thing but does not allocate the whole thing in memory. My fix was pushed as revision 4159. To trigger this, use in the CREATE or ALTER statement the option HUGE, for instance: ALTER TABLE `url_list_connect` huge=1; I cannot test it, not having such big tables; let me know whether it works for you. Note that the way these API functions work, using LIMIT does not reduce the time spent to retrieve the result. It seems that even the result set is not stored, MySQL must internally go to the end of it before the connection is released. Olivier

jocelyn fournier added a comment - 2014-04-27 22:20 - edited

Hi Olivier,

I'll take a look ASAP.
Not related to the initial issue, but I've just figured out CTRL+C doesn't work properly with CONNECT :

MariaDB [test]> SELECT * FROM url_list_connect LIMIT 10;

^CCtrl-C -- query killed. Continuing normally.

^CCtrl-C -- query killed. Continuing normally.

ERROR 2013 (HY000): Lost connection to MySQL server during query

MariaDB [test]> show processlist;

ERROR 2006 (HY000): MySQL server has gone away

No connection. Trying to reconnect...

Connection id:    350

Current database: test

+-----+------+-----------------+--------+---------+------+-----------------------------+------------------------------------------------------------------------------------------------------+----------+

| Id  | User | Host            | db     | Command | Time | State                       | Info                                                                                                 | Progress |

+-----+------+-----------------+--------+---------+------+-----------------------------+------------------------------------------------------------------------------------------------------+----------+

| 309 | jira | 127.0.0.1:58647 | jiradb | Sleep   |   39 |                             | NULL                                                                                                 |    0.000 |

| 346 | root | localhost       | test   | Killed  |   13 | Sending data                | SELECT * FROM url_list_connect LIMIT 10                                                              |    0.000 |

| 347 | root | localhost       | test   | Query   |   13 | Queried about 14730000 rows | SELECT `id`, `url`, `date`, `id_license_domain`, `id_license`, `dom`, `loadtime`, `gen`, `script`, ` |    0.000 |

| 350 | root | localhost       | test   | Query   |    0 | init                        | show processlist                                                                                     |    0.000 |

+-----+------+-----------------+--------+---------+------+-----------------------------+------------------------------------------------------------------------------------------------------+----------+

4 rows in set (0.00 sec)

MariaDB [test]> kill 347;

Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> show processlist;

+-----+------+-----------------+--------+---------+------+-------+------------------+----------+

| Id  | User | Host            | db     | Command | Time | State | Info             | Progress |

+-----+------+-----------------+--------+---------+------+-------+------------------+----------+

| 309 | jira | 127.0.0.1:58647 | jiradb | Sleep   |   55 |       | NULL             |    0.000 |

| 350 | root | localhost       | test   | Query   |    0 | init  | show processlist |    0.000 |

+-----+------+-----------------+--------+---------+------+-------+------------------+----------+

2 rows in set (0.00 sec)

Do you want me to open a new bug report for this one ?

Thanks,
Jocelyn

Olivier Bertrand added a comment - 2014-04-28 01:27 - edited

Hmmm... I did not think plugins had to do something about this. I thought this was handled by MySQL/MariaDB. Are you sure that this is not working only with CONNECT?

If so yes, this could be a new bug report.

Olivier Bertrand added a comment - 2014-04-28 01:27 - edited Hmmm... I did not think plugins had to do something about this. I thought this was handled by MySQL/MariaDB. Are you sure that this is not working only with CONNECT? If so yes, this could be a new bug report.

jocelyn fournier added a comment - 2014-04-28 01:46 - edited

At least it's the first I see this kind of behaviour (it works properly with innodb / tokudb / MyISAM). So I've just opened a new bug report for this one : MDEV-6182

jocelyn fournier added a comment - 2014-04-28 01:46 - edited At least it's the first I see this kind of behaviour (it works properly with innodb / tokudb / MyISAM). So I've just opened a new bug report for this one : MDEV-6182

Olivier Bertrand added a comment - 2014-05-13 00:57

The huge=1 option solves the problem.

Olivier Bertrand added a comment - 2014-05-13 00:57 The huge=1 option solves the problem.

People

Assignee:: Olivier Bertrand

Reporter:: jocelyn fournier

Votes:: 1 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2014-04-21 10:18

Updated:: 2014-05-13 00:57

Resolved:: 2014-05-13 00:57

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server