Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.0.10
-
None
Description
Hi,
I've just tested the CONNECT engine on a "big" table (24GB) on my local server :
CREATE TABLE `url_list_connect` ( |
`id` int(10) unsigned NOT NULL, |
`url` varchar(2048) NOT NULL DEFAULT '', |
`date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, |
`id_license_domain` int(10) unsigned NOT NULL DEFAULT '0', |
`id_license` int(10) unsigned NOT NULL DEFAULT '0', |
`dom` mediumint(8) unsigned NOT NULL DEFAULT '0', |
`loadtime` mediumint(8) unsigned NOT NULL DEFAULT '0', |
`gen` mediumint(8) unsigned NOT NULL DEFAULT '0', |
`script` varchar(2048) NOT NULL DEFAULT '', |
`database_time` mediumint(8) unsigned NOT NULL DEFAULT '0' |
) ENGINE=CONNECT DEFAULT CHARSET=latin1 `table_type`=MYSQL `dbname`=test `tabname`=url_list `option_list`='user=root,host=localhost,password=xxxx' |
The original table is a tokudb table with a tokudb_cache_size of 12Go
(I can see the same behaviour with an InnoDB table).
Then I've tested the following :
SELECT * FROM url_list_connect LIMIT 10;
This query seems to execute a full scan of the table and load the whole table in memory, before returning the 10 first results.
Hence the mysql server finally crashes because out of memory. (even with a small innodb_buffer_pool_size / tokudb_cache_size / tmp_table_size).
I don't see any variable like in cassandra SE which allows to control the full scan behaviour :
"Full table scans are performed in a memory-efficient way. Cassandra SE performs a full table scan as a series of batches, each of which reads not more than cassandra_rnd_batch_size records."
To reproduce : create a table with a size > the total memory on the server, then select without a WHERE clause.
Thanks and regards,
Jocelyn Fournier