|
please add output of SHOW CREATE TABLE stock; and also the output of explain, when FORCE INDEX is used.
|
|
Having more info , first it can be reproduce on 10.1.24
the show index is:
| stock | 1 | date | 1 | date | A | 58354077 | NULL | NULL | YES | BTREE |
|
If is reduce the scope of the range it is than picking the range scan
explain select `payment_uid`,`date` from `stock` where `date` >= '2017-05-31 22:00:00' and `date` <= '2017-08-30 22:00:00' order by `date`;
|
+------+-------------+-------+------+---------------+------+---------+------+-----------+-----------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------+------+---------------+------+---------+------+-----------+-----------------------------+
|
| 1 | SIMPLE | stock | ALL | date | NULL | NULL | NULL | 233416309 | Using where; Using filesort |
|
+------+-------------+-------+------+---------------+------+---------+------+-----------+-----------------------------+
|
1 row in set (0.00 sec)
|
explain select `payment_uid`,`date` from `stock` where `date` >= '2017-05-31 22:00:00' and `date` <= '2017-07-30 22:00:00' order by `date`;
|
+------+-------------+-------+-------+---------------+------+---------+------+----------+-------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------+-------+---------------+------+---------+------+----------+-------------+
|
| 1 | SIMPLE | stock | range | date | date | 6 | NULL | 34892628 | Using where |
|
+------+-------------+-------+-------+---------------+------+---------+------+----------+-------------+
|
1 row in set (0.01 sec)
|
forcing :
explain select `payment_uid`,`date`,`simulation_state` from `stock` FORCE INDEX(date) where `date` >= '2017-05-31 22:00:00' and `date` <= '2018-05-31 22:00:00' order by `date`;
|
+------+-------------+-------+-------+---------------+------+---------+------+-----------+-----------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------+-------+---------------+------+---------+------+-----------+-----------------------+
|
| 1 | SIMPLE | stock | range | date | date | 6 | NULL | 116708154 | Using index condition |
|
+------+-------------+-------+-------+---------------+------+---------+------+-----------+-----------------------+
|
1 row in set (0.00 sec)
|
show create table
CREATE TABLE `stock` (
|
`uid` bigint(20) unsigned NOT NULL,
|
`order_id` bigint(20) unsigned NOT NULL,
|
`explanation_uid` bigint(20) unsigned DEFAULT NULL,
|
`node_uid` bigint(20) unsigned DEFAULT NULL,
|
`section_uid` bigint(20) unsigned DEFAULT NULL,
|
`payment_uid` bigint(20) unsigned DEFAULT NULL,
|
`function_uid` bigint(20) unsigned DEFAULT NULL,
|
`project_uid` bigint(20) unsigned DEFAULT NULL,
|
`funding_uid` bigint(20) unsigned DEFAULT NULL,
|
`ledger_uid` bigint(20) unsigned DEFAULT NULL,
|
`payment_request_uid` bigint(20) unsigned DEFAULT NULL,
|
`mirror_section_uid` bigint(20) unsigned DEFAULT NULL,
|
`mirror_node_uid` bigint(20) unsigned DEFAULT NULL,
|
`resource_uid` bigint(20) unsigned DEFAULT NULL,
|
`quantity` double DEFAULT NULL,
|
`is_cancellation` tinyint(1) DEFAULT NULL,
|
`is_accountable` tinyint(1) DEFAULT NULL,
|
`date` datetime DEFAULT NULL,
|
`mirror_date` datetime DEFAULT NULL,
|
`total_price` double DEFAULT NULL,
|
`portal_type` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
|
`simulation_state` varchar(255) COLLATE utf8_unicode_ci DEFAULT '',
|
`variation_text` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
|
`sub_variation_text` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
|
`is_asset_increase` tinyint(1) DEFAULT NULL,
|
`is_input` tinyint(1) DEFAULT NULL,
|
`grouping_date` datetime DEFAULT NULL,
|
PRIMARY KEY (`uid`,`order_id`),
|
KEY `quantity` (`quantity`),
|
KEY `mirror_section_uid` (`mirror_section_uid`),
|
KEY `mirror_node_uid` (`mirror_node_uid`),
|
KEY `node_uid` (`node_uid`),
|
KEY `payment_uid` (`payment_uid`),
|
KEY `function_uid` (`function_uid`),
|
KEY `project_uid` (`project_uid`),
|
KEY `funding_uid` (`funding_uid`),
|
KEY `resource_node_uid` (`resource_uid`,`node_uid`),
|
KEY `resource_section_node_uid` (`resource_uid`,`section_uid`,`node_uid`,`simulation_state`),
|
KEY `date` (`date`),
|
KEY `payment_request_uid` (`payment_request_uid`),
|
KEY `state_section_increase_node_date` (`simulation_state`,`section_uid`,`is_asset_increase`,`node_uid`,`date`),
|
KEY `explanation_uid` (`explanation_uid`),
|
KEY `section_uid_portal_type_mirror_section_uid` (`section_uid`,`portal_type`,`mirror_section_uid`),
|
KEY `ledger_uid` (`ledger_uid`,`section_uid`,`date`),
|
KEY `ledger_section_grouping_date_node` (`ledger_uid`,`section_uid`,`grouping_date`,`node_uid`)
|
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
|
I have tested without success
index_condition_pushdown=off
increase innodb_stats_sample_pages=1024;
extended_keys=off
Note that similar issue was already reported in : MDEV-10046
|
|
Hi Alice,
sorry MDEV-10046 is not related
What happens here is that i think there are wrong cost calculation of how long it takes to sort 240M rows , the sort merge pass will be huge with sort_buffer_size | 2097152 and a varchar(255) utf8 that require 240*(8+4+756) let's assume 200G of memory is required at the same time the cost of full scanning the index vs range on the index should be the same , so range without sorting should always be better
Tx
|
|
So,
- Full scan expects to scan 233 M rows
- Range access expects to scan 35 M rows
which is ~15 % of the table.
|
|
Well, currently the optimizer doesn't take into account the cost of sorting AT ALL. It does take into account that ORDER BY ... LIMIT n will only need to read n rows if it uses a matching index (while filesort will have to read all rows and then sort). The cost of sorting is typically not very high.
|
|
Another observation:
where `date` >= '2017-05-31 22:00:00' and `date` <= '2018-05-31 22:00:00' order by date ;
|
here the range on "date" actually includes only one value and so sorting should not be needed at all? (Or are the constants different in the general case?)
|
|
stephane@skysql.com, to make sure the cost of sorting is the issue, can you provide the output of
analyze format=json
|
select `payment_uid`,`date`,`simulation_state`
|
from `stock`
|
where `date` >= '2017-05-31 22:00:00' and `date` <= '2018-05-31 22:00:00' order by date ;
|
|
|
" on "date" actually includes only one value "
humm the year is different so the range is a big part of count rows in the table
|
|
(upgraded to 10.3.10)
{
|
"query_block": {
|
"select_id": 1,
|
"r_loops": 1,
|
"r_total_time_ms": 520933,
|
"read_sorted_file": {
|
"r_rows": 1.03e8,
|
"filesort": {
|
"sort_key": "stock.`date`",
|
"r_loops": 1,
|
"r_total_time_ms": 490330,
|
"r_used_priority_queue": false,
|
"r_output_rows": 102963264,
|
"r_sort_passes": 6654,
|
"r_buffer_size": "2047Kb",
|
"table": {
|
"table_name": "stock",
|
"access_type": "ALL",
|
"possible_keys": ["date"],
|
"r_loops": 1,
|
"rows": 216966915,
|
"r_rows": 2.41e8,
|
"r_total_time_ms": 151393,
|
"filtered": 50,
|
"r_filtered": 42.692,
|
"attached_condition": "stock.`date` >= '2017-05-31 22:00:00' and stock.`date` <= '2018-05-31 22:00:00'"
|
}
|
}
|
}
|
}
|
}
|
I don't know why the original post reports more rows. Maybe it was after some processing that would have added rows.
|
|
Looking at the above...
filesort.r_total_time_ms= 490,330
filesort.table.r_total_time_ms= 151,393
So, the filesort call took 490 seconds, of which 150 seconds were spent reading the table. The rest 340 must be sorting.
(This was to read 2.4e8 rows, then filter out 50%, and then sort 1e8 rows)
Indeed, the cost of sorting is 2x higher than the cost of reading the data.
|
|
Hi Julien, Ivan,
Could you re run analyze query with bigger sort buffer like 512M or 1G and tell us what the current value of the sort_buffer_size ?
It does not change the fact that on SSD or NVMe sorting to read in the correct pk order the majority of record is less an issue to compare to spinning disk
But it can help quantify the cost of a reasonable sorting .
|
|
Actually, the analyze output in my above comment was with tmpdir (/tmp) in tmpfs. And we should also not ignore that using a secondary index to avoid filesort implies reading more the disk. I'm surprised nobody checked the same query with FORCE INDEX(date), to actually show there's improvement to expect.
In my current setup (~227 GiB RAM, 16Gi buffer pool size, SSD, stock table: 40Gi data, 178Gi index), forcing the index (tried twice: 11m34 & 18m53) is slower than sorting in tmpfs (8m41). Now retrying twice with tmpdir in same device as datadir: 24m23 & 29m (again, the second run is slower, no idea why). Now, I'm going to play with sort_buffer_size...
(note that many processes are currently stopped on the machine and during normal operation, there's much less RAM available, that's why innodb_buffer_pool_size is not bigger)
|
I'm surprised nobody checked the same query with FORCE INDEX(date), to actually show there's improvement to expect.
(even if I missed the first comments, explain does not show much)
Could you re run analyze query with bigger sort buffer like 512M or 1G and tell us what the current value of the sort_buffer_size ?
It was the default value: 2M. I changed to 1G and ran twice: 12min15 & 11m22.
|
|
Julien this issue was created with spider testing in mind , while the sorting is happening on the catalog table no records is flowing to a joined spider category table from the original query we wanted to improve with sharding .
We are really missing the result of ANALYZE query with FORCE INDEX to get the time and range
Conclusion :
- Forcing the index avoid file sort cost
- Forcing the index save 25% reading on disk
select 1.8/2.4*100;
|
+-------------+
|
| 1.8/2.4*100 |
|
+-------------+
|
| 75.00000 |
|
+-------------+
|
- Forcing the index : how much time, reading data is taking. I can't see this from previous comments ? (Cost of rnd read + pk clustered vs direct clustered lookup on SSD)
- Forcing the index enable constant work using spider on joined tables as data flow to next state of the plan without without waiting for sorting
|
|
Leaving it to psergey to decide whether there is an issue to solve, and if so, in which version.
|