|
Hi Stephane,
I've tried to investigate the issue. I created this testcase:
CREATE TABLE `login2` (
`user_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`user_name` varchar(50) DEFAULT NULL,
`user_email` varchar(2000) CHARACTER SET utf8 DEFAULT NULL,
`created` datetime DEFAULT NULL,
PRIMARY KEY (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=102048 DEFAULT CHARSET=latin1;
insert into login2 values (100002, 'user-100002', 'email-100002', now());
insert into login2 values (100003, 'user-100003', 'email-100003', now());
insert into login2 values (100004, 'user-100004', 'email-100004', now());
insert into login2 values (100007, 'user-100007', 'email-100007', now());
insert into login2 values (100008, 'user-100008', 'email-100008', now());
select * from (
select
user_id,
created,
(select user_email from login2 b where b.user_id=a.user_id ) as mail
from
login2 a
where
a.user_id IN (100005, 100002, 100004, 100003)
) as a order by created;
and ran it.
Then, I replaced varchar(2000) with varchar(200) and ran it again. In both cases, the EXPLAIN was
----------------------------------------------------------------------------------------------------------+
| id |
select_type |
table |
type |
possible_keys |
key |
key_len |
ref |
rows |
Extra |
----------------------------------------------------------------------------------------------------------+
| 1 |
SIMPLE |
a |
ALL |
PRIMARY |
NULL |
NULL |
NULL |
5 |
Using where; Using filesort |
| 3 |
DEPENDENT SUBQUERY |
b |
eq_ref |
PRIMARY |
PRIMARY |
4 |
j11.a.user_id |
1 |
|
----------------------------------------------------------------------------------------------------------+
|
|
In both cases, the slow query log looks like this: (query was artificially slowed down in gdb)
- Time: 130726 10:20:33
- User@Host: root[root] @ localhost []
- Thread_id: 2 Schema: j11 QC_hit: No
- Query_time: 1.996783 Lock_time: 0.006994 Rows_sent: 3 Rows_examined: 11
- Full_scan: Yes Full_join: No Tmp_table: Yes Tmp_table_on_disk: No
- Filesort: Yes Filesort_on_disk: No Merge_passes: 0
...
That is, Tmp_table: Yes, Tmp_table_on_disk: No.
|
|
Note that this "Tmp_table: Yes" is not a "real" tmp-table.
This is an artifact of derived_merge feature: The optimizer creates a kind of temporary table template . The table is not really created but the counter is incremented. Then, the optimizer figures that derived_merge=on removes the need for temporary table, so the actual table is never created or written-to.
|
|
The above effect is filed as MDEV-710.
|
|
We are really talking about disk tmp table , we have set tmp table size and max heap table size to 1G to make sure this was not the issue , here is the original client query
SELECT
(SELECT LEFT(fv.value,1502) AS value FROM
fphd_link_housing_variability AS flinkv
INNER JOIN fphd_variability AS fv ON ( flinkv.variability_field_code = fv.field_code AND flinkv.variability_field_name = fv.field_name AND flinkv.variability_language_code = fv.language_code AND flinkv.variability_season_code = fv.season_code ) WHERE fv.language_code = 'fr' AND fv.season_code = 'SUMMER' AND fv.table_name = 'fphd_housing' AND
fv.field_name = 'description' AND
flinkv.housing_code = fh.code LIMIT 1) AS description,
(SELECT LEFT(fv.value,586) AS value FROM
fphd_link_housing_variability AS flinkv
INNER JOIN fphd_variability AS fv ON ( flinkv.variability_field_code = fv.field_code AND flinkv.variability_field_name = fv.field_name AND flinkv.variability_language_code = fv.language_code AND flinkv.variability_season_code = fv.season_code ) WHERE fv.language_code = 'fr' AND fv.season_code = 'SUMMER' AND fv.table_name = 'fphd_housing' AND
fv.field_name = 'other_description' AND
flinkv.housing_code = fh.code LIMIT 1) AS other_description,
(SELECT LEFT(fv.value,516) AS value FROM
fphd_link_housing_variability AS flinkv
INNER JOIN fphd_variability AS fv ON ( flinkv.variability_field_code = fv.field_code AND flinkv.variability_field_name = fv.field_name AND flinkv.variability_language_code = fv.language_code AND flinkv.variability_season_code = fv.season_code ) WHERE fv.language_code = 'fr' AND fv.season_code = 'SUMMER' AND fv.table_name = 'fphd_housing' AND
fv.field_name = 'excluded' AND
flinkv.housing_code = fh.code LIMIT 1) AS excluded,
(SELECT LEFT(fv.value,1096) AS value FROM
fphd_link_housing_variability AS flinkv
INNER JOIN fphd_variability AS fv ON ( flinkv.variability_field_code = fv.field_code AND flinkv.variability_field_name = fv.field_name AND flinkv.variability_language_code = fv.language_code AND flinkv.variability_season_code = fv.season_code ) WHERE fv.language_code = 'fr' AND fv.season_code = 'SUMMER' AND fv.table_name = 'fphd_housing' AND
fv.field_name = 'teaser' AND
flinkv.housing_code = fh.code LIMIT 1) AS teaser,(SELECT LEFT(fv.value,13) AS value FROM
fphd_link_comfort_level_variability AS flinkv
INNER JOIN fphd_variability AS fv ON ( flinkv.variability_field_code = fv.field_code AND flinkv.variability_field_name = fv.field_name AND flinkv.variability_language_code = fv.language_code ) WHERE fv.language_code = 'fr' AND fv.table_name = 'fphd_comfort_level' AND
fv.field_name = 'name' AND
flinkv.comfort_level_code = fh.comfort_level_code LIMIT 1) AS comfort_level_name,
(SELECT LEFT(fv.value,525) AS value FROM
fphd_link_comfort_level_variability AS flinkv
INNER JOIN fphd_variability AS fv ON ( flinkv.variability_field_code = fv.field_code AND flinkv.variability_field_name = fv.field_name AND flinkv.variability_language_code = fv.language_code AND flinkv.variability_season_code = fv.season_code ) WHERE fv.language_code = 'fr' AND fv.season_code = 'SUMMER' AND fv.table_name = 'fphd_comfort_level' AND
fv.field_name = 'description' AND
flinkv.comfort_level_code = fh.comfort_level_code LIMIT 1) AS comfort_level_description,
(SELECT LEFT(fv.value,604) AS value FROM
fphd_link_comfort_level_variability AS flinkv
INNER JOIN fphd_variability AS fv ON ( flinkv.variability_field_code = fv.field_code AND flinkv.variability_field_name = fv.field_name AND flinkv.variability_language_code = fv.language_code AND flinkv.variability_season_code = fv.season_code ) WHERE fv.language_code = 'fr' AND fv.season_code = 'SUMMER' AND fv.table_name = 'fphd_comfort_level' AND
fv.field_name = 'specifications' AND
flinkv.comfort_level_code = fh.comfort_level_code LIMIT 1) AS comfort_level_specifications,(SELECT LEFT(fv.value,55) AS value FROM
fphd_link_architectural_line_variability AS flinkv
INNER JOIN fphd_variability AS fv ON ( flinkv.variability_field_code = fv.field_code AND flinkv.variability_field_name = fv.field_name AND flinkv.variability_language_code = fv.language_code ) WHERE fv.language_code = 'fr' AND fv.table_name = 'fphd_architectural_line' AND
fv.field_name = 'name' AND
flinkv.architectural_line_code = fh.architectural_line_code LIMIT 1) AS architectural_line_name,
(SELECT LEFT(fv.value,33) AS value FROM
fphd_link_architectural_line_variability AS flinkv
INNER JOIN fphd_variability AS fv ON ( flinkv.variability_field_code = fv.field_code AND flinkv.variability_field_name = fv.field_name AND flinkv.variability_language_code = fv.language_code ) WHERE fv.language_code = 'fr' AND fv.table_name = 'fphd_architectural_line' AND
fv.field_name = 'catch' AND
flinkv.architectural_line_code = fh.architectural_line_code LIMIT 1) AS architectural_line_catch,
(SELECT LEFT(fv.value,2284) AS value FROM
fphd_link_architectural_line_variability AS flinkv
INNER JOIN fphd_variability AS fv ON ( flinkv.variability_field_code = fv.field_code AND flinkv.variability_field_name = fv.field_name AND flinkv.variability_language_code = fv.language_code ) WHERE fv.language_code = 'fr' AND fv.table_name = 'fphd_architectural_line' AND
fv.field_name = 'description' AND
flinkv.architectural_line_code = fh.architectural_line_code LIMIT 1) AS architectural_line_description,
(SELECT LEFT(fv.value,106) AS value FROM
fphd_link_architectural_line_variability AS flinkv
INNER JOIN fphd_variability AS fv ON ( flinkv.variability_field_code = fv.field_code AND flinkv.variability_field_name = fv.field_name AND flinkv.variability_language_code = fv.language_code ) WHERE fv.language_code = 'fr' AND fv.table_name = 'fphd_architectural_line' AND
fv.field_name = 'advantage' AND
flinkv.architectural_line_code = fh.architectural_line_code LIMIT 1) AS architectural_line_advantage,(SELECT LEFT(fv.value,18) AS value FROM
fphd_link_housing_type_variability AS flinkv
INNER JOIN fphd_variability AS fv ON ( flinkv.variability_field_code = fv.field_code AND flinkv.variability_field_name = fv.field_name AND flinkv.variability_language_code = fv.language_code ) WHERE fv.language_code = 'fr' AND fv.table_name = 'fphd_housing_type' AND
fv.field_name = 'name' AND
flinkv.housing_type_code = fh.housing_type_code LIMIT 1) AS type_name,
fh.code,
fh.comfort_level_code,
fh.architectural_line_code,
fh.housing_type_code AS type,
fh.url_housing_map AS plan,
flosya.brand_label_group_code,
floha.exception_housing AS exception,
fal.owner_code AS architectural_line_owner_code,
fcl.owner_code AS comfort_level_owner_code
FROM
fphd_housing AS fh
INNER JOIN fphd_link_offer_housing_assembly AS floha ON ( fh.code = floha.housing_code )
INNER JOIN fphd_link_offer_season_year_assembly AS flosya ON ( floha.offer_code = flosya.offer_code AND
floha.season_code = flosya.season_code )
LEFT OUTER JOIN fphd_architectural_line AS fal ON ( fh.architectural_line_code = fal.code )
LEFT OUTER JOIN fphd_comfort_level AS fcl ON ( fh.comfort_level_code = fcl.code ) WHERE ( floha.offer_code IN ( SELECT
IF ( blro_group.offer_code = 'CEL', blro_group.main_offer_code, blro_group.offer_code ) as offer_code
FROM
b2chd_link_referent_offer AS blro
INNER JOIN b2chd_link_referent_offer AS blro_group ON ( blro.main_offer_code = blro_group.main_offer_code )
WHERE
blro.offer_code = 'CEL' OR
blro.main_offer_code = 'CEL') OR
floha.offer_code = 'CEL' ) AND floha.season_code = 'SUMMER' ORDER BY fh.code ASC;
|
|
If I switch 'derived_merge=off', I don't see the difference.
|
|
Stephane, the query you've posted doesn't have any FROM subqueries. This means the problem doesn't have anything to do with derived_merge feature.
|
|
Stephane, could you describe the problem in detail?
I suppose the problem is: you see the query to use on-disk temporary tables, which makes it slow? Is it possible to make the query shorter ? It seems to use a lot of repetitive constructs.
Then, with shorter query, could you provide
- the shorter query
- DDL of the tables involved
- EXPLAIN statement
That way, I'll be able to analyze.
|
|
With the supplied dataset and query (test-case-database-MDEV-4793.sql.bz2):
MariaDB 5.3:
creates (but not opens) two temporary tables, both with engine=Aria:
- one for handling GROUP BY
- one for subquery cache.
the table for handling group is later "opened" (with open_tmp_table()) which causes a MAI/MAD file to be created on disk.
|
|
The same in MariaDB 5.5
|
|
What is interesting, it seems to be possible to manually create a HEAP table with this DDL:
##
-
- Create a table with the same DDL that the subuqery would output:
##
MariaDB [j1]> create table t2 as SELECT
-> fv.value
-> FROM
-> fphd_content AS fv
-> WHERE
-> fv.language_code = 'fr';
Query OK, 3305 rows affected (2.20 sec)
Records: 3305 Duplicates: 0 Warnings: 0
MariaDB [j1]> show create table t2\G
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 1. row ***************************
Table: t2
Create Table: CREATE TABLE `t2` (
`value` varchar(3000) CHARACTER SET utf8 NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
MariaDB [j1]> create table t3 like t2;
Query OK, 0 rows affected (0.11 sec)
MariaDB [j1]> alter table t3 engine=heap;
Query OK, 0 rows affected (0.07 sec)
Records: 0 Duplicates: 0 Warnings: 0
MariaDB [j1]> show create table t3\G
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 1. row ***************************
Table: t3
Create Table: CREATE TABLE `t3` (
`value` varchar(3000) CHARACTER SET utf8 NOT NULL
) ENGINE=MEMORY DEFAULT CHARSET=latin1
1 row in set (0.01 sec)
|
|
I have explored why MySQL/MariaDB has CONVERT_IF_BIGGER_TO_BLOB constant.
The reason is: temporary tables use "Fixed" row format. This means that a temporary table with this definition
CREATE TABLE tmp (col1 varchar(3000) CHARACTER SET utf8);
will use 3000* max_char_len(utf8) = 3000* 3= 9000 bytes for each row. Dynamic row format (and blobs will use dynamic row format) is much more economical.
One may ask a question: why can't we use a heap temporary table with a dynamic row format? The answer for this is not yet clear for me. I'll investigate.
|
|
It turns out, MariaDB doesn't support row_format=dynamic for heap tables.
Percona supports row_format=dynamic as of 5.5. However, their create_tmp_table() code is not aware of that, so Percona Server will still create an on-disk table for the provided example.
|
|
In my opinion, this cannot be fixed in MariaDB 5.5. It is quite risky to make changes in create_tmp_table() code. We could consider this as a feature for MariaDB 10.0.
|
|
Hi Sergei,
Should we convert it into a task and target for 10.1?
|
|
Personally, I would like to see this implemented, so I think converting it into a task is a good idea. 
|