Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9968

Subquery not optimized when join is between Aria and InnoDB tables

Details

    Description

      See my stack exchange post and answer here for reference:

      http://dba.stackexchange.com/questions/134813/materialized-subquery-not-optimized

      Primary query table uses Aria engine. Subquery uses InnoDB. Query was not optimizing properly and was joining on "index" type. Expecting "eq_ref" type because primary table constraint was using primary key. Changing primary table to InnoDB fixed the issue and used eq_ref as expected.

      Attachments

        Issue Links

          Activity

            The problem disappeared from 10.2 branch after this commit:

            commit c8dc866fdeee551993ef91fb321135f9106ea00e
            Author: Sergei Petrunia <psergey@askmonty.org>
            Date:   Tue Sep 10 23:51:42 2019 +0300
             
                MDEV-20371: Invalid reads at plan refinement stage: join->positions...
            

            For a reference, here is the test case I used for checking:

            --source include/have_innodb.inc
            DROP TABLE IF EXISTS ri_email_messages, ri_email_messages_contacts, ri_email_contacts;
            CREATE TABLE ri_email_messages (id INT PRIMARY KEY) ENGINE=Aria;
            INSERT INTO ri_email_messages VALUES (1),(2),(3),(4),(5),(6),(7),(8);
            CREATE TABLE ri_email_messages_contacts (contact_id INT, message_id INT, PRIMARY KEY(contact_id,message_id), KEY(contact_id), KEY(message_id)) ENGINE=InnoDB;
            INSERT INTO ri_email_messages_contacts VALUES (1,1),(2,2),(3,2),(4,2),(4,3),(4,4),(5,2);
            CREATE TABLE ri_email_contacts (id INT PRIMARY KEY, email VARCHAR(128), KEY(email)) ENGINE=InnoDB;
            INSERT INTO ri_email_contacts VALUES (1,'foo'),(2,'bar'),(3,'baz'),(4,'qux'),(5,'foobar');
             
            EXPLAIN EXTENDED SELECT rem.id                 
            FROM ri_email_messages rem
            WHERE rem.id IN(
                SELECT remc.message_id
                FROM ri_email_messages_contacts remc  
                  INNER JOIN ri_email_contacts rec ON remc.contact_id = rec.id
                WHERE rec.email = 'email@address.com'
            );
            DROP TABLE IF EXISTS ri_email_messages, ri_email_messages_contacts, ri_email_contacts;
            

            Before the patch above:

            10.2 863a9517311

            id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
            1	PRIMARY	<subquery2>	ALL	distinct_key	NULL	NULL	NULL	1	100.00	
            1	PRIMARY	rem	index	PRIMARY	PRIMARY	4	NULL	8	75.00	Using where; Using index; Using join buffer (flat, BNL join)
            2	MATERIALIZED	rec	ref	PRIMARY,email	email	131	const	1	100.00	Using where; Using index
            2	MATERIALIZED	remc	ref	PRIMARY,contact_id,message_id	PRIMARY	4	test.rec.id	1	100.00	Using index
            Warnings:
            Note	1003	select `test`.`rem`.`id` AS `id` from `test`.`ri_email_messages` `rem` semi join (`test`.`ri_email_messages_contacts` `remc` join `test`.`ri_email_contacts` `rec`) where `test`.`rec`.`email` = 'email@address.com' and `test`.`rem`.`id` = `test`.`remc`.`message_id` and `test`.`remc`.`contact_id` = `test`.`rec`.`id`
            

            After the patch:

            10.2 c8dc866fde

            id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
            1	PRIMARY	<subquery2>	ALL	distinct_key	NULL	NULL	NULL	1	100.00	
            1	PRIMARY	rem	eq_ref	PRIMARY	PRIMARY	4	test.remc.message_id	1	100.00	Using index
            2	MATERIALIZED	rec	ref	PRIMARY,email	email	131	const	1	100.00	Using where; Using index
            2	MATERIALIZED	remc	ref	PRIMARY,contact_id,message_id	PRIMARY	4	test.rec.id	1	100.00	Using index
            Warnings:
            Note	1003	select `test`.`rem`.`id` AS `id` from `test`.`ri_email_messages` `rem` semi join (`test`.`ri_email_messages_contacts` `remc` join `test`.`ri_email_contacts` `rec`) where `test`.`rec`.`email` = 'email@address.com' and `test`.`rem`.`id` = `test`.`remc`.`message_id` and `test`.`remc`.`contact_id` = `test`.`rec`.`id`
            

            elenst Elena Stepanova added a comment - The problem disappeared from 10.2 branch after this commit: commit c8dc866fdeee551993ef91fb321135f9106ea00e Author: Sergei Petrunia <psergey@askmonty.org> Date: Tue Sep 10 23:51:42 2019 +0300   MDEV-20371: Invalid reads at plan refinement stage: join->positions... For a reference, here is the test case I used for checking: --source include/have_innodb.inc DROP TABLE IF EXISTS ri_email_messages, ri_email_messages_contacts, ri_email_contacts; CREATE TABLE ri_email_messages (id INT PRIMARY KEY ) ENGINE=Aria; INSERT INTO ri_email_messages VALUES (1),(2),(3),(4),(5),(6),(7),(8); CREATE TABLE ri_email_messages_contacts (contact_id INT , message_id INT , PRIMARY KEY (contact_id,message_id), KEY (contact_id), KEY (message_id)) ENGINE=InnoDB; INSERT INTO ri_email_messages_contacts VALUES (1,1),(2,2),(3,2),(4,2),(4,3),(4,4),(5,2); CREATE TABLE ri_email_contacts (id INT PRIMARY KEY , email VARCHAR (128), KEY (email)) ENGINE=InnoDB; INSERT INTO ri_email_contacts VALUES (1, 'foo' ),(2, 'bar' ),(3, 'baz' ),(4, 'qux' ),(5, 'foobar' );   EXPLAIN EXTENDED SELECT rem.id FROM ri_email_messages rem WHERE rem.id IN ( SELECT remc.message_id FROM ri_email_messages_contacts remc INNER JOIN ri_email_contacts rec ON remc.contact_id = rec.id WHERE rec.email = 'email@address.com' ); DROP TABLE IF EXISTS ri_email_messages, ri_email_messages_contacts, ri_email_contacts; Before the patch above: 10.2 863a9517311 id select_type table type possible_keys key key_len ref rows filtered Extra 1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 1 100.00 1 PRIMARY rem index PRIMARY PRIMARY 4 NULL 8 75.00 Using where ; Using index ; Using join buffer (flat, BNL join ) 2 MATERIALIZED rec ref PRIMARY ,email email 131 const 1 100.00 Using where ; Using index 2 MATERIALIZED remc ref PRIMARY ,contact_id,message_id PRIMARY 4 test.rec.id 1 100.00 Using index Warnings: Note 1003 select `test`.`rem`.`id` AS `id` from `test`.`ri_email_messages` `rem` semi join (`test`.`ri_email_messages_contacts` `remc` join `test`.`ri_email_contacts` `rec`) where `test`.`rec`.`email` = 'email@address.com' and `test`.`rem`.`id` = `test`.`remc`.`message_id` and `test`.`remc`.`contact_id` = `test`.`rec`.`id` After the patch: 10.2 c8dc866fde id select_type table type possible_keys key key_len ref rows filtered Extra 1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 1 100.00 1 PRIMARY rem eq_ref PRIMARY PRIMARY 4 test.remc.message_id 1 100.00 Using index 2 MATERIALIZED rec ref PRIMARY ,email email 131 const 1 100.00 Using where ; Using index 2 MATERIALIZED remc ref PRIMARY ,contact_id,message_id PRIMARY 4 test.rec.id 1 100.00 Using index Warnings: Note 1003 select `test`.`rem`.`id` AS `id` from `test`.`ri_email_messages` `rem` semi join (`test`.`ri_email_messages_contacts` `remc` join `test`.`ri_email_contacts` `rec`) where `test`.`rec`.`email` = 'email@address.com' and `test`.`rem`.`id` = `test`.`remc`.`message_id` and `test`.`remc`.`contact_id` = `test`.`rec`.`id`

            People

              psergei Sergei Petrunia
              mr.rost Mike
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.