[MDEV-5177] ha_partition and innodb index intersection produce fewer rows (MySQL Bug#70703) Created: 2013-10-23  Updated: 2014-03-11  Resolved: 2014-03-11

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 5.5.33a
Fix Version/s: 5.5.37

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: None

Attachments: File mdev5555-innodb.test    
Issue Links:
Relates
relates to MDEV-5821 Wrong result of SELECT query while tw... Closed

 Description   

Even after fix for MDEV-5112 (and MySQL Bug#69581), a query that runs index_merge intersection (union is probably affected too) over partitioned table, may return wrong query result.

Testcase:

create table t11 (
  a int not null,
  b int not null,
  pk int not null,
  primary key (pk),
  key(a),
  key(b)
) partition by hash(pk) partitions 10;
 
insert into t11 values (1,2,4); -- both
insert into t11 values (1,0,17);  -- left
insert into t11 values (1,2,25);   -- both
 
insert into t11 values (10,20,122); 
insert into t11 values (10,20,123);
 
-- Now, fill in some data so that the optimizer choses index_merge
create table t12 (a int);
insert into t12 values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
 
insert into t11 select 1,2, 200 + A.a + 10*B.a + 100*C.a from t12 A, t12 B, t12 C;
 
insert into t11 select 10+A.a + 10*B.a + 100*C.a + 1000*D.a,
                       10+A.a + 10*B.a + 100*C.a  + 1000*D.a, 
                       2000 + A.a + 10*B.a + 100*C.a + 1000*D.a
                       from t12 A, t12 B, t12 C ,t12 D;
 
-- This should show index_merge, using intersect
explain select * from t11 where a=1 and b=2 and  pk between 1 and 999999 ;
-- 794 rows in output
select * from t11 where a=1 and b=2 and  pk between 1 and 999 ;
-- 802 rows in output
select * from t11 ignore index(a,b)  where a=1 and b=2 and  pk between 1 and 999 ;



 Comments   
Comment by Arnaud Gadal [ 2014-02-07 ]

It seems it's ok with MySQL 5.6.16, I obtain 802 rows in either cases.
MariaDB 5.5.35 still returns 794 and 802.

Comment by Sergei Petrunia [ 2014-02-14 ]

Repeatable on MySQL 5.6.15 though.

Comment by Sergei Petrunia [ 2014-02-14 ]

But not on mysql-5.5.35. It seems the bug is somewhere in the code common between mariadb-5.5 and mysql-5.6.

Comment by Sergei Petrunia [ 2014-02-14 ]

MDEV-5555 might be the same issue as this one.

Comment by Elena Stepanova [ 2014-02-14 ]

The bug was fixed in mysql/5.6 tree by the following revision:

revno: 5592 [merge]
revision-id: aditya.a@oracle.com-20131105140125-tecvsaelewn37kla
parent: tor.didriksen@oracle.com-20131105112641-n71zwdnmx9q8kq8f
parent: aditya.a@oracle.com-20131105135526-agm7nxh5mbf6s6fn
committer: Aditya A <aditya.a@oracle.com>
branch nick: mysql-5.6
timestamp: Tue 2013-11-05 19:31:25 +0530
message:
  Bug#17588348: INDEX MERGE USED ON PARTITIONED TABLE 
                 CAN RETURN WRONG RESULT SET
  
  [Merge from 5.5]
  -
    ------------------------------------------------------------
    revno: 2875.437.288
    revision-id: aditya.a@oracle.com-20131105135526-agm7nxh5mbf6s6fn
    parent: tor.didriksen@oracle.com-20131105090257-p7pfzgf7wgs8eau1
    committer: Aditya A <aditya.a@oracle.com>
    branch nick: mysql-5.5
    timestamp: Tue 2013-11-05 19:25:26 +0530
    message:
      Bug#17588348: INDEX MERGE USED ON PARTITIONED TABLE 
                     CAN RETURN WRONG RESULT SET
      
      PROBLEM
      -------
      In ha_partition::cmp_ref() we were only calling the 
      underlying cmp_ref() of storage engine if the records
      are in the same partiton,else we sort by partition and
      returns the result.But the index merge intersect 
      algorithm expects first to sort by row-id first and 
      then by partition id.
      
      FIX
      ---
      Compare the refernces first using storage engine cmp_ref
      and then if references are equal(only happens if 
      non clustered index is used) then sort it by partition id.
      
      [Approved by Mattiasj #rb3755]
      -

But it does not fix the test case from MDEV-5555. It fails on the current mysql/5.6 tree.

Comment by Sergei Petrunia [ 2014-02-17 ]

Again, full analysis:

  • index_merge/intersect requires that scans over key=const are ROR-scans
    (i.e. the storage engine produces records in rowid order).
  • ha_partition::cmp_ref() defines rowid order as lexicographic order of
    (partition_id, underlying_engine_rowid) pairs.
  • ha_partition::handle_ordered_index_scan() merges row streams from underlying
    partitions. Its priority queue is set to sort rows by key column values.
    partition_id values are ignored. As a result, index scans do not have the ROR
    property.

MySQL's solution is to change ha_partition::cmp_ref() so that the rowid order
is (underlying_engine_rowid, partition_id). This doesn't solve the problem.
The priority queue in handle_ordered_index_scan()
there is still no warranty that handle_ordered_index_scan() will produce rows
in rowid order.

Our solution is to make ha_partition::handle_ordered_index_scan() to produce rows in the order of (key_value, partition_id).

Comment by Arnaud Gadal [ 2014-02-26 ]

I've just tried on the latest MariaDB 5.5.36 and I still get 794 and 802 rows although this bug is tagged as fixed...

Comment by Elena Stepanova [ 2014-02-26 ]

Indeed, the bug was fixed by revno 4074, but immediately re-appered with the postfix 4075:

------------------------------------------------------------
revno: 4074 [merge]
revision-id: psergey@askmonty.org-20140217145354-3apakilmvbwmbflc
parent: sergii@pisem.net-20140217100940-phj213cgrun90d4j
parent: psergey@askmonty.org-20140217145054-90puw5v7oi6iapi4
committer: Sergey Petrunya <psergey@askmonty.org>
branch nick: 5.5-push
timestamp: Mon 2014-02-17 18:53:54 +0400
message:
  Merge
    ------------------------------------------------------------
    revno: 4066.1.1
    revision-id: psergey@askmonty.org-20140217145054-90puw5v7oi6iapi4
    parent: svoj@mariadb.org-20140213124108-tyzdv1v2u8zg2dfu
    committer: Sergey Petrunya <psergey@askmonty.org>
    branch nick: 5.5
    timestamp: Mon 2014-02-17 18:50:54 +0400
    message:
      MDEV-5177: ha_partition and innodb index intersection produce fewer rows (MySQL Bug#70703)
      MDEV-5555: Incorrect index_merge on BTREE indices
      - In ha_partition, make ordered index reads return rows in rowid order
        when index columns are the same.

revno: 4075
revision-id: psergey@askmonty.org-20140217192007-zp0pafkydlzgd0yw
parent: psergey@askmonty.org-20140217145354-3apakilmvbwmbflc
committer: Sergey Petrunya <psergey@askmonty.org>
branch nick: 5.5
timestamp: Mon 2014-02-17 23:20:07 +0400
message:
  Post-merge fixes: merge MariaDB's fix for MDEV-5177 and MDEV-5555 
  with Oracle's fix for Bug#17588348 by reverting Oracle's fix.

Comment by Sergei Petrunia [ 2014-02-27 ]

It is interesting, how the bug could reappear.. mysql-test/t/partition.test has a testcase for MDEV-5177. Why didn't the testcase fail?

Comment by Sergei Petrunia [ 2014-02-27 ]

Still, somehow it reappeared. If I run manually the testcase provided here, I'm able to observe different numbers of rows to be returned.

Comment by Sergei Petrunia [ 2014-02-27 ]

Ok, the problem with the testcase is that I've added into a file that runs with storage_engine=myisam. This was appropriate for MDEV-5555 (which has a testcase with MyISAM), but not for this bug.

Comment by Sergei Petrunia [ 2014-02-27 ]

Debugging the query, I see that merged index scans do not return records in
rowid order:
index=1 rowid=0400-04000000
index=1 rowid=0700-11000000
index=1 rowid=0500-19000000
index=1 rowid=0000-c8000000
index=1 rowid=0100-c9000000
index=1 rowid=0200-ca000000
index=1 rowid=0300-cb000000
index=1 rowid=0400-cc000000
index=1 rowid=0500-cd000000
index=1 rowid=0600-ce000000
index=1 rowid=0700-cf000000
index=1 rowid=0800-d0000000
...

Debugging further, one can see the reason of this. The table's keys are:

primary key (pk),
key(a),
key(b)

extended_keys feature causes the key to be (pk, a)

When I added cmp_key_then_part_id(), it defined the ordering as
1. order by key
2. then order by part_id.

index scan is done on a=1, which means #1 is constant. From that it followed that cmp_key_then_part_id will return records ordered by part_id. That ordering will match the ordering defined by ha_partition::cmp_ref() which starts by comparing part_id.

In reality, the following happens: cmp_key_then_part_id() defines ordering as
1. order by key, extended_key_parts (i.e. pk)
2. then order by part_id.

index scan returns values with a=1, but different values of pk. The ordering doesn't match ordering of ha_partition::cmp_ref anymore.

Comment by Sergei Petrunia [ 2014-02-27 ]

== Ideas on how this could be fixed ==

Ordered index scans over ha_partition return records in the order of (key, extended_key_parts, ...).
There is no way to change that.

It could be nice to get index_merge to inform the storage engine that it needs the index scans to be rowid-ordered but not key-ordered, but this is a too big change for a GA version.

Comment by Sergei Petrunia [ 2014-02-27 ]

A possible solution:

index scan on ha_partition must return records ordered by

key_value, ext_key_parts (if there are any)

When ROR-index_merge is used, the key_value part is fixed, which means that
index scans will return rows ordered by:

ext_key_parts (if there are any)

On the other hand, ROR-index_merge requires that rows come in the order that
matches the ordering imposed by ha_partition::cmp_ref().

The solution is:
1. Let index scan on ha_partition return rows ordered by

key_rec_cmp, then part_id - for tables with extended keys (we do this now)
key_rec_cmp, then underlying_rowid, then part_id - for tables without extended keys.

2. Let ha_partition::cmp_ref() compare rowids by comparing underlying_rowid
first, then comparing part_id.

Comment by Sergei Petrunia [ 2014-02-27 ]

Committed another fix. I'll need a review for it.

Comment by Sergei Petrunia [ 2014-03-04 ]

igor, as you have requested: please find the attached testcase (mdev5555-innodb.test) which fails in MySQL 5.6.

Comment by Sergei Petrunia [ 2014-03-11 ]

Fix pushed into 5.5 tree

Generated at Thu Feb 08 07:02:16 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.