[MDEV-5555] Incorrect index_merge on BTREE indices - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 5.5.34, 10.0.8
Fix Version/s: 5.5.36, 10.0.9
Component/s: None
Labels:
None
Environment:
mysql Ver 15.1 Distrib 5.5.34-MariaDB, for debian-linux-gnu (x86_64) using readline 5.1

Ubuntu 12.04 LTS

Description

Trying to simplify the context for this problem, but let me know if you require additional information.

I have a table named base with several columns, including a field named id.

Each field has a BTREE index created via create index X on base(X)

I've found that queries of the type select id from base where X = 'value1' and Y = 'value2' do not return the correct number of results.

If I do a select id, X from base where Y = 'value2' order by X, I can see a full and accurate list of all combinations of X and Y = 'value2', since this query does not execute an index_merge (running an explain on this shows that its a ref query using the index on Y).

But running select id from base where X = 'value1' and Y = 'value2' clearly shows that some rows are dropped from the results when compared to what you see with select id, X from base where Y = 'value2' order by X. Running an explain shows that an index_merge is run on the indices on both X and Y.

Rebuilding the indices does not fix the problem, and results in the same rows consistently being dropped from the query results.

I can provide firmer context with actual query outputs if this is too abstract. Thanks for your help.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

1krows_dump_min.sql
3.13 MB
2014-02-10 20:22
mdev5555.test
20 kB
2014-02-13 23:05

Activity

Ascending order - Click to sort in descending order

Nicholas Hwang created issue - 2014-01-23 20:00

Sergei Golubchik made changes - 2014-01-28 18:32

Field	Original Value	New Value
Assignee		Elena Stepanova [ elenst ]

Elena Stepanova added a comment - 2014-01-28 19:21

Hi,

Yes, it would be great if you could provide a data dump for the tables involved into the query, and the actual query. If the data is sensitive, you can upload it to the private section of our FTP (ftp://ftp.askmonty.org/private/)

We have some open bugs regarding index_merge, but it's difficult to say whether your case is related to them or not without the actual data.

Elena Stepanova added a comment - 2014-01-28 19:21 Hi, Yes, it would be great if you could provide a data dump for the tables involved into the query, and the actual query. If the data is sensitive, you can upload it to the private section of our FTP ( ftp://ftp.askmonty.org/private/ ) We have some open bugs regarding index_merge, but it's difficult to say whether your case is related to them or not without the actual data.

Elena Stepanova made changes - 2014-01-28 19:21

Due Date

2014-02-28

Michael Reschly added a comment - 2014-02-07 23:03

Hi Elena,

I work with Nick. I put a data dump in the private FTP folder you referenced (10krows_bugtest.sql.gz). The data isn't private (it's synthetically generated), but it's ~1GB compressed/1.4GB uncompressed, which seemed too big to attach to attach to JIRA.

As for queries which give differing results:

select id, fname, race, state from base where fname = 'Michael' and race = 'White' and state = 'California'; returns 7 rows
select id, fname, race, state from base IGNORE INDEX(fname, race, state) where fname = 'Michael' and race = 'White' and state = 'California'; returns 10 rows
select id, fname, race, state from base where fname = 'Michael' and race = 'White' ORDER BY state; returns 10 rows with state=California
select id, fname, race, state from base where fname = 'Michael' and state = 'California' ORDER BY race; returns 7 rows with race=White
select id, fname, race, state from base IGNORE INDEX(fname, race, state) where fname = 'Michael' and state = 'California' ORDER BY race; returns 10 rows with race=White

Michael Reschly added a comment - 2014-02-07 23:03 Hi Elena, I work with Nick. I put a data dump in the private FTP folder you referenced (10krows_bugtest.sql.gz). The data isn't private (it's synthetically generated), but it's ~1GB compressed/1.4GB uncompressed, which seemed too big to attach to attach to JIRA. As for queries which give differing results: select id, fname, race, state from base where fname = 'Michael' and race = 'White' and state = 'California'; returns 7 rows select id, fname, race, state from base IGNORE INDEX(fname, race, state) where fname = 'Michael' and race = 'White' and state = 'California'; returns 10 rows select id, fname, race, state from base where fname = 'Michael' and race = 'White' ORDER BY state; returns 10 rows with state=California select id, fname, race, state from base where fname = 'Michael' and state = 'California' ORDER BY race; returns 7 rows with race=White select id, fname, race, state from base IGNORE INDEX(fname, race, state) where fname = 'Michael' and state = 'California' ORDER BY race; returns 10 rows with race=White

Michael Reschly added a comment - 2014-02-10 20:22

Significantly smaller database dump that demonstrates the index_merge issue. All data contained within this file is fully synthetic. Any similarity to any persons, real or fictional, is purely coincidental.

Michael Reschly added a comment - 2014-02-10 20:22 Significantly smaller database dump that demonstrates the index_merge issue. All data contained within this file is fully synthetic. Any similarity to any persons, real or fictional, is purely coincidental.

Michael Reschly made changes - 2014-02-10 20:22

Attachment

1krows_dump_min.sql [ 26401 ]

Michael Reschly added a comment - 2014-02-10 20:31

I attached a much smaller database that demonstrates the same issue, which should be easier to work with. As for queries:

Query using index:
select id, dob from base where hours_worked_per_week = 40 and weeks_worked_last_year = 52 and dob < '1949-11-21'; returns 2 rows
Query without index:
select id, dob from base IGNORE INDEX(dob, weeks_worked_last_year, hours_worked_per_week) where hours_worked_per_week = 40 and weeks_worked_last_year = 52 and dob < '1949-11-21'; returns 8 rows

Query 2 of three terms, order by third:
select id, dob, weeks_worked_last_year from base where hours_worked_per_week = 40 and dob < '1949-11-21' order by weeks_worked_last_year; returns 8 rows that have weeks_worked_last_year = 52
select id, dob, hours_worked_per_week from base where weeks_worked_last_year = 52 and dob < '1949-11-21' order by hours_worked_per_week ; returns 8 rows that have hours_worked_per_week = 40

Michael Reschly added a comment - 2014-02-10 20:31 I attached a much smaller database that demonstrates the same issue, which should be easier to work with. As for queries: Query using index: select id, dob from base where hours_worked_per_week = 40 and weeks_worked_last_year = 52 and dob < '1949-11-21'; returns 2 rows Query without index: select id, dob from base IGNORE INDEX(dob, weeks_worked_last_year, hours_worked_per_week) where hours_worked_per_week = 40 and weeks_worked_last_year = 52 and dob < '1949-11-21'; returns 8 rows Query 2 of three terms, order by third: select id, dob, weeks_worked_last_year from base where hours_worked_per_week = 40 and dob < '1949-11-21' order by weeks_worked_last_year; returns 8 rows that have weeks_worked_last_year = 52 select id, dob, hours_worked_per_week from base where weeks_worked_last_year = 52 and dob < '1949-11-21' order by hours_worked_per_week ; returns 8 rows that have hours_worked_per_week = 40

Elena Stepanova added a comment - 2014-02-13 23:05 - edited

Thank you for creating the test case!

I reduced it a bit further, and replaced data with generic values, it is attached as mdev5555.test.

Output of the queries:

MariaDB [test]> select * from base where hours_worked_per_week = 40 and weeks_worked_last_year = 52 and dob < '1949-11-21';

Empty set (0.01 sec)

MariaDB [test]> select * from base IGNORE INDEX(dob, weeks_worked_last_year, hours_worked_per_week) where hours_worked_per_week = 40 and weeks_worked_last_year = 52 and dob < '1949-11-21';

+----+-----+------------+-----------+--------+-----------------------+------------------------+

| id | id2 | dob        | address   | city   | hours_worked_per_week | weeks_worked_last_year |

+----+-----+------------+-----------+--------+-----------------------+------------------------+

| 16 |  16 | 1949-11-07 | address16 | city16 |                    40 |                     52 |

| 50 |  50 | 1923-09-08 | address50 | city50 |                    40 |                     52 |

+----+-----+------------+-----------+--------+-----------------------+------------------------+

2 rows in set (0.00 sec)

Elena Stepanova added a comment - 2014-02-13 23:05 - edited Thank you for creating the test case! I reduced it a bit further, and replaced data with generic values, it is attached as mdev5555.test. Output of the queries: MariaDB [test]> select * from base where hours_worked_per_week = 40 and weeks_worked_last_year = 52 and dob < '1949-11-21' ; Empty set (0.01 sec) MariaDB [test]> select * from base IGNORE INDEX (dob, weeks_worked_last_year, hours_worked_per_week) where hours_worked_per_week = 40 and weeks_worked_last_year = 52 and dob < '1949-11-21' ; + ----+-----+------------+-----------+--------+-----------------------+------------------------+ | id | id2 | dob | address | city | hours_worked_per_week | weeks_worked_last_year | + ----+-----+------------+-----------+--------+-----------------------+------------------------+ | 16 | 16 | 1949-11-07 | address16 | city16 | 40 | 52 | | 50 | 50 | 1923-09-08 | address50 | city50 | 40 | 52 | + ----+-----+------------+-----------+--------+-----------------------+------------------------+ 2 rows in set (0.00 sec)

Elena Stepanova made changes - 2014-02-13 23:05

Attachment		mdev5555.test [ 26500 ]
Fix Version/s		10.0.9 [ 14400 ]
Fix Version/s		5.5.36 [ 14600 ]
Affects Version/s		10.0.8 [ 14200 ]
Assignee	Elena Stepanova [ elenst ]	Sergei Petrunia [ psergey ]
Due Date	2014-02-28

Sergei Petrunia added a comment - 2014-02-14 16:01

index_merge and partitioned tables- this could be the same issue as ~~MDEV-5177~~.

Sergei Petrunia added a comment - 2014-02-14 16:01 index_merge and partitioned tables- this could be the same issue as MDEV-5177 .

Sergei Petrunia added a comment - 2014-02-15 01:39

Records are read this way:

#0 key_rec_cmp
#1 0x0000000000ab3e57 in _downheap
#2 0x0000000000ab3f70 in queue_fix
#3 0x0000000000e3c459 in ha_partition::handle_ordered_index_scan
#4 0x0000000000e3ad1a in ha_partition::common_index_read
#5 0x0000000000e3b4fc in ha_partition::read_range_first
#6 0x0000000000648467 in handler::multi_range_read_next
#7 0x00000000009765ba in QUICK_RANGE_SELECT::get_next
#8 0x0000000000975929 in QUICK_ROR_INTERSECT_SELECT::get_next
#9 0x000000000099015c in rr_quick
#10 0x00000000007b1a07 in join_init_read_record

QUICK_ROR_INTERSECT_SELECT requires that its inputs are ordered by Rowid. ha_partition does an ordered index scan.

The scan is done as follows: get a row from each partition, put them into a priority queue, then get the row from the top of the priority queue. Put another row there, repeat.

The problem is that ha_partition will not produce records in rowid order when using this strategy.

Rowid for ha_partition table is

{partition_no, underlying_table_rowid}

, and ha_partition::cmp_ref() compares rowid tuples in lexicographic order (ie. partition_no is compared first, then rowid).

Sergei Petrunia added a comment - 2014-02-15 01:39 Records are read this way: #0 key_rec_cmp #1 0x0000000000ab3e57 in _downheap #2 0x0000000000ab3f70 in queue_fix #3 0x0000000000e3c459 in ha_partition::handle_ordered_index_scan #4 0x0000000000e3ad1a in ha_partition::common_index_read #5 0x0000000000e3b4fc in ha_partition::read_range_first #6 0x0000000000648467 in handler::multi_range_read_next #7 0x00000000009765ba in QUICK_RANGE_SELECT::get_next #8 0x0000000000975929 in QUICK_ROR_INTERSECT_SELECT::get_next #9 0x000000000099015c in rr_quick #10 0x00000000007b1a07 in join_init_read_record QUICK_ROR_INTERSECT_SELECT requires that its inputs are ordered by Rowid. ha_partition does an ordered index scan. The scan is done as follows: get a row from each partition, put them into a priority queue, then get the row from the top of the priority queue. Put another row there, repeat. The problem is that ha_partition will not produce records in rowid order when using this strategy. Rowid for ha_partition table is {partition_no, underlying_table_rowid} , and ha_partition::cmp_ref() compares rowid tuples in lexicographic order (ie. partition_no is compared first, then rowid).

Sergei Petrunia added a comment - 2014-02-15 02:02

Elena has found MySQL's fix for Bug#17588348. (See ~~MDEV-5177~~).

That fix changes ha_partition::cmp_ref() to compare

{partition_no, underlying_table_rowid}

pairs by comparing underlying_table_rowid first, and partition_no second.

The problem is that this still doesn't make ha_partition to produce rowid-ordered streams.

Sergei Petrunia added a comment - 2014-02-15 02:02 Elena has found MySQL's fix for Bug#17588348. (See MDEV-5177 ). That fix changes ha_partition::cmp_ref() to compare {partition_no, underlying_table_rowid} pairs by comparing underlying_table_rowid first, and partition_no second. The problem is that this still doesn't make ha_partition to produce rowid-ordered streams.

Sergei Petrunia added a comment - 2014-02-17 17:04

Fix pushed into 5.5 tree and will be in the next 5.5 release.

Thanks for the detailed bug report, and for the testcase.

Sergei Petrunia added a comment - 2014-02-17 17:04 Fix pushed into 5.5 tree and will be in the next 5.5 release. Thanks for the detailed bug report, and for the testcase.

Sergei Petrunia made changes - 2014-02-17 17:04

Resolution		Fixed [ 1 ]
Status	Open [ 1 ]	Closed [ 6 ]

Nicholas Hwang added a comment - 2014-02-18 08:10

Wonderful, thanks!

We have some very large existing datasets that we'd like to use as is, if possible. Does this bug require that these datasets be regenerated using the patched server? Or just that the indices be re-created? Or just that the patched server be running?

Nicholas Hwang added a comment - 2014-02-18 08:10 Wonderful, thanks! We have some very large existing datasets that we'd like to use as is, if possible. Does this bug require that these datasets be regenerated using the patched server? Or just that the indices be re-created? Or just that the patched server be running?

Sergei Petrunia added a comment - 2014-02-18 10:21

The patch only changes the way data is read. There is no need to regenerate data or to rebuild indexes. It should be sufficient to start the patched server on the existing data directory.

Sergei Petrunia added a comment - 2014-02-18 10:21 The patch only changes the way data is read. There is no need to regenerate data or to rebuild indexes. It should be sufficient to start the patched server on the existing data directory.

Nicholas Hwang added a comment - 2014-02-19 02:26

Excellent, thanks so much for the quick turn around.

Nicholas Hwang added a comment - 2014-02-19 02:26 Excellent, thanks so much for the quick turn around.

Michael Reschly added a comment - 2014-02-26 16:00

> There is no need to regenerate data or to rebuild indexes. It should be sufficient to start the patched server on the existing data directory.

Hi Sergi,

I just wanted to confirm that this was the case. I used patched binaries on our existing data and the correct results were returned without regenerating data or re-indexing.

Thanks again.

Michael Reschly added a comment - 2014-02-26 16:00 > There is no need to regenerate data or to rebuild indexes. It should be sufficient to start the patched server on the existing data directory. Hi Sergi, I just wanted to confirm that this was the case. I used patched binaries on our existing data and the correct results were returned without regenerating data or re-indexing. Thanks again.

Sergei Golubchik made changes - 2014-06-13 15:06

Workflow

defaullt [ 33719 ]

MariaDB v2 [ 42989 ]

Rasmus Johansson (Inactive) made changes - 2015-05-18 17:51

Workflow

MariaDB v2 [ 42989 ]

MariaDB v3 [ 62685 ]

Sergei Golubchik made changes - 2021-12-06 21:39

Workflow

MariaDB v3 [ 62685 ]

MariaDB v4 [ 147428 ]

People

Assignee:: Sergei Petrunia

Reporter:: Nicholas Hwang

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 2014-01-23 20:00

Updated:: 2014-02-26 16:00

Resolved:: 2014-02-17 17:04

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Activity

People

Dates

Git Integration