Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5555

Incorrect index_merge on BTREE indices

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 5.5.34, 10.0.8
    • 5.5.36, 10.0.9
    • None
    • None
    • mysql Ver 15.1 Distrib 5.5.34-MariaDB, for debian-linux-gnu (x86_64) using readline 5.1

      Ubuntu 12.04 LTS

    Description

      Trying to simplify the context for this problem, but let me know if you require additional information.

      I have a table named base with several columns, including a field named id.

      Each field has a BTREE index created via create index X on base(X)

      I've found that queries of the type select id from base where X = 'value1' and Y = 'value2' do not return the correct number of results.

      If I do a select id, X from base where Y = 'value2' order by X, I can see a full and accurate list of all combinations of X and Y = 'value2', since this query does not execute an index_merge (running an explain on this shows that its a ref query using the index on Y).

      But running select id from base where X = 'value1' and Y = 'value2' clearly shows that some rows are dropped from the results when compared to what you see with select id, X from base where Y = 'value2' order by X. Running an explain shows that an index_merge is run on the indices on both X and Y.

      Rebuilding the indices does not fix the problem, and results in the same rows consistently being dropped from the query results.

      I can provide firmer context with actual query outputs if this is too abstract. Thanks for your help.

      Attachments

        Activity

          njhwang Nicholas Hwang created issue -
          serg Sergei Golubchik made changes -
          Field Original Value New Value
          Assignee Elena Stepanova [ elenst ]

          Hi,

          Yes, it would be great if you could provide a data dump for the tables involved into the query, and the actual query. If the data is sensitive, you can upload it to the private section of our FTP (ftp://ftp.askmonty.org/private/)

          We have some open bugs regarding index_merge, but it's difficult to say whether your case is related to them or not without the actual data.

          elenst Elena Stepanova added a comment - Hi, Yes, it would be great if you could provide a data dump for the tables involved into the query, and the actual query. If the data is sensitive, you can upload it to the private section of our FTP ( ftp://ftp.askmonty.org/private/ ) We have some open bugs regarding index_merge, but it's difficult to say whether your case is related to them or not without the actual data.
          elenst Elena Stepanova made changes -
          Due Date 2014-02-28

          Hi Elena,

          I work with Nick. I put a data dump in the private FTP folder you referenced (10krows_bugtest.sql.gz). The data isn't private (it's synthetically generated), but it's ~1GB compressed/1.4GB uncompressed, which seemed too big to attach to attach to JIRA.

          As for queries which give differing results:

          select id, fname, race, state from base where fname = 'Michael' and race = 'White' and state = 'California'; returns 7 rows
          select id, fname, race, state from base IGNORE INDEX(fname, race, state) where fname = 'Michael' and race = 'White' and state = 'California'; returns 10 rows
          select id, fname, race, state from base where fname = 'Michael' and race = 'White' ORDER BY state; returns 10 rows with state=California
          select id, fname, race, state from base where fname = 'Michael' and state = 'California' ORDER BY race; returns 7 rows with race=White
          select id, fname, race, state from base IGNORE INDEX(fname, race, state) where fname = 'Michael' and state = 'California' ORDER BY race; returns 10 rows with race=White

          reschly Michael Reschly added a comment - Hi Elena, I work with Nick. I put a data dump in the private FTP folder you referenced (10krows_bugtest.sql.gz). The data isn't private (it's synthetically generated), but it's ~1GB compressed/1.4GB uncompressed, which seemed too big to attach to attach to JIRA. As for queries which give differing results: select id, fname, race, state from base where fname = 'Michael' and race = 'White' and state = 'California'; returns 7 rows select id, fname, race, state from base IGNORE INDEX(fname, race, state) where fname = 'Michael' and race = 'White' and state = 'California'; returns 10 rows select id, fname, race, state from base where fname = 'Michael' and race = 'White' ORDER BY state; returns 10 rows with state=California select id, fname, race, state from base where fname = 'Michael' and state = 'California' ORDER BY race; returns 7 rows with race=White select id, fname, race, state from base IGNORE INDEX(fname, race, state) where fname = 'Michael' and state = 'California' ORDER BY race; returns 10 rows with race=White

          Significantly smaller database dump that demonstrates the index_merge issue. All data contained within this file is fully synthetic. Any similarity to any persons, real or fictional, is purely coincidental.

          reschly Michael Reschly added a comment - Significantly smaller database dump that demonstrates the index_merge issue. All data contained within this file is fully synthetic. Any similarity to any persons, real or fictional, is purely coincidental.
          reschly Michael Reschly made changes -
          Attachment 1krows_dump_min.sql [ 26401 ]

          I attached a much smaller database that demonstrates the same issue, which should be easier to work with. As for queries:

          Query using index:
          select id, dob from base where hours_worked_per_week = 40 and weeks_worked_last_year = 52 and dob < '1949-11-21'; returns 2 rows
          Query without index:
          select id, dob from base IGNORE INDEX(dob, weeks_worked_last_year, hours_worked_per_week) where hours_worked_per_week = 40 and weeks_worked_last_year = 52 and dob < '1949-11-21'; returns 8 rows

          Query 2 of three terms, order by third:
          select id, dob, weeks_worked_last_year from base where hours_worked_per_week = 40 and dob < '1949-11-21' order by weeks_worked_last_year; returns 8 rows that have weeks_worked_last_year = 52
          select id, dob, hours_worked_per_week from base where weeks_worked_last_year = 52 and dob < '1949-11-21' order by hours_worked_per_week ; returns 8 rows that have hours_worked_per_week = 40

          reschly Michael Reschly added a comment - I attached a much smaller database that demonstrates the same issue, which should be easier to work with. As for queries: Query using index: select id, dob from base where hours_worked_per_week = 40 and weeks_worked_last_year = 52 and dob < '1949-11-21'; returns 2 rows Query without index: select id, dob from base IGNORE INDEX(dob, weeks_worked_last_year, hours_worked_per_week) where hours_worked_per_week = 40 and weeks_worked_last_year = 52 and dob < '1949-11-21'; returns 8 rows Query 2 of three terms, order by third: select id, dob, weeks_worked_last_year from base where hours_worked_per_week = 40 and dob < '1949-11-21' order by weeks_worked_last_year; returns 8 rows that have weeks_worked_last_year = 52 select id, dob, hours_worked_per_week from base where weeks_worked_last_year = 52 and dob < '1949-11-21' order by hours_worked_per_week ; returns 8 rows that have hours_worked_per_week = 40
          elenst Elena Stepanova added a comment - - edited

          Thank you for creating the test case!

          I reduced it a bit further, and replaced data with generic values, it is attached as mdev5555.test.

          Output of the queries:

          MariaDB [test]> select * from base where hours_worked_per_week = 40 and weeks_worked_last_year = 52 and dob < '1949-11-21';
          Empty set (0.01 sec)

          MariaDB [test]> select * from base IGNORE INDEX(dob, weeks_worked_last_year, hours_worked_per_week) where hours_worked_per_week = 40 and weeks_worked_last_year = 52 and dob < '1949-11-21';
          +----+-----+------------+-----------+--------+-----------------------+------------------------+
          | id | id2 | dob        | address   | city   | hours_worked_per_week | weeks_worked_last_year |
          +----+-----+------------+-----------+--------+-----------------------+------------------------+
          | 16 |  16 | 1949-11-07 | address16 | city16 |                    40 |                     52 |
          | 50 |  50 | 1923-09-08 | address50 | city50 |                    40 |                     52 |
          +----+-----+------------+-----------+--------+-----------------------+------------------------+
          2 rows in set (0.00 sec)

          elenst Elena Stepanova added a comment - - edited Thank you for creating the test case! I reduced it a bit further, and replaced data with generic values, it is attached as mdev5555.test. Output of the queries: MariaDB [test]> select * from base where hours_worked_per_week = 40 and weeks_worked_last_year = 52 and dob < '1949-11-21' ; Empty set (0.01 sec) MariaDB [test]> select * from base IGNORE INDEX (dob, weeks_worked_last_year, hours_worked_per_week) where hours_worked_per_week = 40 and weeks_worked_last_year = 52 and dob < '1949-11-21' ; + ----+-----+------------+-----------+--------+-----------------------+------------------------+ | id | id2 | dob | address | city | hours_worked_per_week | weeks_worked_last_year | + ----+-----+------------+-----------+--------+-----------------------+------------------------+ | 16 | 16 | 1949-11-07 | address16 | city16 | 40 | 52 | | 50 | 50 | 1923-09-08 | address50 | city50 | 40 | 52 | + ----+-----+------------+-----------+--------+-----------------------+------------------------+ 2 rows in set (0.00 sec)
          elenst Elena Stepanova made changes -
          Attachment mdev5555.test [ 26500 ]
          Fix Version/s 10.0.9 [ 14400 ]
          Fix Version/s 5.5.36 [ 14600 ]
          Affects Version/s 10.0.8 [ 14200 ]
          Assignee Elena Stepanova [ elenst ] Sergei Petrunia [ psergey ]
          Due Date 2014-02-28

          index_merge and partitioned tables- this could be the same issue as MDEV-5177.

          psergei Sergei Petrunia added a comment - index_merge and partitioned tables- this could be the same issue as MDEV-5177 .

          Records are read this way:

          #0 key_rec_cmp
          #1 0x0000000000ab3e57 in _downheap
          #2 0x0000000000ab3f70 in queue_fix
          #3 0x0000000000e3c459 in ha_partition::handle_ordered_index_scan
          #4 0x0000000000e3ad1a in ha_partition::common_index_read
          #5 0x0000000000e3b4fc in ha_partition::read_range_first
          #6 0x0000000000648467 in handler::multi_range_read_next
          #7 0x00000000009765ba in QUICK_RANGE_SELECT::get_next
          #8 0x0000000000975929 in QUICK_ROR_INTERSECT_SELECT::get_next
          #9 0x000000000099015c in rr_quick
          #10 0x00000000007b1a07 in join_init_read_record

          QUICK_ROR_INTERSECT_SELECT requires that its inputs are ordered by Rowid. ha_partition does an ordered index scan.

          The scan is done as follows: get a row from each partition, put them into a priority queue, then get the row from the top of the priority queue. Put another row there, repeat.

          The problem is that ha_partition will not produce records in rowid order when using this strategy.

          Rowid for ha_partition table is

          {partition_no, underlying_table_rowid}

          , and ha_partition::cmp_ref() compares rowid tuples in lexicographic order (ie. partition_no is compared first, then rowid).

          psergei Sergei Petrunia added a comment - Records are read this way: #0 key_rec_cmp #1 0x0000000000ab3e57 in _downheap #2 0x0000000000ab3f70 in queue_fix #3 0x0000000000e3c459 in ha_partition::handle_ordered_index_scan #4 0x0000000000e3ad1a in ha_partition::common_index_read #5 0x0000000000e3b4fc in ha_partition::read_range_first #6 0x0000000000648467 in handler::multi_range_read_next #7 0x00000000009765ba in QUICK_RANGE_SELECT::get_next #8 0x0000000000975929 in QUICK_ROR_INTERSECT_SELECT::get_next #9 0x000000000099015c in rr_quick #10 0x00000000007b1a07 in join_init_read_record QUICK_ROR_INTERSECT_SELECT requires that its inputs are ordered by Rowid. ha_partition does an ordered index scan. The scan is done as follows: get a row from each partition, put them into a priority queue, then get the row from the top of the priority queue. Put another row there, repeat. The problem is that ha_partition will not produce records in rowid order when using this strategy. Rowid for ha_partition table is {partition_no, underlying_table_rowid} , and ha_partition::cmp_ref() compares rowid tuples in lexicographic order (ie. partition_no is compared first, then rowid).

          Elena has found MySQL's fix for Bug#17588348. (See MDEV-5177).

          That fix changes ha_partition::cmp_ref() to compare

          {partition_no, underlying_table_rowid}

          pairs by comparing underlying_table_rowid first, and partition_no second.

          The problem is that this still doesn't make ha_partition to produce rowid-ordered streams.

          psergei Sergei Petrunia added a comment - Elena has found MySQL's fix for Bug#17588348. (See MDEV-5177 ). That fix changes ha_partition::cmp_ref() to compare {partition_no, underlying_table_rowid} pairs by comparing underlying_table_rowid first, and partition_no second. The problem is that this still doesn't make ha_partition to produce rowid-ordered streams.

          Fix pushed into 5.5 tree and will be in the next 5.5 release.

          Thanks for the detailed bug report, and for the testcase.

          psergei Sergei Petrunia added a comment - Fix pushed into 5.5 tree and will be in the next 5.5 release. Thanks for the detailed bug report, and for the testcase.
          psergei Sergei Petrunia made changes -
          Resolution Fixed [ 1 ]
          Status Open [ 1 ] Closed [ 6 ]

          Wonderful, thanks!

          We have some very large existing datasets that we'd like to use as is, if possible. Does this bug require that these datasets be regenerated using the patched server? Or just that the indices be re-created? Or just that the patched server be running?

          njhwang Nicholas Hwang added a comment - Wonderful, thanks! We have some very large existing datasets that we'd like to use as is, if possible. Does this bug require that these datasets be regenerated using the patched server? Or just that the indices be re-created? Or just that the patched server be running?

          The patch only changes the way data is read. There is no need to regenerate data or to rebuild indexes. It should be sufficient to start the patched server on the existing data directory.

          psergei Sergei Petrunia added a comment - The patch only changes the way data is read. There is no need to regenerate data or to rebuild indexes. It should be sufficient to start the patched server on the existing data directory.

          Excellent, thanks so much for the quick turn around.

          njhwang Nicholas Hwang added a comment - Excellent, thanks so much for the quick turn around.

          > There is no need to regenerate data or to rebuild indexes. It should be sufficient to start the patched server on the existing data directory.

          Hi Sergi,

          I just wanted to confirm that this was the case. I used patched binaries on our existing data and the correct results were returned without regenerating data or re-indexing.

          Thanks again.

          reschly Michael Reschly added a comment - > There is no need to regenerate data or to rebuild indexes. It should be sufficient to start the patched server on the existing data directory. Hi Sergi, I just wanted to confirm that this was the case. I used patched binaries on our existing data and the correct results were returned without regenerating data or re-indexing. Thanks again.
          serg Sergei Golubchik made changes -
          Workflow defaullt [ 33719 ] MariaDB v2 [ 42989 ]
          ratzpo Rasmus Johansson (Inactive) made changes -
          Workflow MariaDB v2 [ 42989 ] MariaDB v3 [ 62685 ]
          serg Sergei Golubchik made changes -
          Workflow MariaDB v3 [ 62685 ] MariaDB v4 [ 147428 ]

          People

            psergei Sergei Petrunia
            njhwang Nicholas Hwang
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.