Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-688

LP:882833 - Query with (a is null or a in (1)) sometimes returns only rows with a = 1

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Incomplete
    • None
    • None
    • None

    Description

      I've tried very hard to reproduce this on a test set, or reliably, but I cannot. I spoke with Monty in IRC yesterday, he advised I use a debug build since I could create a test case. That option failed as well - with the debug build the result is reliable.

      I have three mariadb servers all of the same build (3249) all running repl and built from snapshots of 5.5.5-m3-log data. The problem persists after a optimize table t1 on all three servers.

      The table is similar to:

      CREATE TABLE `t1` (
      `a` bigint(20) NOT NULL DEFAULT '0',
      `b` varchar(512) DEFAULT NULL,
      `c` int(10) DEFAULT NULL,
      `d` tinyint(1) DEFAULT NULL,
      `e` int(10) DEFAULT NULL,
      `f` tinyint(1) DEFAULT NULL,
      `g` bigint(20) DEFAULT NULL,
      PRIMARY KEY (`a`),
      UNIQUE KEY (`a`,`d`),
      UNIQUE KEY (`a`,`d`,`e`),
      KEY `h` (`c`),
      KEY `i` (`e`,`b`),
      KEY `j` (`b`,`d`,`e`),
      KEY `k` (`b`),
      KEY `l` (`g`),
      KEY `m` (`e`,`d`,`c`,`b`) USING BTREE,
      KEY `n` (`e`,`d`,`g`,`c`) USING BTREE,
      KEY `o` (`f`,`d`,`e`,`b`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1

      The query is similar to:

      select a from t1 where a in ( select a from t1 where e = 5 and f = 0 ) and e = 5 and f = 0 and ( g is null or g in (1) ) order by b limit 10;

      There are 35m records in the table. There should be 150,000 records returned by this query. There are usually 0-4 of these records where g = 1 when I see the problem manifest. The rest are null.

      The problem appears about 80% of the time, and when it does it only returns the rows where g = 1, it does not honor the g is null portion.

      The problem disappears if I write

      ( g is null or g in (1, null))

      I am fairly certain order by related. The problem has never occurred with no order by.

      I am confident limit is not related. Monty asked me to test this. If I take limit off, it will only return those rows where g = 1.

      I am also confident this is not related to optimizer_switch. I shut all of them off (and alternated with in_to_exists and materialization, where appropriate) and was able to reliably reproduce regardless of any optimizer_switch setting.

      I am also confident query_cache_type is not related. The problem manifests regardless of this setting.

      I have worked around it by using:

      ( g is null or g in (1, null))

      for now. But I did want to open a bug for future reference. I am willing to assist in many ways, but I cannot share my schema or data.

      Attachments

        Activity

          People

            Unassigned Unassigned
            fimbulvetr Dan Vande More
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.