Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Incomplete
-
None
-
None
-
None
Description
I've tried very hard to reproduce this on a test set, or reliably, but I cannot. I spoke with Monty in IRC yesterday, he advised I use a debug build since I could create a test case. That option failed as well - with the debug build the result is reliable.
I have three mariadb servers all of the same build (3249) all running repl and built from snapshots of 5.5.5-m3-log data. The problem persists after a optimize table t1 on all three servers.
The table is similar to:
CREATE TABLE `t1` (
`a` bigint(20) NOT NULL DEFAULT '0',
`b` varchar(512) DEFAULT NULL,
`c` int(10) DEFAULT NULL,
`d` tinyint(1) DEFAULT NULL,
`e` int(10) DEFAULT NULL,
`f` tinyint(1) DEFAULT NULL,
`g` bigint(20) DEFAULT NULL,
PRIMARY KEY (`a`),
UNIQUE KEY (`a`,`d`),
UNIQUE KEY (`a`,`d`,`e`),
KEY `h` (`c`),
KEY `i` (`e`,`b`),
KEY `j` (`b`,`d`,`e`),
KEY `k` (`b`),
KEY `l` (`g`),
KEY `m` (`e`,`d`,`c`,`b`) USING BTREE,
KEY `n` (`e`,`d`,`g`,`c`) USING BTREE,
KEY `o` (`f`,`d`,`e`,`b`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
The query is similar to:
select a from t1 where a in ( select a from t1 where e = 5 and f = 0 ) and e = 5 and f = 0 and ( g is null or g in (1) ) order by b limit 10;
There are 35m records in the table. There should be 150,000 records returned by this query. There are usually 0-4 of these records where g = 1 when I see the problem manifest. The rest are null.
The problem appears about 80% of the time, and when it does it only returns the rows where g = 1, it does not honor the g is null portion.
The problem disappears if I write
( g is null or g in (1, null))
I am fairly certain order by related. The problem has never occurred with no order by.
I am confident limit is not related. Monty asked me to test this. If I take limit off, it will only return those rows where g = 1.
I am also confident this is not related to optimizer_switch. I shut all of them off (and alternated with in_to_exists and materialization, where appropriate) and was able to reliably reproduce regardless of any optimizer_switch setting.
I am also confident query_cache_type is not related. The problem manifests regardless of this setting.
I have worked around it by using:
( g is null or g in (1, null))
for now. But I did want to open a bug for future reference. I am willing to assist in many ways, but I cannot share my schema or data.