Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Duplicate
-
10.11, 11.4, 11.8, 12.3, 11.8.6
-
MariaDB Version: 11.8.6-MariaDB
Client Version: 15.2 for linux-systemd (x86_64)
OS: Ubuntu 20.04.4 LTS
CPU: x86_64 (Intel(R) Xeon(R) Platinum 8358P)
-
Not for Release Notes
Description
Problem Description
A "Wrong Result" bug exists in MariaDB for a join between:
- `t_main.d` of type `DATE`
- `t_ref.mt` of type `MEDIUMTEXT`
In the baseline state S1, the query returns 8 rows and 8 distinct right-side
tags.
Then S2 is obtained from S1 by append-only inserts into the left table,
followed by `ANALYZE TABLE`. In S2, the same query returns 500 rows, but all
rows come from only one right-side tag:
- `:5-bru_e`
The other 7 tags that matched in S1 disappear entirely, even though S2 is
derived from S1 only by appending left-side rows. That is a monotonicity
violation and therefore a wrong-result bug.
Two diagnostics further narrow the issue:
- rewriting the join as `m.d = CAST(r.mt AS DATE)` does *not* restore the
missing tags - avoiding the left-side indexes with `IGNORE INDEX` *does* restore all 8
tags
This shows the bug is plan-dependent and is broader than a simple "implicit
cast omitted" issue. The indexed execution path remains wrong even when the
text-to-date conversion is made explicit.
Step-by-step Reproduction
SET SESSION sql_mode = ''; |
SET SESSION max_recursive_iterations = 10000; |
|
|
DROP DATABASE IF EXISTS repro_mdb_date_mediumtext_join; |
CREATE DATABASE repro_mdb_date_mediumtext_join; |
USE repro_mdb_date_mediumtext_join; |
|
|
DROP TABLE IF EXISTS t_main; |
DROP TABLE IF EXISTS t_ref; |
|
|
CREATE TABLE t_main ( |
id INT NOT NULL PRIMARY KEY, |
c2 VARCHAR(255) NOT NULL, |
c3 VARCHAR(255) NULL, |
k INT NULL, |
d DATE NOT NULL, |
c6 VARCHAR(10) NOT NULL, |
KEY idx_k (k), |
KEY idx_c2 (c2(64)), |
KEY idx_c3 (c3(64)), |
KEY idx_k_d (k, d) |
) ENGINE=InnoDB;
|
|
|
CREATE TABLE t_ref ( |
id INT NOT NULL PRIMARY KEY, |
tag VARCHAR(64) NOT NULL, |
mt MEDIUMTEXT NULL |
) ENGINE=InnoDB;
|
|
|
-- [Phase S1: baseline state]
|
INSERT INTO t_main (id, c2, c3, k, d, c6) VALUES |
(1, 'not-a-date', ' 1', 6, '0000-00-00', 'u59-myefohvf1'); |
|
|
INSERT INTO t_ref (id, tag, mt) VALUES |
(9000000, 'bi4n5oob77', '2:kz-f:8p_ek7du0yf6-24:9h/g70jbm2al7'), |
(9000001, ':5-bru_e', '0000-00-00'), |
(9000002, 'rr2bu/drxg01vjr', '-1'), |
(9000003, '2023-01-01', 'aowrto-k5n: ut1wlwcwa'), |
(9000004, 'xjigzyoch7b _', ':zil dlq5g9'), |
(9000005, 'qcjy__95mec/xbflit', '0'), |
(9000006, ' 1', '_-y91hqjnrylcdavvm1og:cwdw0v0w-yt7kfb5aw'), |
(9000007, 'not-a-date', '1'); |
|
|
ANALYZE TABLE t_main, t_ref; |
|
|
SELECT 'S1_count' AS stage, COUNT(*) AS cnt |
FROM t_main m |
JOIN t_ref r |
ON m.d = r.mt |
WHERE m.k <= r.id; |
|
|
SELECT 'S1_tags' AS stage, |
GROUP_CONCAT(DISTINCT r.tag ORDER BY r.tag SEPARATOR ' | ') AS tags |
FROM t_main m |
JOIN t_ref r |
ON m.d = r.mt |
WHERE m.k <= r.id; |
|
|
EXPLAIN
|
SELECT COUNT(*) |
FROM t_main m |
JOIN t_ref r |
ON m.d = r.mt |
WHERE m.k <= r.id; |
|
|
-- [Phase S2: append-only expansion]
|
INSERT INTO t_main (id, c2, c3, k, d, c6) |
WITH RECURSIVE seq(n) AS ( |
SELECT 2 |
UNION ALL |
SELECT n + 1 FROM seq WHERE n < 500 |
)
|
SELECT
|
n,
|
'not-a-date', |
' 1', |
CASE MOD(n, 4) |
WHEN 0 THEN 27 |
WHEN 1 THEN 7 |
WHEN 2 THEN 6 |
ELSE 3 |
END, |
'0000-00-00', |
'u59-myefohvf1' |
FROM seq; |
|
|
ANALYZE TABLE t_main; |
|
|
SELECT 'S2_count' AS stage, COUNT(*) AS cnt |
FROM t_main m |
JOIN t_ref r |
ON m.d = r.mt |
WHERE m.k <= r.id; |
|
|
SELECT 'S2_tags' AS stage, |
GROUP_CONCAT(DISTINCT r.tag ORDER BY r.tag SEPARATOR ' | ') AS tags |
FROM t_main m |
JOIN t_ref r |
ON m.d = r.mt |
WHERE m.k <= r.id; |
|
|
EXPLAIN
|
SELECT COUNT(*) |
FROM t_main m |
JOIN t_ref r |
ON m.d = r.mt |
WHERE m.k <= r.id; |
|
|
-- Diagnostic A: explicit cast on the MEDIUMTEXT side
|
SELECT 'S2_cast_count' AS stage, COUNT(*) AS cnt |
FROM t_main m |
JOIN t_ref r |
ON m.d = CAST(r.mt AS DATE) |
WHERE m.k <= r.id; |
|
|
SELECT 'S2_cast_tags' AS stage, |
GROUP_CONCAT(DISTINCT r.tag ORDER BY r.tag SEPARATOR ' | ') AS tags |
FROM t_main m |
JOIN t_ref r |
ON m.d = CAST(r.mt AS DATE) |
WHERE m.k <= r.id; |
|
|
-- Diagnostic B: avoid left-side indexes
|
SELECT 'S2_no_index_count' AS stage, COUNT(*) AS cnt |
FROM t_main m IGNORE INDEX (idx_k, idx_c2, idx_c3, idx_k_d) |
JOIN t_ref r |
ON m.d = r.mt |
WHERE m.k <= r.id; |
|
|
SELECT 'S2_no_index_tags' AS stage, |
GROUP_CONCAT(DISTINCT r.tag ORDER BY r.tag SEPARATOR ' | ') AS tags |
FROM t_main m IGNORE INDEX (idx_k, idx_c2, idx_c3, idx_k_d) |
JOIN t_ref r |
ON m.d = r.mt |
WHERE m.k <= r.id; |
Observed Results
The reproducer produced the following results:
```text
S1_count = 8
S1_tags = 1 | :5-bru_e | 2023-01-01 | bi4n5oob77 |
not-a-date | qcjy__95mec/xbflit | rr2bu/drxg01vjr |
xjigzyoch7b _
S2_count = 500
S2_tags = :5-bru_e
S2_cast_count = 500
S2_cast_tags = :5-bru_e
S2_no_index_count = 4000
S2_no_index_tags = 1 | :5-bru_e | 2023-01-01 | bi4n5oob77 |
not-a-date | qcjy__95mec/xbflit | rr2bu/drxg01vjr |
xjigzyoch7b _
```
The corresponding `EXPLAIN` output shows a plan change:
S1:
```text
----------------------------------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
----------------------------------------------------------------------------------------------------------+
| 1 | SIMPLE | m | index | idx_k,idx_k_d | idx_k_d | 8 | NULL | 1 | Using index |
| 1 | SIMPLE | r | ALL | PRIMARY | NULL | NULL | NULL | 8 | Range checked for each record (index map: 0x1) |
----------------------------------------------------------------------------------------------------------+
```
S2:
```text
-------------------------------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
-------------------------------------------------------------------------------------------------------+
| 1 | SIMPLE | r | ALL | PRIMARY | NULL | NULL | NULL | 8 | |
| 1 | SIMPLE | m | ALL | idx_k,idx_k_d | NULL | NULL | NULL | 500 | Range checked for each record (index map: 0x12) |
-------------------------------------------------------------------------------------------------------+
```
Diagnostic Analysis
This is a wrong-result bug because S2 is obtained from S1 only by appending
rows to `t_main`. Therefore every right-side tag that matched in S1 must still
match in S2.
Instead:
- S1 matches 8 distinct tags
- S2 keeps only `:5-bru_e`
So 7 previously matched right-side tags disappear after `ANALYZE TABLE`.
The bug is not fixed by making the join explicit:
```sql
ON m.d = CAST(r.mt AS DATE)
```
That still returns only `:5-bru_e`.
However, avoiding the left-side indexes with:
```sql
IGNORE INDEX (idx_k, idx_c2, idx_c3, idx_k_d)
```
restores all 8 tags and increases the row count from 500 to 4000.
This shows the defect is plan-dependent and tied to the indexed execution path
chosen after statistics change, not merely to omission of an implicit
conversion.
Impact
MariaDB can silently drop valid join matches for `DATE = MEDIUMTEXT`
comparisons after `ANALYZE TABLE` and append-only growth, while still
returning a plausible-looking non-empty result set.
Because even `CAST(MEDIUMTEXT AS DATE)` remains wrong under the natural S2
plan, applications cannot reliably avoid the issue just by rewriting the
predicate with an explicit cast.
Attachments
Issue Links
- duplicates
-
MDEV-39553 `DATE = VARCHAR` filter loses existing matches after `ANALYZE TABLE`, keeping only the literal `'0000-00-00'` row
-
- Confirmed
-