Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Cannot Reproduce
-
11.4
-
None
-
Not for Release Notes
Description
CREATE TABLE TED (
|
ID VARCHAR(20) NOT NULL, |
DATE VARCHAR(8) NOT NULL, |
COMP VARCHAR(10) NOT NULL DEFAULT 'S000', |
PRIMARY KEY (ID, DATE, COMP)
|
) ENGINE=InnoDB;
|
|
|
CREATE TABLE TEM (
|
ID VARCHAR(20) NOT NULL, |
COMP VARCHAR(10) NOT NULL DEFAULT 'S000', |
STRT_DATE VARCHAR(8) NOT NULL, |
END_DATE VARCHAR(8) NOT NULL, |
PRIMARY KEY (ID, EFFCT_STRT_DATE)
|
) ENGINE=InnoDB;
|
|
|
EXPLAIN
|
SELECT *
|
FROM TED
|
JOIN TEM
|
ON TEM.ID = TED.ID
|
AND TEM.COMP = TED.COMP
|
AND TED.DATE BETWEEN TEM.STRT_DATE AND TEM.END_DATE
|
WHERE TED.ID = '10004093' |
AND TED.DATE BETWEEN '20251221' AND '20261220'; |
In 10.6, the query correctly uses PRIMARY KEY range scan. The optimizer trace would show
- attaching_conditions_to_tables: Constant condition exists on TED.ID → PRIMARY composite range (10004093,20251221) <= (ID,DATE) <= (10004093,20261220) successfully constructed → range scan |
After upgrading to 11.4.9, the same query falls back to a full table scan. The optimizer trace would show
- attaching_conditions_to_tables: No constant condition on TED.ID → PRIMARY composite range cannot be constructed → Falls back to Full Table Scan
|
The new cost model introduced in MariaDB 11.0+ (and inherited by 11.4) changed how the optimizer evaluates costs.
Moving from rule-based to cost-based decisions for things like range vs table scan, index selection for ORDER BY, etc. (https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizer/the-optimizer-cost-model-from-mariadb-11-0).
However, the new cost model is about how costs are calculated, not about how constants are propagated through equality chains. The problem here is in substitute_best_equal, which decides which table gets the constant - that's a logical optimization step, not a cost decision.
1. rows_estimation - initial range analysis per table
2. considered_execution_plans - join order
3. substitute_best_equal - picks which side of multiple equal(10004093, TED.ID, TEM.ID) gets the constant
4. attaching_conditions_to_tables - re-analyzes ranges with the substituted conditions
In 10.6, step 3 propagated the constant 10004093 to both TED and TEM. In step 4, TED had ID = 10004093 AND DATE BETWEEN ..., giving the composite PRIMARY range (10004093, 20251221) <= (ID, DATE) <= (10004093, 20261220) -> range scan
In 11.4, step 3 assigns the constant only to TEM and leaves TED with just the column reference TED.ID = TEM.ID. In step 4, TED only has DATE BETWEEN ... - without the leading ID constant, PRIMARY can't be used. The only date-leading index (N02) estimates a high number of rows and gets rejected on cost. Result: full table scan.
The new cost model documentation explicitly states it changed things like:
1. Cost calculations for disk reads, index lookups, row fetches
2. Rule-based to cost-based decisions for ORDER BY/GROUP BY index usage
3. Removal of the old 10% key lookup cap
4. Removal of InnoDB's 50% range cap
None of these changes should affect which table receives a constant during equality propagation. The constant should be propagated to both sides.