Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-38129

MariaDB Bug Report: Severe Performance Regression with cardinality<7 Indexes in LEFT JOIN

    XMLWordPrintable

Details

    Description

      Summary

      MariaDB 11.0+ and 12.0+ refuse to use indexes with cardinality = 0 (100% NULL values for example) in LEFT JOIN queries, causing catastrophic performance regressions (up to 564Ă— slower) compared to MariaDB 10.11.15.

      Environment

      • Affected Versions: MariaDB 11.4.5, 12.0.2 (tested)
      • Working Version: MariaDB 10.11.15
      • Operating System: Linux 6.11.11-linuxkit (Docker)
      • Storage Engine: InnoDB

      Reproducible Test Case

      Database Setup

      -- Create test database
      DROP DATABASE IF EXISTS mariadb_bug_test;
      CREATE DATABASE mariadb_bug_test;
      USE mariadb_bug_test;
       
      -- Table A: Main table with ~25,000 rows
      CREATE TABLE table_a (
          id INT PRIMARY KEY AUTO_INCREMENT,
          reference_number VARCHAR(50)
      ) ENGINE=InnoDB;
       
      -- Table B: Related table with ~100,000 rows where fk_id is 100% NULL
      CREATE TABLE table_b (
          id_validation INT PRIMARY KEY AUTO_INCREMENT,
          fk_id INT DEFAULT NULL,
          validation_date DATETIME DEFAULT NULL,
          KEY idx_fk (fk_id)
      ) ENGINE=InnoDB;
       
      -- Increase recursion limit for CTE
      SET SESSION max_recursive_iterations = 100000;
       
      -- Populate table_a with 25,000 rows
      INSERT INTO table_a (reference_number)
      WITH RECURSIVE numbers AS (
          SELECT 1 AS n
          UNION ALL
          SELECT n + 1 FROM numbers WHERE n < 25000
      )
      SELECT CONCAT('REF-', LPAD(n, 6, '0')) FROM numbers;
       
      -- Populate table_b with 100,000 rows (fk_id remains NULL)
      INSERT INTO table_b (validation_date)
      WITH RECURSIVE numbers AS (
          SELECT 1 AS n
          UNION ALL
          SELECT n + 1 FROM numbers WHERE n < 100000
      )
      SELECT DATE_ADD('2023-01-01', INTERVAL n DAY) FROM numbers;
       
      -- Reset recursion limit
      SET SESSION max_recursive_iterations = DEFAULT;
       
      -- Update statistics
      ANALYZE TABLE table_a;
      ANALYZE TABLE table_b;
      

      Test Query

      SELECT a.reference_number, b.validation_date
      FROM table_a a
      LEFT OUTER JOIN table_b b ON (a.id = b.fk_id)
      LIMIT 1000;
      

      Performance Results

      Version Execution Time Multiplicateur
      MariaDB 10.11.15 0.075 sec 1Ă— (baseline)
      MariaDB 12.0.2 42.324 sec 564Ă— slower

      Execution Plan Comparison

      MariaDB 10.11.15 (EXPLAIN)

      +-------+------+---------------+------+------+--------+
      | table | type | key           | rows | cost | Extra  |
      +-------+------+---------------+------+------+--------+
      | a     | ALL  | NULL          | 25762| 481  |        |
      | b     | ref  | idx_fk        | 54232| 27M  |        |
      +-------+------+---------------+------+------+--------+
      

      Result: Uses index idx_fk despite high cost estimate.

      MariaDB 12.0.2 (EXPLAIN)

      +-------+------+---------------+--------+---------------+--------------------------------------------------+
      | table | type | key           | rows   | cost          | Extra                                            |
      +-------+------+---------------+--------+---------------+--------------------------------------------------+
      | a     | ALL  | NULL          | 25717  | 4.6130298     |                                                  |
      | b     | ALL  | NULL          | 108984 | 457427.5086   | Using where; Using join buffer (flat, BNL join)  |
      +-------+------+---------------+--------+---------------+--------------------------------------------------+
      

      Result: Full table scan on b with Block Nested Loop (BNL) join buffering and 25,717 loops = ~2.8 billion row reads.

      Optimizer Trace Analysis

      The full optimizer traces are available as attachments:

      • optimizer_trace_10.11.15_anonymized.json - Complete trace from MariaDB 10.11.15
      • optimizer_trace_12.0.2_anonymized.json - Complete trace from MariaDB 12.0.2

      MariaDB 10.11.15 - Critical Decision Point

      {
        "table": "b",
        "considered_access_paths": [
          {
            "access_type": "ref",
            "index": "idx_fk",
            "rows": 54232,
            "cost": 27461203.35,
            "chosen": true
          },
          {
            "access_type": "scan",
            "resulting_rows": 108464,
            "cost": 9093986,
            "chosen": false
          }
        ],
        "chosen_access_method": {
          "type": "ref",
          "records": 54232,
          "cost": 27461203.35,
          "uses_join_buffering": false
        }
      }
      

      Key Observation: Optimizer calculates index as MORE expensive (27.4M vs 9.0M) but still chooses index (chosen: true for ref, chosen: false for scan). This demonstrates the presence of heuristic safeguards that override cost calculations for LEFT JOIN scenarios.

      MariaDB 12.0.2 - Critical Decision Point

      {
        "table": "b",
        "considered_access_paths": [
          {
            "access_type": "ref",
            "index": "idx_fk",
            "rows": 108984,
            "cost": 2784449.486,
            "chosen": true
          },
          {
            "access_type": "scan",
            "rows": 108984,
            "rows_after_filter": 108984,
            "rows_out": 81738,
            "cost": 457427.5086,
            "index_only": false,
            "chosen": true
          }
        ],
        "chosen_access_method": {
          "type": "scan",
          "rows_read": 108984,
          "rows_out": 81738,
          "cost": 457427.5086,
          "uses_join_buffering": true
        }
      }
      

      Key Observation: Optimizer calculates scan as cheaper (457K vs 2.7M) and strictly follows cost calculation (chosen: true for BOTH, but ultimately selects scan). The removal of heuristic safeguards means the optimizer blindly trusts the cost model, which severely underestimates the actual cost of scanning 108,984 rows Ă— 25,717 times = ~2.8 billion row comparisons.

      Root Cause

      According to MariaDB documentation:

      MariaDB 11.0 uses granular microsecond-based cost calculations instead of heuristic-driven planning

      The new cost model removed heuristic safeguards that previously forced index usage for LEFT JOINs even when cost estimates were uncertain. With indexes having 100% NULL values (cardinality = 0), the cost model incorrectly estimates table scans as cheaper, even though index lookups on NULL are actually very fast.

      Configuration Parameters Tested (All Failed)

      Optimizer Switches

      -- Tested individually and in combinations - no improvement:
      SET GLOBAL optimizer_switch='duplicateweedout=off';
      SET GLOBAL optimizer_switch='hash_join_cardinality=off';
      SET GLOBAL optimizer_switch='cset_narrowing=off';
      SET GLOBAL optimizer_switch='sargable_casefold=off';
      SET GLOBAL optimizer_switch='outer_join_with_cache=off';
      

      Cost Model Parameters

      ALL cost model parameters were tested exhaustively with extreme values - no improvement achieved:

      -- Engine-specific costs (tested with values ranging from Ă·113 to Ă—21000):
      SET GLOBAL optimizer_disk_read_cost = 50.0;           -- default: 10.24, tested up to 50.0
      SET GLOBAL optimizer_key_lookup_cost = 1000.0;        -- default: 0.27, tested up to 1000.0
      SET GLOBAL optimizer_row_lookup_cost = 10.0;          -- default: 0.13, tested up to 10.0
      SET GLOBAL optimizer_scan_setup_cost = 50.0;          -- default: 10.0, tested up to 50.0
      SET GLOBAL optimizer_row_copy_cost = 50.0;            -- default: 0.03, tested up to 50.0
       
      -- Sequential access costs (critical for scan vs index decision):
      SET GLOBAL optimizer_row_next_find_cost = 1000.0;     -- default: 0.045916, tested up to 1000.0 (Ă—21000)
      SET GLOBAL optimizer_key_next_find_cost = 0.001;      -- default: 0.082347, tested down to 0.001 (Ă·82)
      SET GLOBAL optimizer_index_block_copy_cost = 0.001;   -- default: 0.035600, tested down to 0.001 (Ă·35)
      SET GLOBAL optimizer_key_compare_cost = 0.0001;       -- default: 0.011361, tested down to 0.0001 (Ă·113)
       
      -- Session-level costs:
      SET GLOBAL optimizer_where_cost = 100.0;              -- default: 0.032, tested from 0.001 to 100.0
      

      Problem: Both scan and index costs scale proportionally, maintaining the same ratio. Even increasing optimizer_row_next_find_cost by 21,000Ă— (to make sequential scans prohibitively expensive) had zero effect on the optimizer's decision.

      Other Parameters

      SET GLOBAL eq_range_index_dive_limit = 0; -- Force statistics dive
      SET GLOBAL eq_range_index_dive_limit = 10000; -- Disable statistics dive
      SET GLOBAL optimizer_adjust_secondary_key_costs = 0; -- Disable adjustment
      SET GLOBAL optimizer_use_condition_selectivity = 1; -- Use less selectivity
      SET GLOBAL join_cache_level = 0; -- Disable join cache
      SET GLOBAL optimizer_join_limit_pref_ratio = 10000; -- Extreme preference for ref
      SET GLOBAL innodb_stats_method = 'nulls_ignored'; -- Ignore NULLs in statistics
       
      ANALYZE TABLE table_b; -- Recalculate statistics
      

      Result: No improvement with any combination of parameters.

      Expected Behavior

      MariaDB should use the index for LEFT JOIN queries when it exists, especially when:

      1. The join is on a foreign key relationship
      2. Index lookups on NULL values are fast (constant time)
      3. Alternative table scan requires millions/billions of row reads

      The optimizer should either:

      1. Maintain heuristic safeguards for LEFT JOIN scenarios
      2. Improve cost estimation for NULL-only indexes
      3. Recognize that index lookups on NULL are efficient

      Actual Behavior

      MariaDB 11.0+ strictly follows cost calculations that incorrectly estimate table scans as cheaper than index lookups for NULL-only indexes, causing 400-600Ă— performance regressions in production systems.

      Workarounds

      The only viable workarounds require application code changes:

      -- Force index usage explicitly:
      SELECT a.reference_number, b.validation_date
      FROM table_a a
      LEFT OUTER JOIN table_b b FORCE INDEX (idx_fk)
      ON (a.id = b.fk_id)
      LIMIT 1000;
      

      No configuration-level workaround exists.

      Impact

      This regression affects any application with:

      • LEFT JOIN queries on tables with partially populated foreign keys
      • Indexes where a significant portion of values are NULL
      • Upgraded from MariaDB 10.x to 11.x/12.x

      The performance impact is severe and cannot be mitigated without code changes or downgrading to MariaDB 10.11 LTS.

      Request

      Please investigate and restore either:

      1. Heuristic safeguards for LEFT JOIN index selection
      2. Improved cost estimation for NULL-value index lookups
      3. Configuration parameter to control this behavior

      This is a critical regression affecting production systems with no viable configuration workaround.

      Attachments

        Activity

          People

            psergei Sergei Petrunia
            BenSia Ben W
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.