Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-39552

Join returns 0 rows after ANALYZE due to inconsistent implicit casting in VARCHAR = BIGINT comparison

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Duplicate
    • 10.11, 11.4, 11.8, 12.3, 11.8.6
    • N/A
    • Data types, Optimizer
    • MariaDB Version: 11.8.6-MariaDB
      Client Version: 15.2 for linux-systemd (x86_64)
      OS: Ubuntu 20.04.4 LTS
      CPU: x86_64 (Intel(R) Xeon(R) Platinum 8358P)
    • Not for Release Notes

    Description

      Prblem Description
      A "Wrong Result" bug exists in MariaDB when joining a `VARCHAR` column to a `BIGINT` column using implicit conversion.

      In the baseline state S1, the query returns the expected matches. After append-only inserts into the left table and `ANALYZE TABLE`, the same query returns 0 rows in S2, even though S2 should be a superset of S1.

      The result can be restored by either:

      • making the comparison explicit with `CAST(m.v AS SIGNED) = r.n`
      • avoiding the left-side indexes with `IGNORE INDEX`

      This shows that the wrong result is plan-dependent and tied to the implicit `VARCHAR = BIGINT` comparison path chosen after statistics change.

      Step-by-step Reproduction

      SET SESSION sql_mode = '';
      SET SESSION max_recursive_iterations = 10000;
       
      DROP DATABASE IF EXISTS repro_mdb_varchar_bigint_join2;
      CREATE DATABASE repro_mdb_varchar_bigint_join2;
      USE repro_mdb_varchar_bigint_join2;
       
      DROP TABLE IF EXISTS t_main;
      DROP TABLE IF EXISTS t_ref;
       
      CREATE TABLE t_main (
          id INT NOT NULL PRIMARY KEY,
          v  VARCHAR(64) NOT NULL,
          d  DATE NOT NULL,
          KEY idx_v (v),
          KEY idx_v_d (v, d)
      ) ENGINE=InnoDB;
       
      CREATE TABLE t_ref (
          id  INT NOT NULL PRIMARY KEY,
          n   BIGINT NULL,
          txt LONGTEXT NULL
      ) ENGINE=InnoDB;
       
      -- Right side: all rows should match v='hv_6576' via VARCHAR->BIGINT coercion,
      -- and should match d='0000-00-00' via text->DATE coercion.
      INSERT INTO t_ref (id, n, txt) VALUES
      (1, 0, 'not-a-date'),
      (2, 0, '-1'),
      (3, 0, 'sample_x'),
      (4, 0, 'abc'),
      (5, 0, ':bad:text');
       
      -- [Phase S1: Baseline state]
      INSERT INTO t_main (id, v, d) VALUES
      (1, 'hv_6576', '0000-00-00');
       
      ANALYZE TABLE t_main, t_ref;
       
      -- Query S1: returns 5 rows
      SELECT 'S1_natural' AS stage, COUNT(*) AS cnt
      FROM t_main m
      JOIN t_ref  r
        ON m.v = r.n
      WHERE r.txt = m.d;
       
      EXPLAIN
      SELECT COUNT(*)
      FROM t_main m
      JOIN t_ref  r
        ON m.v = r.n
      WHERE r.txt = m.d;
       
      -- [Phase S2: append-only expansion]
      INSERT INTO t_main (id, v, d)
      WITH RECURSIVE seq(n) AS (
          SELECT 2
          UNION ALL
          SELECT n + 1 FROM seq WHERE n < 500
      )
      SELECT
          n,
          'hv_6576',
          '0000-00-00'
      FROM seq;
       
      ANALYZE TABLE t_main;
       
      -- Query S2: returns 0 rows (wrong)
      SELECT 'S2_natural' AS stage, COUNT(*) AS cnt
      FROM t_main m
      JOIN t_ref  r
        ON m.v = r.n
      WHERE r.txt = m.d;
       
      EXPLAIN
      SELECT COUNT(*)
      FROM t_main m
      JOIN t_ref  r
        ON m.v = r.n
      WHERE r.txt = m.d;
       
      -- Diagnostic 1: explicit coercion restores correctness
      SELECT 'S2_cast_num' AS stage, COUNT(*) AS cnt
      FROM t_main m
      JOIN t_ref  r
        ON CAST(m.v AS SIGNED) = r.n
      WHERE r.txt = m.d;
       
      -- Diagnostic 2: avoiding left-side indexes also restores correctness
      SELECT 'S2_no_index' AS stage, COUNT(*) AS cnt
      FROM t_main m IGNORE INDEX (idx_v, idx_v_d)
      JOIN t_ref  r
        ON m.v = r.n
      WHERE r.txt = m.d;
      

      Observed Results
      The replay produced the following results:

      ```text
      S1_natural = 5
      S2_natural = 0
      S2_cast_num = 2500
      S2_no_index = 2500
      ```

      The corresponding `EXPLAIN` output shows a plan change:

      S1:
      ```text
      -----------------------------------------------------------------------------------------------------------+

      id select_type table type possible_keys key key_len ref rows Extra

      -----------------------------------------------------------------------------------------------------------+

      1 SIMPLE m index idx_v,idx_v_d idx_v_d 261 NULL 1 Using index
      1 SIMPLE r ALL NULL NULL NULL NULL 5 Using where; Using join buffer (flat, BNL join)

      -----------------------------------------------------------------------------------------------------------+
      ```

      S2:
      ```text
      ------------------------------------------------------------------------------------------------------+

      id select_type table type possible_keys key key_len ref rows Extra

      ------------------------------------------------------------------------------------------------------+

      1 SIMPLE r ALL NULL NULL NULL NULL 5  
      1 SIMPLE m ALL idx_v,idx_v_d NULL NULL NULL 500 Range checked for each record (index map: 0x6)

      ------------------------------------------------------------------------------------------------------+
      ```

      Diagnostic Analysis
      This is a wrong-result bug because S2 is derived from S1 by append-only inserts into `t_main`, so every row matched in S1 must still be present in S2.

      Instead:

      • S1 returns 5 rows
      • S2 returns 0 rows

      That violates the monotonicity invariant `COUNT(S2) >= COUNT(S1)`.

      The two controls show that the data is still logically matchable:

      • `CAST(m.v AS SIGNED) = r.n` restores the expected 2500 rows
      • `IGNORE INDEX (idx_v, idx_v_d)` also restores the expected 2500 rows

      This strongly suggests that MariaDB mishandles the implicit `VARCHAR = BIGINT` comparison in the S2 access path chosen after `ANALYZE TABLE`, specifically the `Range checked for each record` plan.

      Expected Behavior
      The query should not lose rows after append-only inserts and `ANALYZE TABLE`.

      At minimum, the original S1 match set must still be present in S2. In this reproducer, the query should return 2500 rows in S2, regardless of whether the optimizer uses indexed access, table scan, or any other join strategy.

      Attachments

        Issue Links

          Activity

            People

              shipjain Shipra Jain
              Dreambreaker yaoruifei
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.