Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-38819

Logic Inconsistency between Direct HAVING Query and Derived Table Relocation

    XMLWordPrintable

Details

    • Bug
    • Status: In Review (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.11, 11.8.5
    • 10.11, 11.4, 11.8, 12.3
    • Optimizer
    • OS: Ubuntu 24.04
      DBMS: MariaDB
      Docker image: mariadb:11.8.5
      Image ID: bfe9184ea9e5
    • Q2/2026 Server Development

    Description

      Summary

      There is a significant logic inconsistency in MariaDB when evaluating complex predicates in a HAVING clause versus a WHERE clause in a derived table. A query using HAVING returns 0 rows, while the logically equivalent query using a derived table returns 2 rows. In contrast, MySQL correctly returns 2 rows for both queries.

      Reproduction Steps

      CREATE TABLE t0(c0 VARCHAR(100), c1 INT UNIQUE);
      CREATE TABLE t1 LIKE t0;
      INSERT INTO t1 VALUES ('C', -1833670268);
      INSERT INTO t1 VALUES ('\\', 1046230419);
      INSERT INTO t0 VALUES ('', -1016012686);
       
      SELECT t1.c1 AS g0, t0.c0 AS g1, t0.c1 AS g2 FROM t1 STRAIGHT_JOIN t0 ON ('pt tw' NOT IN ((- 1873356510), true)) GROUP BY t1.c1, t0.c0, t0.c1 HAVING ((COUNT(g0) < 464443616) AND (g2 NOT IN (g0, (g0 NOT IN (g1, true, g2)))));-- cardinality: 0
      SELECT g0, g1, g2 FROM (SELECT t1.c1 AS g0, t0.c0 AS g1, t0.c1 AS g2, (((COUNT(t1.c1) < 464443616) AND (t0.c1 NOT IN (t1.c1, (t1.c1 NOT IN (t0.c0, true, t0.c1)))))) AS ref1 FROM t1 STRAIGHT_JOIN t0 ON ('pt tw' NOT IN ((- 1873356510), true)) GROUP BY t1.c1, t0.c0, t0.c1) AS s WHERE ref1;-- cardinality: 2;
      

      Observed vs. Expected Behavior

      Observed Behavior (MariaDB):

      Query A returns an empty set (0 rows).
      Query B returns 2 rows (the expected data).

      mysql> SELECT t1.c1 AS g0, t0.c0 AS g1, t0.c1 AS g2 FROM t1 STRAIGHT_JOIN t0 ON ('pt tw' NOT IN ((- 1873356510), true)) GROUP BY t1.c1, t0.c0, t0.c1 HAVING ((COUNT(g0) < 464443616) AND (g2 NOT IN (g0, (g0 NOT IN (g1, true, g2)))));-- cardinality: 0
      Empty set, 1 warning (0.00 sec)
       
      mysql> SELECT g0, g1, g2 FROM (SELECT t1.c1 AS g0, t0.c0 AS g1, t0.c1 AS g2, (((COUNT(t1.c1) < 464443616) AND (t0.c1 NOT IN (t1.c1, (t1.c1 NOT IN (t0.c0, true, t0.c1)))))) AS ref1 FROM t1 STRAIGHT_JOIN t0 ON ('pt tw' NOT IN ((- 1873356510), true)) GROUP BY t1.c1, t0.c0, t0.c1) AS s WHERE ref1;-- cardinality: 2;
      +-------------+------+-------------+
      | g0          | g1   | g2          |
      +-------------+------+-------------+
      | -1833670268 |      | -1016012686 |
      |  1046230419 |      | -1016012686 |
      +-------------+------+-------------+
      2 rows in set, 3 warnings (0.00 sec)
      

      Expected Behavior (Consistent with MySQL v9.6.0):

      Both Query A and Query B should return 2 rows.

      mysql> SELECT t1.c1 AS g0, t0.c0 AS g1, t0.c1 AS g2 FROM t1 STRAIGHT_JOIN t0 ON ('pt tw' NOT IN ((- 1873356510), true)) GROUP BY t1.c1, t0.c0, t0.c1 HAVING ((COUNT(g0) < 464443616) AND (g2 NOT IN (g0, (g0 NOT IN (g1, true, g2)))));-- cardinality: 0
      +-------------+------+-------------+
      | g0          | g1   | g2          |
      +-------------+------+-------------+
      |  1046230419 |      | -1016012686 |
      | -1833670268 |      | -1016012686 |
      +-------------+------+-------------+
      2 rows in set, 1 warning (0.01 sec)
       
      mysql> SELECT g0, g1, g2 FROM (SELECT t1.c1 AS g0, t0.c0 AS g1, t0.c1 AS g2, (((COUNT(t1.c1) < 464443616) AND (t0.c1 NOT IN (t1.c1, (t1.c1 NOT IN (t0.c0, true, t0.c1)))))) AS ref1 FROM t1 STRAIGHT_JOIN t0 ON ('pt tw' NOT IN ((- 1873356510), true)) GROUP BY t1.c1, t0.c0, t0.c1) AS s WHERE ref1;-- cardinality: 2;
      +-------------+------+-------------+
      | g0          | g1   | g2          |
      +-------------+------+-------------+
      |  1046230419 |      | -1016012686 |
      | -1833670268 |      | -1016012686 |
      +-------------+------+-------------+
      2 rows in set, 1 warning (0.00 sec)
      

      Execution Plan Analysis (EXPLAIN)

      The discrepancy seems linked to the optimizer's choice of join/access method:
      Query A (Direct): Uses Range checked for each record (index map: 0x1) for table t0. It appears the complex HAVING predicate involving NOT IN and unique index columns is not correctly evaluated during this optimization phase.
      Query B (Derived): Uses Using join buffer (flat, BNL join). By materializing the subquery, the logic is evaluated row-by-row in the temporary table, bypassing the faulty optimization and producing the correct result.

      Attachments

        Activity

          People

            Johnston Rex Johnston
            Ce Lyu Ce Lyu
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.