Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-39238

NoREC logical bug: unoptimized evaluation of REGEXP_SUBSTR in the WHERE clause produces inconsistent counts when compared with transformed boolean aggregation.

    XMLWordPrintable

Details

    • Bug
    • Status: Confirmed (View Workflow)
    • Major
    • Resolution: Unresolved
    • 11.8, 12.2, 12.3, 12.2.2
    • 11.8, 12.3
    • None
    • OS: Any (Docker container)
      CPU Architecture: Any (x86_64)
      MariaDB version: 12.2.2-MariaDB-ubu2404

    Description

      The original query returns COUNT( * ) = 4, while the NoREC-transformed query using SUM(CASE WHEN ... IS TRUE THEN 1 ELSE 0 END) returns 0.

      The predicate involves REGEXP_SUBSTR and nested CASE expressions that always evaluate to NULL, making the comparison result UNKNOWN. However, the original query incorrectly counts all rows, while the NoREC version correctly filters them out.

       
      -- SCHEMA
       
      CREATE TABLE comments (
          id          INT,
          post_id     INT,
          user_id     INT,
          content     VARCHAR(1000),
          is_spam     INT,
          created_at  TIMESTAMP NULL
      );
       
      INSERT INTO comments VALUES
      (1, 1, 2, 'Nice post', 0, '2022-01-20 10:00:00'),
      (2, 1, 3, 'Spam here', 1,  '2022-01-21 11:00:00'),
      (3, 2, 1, 'Thanks',    0, '2022-01-22 12:00:00'),
      (4, 4, 5, NULL,        0, '2022-01-23 13:00:00');
       
      -- TRIGGER SQLs:
       
      SELECT COUNT(*)
      FROM 
        comments AS ref_0
      where (REGEXP_SUBSTR(
              'sk',
              CASE WHEN ref_0.content IS NULL 
                  THEN CASE WHEN ref_0.created_at IS NULL 
                            THEN COALESCE('8wc3', 'tprjcl') 
                            ELSE REGEXP_SUBSTR(REGEXP_SUBSTR('jo', 'ix1'), 'gii') 
                        END
                  ELSE REGEXP_SUBSTR('5rw6', 'al74v8') 
              END
          ) < FORMAT(ref_0.is_spam, 16));
       
      -- RESULT: {4}
       
      SELECT SUM(
          CASE WHEN (
                  REGEXP_SUBSTR(
                      'sk',
                      CASE WHEN ref_0.content IS NULL 
                          THEN CASE WHEN ref_0.created_at IS NULL 
                                   THEN COALESCE('8wc3', 'tprjcl') 
                                   ELSE REGEXP_SUBSTR(REGEXP_SUBSTR('jo', 'ix1'), 'gii') 
                               END
                          ELSE REGEXP_SUBSTR('5rw6', 'al74v8') 
                      END
                  ) < FORMAT(ref_0.is_spam, 16)
              ) IS TRUE THEN 1 ELSE 0 END + 0
      )
      FROM comments AS ref_0;
       
      -- RESULT: {0}
      
      

      Attachments

        Issue Links

          Activity

            People

              psergei Sergei Petrunia
              fmu Jasper Andrew
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.