Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25168

NULL values from LEFT JOIN

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.3.27
    • 10.3(EOL)
    • Optimizer
    • None
    • Focal (docker) and Debian 10

    Description

      Fundamentally I have a query like this:

      SELECT outer.id, leftJoinedSubQuery.something
      FROM innoDbTable outer
      LEFT JOIN (
         SELECT id, COUNT(*) something
         FROM anotherTable
         ...
         GROUP BY id
      ) leftJoinedSubQuery ON leftJoinedSubQuery.id = outer.id
      WHERE
         -- various conditions on outer --
        AND EXISTS (
            SELECT id FROM yetAnotherTable WHERE yetAnotherTable.id = outer.id
        )
      

      Where the leftJoinedSubQuery returns rows like this:

      id            something
      1             123
      

      And the outer returns a row with ID 1 too.

      Now depending on the WHERE clauses - the something value might show up as NULL for *all* rows, including the row where ID = 1.

      The actual query is at
      https://dba.stackexchange.com/q/287133/4628

      and copied below:

      SELECT c.id contact_id, engagement.sent
      FROM contact c
      LEFT JOIN (
          /* ➊ Inner query */
          SELECT eq.contact_id contact_id, COUNT(*) sent
          FROM mailing_job j
          INNER JOIN mailing_event_queue eq ON eq.job_id = j.id
          WHERE j.end_date > NOW() - INTERVAL 3 MONTH
          GROUP BY eq.contact_id
        ) engagement ON engagement.contact_id = c.id
       
       WHERE
        /* ➋ various conditions on table 'c' */
       do_not_email = 0 AND is_opt_out = 0 AND is_deceased = 0 AND is_deleted = 0
         AND contact_type = 'Individual'
       
         /* ➌ */
         AND EXISTS (
            SELECT gc.contact_id
            FROM group_contact gc
            WHERE gc.contact_id = c.id AND gc.group_id IN (30,386,14,6,214,5,88,361,334,18,9,17,240,7,13,10,292,291,290,12)
          )
       
         /* ➍ AND c.id=1 */
      ORDER BY c.id
      

      For example,

      • ommitting ➋ makes it work
      • ommitting ➌ makes it work
      • un-commenting ➍ makes it work

      I can't reproduce this with dummy data. The tables in use have between 200k and 10M rows each and the query takes ~40s+ to run.

      I'm not used to Jira, so apols if I've not done this corectly.

      Note, this rewrite of the query (by dba.stackexchange.com user ypercube - see link above) returns the correct results with and without the ➍ clause. It's very slow compared to the original though.

      -- Query Z
        SELECT z.contact_id, engagement.sent
        FROM
            (
            SELECT c.id contact_id
            FROM contact c
       
             WHERE
              /* ➋ various conditions on table 'c' */
             c.do_not_email = 0 AND c.is_opt_out = 0 AND c.is_deceased = 0 AND c.is_deleted = 0
               AND c.contact_type = 'Individual'
       
               /* ➌ */
               AND EXISTS (
                  SELECT gc.contact_id
                  FROM group_contact gc
                  WHERE gc.contact_id = c.id AND gc.group_id IN (30,386,14,6,214,5,88,361,334,18,9,17,240,7,13,10,292,291,290,12)
                )
       
               /* ➍ */
               AND c.id=1
            ) AS z
            LEFT JOIN (
                /* ➊ Inner query */
                SELECT eq.contact_id contact_id, COUNT(*) sent
                FROM mailing_job j
                INNER JOIN mailing_event_queue eq ON eq.job_id = j.id
                WHERE j.end_date > NOW() - INTERVAL 3 MONTH
                GROUP BY eq.contact_id
              ) engagement ON engagement.contact_id = z.contact_id
            ;
      

      I've attached the create table statements.

      Attachments

        Activity

          People

            psergei Sergei Petrunia
            artfulrobot Rich
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.