Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9028

Incorrect query result when combining DISTINCT, LEFT JOIN and subquery with constant column

Details

    • 5.5.55

    Description

      There appears to be a problem with the results returned from a query when DISTINCT, LEFT JOIN and a subquery which contains a constant value are combined.

      I've boiled down the problem into a simple example of how it fails:

      SELECT DISTINCT realtable.id AS realid, virtualtable.id AS virtualid, virtualtable.recordexists
      FROM realtable
      LEFT JOIN (SELECT realtable.id, "yes" AS recordexists FROM realtable) virtualtable
      ON realtable.id = virtualtable.id

      If you create a table called "realtable" with columns int "id" and string "data", insert two records (1, "yes") and (2, "yes") and run the query then you'll get:

      realid, virtualid, recordexists
      1, 1, null
      2, 2, null

      which is incorrect. The value "yes" should be returned in the recordexists column. Obviously the "virtual" records are being joined correctly to the "real" records, because the virtualid returns the correct value, but the constant "yes" is missing. If you remove DISTINCT, change LEFT JOIN to just JOIN, or change "yes" to realtable.data then you get the correct results:

      realid, virtualid, recordexists
      1, 1, yes
      2, 2, yes

      I also confirmed that this is NOT caused by using the same table in both the main and sub queries.

      I've got a query structured like this which is working in Postgress and MySQL, and when delivered to a client who tried it in MariaDB it started failing. This could probably be considered a data loss bug.

      Attachments

        Activity

          nathanl Nathan Lewis created issue -
          elenst Elena Stepanova made changes -
          Field Original Value New Value
          Description There appears to be a problem with the results returned from a query when DISTINCT, LEFT JOIN and a subquery which contains a constant value are combined.

          I've boiled down the problem into a simple example of how it fails:

          SELECT DISTINCT realtable.id AS realid, virtualtable.id AS virtualid, virtualtable.recordexists
          FROM realtable
          LEFT JOIN (SELECT realtable.id, "yes" AS recordexists FROM realtable) virtualtable
          ON realtable.id = virtualtable.id

          If you create a table called "realtable" with columns int "id" and string "data", insert two records (1, "yes") and (2, "yes") and run the query then you'll get:

          realid, virtualid, recordexists
          1, 1, null
          2, 2, null

          which is incorrect. The value "yes" should be returned in the recordexists column. Obviously the "virtual" records are being joined correctly to the "real" records, because the virtualid returns the correct value, but the constant "yes" is missing. If you remove DISTINCT, change LEFT JOIN to just JOIN, or change "yes" to realtable.data then you get the correct results:

          realid, virtualid, recordexists
          1, 1, yes
          2, 2, yes

          I also confirmed that this is NOT caused by using the same table in both the main and sub queries.

          I've got a query structured like this which is working in Postgress and MySQL, and when delivered to a client who tried it in MariaDB it started failing. This could probably be considered a data loss bug.
          There appears to be a problem with the results returned from a query when DISTINCT, LEFT JOIN and a subquery which contains a constant value are combined.

          I've boiled down the problem into a simple example of how it fails:

          {code:sql}
          SELECT DISTINCT realtable.id AS realid, virtualtable.id AS virtualid, virtualtable.recordexists
          FROM realtable
          LEFT JOIN (SELECT realtable.id, "yes" AS recordexists FROM realtable) virtualtable
          ON realtable.id = virtualtable.id
          {code}

          If you create a table called "realtable" with columns int "id" and string "data", insert two records (1, "yes") and (2, "yes") and run the query then you'll get:

          {noformat}
          realid, virtualid, recordexists
          1, 1, null
          2, 2, null
          {noformat}

          which is incorrect. The value "yes" should be returned in the recordexists column. Obviously the "virtual" records are being joined correctly to the "real" records, because the virtualid returns the correct value, but the constant "yes" is missing. If you remove DISTINCT, change LEFT JOIN to just JOIN, or change "yes" to realtable.data then you get the correct results:

          {noformat}
          realid, virtualid, recordexists
          1, 1, yes
          2, 2, yes
          {noformat}

          I also confirmed that this is NOT caused by using the same table in both the main and sub queries.

          I've got a query structured like this which is working in Postgress and MySQL, and when delivered to a client who tried it in MariaDB it started failing. This could probably be considered a data loss bug.
          elenst Elena Stepanova made changes -
          Status Open [ 1 ] Confirmed [ 10101 ]
          elenst Elena Stepanova made changes -
          Affects Version/s 5.5.43 [ 18601 ]
          Affects Version/s 10.1 [ 16100 ]
          Affects Version/s 10.0 [ 16000 ]
          Affects Version/s 5.5 [ 15800 ]
          Affects Version/s 5.5.45-galera [ 19503 ]
          elenst Elena Stepanova made changes -
          Labels regression wrong_result
          elenst Elena Stepanova made changes -
          Fix Version/s 10.1 [ 16100 ]
          Fix Version/s 10.0 [ 16000 ]
          Fix Version/s 5.5 [ 15800 ]

          nathanl, thanks for the report.


          Test case

          create table realtable (id int, data varchar(255));
          insert into realtable values (1,'yes'),(2,'yes');
           
          SELECT DISTINCT realtable.id AS realid, virtualtable.id AS virtualid, virtualtable.recordexists
          FROM realtable
          LEFT JOIN (SELECT realtable.id, 'yes' AS recordexists FROM realtable) virtualtable
          ON realtable.id = virtualtable.id;
           
          drop table realtable;

          The problem was introduced by the following revision in 5.5.43:

          commit 20109712aeb3d23e5e975780897ad236cbcd2ddc
          Author: Oleksandr Byelkin <sanja@mariadb.com>
          Date:   Tue Apr 14 23:18:54 2015 +0200
           
              MDEV-6892: WHERE does not apply
              
              Taking into account implicit dependence of constant view field from nullable table of left join added.
              
              Fixed finding real table to check if it turned to NULL (materialized view & derived taken into account)
              
              Removed incorrect uninitialization.

          elenst Elena Stepanova added a comment - nathanl , thanks for the report. Test case create table realtable (id int, data varchar(255)); insert into realtable values (1,'yes'),(2,'yes');   SELECT DISTINCT realtable.id AS realid, virtualtable.id AS virtualid, virtualtable.recordexists FROM realtable LEFT JOIN (SELECT realtable.id, 'yes' AS recordexists FROM realtable) virtualtable ON realtable.id = virtualtable.id;   drop table realtable; The problem was introduced by the following revision in 5.5.43: commit 20109712aeb3d23e5e975780897ad236cbcd2ddc Author: Oleksandr Byelkin <sanja@mariadb.com> Date: Tue Apr 14 23:18:54 2015 +0200   MDEV-6892: WHERE does not apply Taking into account implicit dependence of constant view field from nullable table of left join added. Fixed finding real table to check if it turned to NULL (materialized view & derived taken into account) Removed incorrect uninitialization.
          elenst Elena Stepanova made changes -
          Assignee Oleksandr Byelkin [ sanja ]
          igor Igor Babaev (Inactive) made changes -
          Assignee Oleksandr Byelkin [ sanja ] Igor Babaev [ igor ]
          igor Igor Babaev (Inactive) made changes -
          Sprint 5.5.55&10.0.30 [ 138 ]
          igor Igor Babaev (Inactive) made changes -
          Status Confirmed [ 10101 ] In Progress [ 3 ]
          igor Igor Babaev (Inactive) made changes -
          Assignee Igor Babaev [ igor ] Oleksandr Byelkin [ sanja ]
          Status In Progress [ 3 ] In Review [ 10002 ]

          OK to push! Thank you a lot for the fix!

          sanja Oleksandr Byelkin added a comment - OK to push! Thank you a lot for the fix!
          sanja Oleksandr Byelkin made changes -
          Assignee Oleksandr Byelkin [ sanja ] Igor Babaev [ igor ]
          Status In Review [ 10002 ] Stalled [ 10000 ]

          The fix for this bug was pushed into the 5.5 tree.

          igor Igor Babaev (Inactive) added a comment - The fix for this bug was pushed into the 5.5 tree.
          igor Igor Babaev (Inactive) made changes -
          Fix Version/s 5.5.55 [ 22311 ]
          Fix Version/s 5.5 [ 15800 ]
          Fix Version/s 10.0 [ 16000 ]
          Fix Version/s 10.1 [ 16100 ]
          Resolution Fixed [ 1 ]
          Status Stalled [ 10000 ] Closed [ 6 ]
          serg Sergei Golubchik made changes -
          Workflow MariaDB v3 [ 72320 ] MariaDB v4 [ 149770 ]

          People

            igor Igor Babaev (Inactive)
            nathanl Nathan Lewis
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.