Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23878

Wrong result with semi-join and splittable derived table

    XMLWordPrintable

Details

    Description

      When the number of conditions inside the "IN" clause matches the value of in_predicate_conversion_threshold , query behavior changes and returned result is incorrect for query.

      Attaching file for case reproduction:

      CREATE DATABASE subscriber_data;
      mysql subscriber_data < subscriber_data.sql

      Execute:

      pager;
      set in_predicate_conversion_threshold = 51;
      pager md5sum;
       
      SELECT object, ind, data FROM `subscriber_data`.`dev_config` AS live JOIN (SELECT object, ind, MAX(collection_time) AS collection_time FROM `subscriber_data`.`dev_config` WHERE collection_time <= '2020-09-24' AND ind IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51) GROUP BY object, ind) AS latest USING (object, ind, collection_time) WHERE ind IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51);
      

      Result is
      13 rows in set (0.039 sec)

      Otherwise setting set in_predicate_conversion_threshold = 1000; (value bigger than number of clauses inside "IN" ):

      pager;
      set in_predicate_conversion_threshold = 1000;
      pager md5sum;
       
      SELECT object, ind, data FROM `subscriber_data`.`dev_config` AS live JOIN (SELECT object, ind, MAX(collection_time) AS collection_time FROM `subscriber_data`.`dev_config` WHERE collection_time <= '2020-09-24' AND ind IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51) GROUP BY object, ind) AS latest USING (object, ind, collection_time) WHERE ind IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51);
      

      Returns
      700 rows in set (0.038 sec)

      The returned result when bug triggers (in_predicate_conversion_threshold = 51) is:
      -------------------------------------------

      object ind data

      -------------------------------------------

      1 51 51
      2 51 0
      3 51 us-east-1d
      4 51 ebd81398742e75f2fafb5511e06720c3
      5 51 0
      6 51 0
      7 51 0
      8 51 9b5d0e93ea552fb8a86420e87410ae21
      10 51 NULL
      11 51 0
      12 51 0
      13 51 0
      14 51 0

      -------------------------------------------
      13 rows in set (0.046 sec)

      But if I reorder the values inside the query IN clause and set "1" at the end instead of 51, result is:
      -------------------------------------------

      object ind data

      -------------------------------------------

      1 1 1
      2 1 0
      3 1 test_by_vcug_device
      4 1 6dc8e300391a4730609c89d9bf5e7564
      5 1 0
      6 1 NULL
      7 1 0
      8 1 8d4ffb6e30134a25b5297975eb3c0155
      10 1 NULL
      11 1 0
      12 1 0
      13 1 0
      14 1 0

      -------------------------------------------
      13 rows in set (0.040 sec)

      It can be seen that on the group by "ind" , the result "ind" value displayed is only the one last used in the "IN" conditions, and there seems to is a "UNION ALL" missing to merge all results when in_predicate_conversion_threshold kicks in.

      Attachments

        Activity

          People

            shagalla Galina Shalygina (Inactive)
            ctutte carlos tutte
            Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.