Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32349

Window Lag function returning different result under from MySQL-8.0

Details

    Description

      CREATE TABLE aggregates
      (
        `date` date,
        `ticker` varchar(3),
        l float,
        is_fs tinyint(1)
      );
       
      INSERT INTO aggregates
      VALUES
      	('2016-08-02', 'GFF', 16.75, NULL),
      	('2016-08-24', 'GFF', 16.75, NULL),
          ('2016-09-16', 'GFF', 16.75, NULL),
          ('2016-01-06', 'GFF', 16.75, NULL),
          ('2016-10-10', 'GFF', 16.75, NULL);
      

       
      SELECT 
       
      (1=1) 
      AND 
      (
          (LAG(l) OVER (PARTITION BY ticker ORDER BY date) > 1 AND 1=1) 
      OR (LAG(l) OVER (PARTITION BY ticker ORDER BY date) > 1 AND 1=1)
      )
       
      AS is_fs FROM `aggregates` WHERE YEAR(date) = 2016 AND ticker = 'GFF' ORDER BY is_fs DESC
      

      result:

      NULL
      NULL
      NULL
      NULL
      NULL
      

      https://www.db-fiddle.com/f/iEoGYnnKA3pmkB9piu8pWk/0

      MySQL-8.0 result

      1
      1
      1
      1
      NULL
      
      

      Attachments

        Issue Links

          Activity

            cvicentiu Vicențiu Ciorbaru added a comment - - edited

            The problem is caused by the 3-level Window Function dependency tree.

            Simplified query:

             
            -- Setup
            CREATE TABLE aggregates
            (
            `date` date,
            `ticker` varchar(3),
            l float,
            is_fs tinyint(1)
            );
            INSERT INTO aggregates
            VALUES
            ('2016-08-02', 'GFF', 16.75, NULL),
            ('2016-08-24', 'GFF', 16.75, NULL);
             
            -- Problematic Query
            SELECT
            (
              (SUM(1) OVER () > 0 AND true) OR
              (AVG(1) OVER () > 0 AND true)
            ) and true
            FROM `aggregates`;
            

            The execution of this query generates a temporary table with the following fields corresponding to the following Item subtrees (in this order)

            • Field 0 - Item_cond_or(ref_to_field_3, ref_to_field2)
            • Field 1 - Item_cond_and(ref_to_field_2, true)
            • Field 2 - Item_window_func(avg...)
            • Field 3 - Item_cond_and(ref_to_field_4, true)
            • Field 4 - Item_window_func(sum...)
            • Field 5 - Item_cond_and(ref_to_field0, true)

            The problem with this field order is that the final step in window function computation does not consider field dependencies properly. To be able to fill the whole table, one must start by computing window functions first (which is done correctly), then the items depending directly on window functions, then the items depending indirectly and so on.

            We currently only properly handle first and second level items containing window functions. 3 levels and more are not evaluated in proper order, hence this bug.

            This translates into a need to do a topological sort of temp-table fields before saving the actual results.

            cvicentiu Vicențiu Ciorbaru added a comment - - edited The problem is caused by the 3-level Window Function dependency tree. Simplified query:   -- Setup CREATE TABLE aggregates ( ` date ` date , `ticker` varchar (3), l float , is_fs tinyint(1) ); INSERT INTO aggregates VALUES ( '2016-08-02' , 'GFF' , 16.75, NULL ), ( '2016-08-24' , 'GFF' , 16.75, NULL );   -- Problematic Query SELECT ( ( SUM (1) OVER () > 0 AND true ) OR ( AVG (1) OVER () > 0 AND true ) ) and true FROM `aggregates`; The execution of this query generates a temporary table with the following fields corresponding to the following Item subtrees (in this order) Field 0 - Item_cond_or(ref_to_field_3, ref_to_field2) Field 1 - Item_cond_and(ref_to_field_2, true) Field 2 - Item_window_func(avg...) Field 3 - Item_cond_and(ref_to_field_4, true) Field 4 - Item_window_func(sum...) Field 5 - Item_cond_and(ref_to_field0, true) The problem with this field order is that the final step in window function computation does not consider field dependencies properly. To be able to fill the whole table, one must start by computing window functions first (which is done correctly), then the items depending directly on window functions, then the items depending indirectly and so on. We currently only properly handle first and second level items containing window functions. 3 levels and more are not evaluated in proper order, hence this bug. This translates into a need to do a topological sort of temp-table fields before saving the actual results.
            cvicentiu Vicențiu Ciorbaru added a comment - - edited

            Discussion with psergei:

            Items can be split into 3 categories:
            1. Those that do not use window functions
            2. Those that ARE window functions
            3. Those that use window function values.

            With this in mind, there should be no need for topological sorting, as all items of type 3 can be computed regardless of ordering. The current code however creates separate columns in the tmp table depending on the expression tree with type 3 items referencing window functions directly, but also with type 3 items referencing window functions via one item indirection. This needs to be investigated further to see why we have an extra layer of Item_refs placed in the tmp table.

            cvicentiu Vicențiu Ciorbaru added a comment - - edited Discussion with psergei : Items can be split into 3 categories: 1. Those that do not use window functions 2. Those that ARE window functions 3. Those that use window function values. With this in mind, there should be no need for topological sorting, as all items of type 3 can be computed regardless of ordering. The current code however creates separate columns in the tmp table depending on the expression tree with type 3 items referencing window functions directly, but also with type 3 items referencing window functions via one item indirection. This needs to be investigated further to see why we have an extra layer of Item_refs placed in the tmp table.

            People

              cvicentiu Vicențiu Ciorbaru
              danblack Daniel Black
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.