Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-2266

Make ColumnStore "flatten" ("merge") subqueries when appropriate

    XMLWordPrintable

Details

    Description

      Queries like SELECT count(a) FROM (SELECT a,b,c,d ..., z FROM table) t are presently insufficiently optimized in columnstore, the way they are optimized in every RDBMS.

      Normally, the execution plan for a query like that should be identical to SELECT count(a) FROM table. But it isn't. For example:

      select count(year) from (select * from flights) q;
      Desc Mode Table TableOID ReferencedColumns PIO LIO PBE Elapsed Rows
      BPS PM flights 3028 (year) 0 3176 0 0.378 12717658
      TNS UM - - - - - - 0.373 12717658
      TAS UM - - - - - - 0.361 1
      TNS UM - - - - - - 0.000 1

      All rows from all nodes are passed to the PM1 and the entire aggregation is conducted by ExeMgr on it. By contrast:

      MariaDB [bts]> select count(year) from flights;

      Desc Mode Table TableOID ReferencedColumns PIO LIO PBE Elapsed Rows
      BPS PM flights 3028 (year) 0 3176 0 0.075 1588
      TAS UM - - - - - - 0.067 1
      TNS UM - - - - - - 0.000 1

      Nodes aggregate locally and in parallel, and send partial aggregated rows of which there are a lot less to the primary which completes the aggregation.

      The difference in response time is obviously very significant (17x in the two queries above).

      Attachments

        Activity

          People

            tntnatbry Gagan Goel (Inactive)
            adrian.zhelev Adrian Zhelev (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.