Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-1205

support queries with circular INNER joins

Details

    • Bug
    • Status: Closed (View Workflow)
    • Blocker
    • Resolution: Fixed
    • 1.0.12
    • 6.1.1
    • ExeMgr
    • None
    • 2018-10, 2018-11, 2018-12, 2018-13, 2018-14, 2018-15, 2018-16, 2021-5, 2021-6, 2021-7, 2021-8, 2021-9

    Description

      Certain types of query will result in the following error:
      ERROR 1815 (HY000) at line 6: Internal error: IDB-1003: Circular joins are not supported.

      This happens if a given table is joined more than once. This is a common pattern and can be generated by bi tools including microstrategy.

      dbt3 query 5 is one such example that should be supported:

      execute the query #5
       
      select
      n_name,
      sum(l_extendedprice * (1 - l_discount)) as revenue
      from
      customer,
      orders,
      lineitem,
      supplier,
      nation,
      region
      where
      c_custkey = o_custkey
      and l_orderkey = o_orderkey
      and l_suppkey = s_suppkey
      and c_nationkey = s_nationkey
      and s_nationkey = n_nationkey
      and n_regionkey = r_regionkey
      and r_name = 'AMERICA'
      and o_orderdate >= '1993-01-01'
      and o_orderdate < date_add( '1993-01-01' , interval 1 year)
      group by
      n_name
      order by
      revenue desc;
      

      Attachments

        Issue Links

          Activity

            we are facing the same issue currently, is there an ETA for this feature?

            DanielW Dalu (Inactive) added a comment - we are facing the same issue currently, is there an ETA for this feature?

            no update since over a year?

            DanielW Dalu (Inactive) added a comment - no update since over a year?

            Hi DanielW,

            We have a PoC fix for this but it can make bad decisions on join order so we never merged it. A fix with a new optimizer for this would be a very large task and there hasn't been enough demand for us to work on a proper fix.

            In the mean time there is a workaround, for the example in the description add "+0" to the c_nationkey / s_nationkey join as follows:

            and c_nationkey = s_nationkey + 0
            

            This makes a separate virtual column internally allowing the join to work, but you may still need to manually order the joins if you find this does not perform well.

            LinuxJedi Andrew Hutchings (Inactive) added a comment - Hi DanielW , We have a PoC fix for this but it can make bad decisions on join order so we never merged it. A fix with a new optimizer for this would be a very large task and there hasn't been enough demand for us to work on a proper fix. In the mean time there is a workaround, for the example in the description add "+0" to the c_nationkey / s_nationkey join as follows: and c_nationkey = s_nationkey + 0 This makes a separate virtual column internally allowing the join to work, but you may still need to manually order the joins if you find this does not perform well.

            Build tested: 5.6.1 ( Drone #2270)

            Tested dbt3 query #5 on a 10g (ten gb) database. It returned an error

            [centos8:root~]# mysql tpch10 -vvv </data/qa/autopilot/performance/dbt3/sql/10g/5.sql.original 
            --------------
            select
            	n_name,
            	sum(l_extendedprice * (1 - l_discount)) as revenue
            from
            	customer,
            	orders,
            	lineitem,
            	supplier,
            	nation,
            	region
            where
            	c_custkey = o_custkey
            	and l_orderkey = o_orderkey
            	and l_suppkey = s_suppkey
            	and c_nationkey = s_nationkey
            	and s_nationkey = n_nationkey
            	and n_regionkey = r_regionkey
            	and r_name = 'MIDDLE EAST'
            	and o_orderdate >= '1994-01-01'
            	and o_orderdate < date_add( '1994-01-01' , interval 1 year)
            group by
            	n_name
            order by
            	revenue desc
            --------------
             
            ERROR 1815 (HY000) at line 6: Internal error: InetStreamSocket::readToMagic: Remote is closed
            Bye
            

            I ran the 1gb qeury again on a 1gb (one gb) database and it took 436 seconds.
            The modified version of query #5 took 7 seconds to execute on a 10g.

            [centos8:root~]# mysql tpch1 -vvv </data/qa/autopilot/performance/dbt3/sql/1g/5.sql.original 
            --------------
            select
            	n_name,
            	sum(l_extendedprice * (1 - l_discount)) as revenue
            from
            	customer,
            	orders,
            	lineitem,
            	supplier,
            	nation,
            	region
            where
            	c_custkey = o_custkey
            	and l_orderkey = o_orderkey
            	and l_suppkey = s_suppkey
            	and c_nationkey = s_nationkey
            	and s_nationkey = n_nationkey
            	and n_regionkey = r_regionkey
            	and r_name = 'AMERICA'
            	and o_orderdate >= '1993-01-01'
            	and o_orderdate < date_add( '1993-01-01' , interval 1 year)
            group by
            	n_name
            order by
            	revenue desc
            --------------
             
            +---------------+--------------------+
            | n_name        | revenue            |
            +---------------+--------------------+
            | PERU          | 8255761247026.0164 |
            | CANADA        | 8246925619913.6201 |
            | ARGENTINA     | 8120116982337.0609 |
            | BRAZIL        | 7977931767054.3294 |
            | UNITED STATES | 7772891387103.9486 |
            +---------------+--------------------+
            5 rows in set (7 min 36.532 sec)
             
             
            Bye
            

            I also compared the 10g dbt3 performance of this build to release 5.5.2, using the modified query #5.

            10g DBT3 performance test in seconds
             
            all 22 queries, with query #5 modified
             
            5.6.1-1 is about 20% faster
             
                       diskRun	CachedRun  Delta
            5.5.2-1    341.596  273.675	   -20%
            5.6.1-1    302.928  243.378    -20$
            

            dleeyh Daniel Lee (Inactive) added a comment - Build tested: 5.6.1 ( Drone #2270) Tested dbt3 query #5 on a 10g (ten gb) database. It returned an error [centos8:root~]# mysql tpch10 -vvv </data/qa/autopilot/performance/dbt3/sql/10g/5.sql.original -------------- select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from customer, orders, lineitem, supplier, nation, region where c_custkey = o_custkey and l_orderkey = o_orderkey and l_suppkey = s_suppkey and c_nationkey = s_nationkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'MIDDLE EAST' and o_orderdate >= '1994-01-01' and o_orderdate < date_add( '1994-01-01' , interval 1 year) group by n_name order by revenue desc --------------   ERROR 1815 (HY000) at line 6: Internal error: InetStreamSocket::readToMagic: Remote is closed Bye I ran the 1gb qeury again on a 1gb (one gb) database and it took 436 seconds. The modified version of query #5 took 7 seconds to execute on a 10g. [centos8:root~]# mysql tpch1 -vvv </data/qa/autopilot/performance/dbt3/sql/1g/5.sql.original -------------- select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from customer, orders, lineitem, supplier, nation, region where c_custkey = o_custkey and l_orderkey = o_orderkey and l_suppkey = s_suppkey and c_nationkey = s_nationkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'AMERICA' and o_orderdate >= '1993-01-01' and o_orderdate < date_add( '1993-01-01' , interval 1 year) group by n_name order by revenue desc --------------   +---------------+--------------------+ | n_name | revenue | +---------------+--------------------+ | PERU | 8255761247026.0164 | | CANADA | 8246925619913.6201 | | ARGENTINA | 8120116982337.0609 | | BRAZIL | 7977931767054.3294 | | UNITED STATES | 7772891387103.9486 | +---------------+--------------------+ 5 rows in set (7 min 36.532 sec)     Bye I also compared the 10g dbt3 performance of this build to release 5.5.2, using the modified query #5. 10g DBT3 performance test in seconds   all 22 queries, with query #5 modified   5.6.1-1 is about 20% faster   diskRun CachedRun Delta 5.5.2-1 341.596 273.675 -20% 5.6.1-1 302.928 243.378 -20$
            dleeyh Daniel Lee (Inactive) added a comment - - edited

            Build tested: 6.1.1 ( #2727)

            With 32 gb memory and a 10gb dbt3 database, query #5 returned an error.

            ERROR 1815 (HY000): Internal error: IDB-2001: Join or subselect exceeds memory limit.

            The query worked if I used the analyze command before executing the query.

            Performance is about the same as the modified query #5.

            dleeyh Daniel Lee (Inactive) added a comment - - edited Build tested: 6.1.1 ( #2727) With 32 gb memory and a 10gb dbt3 database, query #5 returned an error. ERROR 1815 (HY000): Internal error: IDB-2001: Join or subselect exceeds memory limit. The query worked if I used the analyze command before executing the query. Performance is about the same as the modified query #5.

            People

              denis0x0D Denis Khalikov (Inactive)
              dthompson David Thompson (Inactive)
              Votes:
              8 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.