Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-5316

Subselect sorting is single-threaded always

Details

    • Task
    • Status: Stalled (View Workflow)
    • Major
    • Resolution: Unresolved
    • 22.08.4
    • 23.10
    • PrimProc
    • None
    • 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11, 2023-12, 2024-1, 2024-2, 2025-2

    Description

      The recent external sorting project reveals that subquery has thread factor set to 1 no matter what is a value of columnstore_orderby_threads.

      Attachments

        Activity

          JIraAutomate JiraAutomate added a comment -

          Automated message:
          ----------------------------
          Since this issue has not been updated since 6 weeks, it's time to move it back to Stalled.

          JIraAutomate JiraAutomate added a comment - Automated message: ---------------------------- Since this issue has not been updated since 6 weeks, it's time to move it back to Stalled.
          abukhalov Aleksei Bukhalov added a comment - - edited

          After a series of tests it seems like the patch is not working.

          I used
          mariadb 10.6-enterprise
          columnstore stable-23.10
          DB was filled with 10gb dataset using burza.

          Queries:
          query without subselect:
          SELECT l_orderkey
          FROM LINEITEM
          ORDER BY
          RAND(),
          EXP(SIN(l_quantity) * LOG(l_extendedprice + 1) * COS(l_discount + 1) + TAN(l_discount)),
          POW(l_extendedprice, 7) / (l_discount + 1) + SQRT(ABS(l_quantity - l_discount)) + LOG(POW(l_quantity, 3) + 1)
          LIMIT 10;

          query with subselect:
          SELECT l_orderkey
          FROM (
          SELECT l_orderkey
          FROM LINEITEM
          ORDER BY
          RAND(),
          EXP(SIN(l_quantity) * LOG(l_extendedprice + 1) * COS(l_discount + 1) + TAN(l_discount)),
          POW(l_extendedprice, 7) / (l_discount + 1) + SQRT(ABS(l_quantity - l_discount)) + LOG(POW(l_quantity, 3) + 1)
          ) AS subquery
          LIMIT 10;

          I tried to use different values of columnstore_orderby_threads (e.g. SET SESSION columnstore_orderby_threads = 4), but strangely, did not see any effect of it. In htop all 16 cores are loaded and that does not change during query execution.

          • results without patch:
            query without subselect:
            there is an even load on all cores the whole time until query finishes execution
            10 rows in set (6.310 sec)
            10 rows in set (6.282 sec)
            10 rows in set (6.207 sec)

          query with subselect:
          with subselect, for several(1-3) seconds there is an even load on all cores, but after that
          there is a stable 100 load on only one core until query finishes execution,
          all the other cores are between 0 and 2 percent load.
          10 rows in set (3 min 7.105 sec)
          10 rows in set (3 min 5.934 sec)
          10 rows in set (2 min 58.059 sec)

          • with patch: (I cherry picked the commit and rebuilt the code)

          query without subselect:
          there is an even load on all cores the whole time until query finishes execution
          10 rows in set (7.110 sec)
          10 rows in set (6.152 sec)
          10 rows in set (6.201 sec)

          query with subselect:
          cores load looks the same as before patch
          10 rows in set (3 min 3.735 sec)
          10 rows in set (3 min 18.246 sec)

          abukhalov Aleksei Bukhalov added a comment - - edited After a series of tests it seems like the patch is not working. I used mariadb 10.6-enterprise columnstore stable-23.10 DB was filled with 10gb dataset using burza. Queries: query without subselect: SELECT l_orderkey FROM LINEITEM ORDER BY RAND(), EXP(SIN(l_quantity) * LOG(l_extendedprice + 1) * COS(l_discount + 1) + TAN(l_discount)), POW(l_extendedprice, 7) / (l_discount + 1) + SQRT(ABS(l_quantity - l_discount)) + LOG(POW(l_quantity, 3) + 1) LIMIT 10; query with subselect: SELECT l_orderkey FROM ( SELECT l_orderkey FROM LINEITEM ORDER BY RAND(), EXP(SIN(l_quantity) * LOG(l_extendedprice + 1) * COS(l_discount + 1) + TAN(l_discount)), POW(l_extendedprice, 7) / (l_discount + 1) + SQRT(ABS(l_quantity - l_discount)) + LOG(POW(l_quantity, 3) + 1) ) AS subquery LIMIT 10; I tried to use different values of columnstore_orderby_threads (e.g. SET SESSION columnstore_orderby_threads = 4), but strangely, did not see any effect of it. In htop all 16 cores are loaded and that does not change during query execution. results without patch : query without subselect: there is an even load on all cores the whole time until query finishes execution 10 rows in set (6.310 sec) 10 rows in set (6.282 sec) 10 rows in set (6.207 sec) query with subselect: with subselect, for several(1-3) seconds there is an even load on all cores, but after that there is a stable 100 load on only one core until query finishes execution, all the other cores are between 0 and 2 percent load. 10 rows in set (3 min 7.105 sec) 10 rows in set (3 min 5.934 sec) 10 rows in set (2 min 58.059 sec) with patch: (I cherry picked the commit and rebuilt the code) query without subselect: there is an even load on all cores the whole time until query finishes execution 10 rows in set (7.110 sec) 10 rows in set (6.152 sec) 10 rows in set (6.201 sec) query with subselect: cores load looks the same as before patch 10 rows in set (3 min 3.735 sec) 10 rows in set (3 min 18.246 sec)

          People

            abukhalov Aleksei Bukhalov
            drrtuy Roman
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.