Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-5316

Subselect sorting is single-threaded always

Details

    • Task
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 22.08.4
    • 23.10.4
    • PrimProc
    • None
    • 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11, 2023-12, 2024-1, 2024-2, 2025-2

    Description

      The recent external sorting project reveals that subquery has thread factor set to 1 no matter what is a value of columnstore_orderby_threads.

      Attachments

        Activity

          drrtuy Roman created issue -
          alexey.vorovich alexey vorovich (Inactive) made changes -
          Field Original Value New Value
          Fix Version/s 23.02 [ 28209 ]
          Fix Version/s 23.03.1 [ 28458 ]
          toddstoffel Todd Stoffel (Inactive) made changes -
          Rank Ranked higher
          toddstoffel Todd Stoffel (Inactive) made changes -
          Sprint 2022-22 [ 672 ] 2022-22, 2022-23 [ 672, 686 ]
          toddstoffel Todd Stoffel (Inactive) made changes -
          Fix Version/s 23.08 [ 28540 ]
          Fix Version/s 23.02 [ 28209 ]
          toddstoffel Todd Stoffel (Inactive) made changes -
          Sprint 2022-22, 2022-23 [ 672, 686 ] 2022-22, 2022-23, 2022-24 [ 672, 686, 698 ]
          toddstoffel Todd Stoffel (Inactive) made changes -
          Sprint 2022-22, 2022-23, 2023-4 [ 672, 686, 698 ] 2022-22, 2022-23, 2023-4, 2023-5 [ 672, 686, 698, 702 ]
          toddstoffel Todd Stoffel (Inactive) made changes -
          Sprint 2022-22, 2022-23, 2023-4, 2023-5 [ 672, 686, 698, 702 ] 2022-22, 2022-23, 2023-4, 2023-5, 2023-6 [ 672, 686, 698, 702, 706 ]
          toddstoffel Todd Stoffel (Inactive) made changes -
          Sprint 2022-22, 2022-23, 2023-4, 2023-5, 2023-6 [ 672, 686, 698, 702, 706 ] 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7 [ 672, 686, 698, 702, 706, 726 ]
          toddstoffel Todd Stoffel (Inactive) made changes -
          Rank Ranked lower
          toddstoffel Todd Stoffel (Inactive) made changes -
          Sprint 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7 [ 672, 686, 698, 702, 706, 726 ] 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8 [ 672, 686, 698, 702, 706, 726, 728 ]
          drrtuy Roman made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          alexey.vorovich alexey vorovich (Inactive) made changes -
          Summary Subselect sorting is single-threaded whatsoever Subselect sorting is single-threaded always
          toddstoffel Todd Stoffel (Inactive) made changes -
          Sprint 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8 [ 672, 686, 698, 702, 706, 726, 728 ] 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-9 [ 672, 686, 698, 702, 706, 726, 728, 733 ]
          toddstoffel Todd Stoffel (Inactive) made changes -
          Sprint 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-9 [ 672, 686, 698, 702, 706, 726, 728, 733 ] 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10 [ 672, 686, 698, 702, 706, 726, 728, 734 ]
          toddstoffel Todd Stoffel (Inactive) made changes -
          Sprint 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10 [ 672, 686, 698, 702, 706, 726, 728, 734 ] 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11 [ 672, 686, 698, 702, 706, 726, 728, 734, 737 ]
          JIraAutomate JiraAutomate added a comment -

          Automated message:
          ----------------------------
          Since this issue has not been updated since 6 weeks, it's time to move it back to Stalled.

          JIraAutomate JiraAutomate added a comment - Automated message: ---------------------------- Since this issue has not been updated since 6 weeks, it's time to move it back to Stalled.
          julien.fritsch Julien Fritsch made changes -
          Status In Progress [ 3 ] Stalled [ 10000 ]
          julien.fritsch Julien Fritsch made changes -
          Status Stalled [ 10000 ] In Progress [ 3 ]
          julien.fritsch Julien Fritsch made changes -
          Status In Progress [ 3 ] Stalled [ 10000 ]
          leonid.fedorov Leonid Fedorov made changes -
          Assignee Roman [ drrtuy ] Leonid Fedorov [ JIRAUSER48443 ]
          leonid.fedorov Leonid Fedorov made changes -
          Status Stalled [ 10000 ] In Progress [ 3 ]
          leonid.fedorov Leonid Fedorov made changes -
          Sprint 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11 [ 672, 686, 698, 702, 706, 726, 728, 734, 737 ] 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11, 2023-13 [ 672, 686, 698, 702, 706, 726, 728, 734, 737, 748 ]
          JIraAutomate JiraAutomate made changes -
          Status In Progress [ 3 ] Stalled [ 10000 ]
          julien.fritsch Julien Fritsch made changes -
          Sprint 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11, 2023-12 [ 672, 686, 698, 702, 706, 726, 728, 734, 737, 748 ] 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11, 2023-12, 2023-13 [ 672, 686, 698, 702, 706, 726, 728, 734, 737, 748, 755 ]
          leonid.fedorov Leonid Fedorov made changes -
          Component/s PrimProc [ 13700 ]
          Component/s MDB Plugin [ 13801 ]
          julien.fritsch Julien Fritsch made changes -
          Sprint 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11, 2023-12, 2024-1 [ 672, 686, 698, 702, 706, 726, 728, 734, 737, 748, 755 ] 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11, 2023-12, 2024-1, 2024-2 [ 672, 686, 698, 702, 706, 726, 728, 734, 737, 748, 755, 764 ]
          leonid.fedorov Leonid Fedorov made changes -
          Fix Version/s 23.10.3 [ 29862 ]
          Fix Version/s 23.10 [ 28540 ]
          leonid.fedorov Leonid Fedorov made changes -
          Status Stalled [ 10000 ] In Progress [ 3 ]
          leonid.fedorov Leonid Fedorov made changes -
          Status In Progress [ 3 ] In Testing [ 10301 ]
          leonid.fedorov Leonid Fedorov made changes -
          Fix Version/s 23.10 [ 28540 ]
          Fix Version/s 23.10.3 [ 29862 ]
          julien.fritsch Julien Fritsch made changes -
          Sprint 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11, 2023-12, 2024-1, 2024-2 [ 672, 686, 698, 702, 706, 726, 728, 734, 737, 748, 755, 764 ] 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11, 2023-12, 2024-1, 2024-2, 2024-3 [ 672, 686, 698, 702, 706, 726, 728, 734, 737, 748, 755, 764, 784 ]
          allen.herrera Allen Herrera made changes -
          Sprint 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11, 2023-12, 2024-1, 2024-2, 2025-1 [ 672, 686, 698, 702, 706, 726, 728, 734, 737, 748, 755, 764, 784 ] 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11, 2023-12, 2024-1, 2024-2, 2025-2 [ 672, 686, 698, 702, 706, 726, 728, 734, 737, 748, 755, 764, 788 ]
          drrtuy Roman made changes -
          Resolution Fixed [ 1 ]
          Status In Testing [ 10301 ] Closed [ 6 ]
          leonid.fedorov Leonid Fedorov made changes -
          Assignee Leonid Fedorov [ JIRAUSER48443 ] Aleksei Bukhalov [ JIRAUSER56149 ]
          leonid.fedorov Leonid Fedorov made changes -
          Resolution Fixed [ 1 ]
          Status Closed [ 6 ] Stalled [ 10000 ]
          leonid.fedorov Leonid Fedorov made changes -
          Status Stalled [ 10000 ] Confirmed [ 10101 ]
          leonid.fedorov Leonid Fedorov made changes -
          Assignee Aleksei Bukhalov [ JIRAUSER56149 ] Leonid Fedorov [ JIRAUSER48443 ]
          leonid.fedorov Leonid Fedorov made changes -
          Status Confirmed [ 10101 ] In Progress [ 3 ]
          leonid.fedorov Leonid Fedorov made changes -
          Status In Progress [ 3 ] In Testing [ 10301 ]
          leonid.fedorov Leonid Fedorov made changes -
          Assignee Leonid Fedorov [ JIRAUSER48443 ] Aleksei Bukhalov [ JIRAUSER56149 ]
          abukhalov Aleksei Bukhalov added a comment - - edited

          After a series of tests it seems like the patch is not working.

          I used
          mariadb 10.6-enterprise
          columnstore stable-23.10
          DB was filled with 10gb dataset using burza.

          Queries:
          query without subselect:
          SELECT l_orderkey
          FROM LINEITEM
          ORDER BY
          RAND(),
          EXP(SIN(l_quantity) * LOG(l_extendedprice + 1) * COS(l_discount + 1) + TAN(l_discount)),
          POW(l_extendedprice, 7) / (l_discount + 1) + SQRT(ABS(l_quantity - l_discount)) + LOG(POW(l_quantity, 3) + 1)
          LIMIT 10;

          query with subselect:
          SELECT l_orderkey
          FROM (
          SELECT l_orderkey
          FROM LINEITEM
          ORDER BY
          RAND(),
          EXP(SIN(l_quantity) * LOG(l_extendedprice + 1) * COS(l_discount + 1) + TAN(l_discount)),
          POW(l_extendedprice, 7) / (l_discount + 1) + SQRT(ABS(l_quantity - l_discount)) + LOG(POW(l_quantity, 3) + 1)
          ) AS subquery
          LIMIT 10;

          I tried to use different values of columnstore_orderby_threads (e.g. SET SESSION columnstore_orderby_threads = 4), but strangely, did not see any effect of it. In htop all 16 cores are loaded and that does not change during query execution.

          • results without patch:
            query without subselect:
            there is an even load on all cores the whole time until query finishes execution
            10 rows in set (6.310 sec)
            10 rows in set (6.282 sec)
            10 rows in set (6.207 sec)

          query with subselect:
          with subselect, for several(1-3) seconds there is an even load on all cores, but after that
          there is a stable 100 load on only one core until query finishes execution,
          all the other cores are between 0 and 2 percent load.
          10 rows in set (3 min 7.105 sec)
          10 rows in set (3 min 5.934 sec)
          10 rows in set (2 min 58.059 sec)

          • with patch: (I cherry picked the commit and rebuilt the code)

          query without subselect:
          there is an even load on all cores the whole time until query finishes execution
          10 rows in set (7.110 sec)
          10 rows in set (6.152 sec)
          10 rows in set (6.201 sec)

          query with subselect:
          cores load looks the same as before patch
          10 rows in set (3 min 3.735 sec)
          10 rows in set (3 min 18.246 sec)

          abukhalov Aleksei Bukhalov added a comment - - edited After a series of tests it seems like the patch is not working. I used mariadb 10.6-enterprise columnstore stable-23.10 DB was filled with 10gb dataset using burza. Queries: query without subselect: SELECT l_orderkey FROM LINEITEM ORDER BY RAND(), EXP(SIN(l_quantity) * LOG(l_extendedprice + 1) * COS(l_discount + 1) + TAN(l_discount)), POW(l_extendedprice, 7) / (l_discount + 1) + SQRT(ABS(l_quantity - l_discount)) + LOG(POW(l_quantity, 3) + 1) LIMIT 10; query with subselect: SELECT l_orderkey FROM ( SELECT l_orderkey FROM LINEITEM ORDER BY RAND(), EXP(SIN(l_quantity) * LOG(l_extendedprice + 1) * COS(l_discount + 1) + TAN(l_discount)), POW(l_extendedprice, 7) / (l_discount + 1) + SQRT(ABS(l_quantity - l_discount)) + LOG(POW(l_quantity, 3) + 1) ) AS subquery LIMIT 10; I tried to use different values of columnstore_orderby_threads (e.g. SET SESSION columnstore_orderby_threads = 4), but strangely, did not see any effect of it. In htop all 16 cores are loaded and that does not change during query execution. results without patch : query without subselect: there is an even load on all cores the whole time until query finishes execution 10 rows in set (6.310 sec) 10 rows in set (6.282 sec) 10 rows in set (6.207 sec) query with subselect: with subselect, for several(1-3) seconds there is an even load on all cores, but after that there is a stable 100 load on only one core until query finishes execution, all the other cores are between 0 and 2 percent load. 10 rows in set (3 min 7.105 sec) 10 rows in set (3 min 5.934 sec) 10 rows in set (2 min 58.059 sec) with patch: (I cherry picked the commit and rebuilt the code) query without subselect: there is an even load on all cores the whole time until query finishes execution 10 rows in set (7.110 sec) 10 rows in set (6.152 sec) 10 rows in set (6.201 sec) query with subselect: cores load looks the same as before patch 10 rows in set (3 min 3.735 sec) 10 rows in set (3 min 18.246 sec)
          abukhalov Aleksei Bukhalov made changes -
          Status In Testing [ 10301 ] Stalled [ 10000 ]
          abukhalov Aleksei Bukhalov added a comment - - edited

          To avoid potential partial sort optimisation and maximize resource consumption of ORDER BY, LIMIT keyword was removed:

          Queries:
          query without subselect:
          SELECT l_orderkey
          FROM LINEITEM
          ORDER BY
          RAND(),
          EXP(SIN(l_quantity) * LOG(l_extendedprice + 1) * COS(l_discount + 1) + TAN(l_discount)),
          POW(l_extendedprice, 7) / (l_discount + 1) + SQRT(ABS(l_quantity - l_discount)) + LOG(POW(l_quantity, 3) + 1);

          query with subselect:
          SELECT l_orderkey
          FROM (
          SELECT l_orderkey
          FROM LINEITEM
          ORDER BY
          RAND(),
          EXP(SIN(l_quantity) * LOG(l_extendedprice + 1) * COS(l_discount + 1) + TAN(l_discount)),
          POW(l_extendedprice, 7) / (l_discount + 1) + SQRT(ABS(l_quantity - l_discount)) + LOG(POW(l_quantity, 3) + 1)
          ) AS subquery;

          10gb dataset was used.
          To change columnstore_orderby_threads value I used
          SET GLOBAL command, (SET GLOBAL columnstore_orderby_threads = 1; SET GLOBAL columnstore_orderby_threads = 16)

          no subselect, with patch and columnstore_orderby_threads == 16
          59986052 rows in set (1 min 4.815 sec)
          59986052 rows in set (1 min 4.975 sec)

          no subselect, with patch and columnstore_orderby_threads == 1
          59986052 rows in set (3 min 32.506 sec)
          59986052 rows in set (3 min 29.330 sec)

          subselect, with patch and columnstore_orderby_threads == 16
          59986052 rows in set (3 min 29.684 sec)
          59986052 rows in set (3 min 35.516 sec)

          subselect, with patch and columnstore_orderby_threads == 1
          59986052 rows in set (3 min 33.364 sec)
          59986052 rows in set (3 min 27.356 sec)

          Change in columnstore_orderby_threads from 16 to 1 only affects processing time of request without subquery, so I am assuming that sorting in requests with subquery is still single threaded always

          abukhalov Aleksei Bukhalov added a comment - - edited To avoid potential partial sort optimisation and maximize resource consumption of ORDER BY, LIMIT keyword was removed: Queries: query without subselect: SELECT l_orderkey FROM LINEITEM ORDER BY RAND(), EXP(SIN(l_quantity) * LOG(l_extendedprice + 1) * COS(l_discount + 1) + TAN(l_discount)), POW(l_extendedprice, 7) / (l_discount + 1) + SQRT(ABS(l_quantity - l_discount)) + LOG(POW(l_quantity, 3) + 1); query with subselect: SELECT l_orderkey FROM ( SELECT l_orderkey FROM LINEITEM ORDER BY RAND(), EXP(SIN(l_quantity) * LOG(l_extendedprice + 1) * COS(l_discount + 1) + TAN(l_discount)), POW(l_extendedprice, 7) / (l_discount + 1) + SQRT(ABS(l_quantity - l_discount)) + LOG(POW(l_quantity, 3) + 1) ) AS subquery; 10gb dataset was used. To change columnstore_orderby_threads value I used SET GLOBAL command, (SET GLOBAL columnstore_orderby_threads = 1; SET GLOBAL columnstore_orderby_threads = 16) no subselect, with patch and columnstore_orderby_threads == 16 59986052 rows in set (1 min 4.815 sec) 59986052 rows in set (1 min 4.975 sec) no subselect, with patch and columnstore_orderby_threads == 1 59986052 rows in set (3 min 32.506 sec) 59986052 rows in set (3 min 29.330 sec) subselect, with patch and columnstore_orderby_threads == 16 59986052 rows in set (3 min 29.684 sec) 59986052 rows in set (3 min 35.516 sec) subselect, with patch and columnstore_orderby_threads == 1 59986052 rows in set (3 min 33.364 sec) 59986052 rows in set (3 min 27.356 sec) Change in columnstore_orderby_threads from 16 to 1 only affects processing time of request without subquery, so I am assuming that sorting in requests with subquery is still single threaded always
          allen.herrera Allen Herrera made changes -
          Status Stalled [ 10000 ] Confirmed [ 10101 ]
          allen.herrera Allen Herrera made changes -
          Assignee Aleksei Bukhalov [ JIRAUSER56149 ] Leonid Fedorov [ JIRAUSER48443 ]
          allen.herrera Allen Herrera made changes -
          Sprint 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11, 2023-12, 2024-1, 2024-2, 2025-2 [ 672, 686, 698, 702, 706, 726, 728, 734, 737, 748, 755, 764, 788 ] 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11, 2023-12, 2024-1, 2024-2 [ 672, 686, 698, 702, 706, 726, 728, 734, 737, 748, 755, 764 ]
          leonid.fedorov Leonid Fedorov made changes -
          Fix Version/s 23.10.4 [ 29991 ]
          Fix Version/s 23.10 [ 28540 ]
          Resolution Fixed [ 1 ]
          Status Confirmed [ 10101 ] Closed [ 6 ]
          leonid.fedorov Leonid Fedorov made changes -
          Sprint 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11, 2023-12, 2024-1, 2024-2 [ 672, 686, 698, 702, 706, 726, 728, 734, 737, 748, 755, 764 ] 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11, 2023-12, 2024-1, 2024-2, 2025-2 [ 672, 686, 698, 702, 706, 726, 728, 734, 737, 748, 755, 764, 788 ]

          Fixed.

          Tested on 10gb dataset:

          set columnstore_orderby_threads=1

          SELECT l_orderkey
          FROM (
          SELECT l_orderkey
          FROM LINEITEM
          ORDER BY 1 LIMIT 100
          ) AS subquery;

          100 rows in set (1.431 sec)
          100 rows in set (1.396 sec)
          100 rows in set (1.438 sec)

          After
          set columnstore_orderby_threads=16

          The same request performs as follows:

          100 rows in set (0.347 sec)
          100 rows in set (0.344 sec)
          100 rows in set (0.349 sec)

          Previously used test request was not showing much difference between orderby_threads == 1 and orderby_threads == 16 apparently due to math functions not being split to threads. Math functions were taking the most of processing time and difference in orderby was not apparent.

          abukhalov Aleksei Bukhalov added a comment - Fixed. Tested on 10gb dataset: set columnstore_orderby_threads=1 SELECT l_orderkey FROM ( SELECT l_orderkey FROM LINEITEM ORDER BY 1 LIMIT 100 ) AS subquery; 100 rows in set (1.431 sec) 100 rows in set (1.396 sec) 100 rows in set (1.438 sec) After set columnstore_orderby_threads=16 The same request performs as follows: 100 rows in set (0.347 sec) 100 rows in set (0.344 sec) 100 rows in set (0.349 sec) Previously used test request was not showing much difference between orderby_threads == 1 and orderby_threads == 16 apparently due to math functions not being split to threads. Math functions were taking the most of processing time and difference in orderby was not apparent.

          People

            leonid.fedorov Leonid Fedorov
            drrtuy Roman
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.