Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-5250

Disk-based DISTINCT

    XMLWordPrintable

Details

    • 2021-17, 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11

    Description

      As of 22.08.01 MCS does DISTINCT processing TupleAnnexStep. This step leverages hashmap for the purpose. This solution is simple but it:

      • lacks scalability
      • can't leverage disk-based capabilities of RowStorage class used by GROUP BY
      • ResourceManager that accounts RAM consumption doesn't counts the hashmap

      This issue is about a new DISTINCT implementation(presumably based on RowStorage) that:

      • can do external DISTINCT spilling on disk if necessary,
      • ResourceManager counts the implemenation RAM consumption
      • scales(this might be tricky since DISTINCT processing overlaps with ORDER BY)

      Attachments

        Issue Links

          Activity

            People

              drrtuy Roman
              drrtuy Roman
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.