Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33305

Testing for MDEV-21829: Packed keys in Unique

    XMLWordPrintable

Details

    • Task
    • Status: Stalled (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.6
    • Tests
    • None

    Description

      This is about doing testing for MDEV-21829: Use packed sort keys in Unique objects

      Queries affected by MDEV-21829

      Packed Unique has these uses:

      USE1: Histogram collection in EITS
      USE2: agg_func(DISTINCT col,,, )
      USE3: Index_merge stores rowids.

      We need to test the scenario where the data doesn't fit into the in-memory Unique and is spilled on disk.

      The focus is on the scenario where Unique stores variable-sized data, although it's nice to have a test for fixed-size data, too.

      There is no way to turn off the new behavior. You will need to compare with the old server.

      There is a bpftrace script to watch tempfile usage at MDEV-32472.

      Variable-sized data

      This is VARCHAR(n) columns.
      (TODO: what happens with CHAR(n)?) cvicentiu ?

      Controlling amount of memory that Unique uses

      How much memory Unique will use for COUNT(DISTINCT): The logic is here:

        size_t Item_sum::ram_limitation(THD *thd)
        {     
          return MY_MAX(1024,
                   (size_t)MY_MIN(thd->variables.tmp_memory_table_size,
                                  thd->variables.max_heap_table_size));
      

      How much memory Unique will use in EITS ANALYZE:

        void Column_statistics_collected::init(THD *thd, Field *table_field)
        {
          size_t max_heap_table_size= (size_t)thd->variables.max_heap_table_size;
      

      Plan for test#1:

      Create a table with VARCHAR column.

      • Try different charsets (latin, utf8mb3, utf8mb4 if you have the data, ucs2)
      • Try a few different collations (There's no need to try all collations)

      Populate the table with enough random data of different lengths and chars.

      Set @@max_heap_table_size low enough to force Unique to flush to disk.

      Run

      ANALYZE TABLE t PERSISTENT FOR COLUMNS (col) INDEXES ();
      

      Compare the output of

      select * from mysql.column_stats where db_name=datababase() and table_name='t'
      

      with the server before the patch.

      Also compare bpftrace tempfile usage output for both servers.

      Testing for agg_func(DISTINCT ...)

      TODO

      Attachments

        Issue Links

          Activity

            People

              lstartseva Lena Startseva
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.