Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6529

optimize VARCHAR temp storage during EITS ANALYZE

    XMLWordPrintable

Details

    Description

      Filing this https://jira.mariadb.org/browse/MDEV-6181?focusedCommentId=57209&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-57209

      as a separate issue:

      Motivated by CSC#7897, I've tried to see how much space can ANALYZE use. (So far my understanding was that max. temp dir usage would be rougly equal to size of the table w/o indexes).

      Let's try this:

      create table ten(a int);
      insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
       
      create table one_k(a int);
      insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
       
      create table t1 (pk int primary key, col1 varchar(100)) charset=utf8;
      insert into t1 select A.a+B.a*1000, concat('val-', A.a+B.a*1000) from one_k A, one_k B;
      

      The table has 1M rows. Rows in the VARCHAR(100) column occupy less than
      300 bytes (which is typical).

      The table on disk:

       -rw-rw---- 1 psergey psergey  44M Aug  1 23:09 t1.ibd
      

      Now,

      analyze table t1 persistent for all;
      

        Breakpoint 4, unique_write_to_file_with_count (key=0x7ffebf2ce0a0 "\005", count=1, unique=0x7ffebf047ef0) at /home/psergey/dev2/10.0/sql/uniques.cc:53
      (gdb) print key
        $7 = (uchar *) 0x7ffebf2ce0a0 "\005"
      (gdb)  
      (gdb) p unique->size
        $8 = 302
      

      It's writing 300 bytes, the unpacked length.

      In total, we get:

        Breakpoint 3, Unique::flush (this=0x7ffebf047ef0) at /home/psergey/dev2/10.0/sql/uniques.cc:379
        $46 = 281,734,200
        Breakpoint 3, Unique::flush (this=0x7ffebf047ef0) at /home/psergey/dev2/10.0/sql/uniques.cc:379
        $48 = 297,386,100
      ...
        Breakpoint 3, Unique::flush (this=0x7ffebf047a30) at /home/psergey/dev2/10.0/sql/uniques.cc:379
        $47 = 4,194,304
      

      300MBit = 286 M, i.e. ANALYZE may require much more space than is occupied by the table (44M)

      Attachments

        Issue Links

          Activity

            People

              cvicentiu Vicențiu Ciorbaru
              psergei Sergei Petrunia
              Votes:
              3 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.