Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33457

Wrong EITS histogram for large table sizes with charset 'utf8mb4'

    XMLWordPrintable

Details

    Description

      After changes in MDEV-21829 it is generated wrong EITS histogram for big tables with charset 'utf8mb4'

      Table contains 999665 unique and total 10001000 records :

      CREATE TABLE `t100mb4a` (
        `a` varchar(1000) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
        KEY `a` (`a`(575))
      ) ENGINE=Aria;
       
      MariaDB [test]>  SELECT COUNT(*) FROM t100mb4a;
      +----------+
      | COUNT(*) |
      +----------+
      | 10001000 |
      +----------+
       
      MariaDB [test]> SELECT COUNT(DISTINCT a) FROM t100mb4a;
      +-------------------+
      | COUNT(DISTINCT a) |
      +-------------------+
      |            999665 |
      +-------------------+
      

      Variables:

      +---------------------------+----------------+----------------+
      | VARIABLE_NAME             | SESSION_VALUE  | GLOBAL_VALUE   |
      +---------------------------+----------------+----------------+
      | TMP_MEMORY_TABLE_SIZE     | 16777216       | 16777216       |
      | HISTOGRAM_SIZE            | 254            | 254            |
      | MAX_HEAP_TABLE_SIZE       | 1048576        | 1048576        |
      | HISTOGRAM_TYPE            | DOUBLE_PREC_HB | DOUBLE_PREC_HB |
      | ANALYZE_SAMPLE_PERCENTAGE | 100.000000     | 100.000000     |
      +---------------------------+----------------+----------------+
      

      Analysis script:

      set @@rand_seed1=1, @@rand_seed2=333;
      ANALYZE TABLE t100mb4a PERSISTENT FOR COLUMNS (a) INDEXES ();
      

      Result of query:

      select db_name, table_name, column_name, min_value, max_value, nulls_ratio, avg_length, avg_frequency, hist_size, hist_type, hex(histogram) from mysql.column_stats where db_name="test" and table_name="t100mb4a"\G
      

      before changes:

             db_name: test
          table_name: t100mb4a
         column_name: a
           min_value: !ѩᜃ昔뤙檂儰���䤱锪㽇綡또ᒯ䙶∃흩췄繘ສ쾇ḁ븬宛遂븵օ嬿⃘錺糘뛋᪯扇鬖辰ⶬ㘈蕢䥵蛎앨䗜ු瑎ᭋⱉ诋㕛栣桞톫���퐵笽⪊ᄈ홊﬛揩˸欝歀휩ぅắࣙ탮棌∟烳츢댔ᒽ
           max_value: ￿���倅࢏㭹཯魾���烑ꖙ鳠哇툁ᩯས﹬젯䰧둔ꁰӴ㢲ౚ鑭쁒Δ튪ზ���⅞์䟿볘흺ﺚ燓㸏걜륢馔폼噳㔈،翢沆鼵
         nulls_ratio: 0.0000
          avg_length: 224.1367
       avg_frequency: 10.0044
           hist_size: 254
           hist_type: DOUBLE_PREC_HB
      hex(histogram): 3400BB02F00308060A080E0A0C0C050E04100D120C140F160B18181A1E1CE21FE921E523DC25D227CB29C82BCD2DCA2FD131D533CC35C837CE39C93BC93DC83FD241D343CA45CA47C449C24BC34DC84FC751C753C255B857C159CD5BCD5DD65FD061CA63D065CB67D469D56BD76DD76FD071D373D075D677D779DB7BD77DE27FE881EC83E085E487DC89E68BE78DDE8FDA91D993D895D797E099DF9BDD9DE79FE1A1E3A3E3A5E1A7E4A9EEABF7ADEEAFEAB1F9B3FBB5F9B7F1B9F8BBFEBDFFBF03C210C40EC60BC80ECA07CC01CE07D00CD20FD40ED60ED817DA19DC17DE29E02AE223E420E61DE822EA28EC28EE2EF02FF22EF42BF62EF833FA3CFC48FE
      

      after changes:

             db_name: test
          table_name: t100mb4a
         column_name: a
           min_value: !ѩᜃ昔뤙檂儰���䤱锪㽇綡또ᒯ䙶∃흩췄繘ສ쾇ḁ븬宛遂븵օ嬿⃘錺糘뛋᪯扇鬖辰ⶬ㘈蕢䥵蛎앨䗜ු瑎ᭋⱉ诋㕛栣桞톫���퐵笽⪊ᄈ홊﬛揩˸欝歀휩ぅắࣙ탮棌∟烳츢댔ᒽ
           max_value: ￿���倅࢏㭹཯魾���烑ꖙ鳠哇툁ᩯས﹬젯䰧둔ꁰӴ㢲ౚ鑭쁒Δ튪ზ���⅞์䟿볘흺ﺚ燓㸏걜륢馔폼噳㔈،翢沆鼵
         nulls_ratio: 0.0000
          avg_length: 224.1367
       avg_frequency: 1.0843
           hist_size: 254
           hist_type: DOUBLE_PREC_HB
      hex(histogram): 2000200020002000200020002400240024002400240000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
      

      Attachments

        Issue Links

          Activity

            People

              cvicentiu Vicențiu Ciorbaru
              lstartseva Lena Startseva
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.