Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-15592

Column COMPRESSED should select a 'high order' datatype

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.3
    • Fix Version/s: 10.3.7
    • Component/s: Server
    • Labels:
      None
    • Sprint:
      10.3.6-1

      Description

      I'm using a Azure VM (ubuntu), mariadb ubuntu package
      Server version: 10.3.5-MariaDB-10.3.5+maria~xenial mariadb.org binary distribution

      MariaDB [test]> show variables like 'column_compre%';
      +----------------------------------+------------------+
      | Variable_name                    | Value            |
      +----------------------------------+------------------+
      | column_compression_threshold     | 100              |
      | column_compression_zlib_level    | 6                |
      | column_compression_zlib_strategy | DEFAULT_STRATEGY |
      | column_compression_zlib_wrap     | OFF              |
      +----------------------------------+------------------+
      4 rows in set (0.025 sec)
       
      CREATE TABLE `table_without_compression` (
        `id` int(11) NOT NULL DEFAULT 0,
        `gmtime` decimal(22,7) NOT NULL DEFAULT 0.0000000,
        `data` blob NOT NULL,
        PRIMARY KEY (`id`,`gmtime`)
      )  ENGINE=InnoDB DEFAULT CHARSET=latin1;
      #Query OK, 0 rows affected (0.385 sec)
       
      CREATE TABLE `table_compressed` (
        `id` int(11) NOT NULL DEFAULT 0,
        `gmtime` decimal(22,7) NOT NULL DEFAULT 0.0000000,
        `data` blob COMPRESSED NOT NULL DEFAULT '',
        PRIMARY KEY (`id`,`gmtime`)
      )  ENGINE=InnoDB DEFAULT CHARSET=latin1;*
      #Query OK, 0 rows affected (0.385 sec)
       
      INSERT INTO table_without_compression VALUES (1,0,repeat('a',65535));
      #Query OK, 1 row affected (0.135 sec)
       
      INSERT INTO  table_without_compression VALUES (1,1,repeat('a',65536));
      #ERROR 1406 (22001): Data too long for column 'data' at row 1   <- JUST TO CHECK IF WE GOT MAX SIZE OF BLOB COLUMN, maybe that's the problem, internally compression consider the from/to column instead of a bigger column (long blob for example when using blob/longblob, i will test with TEXT/LONG TEXT and VARCHAR too)
       
      INSERT INTO table_compressed SELECT * FROM table_without_compression;
      #ERROR 1406 (22001): Data too long for column 'data' at row 1 <- THAT'S THE ERROR
      


      Same to TEXT

      CREATE TABLE `table_compressed2` (
             `id` int(11) NOT NULL DEFAULT 0,
             `gmtime` decimal(22,7) NOT NULL DEFAULT 0.0000000,
             `data` text COMPRESSED NOT NULL DEFAULT '',
             PRIMARY KEY (`id`,`gmtime`)
           )  ENGINE=InnoDB DEFAULT CHARSET=latin1;
      #Query OK, 0 rows affected (0.231 sec)
       
      CREATE TABLE `table_without_compression2` (
             `id` int(11) NOT NULL DEFAULT 0,
             `gmtime` decimal(22,7) NOT NULL DEFAULT 0.0000000,
             `data` text NOT NULL,
             PRIMARY KEY (`id`,`gmtime`)
           )  ENGINE=InnoDB DEFAULT CHARSET=latin1;
      #Query OK, 0 rows affected (0.253 sec)
       
      INSERT INTO table_without_compression2 VALUES (1,0,repeat('a',65535));
      #Query OK, 1 row affected (0.061 sec)
       
      INSERT INTO table_without_compression2 VALUES (1,0,repeat('a',65536));
      #ERROR 1406 (22001): Data too long for column 'data' at row 1  <- JUST TO CHECK IF WE GOT MAX SIZE OF BLOB COLUMN
       
      INSERT INTO table_compressed2 SELECT * FROM table_without_compression2;
      #ERROR 1406 (22001): Data too long for column 'data' at row 1  <- JUST TO CHECK IF WE GOT MAX SIZE OF BLOB COLUMN
      


      mixing data types:

      INSERT INTO table_compressed SELECT * FROM table_without_compression2;
      #ERROR 1406 (22001): Data too long for column 'data' at row 1
      INSERT INTO table_compressed2 SELECT * FROM table_without_compression;
      #ERROR 1406 (22001): Data too long for column 'data' at row 1
      


      creating now a LONGBLOB / LONGTEXT

      CREATE TABLE table_compressed_long_blob (
           `id` int(11) NOT NULL DEFAULT 0,
           `gmtime` decimal(22,7) NOT NULL DEFAULT 0.0000000,
           `data` LONGBLOB COMPRESSED NOT NULL,
           PRIMARY KEY (`id`,`gmtime`)
           )  ENGINE=InnoDB DEFAULT CHARSET=latin1;
      #Query OK, 0 rows affected (0.141 sec)
       
      CREATE TABLE table_compressed_long_text (
           `id` int(11) NOT NULL DEFAULT 0,
           `gmtime` decimal(22,7) NOT NULL DEFAULT 0.0000000,
           `data` LONGTEXT COMPRESSED NOT NULL,
           PRIMARY KEY (`id`,`gmtime`)
           )  ENGINE=InnoDB DEFAULT CHARSET=latin1;
      #Query OK, 0 rows affected (0.161 sec)
       
      INSERT INTO table_compressed_long_blob SELECT * FROM table_without_compression2;
      #Query OK, 1 row affected (0.045 sec)   <- NICE!
      #Records: 1  Duplicates: 0  Warnings: 0
       
      INSERT INTO table_compressed_long_text SELECT * FROM table_without_compression2;
      #Query OK, 1 row affected (0.009 sec)  <- NICE!
      #Records: 1  Duplicates: 0  Warnings: 0
      

      Possible solutions:
      1) Report at Documentation about data sizes,
      2) Change code to use a bigger (high order) column type <- preffered, since we have users with "INSERT compressed_table SELECT * FROM same_column_type_structure" queries


      problem found:

      MariaDB [test]> truncate table_without_compression;
      Query OK, 0 rows affected (0.131 sec)
       
      MariaDB [test]> INSERT INTO table_without_compression VALUES (1,0,repeat('a',65534));
      Query OK, 1 row affected (0.039 sec)
       
      MariaDB [test]> INSERT INTO table_compressed SELECT * FROM table_without_compression;
      Query OK, 1 row affected (0.044 sec)
      Records: 1  Duplicates: 0  Warnings: 0
      

      at compression source code, we have LENGTH-1 as field size

      https://github.com/MariaDB/server/commit/fdc47792354c820aa4a8542d7c00d434424a63fb#diff-06426300d2671c77dd6763db8df79306R3241

      maybe solution is here (i don't know the full source):
      https://github.com/MariaDB/server/commit/fdc47792354c820aa4a8542d7c00d434424a63fb#diff-06426300d2671c77dd6763db8df79306R3257

      instead of:
      + return n < 256 ? 1 : n < 65536 ? 2 : n < 16777216 ? 3 : 4;

      use:
      + return n < 256-1 ? 1 : n < 65536-1 ? 2 : n < 16777216-1 ? 3 : 4;

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              svoj Sergey Vojtovich
              Reporter:
              rspadim roberto spadim
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: