Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13238

Dynamic columns are corrupted by COLUMN_ADD when value goes out of bounds

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Not a Bug
    • 10.0(EOL), 10.1(EOL), 10.2(EOL)
    • N/A
    • Dynamic Columns
    • None

    Description

      All contents of a blob containing dynamic columns can be easily corrupted by simply adding a value, when the field length is exceeded.

      CREATE TABLE t1 (b TINYBLOB);
      --echo #
      --echo # The column gets created normally, that's good
      INSERT INTO t1 VALUES (COLUMN_CREATE('Column1',REPEAT('a',200)));
      SELECT COLUMN_JSON(b) FROM t1;
      --echo #
      --echo # The column gets created, but the value is truncated, that's expected
      UPDATE t1 SET b = COLUMN_ADD(b,'Column2',REPEAT('b',200));
      SELECT COLUMN_JSON(b) FROM t1;
      --echo #
      --echo # The warning says that the value is truncated again...
      UPDATE t1 SET b = COLUMN_ADD(b,'Column3',REPEAT('c',200));
      --echo #
      --echo # ...but in fact the whole column is corrupt
      SELECT COLUMN_JSON(b) FROM t1;
      DROP TABLE t1;
      

      Here in the test case the maximum length of the blob is 255 symbols.
      The first dynamic column with value of 200 symbols long fits all right.
      The second one with value of 200 symbols long is still created, but its value is truncated. It's fine.
      But the third one is still attempted to be added and this attempt corrupts the whole column.

      MariaDB [test]> UPDATE t1 SET b = COLUMN_ADD(b,'Column3',REPEAT('c',200));
      Query OK, 1 row affected, 1 warning (0.04 sec)
      Rows matched: 1  Changed: 1  Warnings: 1
       
      MariaDB [test]> SHOW WARNINGS;
      +---------+------+----------------------------------------+
      | Level   | Code | Message                                |
      +---------+------+----------------------------------------+
      | Warning | 1265 | Data truncated for column 'b' at row 1 |
      +---------+------+----------------------------------------+
      1 row in set (0.00 sec)
       
      MariaDB [test]> SELECT COLUMN_JSON(b) FROM t1;
      ERROR 1919 (HY000): Encountered illegal format of dynamic column string
      

      The last UPDATE should just fail, but not touch the existing value.

      Note: the same result can be achieved by 2 columns, I intentionally made it 3-step to show that the current logic is capable of truncating a value in some circumstances but not other.

      Attachments

        Activity

          It would be a bug is there was no warning, otherwise it is expected behaviour.

          sanja Oleksandr Byelkin added a comment - It would be a bug is there was no warning, otherwise it is expected behaviour.
          elenst Elena Stepanova added a comment - - edited

          I disagree.

          Warning is meant to warn people from causing harm. This warning comes too late, the value is already corrupt, and there is no way to get it back.

          I mean the whole value, all previously existed dynamic columns, not just the one that you tried to add.

          It's like, for example, if you try to insert a row with a too long value into a table, it would throw a warning "the value is truncated", but after that the whole table would go corrupt and not readable anymore. It cannot possibly be expected behavior.

          elenst Elena Stepanova added a comment - - edited I disagree. Warning is meant to warn people from causing harm. This warning comes too late, the value is already corrupt, and there is no way to get it back. I mean the whole value, all previously existed dynamic columns, not just the one that you tried to add. It's like, for example, if you try to insert a row with a too long value into a table, it would throw a warning "the value is truncated", but after that the whole table would go corrupt and not readable anymore. It cannot possibly be expected behavior.

          When a string written to the field which can not fit it also (HURROR!!!) first damaged/corrupted (truncated) then written in this malformed state, then warning issued. What is the difference? Why dynamic column function should behave differently from concat or repeat?

          sanja Oleksandr Byelkin added a comment - When a string written to the field which can not fit it also (HURROR!!!) first damaged/corrupted (truncated) then written in this malformed state, then warning issued. What is the difference? Why dynamic column function should behave differently from concat or repeat?

          BTW field have no idea which function result is the string it store, and it is better to keep it so.

          sanja Oleksandr Byelkin added a comment - BTW field have no idea which function result is the string it store, and it is better to keep it so.

          set sql_mode='';
          create table t1 (a varchar(10));
          insert into t1 values ('123456789');
          update t1 set a=concat("aaa", a);
          Warnings:
          Warning 1265 Data truncated for column 'a' at row 1
          select * from t1;
          a
          aaa1234567
          drop table t1;

          sanja Oleksandr Byelkin added a comment - set sql_mode=''; create table t1 (a varchar(10)); insert into t1 values ('123456789'); update t1 set a=concat("aaa", a); Warnings: Warning 1265 Data truncated for column 'a' at row 1 select * from t1; a aaa1234567 drop table t1;
          cmerchan@gmail.com Clark Merchant added a comment - - edited

          I am the author of the closed issue MDEV-13041, and arrived here through a link in that issue. The test-case presented here intrigued me on reproducing the error by creating the dynamic column using a tinyblob, and one avenue we want to investigate with our corruption issue is a possible truncation issue.

          The default method for creating a Dynamic Column as per the documentation is to use a blob, which as far as I know has a 65535 byte max size. From the documentation:

          create table assets (
            item_name varchar(32) primary key, -- A common attribute for all items
            dynamic_cols  blob  -- Dynamic columns will be stored here
          );
          

          While we need to investigate further to see if MDEV-13041 is a blob overflow issue, it does make me concerned that people storing lots of info/data in the dynamic columns may inadvertently trigger a similar issue for large dynamic column blob data. I am looking at some test projects now and within 1 minute found a few dynamic columns 12k+ in size. It isn't inconceivable for our application that they hit 64k for folks with long data strings stored per column.

          Is the recommended point of action in this case to simply change the dynamic columns to be MEDIUMBLOB or LARGEBLOB when they are created, to ensure we don't hit this limit? Are there drawbacks to this?

          And a related question, the limitations of dynamic columns are stated in the documentation to be "Max number of columns"=65535. Is that related at all to the fact that the documentation example is using a blob (which has a max size of 65535), or is this 65535 limit unrelated? My guess is that it's unrelated, but it's worthwhile to ask.

          If it indeed is prudent for us to update all dynamic columns to MEDIUMBLOB or larger, we will definitely do so.

          cmerchan@gmail.com Clark Merchant added a comment - - edited I am the author of the closed issue MDEV-13041 , and arrived here through a link in that issue. The test-case presented here intrigued me on reproducing the error by creating the dynamic column using a tinyblob, and one avenue we want to investigate with our corruption issue is a possible truncation issue. The default method for creating a Dynamic Column as per the documentation is to use a blob, which as far as I know has a 65535 byte max size. From the documentation: create table assets ( item_name varchar( 32 ) primary key, -- A common attribute for all items dynamic_cols blob -- Dynamic columns will be stored here ); While we need to investigate further to see if MDEV-13041 is a blob overflow issue, it does make me concerned that people storing lots of info/data in the dynamic columns may inadvertently trigger a similar issue for large dynamic column blob data. I am looking at some test projects now and within 1 minute found a few dynamic columns 12k+ in size. It isn't inconceivable for our application that they hit 64k for folks with long data strings stored per column. Is the recommended point of action in this case to simply change the dynamic columns to be MEDIUMBLOB or LARGEBLOB when they are created, to ensure we don't hit this limit? Are there drawbacks to this? And a related question, the limitations of dynamic columns are stated in the documentation to be "Max number of columns"=65535. Is that related at all to the fact that the documentation example is using a blob (which has a max size of 65535), or is this 65535 limit unrelated? My guess is that it's unrelated, but it's worthwhile to ask. If it indeed is prudent for us to update all dynamic columns to MEDIUMBLOB or larger, we will definitely do so.

          Please check warnings and you will see that data was truncated.
          With the same success you can try to put some video in the tinyblob and then see that it become corrupted.

          sanja Oleksandr Byelkin added a comment - Please check warnings and you will see that data was truncated. With the same success you can try to put some video in the tinyblob and then see that it become corrupted.

          People

            sanja Oleksandr Byelkin
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.