[MDEV-13238] Dynamic columns are corrupted by COLUMN_ADD when value goes out of bounds Created: 2017-07-03  Updated: 2018-03-29  Resolved: 2017-07-04

Status: Closed
Project: MariaDB Server
Component/s: Dynamic Columns
Affects Version/s: 10.0, 10.1, 10.2
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Oleksandr Byelkin
Resolution: Not a Bug Votes: 0
Labels: None


 Description   

All contents of a blob containing dynamic columns can be easily corrupted by simply adding a value, when the field length is exceeded.

CREATE TABLE t1 (b TINYBLOB);
--echo #
--echo # The column gets created normally, that's good
INSERT INTO t1 VALUES (COLUMN_CREATE('Column1',REPEAT('a',200)));
SELECT COLUMN_JSON(b) FROM t1;
--echo #
--echo # The column gets created, but the value is truncated, that's expected
UPDATE t1 SET b = COLUMN_ADD(b,'Column2',REPEAT('b',200));
SELECT COLUMN_JSON(b) FROM t1;
--echo #
--echo # The warning says that the value is truncated again...
UPDATE t1 SET b = COLUMN_ADD(b,'Column3',REPEAT('c',200));
--echo #
--echo # ...but in fact the whole column is corrupt
SELECT COLUMN_JSON(b) FROM t1;
DROP TABLE t1;

Here in the test case the maximum length of the blob is 255 symbols.
The first dynamic column with value of 200 symbols long fits all right.
The second one with value of 200 symbols long is still created, but its value is truncated. It's fine.
But the third one is still attempted to be added and this attempt corrupts the whole column.

MariaDB [test]> UPDATE t1 SET b = COLUMN_ADD(b,'Column3',REPEAT('c',200));
Query OK, 1 row affected, 1 warning (0.04 sec)
Rows matched: 1  Changed: 1  Warnings: 1
 
MariaDB [test]> SHOW WARNINGS;
+---------+------+----------------------------------------+
| Level   | Code | Message                                |
+---------+------+----------------------------------------+
| Warning | 1265 | Data truncated for column 'b' at row 1 |
+---------+------+----------------------------------------+
1 row in set (0.00 sec)
 
MariaDB [test]> SELECT COLUMN_JSON(b) FROM t1;
ERROR 1919 (HY000): Encountered illegal format of dynamic column string

The last UPDATE should just fail, but not touch the existing value.

Note: the same result can be achieved by 2 columns, I intentionally made it 3-step to show that the current logic is capable of truncating a value in some circumstances but not other.



 Comments   
Comment by Oleksandr Byelkin [ 2017-07-04 ]

It would be a bug is there was no warning, otherwise it is expected behaviour.

Comment by Elena Stepanova [ 2017-07-04 ]

I disagree.

Warning is meant to warn people from causing harm. This warning comes too late, the value is already corrupt, and there is no way to get it back.

I mean the whole value, all previously existed dynamic columns, not just the one that you tried to add.

It's like, for example, if you try to insert a row with a too long value into a table, it would throw a warning "the value is truncated", but after that the whole table would go corrupt and not readable anymore. It cannot possibly be expected behavior.

Comment by Oleksandr Byelkin [ 2017-07-04 ]

When a string written to the field which can not fit it also (HURROR!!!) first damaged/corrupted (truncated) then written in this malformed state, then warning issued. What is the difference? Why dynamic column function should behave differently from concat or repeat?

Comment by Oleksandr Byelkin [ 2017-07-04 ]

BTW field have no idea which function result is the string it store, and it is better to keep it so.

Comment by Oleksandr Byelkin [ 2017-07-04 ]

set sql_mode='';
create table t1 (a varchar(10));
insert into t1 values ('123456789');
update t1 set a=concat("aaa", a);
Warnings:
Warning 1265 Data truncated for column 'a' at row 1
select * from t1;
a
aaa1234567
drop table t1;

Comment by Clark Merchant [ 2018-03-28 ]

I am the author of the closed issue MDEV-13041, and arrived here through a link in that issue. The test-case presented here intrigued me on reproducing the error by creating the dynamic column using a tinyblob, and one avenue we want to investigate with our corruption issue is a possible truncation issue.

The default method for creating a Dynamic Column as per the documentation is to use a blob, which as far as I know has a 65535 byte max size. From the documentation:

create table assets (
  item_name varchar(32) primary key, -- A common attribute for all items
  dynamic_cols  blob  -- Dynamic columns will be stored here
);

While we need to investigate further to see if MDEV-13041 is a blob overflow issue, it does make me concerned that people storing lots of info/data in the dynamic columns may inadvertently trigger a similar issue for large dynamic column blob data. I am looking at some test projects now and within 1 minute found a few dynamic columns 12k+ in size. It isn't inconceivable for our application that they hit 64k for folks with long data strings stored per column.

Is the recommended point of action in this case to simply change the dynamic columns to be MEDIUMBLOB or LARGEBLOB when they are created, to ensure we don't hit this limit? Are there drawbacks to this?

And a related question, the limitations of dynamic columns are stated in the documentation to be "Max number of columns"=65535. Is that related at all to the fact that the documentation example is using a blob (which has a max size of 65535), or is this 65535 limit unrelated? My guess is that it's unrelated, but it's worthwhile to ask.

If it indeed is prudent for us to update all dynamic columns to MEDIUMBLOB or larger, we will definitely do so.

Comment by Oleksandr Byelkin [ 2018-03-29 ]

Please check warnings and you will see that data was truncated.
With the same success you can try to put some video in the tinyblob and then see that it become corrupted.

Generated at Thu Feb 08 08:04:00 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.