[MDEV-13041] Character set '#0' is not a compiled character set and is not specified in the 'C:\Program Files\MariaDB 10.1\share\charsets\Index.xml' file Created: 2017-06-08  Updated: 2017-08-04  Resolved: 2017-08-04

Status: Closed
Project: MariaDB Server
Component/s: Character Sets, Dynamic Columns
Affects Version/s: 10.1.10
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Clark Merchant Assignee: Unassigned
Resolution: Won't Fix Votes: 0
Labels: None


 Description   

This appears to be an issue where somehow a MariaDB dynamic column value has corrupted and a query can no longer be run on this row.

Running a query like:

SELECT COLUMN_GET(`Fields`, 'SampleName' AS CHAR) FROM `mytablehere`;

would generate an error like:

Character set '#0' is not a compiled character set and is not specified in the 'C:\Program Files\MariaDB 10.1\share\charsets\Index.xml' file

We realized that if we filtered on certain rows, some would work. We were able to isolate to the row(s) in question with a query like:

SELECT COLUMN_GET(`Fields`, 'SampleName' AS CHAR) FROM `mytablehere` WHERE Id IN(43) ;

In this case, Id=43 would fail as before. Looking at the next row, the (Id=44), it had a Sample Name of "[NUL]00 std" when it should have been "100 std", and in this case the NUL is the NULL (0x00) character code that I could see when pasting the value in Notepad++ and viewing all characters/symbols. You actually don't see anything when viewing the result of the query in SQLYog due to that leading NULL, but Notepad++ came to the rescue.

So, something managed to break an entry for the row before completely, and then carried over to the next row to corrupt it too. Any clue as to what may cause something like that? Was there inadvertently a control code character somehow entered in a particular Sample Name that could take out the query?



 Comments   
Comment by Clark Merchant [ 2017-06-08 ]

I should also note that other dynamic columns worked fine for that row. For instance querying on UserName with

SELECT COLUMN_GET(`Fields`, 'UserName' AS CHAR) FROM `mytablehere` WHERE Id IN(43) ;

Works fine and returns a value as expected. It is only the SampleName dynamic column.

Comment by Elena Stepanova [ 2017-06-11 ]

It's rather pointless for us to guess what someone might have done with a column to make it happen, even not considering actual data corruption. It's just a blob which relies on data inside being formatted in a very specific way, any attempt to tamper with it will cause troubles. I can of course give you an example how you can get this specific error, but then you'll say that nobody of course did anything like that in your database, and it can go like that forever. It will be much more reasonable the other way around, when you describe what you did with the table/column and we'll try to reproduce it (the normal bugreporting way).

Anyway, to provide the promised example, here you go.

MariaDB [test]> create table t1 (b blob);
Query OK, 0 rows affected (0.31 sec)
 
MariaDB [test]> insert into t1 values (column_create('SampleName','value' as char charset utf8));
Query OK, 1 row affected (0.04 sec)
 
MariaDB [test]> select COLUMN_GET(`b`, 'SampleName' AS CHAR) from t1;
+---------------------------------------+
| COLUMN_GET(`b`, 'SampleName' AS CHAR) |
+---------------------------------------+
| value                                 |
+---------------------------------------+
1 row in set (0.00 sec)
 
MariaDB [test]> update t1 set b = replace(b,unhex(21),unhex(0));
Query OK, 1 row affected (0.04 sec)
Rows matched: 1  Changed: 1  Warnings: 0
 
MariaDB [test]> select COLUMN_GET(`b`, 'SampleName' AS CHAR) from t1;
ERROR 22 (HY000): Character set '#0' is not a compiled character set and is not specified in the '/data/src/10.1-bug/share/charsets/Index.xml' file

Comment by Clark Merchant [ 2017-06-12 ]

Perfect, thanks for the feedback. I'd love to tell you how the user created the error, but have no idea how they managed it (and I am sure that they don't either) as they simply were entering a field in a web form, but managed to somehow corrupt the table (or at least that row, the rest of the table was fine). Not something where they have direct access to the SQL query as above.

This very well may not be a bug at all, and could be either data corruption (which is unnerving in its own right), or somehow they managed to pass the web form a string with a special character or combination of characters that would break things.

Following your example above, if you can tell me if there is a string that might bork the dynamic column when updated in the fashion of:

UPDATE t1 SET b =  COLUMN_ADD(b, 'SampleName', '<BadString>');  // <BadString> being some mal-formatted string  

then that's something I will protect against, and then likely we can close this issue off as being not a bug. Unless, of course, you want to protect against this directly in your COLUMN_ADD function!

Thanks for the assistance so far.

Comment by Oleksandr Byelkin [ 2017-07-05 ]

Dynamic column function return result in binary cherset, so it is better to store them in the fields with binary charset (after all it is binary data).

Comment by Elena Stepanova [ 2017-08-04 ]

>> Following your example above, if you can tell me if there is a string that might bork the dynamic column when updated in the fashion of:
>> UPDATE t1 SET b = COLUMN_ADD ...

GIven the generic problem described in MDEV-13238, with some luck you can get any kind of malformed value in the field that is supposed to hold dynamic columns. Since MDEV-13238 is said not to be a bug, this one follows.

One reasonable suggestion that comes from the discussion on MDEV-13238 is to make sure that you set sql_mode='STRICT_ALL_TABLES', hopefully it will prevent corruption of the values.

Generated at Thu Feb 08 08:02:26 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.