[MDEV-9874] LOAD XML INFILE does not handle well broken multi-byte characters Created: 2016-04-06 Updated: 2016-11-29 Resolved: 2016-04-08 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Character Sets |
| Affects Version/s: | 5.5, 10.0, 10.1, 10.2 |
| Fix Version/s: | 10.2.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Alexander Barkov | Assignee: | Alexander Barkov |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Description |
|
I create an XML file with a broken utf8 byte sequence:
Notice, 0xD0 is a utf8 leading byte of a 2-byte character, but it is not followed by a tail byte. Now I try to load this file into a table:
It produces this warning:
and this result set:
This is wrong. The closing tag '</a>' was interpreted as data rather than markup. |