returns the following output:
- 0xC2 is an incomplete UTF8 character (a valid mbhead not followed by an mbtail).
- 0xC223 is an invalid sequence (a valid mbhead followed by a 7-bit ASCII character instead of an mbtail)
- The second row correctly replaced mbhead to question mark and appended '#' to it.
- The first row did not replace mbhead to '?', it just truncated.
- The warnings are different. The warning for the second row is more descriptive
The same effect can be achieved using a Latin1 terminal window.
The idea is exactly the same. It just uses direct Latin1 input instead of creating a bad sequence using CONCAT and executing it with a prepared statement.
The column can have any other character set other than utf8, to enable conversion.
The expected behaviour would be to replace trailing incomplete characters to question marks,
so the first row returns '?' instead of an empty string, with a more descriptive warning, similar
to the one returned for the second row.