Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Incomplete
-
None
-
CentOS
Server version: 10.3.17-MariaDB-log MariaDB Server
Description
My hosting service runs MariaDB on a CentOS server. I connect via SSH (utf-8) or via PHP (with encoding everywhere set to utf-8 as well). The same issue happens no matter which of these approaches I use.
> show session variables like 'character_set%';
|
+--------------------------+------------------------------+
|
| Variable_name | Value |
|
+--------------------------+------------------------------+
|
| character_set_client | utf8 |
|
| character_set_connection | utf8 |
|
| character_set_database | latin1 |
|
| character_set_filesystem | binary |
|
| character_set_results | utf8 |
|
| character_set_server | latin1 |
|
| character_set_system | utf8 |
|
| character_sets_dir | /usr/share/mariadb/charsets/ |
|
+--------------------------+------------------------------+
|
> DELIMITER //
|
> CREATE FUNCTION return_string( str TEXT ) RETURNS TEXT CHARSET 'utf8'
|
> BEGIN
|
> RETURN str;
|
> END //
|
> DELIMITER ;
|
> SELECT return_string('ā');
|
ERROR 1366 (22007): Incorrect string value: '\xC4\x81' for column ``.``.`str` at row 1
|
Unless I am mistaken, this is a bug. It works correctly in MySQL. MariaDB seems to get confused about which charset it is using with the user function. A simple "SELECT 'ā';" works correctly.
SHOW CREATE FUNCTION return_string;
shows the expected status of the function as being utf-8.
This works correctly if you use this instead:
CREATE FUNCTION return_string( str TEXT CHARSET 'utf8' ) RETURNS TEXT CHARSET 'utf8'
specifying the input encoding too. However, the connection was already in utf-8, the function is created to return values in utf-8 and the database tables (in the case where I discovered this issue) were in utf-8. So it seems really unexpected/buggy for it to try to interpret the function input as latin1. Especially when MySQL treats the input as utf-8 (at least in the slightly older version that I have access to).