Details
-
Task
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
Modern software (including text editors, static analysis software,
and web-based code review interfaces) often requires source code files
to be interpretable via a consistent character encoding, with UTF-8 or
ASCII (a strict subset of UTF-8) as the default. Several of the MariaDB
source files contain bytes that are not valid in either the UTF-8 or
ASCII encodings, but instead represent strings encoded in the
ISO-8859-1/Latin-1 or ISO-8859-2/Latin-2 encodings.
This JIRA stemmed from this PR to allow for more discussion regarding how to handle these strings: https://github.com/MariaDB/server/pull/2224
In the PR, we are using '\x' escapes to replace the non-valid ASCII encoded characters. By doing this, we do not change the fundamental encoding that these strings are encoded in (ISO-8859-1). This ticket aims to foster discussion regarding the feasibility of changing MariaDB to output these strings in UTF8 instead altogether.
I do not think you have to worry about those french strings in those header files. The header files are not used in compilation, and you'd need a preprocessor constant -DFRENCH so they are used. thus, you can change them to whatever you want, or remove them entirely, together with #if defined(FRENCH) inside storage/connect directory.
Of you can convert them to UTF8, whatever you do to those headers has no effect.