Details
-
New Feature
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
Description
Goal of this task is to set default global variables to 4 bytes utf8 charset
meaning :
- character_set_client : from from utf8 to utf8mb4.
- character_set_database : from latin1 to utf8mb4
- character_set_server : from latin1 to utf8mb4
- character_set_results: from utf8 to utf8mb4
- character_set_connection: from utf8 to utf8mb4
- collation_database: from latin1_swedish_ci to utf8mb4_uca1400_ai_ci
- collation_server: from latin1_swedish_ci to utf8mb4_uca1400_ai_ci
Default changed in mysql 8.0.1
There are some questions which should be discussed before/while working on this task:
- YES: Should we change the default collation for utf8mb4 from utf8mb4_general_ci to uca1400_ai_ci? The problem is that utf8mb4_general_ci is very bad for non-BMP characters - it considers all non-BMP charcters as equal to each other. See
MDEV-25829 - YES: Should we reassign the UTF8 Linux Locale from utf8mb3 to utf8mb4 in the client? Or to what the server side uses as the alias for "utf8". See
MDEV-19123 - YES: Should we change system_charset_info from utf8mb3 to utf8mb4 and allow non-BMP characters in identifiers? See MDEV-27490.
- If so, table name to file name encoding should be extended to support non-BMP characters. See MDEV-27490
- system charset cannot be utf8mb4 until we fix the collation as above
- YES: Should we change numerous INFORMATION_SCHEMA columns from utf8mb3 to utf8mb4?
- they should be in the system_charset_info, as they store identifiers
Attachments
Issue Links
- blocks
-
MDEV-30041 don't set utf8_is_utf8mb3 by default in the old-mode
- Open
- causes
-
MDEV-34790 Assertion `(mem_root->flags & 4) == 0' failed in main.ps_4heap main.ps_5merge main.ps_3innodb main.ps_2myisam with WITH_PROTECT_STATEMENT_MEMROOT=ON
- Closed
-
MDEV-34883 LOAD DATA INFILE with geometry data fails
- Open
- is blocked by
-
MDEV-22981 Bad "default-character-set" option in [client] option group 50-client.cnf on Debian/Ubuntu
- Closed
-
MDEV-25829 Change default Unicode collation to uca1400_ai_ci
- Closed
-
MDEV-27009 Add UCA-14.0.0 collations
- Closed
-
MDEV-29446 Change SHOW CREATE TABLE to display default collations
- Closed
-
MDEV-30556 UPPER() returns an empty string for U+0251 in Unicode-5.2.0+ collations for utf8
- Closed
-
MDEV-30577 Case folding for uca1400 collations is not up to date
- Closed
-
MDEV-30661 UPPER() returns an empty string for U+0251 in uca1400 collations for utf8
- Closed
-
MDEV-34288 SET NAMES DEFAULT crashes `mariadbd --collation-server=utf8mb4_unicode_ci`
- Closed
-
MDEV-34295 CAST(char_col AS DOUBLE) prints redundant spaces in a warning
- Closed
-
MDEV-34305 Redundant truncation errors/warnings with optimizer_trace enabled
- Closed
- is duplicated by
-
MDEV-17662 Default to UTF8
- Closed
- relates to
-
MDEV-23465 Implement a collation for identifiers
- Closed
-
MDEV-34352 Execution time for Block Nested Loop Hash (BNLH) join with charset utf8mb4 is 7-20 times slower than with latin1
- Closed
-
MDEV-34387 Too long value in the Duplicate entry message with online alter and utf8
- Open
-
MDEV-34410 CONNECT doesn't work well with utf8mb4
- Open
-
MDEV-34700 Connect SQLite3 MTR test fails due to various charset/collation related output changes
- Open
-
MDEV-7128 Configuring charsets or collations as utf8 yields surprising result and leads to data loss
- Closed
-
MDEV-8334 Rename utf8 to utf8mb3
- Closed
-
MDEV-8872 Performance regressions with utf8mb4 vs utf8 in WordPress
- Closed
-
MDEV-27490 Allow full utf8mb4 for identifiers
- Stalled
-
MDEV-29414 Map utf8 OS locales to utf8mb4
- Open
-
MDEV-34376 Wrong data types when mixing an utf8 *TEXT column and a short binary
- Closed
-
MDEV-34409 myisamchk is broken for collation IDs >255
- Open