Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
Description
Case folding tables are stored in the following structures:
typedef struct unicase_info_char_st |
{
|
uint32 toupper; |
uint32 tolower; |
uint32 sort;
|
} MY_UNICASE_CHARACTER;
|
|
struct unicase_info_st |
{
|
my_wc_t maxchar;
|
MY_UNICASE_CHARACTER **page;
|
};
|
The member MY_UNICASE_CHARACTER::sort is not used by the underlying code in Asia collations.
This member is only used by Unicode _general_ci collations. For other collations (Asian collations, Unicode UCA collations, Unicode _bin collations) the member MY_UNICASE_CHARACTER::sort only wastes memory.
It's good to refactor the code, so those tables do not waste space.
In MDEV-30577 we're going to introduce new casefolding tables for Unicode-14.0.0 collations soon. It's good to refactor the code before MDEV-30577.
Lets add new data types to store casefolding information:
typedef struct casefold_info_char_t |
{
|
uint32 toupper; |
uint32 tolower; |
} MY_CASEFOLD_CHARACTER;
|
|
|
struct casefold_info_st |
{
|
my_wc_t maxchar;
|
MY_CASEFOLD_CHARACTER **page;
|
};
|
and change all Asian collations to store casefolding tables using new data types.
Note, some or all Unicode collations will be also modified to use new data types, but later under terms of a separate task.
Attachments
Issue Links
- blocks
-
MDEV-30577 Case folding for uca1400 collations is not up to date
- Closed