Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
Description
my_wildcmp_uca_impl() and my_wildcmp_unicode_impl() are 130 lines functions fully repeating each other.
The only difference between them is in two small fragments comparing the subject character and the pattern character for equality.
For example, the second different fragment looks like this:
- In my_wildcmp_uca_impl():
if (s_wc == w_wc)
break;
if ((scan= mb_wc(cs, &s_wc, (const uchar*)str,
(const uchar*)str_end)) <= 0)
return 1;
if (!my_uca_charcmp(cs,s_wc,w_wc))
break;
- my_wildcmp_unicode_impl():
if ((scan= mb_wc(cs, &s_wc, (const uchar*)str,
(const uchar*)str_end)) <= 0)
return 1;
if (weights)
{
my_tosort_unicode(weights, &s_wc);
my_tosort_unicode(weights, &w_wc);
}
if (s_wc == w_wc)
break;
The rest of the code (100+ lines) is fully duplicate, it's hard to maintain it.
Note, under terms of MDEV-31340 we'll need to extend the wildcmp() implementation, to support AS CI comparison style properly for utf8mb4_general1400_as_ci.
Instead of making the code inside my_wildcmp_unicode_impl() more complex to detect which collation style to apply:
- binary
- AI CI, e.g. utf8mb4_general_ci
- AS CI, e.g. utf8mb4_general1400_ai_ci (coming in
MDEV-31340)
it's better to introduce dedicated functions for every collation style.
Proposed changes
The duplicate code should be reused with help of the same approach that we earlier used in MDEV-17474 for strnncoll()-family functions:
- The body of the function should be moved to a new shared file ctype-wildcmp.inl
- The shared file should be included from multiple places
The code including the shared file will look about like this:
#define MY_FUNCTION_NAME(x) my_ ## x ## _uca_impl
|
#define MY_MB_WC(cs, pwc, s, e) ((cs)->cset->mb_wc)(cs, pwc, s, e)
|
#define MY_CHAR_EQ(cs, wc1, wc2) (my_uca_charcmp(cs, wc1, wc2)==0)
|
#include "ctype-wildcmp.inl" |
#define MY_FUNCTION_NAME(x) my_ ## x ## _mb2_or_mb4_impl
|
#define MY_MB_WC(cs, pwc, s, e) ((cs)->cset->mb_wc)(cs, pwc, s, e)
|
#define MY_CHAR_EQ(cs, wc1, wc2) my_char_eq_mb2_or_mb4(cs, wc1, wc2)
|
#include "ctype-wildcmp.inl" |
Attachments
Issue Links
- relates to
-
MDEV-17474 Change Unicode collation implementation from "handler" to "inline" style
- Closed
-
MDEV-17502 Change Unicode xxx_general_ci and xxx_bin collation implementation to "inline" style
- Closed
-
MDEV-31340 Remove MY_COLLATION_HANDLER::strcasecmp()
- Closed