Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.6, 11.8, 12.2, 12.3
-
None
-
Related to performance
Description
I wrote a program testing strnncollsp_nchar() in combination with PAD SPACE collations:
#include <my_global.h>
|
#include <m_ctype.h>
|
#include <my_sys.h>
|
|
|
void test_collation(const char *name) |
{
|
CHARSET_INFO *cs= get_charset_by_name(name, MYF(0));
|
if (!cs) |
{
|
printf("Collation '%s' not found\n", name); |
return; |
}
|
/* |
Intentionally don't pass the
|
MY_STRNNCOLLSP_NCHARS_EMULATE_TRIMMED_TRAILING_SPACES flag
|
to the last argument:
|
*/
|
int cmp= (cs->coll->strnncollsp_nchars)(cs, |
(const uchar*) "abc", 3, |
(const uchar*) "abc ", 4, |
4/*nchars*/, 0/*flags*/); |
printf("cmp=%-10d %s\n", cmp, name); |
}
|
|
|
|
|
int main() |
{
|
my_init();
|
test_collation("utf8mb3_uca1400_nopad_ai_ci"); |
test_collation("utf8mb3_general_nopad_ci"); |
test_collation("utf8mb3_nopad_bin"); |
test_collation("latin1_swedish_nopad_ci"); |
test_collation("latin1_nopad_bin"); |
my_end(0);
|
return 0; |
}
|
and built it using this Makefile:
SRCDIR=/home/bar/maria-git/12.3.hphsh/
|
BUILDDIR=/home/bar/maria-git/12.3.hphsh/BUILD-DEB/
|
|
|
INCLUDES=-I$(SRCDIR)/include/ -I$(BUILDDIR)/include/
|
LIB=-L$(BUILDDIR)/mysys/ -L$(BUILDDIR)/dbug/ -L$(BUILDDIR)/strings/
|
|
|
all: test
|
|
|
test: test.cc
|
g++ $(INCLUDES) $(LIB) test.cc -o test -lmysys -ldbug -lstrings
|
|
|
clean:
|
rm -rf test
|
The output of the program is:
cmp=-521 utf8mb3_uca1400_nopad_ai_ci
|
cmp=0 utf8mb3_general_nopad_ci
|
cmp=0 utf8mb3_nopad_bin
|
cmp=0 latin1_swedish_nopad_ci
|
cmp=0 latin1_nopad_bin
|
Notice:
- utf8mb3_uca1400_nopad_ai_ci correctly reports that the string "abc" is smaller. This is correct.
- Other collations report that "abc" and "abc " are equal. This is wrong.
Emulation of paddding of the shorter string "abc" to "abc " (according to nchars=4) should only happen when MY_STRNNCOLLSP_NCHARS_EMULATE_TRIMMED_TRAILING_SPACES is passed. When this flag is not passed, with this input, it should just return what strnncollsp() returns.
All virtual implementations of strnncollsp_nchars() should be checked and fixed to return a negative result meaning that "abc" is smaller than "abc ".
After this fix it will be possible to use strnncollsp_nchars() to address the problem reported in MDEV-21543. See the link to the Zulip topic. Without this change a patch for MDEV-21543 can only use strnncollsp_nchars() for NO PAD collations.
Attachments
Issue Links
- relates to
-
MDEV-21543 hp_rec_key_cmp suboptimal comparison
-
- Open
-
- links to