Status: Open (View Workflow)
Resolution: Unresolved
Docker image "mariadb:11.2.3"
With this extra config binded into mariadb container:
innodb_ft_enable_stopword = OFF
init-connect='SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci'
collation-server = utf8mb4_unicode_ci
And create table with explicit utf8mb4 charset and collate:
create database test;
use test;
create table `test` (`id` bigint(20) not null auto_increment, `text` text character set utf8mb4 collate utf8mb4_unicode_ci default null, primary key (`id`), fulltext key `ftidx` (`text`)) default charset=utf8mb4 collate=utf8mb4_unicode_ci;
If we insert a row like this:
insert into `test` (`text`) values ('a 哈 😂 🐧');
And try with like and match ... against:
MariaDB [test]> select * from test where text like "%😂%";
| id | text |
| 1 | a 哈 😂 🐧 |
1 row in set (0.001 sec)
MariaDB [test]> select * from test where match text against ("a");
| id | text |
| 1 | a 哈 😂 🐧 |
1 row in set (0.001 sec)
MariaDB [test]> select * from test where match text against ("哈");
| id | text |
| 1 | a 哈 😂 🐧 |
1 row in set (0.001 sec)
MariaDB [test]> select * from test where match text against ("😂");
Empty set (0.001 sec)
With min token size setting to 1 and stopword disabled, full text search in mariadb could give correct results when searching "a" or "哈" in this case, but searching with single emoji character ("😂") fails.
This does not look like a configuration mistake, as text like "%😂%" prints out the row without any problem.