[CONJ-345] Regression with unexpected collation error Created: 2016-09-08  Updated: 2016-09-22  Resolved: 2016-09-22

Status: Closed
Project: MariaDB Connector/J
Component/s: Failover
Affects Version/s: 1.5.2
Fix Version/s: 1.5.3

Type: Bug Priority: Major
Reporter: Dan Assignee: Diego Dupin
Resolution: Fixed Votes: 0
Labels: None
Environment:

MariaDB 10.1.17 (MacOS X, Ubuntu 16.04), Java 8u102, 64bit


Attachments: Zip Archive CONJ_345.zip    
Issue Links:
Problem/Incident
is caused by MDEV-10865 COLLATE keyword doesn't work in PREPA... Confirmed
Relates
relates to CONJ-350 Make server prepare fallback to clien... Closed

 Description   

A query that worked fine with mariadb-connector-j 1.4.6 is failing with 1.5.2.

The table structure.

 CREATE TABLE `index` (
  `uid` int(11) NOT NULL DEFAULT '0',
  `term` varchar(500) COLLATE utf8mb4_unicode_ci NOT NULL,
  `prio` int(11) NOT NULL DEFAULT '0',
  `LENGTH` int(11) DEFAULT NULL,
  KEY `ix_ig_term_uid_length` (`term`(191),`uid`,`LENGTH`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

The query.

SELECT uid,length 
            FROM index USE INDEX(ix_ig_term_uid_length)
            WHERE term LIKE "a" COLLATE utf8mb4_unicode_ci
            GROUP BY uid
            ORDER BY length,prio DESC

The error message when 1.5.2 is used.

SQL Error: 1253, SQLState: 42000
Error preparing query: COLLATION 'utf8mb4_unicode_ci' is not valid for CHARACTER SET 'binary'

The reported server configuration from the obtained connection. This doesn't change between 1.4.6 and 1.5.2.

========== DB character set configuration ===========
character_set_client = utf8mb4
character_set_connection = utf8mb4
character_set_database = utf8mb4
character_set_filesystem = binary
character_set_results = utf8mb4
character_set_server = utf8mb4
character_set_system = utf8
========== DB collation configuration ===========
collation_connection = utf8mb4_general_ci
collation_database = utf8mb4_general_ci
collation_server = utf8mb4_general_ci



 Comments   
Comment by Diego Dupin [ 2016-09-15 ]

Hi,
all seems right. I don't understand how this can occur.
can you send the result of this following queries, to help identify to problem ?
select * from information_schema.collations
where id in ( 45, 46, 224, 225, 226, 227, 228,229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 245, 246, 247)
order by id;

select @@session.collation_connection, @@global.collation_connection,
@@session.collation_database, @@global.collation_database,
@@session.collation_server, @@global.collation_server,
@@session.character_set_client, @@global.character_set_client,
@@session.character_set_connection, @@global.character_set_connection,
@@session.character_set_database, @@global.character_set_database,
@@session.character_set_server, @@global.character_set_server,
@@global.character_set_system;

SELECT default_character_set_name FROM information_schema.SCHEMATA
WHERE schema_name = "<The default schema used>";

because, it's seems, server is configured as binary, not utf8mb4, but driver normally change binary to utf8 in this case...but this doesn't correspond to your error.

Comment by Dan [ 2016-09-16 ]

I'm glad to help. I checked the server config and there is no 'binary' to be found. Here is the output:

select * from information_schema.collations 
    -> where id in ( 45, 46, 224, 225, 226, 227, 228,229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 245, 246, 247)
    -> order by id;
+------------------------------+--------------------+-----+------------+-------------+---------+
| COLLATION_NAME               | CHARACTER_SET_NAME | ID  | IS_DEFAULT | IS_COMPILED | SORTLEN |
+------------------------------+--------------------+-----+------------+-------------+---------+
| utf8mb4_general_ci           | utf8mb4            |  45 | Yes        | Yes         |       1 |
| utf8mb4_bin                  | utf8mb4            |  46 |            | Yes         |       1 |
| utf8mb4_unicode_ci           | utf8mb4            | 224 |            | Yes         |       8 |
| utf8mb4_icelandic_ci         | utf8mb4            | 225 |            | Yes         |       8 |
| utf8mb4_latvian_ci           | utf8mb4            | 226 |            | Yes         |       8 |
| utf8mb4_romanian_ci          | utf8mb4            | 227 |            | Yes         |       8 |
| utf8mb4_slovenian_ci         | utf8mb4            | 228 |            | Yes         |       8 |
| utf8mb4_polish_ci            | utf8mb4            | 229 |            | Yes         |       8 |
| utf8mb4_estonian_ci          | utf8mb4            | 230 |            | Yes         |       8 |
| utf8mb4_spanish_ci           | utf8mb4            | 231 |            | Yes         |       8 |
| utf8mb4_swedish_ci           | utf8mb4            | 232 |            | Yes         |       8 |
| utf8mb4_turkish_ci           | utf8mb4            | 233 |            | Yes         |       8 |
| utf8mb4_czech_ci             | utf8mb4            | 234 |            | Yes         |       8 |
| utf8mb4_danish_ci            | utf8mb4            | 235 |            | Yes         |       8 |
| utf8mb4_lithuanian_ci        | utf8mb4            | 236 |            | Yes         |       8 |
| utf8mb4_slovak_ci            | utf8mb4            | 237 |            | Yes         |       8 |
| utf8mb4_spanish2_ci          | utf8mb4            | 238 |            | Yes         |       8 |
| utf8mb4_roman_ci             | utf8mb4            | 239 |            | Yes         |       8 |
| utf8mb4_persian_ci           | utf8mb4            | 240 |            | Yes         |       8 |
| utf8mb4_esperanto_ci         | utf8mb4            | 241 |            | Yes         |       8 |
| utf8mb4_hungarian_ci         | utf8mb4            | 242 |            | Yes         |       8 |
| utf8mb4_sinhala_ci           | utf8mb4            | 243 |            | Yes         |       8 |
| utf8mb4_croatian_mysql561_ci | utf8mb4            | 245 |            | Yes         |       8 |
| utf8mb4_unicode_520_ci       | utf8mb4            | 246 |            | Yes         |       8 |
| utf8mb4_vietnamese_ci        | utf8mb4            | 247 |            | Yes         |       8 |
+------------------------------+--------------------+-----+------------+-------------+---------+
25 rows in set (0.00 sec)

select @@session.collation_connection, @@global.collation_connection,
    -> @@session.collation_database, @@global.collation_database,
    -> @@session.collation_server, @@global.collation_server,
    -> @@session.character_set_client, @@global.character_set_client,
    -> @@session.character_set_connection, @@global.character_set_connection,
    -> @@session.character_set_database, @@global.character_set_database,
    -> @@session.character_set_server, @@global.character_set_server,
    -> @@global.character_set_system;
+--------------------------------+-------------------------------+------------------------------+-----------------------------+----------------------------+---------------------------+--------------------------------+-------------------------------+------------------------------------+-----------------------------------+----------------------------------+---------------------------------+--------------------------------+-------------------------------+-------------------------------+
| @@session.collation_connection | @@global.collation_connection | @@session.collation_database | @@global.collation_database | @@session.collation_server | @@global.collation_server | @@session.character_set_client | @@global.character_set_client | @@session.character_set_connection | @@global.character_set_connection | @@session.character_set_database | @@global.character_set_database | @@session.character_set_server | @@global.character_set_server | @@global.character_set_system |
+--------------------------------+-------------------------------+------------------------------+-----------------------------+----------------------------+---------------------------+--------------------------------+-------------------------------+------------------------------------+-----------------------------------+----------------------------------+---------------------------------+--------------------------------+-------------------------------+-------------------------------+
| utf8mb4_general_ci             | utf8mb4_general_ci            | utf8mb4_general_ci           | utf8mb4_general_ci          | utf8mb4_general_ci         | utf8mb4_general_ci        | utf8mb4                        | utf8mb4                       | utf8mb4                            | utf8mb4                           | utf8mb4                          | utf8mb4                         | utf8mb4                        | utf8mb4                       | utf8                          |
+--------------------------------+-------------------------------+------------------------------+-----------------------------+----------------------------+---------------------------+--------------------------------+-------------------------------+------------------------------------+-----------------------------------+----------------------------------+---------------------------------+--------------------------------+-------------------------------+-------------------------------+
1 row in set (0.00 sec)

SELECT default_character_set_name FROM information_schema.SCHEMATA 
    -> WHERE schema_name = "wd_db";
+----------------------------+
| default_character_set_name |
+----------------------------+
| utf8mb4                    |
+----------------------------+
1 row in set (0.00 sec)

Comment by Dan [ 2016-09-18 ]

I attached a simple test case.

It fails with 1.5.2 and passes with 1.4.6.
You can run it with 1.5.2 by calling

mvn clean package -Pfail

and with 1.4.6 by calling

mvn clean package -Ppass

.

This was tested on Mac OS X with homebrew's MariaDB 10.1.17 (unchanged default configuration).
CONJ_345.zip

Comment by Diego Dupin [ 2016-09-20 ]

Thanks for the test case, it has permit to identify the problem !

Problem is the use of "COLLATE" keywork in PREPARE statement :
CREATE TABLE IF NOT EXISTS `tt` (
`test` varchar(500) COLLATE utf8mb4_unicode_ci NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

#Text work well
SELECT * FROM `tt` WHERE `test` LIKE 'jj' COLLATE utf8mb4_unicode_ci

#will throw an error "Erreur SQL (1253) : COLLATION 'utf8mb4_unicode_ci' is not valid for CHARACTER SET 'binary'"
PREPARE stmt FROM 'SELECT * FROM `tt` WHERE `test` LIKE ? COLLATE utf8mb4_unicode_ci'

I'll create a server issue about this problem.

Second is on driver side : to use text query as a fallback to fail prepare.
Diego.

Comment by Diego Dupin [ 2016-09-22 ]

correction in driver is done with CONJ-350. (if prepare doesn't work, will fallback to use text protocol = without the prepare).
fixed in 1.5.3 (will be released the oct. 3).

correction is available in snapshot release :

<repositories>
    <repository>
        <id>sonatype-nexus-snapshots</id>
        <name>Sonatype Nexus Snapshots</name>
        <url>https://oss.sonatype.org/content/repositories/snapshots</url>
    </repository>
</repositories>
 
<dependencies>
    <dependency>
        <groupId>org.mariadb.jdbc</groupId>
        <artifactId>mariadb-java-client</artifactId>
        <version>1.5.3-SNAPSHOT</version>
    </dependency>
</dependencies>

Server issue MDEV-10865 created to solve initial problem

Comment by Dan [ 2016-09-22 ]

Thank you, I tested with 1.5.3-SNAPSHOT and everything works as expected.

Generated at Thu Feb 08 03:14:59 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.