[CONJ-836] Single Update with nested functions where internal functions return distinct values are returning identical values Created: 2020-10-26  Updated: 2020-12-01  Resolved: 2020-12-01

Status: Closed
Project: MariaDB Connector/J
Component/s: JDBC compatibility
Affects Version/s: 2.3.0, 2.2.4, 2.4.0, 2.5.0, 2.6.0, 2.7.0
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Davi de Paula Cavalcanti Assignee: Diego Dupin
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

MacOS 10.14.6, AWS Aurora (MySQL), MySQL 5.7


Attachments: Java Source File MariaDBTestUUID.java     File build.gradle    
Issue Links:
Relates
relates to CONJ-417 Support characterEncoding and collati... Closed
relates to CONJ-597 default collation use utf8mb4 if serv... Closed

 Description   

I'm trying to update the Java connector in one of my applications from version 2.2.3 to 2.2.4.
However, one of my SQL statements started behaving differently with the new version.

UPDATE TEST_UUID SET new_id = UNHEX(REPLACE(uuid(), '-', ''))

Up to version 2.2.3, this SQL would set a distinct value to `new_id` in each of the rows in the table.
From version 2.2.4, this sets the same exact value to each of the rows.
I've tested this with up to version 2.7.0, the behavior is always the same starting from 2.2.4.

If I run the statement directly on MySQL server (using mysql>), the result is always as expected (distinct values on each row).
I've tried that on MySQL 5.7.22 (locally installed in my env), MySQL 5.7.30 (test docker container), and more importantly, Aurora DB (apparently running MySQL 5.6.10).

I've written a very simple Java application to replicate the issue and do some investigation myself.
The 2 files are attached to the ticket.

From running the application it is clear that the problem is not really with the call to `UUID()`. It is `REPLACE(UUID())` that causes identical values to be returned on 2.2.4.

Is there some parameter I must pass for that to work correctly from 2.2.4?

P.S.:

In case you want to run the attached application:

1 - connect to each of the DBs and run the following queries:

create table testUUID (ID int, new_ID varchar(100), new_ID2 varchar(100), new_ID3 BINARY(16));
insert into testUUID (id) values (1),(2),(3);

2 - replace the connection info masked for security reasons and run it on each of the DBs.

3 - set the version of mariadb client to use in the test in build.gradle

4 - run main()

If the output shows 3 for each of the distinct queries, the update stmt created distinct values in each column. If it shows 1, the update set the same value to all rows for that column. The queries should all return 3.

What I saw, for all DBs:

With mariadb client 2.2.3, it returns 3 for all queries.
With mariadb client 2.2.4, it returns 1 for the second and third queries (new_ID2 and new_ID3).



 Comments   
Comment by Davi de Paula Cavalcanti [ 2020-10-26 ]

This seems to have happened to others as well:
https://stackoverflow.com/questions/44622947/sql-duplicates-while-replaceuuid-on-update-query

Comment by Davi de Paula Cavalcanti [ 2020-10-26 ]

After some more investigation, it seems to be related to this:
https://jira.mariadb.org/browse/CONJ-597

Source: https://stackoverflow.com/a/51393124/5154619
I tested adding the `convert` function: `convert(uuid() using utf8mb4)`), and it showed distinct values in every row.

Comment by Davi de Paula Cavalcanti [ 2020-10-27 ]

https://jira.mariadb.org/browse/CONJ-417 helped a lot, as I could not change the databases that had already been created.

Comment by Diego Dupin [ 2020-11-17 ]

I failed to reproduce the issue, whatever the server I use and connector version.
Could you send the result of the charset you use in connection / database /table with the following command ?

SHOW VARIABLES LIKE '%COLLATION%';
SHOW VARIABLES LIKE '%CHARACTER%';
 
SELECT CCSA.character_set_name FROM information_schema.`TABLES` T,
       information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` CCSA
WHERE CCSA.collation_name = T.table_collation
  AND T.table_name = "testuuid";

Comment by Davi de Paula Cavalcanti [ 2020-11-18 ]

Hi Diego, thank you for looking into this!

I've added the queries to the test program, and this was the output:

10:59:34 am: Executing task 'MariaDBTestUUID.main()'...

> Task :compileJava UP-TO-DATE
> Task :processResources NO-SOURCE
> Task :classes UP-TO-DATE

> Task :MariaDBTestUUID.main()
Running "select ID, new_ID, new_ID2, hex(new_ID3) as new_ID3 from testUUID"...
ID: 1, new_ID: f117abd2-2930-11eb-904c-85ca6393b8de, new_ID2: f117aac4293011eb904c85ca6393b8de, new_ID3: F117AAD8293011EB904C85CA6393B8DE,
ID: 2, new_ID: f117aca4-2930-11eb-904c-85ca6393b8de, new_ID2: f117aac4293011eb904c85ca6393b8de, new_ID3: F117AAD8293011EB904C85CA6393B8DE,
ID: 3, new_ID: f117acd6-2930-11eb-904c-85ca6393b8de, new_ID2: f117aac4293011eb904c85ca6393b8de, new_ID3: F117AAD8293011EB904C85CA6393B8DE,
Running "select count(distinct(new_ID)) as distinct_rows from testUUID"...
distinct_rows: 3,
Running "select count(distinct(new_ID2)) as distinct_rows2 from testUUID"...
distinct_rows2: 1,
Running "select count(distinct(new_ID3)) as distinct_rows3 from testUUID"...
distinct_rows3: 1,
Debugging info for Diego Dupin:
Running "SHOW VARIABLES LIKE '%COLLATION%'"...
Variable_name: collation_connection, Value: utf8mb4_unicode_ci,
Variable_name: collation_database, Value: latin1_swedish_ci,
Variable_name: collation_server, Value: latin1_swedish_ci,
Running "SHOW VARIABLES LIKE '%CHARACTER%'"...
Variable_name: character_set_client, Value: utf8mb4,
Variable_name: character_set_connection, Value: utf8mb4,
Variable_name: character_set_database, Value: latin1,
Variable_name: character_set_filesystem, Value: binary,
Variable_name: character_set_results, Value: utf8mb4,
Variable_name: character_set_server, Value: latin1,
Variable_name: character_set_system, Value: utf8,
Variable_name: character_sets_dir, Value: /usr/local/mysql-5.7.22-macos10.13-x86_64/share/charsets/,
Running "SELECT CCSA.character_set_name FROM information_schema.`TABLES` T,
information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` CCSA
WHERE CCSA.collation_name = T.table_collation
AND T.table_name = "testuuid""...
CHARACTER_SET_NAME: latin1,

BUILD SUCCESSFUL in 0s
2 actionable tasks: 1 executed, 1 up-to-date
10:59:34 am: Task execution finished 'MariaDBTestUUID.main()'.

Comment by Diego Dupin [ 2020-11-18 ]

Ok, with your comment, the problem reside in server part :

ID: 1, new_ID: f117abd2-2930-11eb-904c-85ca6393b8de, new_ID2: f117aac4293011eb904c85ca6393b8de, new_ID3: F117AAD8293011EB904C85CA6393B8DE,
ID: 2, new_ID: f117aca4-2930-11eb-904c-85ca6393b8de, new_ID2: f117aac4293011eb904c85ca6393b8de, new_ID3: F117AAD8293011EB904C85CA6393B8DE,
ID: 3, new_ID: f117acd6-2930-11eb-904c-85ca6393b8de, new_ID2: f117aac4293011eb904c85ca6393b8de, new_ID3: F117AAD8293011EB904C85CA6393B8DE,

new_ID2 and new_ID3 get the exact same generated UUID Value.
What is the environment and server version ?

Comment by Davi de Paula Cavalcanti [ 2020-11-20 ]

Yes, I wrote that test program to demonstrate that (on the previous version of the library it doesn't happen).

I've tested on:
MySQL 5.7.22 on macOS
MySQL 5.7.30 (test docker container, runs on Linux),
Aurora DB (apparently running MySQL 5.6.10)

Comment by Diego Dupin [ 2020-12-01 ]

Problem reside in mysql 5.7 that evaluate UUID() only once when using UTF8MB4 charset.
Connector use by default utf8mb4 charset in 2.2.4. This part is not an issue. The problem is why MySQL server 5.7 has this issue.
I would recommend to report that bug here : https://bugs.mysql.com/

just an example to reproduced (only SQL)

DROP TABLE  if EXISTS testUUID;
create table testUUID (ID int, new_ID varchar(100), new_ID2 varchar(100), new_ID3 BINARY(16)) CHARACTER SET UTF8MB4 COLLATE utf8mb4_general_ci;
insert into testUUID (id) values (1),(2),(3);
 
SET NAMES 'utf8' COLLATE 'utf8_general_ci';
 
UPDATE testUUID SET new_id = uuid(), new_id2 = REPLACE(uuid(), '-', ''), new_id3 = UNHEX(REPLACE(UUID(), '-', ''));
 
SELECT new_ID, new_ID2 FROM testUUID; 
/* good !
new_ID;new_ID2
ac65af1e-33e7-11eb-9b97-c8348e0fed44;ac65af3133e711eb9b97c8348e0fed44
ac65b27d-33e7-11eb-9b97-c8348e0fed44;ac65b28733e711eb9b97c8348e0fed44
ac65b300-33e7-11eb-9b97-c8348e0fed44;ac65b30433e711eb9b97c8348e0fed44
 
*/
SET NAMES 'utf8mb4' COLLATE 'utf8mb4_general_ci';
UPDATE testUUID SET new_id = uuid(), new_id2 = REPLACE(uuid(), '-', ''), new_id3 = UNHEX(REPLACE(UUID(), '-', ''));
SELECT new_ID, new_ID2 FROM testUUID; 
/* BAD !!!! new_ID2 are all the same
new_ID;new_ID2
ac6aa6ae-33e7-11eb-9b97-c8348e0fed44;ac6aa54a33e711eb9b97c8348e0fed44
ac6aa7be-33e7-11eb-9b97-c8348e0fed44;ac6aa54a33e711eb9b97c8348e0fed44
ac6aa7e9-33e7-11eb-9b97-c8348e0fed44;ac6aa54a33e711eb9b97c8348e0fed44
*/

Comment by Diego Dupin [ 2020-12-01 ]

just reported here https://bugs.mysql.com/bug.php?id=101820

Generated at Thu Feb 08 03:18:40 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.