[MDEV-13621] CONNECT engine table_type=JDBC UPDATE containing single or double quote chars produces wrong result Created: 2017-08-22  Updated: 2017-10-26  Resolved: 2017-08-27

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Connect
Affects Version/s: 10.1.26
Fix Version/s: 10.0.33, 10.1.27, 10.2.9, 10.3.1

Type: Bug Priority: Major
Reporter: Robert Dyas Assignee: Olivier Bertrand
Resolution: Fixed Votes: 0
Labels: None
Environment:

Centos7


Issue Links:
Duplicate
duplicates MDEV-13926 CONNECT engine table_type=JDBC UPDATE... Open
duplicates MDEV-13927 CONNECT engine table_type=JDBC escapi... Closed

 Description   

In testing with the Microsoft JDBC environemnt,

INSERTing a value like the following DOES work: Bob's
INSERTing a value like the following DOES work: size is 24"

However, trying to UPDATE a column and specifying those values does NOT work...
for UPDATE with a string like Bob's you get an error, and for UPDATE with size is 24" you don't get an error but it is saved as \" and the next query returns that.

The escaping is the same for INSERT and UPDATE, but it doesn't seem to work correctly for UPDATE.



 Comments   
Comment by Olivier Bertrand [ 2017-08-23 ]

Insert works differently: CONNECT make an INSERT statement (prepared when accepted by the driver) and fills the values before executing it.

For UPDATE and DELETE, the original query is retrieved from MariaDB and edited to change the table name. The issue is that the query retrieved using the thd_query_string funtion is not identical to the original one. For instance, when executing:

update mynumb set v = 'Fred\'s' where n = 1;
update mynumb set v = 'size is 28"' where n = 2;

The thd_query_string funtion returns:

update mynumb set v = 'Fred\\'s' where n = 1;
update mynumb set v = 'size is 28\"' where n = 2;

I think this is a MariaDB problem (bug?), not a CONNECT one.

Note that when connected via JDBC to a MySQL or MariaDB server, the update is done correctly, probably because the parser knows how to eliminate the spurious backslash.

Meanwhile a turnaround seems to replace escaping by doubling the character:

update mynumb set v = 'Fred''s' where n = 1;
update mynumb set v = "size is 28""" where n = 2;

Providing the remote server accepts it.

Comment by Robert Dyas [ 2017-08-24 ]

Given the two bug reports I've submitted where INSERT works correctly and UPDATE does not, would it be possible for UPDATE to be changed to work more like INSERT? ("CONNECT makes an INSERT statement (prepared when accepted by the driver) and fills the values before executing it.")

It seems that if UPDATE were to bind variables using a prepared statement it would avoid multiple problems 1) this one with different servers requiring different escaping of special characters (the driver's bind would deal with this) and 2) issues with certain characters not encoding correctly (bug filed where INSERT does it right and UPDATE does it wrong with many foreign characters, see MDEV-13504).

Comment by Olivier Bertrand [ 2017-08-24 ]

Yes. However, apparently, JDBC does not have the SELECT ... FOR UPDATE statement. Therefore, constructing a prepared statement for UPDATE or DELETE would require to re-parse the statement and I don't think this can be done from a storage engine. Or to make it like for INSERT from the used columns.

The issue is that this prepared statement must also include the WHERE clause. This does not exists for INSERT. It can be retrieved by the cond_push feature but there is no garanty it is allways completely retrieved. This could lead to a somewhat destructing statement that would update more rows than wished.

All these problems would be solved if CONNECT could get the original query unchanged.

Comment by Robert Dyas [ 2017-08-24 ]

RE "It can be retrieved by the cond_push feature but there is no granularity it is always completely retrieved." Can you explain more or provide an example?

pstmt = conn.prepareStatement("UPDATE `Foreign_Table_Name` SET `col1` = ? WHERE `id` = ? ");
pstmt.setString(1, "somevalue not escped");
pstmt.setString(2, "some PK value");
int rowsAffected = pstmt.executeUpdate();

So if the UPDATE given to MariaDB has a WHERE clause specified, it would always be in cond_push correct? And if not, then that is correct as well (i.e. not specified, so applies to all rows).

Comment by Olivier Bertrand [ 2017-08-24 ]

As it is, the cond_push function does not handle all the where clause possibilities that can be fairly complex with functions, like, subselect, and applied to other tables in case of join.

And how to handle in a prepared statement the IN, IS NULL and IS NOT NULL clauses?

I don't say it is impossible but it represents quite a new implementation, possibly buggy, while retrieving the unchanged original query statement is a simple fix for all this.

Comment by Robert Dyas [ 2017-08-24 ]

Not every variable needs to be bound in a prepared statement, you can use a hybrid approach like this:

pstmt = conn.prepareStatement("UPDATE `foreignTable` SET col1 = ? WHERE id = 12345 ");
pstmt.setString(1, "somevalue not escped");

To bind NULL values in a SET clause, this will work:

pstmt.setNull(1, Types.NULL);

For WHERE clause I believe you do need to specify IS NULL or IS NOT NULL which is easy with hybrid approach.

If you could make it so the SET portion of the clause is bound but the WHERE clause is as it is now, that would be a nice improvement if it didn't introduce new bugs.

Comment by Olivier Bertrand [ 2017-08-25 ]

The WHERE clause can contain quotes and double quotes as well.

Comment by Robert Dyas [ 2017-08-25 ]

Absolutely, but in our use cases most queries are primary key based matching integers, or ranges matching dates, or simple text queries (seldom quotes or extended characters). No doubt the WHERE clause is an issue, but the SET clause handling the extended chars, single and double quotes is at least 10x more important for our use cases. So a partial solution is a huge step forward.

Is the "hybrid" approach above possible? (i.e. bind the SET variables and do WHERE like now)

Comment by Olivier Bertrand [ 2017-08-26 ]

Thanks to Sergei information, I found how to retrieve the original query without additional escapes.

This is the perfect fix for all our problems.

Comment by Robert Dyas [ 2017-08-26 ]

Will this work even if the original (by "original" I mean the one the user submits) query includes sub selects in the where clause that reference tables not in the remote system?

Will the following work? (Bob's has two single quotes)

UPDATE remoteT1 SET col1 = 'Bob''s fence is 24" high' WHERE id = (select distinct id from localT2 where a > 17)

The above is not our immediate use case, but super great if supported.

Comment by Olivier Bertrand [ 2017-08-26 ]

Of course it won't work and reply table localT2 does not exist.

Supporting it would require a complete change. Something like replacing the query by something like:

SELECT ... FOR UPDATE

I don't even dream of trying implementing this which, by the way, would not work in our case. Remember that the WHERE clause is parsed and must be understood locally but is applied by the remote server.

Comment by Robert Dyas [ 2017-08-26 ]

Well, the following works today (I just tested it):

select * from MyConnectTable where id in (select distinct Customer_ID from MyLocalNonConnect)

But an UPDATE statement with the same WHERE clause fails.

In theory (I realize not how it works today) couldn't sub queries be processed by MariaDB locally (select distinct Customer_ID from MyLocalNonConnect) and the VALUES from the result be passed to the remote server? To be clear this is not a feature request - just thinking.

Comment by Olivier Bertrand [ 2017-08-26 ]

I am trying to keep things simple and working, even warning that UPDATE and DELETE are supported "in a simplified way".

For instance, how could CONNECT decide whether the sub-query applies to a local or remote table?

Comment by Robert Dyas [ 2017-08-27 ]

I can reopen this discussion in another MDEV at a later date, as I don't want to slow down the fixing of the single and double quote issue itself... those are important core use cases as we can't control what a user types in, and many of our customers have product descriptions using both characters (e.g. Gold 8" long pendant from Kyle's Jewlery). And stability is most important at this stage.

One final thought ... but since I have no idea about the internal workings of MariaDB and how queries interact with storage engines, all I can do is speculate on what a solution might look like. But one solution regarding how CONNECT could decide if sub-query is local or remote --> don't decide If you could take any subquery and pass it back up to the top MariaDB server (local table or remote), get the results as a derived table with a single column, then use those in a hard-coded query... seems like that would handle a lot of use cases (not correlated sub queries, but others).

So this:

UPDATE remoteTable SET col1 = 'abc' WHERE id IN (SELECT id from localTable where a > 7)

takes the sub-query and passes it to mariadb top level to execute, and the results of that are used to re-write the sub-query to something like this:

UPDATE remoteTable SET col1 = 'abc' WHERE id IN (45,27,1234,5554,67)

Right now sub-query in update is purely a speculative use case for us whereas the single and double quote handling (and the Ntext datatype support) are real-life immediate needs. The rest seems to be working very well, and stability of the Java connect stuff is excellent.

Generated at Thu Feb 08 08:07:01 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.